Design and Evaluation of a Memristor‐based Look‐up Table for Non‐volatile Field Programmable Gate Arrays

Haider A. F. Almurib,T. Nandha Kumar,Fabrizio Lombardi

IET circuits, devices & systems（2016）

引用 28|浏览0

暂无评分

摘要

IET Circuits, Devices & SystemsVolume 10, Issue 4 p. 292-300 Research ArticleFree Access Design and evaluation of a memristor-based look-up table for non-volatile field programmable gate arrays Haider Abbas F. Almurib, Haider Abbas F. Almurib Faculty of Engineering, The University of Nottingham Malaysia Campus, Semenyih, Selangor, MalaysiaSearch for more papers by this authorThulasiraman Nandha Kumar, Corresponding Author Thulasiraman Nandha Kumar nandhakumaar.t@nottingham.edu.my Faculty of Engineering, The University of Nottingham Malaysia Campus, Semenyih, Selangor, MalaysiaSearch for more papers by this authorFabrizio Lombardi, Fabrizio Lombardi Department of Electrical and Computer Engineering, Northeastern University, Boston, 02115 USASearch for more papers by this author Haider Abbas F. Almurib, Haider Abbas F. Almurib Faculty of Engineering, The University of Nottingham Malaysia Campus, Semenyih, Selangor, MalaysiaSearch for more papers by this authorThulasiraman Nandha Kumar, Corresponding Author Thulasiraman Nandha Kumar nandhakumaar.t@nottingham.edu.my Faculty of Engineering, The University of Nottingham Malaysia Campus, Semenyih, Selangor, MalaysiaSearch for more papers by this authorFabrizio Lombardi, Fabrizio Lombardi Department of Electrical and Computer Engineering, Northeastern University, Boston, 02115 USASearch for more papers by this author First published: 01 July 2016 https://doi.org/10.1049/iet-cds.2015.0217Citations: 26AboutSectionsPDF ToolsRequest permissionExport citationAdd to favoritesTrack citation ShareShare Give accessShare full text accessShare full-text accessPlease review our Terms and Conditions of Use and check box below to share full-text version of article.I have read and accept the Wiley Online Library Terms and Conditions of UseShareable LinkUse the link below to share a full-text version of this article with your friends and colleagues. Learn more.Copy URL Share a linkShare onFacebookTwitterLinkedInRedditWechat Abstract This study presents the detailed design and analysis of a new memristor-based look-up table (LUT) for field programmable gate arrays (FPGAs). The proposed memory utilises memristors as storage elements with N-type metal–oxide–semiconductor transistors for row access. New WRITE and READ operations are proposed; the proposed LUT requires no additional circuit to handle the WRITE 1 (0) operation. The proposed method requires a RESTORE pulse only for the READ 0 operation. Moreover, the WRITE operation of the proposed method requires three power lines and a RESTORE pulse only for the READ 0 operation, thus saving 25% READ time when compared with previous methods. In addition, the proposed method does not require the REFRESH pulse and does not dissipate power during stand-by mode. Extensive simulation results are presented with respect to different operational features such as normalised state parameter, pulse width and LUT size. In addition to a circuit-level evaluation, the proposed LUT scheme has also been assessed with respect to FPGA implementation. Simulation results using sequential benchmarks mapped on Spartan 4 and 5 FPGAs show that the proposed non-volatile LUT outperforms existing static random access memory cell-based LUTs in terms of performance. 1 Introduction Field programmable gate arrays (FPGAs) offer programmability at relatively low development cost and good performance [1]. The common FPGA architecture consists of a regular, programmable two-dimensional (2D) array of configurable logic blocks with programmable input/outputs along its perimeter [2]. All configurable resources [inclusive of the look-up tables (LUTs)] are controlled by the configuration bits stored in a static random access memory (SRAM) cell [2]. However, an SRAM is unable to retain the configurations bits should the power be lost. Hence, as a possible solution, non-volatile (NV) flash memory is integrated into the FPGA for storing the configuration bits [3]. However, it leads to issues such as a larger silicon area, increase in cost [4] and very slow data retrieving time. Moreover with the nanotechnology, a substantial increase of leakage current is encountered when the FPGA is in a stand-by mode [5], hence causing additional power dissipation. Thus, an alternative NV memory block (as an LUT) based on the so-called memristor as storage device is proposed in this paper to overcome the above-mentioned issues. Emerging NV memory technologies such as spin transfer torque (STT)-magnetic RAM (MRAM) [6, 7], phase change memory (PCM) [8-10], conductive bridge RAM (CBRAM) [10] and resistive RAM (RRAM) [11-14, 16] have been proposed to potentially supersede SRAM-based LUTs, because they show a nearly zero stand-by power, a high speed, density and endurance cycle, also at low costs. Comparison between different NV memories [16-18] as per different metrics is presented in Table 1. The on/off resistance ratio of a STT-MRAM cell is comparatively poor; moreover, for robust operation, one of the most crucial requirements is the design of the sense amplifier [7]. The PCM based FPGA of [8] incurs in to a long configuration time and high power dissipation when the power is on; the PCM based LUT of [9] suffer from a high active leakage power during normal operation. Furthermore, a PCM requires a large programming current and incurs in a resistance drift due to material relaxation. The switching voltage of the CBRAM is low so resulting in poor retention [10]. Table 1. Features of different NV memories STT-MRAM PCM CBRAM RRAM, memristors cell size, F2 37 8–16 <20 >5 read latency, ns <10 48 ∼20 <10 write latency, ns 12.5 40–150 ∼100 ∼10 energy per bit access, pJ 0.02 100 2 2 endurance >1015 108 >105 105 The memristor is a passive element postulated in [19] and realised by using a nanoscale thin film of titanium dioxide by Hewlett-Packard (HP) Labs [20]. Memristor-based memories have been extensively analysed in the technical literature [11-15]; these memories have been advocated as a potential replacement for conventional NV flash memories due to the high density and low power consumption [12, 15]. An LUT in an FPGA must meet specific requirements to ensure that a memristor-based implementation is effective: its size is usually small (Table 1) and once programming (WRITE) for an application is accomplished, the FPGA requires a fast and NV READ operation (Table 1). A novel architecture that avoids the 3D stacking process for the interconnect of an FPGA has been proposed in [15]; it uses only memristors and metal wires for implementing the interconnect. Turkyilmaz , et al. [14] has proposed a different technique; it uses NVSRAM memories with bipolar OxRRAM. A power reduction in stand-by mode is reported compared with a conventional SRAM-based FPGA. However, the above techniques employ six transistors (for SRAM) and two memristors per information bit; a perfect clock gating is also assumed. Therefore, these methods suffer from dynamic power dissipation and incur in substantial area and delay overheads. Moreover, these methods have complicated WRITE and READ operations. This paper proposes a novel memory block for an LUT of an FPGA using memristors. The LUT proposed in this paper is designed using memristors; as shown later in this paper, it has a fast access time and no power dissipation during stand-by mode. New WRITE and READ operations are proposed for the memory block; the proposed WRITE 1 (0) operation is performed by applying a pulse of +Vdd (–Vdd) only at the word line (WL). Similarly, the READ operation is carried out by applying a pulse of ±Vdd at WL, while the status of the selected memristor is found at the bit line (BL). The substantial differences between the proposed method and previous methods [11-13] are listed in Table 2. Simulation results for benchmark circuits in FPGA implementation are provided to substantiate the applicability of the proposed LUT scheme and its superior performance compared with an SRAM-based LUT. Table 2. Differences between proposed method and [11-13] Feature [11-13] Proposed method WRITE operation required data monitor circuit data monitor circuit not required performed at WL and BL performed only at WL READ operation negative pulse positive pulse RESTORE pulse positive pulse negative pulse required for both READ 1 and 0 required only for READ 0 REFRESH pulse required not required V/2 biasing required not required data integrity affected not affected number of power rails four three The submitted paper is significantly different in technical contents from [21, 22] in terms of simulation and analysis of (i) performance of the normalised state parameters (NSPs) of the unselected memristors at different array sizes under the WRITE operation, (ii) new READ method using an equivalent circuit model, (iii) worst-case performance of the READ and WRITE operations, (iv) the proposed NV and SRAM-based LUT designs for Xilinx Virtex4 and Virtex5 FPGAs to implement the International Symposium on Circuits and Systems (ISCAS)'89 sequential benchmark circuits. 2 Proposed LUT The proposed memory block for a two-input LUT (Fig. 1) consists of two parts: (a) the memory array that consists of four nanocross wires (in which a memristor is connected at every junction); (b) a controller (dotted box of Fig. 1) that is utilised to provide appropriate control signals to the memory. The main focus of this paper is to design an LUT using memristors and evaluate its performance for the READ and WRITE operations. A controller for the memory inputs has been designed; it has 70 metal–oxide–semiconductor (MOS) transistors at 45 nm technology node [24], it has a static leakage current of 6061.568 pA; no issue with noise margin has been observed. Fig. 1Open in figure viewerPowerPoint Proposed memory block as a two-input LUT implemented using memristors In the memory array, the horizontal wires are the WLs and the vertical wires are the BLs. Every BL is connected to the ground (GND) through an N-type MOS (NMOS) transistor (T1,T2); so according to the input data, the controller block controls the data to be driven on WL. Moreover, it switches the transistors on and off by controlling the gate signals (G1 and G2) and selects (Sel) the appropriate BL value to the output (Out). A and B are the inputs to the control block, i.e. the two inputs of the LUT; Out is the output of the LUT. WRITE is executed on a column wise basis when WriteEn is high. To WRITE to all memristors connected to a column, the controller selects the corresponding BL, generates the appropriate voltage based on the input data and applies to the respective WLs. The voltage requirements for the signals of the WRITE and READ operations of the selected memristor (M11) are shown in Table 3. The truth table of the functions of the controller block is given in Table 4. Table 3. Voltage requirements for WRITE and READ operations on M11 using the proposed method Voltage on WL1 Voltage on WL2 Voltage on BL1 Voltage on BL2 write 1 Vdd floating GND floating write 0 −Vdd floating GND floating read ±Vdd GND to load to load Table 4. Truth table of controller A B WL1 WL2 G1 G2 Sel 0 0 1 0 1 0 0 0 1 1 0 0 1 1 1 0 0 1 1 0 0 1 1 0 1 0 1 1 As an example, let the inputs of the LUT (AB) be 00; the controller selects the memristor M11 for the operation (i.e. WRITE/READ) by turning on T1 and applying the appropriate voltage to WL1. Similarly for selecting the memristor M22, the appropriate voltage is applied to WL2 and T2 is turned on. The output multiplexer is enabled only during the READ operation. The READ is performed on each memristor; the READ value is obtained at the OUT. ReadEn is made high to READ a memristor. Then depending on the values of A and B, the controller selects the corresponding BL of the memristor and applies the READ voltage (Table 3) to the respective WL of the memristor. Moreover, the controller generates the appropriate select signal (0/1) to propagate the READ value to OUT. For example, to READ M11, BL1 is selected, Sel is '0' and the READ voltage is applied at WL1. In addition, the proposed controller block is designed such that simultaneous READ and WRITE operations are not possible. Prior to writing the configuration bits to the LUT, all memristors are assumed to be in the off state (0 state). The proposed memory block provides many qualitative advantages compared with an SRAM-based LUT. When the power to the proposed LUT is turned off, the NSPs [21, 22] of the memristors are retained due to its NV nature. Therefore, unlike an SRAM-based LUT, there will be no power dissipation during the stand-by mode; moreover, the dynamic power dissipation is very small when compared with an SRAM-based LUT as the latter consumes more power particularly when switching the inverters. Six transistors (6 T) are usually required for storing a single bit in a complementary MOS (CMOS) SRAM-based LUT; hence, a two-input LUT requires 24 transistors. In the proposed memory block, only four memristors and two transistors are required. A detailed evaluation of performance and related metrics will be performed in subsequent sections. In the proposed work, the updated version of the simulation program with integrated circuit emphasis (SPICE) model of [23] is utilised for simulating the memristor; as it shows close resemblance to the HP Labs implementation [20]. Moreover, throughout this paper unless specified, the default values of the parameters used in [21] are adopted. 3 Operational features Consider first WRITE and READ on a single bit LUT in which a terminal of the memristor is connected to the WL, while the other terminal is connected to the BL; an NMOS that acts as a load and by controlling it (on/off) the corresponding memristor is selected/unselected for the WRITE/READ operation. 3.1 WRITE 1 operation For the WRITE 1 operation, consider a pulse (+Vdd) is applied to WL until the NSP of the memristor (initially at ROFF) changes from ROFF (0) to RON (1); thus, a logic 1 is successfully written to the memory cell. Fig. 2a shows the value of the NSP, whereas Fig. 2b shows the output voltage at the load after applying +Vdd (+0.9 V) till 150 ns. It should be observed that the NSP fully changes from the initial value of 0 to 1 at about 135 ns; thus, the WRITE 1 operation is correctly executed. Fig. 2Open in figure viewerPowerPoint Response to a sequence of WRITE 1, READ 1, RESTORE, WRITE 0, READ 0 and RESTORE operations on a single memristor using the proposed methoda Applied voltage and resulting NSPb Output voltage 3.2 WRITE 0 operation For a WRITE 0 operation, consider a pulse (−Vdd) is applied to WL until the NSP of the memristor (initially at RON) changes from RON (1) to ROFF (0). Thus, a logic 0 is successfully written to the memory cell. As shown in Fig. 2a for WRITE 0, –Vdd (−0.9 V) is applied to WL at 170 ns and the NSP changes from 1 and reaches the value of 0 at about 320 ns, i.e. the WRITE 0 operation is correctly executed. Thus unlike [11-13], the proposed WRITE operation requires applying +Vdd (+0.9 V) and –Vdd (−0.9 V) only to WL, i.e. it does not require an additional circuit to monitor the incoming data and then channel it to WL/BL depending on the value of the data for the WRITE operation. In addition, the proposed method does not require the application of +Vdd and +0.5 Vdd to WL and BL to unselect a memristor; therefore, the proposed method uses only three power rails (+Vdd, −Vdd and GND). Moreover as shown later, in the proposed method, the changes to the NSP of the unselected memristors are small when compared with [11-13]. 3.3 READ 1(0) operation A pulse of +Vdd (with a duration of 10 ns) is applied to WL (READ pulse) of the memristor with NSP at 1(0) and the READ value is found across the NMOS. The simulation results are shown in Figs. 2a and b. The READ 1 operation performed at 155 ns shows that it did not affect the NSP of the memristor. Therefore, unlike [11-13], the proposed method does not require a RESTORE pulse following the READ 1 operation. Thus, the proposed method accomplishes significant READ and RESTORE time savings when compared with [11-13], as analytically proven next. Let α be the percentage of READ 1 operations and β be the percentage of READ 0 operations over a total number of N READ operations; Ttotal denotes the total time required to perform all N READs and is given by (1)where T1 is the time required for a READ 1 operation, Tr1 is the RESTORE time after a READ 1, T0 is the time required for the READ 0 operation and Tr0 is the RESTORE time after the READ 0 operation. Therefore, for the methods of [11-13], Tr1 = Tr0≠ 0 while for the proposed method, Tr1 = 0 and Tr0≠ 0. Assuming that all pulse widths are equal (i.e. T1 = T0 = Tr0 = T), then the total time to perform all READ operations using the proposed method is given by (2)The largest saving in total time (Ts-max) that can be achieved using the proposed method is 50% of the Ttotal of the [11-13]. Alternatively for equal amount of READ 1 and 0 operations (α = 0.5 and β = 0.5), the proposed method achieves a saving of 25% for Ttotal. Thus, the proposed method saves 25% of READ and RESTORE times compared with [11-13]. A READ 0 operation performed at 355 ns shows (Fig. 2b) that the output voltage is low during the READ 0 operation. However, as shown in Fig. 2a, the NSP changes slightly toward RON. To restore the NSP to its original value, a pulse of –Vdd is applied for 10 ns; this signal is referred to as the RESTORE pulse. A circuit is required to control the application of the RESTORE pulse; this circuit operates depending on the NSP value of the memristor when a READ operation is performed. A RESTORE pulse is then applied at 365 ns to restore the NSP to its original value (0). However, this pulse does not fully restore the NSP to its original value due to the insufficient width of the RESTORE pulse. Hence consecutive READ 0 operations may cause the NSP to reach the threshold level, thus eventually changing the value stored in the memory. Fig. 3Open in figure viewerPowerPoint Equivalent circuit for LUT readinga Floating-row methodb Grounded-row method Assessment of the number of consecutive READ 0 and RESTORE operations can be found in [21]; a ringing behaviour has been reported in the operation of the memristor. Therefore, for ringing to have no effect on the memory state, the width of the READ pulse must be smaller or equal to the RESTORE pulse width [21]. Therefore, the proposed method does not require a REFRESH pulse to be applied after few READ operations [21]. As described previously, an additional circuit is required to decide when to apply the RESTORE pulse. A similar simulation assessment (consecutive READ and RESTORE operations) has been performed on methods presented in [11-13]. The effect of NSP due to READ 0 operation is similar to the proposed method; however, the READ 1 operation causes the NSP of the memristor to fall below the threshold value. Therefore, [11-13] requires a REFRESH pulse when the NSP reaches the threshold value of 0.5 and a circuit is required to monitor the NSP of a memristor [21]. More detailed analysis is presented in [21]. 4 LUT operations In this section, different features related to the LUT-level performance of the proposed LUT block (Fig. 1) are pursued in terms of the WRITE and READ memory operations by considering the scenario under which the memristor M11 is selected for the WRITE and READ operations, while the other memristors (M12, M21 and M22) are unselected. 4.1 WRITE operation Now consider all the memristors are at ROFF state and the proposed WRITE 1 and 0 operations are sequentially performed on M11, as expected the NSP of M11 changes from 0 to 1 (1 to 0) during the WRITE 1 (0) operation. However, the NSPs of the unselected memristors M21 and M12 show a slight change toward RON (well below threshold value) during the WRITE 1 operation and no change during the WRITE 0 operation, respectively; therefore, the correct logic value (0) is stored in M21 and M12. The NSP of the unselected memristor M22 does not change during the WRITE 1 operation, but it changes significantly toward RON during the WRITE 0 operation. Hence to overcome change in the NSP of some of the unselected memristors the WRITE operation of the proposed memory (i.e. the programming of an LUT) executes in two phases as follows: Phase 1 (Refresh Phase): Before writing to the array all the memristors are written to 0. Phase 2 (Write 1): Write the 1 to all selected memristors of the LUT. A similar experiment performed using the methods proposed in [11-13] and the results are presented in Table 5. Columns 2 and 3 show the number of unselected memristors that are affected during the WRITE 1 operation in the proposed and [11-13] respectively. Columns 4 and 5 represent the worst-case change in NSP of those affected (unselected) memristors using the proposed and previous methods [11-13]. The number of unselected memristors affected in the proposed method is twice that of [11-13]. However, for the proposed method the worst case of NSP is well below the threshold (i.e. the threshold is 0.5), whereas for [11-13] the worst-case NSP is always greater than the threshold level. Hence, it is obvious that the previous method does not hold the memory state of the unselected memristors when performing a WRITE operation on the selected memristors. Table 5. NSP of unselected memristors under Scenario 1 LUT size Number of affected, unselected memristors Worst-case NSP among unselected memristors with initial value of 0 Proposed method [11-13] Proposed method [11-13] 2 × 2 2 1 0.179 0.868 4 × 4 6 3 0.242 0.838 6 × 6 10 5 0.261 0.827 8 × 8 14 7 0.269 0.812 4.2 READ operation The process of reading the LUT using the proposed method is performed by reading the selected cell in a row then using MUX to output the content of the selected cell. To substantially differentiate the output voltage difference between READ 0 and 1 operations, a grounded-row method (Fig. 2b) that connects the unselected rows to ground rather than floating (previous methods Fig. 2a) is utilised in this paper. The following modelling analysis supports the utilisation of the proposed method. Figs. 2a and b show the equivalent circuits of the 2 × 2 LUT under these methods at steady state. Rij denotes the resistance of the memristor Mij, RTj denotes the on resistance of transistor Tj and Vj denotes the output voltage at node Outj, where i = 1, 2 and j = 1, 2. Under the floating-row method, let RM = R21 + R22 and RT = RT1 = RT2, so the output voltages V1 and V2 are given as follows (3) (4)where Under the grounded-row method, RT = RT1 = RT2 and the output voltages V1 and V2 are given by (5) (6)where 1/RP1 = 1/RT + 1/R21 and 1/RP2 = 1/RT + 1/R22 are the parallel resistances between the unselected memristors and the transistors. Consider the floating-row method; as shown in (3) and (4) the difference between the two outputs (V1 and V2) is decided solely by R11 and R12. Moreover, the second term of the two equations is significantly larger causing the difference between the two voltages very small. As an example, consider R11 = 19 kΩ and R12 = R21 = R22 = 100 Ω. As RT = 5.552 kΩ at 32 nm feature size and Vin = 0.9 V, the output voltages are V1 = 0.8399 V and V2 = 0.8695 V. Therefore, it is very difficult to distinguish between the two outputs when the two memristors have different values of NSP. Now consider the grounded-row method where the output voltage is effectively caused by the voltage divider. Therefore, the output voltage depends mostly on the resistances of the column for the READ operation. Under the same conditions as previously considered for the floating-row method, the output voltages are now given by V1 = 0.0046 V and V2 = 0.4460 V. That is, there is now a clear distinction between a memristor with a 0 NSP and a memristor with a 1 NSP. This is made possible due to the parallel arrangement between the unselected memristors and the transistor resistance, i.e. the selected memristor is likely to determine the outcome of the voltage divisor. Next, the performance of the grounded-row method is studied for array sizes; it is observed that the final value of the voltage for READ output and hence the difference reduces. The worst case occurs when the value of the NSP of all unselected memristors is the smallest (RON), i.e. in this case the unselected memristors on a column have NSPs of 1. Fig. 4 shows the output voltage difference when reading two memristors (one with an NSP of 1 and the other with an NSP of 0) for different values of RON and LUT sizes (the LUT size refers to the LUT memory size) in the worst case of the grounded-row method. As expected, with an increase in memory size, the difference in the output voltage decreases due to the decrease in the load resistance. Moreover, as RON increases (under a constant ROFF/RON ratio), the output voltage difference decreases, as caused by the decrease in the amount of current flowing in the circuit. As shown in Fig. 4, the reference voltage of a comparator at 0.1 V can sense logic 1 and 0 of an LUT with up to eight inputs. Thus unlike [11-13], the proposed READ operation does not require a sense amplifier. Fig. 4Open in figure viewerPowerPoint Worst-case difference between reading 0 and 1 NSP for different RON and LUT sizes for the grounded-row method 5 Simulation results This section presents a simulation-based evaluation of the proposed scheme and a comparison with [11-13] is pursued with respect to the worst-case scenario (the results for the average case are reported in [21]). Moreover, comparison is performed between the proposed memristor-based and SRAM-based LUTs. 5.1 WRITE operation A detailed performance analysis is pursued by applying the WRITE operation to LUTs of different sizes and then comparing the results with [11-13]. The following scenarios (as in [22]) are simulated: Scenario 1: WRITE 1 to all memristors. Scenario 2: WRITE 0 to all memristors. Scenario 3: WRITE 0 to a memristor while the NSPs of all memristors are initially 1. Scenario 4: WRITE 1 to a memristor while the NSPs of all memristors are 0. The worst values for the WRITE operation at different array sizes are calculated based on the simulation results obtained for the above scenarios. From the results (Fig. 5), it can be observed that the worst-case delay incurred by the proposed method is slightly less than [11-13]. However, the energy dissipated and EDP of the proposed method are significantly less than [11-13]. Fig. 5Open in figure viewerPowerPoint Worst-case WRITE operation for different sizes of LUTs resultsa Delayb Energyc Energy delay product (EDP) Next the WRITE operation performance of the proposed method is compared with SRAM-based volatile LUTs. In this volatile LUT, each cell is a 6 T SRAM designed using 32 nm. Table 6 shows the results for the SRAM-based and the proposed NV LUTs; the average write time and the average EDP of the SRAM-based scheme are significantly less than the proposed memristor-based cell. Table 6. Average WRITE delay, energy and EDP of SRAM and the proposed memristor-based LUTs LUT size Average delay, ns Average energy, pJ Average EDP, pJns SRAM Proposed SRAM Proposed SRAM Proposed 2 × 2 0.103 281.19 0.0016 30.53 0.00017 10,666 4 × 4 0.206 631.28 0.0067 97.50 0.0014 98,448 6 × 6 0.309 1151.76 0.0151 213.85 0.0047 435,339 8 × 8 0.412 1821.81 0.0269 379.5 0.0111 1,285,099 5.2 READ operation The READ operation is performed on different array sizes considering the scenarios stated in [22]. Scenario 5: READ 0 when only the NSP of one memristor is 0 (i.e. the NSP of all other memristors are 1). Scenario 6: READ 0 when the NSPs of all memristors are 0. Scenario 7: READ 1 when only the NSP of one memristor is 1 (i.e. the NSPs of all other memristors are 0). Scenario 8: READ 1 when the NSPs of all memristors are 1. The process of reading the LUT is performed in two steps; (i) the row on which the selected (target) cell is located is initially read to determine the worst propagation delay of the READ, and therefore establish the minimum width of the READ signal. (ii) the MUX outputs the contents of the target cell. Fig. 6 shows the results of worst-case READ operation applicable to the whole LUT for the proposed method and [11-13]. It is observed that the average and worst-case values of the EDP in both methods increase as the size of the array increases. Among the two, the proposed method incurs in a significantly smaller EDP value for all LUT sizes. Moreover, the proposed method requires significantly less READ time and dissipates less energy for all LUT sizes. In [11-13], the unselected rows for the READ operation are left floating and this significantly decreases the read voltage difference between the read 0 and 1 operations. Fig. 6Open in figure viewerPowerPoint Worst-case READ operation for different sizes of LUTs resultsa Delayb Energyc EDP Next the READ operation performance of the proposed method is compared with SRAM-based volatile LUTs. As shown in Table 7, the proposed method incurs in a significantly reduced READ delay and EDP compared with an SRAM-based LUT. This confirms the viability of the proposed cell, because in an FPGA, READ operations are performed more often than WRITE operations. Table 7. Average READ delay, energy and EDP of SRAM and proposed memristor-based LUTs LUT size Average delay, ps Average energy, fJ Average EDP, fJps SRAM Proposed SRAM Proposed SRAM Proposed 2 × 2 320.25 3.073

查看译文

关键词

Memristor,Non-Volatile Memory,Resistive Switching

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要