Nano-Memory Simulation by Bradley Berg December 14, 2005 ABSTRACT The wires interconnecting a grid of sub-lithographic memory cells are too small to be directly manipulated. Randomized address decoders can be used to access individual memory cells. Either differentiated wires are randomly positioned or undifferentiated wires are randomly connected. Randomized access requires a mapping memory to translate ordered addresses into random addresses. A paged serial access chip architecture is presented to minimize the size of the memory map and to compensate for the resultant timing latencies. Simulations were performed to determine plausible memory configurations for differentiated Core-Shell wires and Random- Particle decoders. With both approaches about 25% of the cells were usable assuming a projected 10% failure rate for wires and cells. About 25% of the cells were lost due to failures and the remaining half were lost as a result of randomized access. 1. Introduction Initial nano-memory offerings will compete against dominant incumbent technologies, DRAM, SRAM, and flash, as well as several emerging technologies. Random access nano-memories will have to outperform either DRAM (< 50ns) or SRAM (< 8ns) at lower cost. If near term random access memory technologies (e.g. Ovonic phase change memory [1]) is to displace DRAM it will need to be even faster; raising the performance bar. Non-volatile memory has less stringent performance constraints, but has higher density requirements. Flash memory can be read quickly (< 60ns), but writes can take milliseconds. Some proposed nano-memories may be fast enough to compete for use as main memory, but many will not. A claim has been made that the nano-memory being developed by Nantero has an access time of 1/2 nanosecond [2]. Even if the storage medium itself is fast the total chip level access time may be substantially slower. Proposed nano-memory schemes need to cope with high failure rates and randomized configuration. These issues are addressed with mesa-scale (usually CMOS) address translation maps and error correction. An additional factor that may degrade nano-memory performance is output coupling delay. This occurs at the junction of a nano-scale read sensor and a mesa-scale output line. It takes time for the small read sensor to transfer enough charge to register on the large output line due to its higher capacitance. Considering these factors it is likely that nano-memories will be better positioned to compete on it's density advantage rather than speed. File systems are stored in paged memory and comprise the bulk of the storage bits in a computing system. Currently, file systems are stored on low-cost disk with access times in the tens of milliseconds. New high capacity memory chips can store frequently accessed pages in fast solid-state devices to dramatically improve file system access speed [3]. This paper considers chip level architectures for paged nano-memory devices. It is applicable to nano-memory technologies whose performance is not sufficient to compete with incumbent and emerging random access devices. Plausible circuitry to access pages of nano-memory is devised using simulation. The simulations take into account a varying degrees of fabrication faults and are used to compare configuration options. 2. Paged Memory Configuration Solid-state paged memory can be used to improve file system performance and to build portable storage devices. With the advent of low cost non-volatile solid-state memory, paged memory can be combined with rotating disks to create high performance file systems. There are four different uses for solid-state paged memory in general purpose computer systems. * Portable drives are already used widely in pen drives and music players. Over time it may be desirable for personal data to migrate to pen drives [4]. * Non-volatile memory can be used to cache data for rotating disk drives. Microsoft Windows Vista (a.k.a Longhorn) buffers disk reads in main memory while disk writes are buffered in non-volatile memory on a hybrid drive [5]. * Solid state disk can potentially sustain peak transfer rates for a given transfer protocol with access times significantly faster than hard drives. Transfer rates for SATA 1 are 150MB/second. The recently introduced SATA 2.5 specification has a peak rate of 300MB/second. The next generation SATA 3 will double the rate again to 600MB / second. [6] * Even higher transfer rates can be achieved by storing local file systems in non-volatile memory placed on the mainboard. Local storage can use full speed DMA channels to transfer data in and out of main memory. This can greatly improve the performance of local desktop and server systems and vector data in and out of supercomputer systems. Dehon [7, 8] proposed that nano-memory chips be hierarchically organized using a set of crossbar grids of nano-wires. Within a grid nano-wires are partitioned into bundles. Each bundle is individually addressable by mesa-scale (typically CMOS) wiring. Throughout this project each grid contains 64 usable bundles per axis. The actual number will be greater due to flaws, unaddressable wires, and parity bits. Bundles are kept small enough that the bulk of address decoding can be done with reliable mesa-scale circuits. A practical bundle size is near the ratio of mesa-scale to nano-scale pitches. This is expected to be about 10 (e.g. 90nm:9nm). On this basis the page address size will be set at 3 bits and is used to select a wire within a bundle using a decoder. GRID ADDRESSING +-------------------------------------------------------+ | X Bundle(6) | Y Bundle(6) | X Page(3) | Y Page(3) | +-------------------------------------------------------+ -++++++++++++++-++++++++++++++- -++++++++++++++-++++++++++++++- +-----+ -++++++++++++++-++++++++++++++- ->| | -++++++++++++++-++++++++++++++- Page(3) ->| Map | -++++++++++++++-++++++++++++++- ->| | |||||||||||||| |||||||||||||| +-----+ |||||||||||||| |||||||||||||| ||||| |||||||||||||| |||||||||||||| Decoder Inputs ||||| |||||||||||||| |||||||||||||| ||||| |||||||||||||| |||||||||||||| -+++++--------++++++++++++++-++++++++++++++---- -+++++--------++++++++++++++-++++++++++++++---- -+++++--------++++++++++++++-++++++++++++++---- -+++++--------++++++++++++++-++++++++++++++---- Bundle -+++++--------++++++++++++++-++++++++++++++---- -+++++--------++++++++++++++-++++++++++++++---- -+++++--------++++++++++++++-++++++++++++++---- -+++++--------++++++++++++++-++++++++++++++---- -+++++--------++++++++++++++-++++++++++++++---- -+++++--------++++++++++++++-++++++++++++++---- ||||| |||||||||||||| |||||||||||||| -+++++--------+++++++++++++++++++++++++++++---- -+++++--------++++++++++++++-++++++++++++++---- -+++++--------++++++++++++++-++++++++++++++---- -+++++--------++++++++++++++-++++++++++++++---- -+++++--------++++++++++++++-++++++++++++++---- -+++++--------++++++++++++++-++++++++++++++---- Bundle -+++++--------+++++++++++++++++++++++++++++---- -+++++--------++++++++++++++-++++++++++++++---- -+++++--------++++++++++++++-++++++++++++++---- -+++++--------++++++++++++++-++++++++++++++---- |||||||||||||| |||||||||||||| The X and Y Bundle addresses are scanned and the X and Y Page addresses select a particular 4096 bit page out of 64 pages within the grid. When a page is accessed, bits are selected over the entire grid. Correspondingly, the heat at cross points will be dissipated over a wide area, avoiding hot spots on the chip. Memory cells this small are likely to be particularly susceptible to thermal perturbations. Heat can more easily change their analog characteristics or damage the device. Spreading out access over pages also increases endurance. Over a long period of time no small group of bits will be repeatedly changed. 3. Fault Model This section details sources of faults within a grid. However, with respect to page addressing the decoder logic simply needs to know whether or not a particular page address can access a fully functional nano-wire. Faults are categorized into always on, always off, and intermittent [9, 10]. Hard faults are found using a discovery process. Each address in the grid is tested to see if data can be successfully written. As invalid addresses are found the corresponding map entries are marked. The addresses map is scanned sequentially and invalid or unused addresses are skipped. The validity of neighboring addresses has no effect on a mapped address. Faults can be treated as independent events. Consequently a single composite fault metric is used based on this observation. Simplifying the fault model down to a single parameter implies the fundamental chip architecture does not need to be altered as the capacity or fault rate changes. At most a small number of bundles can be added or removed from grids to accommodate design changes. Intermittent faults occur while the chip is in use and can not be detected at the time of fabrication. These are managed using error correction and are discussed in section 4. Any hard fault has a corresponding intermittent counterpart so the same categorization of hard faults can also be used to structure a detailed analysis of intermittent faults. 3.1 Crossbar faults Crossbar faults occur in an individual memory cell at the point where two nano-wires cross. * Short: The wires can short causing failures along both intersecting wires. Mark both wires as faulty. * Open: Only the single cell will appear to be always 0 or 1. Additional logic to remap single cells would increase the circuit size. It is better to just mark both wires as faulty. * Cell: The memory cell itself could fail and appear as always 0 or 1. As before both cross wires are marked faulty. 3.2 Wire Faults Due to their delicate nature nano-wires can easily break. Their close proximity means that a small fabrication error can allow them to come in contact with each other. Note that within a bundle duplicate wires may be selected by the same address; operating as a single line. In some cases a break in one of the wires may be masked by the other. * Broken: Cells past the point of breakage will not be accessible. Disable the address for the broken wire. * Touch: Two or more touching wires probably will have different addresses. Once a faulty wire is found the discovery process needs to account for other interacting addresses. All interacting addresses must be marked faulty. 3.3 Contact Faults All wires in a bundle are activated and then their activation is selectively blocked by enabled decoder input (control) lines. An input line blocks a wire when it is activated and passes if deactivated. A faulty uncontrolled contact never blocks and conversely a faulty controlled contact always blocks. There are no hard contact faults possible for the Random-Particle decoder since the contacts are randomly present or not. However, intermittent contact faults might still occur. * Uncontrolled: A contact is uncontrolled when activated when it should be controlled. This causes a wire to be activated when it should not be. If more than one address can activate multiple wires then they will interfere. With a linear decoder the wire will always be selected so the address is discarded. For a logarithmic decoder, if each wire is accessible through unique addresses despite the uncontrolled contact, it is still usable. In fact this condition may be undetectable. * Controlled: When a contact is always controlled the Wire is not selected when it should be. In this case the address will not have a detectable wire and is indistinguishable from a wire missing from the bundle. 4. Error Correction Error correction is required to correct intermittent faults that occur after a chip is fabricated. Reed-Solomon codes are effective at correcting errors in paged access memories. The page is divided into a sequence of k-bit symbols and additional parity symbols are appended to the page. The parity symbols can be used to correct errors in up to a fixed number of symbols determined by design parameters. Any number of bits within a symbol can be corrected. Consequently, lengthy sequences of errors can be corrected. Unlike mesa-scale memory grids, failures in a nano-scale grid are likely to involve complete wires. Serially reading cells along a faulty nano-wire yields a contiguous sequence of failed bits. Reading cells perpendicular to a faulty nano-wire distributes the failures throughout the page. In this case many symbols need correcting as each failed cell occurs in a different symbol. This requires many correctable symbols and consequently many parity bits. A more balanced error pattern can be achieved by scanning the grid linearly halfway along both axis. The grid can be divided into quadrants and accessed along a different axis in each quadrant as the following diagram illustrates. This cuts the number of distributed faults in half. ----------------|||||||||||||||| ----------------|||||||||||||||| ----------------|||||||||||||||| ----------------|||||||||||||||| ----------------|||||||||||||||| ----------------|||||||||||||||| ----------------|||||||||||||||| ----------------|||||||||||||||| ----------------|||||||||||||||| ----------------|||||||||||||||| ----------------|||||||||||||||| ----------------|||||||||||||||| ----------------|||||||||||||||| ----------------|||||||||||||||| ----------------|||||||||||||||| ----------------|||||||||||||||| ||||||||||||||||---------------- ||||||||||||||||---------------- ||||||||||||||||---------------- ||||||||||||||||---------------- ||||||||||||||||---------------- ||||||||||||||||---------------- ||||||||||||||||---------------- ||||||||||||||||---------------- ||||||||||||||||---------------- ||||||||||||||||---------------- ||||||||||||||||---------------- ||||||||||||||||---------------- ||||||||||||||||---------------- ||||||||||||||||---------------- ||||||||||||||||---------------- Dividing each 4096 bit page into three RS(255, 233) code words reduces the number of parity bits needed even further. RS(255, 233) is a Reed-Solomon code with 8 bit symbols of which up to 16 can be corrected per code word. Each code word contains up to 233 data symbols and 32 parity symbols. Together the three code words use (32 * 8 * 3) 768 parity bits per page. Alternatively, using a single code word for each page requires a 10 bit symbol resulting in (32 * 10 * 3) 960 parity bits per page. As the page is scanned cells are transferred to alternating code words. Invalid data due to a nano-wire failure is then evenly distributed over all three code words. Consequently the number of failures per code word will not exceeds the upper bound of 16 corrections per code word. The data capacity of the three RS(255, 233) code words is (223 * 8 * 3) 5352 bits; which is more than is needed. Rounding up the 4096 bit page size to a multiple of 3 symbols gives (171 * 8 * 3) 4104 data bits. The capacity of the grid is then (4104 + 768) 4872 total bits per page; which can be stored in (70 * 70) 4900 grid bits. A 70 by 70 bit grid size leaves enough room to store an additional byte per code word that can be used as a checksum. The checksum can detect when errors exceed the correctable limit. Once a fault is corrected, the corresponding nano-wires are re-mapped to spare bundles. This is done very simply by updating the bundle map and rewriting the page. At some point there might be no more spare bundles available to be remapped. Spare grids can also be added to the chip and the entire grid can be rewritten to a spare grid. 5. Nano-Memory Simulation Simulation is used to determine plausible chip configurations for nano-memories. In particular Core-Shell and Random-Particle decoders are simulated with varying defect rates, bundle sizes, and the number of decoder inputs. 5.1 Core-Shell Nano-Wires Core-shell decoders can double up the contacts to increase reliability. For independent contact failures the probability of a fault is squared. For a given fabrication process contact failures may not be totally independent and the actual failure rate will be less than the square. Random particle contacts can not use double contacts to increase their reliability as the contacts are random and can't be duplicated. Different Core-Shell nano-wires are developed through a chemical design process. It can be expected that making many different types of shells is bounded by chemistry to a small number of wire types. Consequently the simulated models seek to use a small, but reasonable number of wire types. The input address map for a Core-Shell decoder can use one bit for each wire type. When a wire type is addressable in a bundle the bit is set. To determine if a particular wire type (address) is present the map bits are shifted and the address is decremented each time a one is encountered. Another counter counts the number of shifts. As the address reaches each 0 the shift counter has the input line values. The number of cycles needed for decoding each bundle can be up to the number of wire types. The decoding process has to be faster than the memory access time to achieve maximum access speed. For paged access it's assumed that the memory is slower than DRAM, so there is plenty of time for the computation. A more complex equivalent circuit could perform the same computation in a single cycle. Example: 7 of 8 addresses usable in a bundle with map settings of: A3: 0 1 2 3 4 5 6 7 Map: 1 0 1 1 0 1 0 0 1 0 1 1 A4: 0 1 2 3 4 5 6 7 8 9 10 11 Miss A3 = 3 (in) A4 = 0 (out) 2 1 2 2 1 3 0 4 0 5 Hit A3 = 7 (in) A4 = 0 (out) 6 1 6 2 5 3 4 4 4 5 3 6 3 7 3 8 2 9 2 10 1 11 0 Miss To prevent the upper addresses from always being mapped to a miss, the addresses need to be cycled. This can be done using the exclusive or of the low bits of the bundle index and the input address. In the example above the low 3 bits of the bundle index would be exclusive or'ed with A(in). 5.1.2 Single Core Shell [11,12] The decoder for the Single Core-Shell modeled here can use either a linear encoding (8 control lines) or a dual-rail log encoder (6 control lines). A preliminary simulation was run to determine reasonable bundle sizes and to observe the effect of disabling individual input addresses in groups. Each simulation was run to determine the number of bundles required along each axis to produce a working 64 by 64 bundle grid. There are 8 wire types with no faults and 1000 grids were generated. Simulations were not run for the blank fields as it was apparent 2 bit maps were impractical after a few runs. map bits per bundle -> (bundles per axis, wires, map bits per axis) bundle 2 4 8 6 205, 1230,820 124, 744,992 7 175, 1225,700 103, 721,824 8 198, 1584,396 118, 944,472 *88, 704,704 9 126, 1134,504 *91, 819,728 10 160, 1600,320 116, 1160,464 *85, 850,680 12 145, 1740,290 100, 1200,400 80, 960,640 14 125, 1750,250 88, 1232,352 77, 1078,616 Note that the more plausible configurations are marked with an asterisk in this and subsequent tables. Observations: * Using fewer map bits than wire types requires many more wires. This is because addressable wires are being discarded. * Using fewer wires per bundle than wire types requires more wires and more map bits. * Using more wires per bundle than wire types uses more wires but fewer map bits, but with diminishing returns. 8 to 10 wires per bundle seem reasonable. Simulations incorporating different fault rates were run next. Ten 70 by 70 grids were generated in each of two runs. The average number of bundles required was used to determine the number of wires and map bits needed for each axis. The first table contains the raw results and the second multiplies the number of bundles to yield the number of nano-wires and map bits required. (run1 run2 | truncated average bundles per grid) Fault 8 wire bundles 9 wire bundles 10 wire bundles 0% 103 99 | 101 98 98 | 98 92 92 | 92 1% 117 117 | 117 110 110 | 110 103 104 | 103 5% 125 125 | 125 121 120 | 120 114 112 | 113 10% 137 137 | 137 129 127 | 128 126 123 | 124 20% 162 166 | 164 158 157 | 157 151 150 | 150 30% 201 200 | 200 196 196 | 196 191 191 | 191 (average bundles, wires, map bits) Fault 8 wire bundles 9 wire bundles 10 wire bundles 0% 101, 808, 808 * 98, 882, 784 92, 920, 736 1% 117, 936, 936 *110, 990, 880 103,1030, 824 5% 125,1000,1000 *120,1080, 960 113,1130, 904 10% 137,1096,1096 *128,1152,1024 124,1240, 992 20% 164,1312,1312 *157,1413,1256 150,1500,1200 30% 200,1600,1600 *196,1764,1568 191,1910,1528 Observations: * Using nine wires per bundle provides a good trade-off between map size and wire utilization. * For fault rates of 10% and under wire utilization is dominated by duplicate wire addresses and not faulty wires. Note that the ideal utilization is (70 * 8) 560 wires. The 0% fault case shows the utilization due solely to duplicate wires. 5.1.3 Double Core Coating nano-wires with two shells reduces the number of different material types and etching steps. Four material types can be used to fabricate up to 12 different wire types. The table below shows each etching step (columns) for each of the possible 12 combination of wire coatings. In 5 etchings there are 9 wire types. k1 k2 k1 k3 k1 k4 k2 k3 k2 k4 k3 k4 k2 k1 k3 k1 k4 k1 For 11 wire types add a 6th etch: K3 K2 K4 K2 For 12 wire types add a 7th etch: K4 K3 As before simulations for ten 70 by 70 grids were run for 9, 11, and 12 wire types. This shows the effect of using 5, 6, or 7 etches respectively. 9 wire types with 9 bit map shift/count map (run1 run2 | truncated average wires per grid) Fault 9 wire bundles 10 wire bundles 11 wire bundles 0% 104 102 | 103 98 96 | 97 92 93 | 92 1% 105 104 | 104 101 101 | 101 95 95 | 95 5% 112 113 | 112 106 106 | 106 102 103 | 102 10% 123 126 | 124 121 121 | 121 112 113 | 112 20% 146 147 | 146 143 141 | 142 139 139 | 139 30% 181 180 | 180 181 181 | 181 176 176 | 176 (average bundles, wires, map bits) Fault 9 wire bundles 10 wire bundles 11 wire bundles 0% 103, 927, 927 * 97, 970, 873 92,1012, 828 1% 104, 936, 936 101,1010, 909 * 95,1045, 855 5% 112,1008,1008 106,1060, 954 *102,1122, 918 10% 124,1116,1116 121,1210,1089 *112,1232,1008 20% 146,1314,1314 *142,1420,1278 139,1529,1251 30% *180,1620,1620 181,1810,1629 176,1936,1584 11 wire types with 11 bit shift/count map (run1 run2 | truncated average wires per grid) Fault 11 wire bundles 12 wire bundles 13 wire bundles 0% 86 86 | 86 83 81 | 82 78 79 | 78 1% 88 86 | 87 85 84 | 84 82 82 | 82 5% 93 93 | 93 90 90 | 90 86 88 | 87 10% 100 103 | 101 100 98 | 99 95 95 | 95 20% 124 123 | 123 119 119 | 119 117 117 | 117 30% 149 149 | 149 147 145 | 146 142 143 | 142 (average bundles, wires, map bits) Fault 11 wire bundles 12 wire bundles 13 wire bundles 0% 86, 946, 946 82, 984, 902 78,1014, 858 1% 87, 957, 957 84,1008, 924 82,1066, 902 5% 93,1023,1023 90,1080, 990 87,1131, 957 10% 101,1111,1111 99,1188,1089 95,1235,1045 20% 123,1353,1353 119,1428,1309 117,1521,1287 30% 149,1639,1639 146,1752,1606 142,1846,1562 12 wire types with 12 bit shift/count map (run1 run2 | truncated average wires per grid) Fault 12 wire bundles 13 wire bundles 14 wire bundles 0% 86 80 | 83 82 81 | 81 77 77 | 77 1% 81 81 | 81 79 79 | 79 77 77 | 77 5% 86 88 | 87 84 84 | 84 83 83 | 83 10% 95 95 | 95 93 90 | 91 88 89 | 88 20% 109 115 | 112 108 109 | 108 106 108 | 107 30% 138 135 | 136 133 133 | 133 132 132 | 132 (average bundles, wires, map bits) Fault 12 wire bundles 13 wire bundles 14 wire bundles 0% 83, 996, 996 81,1053, 972 77,1078, 924 1% 81, 972, 972 79,1027, 948 77,1078, 924 5% 87,1044,1044 84,1092,1008 83,1162, 996 10% 95,1140,1140 91,1183,1092 88,1232,1056 20% 112,1344,1344 108,1404,1296 107,1498,1284 30% 136,1632,1632 133,1729,1596 132,1848,1584 Observations: * Typical wire utilization and map size were close for 8 to 12 wire types. It is probably not worth the additional cost to perform the additional fifth or sixth etches. 5.2 Random Particle Williams and Kuekes describe a scheme for building a decoder based on the random deposition of gold particles [13]. Control lines with random controlled and uncontrolled contacts are produced. The deposition process is tuned such that there is an even distribution of controlled and uncontrolled contacts. The number of input lines needs to be larger than in the decoder for Core-Shell in order to uniquely address nano-wires. Consequently the dense mapping scheme used for Core-Shell decoders can not be used. Instead each 3 bit page address needs to be mapped to the setting for each input line. For each bundle 8 input lines use an (8 * 8) 64 bit map, 10 use (8 * 10) 80 bits, and a 12 line decoder uses (8 * 12) 96 bits. A preliminary simulation was run to determine a reasonable number of input lines and bundle size. The six most promising configurations were selected for further analysis. The simulation results for the six chosen input and bundle size combinations are shown in the second set of tables. (average bundles, wires, map bits) Inputs bundle 8 bundle 10 bundle 12 bundle 14 bundle 16 8 120,960,7680 106,1060,6784 *98,1176,6272 93,1116,5952 93,1488,5952 10 95,760,7600 *81, 810,6480 *78, 936,6240 *77,1078,6160 74,1184,5920 12 83,664,7968 75, 750,7200 70, 840,6720 *68, 952,6528 *67,1072,6432 (input lines, wires per bundle) -> (average bundles, wires, map bits) Fault 8, 12 64 bit map 0% 106,1272, 6784 1% 112,1344, 7168 5% 117,1404, 7488 10% 124,1488, 7936 20% 138,1656, 8832 30% 158,1896,10112 Fault 10, 10 10, 12 10, 14 80 bit map 0% 89, 890, 7120 *83, 996,6640 82,1148,6560 1% 94, 940, 7520 *86,1032,6880 82,1148,6560 5% 98, 980, 7840 *88,1056,7040 86,1204,6880 10% 103,1030, 8240 93,1116,7440 87,1218,6960 20% 114,1140, 9120 103,1236,8240 96,1344,7680 30% 132,1320,10560 119,1428,9520 109,1526,8720 Fault 12, 14 12, 16 96 bit map 0% 74,1036,7104 74,1184,7104 1% 75,1050,7200 75,1200,7200 5% 75,1050,7200 75,1200,7200 10% *77,1078,7392 76,1216,7296 20% 83,1162,7968 *80,1280,7680 30% 91,1274,8736 *85,1360,8160 Observations: * The number of nano-wires used is close to that for Core-Shell wires. * The number of map bits required shows an increase of about 6 or 7 times compared to Core-Shell wires. 6. Conclusions Using a paged address scheme for nano-memories relaxes several design parameters; lowering technical risks. This is particularly relevant for first generation devices. Access is distributed over many bits ensuring there are no hot spots at the chip or memory cell level. Disbursed access also means longer endurance (number of rewrites) over the life of the chip. Support logic for paged access is simpler than the logic for random access. This is particularly true of nano-memory; which has to cope with randomized wire placement and high fault rates. Fault management can also result in irregular timing; which is undesirable in random access memory. Buffers used in paged access memory eliminate the irregularities. The overall access speeds required for paged access are less stringent than for random access. The following recommendation are based on the simulation runs: Single Core-Shell * Use 8 wire types in 9 wire bundles. * Use either an 8 bit linear decoder or a 3 bit dual rail log decoder. * Double up the decoder input lines to reduce contacts faults. Double Core-Shell * With a fault rate of 10%, use 9 wire types in 11 wire bundles. * Use a 4 bit dual rail log decoder. * Double up the decoder input lines to reduce contacts faults. * Compress the input map using the shift-and-add mapping method. Random-Particle * Use a 10 to 12 bit log decoder. * Use 12 to 16 wires per bundle. * As the map size is the same for paged and random access, Random-Particle wires are suitable for either access mode. All 3 fabrication methods are limited by the small number of wire types leading to many duplicate nano-wires. At a 10% fault rate duplicate wires dominate wire utilization over faulty wires by about a factor of two. The major limitation of the Core-Shell process is the cost of making additional wire types. The limiting factor for Random-Particle is the large input map size. With a typical 10:1 mesa-scale to nano-scale pitch ratio the potential density increase for nano-memories is 22 to 28 times that of conventional CMOS memory. ACKNOWLEDGEMENT Thanks to Eric Rachlin for his assistance with the discovery algorithm for the Random-Particle decoder. BIBLIOGRAPHY [1] Stefan Lai (Intel) and Tyler Lowrey (Ovonyx). OUM - A 180 nm Nonvolatile Memory Cell Element Technology For Stand Alone and Embedded Applications. ftp://download.intel.com/technology/silicon/OUM_doc.pdf [2] On the Tube. The Economist. May 8th 2003. http://www.economist.com/science/displaystory.cfm?story_id=1763552 [3] Bradley A. Berg. New Computers Based on Non-Volatile Random Access Memory. July 18, 2003. http://www.techneon.com/paper/nvram.html [4] Bradley A. Berg. Securing Personal Portable Storage. May 12, 2005. http://www.techneon.com/paper/pen.html [5] Jack Creasey (Microsoft). Hybrid Hard Drives with Non-Volatile Flash and Longhorn. http://download.microsoft.com/download/9/8/f/98f3fe47-dfc3-4e74-92a3-088782200fe7/TWST05002_WinHEC05.ppt#309,10,Technical Assumptions for Hybrid Disk [6] Michael Alexenko (Maxtor). ATA for the Enterprise: The Present and Future State of ATA. February 21, 2001. http://www.sata-io.org/docs/srvrio0201b.pdf [7] Andre DeHon. Array-Based Architecture for FET-Based, Nanoscale Electronics. IEEE Transactions on Nanotechnology, VOL. 2, NO. 1, March 2003 pp. 23-32. [8] Andre DeHon, Patrick Lincoln, and John Savage. Stochastic Assembly of Sublithographic Nanoscale Interfaces. IEEE Transactions on Nanotechnology, vol. 2, no. 3, pp. 165-174, 2003. [9] Myung-Hyun Lee, Young Kwan Kim, and Yoon-Hwa Choi. A Defect-Tolerant Memory Architecture for Molecular Electronics. IEEE Transactions on Nanotechnology, VOL. 3, NO. 1, March 2004. [10] Philip J. Kuekes, Warren Robinett, Gadiel Seroussi and R. Stanley Williams. Quantum Defect-tolerant Interconnect to Nanoelectronic Circuits: Internally Redundant Demultiplexers Based on Error-correcting Codes. Science Research, Hewlett-Packard Labs, 1501 Page Mill Road, Palo Alto, CA. [11] Lincoln J. Lauhon, Mark S. Gudiksen, Deli Wang, and Charles M. Lieber. Epitaxial Core-shell and Core-multishell Nanowire Heterostructures. Nature, Vol. 420, pp. 57-61 (2002). [12] Dongmok Whang, Song Jin, and Charles M. Lieber. Nanolithography Using Hierarchically Assembled Nanowire Masks. Nano Letters, Vol. 300, No. 7, pp. 951-954. [13] Stanley Williams and Philip Kuekes. Demultiplexer for a Molecular Wire Crossbar Network. United States Patent Number: 6,256,767, July 3 2001. http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO1&Sect2=HITOFF&d=PALL&p=1&u=/netahtml/srchnum.htm&r=1&f=G&l=50&s1=6,256,767.WKU.&OS=PN/6,256,767&RS=PN/6,256,767