SSD #

Read/write is of page-granularity
Erase is block-granularity
A page cannot be re-written before it is erased
- Read-modify-write: data is copied to internal register, being updated, then written to a free page. Then the original page is marked as invalid.
when free blocks run out, garbage collection thread selects victim block for erase. However, if it contains valid pages, it must be migrated to another free block. Such rewriting introduces write amplification

The charge transfer during program and erase generates electric states in the NAND floating-gate cell that causes irreversible oxide stress, which poses a finite lifetime to the cell.
The number of times a flash cell can be programmed and erased is called P/E cycle.
The lifetime of a page is defined as the maximum number of P/E cycles after which the page becomes uncorrectable by SSD’s ECC within certain guaranteed data storage time ².

Each wordline (WL) corresponds to a page. To read a page, apply read reference voltage to a WL while applying passthrough voltage to other WLs.
In TLC NAND, a WL represent three separate pages, the page layout is vendor-specific, as exemplified in the following figure:
- Reading LSB bit is the easiest, while reading MSB requires applying multiple reference voltages. Therefore, MSB read latency is much higher than LSB latency.

Pages must be programmed sequentially in a block in zig-zag fashion to prevent disturbing adjacent cells during cell programming
Multiple blocks form a plane (see above), which contains a cache register as a latch to store sensed data. Due to this layout, multiple blocks in the same plane can’t be read simultaneously, while reading pages from multiple planes is okay.

the real cost of random update: repeated update will invalidate pages in multiple blocks (that contain other valid pages) and as free blocks are depleted, the cost of live data copy is the real cost of random update
To prevent write-amplification
- write sizes should be multiple of page size: so invalid pages can be reclaimed altogether
- small write should be buffered & coalesced & aligned
Cluster block
- SSD performs read or write operations as a unit of a clustered block, composed of physical pages striped over multiple chips.
- random write is as good as sequential write when write size is multiple of cluster block
Parallel write
- Writing a large buffer with one thread is just as fast as writing many smaller buffers with many concurrent threads.
- Multi-threading helps when writing many buffers that can’t be merged.
Sequential write (counter example)
- Because data is randomly written to block 0, 1, later sequential update cause holes due to invalidation. On the other hand, consistent random updates don’t have this problem.

Physically Addressed Queueing (PAQ): Improving Parallelism in Solid State Disks, ISCA 2012, Myoungsoo Jung et al. ↩︎
Flash Correct-and-Refresh: Retention-Aware Error Management for Increased Flash Memory Lifetime, Yu Cai et al., link ↩︎
Error Characterization, Mitigation, and Recovery in Flash Memory Based Solid-State Drives, Yu Cai et al., link ↩︎