Ssd Ftl

SSD FTL #

SSD FTL provides an indirection over physical address space in conformity to disk HAL. Because block update is never in-place, a mapping from logical to physical address is thereby needed.

Block-based #

image-20200328135538087 1

A write proceeds as followed:

  1. Calculate logical block number (LBN) from request LPN
  2. Lookup physical block number (PBN) from mapping table
  3. Write to the offset in the physical block
    • if it’s been occupied, update it by writing to another free block at the same offset and copy valid data over.

Remark #

  • Pro: mapping table space is not big
  • Cons: large amount of GC due to the difficulty of finding blocks with specific free page offset

Page-based #

image-20200328135857852 2

Remark #

  • Pro: mapping table space is huge. Require 1GB with 1TB SSD
  • Cons: higher associative increases the chance of finding free page in opened block

Hybrid-FTL #

Motivation: harness the space efficiency benefit of block-based FTL, use small amount of page-based to record updates. For example, if 90% of a 512-GB SSD (128 KB block) is covered by block-level mapping, the hybrid FTL only needs 117 MB instead of 1 GB for full page-based. 3

  • A new page is written to data page, and subsequent updates to that page will go to a log block
  • Log blocks are highly limited (in order to keep FTL size small); when log blocks run out, merge is required
  • Log block can be fully-associative or block-associative (like CPU cache):
    • In case of fully-associative log block (a log block can take updates from any data page and a data page scatters its updates in multiple log blocks), a full merge incurs recursively copying pages.
    • In case of block-associative, log block thrashing will be problematic.

Another way of thinking about hybrid-FTL 5 #

A hybrid FTL uses page-level mappings for new data and converts them (merge) to block-level mappings when it runs out of mapping cache.

image-20200330080309230
  • The left write-order is fully sequential (ordering + co-location), and requires only switch merge

  • The right write-order has co-location only, and requires full merge to reorder pages

Remark #

Just a postponed version of block-based FTL. Performance suffers when it fallbacks to block-based FTL at high update period

  • Pro: mapping table space is huge. Require 1GB with 1TB SSD
  • Cons: expensive full merge cost

On-Demand: DFTL 4 #

The de facto FTL today.

Motivation

  • For workload with high temporal locality, it suffices to store the working set of page-based translation blocks in SRAM, while leaving the cold ones in flash.
  • The architecture is reminiscent of TLB.
    • In its Cached Mapping Table, instead of caching an entire translation page, it only caches entries
    • Unlike the offset-based lookup in normal table, it’s a fully-associative dictionary

Architecture

arch

  • Cached Mapping Table: an LRU-based fully-associative dictionary like TLB, which relies on temporal locality to achieve good cache hit rate
  • Global Translation Directory: the table that tracks translation blocks
  • How to read and write?
    • Current Data Block: ingest new writes
    • Page-based mappings allow sequential writes within these blocks, thus conforming to the large-block sequential write specification.
    • For write (and update) requests, DFTL allocates the next available free page in the Current Data Block, writes to it and then updates the map entry in the CMT
      • even page-sized random scattered I/O becomes sequential thanks to out-place update
      • johnson: each channel has one open block for incoming write
      • although an open block can write in the specificed zig-zag sequence, the problem is that update will invalidate pages and as free blocks are depleted, the cost of live data copy is the real cost of random update

GC

  • GC triggered when number of free blocks drop below threshold
  • If victim block is a data block
    • Copy valid pages to Current Data Block
    • Lazy copying: don’t update translation page now, if the translation entry is in CMT, directly update it (will eventually be updated upon eviction)
    • Batch update: the victim block’s translation entries are in the same translation page, so they can be updated together

Performance

  • Few dirty write-back: approximately 97% of the evictions in read-dominant TPC-H benchmark did not incur any eviction overheads.
  • The worst-case overhead includes: two translation page reads (one for the victim chosen by the replacement algorithm and the other for the original request) and one translation page write (for the victim) when a CMT miss occurs
    • the presence of multiple mappings in a single translation page allows batch updates for the entries in the CMT, physically co-located with the victim entry
      • Read up one page, update multiple entries and writeback
      • Optimization: scan in CMT for dirty entries to reduce critical-path eviction

Reference #


  1. SSD Survey ↩︎

  2. SSD Survey ↩︎

  3. (512GB / 256KB * 0.9 + 512GB / 4KB * 0.1) * 8bytes = 117MB , assume each PTE consumes 8 bytes (check!) ↩︎

  4. http://www.cse.psu.edu/~buu1/papers/ps/dftl-asplos09.pdf ↩︎ ↩︎

  5. The Unwritten Contract of Solid State Drives: http://pages.cs.wisc.edu/~jhe/eurosys17-he.pdf ↩︎

Calendar Last modified: April 1, 2020