Professional Documents
Culture Documents
1
Outline
5 MBytes
$57,000
$15200/Mbyte
~1.5 Random IOPs*
600ms latency
300 Mbytes
$60,000
$200/MByte
30 random IOPs
33ms latency
2.52 GBytes
$82,000
$36/MByte
~160 IOPS total
25ms latency
1.8TB
~150 IOPs
6.6ms latency
2016 NVMe NAND SSD
2TB
500,000+ IOPs
~60 usec latency
10
The Continuing Need For Lower Latency
59 Years
Source: Wikipedia
*Other names and brands may be claimed as the property of others.
Lower Storage Latency Requires Media and
Platform Improvements Persistent
Memory
3D XPoint
memory (SCM)
Media Bottlenecks
Platform HW / SW bottlenecks
12
Addressing Media Latency:
Next Gen NVM / SCM Resistive RAM NVM Options
Scalable Resistive Memory Element Family Defining Switching
Wordlines Memory
Characteristics
Element Phase Energy (heat) converts material
Change between crystalline (conductive) and
Selector Memory amorphous (resistive) phases
Device Magnetic Switching of magnetic resistive
Tunnel layer by spin-polarized electrons
Junction
(MTJ)
Electrochemical Formation / dissolution of
Cells (ECM) nano-bridge by electrochemistry
Binary Oxide Reversible filament formation by
Filament Oxidation-Reduction
Cross Point Array in Backend Layers ~4l2 Cell Cells
Interfacial Oxygen vacancy drift diffusion
Switching induced barrier modulation
Crosspoint Breakthrough
Structure Material Advances
Selectors allow dense Compatible switch and
packing and individual memory cell materials
access to bits
*Results have been estimated or simulated using internal analysis or architecture simulation or modeling, and provided to you for informational
purposes. Any differences in your system hardware, software or configuration may affect your actual performance
3D XPoint Technology Instantiation
17
3D Xpoint Technology Video
Demonstration of 3D Xpoint SSD Prototype
Need to Address System Architecture To Go Lower
120
100
Latency (usecs)
80
60
40
20
0
NAND MLC NVMe SSD 3D Xpoint NVMe SSD DIMM Memory
(4kB read) (4kB read) (64B read)
Block Storage Platform Changes
21
Addressing Interface Efficiency With NVMe / PCI
10,000
150
125
NVMe eliminates 20 s of
100
controller latency
75
50 ~7X 3D XPoint SSD delivers < 10 s latency
25
3D XPoint Persistent Memory
0 HDD SSD SSD SSD PM
+SAS/ NAND NAND 3D 3D
SATA +SAS/ +NVMe XPoint Xpoint
SATA +NVMe
70 SATA SAS
60
50
40 PCIe NVMe approaches
30 theoretical max of 800K
20 IOPS at 18us
10
0
0 100000 200000 300000 400000 500000 600000 700000 800000 900000
IOPS
2 WV LSI 4 WV LSI 6 WV LSI 2 WV AHCI
4 WV AHCI 1 FD 1 CPU 1 FD 2 CPU 1 FD 4 CPU
7 6.4
6
Bandwidth (GB/sec)
4
3.2 PCIe/NVMe provides
3 more than 10X the
Bandwidth of SATA.
2 Even More with Gen 4
1 0.55
0
SATA 4x PCIeG3/NVMe 8x PCIeG3/NVMe
25
Synchronous Completion for Queue Depth 1?
Async (interrupt-driven) Context
Switch
9.0 s
From Yang: FAST 12
System call
Ta Tb=1.4 Tu = 2.7 s Ta
Return to user -10th USENIX
CPU
user kernel
user
kernel user
Conference on File
(P2)
OS cost = Ta + Tb and Storage
Storage = 4.9 + 1.4
Device Device 4.1 s
Interrupt
Technologies
command = 6.3 s
Sync (polling)
System call Return to user
4.4 s
CPU
user kernel user
polling
Storage 2.9 s
Device OS cost = 4.4 s
Standards for Low Latency Replication
28
Persistent Memory Oriented Platform Changes
29
Why Persistent Memory?
30
25
20
Latency (usecs)
15
10
0
NAND MLC NVMe SSD 3D Xpoint NVMe SSD DIMMDIMM
3D XPoint MemoryMemory
(4kB read) (4kB read) (64Bread)
(64B read)
Open NVM Programming Model
Interfaces for PM-aware file system interfaces for application accessing a Kernel support for block Interfaces for legacy applications to
accessing kernel PM support PM-aware file system NVM extensions access block NVM extensions
32
NVM Library: pmem.io
64-bit Linux Initially
Application
Standard
Load/Store
File API User
Space
Library
pmem-Aware MMU
File System
Open Source
Mappings
Kernel http://pmem.io
Space
libpmem
libpmemobj
libpmemblk Transactional
Intel 3D XPoint DIMM libpmemlog
libvmem
33 libvmmalloc
33
Write I/O Replaced with Persist Points
Application
Standard
Load/Store
File API
User
Space
NVM Library
No Page Cache
pmem-Aware MMU
Mappings
File System
NVDIMM
Operating System Support for Persistent Memory
The Data Path
L1 L1 L1 L1 L1 L1 L1 L1
L2 L2 L2 L2
L3
NVDIMM NVDIMM
NVDIMM NVDIMM
36
New Instructions For Flushing Writes
L1 L1 L1 L1 L1 L1 L1 L1
L2 L2 L2 L2
CLFLUSH, CLFLUSHOPT, CLWB
L3
37
Flushing Writes from Caches
Instruction Meaning
Cache Line Flush:
CLFLUSH addr
Available for a long time
38
38
Flushing Writes from Memory Controller
Instruction Meaning
Persistent Commit:
PCOMMIT Flush stores accepted by
memory subsystem
39
39
Example Code
Comments
MOV X1, 10
MOV X2, 20 X2,X1 are in pmem
.
MOV R1, X1 Stores to X1 and X2 are globally visible, but may
. not be persistent
.
.
CLFLUSHOPT X1
CLFLUSHOPT X2 X1 and X2 moved from caches to memory
.
.
SFENCE
PCOMMIT
.
.
SFENCE Ensures PCOMMIT has completed
Join the Discussion about Persistent Memory
Learn about the Persistent Memory programming model
http://www.snia.org/forums/sssi/nvmp
Read the documents and code supporting ACPI 6.0 and Linux NFIT drivers
http://www.uefi.org/sites/default/files/resources/ACPI_6.0.pdf
https://git.kernel.org/cgit/linux/kernel/git/djbw/nvdimm.git/log/?h=nd
https://github.com/pmem/ndctl
http://pmem.io/documents/
https://github.com/01org/prd
<1 usec
Persistent
Memory
3D XPoint memory
NVMe SSD
<10 usec
Ultra fast SSD
43