You are on page 1of 2

NVIDIA TESLA P100

GPU ACCELERATOR

Worlds most advanced data center accelerator for


PCIe-based servers
HPC data centers need to support the ever-growing demands of
scientists and researchers while staying within a tight budget. The
old approach of deploying lots of commodity compute nodes requires
huge interconnect overhead that substantially increases costs SPECIFICATIONS
without proportionally increasing performance. GPU Architecture NVIDIA Pascal
NVIDIA CUDA Cores 3584
NVIDIA Tesla P100 GPU accelerators are the most advanced ever Double-Precision 4.7 TeraFLOPS
Performance
built, powered by the breakthrough NVIDIA Pascal architecture
Single-Precision 9.3 TeraFLOPS
and designed to boost throughput and save money for HPC and Performance

hyperscale data centers. The newest addition to this family, Tesla Half-Precision 18.7 TeraFLOPS
Performance
P100 for PCIe enables a single node to replace half a rack of GPU Memory 16GB CoWoS HBM2 at
commodity CPU nodes by delivering lightning-fast performance in a 732 GB/s or
12GB CoWoS HBM2 at
broad range of HPC applications. 549 GB/s
System Interface PCIe Gen3
MASSIVE LEAP IN PERFORMANCE Max Power Consumption 250 W
ECC Yes
NVIDIA Tesla P100 for PCIe Performance Thermal Solution Passive
Form Factor PCIe Full Height/Length
30 X
2X K80 2X P100 (PCIe) 4X P100 (PCIe) Compute APIs CUDA, DirectCompute,
25 X OpenCL, OpenACC
Application Speed-up

20 X TeraFLOPS measurements with NVIDIA GPU Boost technology

15 X

10 X

5X

0X
NAMD VASP MILC HOOMD- AMBER Caffe/
Blue AlexNet
Dual CPU server, Intel E5-2698 v3 @ 2.3 GHz, 256 GB System Memory, Pre-Production Tesla P100
Tesla P100 PCle | Data Sheet | Oct16
A GIANT LEAP IN PERFORMANCE
Tesla P100 for PCIe is reimagined from silicon to software, crafted with innovation at every level. Each groundbreaking technology
delivers a dramatic jump in performance to substantially boost the data center throughput.

PASCAL ARCHITECTURE COWOS HBM2 CPU GPU


PAGE MIGRATION ENGINE
More than 18.7 TeraFLOPS Compute and data are Simpler programming and
of FP16, 4.7 TeraFLOPS integrated on the same computing performance
of double-precision, and package using Chip-on- Unified Memory
tuning means that
9.3 TeraFLOPS of single- Wafer-on-Substrate with applications can now scale
precision performance HBM2 technology for 3X beyond the GPUs physical
powers new possibilities memory performance over memory size to virtually
in deep learning and the previous-generation limitless levels.
HPC workloads. architecture.

Exponential HPC and hyperscale performance 3X memory boost Virtually limitless memory scalability

P100
P100
Bi-directional BW (GB/Sec)

25 800 10,000

Addressable Memory (GB)


P100 (FP16)
Teraflops (FP32/FP16)

20
600 1,000

15 P100 (FP32)
M40 400 K40 M40 100 M40
K40
10 K40
200 10
5

0 0 0

To learn more about the Tesla P100 for PCIe visit www.nvidia.com/tesla
2016 NVIDIA Corporation. All rights reserved. NVIDIA, the NVIDIA logo, Tesla, NVIDIA GPU Boost, CUDA, and NVIDIA Pascal are
trademarks and/or registered trademarks of NVIDIA Corporation in the U.S. and other countries. OpenCL is a trademark of Apple Inc.
used under license to the Khronos Group Inc. All other trademarks and copyrights are the property of their respective owners. OCT16

You might also like