Professional Documents
Culture Documents
B
itcoin, since its January 2009 deployment,1 has datacenter design, and now most computation is per-
experienced exponential growth. As of July 2017, formed in specialized ASIC datacenters that collectively
there are about 16.5 million Bitcoins (BTCs) in form an ASIC cloud.3,4
circulation; given the current (as of this writ-
ing) BTC-to-USD exchange rate of $2,500, Bitcoins mar- HOW THE BITCOIN SYSTEM WORKS
ket capitalization therefore exceeds $41 billion, making The Bitcoin system maintains a global, distributed
it the most successful of the nearly 1,000 cryptocurren- cryptographic ledger of transactions, or blockchain,
cies in use today (coinmarketcap.com). through a consensus algorithm running on hardware
Underpinning Bitcoins success is a series of techno- scattered across the world. These machines perform
logical innovations that span from algorithms to distrib- a computationally intense proof-of-work function
uted software to hardware. Amazingly, these innova- called mining, which integrates BTC transactions into
tions were not initiated by corporations or governments the blockchain. Each transaction debiting a senders
but rather emerged through a grass-roots collaboration account and crediting a receivers account is aggre-
of enthusiasts. gated with other pending transactions into a block by
In this article, I discuss the hardware that maintains a single machine and posted to the blockchains head.
the integrity of the Bitcoin system, which evolved from A block also contains a hash of the previous head
CPUs to GPUs to field-programmable gate arrays (FPGAs) block, creating a total order. Upon receiving notice of
to application-specific integrated circuits (ASICs).2 As a blocks posting, other nodes in the system will ver-
Bitcoins value grew, the industry rapidly matured and ify that the transaction is in orderfor instance, not
the system attained extraordinary scale, equivalent improperly creating, moving, or destroying BTCs
to 3.2 billion high-end GPUs. The latest round of Bit- and then use the new block as the head block for future
coin hardwarededicated ASICshas co-evolved with blockchain updates.
SEPTEMBER 2017 59
BLOCKCHAIN TECHNOLOGY IN FINANCE
5,000
2,000
1,000
500
200
BTC-to-USD exchange rate
100
50
20
10
5 CUDA-based GPU miner appeared in
2 September 2010, followed a month
1 later by the first OpenCL miner. Shortly
0.5 afterward, in November 2010, pooled
0.2 mining emerged, allowing parties to
0.1 mine together and split the rewards
0.05
.05 pro rata.6 These mining pools rapidly
2010 2011 2012 2013 2014 2015 2016 2017 2018 scaled to thousands of members, giving
(a) Year users small, frequent payouts instead
of large 50- or 25-BTC payouts every
1,000,000,000,000 several months. By this time, mining a
100,000,000,000 block was equivalent to several months
10,000,000,000
20 nm
16 nm of computation for a single high-end
22 nm
1,000,000,000 consumer GPU.
Mining difficulty level
28 nm
100,000,000 55 nm Developers released the first open
10,000,000 130 nm source FPGA miner code in June 2011.
65 nm
1,000,000 FPGA 110 nm The first ASIC miner debuted in Janu-
100,000 ary 2013 in 130-nm VLSI technology,
10,000
GPU and more advanced ASICs rapidly fol-
1,000
lowed, racing to the most advanced
100
16-nm node by mid-2015.
10
CPU
1
2009 2010 2011 2012 2013 2014 2015 2016 2017 2018
Performance and energy-
efficiency advances
(b) Year
High-end, overclocked six-core CPUs
like the Intel Core i7-990x eventu-
FIGURE 1. Bitcoin price and mining diculty trends. (a) The price of Bitcoins (BTCs) took ally reached 33 megahashes per sec-
off in mid-2010, a year and a half after the system went live, and has since risen steadily ond (MH/s) when using SIMD (single
but with periods of considerable volatility. (Source: bitcoincharts.com.) (b) Finding a block instruction, multiple data) extensions.
header hash value below the target thresholdthe algorithm underlying Bitcoins block- Top-tier consumer-grade Nvidia GPUs
chainis 850 billion times more dicult than it was originally. The approximate introduc- like the GTX 570 reached 155 MH/s,
tion dates of new mining technologies are indicated: CPUs, GPUs, field-programmable while $450 AMD GPUs like the 7970
gate arrays (FPGAs), and application-specific integrated circuits (ASICs) in different VLSI performed even better, reaching 0.675
nodes. (Data from blockchain.info.) gigahashes per second (GH/s).
The next evolutionary step was
FPGA-based miners, which emerged
that (6 exahashes per second). Earn- in difficulty often align with BTC price in June 2011. Open source versions
ing one block corresponds to about 271 bubble bursts; in these cases, BTC value used four Xilinx Spartan-6s, which
double SHA-256 hashes, an impressive did not justify operating costs for the were less cost-effective in terms of
amount of computation since each more inefficient miners, and their oper- hash search time than AMD GPUs but
double hash is a few thousand opera- ators pulled them offline. operated on 60 W instead of 200 W. A
tions itself. commercial company, Butterfly Labs
Two factors increase mining diffi- A Cambrian explosion of (BFL), began to market and sell a range
culty. First, due to rising exchange rates, mining technology of FPGA miners. These would have
mining can cover the cost of more rigs. The dots in Figure 1b indicate when supplanted GPU miners due to energy
Second, mining software and hardware new Bitcoin mining technology was costs, but the appearance of ASICs
have both continually improved. Dips introduced. The first publicly available provided orders of magnitude cost
200
100
50
CPU (Core i5)
20
10
SEPTEMBER 2017 61
BLOCKCHAIN TECHNOLOGY IN FINANCE
(a) (b)
FIGURE 3. GPU Bitcoin miners. (a) Open-air rig with five GPUs suspended above the motherboards and connected via PCI Express
extender cables and a single high-wattage power supply. (b) Homebrew 69-GPU mining datacenter. Note the ample power cabling (left)
and cooling system, consisting of box fans and an air duct (right). Photos by James Gibson (gigavps).
customized for Bitcoin mining. Those last, creating a chain of dependencies computing in novel ways. A crowd-
interested in details on the first four between operations. Successive SHA- sourced standard evolved, 2 wherein
generations should consult my paper 256 rounds cannot be parallelized, but five GPUs were suspended over an
from the 2013 International Confer- each nonce trial is parallel in a classic inexpensive AMD motherboard with
ence on Compilers, Architecture, Eureka-style computation, making this minimum DRAM, connected via five
and Synthesis for Embedded Systems amenable to parallelization. Further- PCI Express extender cables to reduce
(CASES).2 Much of the information in more, some operations inside a round motherboard costs, and using a large
this paper was drawn from an analy- are parallelizable. However, typical high-efficiency power supply to drive all
sis of bitcointalk.orgs mining hard- out-of-order multicore machines have GPUs. The system was open-air to max-
ware forum (bitcointalk.org/index extra hardware optimized for less reg- imize airflow, as Figure 3a shows. These
.php?board=76.0), which as of July ular computations, resulting in wasted approaches enabled the mining hard-
2017 had more than 525,000 posts. performance and energy efficiency. ware to be amortized across five GPUs,
improving capital efficiency.
CPUs: first-generation miners GPUs: second-generation miners After optimizing per-GPU overhead,
The Bitcoin miner source code (github In October 2010, Bitcoin mining soft- the next scaling challenge was meet-
.com/ bitcoin/ bitcoin/ blob/master ware for GPUs was released on the ing the prodigious power and cooling
/src/miner.cpp) is surprisingly simple. web, and it was rapidly optimized and requirements of multiple GPUs. With
The basic computation adapted for use in several open source each GPU consuming 300 W, the power
efforts. Typically, this software would density exceeded that supported by
while (1) implement the Bitcoin protocol and both high-density datacenters and res-
HDR[kNoncePos]++; GPU voltage/temperature/error control idential electric grids. Most successful
IF (SHA256(SHA256(HDR)) < (65535 in a language such as Java or Python, Bitcoin mining operations typically
<< 208)/ DIFFICULTY) return; and the core nonce-search algorithm as relocated to warehouse spaces with a
a single OpenCL file (see, for example, large air volume for cooling and cheap
leverages existing high-performance github.com/Diablo-D3/DiabloMiner industrial power rates. Figure 3b shows
SHA-256 hashing libraries. One simple / blob/master/src/ma i n/resources a homebrew datacenter consisting of a
optimization employs a midstate buf- /DiabloMiner.cl) that was compiled 69-GPU rack cooled by an array of 12
fer, which hashes the block headers down by installed runtimes into the box fans and an airduct.
beginning portion that precedes the GPUs hidden native instruction-set
nonce and has a constant intermedi- architecture. FPGAs: third-generation miners
ate hash value. More optimizations are GPUs proved much more accessible June 2011 brought the first open source
discussed elsewhere.7 than FPGAs for Bitcoin enthusiasts, FPGA Bitcoin miner implementations.
The SHA-256 computation takes in requiring PC-building skills but no FPGAs are inherently good at process-
512-bit blocks and performs 64 rounds formal training in parallel program- ing SHA-256s rotate-by-constant and
of a basic encryption operation involv- ming or FPGA tools. After investing bit-level operations, but not its 32-bit
ing several long chains of 32-bit addi- resources in a GPU-based mining rig add operations.
tions and rotations, as well as bit-wise that was literally minting cash, the nat- The typical FPGA miner repli-
XOR, majority, and mux functions. ural inclination was to scale up. cated multiple SHA-256 hash func-
An array of 64 32-bit constants is also Efforts to scale hash rates through tions and unrolled them. With
used. Each round depends on the GPUs pushed the limits of consumer full unrolling, the module created
SEPTEMBER 2017 63
BLOCKCHAIN TECHNOLOGY IN FINANCE
(a) (b)
FIGURE 4. ASIC Bitcoin miners. (a) USB hub hosting an array of ASICMiner Block Erupter USB stickstyle miners and a USB-powered
cooling fan. Each USB sticks 130-nm ASIC hashes at 330 megahashes per second (MH/s), or about half the MH/s performance of a
$450 28-nm AMD Radeon HD 7970 GPU. (b) Bitmain Antminer S1 machine with two parallel sea-of-ASICs printed circuit boards.
Photos by DennisD7 and dogie of bitcointalk.org.
others and rapidly dropped in price. or 108 BTCs at the time, and hashing at ever-rising difficulty levels. These suc-
Figure 4a shows a USB hub hosting an 66 GH/s at 600 W. cessive generations had two potential
array of USB stickstyle Bitcoin min- Avalon taped out slightly after sources of innovation: better archi-
ers and a USB-powered cooling fan. ASICMiner, with a target date of 10 tectures and more advanced process
Each USB stick has a 130-nm ASIC that January 2013. On 30 January, Bitcoin nodes. To date, there have been more
hashes at 330 MH/s at 1.05 V and 2.5 W, developer Jeff Garzik became the than 37 different ASIC efforts.
reaching 392 MH/s at 1.15 V. The ASIC first customer in history to receive an BitFury, with star chip designer
performs one hash per cycle, mirror- ASIC mining rig, which earned about Valery Nebesny, reached 55 nm first
ing earlier FPGA designs. It is 40 times 15 BTCs the first day. Avalon offered in mid-2013 with a best-of-class fully
more energy efficient than the 28-nm batches of 600 rigs for 75 BTCs on 2 Feb- custom implementation in many ways
AMD 7970 GPU and 4.4 times cheaper ruary ($1,600) and 25 March ($5,500). superior to 28-nm designs, reaching
per GH/s. They sold out almost immediately. 0.8 W per GH/s and 2.5 GH/s per chip.
ASICMiner shares reached 4 BTCs Avalon followed up with direct chip Sixteen chips were placed on a printed
each in October 2013, signifying a 40 sales, selling more than 100 batches circuit board, and 16 PCBs went into a
return to the initial investors. Of the of 10,000 chips for 780 BTCs per batch, backplane. Unlike most other archi-
three early ASIC mining companies, it or about $78,000, enabling others to tectures that unrolled double SHA-256
was the most innovative in trying out design systems around the new chips. hashes into long pipelines, BitFurys
new products and business models. used rolled hashes that iterate in
THE ASIC WAR: FIFTH- place. It also introduced support for
Avalon GENERATION BITCOIN string designs, with ASIC power pins
Avalon also secured grass-roots fund- MINERS connected serially like Christmas tree
ing through direct presales of units via The next generation of ASICs departed lights, eliminating the DCDC con-
an online store. A key founder, N.G. from the first in several ways. After verters that comprise 2040 percent
Zhang, established his reputation with first-generation ASICs had proven of Bitcoin server cost. BitFurys initial
the design of a top Bitcoin FPGA board, their value in Bitcoin mining, venture 40,000 chips went to a large datacenter
Icarus. Avalon focused on an 110-nm capitalists and other investors funded provider that financed the NRE costs.
TSMC implementation of a double a swath of start-ups, many featuring Later, individual chips were sold, and
SH-256 pipeline, measuring 4 4 mm, industry veterans. Moreover, the com- interesting variants ranging from USB
and packaged 300 chips across three petition was not easily beaten GPUs keys to blades were sold by third par-
blades inside a 4U-ish machine. Like but rather other ASICs. New ASICs had ties online, including on Amazon.com.
ASICMiner, Avalon was based in Shen- to best the previous generation in cost/ Sweden-based KnCMiner reached
zhen, China. The company preordered performance and energy efficiency 28 nm by October 2013. Shortly after-
sales of 300 rigs, each priced at $1,299 to be competitive and stay ahead of ward, San Franciscobased Hash Fast
B
implemented 28-nm miners, target- new immersion-cooled datacenters in itcoin mining is an example of
ing energy efficiencies that matched the Republic of Georgia, Iceland, and the emerging class of planet-
or exceed BitFurys designs, at 0.7 W Finland.10 scale applications. Today, com-
per GH/s. Figure 4b shows Bitmains Merged ASIC development and panies including Apple, Facebook,
Antminer S1. There is evidence that datacenter operation have become and Google are deploying planet-scale
21 Inc reached 22 nm around Decem- prevalent in the industry for three applications like Siri, Facebook Live,
ber 2013, but the details are closely reasons. First, the ASIC, enclosing and Brain, respectively, for which
guarded secrets. machine, and datacenter can be code- computational demand scales with
signed. This eliminates the need to the number of users just like with
THE ASIC VICTORS: SIXTH- worry about varying customer envi- Bitcoin. Ultimately, the TCO of the
GENERATION BITCOIN ronments (temperature, customs cer- datacenters that run these computa-
MINERS tification, 220-V/110-V compatibility, tions becomes so large that it makes
Current sixth-generation Bitcoin min- setup and tech support, shipping and economic sense to build specialized
ers are the products of companies that returns, warranties, and so on) and ASICs to reduce hardware cost and
survived the ASIC war and advanced to enabling new cost, energy-efficiency, power consumption. Following this
bleeding-edge nodes as they emerged and performance optimizations. Sec- trend, last year Google announced the
(for example, 20 nm and 16 nm). The ond, the time to get an ASIC running creation of neural-network ASICs for
two main publicly known contend- is greatly shortened if the product their datacenter workloads.11 Recent
ers are BitFury (bitfury.com) and Bit- does not have to be packaged, trouble- ASIC cloud research shows how the
main (www.bitmain.com), which have shooted, and shipped to the customer, lessons from Bitcoin mining hardware
16-nm chips. Both companies imple- which means that the chips can start apply to other workloads like You-
mentations run at ultralow voltages; hashing earlier. This is particularly Tubes video transcoding.12 The future
BitFury miners exceed 0.07 W per important when the network hash of ASIC clouds is bright, in part due to
GH/s, which is 100 times more energy rate is increasing exponentially and the many pioneers who took financial,
efficient than the first 130-nm ASIC the bulk of the profits are earned early legal and, technical risks to accelerate
miners and 8,000 times more energy in a machines life. Third, tuning an Bitcoin development and design an
efficient than GPU miners. ASIC chip to exactly meet promised entirely new class of hardware.
SEPTEMBER 2017 65
BLOCKCHAIN TECHNOLOGY IN FINANCE
ACKNOWLEDGMENTS Programming Languages and Operating The New York Times, 21 Dec. 2013;
This work was partially supported by NSF Systems (ASPLOS 17), 2017, pp. 511526. dealbook.nytimes.com/2013/12/21
awards 1228992, 1563767, and 1565446, 5. J. Light, For Virtual Prospectors, /into-the-bitcoin-mines.
and by STARnets Center for Future Archi- Life in the Bitcoin Mines Gets Real, 10. A. Kampl, Bitcoin 2-Phase Immer-
tectures Research, a SRC program spon- The Wall Street J., 19 Sept. 2013; www sion Cooling and the Implications for
sored by MARCO and DARPA. .wsj.com/articles/for-virtual High Performance Computing, Elec-
-prospectors-life-in-the-bitcoin tronics Cooling, Mar. 2014, pp. 2429.
REFERENCES -mines-gets-real-1379644359. 11. C. Metz, Google Built Its Very Own
1. S. Nakamoto, Bitcoin: A Peer- 6. M. Rosenfeld, Analysis of Bitcoin Chips to Power Its AI Bots, Wired, 18
to-Peer Electronic Cash System, Pooled Mining Reward Systems, May 2016; www.wired.com/2016/05
2008; bitcoin.org/bitcoin.pdf. 22 Dec. 2011; arxiv.orgpdf/1112.4980 /google-tpu-custom-chips.
2. M.B. Taylor, Bitcoin and the Age .pdf. 12. M. Khazraee, et al, Specializing a
of Bespoke Silicon, Proc. Intl Conf. 7. N.T. Courtois, M. Grajek, and R. Naik, Planets Computation: ASIC Clouds,
Compilers, Architectures, and Synthesis Optimizing SHA256 in Bitcoin Min- IEEE Micro, vol. 37, no. 3, 2017,
for Embedded Systems (CASES 13), 2013, ing, Proc. Intl Conf. Cryptography pp. 6269.
article no. 16. and Security Systems (CCSS 14), 2014,
3. I. Magaki et al., ASIC Clouds: Spe- pp. 131144.
cializing the Datacenter, Proc. 43rd 8. J. Barkatullah and T. Hanke, Gold-
Intl Symp. Computer Architecture strike 1: CoinTerras First-Generation
(ISCA 16), 2016, pp. 178190. Cryptocurrency Mining Processor Read your subscriptions
4. M. Khazraee et al., Moonwalk: NRE for Bitcoin, IEEE Micro, vol. 35, no. 2, through the myCS
publications portal at
Optimization in ASIC Clouds, Proc. 2015, pp. 6876.
http://mycs.computer.org
22nd Intl Conf. Architectural Support for 9. N. Popper, Into the Bitcoin Mines,
Subscribe today!
IEEE Computer Societys newest magazine
tackles the emerging technology
of cloud computing.
computer.org/
R I NG
FAC T U D
MANU
& THE
C L OU Security and
cloudcomputing
AUTONODep
ofM
endability
ICd-Assisted
CLOUDS
Clou
Internet of Things
Live Migration 12
Capability-Oriented
ting 32 Methodology 58
Compu ys 42
Service Ke
Mobile yptographic
g Cr
Securin Software-Defi
ned Networki
ng 8
Datacenter Thre
ats 64
MARCH/APRIL 2016
dcomputing
www.computer.org/clou
2016 puting
UGUST loudcom
JULY/A MAY .org/c
/JUN E 2016
omputer
www.c www.compute
r.org/cloudcom
puting