You are on page 1of 24

1.

Lots of Bits
1.1 Big numbers
1.1.1 Just counting
In the beginning there were just scratches in the sand. Three goats III, six goats IIIIII. That way of representing numbers still turns up occasionally; look at the 2 and 2 count on the old scoreboard at Fenway Park. In the computer world its called unary notation. Pretty early on someone realized that it was hard to tell who had more goats if one person had IIIIIIIIIIIIIIIIIIIII and the other person had IIIIIIIIIIIIIIIIIIII. So abbreviations were devised. In the Italian peninsula they used an acute angle V for five and a cross X for ten. This made it easier to see that the goatherd with XXI goats had more goats than the one with only XX. Around the beginning of the Christian era, when chiseled inscriptions became all the rage, the Romans used the letters of the alphabet in place of the marks, so one became the letter I, five became the letter V, and ten became the letter X. Roman numerals were not the first system for representing numbers. And they were certainly not the best. In fact the system was so clumsy for arithmetic that even the Romans didnt use it for that. But it does illustrate a central point. Whoever it was that first wrote X instead of IIIIIIIIII was performing the first act of data compression: representing the same information with a shorter string of symbols. In this case the compression method was pretty simple: Use several symbols instead of just one so the representation becomes shorter.

1.1.2 Decimal notation


The big breakthrough in the art of writing numbers was the development of decimal notation. In decimal notation there are ten symbols, 0, 1, 2, 3, 4, 5, 6, 7, 8, and 9, called digits. The string 473, for example, means 4 hundreds plus 7 tens plus 3. The rightmost position represents units, and every position further to the left has ten times as much weight as the position just to its right.

BITS: Notes for Harvard QR48 / MIT 6.095 DRAFT March 7, 2007

473 = 4 !100 + 7 !10 + 3 !1.


This representation is very efficient: it takes only three symbols to represent a number that would be hundreds of symbols long in unary. And unlike Roman numerals, which require the invention of more and more individual symbols to represent larger and larger numbers, decimal notation uses just ten symbols and allows representation of numbers of any size. Moreover, the rules for producing and interpreting these strings of digits are extremely simple and regular. For example, To represent a number ten times as large, slide all the digits one place to the left and put a 0 as the new rightmost digit.

1.1.3 Scientific Notation


Yet for bigger numbers, decimal notation itself becomes hard to read. Which is bigger, 100000000000 or 10000000000? Huge numbers are usually represented using so-called scientific notation, as a small number shown multiplied by a power of ten. So we might have written 473 in scientific notation as 473 = 4.73 !10 2 , except that in that case ordinary decimal notation is actually more concise. But writing 100000000000 and 10000000000 as 1011 and 1010 makes it clear which is larger.
10 n = 10 44!L !10 = 1 followed by n 0's . 1 !10244 3
n

Usually scientific notation is used when the exact value of a number is not known, and what is shown is really an approximation. Thats another reason for not using scientific notation for a number such as 473. But one might say something like The sun is 9.3 !10 7 miles from the earth, because we dont know the number of miles to the sun down to the last mile. Just writing a power of ten, such as 1011, suggests that you mean that whole number exactly. But writing 1.00 !1011 suggests that you are referring to some measurable quantity that is approximately that number but might actually be something like 1.002743 !1011. There is no mathematical difference between 1011 and 1.00 !1011 , its just a subtlety of whats implied by the way these expressions are used. It gets hard pretty quickly to associate numbers as big as that with things in our everyday experience. Here is a table that may help a bit. Number Count Distance (meters) Time (seconds)

10

1 Lots of Bits

BITS: Notes for Harvard QR48 / MIT 6.095 DRAFT March 7, 2007 1 = 100 10 = 101 100 = 102 1000 = 103 104 105 106 = million 107 108 109 = billion 1010 1011 1012 = trillion 1013 1014 1015 = quadrillion 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 You Blocking group Dormitory Freshman class Harvard students Cambridge Montana Massachusetts US Females China People on earth People who ever lived, Neurons in brain Human body House Football field Across the campus Boston Massachusetts Boston to Chicago Boston to Hawaii Twice around earth One moon orbit To the sun Heartbeat Speak a sentence Brush your teeth Eat a meal Final exam Sunrise to sunrise Spring break Grow a crop College Mozarts life Harvards life Civilization

To Jupiter To Pluto Cells in human body Distance light travels in a year To nearest star Insects on earth Thickness of Milky Way Across the Milky Way To nearest major galaxy

Homo sapiens sapiens Fire tamed Hominid bipedalism Monkeys Insects Photosynthesis Origin of universe

Grains of sand on earths beaches Stars in the universe

Atoms in a pound

Diameter of

11

BITS: Notes for Harvard QR48 / MIT 6.095 DRAFT March 7, 2007 of iron 10 1027 1028 1029 1030 1031 1050 1085
26

universe

Bacteria on earth Atoms in the earth Particles in universe?

Each row of the table starts with a power of ten, so each successive row represents a scale ten times that of the previous row. Put differently, each row is one order of magnitude greater than the previous row. The second column includes something whose number is around that size. Some of these numbers, especially the larger ones, are rather speculative, but the attempt has been to show a quantity whose exact size is within a factor of two of the number in the left column. For example, the official Chinese estimate of the population of that country was 1.295 billion in 2004, though some other estimates are as high as 1.5 billion. Either way, the order of magnitude is 109. Similarly the third column shows distances in meters, and the fourth column shows times in seconds. Since the big bang was on the order of 1018 seconds ago, and the diameter of the universe is on the order of 1025 meters, the rest of those columns simply cant be filled in. There just arent any distances or times greater than the last ones listed in those columns. Reading down the distance and time columns gives one a foreshortened sense of the universe, like the famous New Yorker cartoon showing Manhattan in the foreground and most of the world in half the page, just on the other side of the Hudson river. Going up by factors of ten results in enormous jumps of space and time in only a few steps.

12

1 Lots of Bits

BITS: Notes for Harvard QR48 / MIT 6.095 DRAFT March 7, 2007 The numbers of things can get very large indeed. There are a billion times more bacteria on earth than there are stars in the universe. If you were to lay down atoms at a rate of one per meter, a pound of iron would suffice to get you across the entire universe. If you had started enumerating the grains of sand on the earths beaches on the day the universe was born and had counted steadily at a rate of one grain per second, you would by today only have gotten through only a thousandth of the sand. Big as these numbers are, the biggest are dwarfed by a googol. Before Google (spelled differently) was the name of a search engine that I used to find some of these numbers, googol was a word invented to designate 10100, a 1 followed by a hundred zeroes. 100000000000000000000000000000000000000000000000000000000000000000 00000000000000000000000000000000000. That's a quadrillion times the number of particles in the universe.

1.2 Naming things


1.2.1 How many things are there, or could there be?
A googol of something would be a lot of them, so a googol is a big number. Yet it takes only a couple of lines to write down a googol in decimal notation. If you had a googol of things and you wanted to give them each a different serial number, you could identify each one of them by a string of less than a hundred digits. The serial numbers would serve as names. Each thing could be uniquely identified by its number. The reason we dont need numbers as big as a googol very often is that there just arent that many different things that we ever have occasion to name. In fact, there arent even that many different things in the universe. The number of things that could exist is quite another matter. For example, imagine a parking lot with twenty spaces, and suppose the same twenty cars park in the lot every day, but not necessarily in the same spaces. The number of different arrangements of cars in parking spaces turns out to be 2432902008176640000 a number bigger than the number of seconds since the 13

BITS: Notes for Harvard QR48 / MIT 6.095 DRAFT March 7, 2007 big bang. The number of ways the cars have ever actually been arranged is a lot smaller no more than the number of days in the lifetime of a car, most likely. If you are inclined to object that a parking lot with twenty cars is not a thing but an arrangement of other things, remember that your body is an arrangement of atoms that have been around for billions of years rearranged in other ways to constitute other things. Inevitably, most of the things that could exist, dont. This is convenient, because it means that the names of things dont have to be too long. If you wanted to give a serial number to every atom of the earth, the numbers would have to be only 50 digits long, because there are less than 1050 atoms in all. Even if you wanted to give a different number to every particle in the universe, any one serial number would have to be only 85 digits. And you could, in theory, give a different ten-digit telephone number to every human being who has ever lived. The names of things the strings of symbols that can be used to identify things are a lot shorter than the number of things there are.

1.2.2 Numerals vs. numbers


This is a good place to start watching our language. Why are things like XVII called Roman numerals, rather than Roman numbers? Because a numeral is a string of symbols that represents a number. Numbers themselves are abstract things. So from now on we will talk about decimal numerals when referring to strings of decimal digits. So how much smaller is a decimal numeral than the number it represents? Well, if there are n things to be named with strings of l decimal digits, then l has to be big enough so that 10 l ! n : nine digits to represent up to 109 things, and so on. The base 10 logarithm of a number n, or in symbols log10n, is the power to which you have to raise 10 to get n,
10 log10 n = n .

So the length l of the numerals needs to be at least log10n, where n is the number of things to be represented. If n is a exactly a power of 10, that is the end of the story: three digits to represent 1000 things numbered 000 to 999, four to represent 10,000 things, and so on. If n is not exactly a power of 10, you still need as many digits as the next 14 1 Lots of Bits

BITS: Notes for Harvard QR48 / MIT 6.095 DRAFT March 7, 2007 larger power of 10. For example, if you want decimal numerals for 597 things, you need three digits, the same as would be needed if there were 1000 things, 1000 being the next power of ten larger than 597. You wouldnt have to name them sequentially, 000 through 596, but you would have to use sequences of length three, since there are only 100 strings of two decimal digits. Another way to look at it is that if log10n is not a whole number, the number of digits needed is the next whole number larger than log10n. For example, to name 597 different things with decimal numerals, we need three-digit numerals, because log10597 = 2.77597433 and the next integer larger than that number is 3. Of course you dont actually need to calculate 2.77597433 all you need to see is that the next power of 10 greater than or equal to 597 is 1000 or 103. We write !x" for the next integer greater than or equal to x, so for example !3.2" = 4 and !17" = 17 and !log10 597" = !2.77597433" = 3 . So then the number of decimal digits needed to assign a different numeral to every one of n things is exactly l = !log10 n" . Suppose we use the twenty-six letters of the Roman alphabet to name things rather than the ten decimal digits, how much shorter could the names be? Well, there are 26l different sequences of exactly l letters: 26 letters, 26 ! 26 = 676 two-letter combinations aa, ab, ac, zy, zz. So to give a different name, say, to every star in the universe wed need to have the length l of the strings to be long enough so that 26 l ! 10 23 . That one you can do on a scientific calculator by just multiplying 26 by itself repeatedly and counting how many times you have to do it before the answer is at least 1023, but there is a better way. For 26 l ! 10 23 we need to have l to be an integer greater than or equal to log261023. It suffices to have l as small as possible, so we should have
l = !log 2610 23 " = !23# log 2610" because log b x c = c log b x ! log10 " log x = $23 % because log b x = logb $ log26 % = !23# 0.7067" = !16.254" = 17

15

BITS: Notes for Harvard QR48 / MIT 6.095 DRAFT March 7, 2007 letters per name. So seventeen letters are enough to name all the stars in the universe thats shorter than a lot of the names people give their children!

1.2.3 A few basic facts about logarithms


A few basic facts about logarithms are used here. This would be a good moment to review the laws mentioned, if your memory of this part of high-school math is rusty. All the basic facts make sense in the context of the simple rule that the length of the number n as expressed in base b notation is about log b n . So here are the basics you need to remember. (If the base of the logarithm is the same on both sides of an equation, it isnt mentioned at all.)
log(m ! n) = log m + log n . The product of a 5-digit number and a 10-digit number takes about 15 digits to write down.

log m n = log m ! log n . Really the same as the previous rule, expressed differently, since it amounts to log m = log n ! m n = log n + log m n .

( )

( )

log(n a ) = a ! log n . For example, if n is a four-digit number, then n 5 will be about a 20-digit number. This rule works when a is negative too, and the rule about lengths makes sense in this context if you interpret a number of length 5 as a fraction less than 1 whose first nonzero digit is 5 places to the right of the decimal point.
log a n , for any a. For example, since log 2 10 ! 3.32 , decimal log a b numerals are about a third the length of the corresponding binary numerals. log b n =

Note that in a quotient such as log x /log y , it doesnt matter what the bases of the logarithms are, as long as the base is the same in the numerator as in the denominator. (Use whatever is handy on your calculator; sometimes its base 10, sometimes its base e.) The general rule is the important one. If you are using d different symbols to give names to n different things, you need strings of length !log d n" .

16

1 Lots of Bits

BITS: Notes for Harvard QR48 / MIT 6.095 DRAFT March 7, 2007

1.3 Binary notation


Its only an accident of human anatomy that we use base 10 notation rather than some other base. People learn to count on their fingers, so that first goatherd who wrote X for ten might well have just been keeping track after he ran out of fingers. There are traces of base 20 and base 5 in human culture, but base 10 is pretty much a cultural universal. For computers it is more natural to use base 2, or binary, since voltages and currents tend to be either on or off and magnetic fields tend to be either this way or that way. Now this isnt strictly true; a voltage inside a computer that is supposed to be 5 or 0 volts is almost never 5.000000000000 volts or 0.000000000000 volts, it is just near one of those values. How near, and how the ideal of a binary computer is realized in a real world where no measurable quantity is completely precise, will be an important subject, but not yet.

1.3.1 Bits = BInary digiTS


The two symbols in binary notation are 0 and 1, and they are called bits, which is short for binary digits. All the rules of decimal arithmetic work on binary numerals, if you remember that 1+1=10, or in other words, 1 plus 1 is 0, carry a 1. This means that just as in decimal notation a 1 followed by n zeroes represents 10n, so in binary notation a 1 followed by n zeroes represents 2n. n 0 1 2 3 4 5 6 7 8 9 10 16 2n (binary) 1 10 100 1000 10000 100000 1000000 10000000 100000000 1000000000 10000000000 2n (decimal) 1 2 4 8 16 32 64 128 256 512 1024 65536

17

BITS: Notes for Harvard QR48 / MIT 6.095 DRAFT March 7, 2007 20 32 64 128 1,048,576 4,294,967,296 = 4.295109 1.8451019 3.4031038

1.3.2 Binary to decimal, decimal to binary


If you memorize the first ten rows of this table, you will be better off. For example, it becomes easy to figure out what number is represented by a binary numeral such as 101010. That numeral has 1s in positions 1, 3, and 5 counting from the right and starting with position 0, so

101010 2 = 2 5 + 2 3 + 21 = 32 + 8 + 2 = 42.
Position 5 4 3 2 1 0 Bit 1 0 1 0 1 0 Position value 32 16 8 4 2 1 BitValue 32 0 8 0 2 0

To convert a number n from decimal to binary, first find the largest power of 2 less than or equal to n. There will be a 1 in the bit position corresponding to that power. Subtract that power of 2 from n and repeat the procedure starting with the remainder. For example, to convert 19 to binary, find the largest power of two less than or equal to 19, which is 16 or 24; subtracting that from 19 leaves 3, which is 21+20. So the binary numeral for 19 would have 1s in positions 4, 1, and 0, in other words 10011.

19 = 16 + 2 + 1 = 1! 2 4 + 0 ! 2 3 + 0 ! 2 2 + 1! 21 + 1! 2 0 = 100112

18

1 Lots of Bits

BITS: Notes for Harvard QR48 / MIT 6.095 DRAFT March 7, 2007

1.3.3 Most and least significant bits


Whatever the base, decimal, binary, or otherwise, the digits at the left end of the numeral say more about the approximate size of the number than do the digits at the right end. The fact that there is a 1 at the fourth bit position from the right in the binary numeral 10011 indicates that the number represented is within a factor of 2 of 2 4 = 16. (The positions are counted starting with 0 at the right end.) The fact that there is a 1 at the rightmost bit position tells us little about the approximate value of the number. Similarly in decimal. The leading digit of 3 in the decimal numeral 342 tells us that the number is in the range of 3!10 2 ; the 2 at the right end provides no information by itself about the approximate size of the number. For this reason the leftmost or leading nonzero digit of a numeral is called the most significant digit. If the numeral is in binary, the leftmost bit is called the most significant bit. Digits or bits towards the right end are called the least significant. You will hear computer geeks, when asked a simple yes or no question, reply yes, but that is only the most significant bit of the answer, meaning that the real answer is more nuanced than a single-bit response would allow.

1.3.4 Negative numbers


There are 2 n different patterns of n bits, enough to represent all the numbers from 0 to 2 n !1 if there was no need to represent negative numbers too. But usually half of the bit-patterns are used to represent negative numbers and half are used for 0 and positive numbers. Usually twos complement notation is used for the negative numbers: to get the notation for !n , take the bit-pattern for n, complement all the bits (replacing 1 by 0 and vice versa), and add 1. For example, the sixteen-bit representation for +9 is 0000000000001001, so 9 is gotten by first complementing all the bits, yielding 1111111111110110, and then adding 1, to get 1111111111110111. In twos-complement notation, the most significant bit is the sign bit: 1 means negative and 0 means positive or zero. But the other bits are not just the notation for the corresponding positive number; -9 is not represented by a sign bit of 1 followed by the notation for +9. Twos complement arithmetic has the great advantage that numbers can be added and subtracted without checking whether 19

BITS: Notes for Harvard QR48 / MIT 6.095 DRAFT March 7, 2007 they are positive or negative; the same algorithm works for two positive numbers, two negative numbers, or a positive and a negative number. (Try adding +9 and 9; you should get all 0s, with a carry propagating all the way off the left end of the sum.) A small oddity is that given a fixed number of bits, say n, it is possible to represent one more negative number than positive number. (Something like this has to be true, since there is an even number, 2 n , of bit-patterns, one of them has to represent 0, and that leaves an odd number to be used for positive and negative numbers.) So the biggest positive number that can be represented using n bits is 2 n!1 !1; for example, the numbers that can be represented using 16 bits range from 32,768 to +32,767.

1.3.5 Binary logarithms


Because binary notation is so widely used in the computer world, base 2 logarithms are also widely used, and there is a special abbreviation for binary logarithms: lg n = log 2 n . The number of bits needed to represent a positive number n is about lg n , or lg n + 1 if you need to account for the fact that half the patterns in twos complement notation must be reserved for negative numbers

1.4 Bits, bytes, and nibbles


1.4.1 ASCII
A string of eight bits can represent any number between zero and 28-1 = 255 inclusive, in exactly the same way that a string of three decimal digits can represent any number between zero and 103-1 = 999 inclusive. But strings of bits can be used to represent things other than numbers. For example, the letters of the alphabet, plus punctuation marks and other symbols, are represented as sequences of bits by means of various codes. The most standard of these is called ASCII, which stands for the American Standard Code for Information Interchange. In ASCII, the upper-case letter A is represented as 01000001, and the lower-case a is 01100001. Well say that 01000001 is the code word associated with the symbol A. ASCII is an eight-bit code, so 256 different symbols can be represented using 8-bit code words. Sequences of symbols can then be represented by concatenating their code words; for example,

20

1 Lots of Bits

BITS: Notes for Harvard QR48 / MIT 6.095 DRAFT March 7, 2007 abc = 01100001 01100010 01100011. Weve left a little white space between the code words for visual clarity, but its not needed to translate the sequence of 24 bits back into three symbols, since it is known in advance that code words are always exactly 8 bits long.

1.4.2 Bytes
A chunk of eight bits is a convenient unit of information, in part because of its correspondence with the length of the code words for characters. A unit of eight bits is called a byte. Reportedly the person who coined that term chose that spelling because he feared that if he called it a bite, people typing it might think he meant bit and change the meaning by mistakenly repairing the spelling.

1.4.3 Hexadecimal
Strings of more than a few bits are hard to read and copy. A convenient notation uses sixteen different symbols for the sixteen different patterns of four bits. The first ten symbols used are the same as the ten decimal digits, in order. That leaves the six patterns whose decimal values would be 10 through 15; the first six letters of the Roman alphabet are used for those patterns. This is a basesixteen or hexadecimal notation, with the sixteen hexadecimal digits being 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F. Bit pattern 0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 Hexadecimal digit 0 1 2 3 4 5 6 7 8 9 A B 21

BITS: Notes for Harvard QR48 / MIT 6.095 DRAFT March 7, 2007 1100 1101 1110 1111 C D E F

We wont be doing hexadecimal arithmetic (though there are people who can do that in their sleep!). All we need to be able to do is to transcribe a string of bits into hexadecimal digits by breaking the bit string into chunks of length four and translating the chunks into hex digits. For example, the ASCII code for the string abc mentioned above would be rendered in hex as follows.
011000010110001001100011 = 61626316 {{ {{ {{ 6 3 6 1 6 2 If a string of eight bits is a byte, then a string of four bits, or half a byte, has to be a nibble. Sorry!

1.4.4 Big-endian and little-endian


Today all computers are organized internally to move and store eight bits at a time, or some multiple of eight bits 32 or 128, for example. If more information than that needs to be moved, it has to be moved in a sequence of steps, a chunk at a time, until all the information has been moved. It is just as though you had to carry a pile of firewood from one place to another. If you can carry eight sticks at a time, then the number of trips you have to make is about one-eighth the number of sticks in the pile. The number of bits the processor can carry from one place to another in a single step is one of the factors affecting the observed running speed of the machine. Modern computers typically have 32bit or 64-bit parallelism. When computer networks started to come into being, a standardization problem arose. If there is a communication channel connecting two computers, whether it is a phone line or a radio link or a piece of fiber optic cable, the bits travel down it one after the next. The speed, or bit rate, varies and can be enormous, but it is always one bit after the next or serially, as the expression goes. If one computer wants to send the character a to the other, and they both agree on the ASCII code so a is going to be represented as 01100001, which bit goes first, the 0 at the left end or the 1 at the right end? It makes a difference, since if the first computer is sending in leftmost-bit-first order and the receiving 22 1 Lots of Bits

BITS: Notes for Harvard QR48 / MIT 6.095 DRAFT March 7, 2007 computer is expecting to see rightmost-bit-first, what it thinks it has received will be not a but 10000110, which is the code for a different symbol. For a time the question of which bit should travel down the wire first had all the earmarks of a holy war. Computer scientist Danny Cohen, in a hilarious reference to Gullivers Travels, called the warring viewpoints the big-endian and little-endian conventions. Cohens paper, On Holy Wars and a Plea for Peace, is worth reading. It doesnt require knowing anything you havent already learned, and plays the holy war metaphor and others, such as to the differing orthographic conventions of English, Hebrew, and Chinese, for all they are worth.1

1.5 Many bytes


1.5.1 Addressable memory
We can render any text as a string of bits by translating the text into ASCII and using one byte for each character of the text. For example, the previous sentence is 129 characters long, so it would take 129 bytes to store it. (Thats assuming we care only what the sentence says and we dont also have to identify which type font and other such decorative information.) The numbers of bytes in texts can get pretty big. An 8.5 x 11 inch sheet has around 3000 characters, so a 500 page book could easily be more than a million characters in length. Rather than billions and trillions of bytes, which wouldnt even mean the same thing in England as they do in the U.S., a different set of terms has evolved for quantities of data. Computers store data internally in byte-sized memory cells. The memory cells have addresses and the data stored at an address are the contents of that memory cell. The metaphor is to houses and their street addresses; the occupants of the residence at 11223 Cove Lane can change, and in the same way the contents of the memory cell at address 326 can change when the computer stores new data in that cell. Computer memory addresses are themselves numbers, and computer programs need to manipulate the addresses a lot. The addresses are represented in binary when the computer pushes them around, so there is a maximum size

http://khavrinen.lcs.mit.edu/wollman/ien-137.txt

23

BITS: Notes for Harvard QR48 / MIT 6.095 DRAFT March 7, 2007 address that a computer can easily handle. If the number of bits in an address on a particular computer is 23, say, then the total number of memory cells that are directly addressable is 223. For this reason physical computer memory is manufactured and sold in units that are powers of two in size. You could not go into a computer store and buy a million bytes of memory. Youd have to try to buy memory in the next larger power of two, which turns out to be 1,048,576. That quantity of memory is called a megabyte, and though that is sometimes called a million bytes, in the computer world a megabyte of data or memory almost always means 220 bytes, which is about 5% more than a million.

1.5.2 Internal and external storage


There are fundamentally two kinds of computer storage, and it is important to understand the difference. The data the computer can store and retrieve in a single step are in internal memory, today typically stored physically on silicon chips. The processor can retrieve any data from internal memory in a fixed amount of time, independent of the address of the data. External storage devices typically have some kind of moving parts disk drives, for example. Retrieving a piece of data from external memory is usually much slower than from internal memory, because it may involve a time delay while parts physically move so that the data can be read. (Even if the disk drive is physically housed inside the case of the computer, it is still referred to as external memory, because it is external to the computers central processor.) Though the usage is not completely standardized, it is convenient to reserve the term memory for internal storage, as opposed to external storage. In general, when someone says that her machine has X amount of memory, the reference is to internal memory.

1.5.3 Words for lots of bytes


Here is the full set of useful terms for storage sizes, and a sense of how big they are. Term Kilobyte Megabyte Abbreviation KB MB Exact value 210 = 1024 220 = 10242 Approx. Value 1.02 103 1.05 106 Example A paragraph A book

24

1 Lots of Bits

BITS: Notes for Harvard QR48 / MIT 6.095 DRAFT March 7, 2007 Gigabyte (with a hard G, as in girl Terabyte Petabyte Exabyte GB 230 = 10243 1.07 109 A few bookshelves

TB PB EB

240 = 10244 250 = 10245 260 = 10246

1.10 1012 1.13 1015 1.15 1018

Widener 50 Libraries of Congress All words ever spoken?

Each line of this table is about a thousand times larger than the previous line to be precise, 1024 times as large. The entire system of nomenclature laid out here arises from the accident that 210 happens to be close to a power of 10. The personal computer you buy today probably has at least 50GB, and you can buy a PB for less than $750 (see http://www.pricewatch.com ). Such storage capacities were unthinkable only a few years ago. Miniaturization has drastically reduced the cost and increased the capacity of computer storage. A decade ago one could not have bought a petabyte for all the money in the world nor could one have owned enough warehouses to store it, even if it could be purchased. Be careful if someone tries to sell you 1Gb of memory. That small b may not be an inconsequential typographical variant of B. Small b means bit and big B means byte, so 1Gb would be a gigabit, only one-eight as much memory as 1GB. In truth, memory is not sold by the bit, but other things, for example the transmission speeds of data lines, are sometimes measured in bits per second (b/sec or bps) and sometimes in bytes per second, so be alert.

1.5.4 K = 1024 or K = 1000?


The abbreviations K, M, G, T, P are convenient but can cause confusion. A kilobyte is 1024 bytes but a kilometer is 1000 meters; when does K mean 103 and when does it mean 210? Alas, the convention that 1K means 1024 and not 1000, 1M means 220, and so on, does not apply even to everything in the electronics world. It is invariable when talking about bytes in the internal memory of computers. But the oldies station to which I listen on the radio is broadcast on

25

BITS: Notes for Harvard QR48 / MIT 6.095 DRAFT March 7, 2007 the 103.3 Mhz frequency, and truly means 103.3 106, not 103.3 220. (A Hz or Hertz is a unit of frequency, a cycle per second. Well get back to that later.) An in-between case is disk sizes. If my computer can handle 30 bit addresses, it has no trouble with a gigabyte of memory, but 230+1 bytes would be a nuisance since addresses would now need 31 bits long. But the addressing logic that ties internal memory sizes to powers of two does not apply when the storage is external. There is nothing magic about any particular disk size. It may make sense to organize the data on a disk so that it is read and written in chunks of some convenient size a number of bytes exactly equal to some power of 2, for example, so that when it moves into or out of internal memory it fits in a slot naturally defined by the computers addressing logic. Such chunks of disk data are called pages. But there is no physical reason for the number of pages on a disk to be a power of 2 or any other round number. Now: If I have 1 GB of internal memory on my computer and I buy a 1K GB disk, does that mean that I could store 1024 copies of my internal memory on the disk, 1000 copies, or perhaps even less maybe the disk truly holds only 1012 bytes, which would suffice to hold only 976 copies of the 1GB internal memory of my machine! Caveat emptor the conventions are not universal, and while none of these numbers varies from the others by more than 10%, thats enough to have generated some lawsuits by consumers who were thinking K as in kilobytes meant 1024 against manufacturers who were thinking K as in kilometers meant 1000.2

1.6 Physical memory


1.6.1 Magnetic core memory
In the 1960s, computer memory was made of magnetic cores, little donuts strung on wires. The magnetic field of a core went around it in a circle. Each core stored one bit of information, a 0 if the magnetic field went around the donut in one direction and a 1 if the magnetic field went the opposite way. The stuff of which the cores were made was easily remagnetizable.

http://austin.bizjournals.com/austin/stories/2003/09/15/daily39.html

26

1 Lots of Bits

BITS: Notes for Harvard QR48 / MIT 6.095 DRAFT March 7, 2007 If a wire was put through a core and an electric current was pulsed through the wire, a magnetic field would be created strong enough to change the direction in which the core was magnetized. The cores were actually strung on a grid of wires. A single core could be picked out by pulsing two wires each with half as much current as was needed to flip the bit.3 The other cores on the same wires would be unaffected. Magnetic core technology replaced a 1950s vacuum-tube technology, which itself supplanted a 1940s technology based on electromechanical dials and gears. Harvard was at the forefront of the computer world in those days you can see the Aiken-IBM Mark I computer, with its rows of gears and dials, in the middle of the first floor of the Science Center. Crucial aspects of the technology needed to store and retrieve bits using magnetic cores were also developed at Harvard, by An Wang, then a graduate student of Howard Aikens. In one of a long series of missed boats in the applied sciences, Harvard took no interest in the invention. But MIT did, and the combination of Wangs invention with inventions of MITs Jay Forrester made The Aiken Mark I commercial core memories possible. The manufacture of core memories was never automated. Cores were strung manually onto wires, almost entirely by poorly paid garment workers in the Far East, working with microscopes. By the time the manufacturing process was mature, the cores were barely a millimeter in diameter and the cost of core memory was approaching a penny a bit. But that still resulted in enormous costs for what we would today consider a tiny amount of memory.

Image of a single core from www.hpmuseum.org/ tech9100.htm , of1957 core memory plane from Siemens Corp., w4.siemens.de/archiv/images/ preview/1954_dv.jpg

27

BITS: Notes for Harvard QR48 / MIT 6.095 DRAFT March 7, 2007 The mechanical and vacuum-tube computers had the equivalent of a few hundreds to a few thousands of bits of memory the ENIAC, an important vacuum tube computer, had 18,000 vacuum tubes. While vacuum tube machines were much faster than mechanical computers performing a few thousand operations per second rather than three or four operations per second the tubes were hot, so vacuum tube machines required lots of electric power and the tubes tended to burn out quickly. Core memory made possible the construction of machines with memories in the millions of bits, essential for complex calculations but out of the question for vacuum tube machines.

1.6.2 Silicon transistors


In 1947 the transistor was invented at Bell Labs. Transistors were used in consumer electronics, radios and televisions, before they were used in computers. Vacuum tube TVs and radios were very hot, very fragile, and very unreliable because the tubes tended to burn out and replacing them required a specialist. It was a great leap forward when vacuum tube televisions and radios gave way to sets made with transistors. Transistor radios produced no heat, had no breakable glass, and used so little power that they could run on flashlight batteries. (The vacuum tube electronics were eliminated from television sets too, but until recently there was no alternative to a big picture tube, which was hot, fragile, and used lots of electricity.) These devices were still assembled by hand, the hundreds of discrete components with two or three wires sticking out of them laid out and soldered together with molten metal.

1.6.3 Integrated circuits


In the late 1950s engineers Jack Kilby and Robert Noyce independently figured out how to miniaturize transistors and to fabricate several at one time by a process like photography. The design of a circuit was like a photographic negative. The configuration of transistors and the connections between them could be transferred onto layers of silicon and other materials. Silicon is the right basic material because it has desirable electrical properties. When mixed with various impurities, it is neither a conductor nor an insulator but a semiconductor, conducting or not conducting electricity at one spot depending on the presence or absence of electrical charges stored at nearby spots. It thus becomes an electrically controllable electric switch, exactly the role that vacuum

28

1 Lots of Bits

BITS: Notes for Harvard QR48 / MIT 6.095 DRAFT March 7, 2007 tubes had had in the earliest electronic computers. Happily, silicon is dirt cheap or sand cheap, to be precise, since sand is the raw material used to make silicon chips. The first of these integrated circuits, consisting of just a few transistors, became commercially available in 1961 from Texas Instruments and Fairchild Semiconductor. As the manufacturing process improved and researchers learned more about the electrical properties of silicon, the size of the individual transistors shrank and it became possible to squeeze more onto a single silicon chip. At first many chips were defective because specks of dust in the air during the manufacturing process could ruin one of the tiny features of the chip, a wire or a transistor. As a result chips could not be too large the bigger the surface area the more likely that the chip would not work because of some defect in manufacturing. As the manufacturing environment became cleaner, the physical dimensions of chips could increase without an unacceptably large increase in the number of defective chips.

1.6.4 Moores law


The combined effect of miniaturization, improved design, and improved manufacturing was that integrated circuits rapidly became faster and more powerful. In 1965, only four years after the first commercially available chip, an Intel engineer, Gordon Moore, observed that the number of components on a chip seemed to be doubling every year. In that year the biggest chips had around 64 components. He speculated that the doubling might go on even for a decade, which would mean that by 1975 chips as large as 64 kbits might be possible. Moores paper is readable today.4 The rate slowed down a bit, but the thesis that the number of components on a chip doubles every eighteen months early on became known as Moores law. The law has remained true far longer than Moore or anyone else could have been imagined in 1965. Doubling every eighteen months meant improvement by a factor of four in three years, by a factor of sixteen in six years, by a factor of more than 250 in twelve years. That only takes us to 1977, but already then magnetic core memory technology had been killed by the availability of cheaper silicon

ftp://download.intel.com/research/silicon/moorespaper.pdf

29

BITS: Notes for Harvard QR48 / MIT 6.095 DRAFT March 7, 2007 memory chips. Except for peculiarly specialized applications, all computers made since the mid-70s have used silicon chips for memory. As of 2004, 39 years have passed since Moores law was propounded. That is 39/1.5 = 26 eighteen-month periods, so if the law has held true, the number of components on a chip should have increased by a factor of 2(39 /1.5) =2 26 = 67,108,864 . The law had been proposed in 1965, when the biggest chip that could be manufactured had 64 = 26 components. An increase by a factor of 226 starting from a size of 26 would be a chip of size 232, or 4,294,967,296. It is, of course, comical to carry out these calculations to that level of precision. But incredibly, the biggest memory chips being manufactured in 2004 are 4 gigabit chips.5 And projections for the future of semiconductor technologies suggest that that doubling in size at the same rate will continue for at least two or three more cycles. Some have attributed the success of Moores law to its existence, suggesting that manufacturers repeatedly rise to the specific challenge of meeting the expectations that the law itself has created. I doubt that technological advances can over such a long period be shaped by such artificial pronouncements about the future. Certainly many less amazing predictions about the future of technology have not come true. But what is true is that the phenomenon described by Moores law is absolutely extraordinary. Our capacity to store information has increased by a factor of four billion over four decades. With the 4 gigabit chips costing $114, the cost per bit has dropped to 3!10"8 cents per bit. There is nothing else that humankind can do even a million times faster, bigger, cheaper, or otherwise better than it could do in the Stone Age. The fastest airplanes fly at around 1000 m/sec; the fastest humans can run short distances at a rate of around 10 m/sec, so the speed improvement is by a factor of only a hundred, or perhaps a thousand if you adopt the more realistic rate of 1 m/sec for Stone Age people walking long distances. The change in our capacity to store and communicate information is like nothing else that is or ever was. Not even close. And that is what makes the information revolution possible.

http://www.infoworld.com/article/04/04/06/HNtoshsandisk_1.html

30

1 Lots of Bits

BITS: Notes for Harvard QR48 / MIT 6.095 DRAFT March 7, 2007

1.7 Images
We have been estimating the size of texts, as though the only thing of interest is the letters and symbols of which they are composed. Of course that is not true. Books contain diagrams and photographs, and they are set in multiple fonts and typefaces. How many bytes does it take to store a real page so that it can be reproduced perfectly from the stored version? Even getting the question to be precisely meaningful will take a bit of effort, but we dont need to explain everything at once. Lets agree that perfectly just means good enough so that the human eye cant see the imperfections, though they might be visible with a microscope. When a page is printed digitally, the surface is divided into tiny square dots, each of which can be black or white. The dots are called pixels, short for picture elements. The number of dots per inch, horizontally or vertically, is called the resolution of the printer. (Actually, the resolution does not have to be the same in both directions, so the squares could actually be rectangles.) The lowest-resolution printers are in fax machines; they print around 100 dots per inch, and it is easy to see with the naked eye how jagged the letterforms are. A very high resolution printer might have print 2400 dots per inch; at that resolution the naked eye could not tell that, for example, a diagonal line is actually a jagged staircase of tiny steps. Lets imagine a page printed at 2400 dpi. Suppose that the printed area is 7 by 10 inches. How many dots is that? 7 !10 ! 2400 ! 2400 " 4 !10 8 dots. Since each dot can be either black or white, each dot records one bit of information, so the page has around 50 megabytes ( 4 !10 8 bits / (8 bits/byte) = .5 !10 8 bytes = 50 !10 6 bytes ). So a single page digitized at 2400dpi consists of as many bits as an entire shelf full of books would take if only the text were preserved! And thats just for black and white. What if each dot needs a color? A common scheme for representing color values uses 24 bits per dot (eight bits of brightness value, 0 to 255, for each of red, green, and blue). Now we are up to around 1.25 109 bytes per page, more than a gigabyte. At that rate a 50GB disk would be filled by just 40 pages of images.

31

BITS: Notes for Harvard QR48 / MIT 6.095 DRAFT March 7, 2007 That cant be right photographs and documents do not require that much storage in practice. How is it possible to store complex documents, photographs, and images in much smaller amounts of disk space? The answer is data compression, and to understand it we need to explore the difference between mere bits and information.

32

1 Lots of Bits

You might also like