Bits, Bytes and Bubblegum

 What Are Bits?

Children take computers for granted today, much like their parents took radio and television for granted during their childhoods. Bits and bytes are a part of childhood now, along with Saturday morning cartoons and bubblegum.

It won't be long before you hear children on the playground singing:

Bits, bytes and bubblegum,
Tell me where your disk comes from,
Do not transmit while I send,
Or I'll have to boot again.

But what about those of us who didn't grow up in a house full of computers? How can we make sense of all this? Everybody talks about bits and bytes, but when you come right down to it, what are they?

A bit is an entity that, at all times, has one of two possible values. The two values are "on" and "off." Computer programmers use the numbers 0 and 1 to represent these two values: 0 stands for "off" and 1 stands for "on." Thus, a bit that is off, has the value 0; a bit that is on, has the value 1.

Conceptually, computer memory consists of a long strong of bits. In other words, computer memory can be thought of as a long string of 0s and 1s.

For example, suppose that a small part of a computer memory contains four bits that are on, followed by two bits that are off, followed by two bits that are on. Using 0s and 1s, we would say that this area of the memory contains:

11110011

You may ask, how are the 0s and 1s stored? Are there really tiny electrical components that are off or on? The answer is, it doesn't matter. A small plastic and silicon memory chip stores the 0s and 1s in a certain way. A floppy disk stores them in another way. In fact, each different type of computer memory stores the 0s and 1s in its own way.

When a computer executes a program (a sequence of instructions), what is happening is that some part of the hardware is working with strings of 0s and 1s. Strange as it seems, the only work a computer does, ultimately, is manipulate 0s and 1s.

Of course, this is like saying that all Shakespeare did was write down long sequences of letters from the alphabet. The importance of information comes not from how it is stored, but how it is organized and what it means. The word "rose" written on a piece of paper, would have the same meaning as the word "rose" stored on a floppy disk.

The basic way that bits are organized is in groups of eight. A group of eight bits is called a "byte." You can think of this as a unit of measurement to memorize; one foot equals 12 inches, one liter equals one thousand milliliters, one byte equals eight bits.

We can consider a byte to be an entity that contains eight consecutive 0s and 1s. For example, one byte might contain 10001011, while another byte might contain 00110101.

If you were to work it out, you would see that there are exactly 256 different combinations. Thus, you can think of computer memory as being a long sequence of bits, divided into bytes; each byte containing one of 256 different combinations of 0s and 1s.

 How Do Computers Store Information?

Computers store information by representing it as a series of bytes. For example, the word "hello" is stored as:

01101000 01100101 01101100 01101100 01101111

There are several different schemes used to represent information as bytes. The scheme that is most commonly used to represent text — letters, punctuation and other symbols — is called the "ASCII code." ("ASCII" stands for American Standard Code for Information Interchange.) This is a standard code that is used almost all of the time with personal computers.

Using the ASCII code, each of the letters and other symbols is assigned one of the 256 possible byte patterns. For instance, the letter "h" is represented by the pattern "01101000", as you can see by looking at the first byte of the example above.

Within the ASCII code, small letters are distinguished from capital letters. For instance, while the letter "h" is "01101000", the letter "H" is "01001000." Each punctuation symbol and each of the ten digits (zero to nine) also has its own code.

Now, how many different code patterns do we actually need? Well, we need 26 for the small letters and 26 for the capital letters, ten for the digits (0 to 9), and 32 for punctuation and other symbols. This adds up to 94. Since the ASCII code contains 256 possible patters, a lot of them won't be needed.

In fact, only half of the 256 patterns have standardized values. the other 128 patterns are used differently by different computer manufacturers. For example, IBM uses these patterns to represent various foreign and graphical symbols.

Take a look at the accompanying chart which shows the first 78 patterns of the ASCII code. Notice that each letter, digit and special character has its own unique pattern. (The first 32 patterns are used for internal signals that can be disregarded.)

If you want to construct the pattern that represents a particular sequence of symbols, you can look it up in the chart, character by character. This will show you how the symbols would be stored in computer memory as 0s and 1s.

 How are Numbers Stored?

The ASCII code is used to represent characters as bytes of 0s and 1s. Each character takes up one byte. Thus, a file that consists of 10,000 characters, requires 10,000 bytes of memory.

This scheme is fine for storing information which contains many different characters. However, it is an inefficient way to store numbers. Here is why.

All numbers consist of the ten digits (0 to 9), and possible a minus sign. There is no need to take up the full eight bits for each digit. Eight bits give you 256 possibilities and you only need ten. Obviously, somewhere, bits would be wasted.

Instead, computers can use several other formats to store numbers. The two most common schemes are called "floating point" and "fixed point." Floating point is used to store numbers that have a decimal point. Fixed point is used to store whole numbers with no decimal point. I won't go into the details here; the important thing to understand is that these schemes require less space than the ASCII code.

For example, consider the number 31589. Using the ASCII code, it would take five bytes to store this number, one byte for each digit. Using the fixed point format, the entire number could be stored in two bytes. This is a significant saving, especially when you are dealing with large amounts of numbers.

The basic difference between the ASCII code and the other schemes is that the ASCII code does not consider digits to have numeric properties. Each digit is merely a symbol with no particular meaning. Fixed point and floating point codes treat digits as representing numeric values.

Thus, when a group of digits is to be manipulated as separate symbols with no numeric significance, the digits are stored using the ASCII code. For instance, during input and output, all numbers are stored as separate digits, each taking one byte.

When a group of digits is to be manipulated as a number, it is stored using one of the other formats. For example, when a computer does arithmetic, the numbers are usually stored in this manner.

Strictly speaking, it is not fair to say that every file requires exactly one byte for each symbol in the file. If the file contains numbers, a program may use one of the numeric formats which stores the digits more compactly.

A program that performs many calculations, for example a spreadsheet program, will store numbers in this way. A program that works with characters symbolically, for example a word processor, will use the ASCII code and store every symbol in a separate byte.

 How Do We Measure Bits and Bytes?

Computers typically store thousands and thousands of bytes of information. It is important that you understand how we describe the capacities of such systems. For example, inches are grouped into feet, yards and miles. How do we group large amounts of bytes?

To answer this question, we first have to understand something about the way that computers work with numbers. In everyday life, we use a system of numbers that is based upon the ten digits, (0 to 9). This system is called the "decimal system."

However, computers store information as a series of bits that can be considered to be 0s and 1s. You can see that it would make sense to use a number system based upon these digits. In fact, that is what is done. The system is called the "binary system." Within the binary system, all numbers consist of only 0s and 1s.

Fortunately, for those of us who are used to the decimal system, the 0s and 1s are handled automatically and we rarely have to concern ourselves with the details. However, we do notice the binary system in odd ways.

In the decimal system, round numbers are those that are multiples of ten. For example, 1000 and 1000000 are round numbers. In the binary system, round numbers are those which are multiples of two. Here are a few of these numbers.

Some binary round numbers

Round Number ...as a power of 2
1,024 210 = 2x2x2... (10 times)
1,048,576 220 = 2x2x2... (20 times)
1,073,741,824 230 = 2x2x2... (30 times)
1,099,511,627,776 240 = 2x2x2... (40 times)

The system that we use to measure large amounts of bits and bytes is a modified form of the metric system. The metric system uses a prefix to mean that a number should be multiplied by a particular scale factor. For example, "kilo" means 1000. Thus, a kilometer is 1000 meters.

With computers, the prefixes are used to stand for the round numbers from the binary system. Here are the most common prefixes along with their meaning.

The most common metric prefixes used with computers

Prefix Abbreviation Value
kiloK210 (1,024)
megaM220 (1,048,576)
gigaG230 (1,073,741,824)
teraT240 (1,099,511,627,776)
petaP250 (1,125,899,906,842,624)
exa P260 (1,152,921,504,606,846,976)

Thus, 1 kilobyte is 1024 bytes and 1 megabyte is 1048576 bytes. Using the abbreviations, we would say 1K bytes and 1M bytes. Sometimes, "B" is used as the abbreviation for "byte", as in 1 KB and 1 MB.

Occasionally, the "B" is left out altogether. For example, suppose you hear someone say that a particular file is 512K. This means that the file consists of 512 kilobytes (512x1024 bytes) of data.

 The True Meaning of the Bit

Thus, bits and bytes are ways of conceptualizing computer memory as sequences of 0s and 1s. Computers are based on a number system that uses only 0s and 1s. Thus, each 0 and 1 in the memory can be considered as one digit in the binary system.

Now you can understand the true meaning of the word "bit." It is an abbreviation for "binary digit." The bit is so important because it describes the point at which the software (instructions) meets the hardware (machinery). The job of the software is to manipulate the 0s and 1s that the hardware stores as "off" and "on" signals.

Remember, bits and bytes only seem complicated when you are not accustomed to them. Today's children learn about the binary system and the ASCII code as a matter of course. So when your youngster crawls into your lap and asks, "Do you remember before there were bits?" You can answer, "Are you kidding? When I was young, all we had was paper and pencils and the decimal system..."