Encoding the World with Computer Data
The basic unit of data in a computer is a bit:
 |
A bit is a one or zero. |
 |
A bit can be used to represent two things: on or off, yes
or no, true or false, and so on. |
Most computer systems depend on the byte for storing
data:
 |
A byte is eight bits. |
 |
It can be used to represent 256 different things (you
get 256 by taking 2 to the eighth power: 28). |
From the byte, we create files:
 |
Files can be virtually unlimited in size. |
 |
Files contain data or programs (sometimes both, but not
very often). |
 |
Files can be used to transfer data between computers via
disk media. |
Generally speaking, computers are effective in handling
three types of data from the "real world":
-
Text
-
Pictures or Graphics
-
Sound or Audio
A fourth type, video is actually a series of pictures
augment with audio.
Now we can encode a lot of things in the real world:
| Data Type |
Encoding
Scheme
|
| Text |
ASCII (American Standard Code
for Information Interchange) Text -- A 7 bit standard to encode
letters, numbers, and special characters. 128 characters can be encoded.
ANSI Text -- An 8 bit version of ASCII, enables 256 characters
to be encoded.
Also UniCode: a 16 bit standard (65K characters)
|
| Raster Graphics (Bitmaps) |
Each pixel of the image (an X, Y
coordinate space) is assigned a color value. Depending on the number of
bytes used for each pixel color, the size of the image is determined.
For example:
 | 1 byte, 256 colors |
 | 2 bytes, 65,536 (65K) colors (high color) |
 | 3 bytes, 16 million plus colors (true color) |
An image that is 800X600 (an SVGA computer screen) various colors
depths would have the following sizes (add some for information about
color palettes etc.):
 | 256 colors: 480,000 bytes |
 | 65K colors: 960,000 bytes |
 | 16 Million colors: 1.4 megabytes (MB) |
This technique results in the "best possible" image depending on the
actual image size.
Graphic file formats such as JPEG and GIF use compression to reduce the
file/data sizes.
|
| Audio |
Digital sound depends on
capturing sound levels at a very rapid pace typically, thousands of times
a second.
Digital audio is based on the following...
 | Capturing sound at intervals ranging from 5000 to 44,000 times a
second (kilohertz) |
 | Each "capture" is based on a byte (256 levels) or two bytes (65,536
levels) |
 | There may be one (monaural) or two channels (stereo) |
One minute of CD-Audio Quality (the best level) involves the following:
 | 60 seconds times |
 | 2 bytes (for 65,536 levels) times |
 | 2 channels (stereo) times |
 | 44,500 hertz |
 | For a total of about 10.5 MB |
Lower grade (telephone quality) involves the following:
 | 60 seconds times |
 | 1 byte (for 256 levels) times |
 | 1 channel times |
 | 11,000 hertz |
 | For a total of about 660,000 bytes |
As you can see, digital audio requires a significant amount of
storage. Audio files are often compressed by reducing the frequency,
channels, and bytes, and by other techniques (MP3 for instance), but these
always degrade the quality of the audio.
|
We often combine groups of bytes for various data storage
situations:
 |
2 bytes (216 = 65536 combinations) for
digital audio and high color graphics. |
 |
3 bytes (224 = 16,777,216 combinations)
for true color graphics. |
 |
Kilobytes
(KB) for 1024 bytes for bulk storage of data. |
 |
Megabytes (MB) for 1,048,576 bytes for bulk storage
of data. |
 |
Gigabytes (GB) for 1,073,741,824 bytes for bulk
storage of data. |
 |
By the way, there are also tera-, peta-, and even larger
ways to reference large numbers of bytes. |
Data is stored collectively in files which can be as small
as few bytes or very, very large (even hundreds of megabytes).
|