Lord Generic Productions

A Crash Course in Game Design and Production
  Week 5 - Sound and Music Specification

Welcome back! This is the fifth installment in "A Crash Course in Game Design and Production. Like last time, this lesson is in multiple parts. In PART ONE, we'll talk about the basics of sound reproduction on the computer. In Part TWO, we'll discuss computer music and sound effects, and what we need to write the Sound and Music Specification. In PART THREE we will write the fifth section of the Design Spec for our Course Project, theSound and Music Specification. This is part 1 of 3

  Part 1 - Basics of Sound Reproduction on the Computer

Before we can talk about the Sound and Music Specification, we need to get up to speed on how sound works, at least as far as we're concerned, and what we need to know to be able to talk about Music and Sound later on.

Sound Terminology
Sound - Anything we can hear. When you drop something and it hits a table, say, it and the table vibrate, they shake a little. This vibration moves the air around them in all directions. As the objects vibrate back and forth, the air around the objects are pushed away or drawn to them in "pulses" like the ripples in a pond when you drop a rock in it. The air pulses eventually reach your ear, and move the hairs inside back and forth at about the same speed (it decreases with distance from the objects).

These vibrations in your ear are converted to electric signals which go to your brain and you "hear" a sound. If the vibrations are too fast or too slow, the hairs in your ear won't respond to them. They either can't go back and forth fast enough to match the vibrations, or the vibrations are so slow that their force isn't enough to move the hairs. You can't hear these vibrations. They are not sound. If you can't hear it, it isn't sound, I don't care what the speaker propaganda says. Physicists and philosophers argue that. If I didn't hear it, it didn't make a sound for me.

Amplitude - How far the hairs in your ear move as they vibrate back and forth. The Amplitude is the total distance from back to forth. The farther the hairs move back and forth, the louder the sound. If a garbage truck falls off the Empire State Building and hits the ground, the vibrations that it causes will stronger than those caused by a Yugo. The sound may be similar, but it will be MUCH louder. There is no standard measurement for this as far as I know. When you're trying to reproduce a sound, the amplitude is relative to the hardware, amplifier and speakers, you are using.

Frequency - How many times the hairs in our ears are vibrating per second. The measure for this is Cycles (back and forth) Per Second, or Hertz, named after the guy who figured this stuff out. The Human ear can "hear" sounds ranging typically from 100 Hertz (1hz) to 16000 Hertz (16khz - 16 Kilo-Hertz, 16000 cycles per second)

Tone, Musical Tone, Pure Tone, or Note - A continuous "sound" at a certain frequency for a certain duration. If your ear hairs are vibrating at 440hz (440 times back and forth per second) for awhile, you are "hearing" an "A Note" A guitar or piano string vibrating at 440hz will make a similar tone. The won't sound exactly the SAME probably, because in the real world things can't continuously vibrate at set rate over ANY length of time. There are a lot of technical reasons why, which are way beyond the scope of this course. Things that effect the sound you hear include, material that is vibrating, density of that material, how it started vibrating, artificial damping of the sound, the transit medium (things sound different underwater than they do in the air), etc. On an instrument, when you play a "note" the actual frequency you hear fluctuates many times a second, but they average pretty close to the "standard" frequency of the note. An "A" on a guitar will be on average about 440 hz, although at any time the frequency may be between 437-444hz.

Synthesis vs Digital Sampling

There are 2 ways to create a sound for the computer to play, Syntheses, and Digital Sampling.

Synthesis - The computer program itself "creates" the sound by sending frequency and\or amplitude parameters to an electronic Sound generator. On the PC, this is generally either to a sound card or PC speaker. On the Adlib(tm) or SoundBlaster(tm) cards, the frequency determines the musical tone, and the amplitude determines the volume. This is called FM synthesis (frequency modulation) and is great for producing pure musical tones. Music, for the most part, is a sequence of distinct musical tones, played in some sequence with varying rhythms. Unfortunately, FM synthesis is awful for trying to recreate "real life" sounds, like the human voice. "Real" sounds are NEVER pure tones. The frequency and amplitude fluctuates thousands of times a second. They may average at around a standard frequency, but the parameters of how they vary with time are too complicated to approximate with any accuracy. Musical instruments are easier to synthesize, since they are designed to play "notes." You at least have a starting point. Where do you start trying to synthesize a cough?

Digital Sampling - The computer "listens" to a real world sound, and "records" the frequency and amplitude changes over time. The process is called "sampling" or "digitizing." On the PC, the SoundBlaster(tm) compatible cards convert the electrical signals from their microphone or line inputs into digital information. It actually only records the amplitude many thousands of times a second. When it "plays" the "sampled" sounds, it converts the data back to an electrical signal which drives the speaker back and forth.

Digital Sampling is the best way to reproduce real world sounds. Pretty much anything you can hear can be sampled into the computer. You can sample your voice saying "Level Completed" and play it in your game. For short duration sound effects and musical tags for your game, digital samples are ideal.

The real drawback is that sampled sounds take up a TON of memory and hard drive space. To accurately reproduce a sound, the computer needs to sample the sound many thousands of times a second, with a very fine amplitude scale. Example: If you sample a five second sound 44,000 times per second, with 2 bytes of amplitude information per sample, the sound will take about 214.5k of ram. If you want to sample in stereo, that's 429k for five seconds! Digital sampling is awkward for long musical pieces, they sound great, but the storage requirements aren't usually worth it.

For long musical pieces, FM synthesis is generally better than digital sampling, but you are creative, and your hardware supports it, you can merge the two forms. You can, for example, sample every note on a piano, then tell the computer to play the individual notes in some order, and it will sound exactly like a piano playing a tune. This is a hybrid between pure synthesis and pure sampling, and is called "wavetable synthesis"

What we can do in Euphoria

For our digital sound needs, we're going to use
The Ecstacy Sound System v0.3 for Euphoria v2.1+ Copyright (c) 1999, Liquid-Nitrogen Software.

This library has the best feature set of all the Euphoria Sound Blaster libraries, is easy to implement, and is most stable.

Here's what it can do (from the documentation)
Mix up to 64 sounds at one time.
Mixing rate from 8000Hz - 44100Hz.
8-Bit, Mono output only.
Each sound can have a seperate volume from 0 - 255.
Each sound can have a seperate playback rate from 1Hz - 88200Hz.
Each sound can optionaly be looped to the begining when it reaches the end.
Global volume-level from 0 - 255 affects all sounds.
Load 8-Bit or 16-Bit Mono or Stereo ".WAV" files.
Save and load multiple wave files to/from a single ".PAK" file.
Auto-detect sound card, or set BASE_PORT, IRQ and DMA manualy.
Optionaly write debuging information to a debug file.

Here are the system requirements
To use this library package you will need the following:

A copy of the latest version of the Euphoria Programming Language(V2.1+).
A 486/66Mhz+ CPU.
A Sound Blaster or compatible sound card with a DSP version 2.0 or higher.

Known Bugs:
Doesn't work with Windows 2000. May not work on windows NT. (unsure)

We have no ability yet to do FM synthesis in Dos32, so the rest of this discussion will be geared toward the Hows and Whats of Digital Sampling

More Terms and Definitions

Sampling Rate or Sampling Frequency - This is how many samples we are going to record per second. This is how frequently we are looking at signal coming in. Since its called a "frequency," the sampling rate is denoted in Hertz or KiloHertz. If you sample a sound 44,000 times a second, it's called a 44khz sample. This has NOTHING to do with the frequency of the SOUND being sampled. You can, for example sample a 440hz sample 44,000 times a second, making a 44khz sample of a 440hz sound. I know it's confusing. Read this about 20 times and you'll get it.

There is a relationship between sampling rate and the kinds of sounds you can sample. A 22khz SOUND vibrates the speaker in and out 22,000 times a second, meaning it moves it IN 22,000 times a second, and OUT 22,000 times a second. In order to sample this sound adequately, you need a high enough sampling rate that you can record all the IN's and OUT's. You must be able to sample at least 44,000 times a second to do this. So the maximum SOUND FREQUENCY you can record is half of your SAMPLING RATE. A 22khz sample can reproduce any sound up to 11khz. An 8khz sample can reproduce any sound up to 4khz.

Sound Quality - How close to the actual sound is your sample? As we've just seen, the higher the sampling rate, the higher the sound frequency we can reproduce, therefore the closer to the actual sound we get. CD players sample at 44khz, and they sound great. Your telephone "samples" at about 8khz, and it sounds sucky in comparison. If you have fiber optic phone cable, the sampling rate is about 15khz so it's better.

Amplitude Resolution - This is how many discrete amplitude levels your sound hardware is capable of reproducing. Early Sound Blasters could reproduce 128 amplitude levels for each direction of the speaker vibration in and out, giving 256 total steps. This entire range can be represented by 8 bits or one byte of information. The resolution is called 8bit. When you hear the term "8-bit sound card" or "8-bit sample" it means that the sound you hear from it will have at most 256 different amplitude variations. Newer sound cards, like the SoundBlaster Pro(tm) have 16 bits of amplitude resolution or 32768 different amplitude levels. They drive the speaker back and forth more accurately, so the sound they can produce is closer to what you sampled than 8-bit cards. 16-bit samples have better sound quality than 8 bit samples, because the speaker moves closer to the way the original sound would have moved it, but they take up twice as much hard drive space and memory at the same sample rate.

I'm guessing most telephones today have 8 bit resolution. Anyone know?

Sound Mixing - Often we want to be able to play more than one sound at the same time. For example, in our game we will have a siren sound looping in the background and every other game sound needs to play "on top" of it. The ability to layer multiple sounds into a single sound is called Mixing. To use sound mixing, we assign sounds to play on different "channels," and as the sound card prepares to play all the sounds we need it to, it takes the amplitude samples from each of the playing "channels" and averages them to form a composite sample and then plays it. We'll talk more about that next time.

In the next part we'll look at sound and music concepts applicable to games.

  End of Week 5 - Sound and Music Specification
  Part 1 - The Basics of Sound Reproduction on the Computer

If you have any questions for group discussion or have any other questions, comments or suggestions, email them to me to Pastor@BeRighteous.com

Mail monetary donations large or small to
 Lord Generic Productions 1218 Karen Ave Santa Ana, Ca 92704
  A Crash Course in Game Design and Production - Euphoria Edition
(C) Copyright 1996,2001 Lord Generic Productions - All Rights Reserved