Skip to main content
Subject Guides

Skills Guides

Audio editing

Digital storytelling: Audio editing

Ever wanted to create a soundtrack? Like the idea of mixing audio? We provide some tips, principles and examples on how to get creative with sound.

Digital audio: what's going on?

Sounds familiar: analogue recording

Sound is a wave-like vibration of air (or other material). We can capture that sound by capturing an impression of that vibration.

In a gramophone recording, the sound vibrated against a needle which cut a correspondingly jagged groove in a rotating disc. During playback, the groove vibrates the needle which vibrates the air. You can test this with a vinyl record and a five pound note:

This is an analogue recording: the shape of the soundwave is essentially directly captured, like for like, in the recording medium (in this case the jagged groove of the vinyl) — much like how a seismograph draws a wavy line when jostled by an earthquake.

The wave might change medium more than once before it's captured: to be recorded on magnetic tape, the vibration hits a microphone which produces a corresponding electrical current which produces a corresponding magnetic field. But whatever the medium, it's ultimately a direct replication of the original waveform.

Digital recording: a series of still images...

Digital sound is not like analogue sound... it’s more like animation: digital sound is a series of still images arranged together to create the illusion of audio. The more still images taken, the more accurate the representation of the waveform.

A sound wave is sampled and quantised
A sound wave, in red, represented digitally, in blue (after sampling and 4-bit quantization).
CC BY-SA 3.0 Aquegg, Wikipedia

In the above graph, the blue samples are snapshots of the sound wave. There are two dimensions to those samples: the sample rate (x-axis) and the bit depth (y-axis).

Read more...

Sample rate

The sample rate is the number of samples being made in a given period of time. Think of it like frames of film....

Galloping horse, animated using photos by Eadweard Muybridge

The above gif is made up of 15 still images (numbered 2-16) animating at 10 frames per second (10 fps or 10 Hz). Most cinema film animates at 24 fps (24 Hz): that's sufficient to trick the eye for most humans.

Sound needs a lot more frames to be understandable, not least because sound itself is made up of tiny vibrations: human hearing can discern vibrations between 20 and 20,000 Hz as different pitches, so 24 samples per second is not going to come remotely close to replicating the nuances of those sounds. A sample rate of 8,000 Hz serves for basic telephone communication (though ess-es sound like effs), but to properly capture human hearing you need to be able to pick up two points in the vibration at the highest pitch we can hear (in other words double 20,000 Hz). That's why CD audio has a sample rate of 44,100 Hz (or 44.1 kHz).

The free sound editing tool Audacity supports rates from 8 kHz to 384 kHz. In theory, 60 kHz is considered to be more detailed than the human ear can discern, and 48 kHz is considered the standard rate for most uses.

Bit depth

Bit depth is the amount of information being recorded in each sample. If samples are like frames of animation, bit depth is like the size of the image being animated, or, more accurately, like the number of colours being used in that image.

4-bit colour rabbit testcard: natural colour variation is reduced to blocks of simple colour
A 4-bit colour palette reduces our bunny to blocks of simple colour.

The above image uses a 4-bit colour palette: that is to say that it is made from 16 colours (15 in binary is 1111 — in other words 4 binary digits — and zero makes 16). Because there are only 16 colours to play with, the natural colour of the rabbit photo is reduced to blocks of the nearest available colour in the palette.

With audio bit depth, the 'colours' are amplitudes (amplitude is the amount of vibration occurring in the sound medium — the volume of the sound). The amplitude of a wave at a particular sample point is rounded to the nearest available value ('quantization'). Let's have that graph again from earlier to see that happening:

A sound wave is sampled and quantised
A sound wave, in red, represented digitally, in blue (after sampling and 4-bit quantization).
CC BY-SA 3.0 Aquegg, Wikipedia

The greater the bit depth, the more accurate the representation of the sound (the 'resolution'). CD audio uses a bit depth of 16 which gives a resolution of 65,536 possible amplitudes. To carry on with our colour palette analogy, that's something looking more like this:

16-bit colour rabbit testcard: natural colour variation is pretty accurately reproduced but it could be more naturalistic
A 16-bit colour palette gives a far more natural looking rabbit but blocks of colour are still visible, especially in the gradients at the bottom of the testcard.

24-bit audio offers over 16 million values and is considered to be of a professional standard. It's a close approximation to human hearing capabilities.

24-bit colour rabbit testcard: natural colour variation is represented in a naturalistic way
A 24-bit colour palette gives a suitably natural looking rabbit and the gradients are smooth. The same is even more true for 24-bit audio.

In addition to 16 and 24 bit formats, the free sound editing tool Audacity has a 32-bit option, albeit using a 'floating point' method which gives a greater rounding error at larger values. For most purposes, 24-bit is more than sufficient.

Sourcing sounds

Always bear in mind that published sounds are always subject to copyright law, so you can’t just use any sound you want.

Fortunately, there are plenty of free-to-use sounds out there. You can find some here:

For more information and advice, take a look at:

Audacity

Audacity is a free audio-editing tool. It can be downloaded for use on your own computer but it's also available on campus machines.


Other tools

Adobe Audition is a more advanced audio-editing tool. It's available as part of the Adobe Creative Cloud suite.

There's also a range of free online tools for sound creation and editing. Here's a selection of interesting tools and resources for use at your own risk:

Audio file types

There are a number of audio file formats. The most common are:

WAVE

Waveform audio file format (.wav) is the standard filetype for digital audio. It is used for professional and archival purposes. Wave files are generally an uncompressed representation of the string of samples that make up a digital recording.

Samples making up a wave
The sample points in a wave file

Because they are generally uncompressed, wave files need to record every point of data in the file. Consequently, file sizes can be very large. A CD-quality 16-bit 44.1 kHz stereo recording takes up about 10 MB of file-space per minute:

  44100 samples x 16 bits of resolution x 2 channels
= 1,411,200 bits
= 176,400 bytes
= 172 KB per second
= 10.09 MB per minute.

mp3

MPEG-1 (or MPEG-2...) Audio Layer III (not that anyone calls it that) is the most common audio file format. It uses 'lossy' compression a bit like the compression used to make a JPEG image file. As with JPEG, there's the ability to select the 'quality' of the file, in this case in the form of the bitrate (the number of kilobits of data being used each second): the lower the bitrate, the smaller the size, but the more compression artifacts (in other words, the worse the sound quality).

The free sound editing tool Audacity supports a range of bitrates up to 320 kbps, but even this will lose some information from the original recording. The video below demonstrates this by isolating the lost information from a 320 kbps encoding of "Tom's Diner" by Suzanne Vega:

moDernisT_v1 from Ryan Patrick Maguire on Vimeo.

Audacity has a range of mp3 encoding options including a number of preset settings.

Forthcoming training sessions

Forthcoming sessions on :

Show details & booking for these sessions

There's more training events at: