Audio: Multimedia - Week 2

Audio
Multimedia Week 2
Audio
Sound that can be heard Vibrations made by electronics or other signals of frequency audible to humans (about 20 20000 Hz)
Digital Audio Today

Analog elements in the audio chain are replaced with digital elements. Mostly linear signal processing. Wide range of digital formats and storage media. Rapid development of technology Rapid increase of signal processing power => possibility to implement new, complex features.
Timeline of Audio Format (Physical)

1870s: Phonograph Cylinder 1895: Gramophone Record 1930s: Wire Recording 1940s: Reel to Reel; Magnetic Tape 1948: Vinyl Record 1960s: 8 Track 1963: Compact Cassette 1969: Microcassette 1970s: Elcaset 1979: Compact Disc 1987: Digital Audio Tape (DAT) 1990s: Digital Compact Cassette 1991: MiniDisc 1996: DVD Audio 1999: Super Audio CD (SACD) 2003: DualDisc
Timeline of Audio Format (Content)

1975: Dolby Stereo 1985: Audio Interchange File Format (AIFF) 1991: ATRAC 1992: Waveform (WAV) & Dolby Digital Surround Cinema Sound 1993: Digital Theatre System (DTS) 1995: MP3 1999: Windows Media Audio (WMA) 2000: Free Lossless Audio Codec (FLAC) 2001: Advanced Audio Coding (AAC) 2002: Ogg Vorbis
Digital Sound
Digitizing sound is done by measuring the voltage at many points in time, translating each measurement into a number, and writing the numbers on a file. This process is called sampling. The sound wave is sampled, and the samples become the digitized sound. The device used for sampling is called an analogto-digital converter (ADC).
Digital Sound
The difference between a sound wave and its samples can be compared to the difference between an analog clock, where the hands seem to move continuously, and a digital clock, where the display changes abruptly every second. Since the audio samples are numbers, they are easy to edit. However, the main use of an audio file is to play it back. This is done by converting the numeric samples back into voltages that are continuously fed into a speaker.
Digital Sound
the main problem in audio sampling is how often to sample a given sound. The second problem in sound sampling is the sample size. Audio sampling is also called pulse code modulation (PCM). The term pulse modulation refers to techniques for converting a continuous wave to a stream of binary numbers (audio samples).
Sampling
Physical terms: Frequency, Amplitude, Spectrum Sampling
Sampling
Origin Signal
PAM
PCM
Audio: Sampling
Clipping
Audio Sampling
Quantization
Block diagram Sampling
Audio Data Rates

Quality
Netcasting Preview Preview Broadcasting or Editing Archive (uncompressed)
Format (examples)
RealAudio RealAudio
MPEG Layer 3 (MP3) MPEG Layer 2
Transfer Rate
20 Kbit/s 80 Kbit/s 192 Kbit/s 384 Kbit/s 1538 Kbit/s
Disk Space 1 hour

8.8 MByte 35.2 MByte 84.4 MByte 168.8 MByte 675.9 MByte
Disk Space 100,000 hours

0.9 TByte 3.5 TByte 8.4 TByte 16.9 TByte 67.6 TByte
Waveform PCM
Space Requirements
One Minute of Sound
Type Mono Mono 16 bit Stereo 8 bit Stereo 16 bit Resolution 8 bit Sampling Rate 44.1k 22.05k 11.025k 8k
2646k 1323k 661.5k 480k
5292k 2646k 1323k 960k
5292k 2646k 1323k 960k
10584k 5292k 2646k 1920k
Audio Compression
Motivation
need to minimize transmission costs or provide cost efficient storage
demand to transmit over channels of limited capacity such as mobile radio channels need to share capacity for different services (voice, audio, data, graphics, images) in integrated service network
Audio Compression
Two important features of audio compression are - it can be lossy - it requires fast decoding. The encoder can be slow, but the decoder has to be fast. sound has three important attributes, its speed, amplitude, and period.
Audio Compression
The speed of sound depends mostly on the medium it passes through & on the temperature. (In air, at sea level (one atmospheric pressure), and at 20 Celsius (68 Fahrenheit), the speed of sound is 343.8 meters per second (about 1128 feet per second)) (The human ear is sensitive to a wide range of sound frequencies, normally from about 20 Hz to about 22,000 Hz, depending on a persons age and health.)
Audio Compression
The sensitivity of the human ear to sound level depends on the frequency. (Which is why sirens has a high pitch).
Compression Approaches
Delta coding
Encode differences only
Predictive coding
Predict the next sample
Linear Predictive Coding (LPC) - mostly for speech

Describe fundamental frequencies + error CELP, RPE, cell-phone standards
Variable Rate Encoding

Dont encode silences
Subband coding
Split into frequency bands each encoded separately + efficiently
Psycho-acoustical coding
drop bits where you cant hear it
Compression
PCM (Pulse Code Modulation) u-LAW (Mu-law logarithmic coding) LPC-10E (Linear Predictive Coding 2.4kb/s) CELP 4.8Kb/s code excited LPC builds on LPC GSM (European Cell Phones, RPE-LPC) 1650 bytes/sec (at 8000 samples/sec) ADPCM (adaptive, delta PCM, 24/32/40 kbps) MPEG Audio Layers (builds on ADPCM) Layer-2: From 32 kbps to 384 kbps - target bit rate of 128 kbps Layer-3: From 32 kbps to 320 kbps - target bit rate of 64 kbps Complex compression, using perceptual models RealAudio, Windows Media Formats (builds on above, proprietary)
Practical Time
Get Audacity Install it Read the Manual Try to familiarize with it Do some editing, see what you can do
Homework (1)
Record your voice for about 3 - 5 minutes. Read / tell a short story. Edit it, try to reduce the original recording up to 2-3 minutes. Add some music on the background. Alternatively, you could do some mix sound. That such work will also be acceptable. Give me the original record and the edited record on Wednesday (27 August 2012) before 11 pm. Have fun

Audio: Multimedia - Week 2

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Audio: Multimedia - Week 2

Uploaded by

Copyright:

Available Formats

Audio

Digital Audio Today

Timeline of Audio Format (Physical)

Timeline of Audio Format (Content)

Block diagram Sampling

Audio Data Rates

Disk Space 1 hour

Disk Space 100,000 hours

2646k 1323k 661.5k 480k

5292k 2646k 1323k 960k

5292k 2646k 1323k 960k

10584k 5292k 2646k 1920k

Linear Predictive Coding (LPC) - mostly for speech

Variable Rate Encoding

You might also like