You are on page 1of 53

CMM5013 Multimedia Authoring and Programming

Chapter 4: Multimedia Building Blocks

Part 3 - Sound
What is sound?
Waveforms and attributes of sound

Capturing digital audio


Sampling

Soundcard technology MIDI (Musical Instrument Digital Interface)

Sound
Sound is a complex relationship involving a vibrating object (sound source), a transmission medium (usually air), a receiver (ear) and a perceptor (brain) As the sound vibrates it bumps into molecules of the surrounding medium causing pressure waves to travel away from the source in all directions As pressure waves get further from the source they become weaker as their energy dissipates

Waveforms
Sound waves are manifest as waveforms
Periodic waveform=waveform that repeats itself at regular intervals Noise=Waveforms that do not exhibit regularity

Cycle=unit of regularity
Hertz (or Hz) after Heinrich Hertz (a pioneer in the field of acoustics)
One cycle = 1 Hz kHz or kiloHertz (1 kHz = 1000 Hz)

Waveforms

Example waveforms
Piano

Pan flute

Snare drum

Capture and playback of digital audio


Air pressure variations

Captured via microphone

Converts back into voltage

Digital to Analogue Converter

DAC

Analogue to Digital Converter

ADC

Signal is converted into binary 0101001101 0110101111

Air pressure variations

The attributes of sound


Sound is described in terms of several characteristics:
Pitch (frequency of the waveform in Hz) Amplitude (or loudness) Timbre (or tone quality)

In addition, all sounds have a duration and successive

The Analogue to Digital Converter (ADC)


ADC is a device that converts analogue signals into digital signals An analogue signal is a continuous value
It can have any single value on an infinite scale

A digital signal is a discrete value


It has a finite value (usually an integer)

ADC will monitor the continuous analogue signal at a set rate and convert what it sees into a discrete value at that specific moment in time

Digital sampling Sampling frequency

D e fi i o n : T h e re co rd i g o f va l e a t d i n ti n u scre te i te rva l i ti n s n m

The Nyquist theorem


The Nyquist theorem states:
In order to be able to reconstruct a signal, the sampling frequency must be at least twice the frequency of the signal being sampled
Named after Harold Nyquist of Bell Telephone Labs (1928)

The highest frequency that can be produced in a given digital audio system (i.e. half the sampling rate) is called the Nyquist frequency

Digital sampling Sampling frequency

Sample resolution
The resolution of a sample is the number of bits it uses to store a given amplitude value, e.g.
8 bits (256 different values) 16 bits (65536 different values)

A higher resolution will give higher quality but will require more memory (or disk storage)

Quantisation
Definition: The restriction of any continuously varying signal to a finite set of discrete values. Samples are usually represented as integers. If the input signal has a voltage corresponding to a value of between 53 and 54, the ADC may round it off to 53 Due to this rounding that must occur, the value of a sample is generally slightly different from the original signal
This is known as quantisation error and is unavoidable Increasing the sample resolution can reduce

Quantisation example

Calculating the size of digital audio


The size is the raw uncompressed memory that the digital audio occupies The formula is as follows:

rate duration resolution number of channels 8

The answer will be in bytes Where:

sampling rate is in Hz duration is in seconds resolution is in bits number of channels = 1 for mono, 2 for stereo, etc

Calculating the data rate of digital audio


Data rate = the rate at which the data must be supplied to the audio hardware in order to be played back at the correct speed The formula is as follows:

rate resolution number of channels 8

The answer will be in bytes per second Where:


stereo, etc.

sampling rate is in Hz resolution is in bits number of channels = 1 for mono, 2 for

CD-DA (CD-Digital Audio)


Compact disks are recorded using 2 channels (stereo) using 16 bits of information at 44.1 KHz It therefore requires just over 10MB to store a single minute of CD audio Compact disks can store about 74 minutes of digital audio but only around 650 MB of data! A Sony MiniDisc has a data capacity of about 140MB

Digital audio editing software


One of the most powerful and professional PC-based packages is a tool called Sound Forge

http://www.sonicfoundry.com/

The purpose of a soundcard


The purpose of a soundcard is to provide some or all of the following functions:
Play sounds Provide a MIDI device interface Synthesise musical sounds and sound effects Provide a game port interface (joystick, game pad, etc...) Record sounds from an input (microphone, CD player, etc...) Mix sounds coming from many sources (microphone, CD player, digital audio output) and send them to an output (speakers, headphones) Control a CD-ROM drive with a drive controller (mostly in older soundcards)

MIDI (Musical Instrument Digital Interface)


MIDI is a standard for specifying a musical performance Rather than send raw digital audio, it sends instructions to musical instruments telling them what note to play, at what volume, using what sound, etc. The synthesiser that receives the MIDI events is responsible for generating the actual sounds The size of MIDI files is small
Proportional to the number of MIDI events Although they cannot contain any digital audio!

MIDI sequencers
A MIDI sequencer allows musicians to edit and create musical compositions like a word processor
Cut and paste Insert / delete

Summary
There are two main types of digital audio
Sampled audio
Captured by sampling an analogue waveform at a set rate

MIDI data
Instructions on how to perform some musical composition

Sampled audio requires more storage space than MIDI information Modern soundcards can capture and playback both sampled audio and MIDI information

Part 4 - Video
Analogue video What is digital video? Calculating the size of digital video Compression techniques Digital video formats Video capture hardware Digital video editing Consumer desktop video

Analogue video
There are two main analogue video formats
PAL and NTSC

PAL is the European television standard NTSC is the American and Japanese standard
National Television Standards Committee 480 lines of vertical resolution out of 525

Television usually has a 4:3 aspect ratio


For every 1 pixel down there are 1.333 pixels across

Digital TV has an aspect ratio of 16:9 (widescreen)

PAL video
The PAL video image is composed of 625 lines The actual picture is contained in 576 lines
The rest is taken up by Teletext information

576 lines 576 lines


768 pixels

625 lines 625 lines

768 576 = 1.333 = 4:3 25 ( )

What is digital video?


Digital video is the digitisation of analogue video signals into numerical format Conversion from analogue to digital format requires the use on an ADC (Analogue to Digital Converter)
A Digital to Analogue Converter (DAC) can be used to output digital video on analogue equipment

Sound can be captured separately


Digital video can have zero or more channels of audio NICAM stereo broadcasts have 2 channels DVD has 6 channels of sound (called 5.1)

Calculating the size of digital video


width height colour depth fps size (bytes) = 8
Where:
width = image width in pixels height = image height in pixels colour depth is measured in bits per pixel fps = number of frames per second

Example calculation
Calculate the size of 1 second of PAL video:
width height colour depth fps size (bytes) = 8 768 576 24 25 = 8 = 33,177,600 bytes = 31.64 MB

Compression techniques
Since the size of raw digital video is so prohibitively large we need some means to compress the information Lossy compression techniques cause some information to be lost from the original image
You can never recreate the source image from the compressed version

Lossless compression techniques do not lose information


You can always recreate an exact replica of the source information

Compression techniques
The two main forms of compression are: Compression of repeating information
Take a newsreader as an example. Most of the screen does not change (the background, desk, etc). Only need to store parts of the image that actually change

Removal of low-visibility artefacts


Things the eye cannot easily identify can be removed or highly compressed and synthesised upon playback

Digital video formats


Microsoft AVI
Files with a .avi extension

Apple QuickTime
Files with a .mov or .qt extension

MPEG / MJPEG
Files with a .mpg extension

Microsoft AVI
Audio Video Interleave format Interleaving is a technique used to embed two or more things into the same stream of information In every chunk of information you will find some video data and some audio data

8, 16 or 24 bits

001001010010010101010011110101... 001001010010010101010011110101... ... 001001010010010101010011110101... 001001010010010101010011110101...

Video information Audio information

MPEG video
Named after the Moving Picture Experts Group who devised the compression and file formats There are a number of MPEG formats:
MPEG-2 is used for digital TV broadcasts and DVDs MPEG-1 is a format used for low quality video (generally displayed on computers)
MPEG-1 Layer 3 is the popular encoding mechanism for MP3 audio files MPEG-4 is a new format for multimedia presentations

Can require separate hardware to decode higher quality MPEG video data

MPEG compression example

A simple scene showing a car moving across a desert landscape

Only the difference between the current and next frame needs to be stored This is called intraframe coding

QuickTime
Developed by Apple, Inc. Primarily for playback without any hardware assistance Can achieve compression ratios of 25 to 200:1 The QuickTime format can also store audio, graphics, 3D and text making it more much versatile for multimedia

Video capture hardware


The hardware has various input and output connectors
Composite video in/out S-Video in/out Audio in/out

Special chips provide the processing power to compress/decompre ss the video information
CODEC (Compressor /

Digital video editing


Analogue tape editing is a linear process
To find the section you want, you may have to forward or rewind the video tape To move a section to another place in the sequence you have to either re-record the section onto another tape or physically cut and splice the video tape

Digital video editing can be non-linear


You can move sections around inside the computer and play those sections back in any order Non-destructive editing and Edit Decision Lists (EDL)

Consumer desktop video

Summary
Today we have seen how analogue video formats are composed and how digital video can be used to store these electronically Digital video demands huge file sizes
even before sound is added on!

Compression techniques help to reduce the file sizes to more manageable levels

Part 5 - Animation
Depiction of objects as they vary over time Traditionally, based on individual drawing or photographing the frames in a sequence Computer animation also results in a sequence of images, but these are created by software.

Animation

Such animation is made from a series of stills and relies on something called persistence of vision. Persistence of vision is the phenomenon were an object on the eyes retina remains for a brief time after viewing. This means that a series of still images which vary slightly, if shown rapidly will give the illusion of movement. If each of the eight pictures below were shown at the same point in rapid succession, the result would be a rotating arrow.

Animation
The pictures shown below are stills of the rotating Ford logo from the Ford Motor Company site.

Animation
TV gives the illusion of continuous movement by showing the stills at the rate of 30 frames per second. Feature films are filmed at a rate of 24 frames per second but shown at a rate of 48 frames per second. Today, most animation is performed by computer. Examples are Bugs, Toy Story, Jurasic Park and the BBCs Walking with Dinosaurs. The computer will produce a wire frame of the scene, then apply textures and light effects before moving onto the production of the next frame, each of which may take hours or even days to produce.

Cel Animation
Cel Animation is the technique used to produce the old Tom & Jerry cartoons and the new computer generated Disney cartoons. It draws its name from the celluloid films used in old hand drawn animations. The celluloid films permitted layering, where the background to a sequence may be drawn on one or more films and then the films containing the animated characters place on top. Such animation starts with the production of keyframes, which are the first and last frames of an action scene.

Cel Animation
The frames in between the keyframes are then produced using a process known as tweening. Tweening is where the number of frames which must appear between keyframes is calculated and the frames drawn. Tweening may be performed by computer if the frames are not too far apart.

Computer Animation
Computer animation works in much the same way as Cel Animation. It even uses the same terms, such as keyframes, layers and tweening. Although theoretically limited by the scan rate of the monitor being used, the frame rate of any animation is typically dictated by the memory and processing power of the computer as well as the channel it is being shown across (if any).

Computer Animation
To produce smooth animation, a minimum rate of 15 frames per second must be sustained. Any lower and the animation will appear jerky. The frame rate depends mostly on the power of the processor and the bandwidths between main memory (66/100MHz) and the processor and the processor and the graphics card (VL, PCI, AGPx1 x2 x4). It also depends of course on the performance of the graphics card and the speed of its onboard memory and integrated acceleration techniques.

Computer Animation
The frame rate does of course also depend on the characteristics of the animation frames. The more frames, the greater the demands on the hardware. The pixel depth of each frame also impacts greatly on performance. As does the resolution of each frame of the animation. If you do not have control of the playback hardware, it may be necessary to work to and recommend a minimum spec machine e.g. MPC2 etc. This will inevitably mean keeping frame rate and pixel depths low and the size of the animation (pixels high and wide) relatively small.

Morphing
Morphingis aspecial effectinmotion picturesandanimationsthat changes (or morphs) oneimageinto another through a seamless transition. This is done by first creating or scanning the first and last images. Then key points are specified, i.e. points on the original image which should become points on the final image. The computer then produces the frames in between, with each successive frame becoming progressively more morphed

Morphing
The creator can usually specify the number of intermediate frames to be produced as well the number of key points. Both the number of the key points and number of frames will impact on the time taken to produce the sequence of frames.

Morphing

Exercise
1)What is sound? What is the different between sound and waveform? Please elaborate more 2)What is the different between PAL and NTSC? 3)What is morphing? 4)

You might also like