You are on page 1of 11

Overview of multichannel reproduction techniques

Jukka Rauhala Helsinki University of Technology


jrauhala@cc.hut.fi

Abstract Stereo systems have been dominant in audio reproduction for many decades. Recently, some multichannel systems have gained interest in the commercial market for mainly two reasons. First, the movie business has been using a lot of effort to the home theater concept, which includes multichannel audio. Secondly, digital signal processing and the transition into using digital format to store media has made multichannel audio storage and multichannel processing much easier. Currently, there are a number of formats, loudspeaker configurations and coding methods. Most popular formats are Dolby Digital and DTS. The purpose of this paper is to give a brief overview of the most important multichannel solutions, including different loudspeaker settings and coding methods. 1 INTRODUCTION Multichannel reproduction system means simply an audio reproduction system with more than two channels. The main reason for increasing the number of channels is to enhance the spatial perception of the listener. With two channels, which is often referred as stereo, it is possible to create very limited sense of space. The main limitations are that the optimal listening area is very small and depending on the placement of the loudspeakers, usually the sound sources are in front of the listener. Multichannel reproduction includes recording, mixing and the actual reproduction of multichannel sound. This paper concentrates more on the reproduction after recording and mixing. Multichannel reproduction systems have the ability to improve the sense of space significantly. However, the questions are that which are the optimal loudspeaker configuration, the best coding method and the best multichannel technique. The era of audio reproduction started monophonically. Then, the audio systems moved from using one channel to using two channels. The first commercial stereo audio reproduction systems were introduced in 1950s, based on Alain Blumlein's research. Since then, stereo has been the dominant configuration in reproduction systems. At the same time, the first commercial multichannel systems were launched in movie theaters. Although multichannel systems have existed for many decades, they have not gained popularity until recent years. Especially, DVD format has become very popular. DVD is a digital video format, which includes multichannel audio. (Rumsey, 2001) In Chapter two, the basic issues related to spatial hearing are studied. In Chapter three, the most important loudspeaker configurations are introduced. In Chapter four, 1

the major multichannel coding methods are examined. 2 SPATIAL HEARING When talking about multichannel sound reproduction, spatial hearing plays an important role. In fact, it is very important to have knowledge about psychoacoustics when developing multichannel systems. Spatial hearing has some important characteristics including localization, distance perception, depth perception and spaciousness. These characteristics are covered briefly in this chapter. There are three important cues for sound localization. First, Interaural Time Difference (ITD) (Rumsey, 2001) means that the same sound arrives into ears with a slight difference in time according to the angle of the sound source compared to head position. Secondly, Interaural Level Difference (ILD) (Rumsey, 2001) similarly means that the same sound is perceived with a slight amplitude difference according to the angle. Thirdly, sound has some spectral cues based on physical characteristics of a human body. Together these three cues enable humans to localize sounds very accurately. One important term related to localization is Head-Related Transfer Function (HRTF) (Rumsey, 2001). HRTF is a transfer function from the sound source to the ear canal. There is an unique HRTF for every sound source for each angle, including elevation. These HRTFs can be very different for different people, since they depend on physical characteristics of the person. These characteristics include the size and shape of shoulders, neck, back, pinna etc. Distance and depth perception means the ability to define the distance of the sound source. Rumsey (2001) lists five differences that the listener is able to recognize in same sound with two placements apart from each other: The further away one is quieter. The further away has less high frequency content. The further away is more reverberant. The further away one has less difference between time of direct sound and first floor reflection. 5. The further away one has attenuated ground reflection. An important factor that determines the ability to recognize distance is reflectiveness of the environment. In a more reflective room, the listener is able to determine the distance of a sound source better than in a less reflective room. The reason for this is that the auditory system uses the reverberation time and the early reflection timing to resolve the distance. 3 CHANNEL AND LOUDSPEAKER CONFIGURATIONS In this chapter, the most important loudspeaker configurations are introduced. Section 3.1, three-channel stereo is presented. Four-channel surround is studied Section 3.2. The most popular multichannel configuration, 5.1-channel surround, introduced in Section 3.3. In addition, other configurations are briefly covered Section 3.4. In in is in 1. 2. 3. 4.

3.1 Three-channel stereo Three-channel stereo is the simplest multichannel configuration. Figure 1 shows an example of a three-channel stereo system. It has the normal left and right channels and in addition it has a center channel. It did not ever become very popular, as it provides very limited sense of space compared to other alternatives. However, it is a basis for many other multichannel configurations. C L R

Figure 1. Layout of three-channel stereo system. The origin of three-channel stereo can be traced back to the 1930's, when Steinberg and Snow developed a stereophonic system with three channels. It has been also used in movie stereo systems, as it enables the use of wide screen. Michael Gerzon made proposals in the beginning of 1990's to start developing three-channel stereo systems (Rumsey, 2001), but he was not very successful. Three-channel stereo has some advantages, especially compared to two-channel stereo. First, it enables the use of wider stage or screen. Secondly, it has a better center image. Thirdly, it enables wider listening area. Clearly, three channels cannot compete with multichannel systems with more channels. Moreover, in three-channel stereo the loudspeakers are supposed to be in front of the listener, so there is no 360 degree sense of space. Three-channel stereo system has also one big disadvantage compared to two-channel stereo, as the center channel is difficult to place for example in home theaters. 3.2 Four-channel surround Four-channel surround, or '3-1 stereo', was an effort to enhance the three-channel stereo to create more sense of space. It adds an additional surround channel to the configuration and it is located in the back or in the side of the listener as seen in Figure 2. The surround channel can be reproduced using multiple speakers. It was developed for movie applications by 20th Century Fox in the 1950s.

C L R

Surround channel array


Figure 2. Layout of four-channel surround system. Four-channel surround is clearly better alternative than three-channel stereo, as the surround channel adds additional channel, which enables sounds behind the listener. However, the surround channel is very limited. It does not succeed in creating spaciousness or envelope. 3.3 5.1-channel surround 5.1-channel surround is the most popular multichannel system at the moment. It has the same three-channel front image, as presented in previous systems. In addition, it has two rear/side channels and one band-limited effect channel called Low-Frequency Effect (LFE) channel, as seen in Figure 3. 5.1-channel surround loudspeaker layout and channel configuration was specified by International Telecom Union (ITU-R) in 1993 (ITU, 1993).

C L R

LS
Figure 3. Layout of 5.1-surround system.

RS

Having two back channels instead of one improves the sense of space a lot. In theory, 4

5.1-channel surround can almost achieve 360o sound localization. However, it was designed to implement just three-channel stereo sound image, while using the rear channels for effects. 5.1-channel surround was also designed to be compatible with twochannel stereo recordings. Even though, 5.1-channel surround is a very good configuration compared to other configurations, it still has very serious drawbacks. First, it does not provide 360o sound image, as discussed earlier. Secondly, because it is compatible with two-channel stereo, it has a narrower front sound stage than what it could be. Thirdly, defining locations for the two rear loudspeakers can create two kinds of problems. The first problem is that specific knowledge of the ideal layout of the speakers is needed. The second problem is that it can be difficult to find physical locations for the speakers, which create the optimal sound image. 3.4 Other configurations In addition to the configurations presented here, a number of other configurations exist. 5.1-channel surround is the most commonly used, but it still has its challenges and better solutions are under development. One extension to 5.1-channel surround is to change the band-limited LFE channel into a normal channel. 7.1-channel surround adds two additional front speakers. There are 7.1 formats for movie theaters and consumer products. Quadraphonic system is a four-channel system, where the loudspeakers form a square with 90o angle between each other. Advanced techniques, such as VBAP, enable to use a number of channels with two- or three-dimensions. 4 SURROUND SYSTEMS In this chapter, multichannel surround systems are introduced. Surround system is a system, which can include formats for coding and transferring of surround sound. In Section 4.1, analog surround systems are presented. In Section 4.2, digital surround systems are studied. In addition, wavefield synthesis is briefly covered in Section 4.3. 4.1 Analog surround systems One problem with analog surround systems has been the requirement for compatibility with two-channel system. It is an important requirement, because it cannot be assumed that multichannel systems exist everywhere. The main solution, which has been used in analog surround systems, is matrixing. It is used in all systems described here. Matrixing means simply that the source channels are matrixed into fewer channels in order to be compatible with two-channel system. Then, the channels are again dematrixed to reproduce the multichannel sound. A typical matrixing form is "4-2-4", which means that the center channel is added into both left and right channels in phase and the surround channel is added into both left and right channels out of phase (Rumsey, 1999). The problem with matrixing is that it creates side effects, such as causing front signals to appear in rear. 4.1.1 Dolby Labs' systems Dolby Labs released their Dolby Stereo surround system in the beginning of the 1970's (Rumsey, 2001). It was a system designed for movie theaters and it had different formats from three channels up to six channels. However, usually the term "Dolby Stereo" refers to their four-channel system, which has been very popular in movie 5

business. It was using the 3-1 configuration and it had three front channels and one surround channel. In 1982, they lauched their Dolby Surround system, which was an attempt to bring multichannel systems into consumer market (Rumsey, 2001). It was based on Dolby Stereo and it used the same matrix decoding. Hence, movies could be decoded the same way as in the movie theaters. Passive Dolby Surround decoding has a serious cross-talk problem with adjacent channels. Dolby developed Dolby ProLogic system to improve Dolby Surround system as it was also targeted to consumer market (Rumsey, 2001). In Dolby ProLogic, the system tries to find the location of the dominant signal component and then attenuating the other channels. All these Dolby Labs systems were developed especially for movie systems. They have some characteristics, such as moving stereo image when the content is changing and the centralizing front image, which make them not suitable for music reproduction. However, there has been some efforts to use them in music reproduction, but with mixed results. In addition, one problem with these Dolby Systems was the inability to produce sounds to the rear of the listener. Hence in 1998, Dolby developed a new system together with Lucasfilm THX (Rumsey, 2001). It was called "Dolby Digital -Surround EX". It added an additional center rear channel, as seen in Figure 4.

C L R

LS
Matrix derived rear center speaker

RS

Figure 4. Dolby Digital Surround EX channel configuration. 4.1.2 Other systems Rocktron Corporation developed Circle Surround system mainly to compete with Dolby Labs' Dolby Surround (SRS, 2003). The technique they used was similar to Dolby Surround, but it had some enhancements. It had separate modes for music and video. In addition, it was claimed to be able to reproduce material, which was not encoded, as well as encoded material. Circle Surround has not been widely used. A company, which is mostly known of its effect processors, Lexicon, was another company to develop multichannel surround matrix decoding suitable for also music (Lexicon, 2003). They called their system "Logic 7". A notable difference to systems 6

presented earlier, is that Logic 7 can provide seven channels. The system is able to produce good surround effect from two-channel material. Logic 7 decodes Dolby matrix material in a very similar way to Dolby ProLogic. It is claimed to be able to reproduce 3-2 format movie with 3-1 matrix surround encoding in a way, which is very close to original. 4.2 Digital surround systems Digital surround systems have become very popular in the past few years. There are two main reasons for this. First, storing of digital media has a great advantage over analog surround systems. The storing itself is cheap and it is easy to store several channels separately, which was not possible with common analog storing formats. Hence, for digital surround systems, providing two-channel compatibility is not a problem as it was for analog surround systems. Secondly, digital video formats, which include also multichannel sound, have become very popular. The most common digital surround systems are briefly introduced in this subsection. 4.2.1 Dolby Digital Dolby Digital is a signal coding and representation method developed by Dolby Labs (Dolby Labs, 2003). It is the corresponding digital surround system to Dolby Stereo and Dolby Surround. The main purpose for its creation was to get rid of analog matrix encoding. It uses AC-3 coding algorithm to encode and decode multiple channels with bit rates from 32 kbit/s up to 640 kbit/s. Dolby Digital is very widely used, especially on DVD video releases. In addition, Dolby Digital has extra features, including dialog normalisation, dynamic range control information and downmix control information. Dialog normalisation is method used especially in broadcast to ensure that the dialog level of different programs does not differ too much. Dynamic range control information is a mechanism, which allows compensating between silent environment and environment with a lot of background noise by adding control information alongside the audio data. Downmix control information means a system, where instead of having two separate downmixes for multichannel and two-channel reproduction, the downmixing is done by using downmix coefficients. The person doing the downmix can set these coefficients to ensure that the downmix sounds as wanted independent of the channel configuration. Then, the actual downmix is done by the decoder, which generates it by using the coefficients. 4.2.2 DTS The DTS (Digital Theater Systems) is a very similar signal coding method as Dolby Digital (DTS, 2003). The main difference is that the bit rates for coding are from 32 kbit/s up to 4.096 Mbit/s. Hence, the maximum bit rate is much higher than with Dolby Digital. However, it does not mean that the quality of DTS coding would be better than Dolby Digital's. DTS provides also additional features, such as variable bit rate coding, lossless coding, downmixing control option and dynamic range control option. It is also very common format used on DVD video releases, but not as common as Dolby Digital. 4.2.3 SDDS Sony Dynamic Digital Sound (SDDS) is another digital film sound format. It uses 7

Sony's ATRAC coding method for coding audio signals. The main difference to Dolby Digital and DTS is that SDDS is targeted to 7.1 configuration instead of 5.1 configuration. SDDS is not very commonly used. 4.2.4 MPEG-2 MPEG-2 is a multichannel codec in the MPEG codec family developed by Moving Pictures Expert Group (MPEG). There are two versions of it: MPEG-2 BC and MPEG2 AAC. In addition, MPEG-4 format includes also multichannel features. MPEG surround formats have not been very popular. MPEG-2 BC is a version, which is backwards compatible with the similar twochannel format, MPEG-1. It means that MPEG-2 BC coded data can be decoded by using a MPEG-1 decoder. There are two main problems with MPEG-2 BC version. First, it is not possible to do the downmix in the decoder. Secondly, the bit rate is higher than without backwards compatibility. MPEG-2 AAC is an advanced algorithm, which does not include backwards compatibility with MPEG-1. It encodes multichannel data into one bitstream and it can take advantage of interchannel redundancy. Hence, MPEG-2 AAC algorithm is very powerful, the encoding results can be considered similar as with Dolby Digital. 4.3 Multichannel techniques In this subsection, important multichannel techniques are studied. These systems are able to use more speakers than the systems presented previously. In addition, with these systems it is possible to get more close to 360o sound localization. The systems presented in this subsection include pairwise panning, Ambisonics, Vector Base Amplitude Panning (VBAP) and Wave-field synthesis. 4.3.1 Pairwise panning Pairwise panning is a technique, which enables creating a virtual sound source between two loudspeakers. It is done by changing the energy ratio of the direct signal applied to a loudspeaker pair (Chowning, 1971). It can be applied for a system with multiple loudspeakers by using one or two loudspeakers at a time to create the virtual source (Pulkki, 1999). The achieved quality of sound with pairwise panning is good. The main problem with pairwise panning is spreading. It means that the perceived virtual source angle is dependent on the frequency. Hence, a broadband signal angle is not perceived accurately, but rather signal spreads. Tests by Pulkki show that the directional spread can be as high as 5 degrees (Pulkki, 1999). 4.3.2 Ambisonics Ambisonics is a multichannel recording and reproduction system. In this paper, only reproduction part is covered. Ambisonics was developed originally by Gerzon, Barton and Fellgett in the early 1970's. It has different formats for signals. First, the A-format was made for microphone pickup. Secondly, the B-format was meant for studio equipment and processing. Thirdly, the C-format is for transmission. Fourthly, the Dformat is meant for decoding and reproduction. In addition, UHJ format is used for encoding multichannel audio, but it also includes compatibility with mono and stereo systems. 8

The C-format, as well as UHJ format, includes four signals: L, R, T and Q. L and R are the two-channel compatible channels, T channel extends horizontal image and Q channel adds the third dimension. The D-format signals are reproduced using loudspeakers and they can be derived from either B- or C-format signals. One great advantage with Ambisonics is that it does not limit the number of loudspeakers used. On the other hand, it reproduces best when the loudspeakers are placed evenly around the listeners (Pulkki, 1997). Ambisonics was a very promising system at its time, but it did not ever become widely used. 4.3.3 VBAP Vector base amplitude panning is a multichannel technique, which can create two- or three-dimensional sound fields using unlimited number of speakers. VBAP was developed by Ville Pulkki. VBAP is more flexible than Ambisonics, because it does not specify the speaker placements. In addition, it is computationally very efficient. (Pulkki, 1997) Three-dimensional VBAP is based on loudspeakers forming triangles. Each virtual source is mapped into one triangle, which is called active triangle, as seen in Figure 5. The virtual source is reproduced using amplitude panning on the three loudspeakers in the active triangle. The algorithm calculating the gain factors for the speakers uses vector arithmetic. (Pulkki, 1997)
Sound source 2 Virtual sound source Sound source 1

Sound source 3

Figure 5. Virtual source reproduction using three-dimensional VBAP. 4.3.4 Wave field synthesis Wave field synthesis (WFS) is another multichannel reproduction technique. It was introduced by Berkhout (Vries and Boone, 1999). It can be used together with wave field analysis (WFA) method, which is able to record multichannel audio data for WFS. WFS is based on generating sound fields by using arrays of loudspeakers as seen in Figure 6. These sound fields will maintain temporal and spatial properties inside the listening area surrounded by the loudspeaker arrays. The sound pressure is distributed along the loudspeaker arrays so that all loudspeakers are producing sound, resulting in the sound field wanted. Unlimited number of loudspeakers can be used; the only requirement is that each loudspeaker has to have omnidirectional characteristics.

Figure 6. An example of an array of loudspeakers for WFS. With WFS, very high quality spatial sound reproduction can be achieved, especially when the gap between the loudspeakers is half of the wave length. On the contrary, WFS requires a number of loudspeakers in order to produce good quality. When comparing WFS with Ambisonics or VBAP, it can be seen that the basis of WFS is very different, as it uses all loudspeakers to reproduce the sound instead just a few. On the other hand, one big similarity between these three techniques is that all of them do not limit the number of loudspeakers. 5 CONCLUSION In this paper, multichannel reproduction techniques have been studied. In conclusion, three things can be seen based on the studies. First, to create spatial sound sources with multichannel techniques, it is better to have as many channels and loudspeakers as possible. However, compromises have to be done in normal situations. Out of the channel and loudspeaker configurations, which were introduced, 5.1-channel surround seems to be the best alternative. Secondly, the digital surround systems have major advantages over the analog surround systems. The main thing is that the storing of digital data is much easier. It is possible to store multichannel data into separate tracks. Hence, it is also easy to keep the compability with the two-channel systems. Thirdly, multichannel techniques such as VBAP and WFS seem to be very interesting. It is possible to create many new applications in the future by using these techniques. REFERENCES Begault, D. R. 1994. 3-D sound for virtual reality and multimedia. San Diego, CA. Academic Press Professional, Inc. 293 pages. Chowning, J. 1971. The simulation of moving sound sources. J. Audio Eng. Soc., vol. 19, no. 1. Pages 2-6. De Vries, D. and Boone, M. 1999. Wave field synthesis and analysis using array technology. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, New Paltz, New York. Dolby Labs. 2003. www.dolby.com.

10

DTS. 2003. www.dtstech.com. Digital Theater Systems. ITU-R. 1993. Recommendation BS.775: Multi-channel stereophonic sound system with of without accompanying picture. International Telecommunications Union. Lexicon. 2003. www.lexicon.com. Mac Caba, C. 2002. Surround audio that lasts: Future-proof Ambisonic recording and processing technique for the real world. AES 112th Convention, Munich, Germany. Pulkki, V. 1997. Virtual sound source positioning using vector base amplitude panning. Journal of the Audio Engineering Society. Pages 456-466. Pulkki, V. 1999. Uniform spreading of amplitude panned virtual sources. Proceedings of the 1999 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. Mohonk Mountain House, New Paltz, New York. Rumsey, F. 2001. Spatial Audio, Woburn, MA. Focal Press. 240 pages. SRS Labs Inc. 2003. www.srslabs.com.

11

You might also like