You are on page 1of 9

XYtri – From stereo to surround, and back!

Abstract

XYtri is an easy to use, integrated solution born out of the need for accurately verifiable,
uncompromised traditional stereophonic recordings when tracking small to large scale
projects also to be captured in surround.

<Img: XYtri#1>

The XYtri-setup consists of three XY-type configurations (with microphones of cardioid


characteristic) –subsequently named L-XY, C-XY and R-XY– which result in three coincident
and therefore quite mono-compatible stereophonic ranges (facing L, C and R), as well as two
connecting runtime-based stereophonic ranges (L-XY_R to C-XY_L and C-XY_R to R-XY_L).

The following paper, first presented at the 25th International Audio Convention in Leipzig
(November 2008), details XYtri‘s technical realization, the rationale behind and history of it‘s
development, and is accompanied by auditory samples of original 6-ch-recordings and
derived mixups (7.0 / 7.1) and mixdowns (5.1 / 2.0).

Technical realization

The original design (XYtri#1) uses three XY-pairs at ±45° arranged in the shape of an
isosceles triangle –each covering a range of 196°–, and a distance of 51.5 cm between L-XY
/ C-XY and C-XY / R-XY, resulting in two AB-ranges of 180° based on a runtime-difference of
about 1.5 msec. [Calculations according to data by Eberhard Sengpiel]

There have been two subsequent modifications:

• XYtri#2
The angle of L-XY and R-XY has been extended to +-67.5°. While keeping L-XY_R and C-
XY_L parallel there is now a rear-facing AB-configuration between the now also parallel L-
XY_L and R-XY_R. The angle of +-67.5° results in a coverage of 143°. The direction of the
outward-facing XY-pairs is no longer parallel to the base-axis but facing down by 21.5°.

• XYtri#3
The angle of all three XY-pairs, placed within an equilateral triangle, has been set to +-60°,
resulting in a uniformly coincident coverage of 158° while creating three uniform runtime-
based angles. The direction of the outward-facing XY-pairs is 20° down from base-axis.
Technical realization

A modified configuration of XYtri#1 tilts all three XY-pairs down by 45° while moving the
direction of both side-facing pairs back by ±30°, forming a tetrahedron. This setup is in many
ways equivalent to XYtri#3 and can be applied at a raised elevation, which is of particular use
in live recordings of concerts as this set does not obstruct the view as much as the original
version. The sonic quality has not yet been tested.

<Img: XYtri#1 configuration tilted down 45°>

For the horizontally level sets the advantage of flying them above and slightly behind the
conductor‘s head is that sound approaching from close below is attenuated due to the
cardioid characteristic of the capsules. Sound from instrumentalists further off –and usually
raised–, e.g. the woodwinds or the brass, approaches at an angle closer to the 0°-axis.

From stereo to surround…

The skill to assess a stereophonic range with headphones can be acquired easily. We are all
more or less used to interpreting binaural cues, and although the impression of the
soundstage when using headphone-based monitoring does differ from that observed using
loudspeakers it can be intuitively translated and further honed through practice.

Recordings tracked with only one coincident setup allow for the precise localization of
sound sources through differences in sound pressure level between the left and right
channel, whereas recordings tracked with only one runtime-based setup contain cues based
on time of arrival differences between the left and right channel. In both instances one of the
greatest challenges for the listener, when compared to a loudspeaker-based evaluation, is the
ability to accurately judge the width of the sound stage.

In the course of many live recordings I have employed a variety of configurations, from
simple to complex, most of which I had to set up, evaluate and monitor using headphones.
In most situations I use visual tools to complement the auditory impression by displaying
levels, frequency curve and the stereophonic power balance. Still I have found many
complex setups, e.g. ones based on the DECCA-Tree, or sessions intended for surround,
distinctly difficult to judge on location. XYtri gives me confidence in this respect.
…and back. Downmixing the XYtri

The configuration maps (nearly) discretely to a 7.0 monitoring environment [01], can be
folded down to 5.0, LCR (analog to OCT-2 [02]), combined or discrete directional stereo as
well as directional mono.

The following two screenshots give an overview of the primitives used to process the six
inputs to create a 5.1 / 2.0 downmix.

<Img: Folddown to 5.1>

In the above case the following processing is applied:

• the difference of C-XY_L minus C-XY_R is low-cut and fed to the front L speaker
• the difference of C-XY_R minus C-XY_L is low-cut and fed to the front R speaker
• a small part (0.3) of the sum of C-XY_L plus C-XY_R is low-cut and fed to the C speaker
• a small part of the low-cut-filtered sum is added to the front L (0.3) and R (0.4) speakers
• the same sum is low-passed and fed to the LFE
• 2/3 of L-XY_L and 1/3 of L-XY_R are low-cut and fed to the rear L speaker
• 2/3 of R-XY_R and 1/3 of R-XY_L are low-cut and fed to the rear R speaker
<Img: Folddown to 2.0>

In the above case the following processing is applied:

• the difference of C-XY_L minus C-XY_R is low-cut and fed to the front L speaker
• the difference of C-XY_R minus C-XY_L is low-cut and fed to the front R speaker
• a small part of the sum of C-XY_L plus C-XY_R is low-cut and added to the front L (0.4) and
R (0.6) speakers

To increase the sense of ambience in the stereophonic representation…


• L-XY_L is fed to the front L speaker
• R-XY_R is fed to the front L speaker

Matrix-Processing for 7.x

Matrixing can be employed to increase channel-separation in 7.0 and 7.1. The concept
currently being explored uses differences in SPL between coincident pairs and compensates
for runtime-differences when multiplexing signals from spatially distinct inputs.

L = ((C-XY_L - C-XY_R)d1 + (C-XY_L - L-XY_Ld1) )d3


R = ((C-XY_R - C-XY_L)d1 + (C-XY_R - R-XY_Rd1) )d3
C = (C-XY_L + C-XY_R)d2
Lm = ((L-XY_R - L-XY_L)d1 + (L-XY_R - C-XY_Rd1) )d3
Rm = ((R-XY_L - R-XY_R)d1 + (R-XY_L - C-XY_Ld1) )d3
Ls= (L-XY_L - L-XY_R)d2 + (L-XY_L - R-XY_Rd2)
Rs= (R-XY_R - R-XY_L)d2 + (R-XY_R - L-XY_Ld2)
d1
Delay between C-XY_L and L-XY_L / C-XY_R and R-XY_R
d2
Delay between L-XY and R-XY; this constitutes the total delay of all processed signals
d3
Delay d2 - Delay d1
Usability

<Img: 3 x XY + 2 x AB using XYtri#1>

With large ensembles there is always the question of how to capture the width of the sound
stage convincingly in a stereophonic representation –perhaps also one that can potentially
be extended. Another concern is that of discovering new ways to maximize the enfolding
nature of a surround monitoring setup –without compromising the possibility to fold down to
an „optimal“ stereophonic representation. XYtri, a composite of traditional stereophonic
techniques, is useful in these kinds of recording situations, preserving the option to „expand“
the image by allowing the mixer to extend the „edges“ to beyond the L and R speakers,
enfolding the listener more so than can usually be achieved even in a live performance.

<Img: XYtri 7.0-mapping>

As a live location recordist I have certain requirements:

• Setup time is precious and I require immediate feedback, especially regarding my


decisions concerning microphone placement
• Currently most of my clients ask for a stereophonic recording only. Still I often attempt to
capture a wider than usual image in case I am subsequently asked to do a surround remix
• If I can set up in an adjoining but separate space I usually employ a stereophonic near-
field-monitoring solution only; then again I like to be present in the performance space
• I often monitor using only headphones[4] –using Spectafoo for visual confirmation of
critical variables
• Evaluating a stereophonic panorama (AB, XY, Blumlein, ORTF) with headphones is
relatively easy, whereas the image from complex configurations in my experience has to
be assembled in a controlled environment and can not be reliably judged „on the fly“
This screenshot displays the setup for a live recording of a harp recital. I am easily able to
monitor each of the five stereophonic ranges while tracking and for subsequent „peace of
mind“-confirmation.

There is an overlap of monitored channels. Inputs 1+2 e.g. produce the leftward-facing XY-
image, inputs 2+3 the forward-and-left-facing AB-image.

<Img: MetricHalo 2d Mixer>


„Yet another setup“

In 2001 Eberhard Sengpiel suggested that everyone invent - and patent - a surround sound
setup, preferably with a flashy name. The question he poses is if we really need such a thing.
Should we not rather shape a recording according to the requirements of the music?[03] The
„flat“ XYtri could be expanded to include a rear facing XY-configuration, but what purpose
would that serve?

There are many spaces that bring with them a very special ambience. In my work as a live
location recordist I have tracked uncorrelated ambience many times by „sampling“ with
microphones of omnidirectional characteristic at two or more points. This gives me a return
on the reverb that I can use to sweeten the more direct sound of the recording or rebalance it
in the mix.

I have consistenly found that these „sampling points“ primarily have to be spacially distinct
from the main setup while avoiding extraneous noise, e.g. from an audience. They can be
behind the musicians, on the second tier of a concert hall, in the far wings of a church etc.
Discovering them is a matter of creativity.

While remixing many basically quadraphonically tracked sessions to surround I have found
the (true sound of the) ambience feeds to again be very helpful. I believe that at least part of
their applicability to sculpt the rear sound stage is due to the distance between the sampling
points and the main setup that introduces specific runtime / phase differences to the mix. This
can only be achieved by a complex setup, tailored to the requirements of the location and
musical event.

Personally I am „into runtime-differences“. I enjoy the spaciousness of AB and often employ


ORTF as the configuration of my main microphone. I use XY primarily as a spot when I either
want to directionally cover a wider range than I can with one microphone only, while
retaining the option to manipulate the width of the image at a later stage, or when I want to
conserve the movement of the musician in space.

My first step towards XYtri came when doing a stereo and surround production with the
ensemble „wavegarden“ featuring mainly crystal bowls and flutes.

I had placed a Blumlein-setup with ribbons in the central position, with two musicians sitting
opposite one another on both sides of the main axis. Since the crystal bowls are arranged in
the shape of a horseshoe I complemented the main setup with two XY-pairs on both sides of
the ring of bowls, facing inward and aligned to be on the diagonal axis‘ of the Blumlein-pair.
Seen in retrospect it was kind of like XYtri in reverse.
What does it look like (currently)?

Norddeutscher Kammerchor (June 2008)

Ralf Kleemann (October 2008)

Junge Symphoniker Hamburg Cantabile Hamburg (February 2009)

What does it sound like?

(Samples will be presented live)


Conclusion

I have found the XYtri-setup to work very well in a variety of recording situations. It allows me
to set up quickly, judge the ongoing recording confidently, produces fine results and allows
me to fine-tune them in many respects.

Please visit the website <http://XYtri.blumlein.net/> for current information on ongoing


developments as well as selected sound samples in 5.1 surround and stereo. I very much
appreciate any and all comments!

© 2008-2009 by Andrew Levine, blumlein records


[Version 090224]

Appendix

[01] The 7.0 configuration referred to consists of L, C, R, Lmiddle, Rmiddle, Lsurround and Rsurround.
<http://en.wikipedia.org/wiki/Surround_sound>

[02] OCT, the „Optimized Cardioid Triangle“, was proposed by Dr. Günther Theile in 2000.
It features a microphone of cardioid characteristic facing forward plus two microphones
of hypercardioid chacteristic, set 8 cm back and spaced at 40 to 100 cm, facing left and
right.
The successor, OCT-2, was proposed by Dr. Helmut Wittek and Dr. Günther Theile in
2004. In this configuration the forward facing microphone is placed 40 cm from the rear
pair, thereby creating a runtime-difference of about 1 msec.
<http://www.hauptmikrofon.de/oct2.htm>

[03] Eberhard Sengpiel (2001): Surround-Hauptmikrofone - wie sich die Bilder gleichen
(Integrated surround setups - how similar these pictures are)
<http://sengpielaudio.com/Surround-Hauptmikrofone.pdf>
Eberhard Sengpiel‘s website (http://sengpielaudio.com/) features a wealth of condensed
knowledge (in German language).

You might also like