Meetings Archive

Active Acoustic Absorbers: Do They Work?

Date: 10 May 2011
Time: 18:30

Location: Royal Academy of Engineering
3 Carlton House Terrace
London SW1Y 5DG

Lecture by John Vanderkooy.

Lecture Report

Active acoustic absorbers can replace low-frequency ‘passive’ absorption techniques. Passive techniques generally involve solving resonance problems in a room by introducing absorbing materials or resonant structures. General-purpose acoustic absorption, comprising sheets of heavy material attached to rigid frames, are somewhat impractical in many rooms, because the size at which they become effective is between a quarter- and a half-wavelength of the frequency of interest — around six feet when treating a 50Hz resonance. Membrane absorbers and resonators are tuned to move when stimulated by certain frequencies, and hence to terminate standing waves. These can be relatively small in size, but because they can react to only a small range of frequencies, several may be required to treat a room. We can alleviate serious problems in a room by equalising the loudspeakers that excite them, but this addresses only problems of sound pressure level. Equalisation cannot treat the equally insidious phase and reverberation time discontinuities that afflict specific frequencies.

Active absorption is effected by positioning subwoofers or full-range loudspeakers strategically, and driving them with a specially-calculated signal that cancels a large range of frequencies, using less space and treating a greater range of frequencies than a passive absorber could.

A relatively simple and theoretically ideal example of this is the ‘delay and cancel’ scheme. Taking a rectangular room, we can place two loudspeakers 25% and 75% of the distance along a wall, and drive them coherently. The images of these loudspeakers reflected in the other walls are evenly spaced, creating a plane wave along the room. This can be cancelled at the rear of the room using a similar arrangement of loudspeakers, delayed appropriately. The bass then becomes effectively anechoic at low frequencies.

Given a rectangular room of dimensions 8 × 7 × 3.5 metres and a reverberation time of one second, we can use the Sabine formula to calculate that the room contains 31.5 sabins (square metres of ideal absorption). The effective area of an active absorber is equal to:

Aabs = λ2 / 4π

This is about 3.7 sabins at 50Hz (an extra 12% of absorption in this room), and 24 sabins at 20Hz (an extra 74%). The theoretical benefits of ideal active absorption are clear, but there are some practical difficulties. Firstly, an ideal loudspeaker is a point source radiator, and the wavefront that we want to treat is generally closer to a plane wave. How do we know that our absorber is not interacting elsewhere with the wave that we are attempting to absorb? Secondly, how does this treatment work in a room where the absorption signal itself is reflected?

John Vanderkooy’s derivation of the driving voltage for an acoustic absorber was performed very rapidly at the lecture, but the brief answer is that both conditions are met without difficulty. The emerging formula for the absorbing signal is:

q(t) = 2πc/ρ × ∫ ∫ p(r,t) dt dt

Where q(t) is the desired volume velocity of the active absorber loudspeaker. Thus, the cone velocity of the absorber is proportional to the double time integral of the pressure at the loudspeaker from the room. To produce this volume velocity we could use a velocity-sensing coil on the loudspeaker in feedback, for example. The pressure p(r,t) must not be contaminated by the absorber signal itself, so we must know the absorber response and subtract it from the microphone signal. Eliminating the self-pressure of the loudspeaker (caused by its provision of the absorbing signal), and shaping the output transfer function to be both stable and correct for the loudspeaker is a significant challenge.

In summary, active absorption is an acoustically valid way of treating low-frequency problems in real rooms, but there are considerable practical difficulties in doing it well. One practical barrier is the necessity for near zero-latency analogue-to-digital conversion and DSP in order to suppress the local absorber signal, read the instantaneous external pressure from the room, react to it, and hence calculate the desired absorption signal.

Report by Ben Supper


Wireless Audio Streaming

Date: 3 May 2011
Time: 19:00

Location: Anglia Ruskin University, Room 302
Helmore Building
Cambridge CB1 1PT

Lecture by Gary Spittle, Cambridge Silicon Radio.


Mixing Out of the Box … The New Way?

Date: 17 May 2011
Time: 18:30

Location: School of Music, University of Leeds
University of Leeds
12 Cavendish Rd

Lecture by Simon Humphrey, The ChairWorks Studio.

Simon Humphrey is a recording engineer and producer based at the ChairWorks, a thriving independent recording facility in Castleford, near Leeds.

Simon will discuss the burning questions that ‘float’ around today when it comes to the role of the engineer, producer and recording studio. He will tackle that all-too-often-questioned subject of recording studios and their validity in a modern recording industry – ‘why do studios matter?’

Using the ChairWorks Studio as a blueprint, Simon will discuss ‘how they do it’.  While some may think studios are the ‘kiss of death’, we will hear, perhaps most importantly, ‘why do they do it?’

Moving on to studio equipment, Simon puts forward the question, ‘Mixing out of the box, the New way?’  Computers could have consigned analogue mixers and the outboard rack to history, but they haven’t. Simon will discuss Rack V’s plug-ins and explain how they work together with outboard on a mix.

Simon’s experience working in education has allowed students a privileged door to the inner workings of the music Industry and it seems students everywhere ask the same three ‘golden questions’. Simon will discuss and give his answers to these.

Throughout, Simon will be talking about his career over the past almost 40 years. It is a invaluable look into the life of a seasoned engineer’s career through the decades, whose work many see as the benchmark of engineering today.


Music Performance Acoustics

Date: 25 May 2011
Time: 19:00

Location: Anglia Ruskin University, Room 302
Helmore Building
Cambridge CB1 1PT

Lecture by Nick Durup, Sharps Redmore Partnership.


The Eigenharp – a radically new musical instrument

Date: 14 Jun 2011
Time: 18:30

Location: Royal Academy of Engineering
3 Carlton House Terrace
London SW1Y 5DG

Lecture and demonstration by John Lambert and Geert Bevin.

Developed in the UK, the Eigenharp represents a significant departure from previous electronic instruments. A wholly new sensor design, software system and physical form factor combine to make a highly expressive and inspiring experience for the musician who wishes to use software synthesis and sampling. The instrument has a growing following and a number of important artists now own them including film composer Hans Zimmer, Grammy award winner Imogen Heap and saxophonist Courtney Pine.

The talk will introduce the instrument and demonstrate its capabilities, before exploring the engineering challenges that were encountered and solved as part of its eight year development process. The sensor design, physical layout and communications system will be described along with a discussion on the importance of different types of software instruments, the emerging Open Sound Control standard and the use and limitations of MIDI in expressive environments.

John Lambert will be accompanied by Geert Bevin, a senior software engineer and musician at Eigenlabs who will be demonstrating and playing the instrument.

Time permitting, there will be an opportunity to try an Eigenharp at the end of the talk.


Life Behind the Glass: A Recording Engineer’s Journey

Date: 21 Jun 2011
Time: 18:00

Location: The Chairworks, Perseverence Works
Morrison Street
Castleford WF10 4BE

Lecture by Walter Samuel.

Walter Samual will talk about his life working as an engineer since the 1970s.  He will explore how the role of the recording engineer has changed over the years and his experiences in the music industry.  This is a rare insight into the career of such a distinguished practitioner. Walter will also demonstrate his recording techniques using DPA microphones.

Walter’s talk will be followed by a guided tour around The Chairworks, a rare opportunity to see behind the scenes of this highly specified studio complex.


High amplification speech systems for crowd management

Date: 29 Jun 2011
Time: 19:00

Location: Anglia Ruskin University, Room 302
Helmore Building
Cambridge CB1 1PT

Lecture by Paul Malpas, Engineered Acoustic Designs.

Paul Malpas has over 20 years’ experience working within acoustic consultancy. The majority of this time he has operated as a specialist designer responsible for the delivery of electro acoustic, audio system and speech intelligibility projects, and solutions within multi-disciplinary design environments. Paul’s seminar will involve, amongst other things, discussion and demonstration of high amplification speech systems for crowd management.

As always non-AES members are welcome. Don’t forget to interact with us on Facebook too at www.facebook.com/aescambridge


Point One Pitfalls — Monitoring For Surround Mixing Explained

Date: 19 Apr 2011
Time: 18:30

Location: School of Music, University of Leeds
University of Leeds
12 Cavendish Rd

Lecture by Roger Quested.

**Free shuttle bus from PLASA Focus. Contact north@aes-uk.org to book your place.**

Roger Quested will use his knowledge of the world’s top studios and recordings to explain how to make a studio monitoring system produce the best possible surround mixes.  Demonstrating on his company’s own Quested 5.1 V3110 System, he will discuss i) positioning of speakers and subwoofers, ii) the LFE channel and how best to integrate subwoofers into the system and environment, and iii) common mistakes to be avoided.

With experience of surround systems gained from working with such names as Hans Zimmer (Gladiator, Pearl Harbour, The Dark Knight), Hackenbacker Studios (Downton Abbey, Spooks, Shaun Of The Dead) and Trevor Horn, this will be a chance for those hoping to maximise their surround mixes to understand the complex monitoring elements that affect all studio environments.


Harmonic Phase — The missing factor in distortion measurement

Date: 12 Apr 2011
Time: 19:00

Location: Royal Academy of Engineering
3 Carlton House Terrace
London SW1Y 5DG

Lecture by Keith Howard

Lecture Report

It is a truism that harmonic distortion affects the perceived quality of an audio signal. It is less readily accepted that such distortion may sometimes be pleasant. In 1977, Hiraga’s article ‘Amplifier Musicality’1 controversially suggested that certain kinds of harmonic distortion may improve the perceived quality of a Hi‑Fi system. This notion is now dubbed ‘euphonic distortion’, although more than thirty years later, few, if any, new insights exist on the subject.

Less controversially, many recording engineers insist on specifying equipment that introduces certain types of harmonic distortion at high input levels, and then overdriving it: valve amplifiers and analogue tape, for example. The effect of this distortion is not obvious, but imparts a diaphanous quality of warmth or complexity.  Other types of harmonic distortion, such as Class B amplifier crossover distortion, are undoubtedly dysphonic: even tiny amounts of crossover distortion are audible, and very unpleasant.

Keith Howard has measured, characterised and emulated harmonic distortion in certain situations. This research led to a number of important conclusions. One of these gives this lecture its title: is not sufficient to record only the level and spectrum of harmonic distortion to reproduce it effectively. Rendering the correct harmonic phase correctly is just as important.

The distortion algorithm that Keith Howard uses in his experiments is based around a waveshaping kernel. This is a mapping function which changes a certain input sample value to an output sample value. Being time and frequency invariant, this is a simplification of a class of systems that are commonly applied for non-linear signal processing. The mapping function may be controlled, using any of a number of methods, to produce a certain pattern of harmonics for an input sinusoid of a certain amplitude. To add a second harmonic, for example, the following trigonometric identity is used as a starting point:

2 cos2 x - 1 = cos 2x

So 2x2 - 1 is the waveshaping kernel function, from which the d.c. on the output must be filtered if anything but a full-amplitude sinusoid is presented. (Click the image to see an animated version.)

X-squared kernel

For the third harmonic, a different identity is used:

4 cos3x - 3 cos x = cos 3x

So 4x3 – 3x is used as a waveshaping kernel function to generate the third harmonic. (Click the image to see an animated version.)

X-cubed kernel

In waveshaping, the amplitude of a harmonic falls faster than that of the input signal, so that attenuating the input (in this case, by 3dB) changes the shape of the output wave. (Click the image to see an animated version.)

X-cubed kernel, 3dB down

The generation of wave shaping functions may also be performed iteratively using Chebyshev polynomials.

Keith’s method of designing and applying waveshaping distortion is encapsulated in a free program called AddDistortion, available from the freeware page of his web site.

Beyond the second and third harmonics, the fractions of each order of polynomial become strongly interdependent. For any input signal that is not a sinusoid at a certain predetermined amplitude, it is not possible to add a fourth harmonic without also introducing a second harmonic. The same is true for any harmonic beyond the third. Also, because the distortion kernel is derived from a series of continuous functions, discontinuities such as corners or jumps in the transfer characteristic cannot be modelled. A final complication is that the signal must be interpolated before waveshaping and decimated afterwards. This prevents aliasing distortion from occuring when the upper harmonics pass the Nyquist limit.

The ramifications of these limitations are powerful. For example, we could attempt to correct a system that distorts audio in a known way, by applying pre-distortion to the input. However, this results in problems. If the system introduces a second harmonic, we might generate this harmonic in antiphase in the input so that it cancels the distortion product. However, the second harmonic introduced in the input will itself be distorted by the system, and will generate a fourth harmonic in the output, and very likely a third harmonic as an intermodulation product. We eliminated the second harmonic, but possibly made the problem somewhat worse. If we then anticipate the fourth harmonic, there will then be an eighth harmonic in the output, and so on. Such correction cannot therefore be performed using analogue circuitry. This rule was often advanced in the argument against the use of corrective feedback when the debate raged in the Hi-Fi community a few decades ago. However, a correct transfer characteristic may carefully be derived in the digital domain by generating a true inverse function, which is effective at least until a certain maximum frequency is reached.

When more complicated signals are distorted by nonlinear functions, it is known that harmonic distortion is a very small part of the overall picture: Brockbank and Wass determined analytically that, for a signal containing thirty harmonic products, the intermodulation distortion generated by a nonlinearity in the system comprises 99% of the total distortion power2. Full measurement and analysis of intermodulation distortion requires at least as many components in the input signal as harmonics that are under scrutiny.

This method and these observations take us to a practical example of the importance of harmonic phase. Keith advanced three case studies, the first of which demonstrates the point effectively; the other two highlight the opportunities for wider research.

Case study 1: Crossover distortion

In 1975, James Moir performed a series of listening experiments in which a Class AB amplifier was biased at different levels, and the audibility of the resulting distortion measured3. Keith Howard’s first attempt to reproduce these results using a waveshaping kernel was not effective: amounts of distortion that would have been perceived as unacceptable in the listening test were barely audible in practice. The generated transfer characteristic looks nothing like crossover distortion, and has very little effect on a low-amplitude signal.

Crossover distortion with same polarity

However, by alternating the polarity of the harmonic partials but keeping them at the same level, a more familiar characteristic is revealed:

Different polarity

For a full-deflection sine tone, these would measure exactly the same on a spectrogram or a THD+n meter, but they are clearly not the same. The resulting waveform reproduces the results of Moir’s test satisfactorily, and keeps the distortion components far higher as the amplitude falls. It also proves that when we are analysing or modelling distortion, we are interested just as much in the waveshaping function as the absolute level of the harmonic partials levels.

Case study 2: Hysteresis in transformers

In addition to the nonlinearities caused by saturation, audio transformers exhibit an asymmetrical transfer characteristic caused by their magnetic memory (hysteresis). As well as being frequency dependent, this characteristic makes modelling the distortion very difficult, because phase shift is introduced into the signal as well as wave shaping. Keith suggested a number of ways in which this could be incorporated into the distortion model in future, by using two waveshaping kernels in quadrature.

Case study 3: Loudspeaker distortion

The mechanisms that cause loudspeaker distortion are split into many different types: some, such as the cone or spider hitting their maximum excursion, are proportional to the displacement of the loudspeaker; some, such as eddy currents, are proportional to the force applied to the coil; others are proportional to cone velocity. The problem of modelling this distortion therefore falls into the same category as hysteresis in transformers: non-linearities act in different ways at different phases of the signal, and a static waveshaping kernel is clearly of limited use.

References

1. Hiraga, J.  ‘Amplifier Musicality’. Hi-Fi News & Record Review. Vol. 22(3), March 1977, pp.41–45.

2. Brockbank, R. A., and Wass, C. A. A.  ‘Non-linear distortion in transmission systems’.  J. I.E.E., Vol. 92, III, 17, 1945, pp.45–56.

3. Moir, J.  ‘Crossover Distortion in Class AB Amplifiers’.  50th AES Conference, March 1975. Paper number L-47.

Further reading

Howard, K.  ‘Weighting Up’.  Multimedia Manufacturer, September/October 2005, pp.7–11.

Report by Ben Supper


21st Century Mastering Workshop

Date: 8 Mar 2011
Time: 18:30

Location: Royal Academy of Engineering
3 Carlton House Terrace
London SW1Y 5DG

A group of leading mastering engineers discuss the latest techniques and the challenges they face.

A recording of the workshop is available here (66MB mp3)

Now that digital recording technology has superseded analogue, is it the ‘perfect sound forever’ that we were promised at the launch of the CD back in 1982?

A group of leading mastering engineers discuss a range of topics encompassing:

Synchronisation: How come it’s the sound that is always out of sync? Why is it not the pictures?
Dither: Is it important any more? Can we hear the difference? What changed?
Compression: How loud does it need to be? What is required for the best results when broadcasting or digitally distributing data compressed files?
Creation of a future proof archive: Just which of those 37 files labelled ‘Master – Final Version’ is actually the master, and whose responsibility is it to keep a record of this information?

On the panel are:
Crispin Murray – Metropolis Mastering (Moderator)
Mazen Murad – Metropolis Mastering
Ray Staff – AIR Mastering
David Woolley – Thornquest

They will share some experiences and advice, along with hopefully some amusing anecdotes of what to avoid in order to produce the best results.


A4V: Audio for Visuals Symposium

Date: 5 Feb 2011
Time: 09:00

Location: National Film and Television School
Beaconsfield Studios, Station Road
Beaconsfield HP9 1LG

‘A4V: Audio for Visuals’ is a one-day symposium covering the many aspects of creating sound for pictures.

The symposium will be held on Saturday 5th February 2011 at the National Film and Television School, Beaconsfield, starting at 9.00am and finishing at 5.30pm. It is aimed at audio engineers seeking to broaden their horizons and students wishing to get a flavour of the wide range of visuals-related disciplines that make up today’s audio industry.

For more information about the event, the provisional programme and details of the various ways to register, please visit the A4V: Audio for Visuals web page.


How I Does Filters

Date: 8 Dec 2009
Time: 18:30

Location: Royal Academy of Engineering
3 Carlton House Terrace
London SW1Y 5DG

Christmas Lecture by Peter Eastty of Oxford Digital

The audio recording of this lecture is available to download here (MP3, 15MB), and the presentation is here.

Meeting Report

The PowerPoint visuals for this lecture are available from the website, and it’s highly recommended that you view them while reading this report or listening to the recording, because many of the key concepts are graphical and make no sense without the Download the visuals here (4MB PDF).

Peter opened his lecture with the assertion that it would not be mathematical, an interesting proposition for a topic notorious for the complexity and abstraction of its mathematics. He also pointed out that he has never designed an analogue filter in his life: instead of approaching digital filter design by designing analogue filters and translating them to the digital domain, he has always considered it from an exclusively discrete-time, sampled perspective. The first design aims are to create a bell-shaped presence filter, and a shelf filter: essentially the same as EQ controls on a mixing console (see page 3 of the visuals). Both of these filter types are defined by three independent parameters, as illustrated: gain, frequency, and Q (for the bell filter) or overshoot for the shelf. To achieve this, there are just three building blocks available (page 4): multiplication, addition, and a delay of one of more samples. So, how do we go about arranging these building blocks to make filters?

Peter took us on a rapid yet simple-to-follow tour of the effect of combining simple combinations of multiplication, addition and delay, with the most intuitive explanation I’ve ever heard of visualising filter responses in the z-plane. It’s difficult to paraphrase to convey the meaning without simply writing down the whole lecture, but listening to the recording whilst viewing the visuals will convey the message. He started by considering the behaviour of the simplest possible combination of a delay and adder – effectively a one-tap FIR, derived the z-plane representation and frequency response entirely graphically and intuitively, then proceeded to extend by adding a multiplier (for coefficient values other than 1), resulting in the insight that the coefficient moves a zero along the x-axis of the z-plane (page 19).

Next, it was shown that responses could be combined by cascading filters together (page 20 and following), but that the elements of cascaded filters can also be combined together into one structure with a single adder (accumulator) with identical behaviour, with simple mathematical relationships between the multiplier values (coefficients) in the cascaded filters and the combine structure. Based on this, a relationship was derived between the coefficients and the positions of zeroes on the x-axis – a little maths involving a square root, but still pretty straightforward (page 23). Of course, square roots often tend to give rise to roots of negative numbers (page 26 – looks remarkably like the quadratic formula) – so what do you do then? Well, in a move highly reminiscent of complex numbers, each zero moves off the x-axis in relation to a couple of new equations, to create a symmetrical pair (pages 27-29), and all the findings so far are summarised on page 30.

So far, the lecture had focussed on things that reduce gain (zeroes) – what about things that increase it, pulling that 3D surface upwards (page 31) instead of downwards? Just like with op-amps, positive gain is created using feedback loops, and the feedback loop contains very similar topologies to the filters already discussed (pages 32-34). Second-order filters with only negative gain response (all-zero) can be combined with second order filters with only positive gain response (all-pole) (pages 35-36), and the resulting structure is often known as a biquad, beloved of digital mixing console designers for (among many other things) digital versions of traditional parametric equalisers and filters. It is shown to have two symmetrical pole/zero pairs, and when the frequency response is plotted against log frequency, it can give rise to the familiar bell-curve EQ frequency response (page 41) if the pole and zero are associated with the same frequency. The distance between the pole and the zero was shown to be related to the Q or bandwidth of the filter (pages 43-50), and the geometry for the curves of constant frequency is calculated.

Curves of constant gain were also shown (pages 51-55), and it was then shown (pages 56-58) that the curves of constant frequency and curves of constant gain are orthogonal at all points – important for independent control of them. All these equations were pulled together, and with the addition of a gain correction term (page 60), resulted in the definitive equations for biquads (page 61). It was demonstrated with code snippets that these equations are directly implemented in Oxford Digital’s products. The effect of a biquad with same gain for the pole and zero, but at different frequencies, was illustrated (pages 65-66): it was shown that the perfectly-damped response is achieved when the gain circle has its origin on the unit circle (page 67).

Higher-order filters can be created by adding extra pole/zero pairs on the same constant frequency curves, but on carefully-chosen constant-gain curves (pages 76-81). It was then demonstrated how to make non-integer-order filters, by using the fact that a coincident pole/zero pair cancel each other, so by introducing such a pair (no effect on the filter) and then slowly moving them apart, the pole/zero configurations for integer orders can be interpolated between. This is truly novel, and although the graphics illustrating the configurations are not in the visuals linked to above, they can be seen in Peter’s convention paper presented at the 125th AES Convention, entitled “Accurate IIR Equalization to an Arbitrary Frequency Response, with Low Delay and Low Noise Real-Time Adjustment“.

This ability to have non-integer order IIR filters permits the construction of arbitrary filter responses, but without the usual penalties of FIR filters (namely, long processing delays and poor phase performance). Peter demonstrated his real-time filter software, running on a laptop with a frequency response curve that is manipulated by attaching handles and moving them arbitrarily as desired. Naturally, some extreme frequency responses result in filter orders in the hundreds, and CPU power is limited, so the filter order can be limited and the response gracefully falls away from the handles if an impractical response is requested. The filter changes response quickly and completely smoothly (to the audio) in real time, even with rapid changes to extreme filter responses with orders greater than 100 – how this is achieved, Peter declined to elaborate further! Controlling the coefficients of IIR filters such that smooth changes in gain/frequency/bandwidth are achieved without artefacts or (worse) instability is regarded as a challenging task for simple conventional filter designs, so achieving this for Peter’s much more sophisticated arbitrary-response EQ with extremely high orders is impressive.

Peter concluded his fascinating lecture with the observation – made possible by his EQ – that if one creates a dramatic comb-like filter response (in this case, alternating 12dB gain boost at roughly octave intervals), then shifts the frequencies of all the gain/cut points together in logarithmic frequency (i.e. group all the handles together and drag them left/right at once), the resulting effect sounds like playback pitch is being increased or decrease, despite the audio remaining at constant pitch and playback speed. Peter makes the entirely plausible suggestion that the rapid scaling of a complex frequency-domain structure in log frequency creates a psychoacoustic illusion of pitch shift, because it sounds like the frequency scaling of harmonic structures characteristic of pitch shift.

Many thanks to Peter Eastty for a fascinating and entertaining Christmas lecture, which delivered fascinating insights for both seasoned digital audio engineers and those new to the field, and revealed genuinely groundbreaking technology.

Meeting report by Michael Page


Remastering and Audio Restoration at Abbey Road Studios

Date: 11 May 2010
Time: 18:30

Location: Royal Academy of Engineering
3 Carlton House Terrace
London SW1Y 5DG

Lecture by Simon Gibson of Abbey Road Studios

Abstract

EMI has an archive going back to 1898 and, since Abbey Road Studios opened in 1931, there has been a gradual increase in the remastering of that back catalogue for new formats. Starting with a potted history of EMI, the early years of recording and the work of Alan Blumlein, we move on to the emergence of remastering at Abbey Road and the systems and techniques used today. The talk will then concentrate on the use made of CEDAR Audio’s Retouch software in the audio restoration of The Beatles album remasters as well as its more usual use in the creation of music for the video game The Beatles Rockband. Along the way we will hear rare audio extracts from EMI’s archive and clips from The Beatles’ recordings to demonstrate these remastering and restoration techniques.

We regret that due to the large amount of copyrighted material played during this lecture we are unable to provide a recording.

Lecture report

Simon’s work at Abbey Road Studios focuses on audio restoration. Much of this is for EMI’s own vast catalogue, which dates back to the first recording by The Gramophone Company (EMI’s predecessor) in 1898. EMI’s archive comprises hundreds of thousands of items, not just audio discs and tapes but also artwork, photographic records and other materials.

In the case of recordings from the pre-tape era, the preference is to transcribe from the metal masters as these deliver a superior quality to a shellac disc. Before 1925 there was no 78rpm standard so the only way to set the correct playback speed is by musical pitch. 1925 also saw the introduction of electrical recordings,  with microphones replacing horns. An early example of a UK electrical recording was Handel’s Messiah conducted by Thomas Beecham.

A notable past EMI employee was Alan Blumlein. He joined The Gramophone Company in 1929 and was with them when Abbey Road Studios opened in 1931, the same year that Electrical Musical Industries (EMI) was formed from the merger of The Gramophone Company and the Columbia Gramophone Company. During his tenure with EMI this remarkable man developed moving coil microphones, a binaural cutter head and a stereo ribbon microphone. His wide-ranging stereo patent, lodged in 1931, expired in 1952 and, incredibly, it wasn’t renewed.

A stereo recording of the Royal Philharmonic Orchestra, again with Beecham as conductor, was made by EMI in 1934, while the team at the Hayes research laboratory’s stereophonic tests included recording a train passing. However, EMI didn’t consider stereo important at that time! Blumlein died during World War II while testing an airborne radar system but his microphones are still used today.

Until the early 1950s recordings were made directly to discs using a pair of ‘gravity-fed’ cutting machines driven by weights. As is well documented, when the Allies liberated Germany in 1945 they discovered the Magnetophon, a recording device using ¼” tape. Ampex in the US and EMI developed their own versions, the EMI machine being the BTR-1, which ran at 30ips. Simon noted that test recordings made on that machine still sound good today.

Stereo recording at 15ips began in 1955. A recording made of a Beecham rehearsal at Kingsway Hall in 1958 used a Vortexion mixer, EMI BTR-2 and Reslo microphones. The reason? All other mics were in use at the time!

Having provided this potted history, Simon moved on to talk about restoration. Early transfers from disc at Abbey Road were made using EMI’s own (analogue) equipment which could remove some of the clicks and pops, the bigger clicks being edited out after the transfer to tape.

In the mid-80s computer technology started to be employed and is now widely used for audio restoration. Abbey Road has an extensive array of tools developed by Cambridge, UK-based Cedar Audio. This includes what Simon referred  to as “declickle”, a combination of de-click, de-crackle and broadband noise reduction. These tools have to be used with discretion, Simon noted, particularly where the human voice is concerned as it is prone to suffer if processing is over used.

Simon sees the role of a remastering engineer as being like that of a curator, re-presenting works for successive generations. He will recommend that a work is remastered from scratch where the technology has improved sufficiently to make a significant difference to the end result.

By far the biggest problem when dealing with material on analogue tape is old edits. The splices can come unstuck.  Oxide  shedding is another issue, often solved by baking the tape at 50° for three days. EMI’s tapes are not usually a problem in this respect as they are stored in a good environment at the company’s library in Hayes.

The remastering process involves finding the best source – which in itself can be a painstaking process – transferring the material to digital, treating it and editing on SADiE. Lastly, some EQ or compression may be applied, but only where appropriate. An engineer’s ears are the final arbiter, Simon noted. Knowing when to leave alone.

Equipment employed include TC Electronic’s System 6000 dynamic processor and, as mentioned earlier, restoration tools from Cedar Audio, in particular, Retouch. A full description of this remarkable system is not possible here, but suffice to say that it acts a bit like an audio version of Photoshop. The different elements within a piece of audio are represented as differently coloured visual  images and the engineer can then remove an individual element by effectively painting it out.

Simon went on to discuss a particular project which used Retouch in a rather unusual way to create the soundtrack for the Beatles’ Rock Band game title. This involved isolating various instruments and vocals to create individual tracks that those playing the game can reproduce by ‘playing’ their instruments.

Because the original Beatles recordings were made on 2-, 3- or 4-track machines, it is not possible to simply mute individual instruments or voices, as it would be today with the virtually unlimited number of tracks offered by hard disk recording systems. Each instrument or voice therefore had to be extracted on to a separate track by identifying its visual pattern on the Retouch display.

The results were impressive, but unfortunately the only way to hear them will be to play the game!

Report by Bill Foster


Surround Sound Codecs in Broadcasting

Date: 14 Apr 2009
Time: 18:30

Location: Royal Academy of Engineering
3 Carlton House Terrace
London SW1Y 5DG

‘Surround sound audio codecs in broadcasting – an introduction and latest results from independent listening tests’

Lecture by David Marston, BBC R&D

Abstract

Surround sound systems are now becoming a popular addition to many people’s homes. This means there is now a demand for surround sound content to be delivered to homes via broadcasting, Internet or recorded media. Whichever way it gets to its destination, it is going to require data reduction along its journey. This may be in the transmission end of a broadcast chain, or in the transport of audio from a studio out over a broadcaster’s network.

This data reduction uses audio coders designed for surround sound. There are currently numerous different audio coders available, often with different attributes and performance. Choosing which coder to use is not a simple choice, and one of the key factors in this choice is the sound quality. It is inevitable that for serious data reduction, the coder will have to be lossy and therefore compromise sound quality. Our work assessed the sound quality of a selection of audio coders using the most accurate instrument of measurement available: the human ear. Here we present the codecs tested, how the tests were done, and of course the results.

Meeting Report

This paper described the methodology used for a series of evaluation tests conducted by members of the EBU on a range of commercially available audio codecs.

In his introduction, David explained that the measurement of perceptual audio coding systems cannot be carried out using conventional objective measuring tools, as one would do for wow and flutter, for example. An objective measure based on psychoacoustic principles such as PEAQ can work reasonable well with MPEG-style stereo codecs, but there is nothing available yet for surround systems. A disadvantage of using such measurement is that any new method is likely to be incorporated into a codec’s design to ensure good test results.

The only effective test, therefore, is subjective listening using humans – a slow and expensive process if a good sized sample is employed, although you do get useful results.

There are various parameters that can be looked at: overall quality, spatial quality in the case of surround sound, intelligibility, cascaded codecs, and so on. When a selection of different codecs and coding rates have to be tested in multiple combinations, the complexity increases further. In these instances a measurement system such as PEAQ can be used as a pre-filter.

The main subjective testing methods today are MUSHRA (MUlti Stimulus test with Hidden Reference and Anchors); BS1534, which is designed for mid-range to higher quality codecs, can test multiple codecs at the same time and was used in the EBU tests; BS1116, designed for high quality codecs but only samples one at a time; and P800. The latter is for speech and was not relevant for these tests.

MUSHRA produces a quality value and BS1116 an impairment value for each codec. On occasions it may be relevant to have more than one value, for example for temporal and spatial quality. A single value makes testing faster, as well as being easier for the listener and for analysis, however it can hide differences in listeners’ perceptions.

Ensuring a gender balance has also been a problem as most of the listeners have been male. Training is important, whoever takes the tests. Listeners must be taught to identify coding artefacts and other problems, as well as how to use the assessment interface. For scoring, a numerical scale is useful because it avoids interpretations of words like ‘Fair’ or ‘Good’.

Each listener hears five codecs, any more would make the test too tiresome and could degrade the accuracy of results. During the MUSHRA test listeners are always given the reference, and also included in the randomised sequence is a hidden low quality anchor reference, a 3.5kHz low-pass filtered version of the original. In the EBU test another, spatially reduced, anchor was added. For BS1116, listeners hear one codec at time, which is compared with a hidden reference and the known reference. This takes much longer, therefore each listener is limited to four codecs.

It is important to select a cross-section of experienced and novice listeners. Some may prove to have poor listening skills, or have a hearing impairment, but it is not always possible to identify this in advance. So it is better to use them for the test and reject their findings afterwards, often based on their ability to rank the hidden reference and low quality anchor.

David showed a slide of the MUSHRA test interface and explained how the listener can select each of the examples in order to make direct comparisons with the reference. He went on describe listening set-up at Kingswood Warren – soon to disappear!

Choosing the test material is always difficult. It must be critical, in order to highlight coding artefacts, but at the same time be unbiased, eg not material that is known to disadvantage a specific codec. The material must also be appropriate for the application: a mixture of music, speech and jingles (which will already have been compressed) for a broadcast codec, for example. The final choice of ten pieces of test material was made by a selection panel.

One of the techniques used by Institut für Rundfunktechnik (IRT), which analysed the results, was the Spearman Rank Correlation. This looks at the ranking of all the scores, and if anybody’s ranking was massively different from the average they were rejected. Around ten percent of listeners were eliminated at this stage.

There are three phases to this series of tests. The first two covered the most commonly used codecs for emission (transmission), the last link in the chain and usually the one with the lowest bit rate. Phase three looked at combinations of higher bit rate codecs used in the production/distribution chain – which are designed to be cascaded – combined with low bit rate emission codecs, and how they interact.

To ensure randomisation it was decided to split the codecs into three groups, based on their bit rates. Each listener’s five codecs contained at least one from each of these high, medium and low bit-rate groups, with the remaining two being from a single group to ensure a strong intra-group comparison within the test; eg a listener might hear one high, three medium and one low bit-rate codec.

Ten test items were used covering a varied selection of material, including applause, harpsichord, sax and piano, a church organ and Robert Plant.

IRT carried out the analysis to produce the test results. Some listeners were rejected if they fell outside of the Spearman Rank Correlation threshold, which compares the ranking given by each listener with the overall rankings. After this process some codecs dropped below the minimum of 15 listeners and so extra listening tests had to be carried out.

David went on to show the various test results and explained that some of the codecs used for the test were pre-production prototypes, or have since been upgraded. One common element was that the most difficult item to encode – usually the applause – normally ranked much lower than the mean. For example, one codec was rated 30 on applause but 90 on music, proving that perceptual coding is very content-dependent. [Note: This report does not list the codecs involved or their rankings due to the risk of misrepresenting the current performance of those codecs.]

The conclusion from Phase 1, as would be expected, was that higher bit rates produce better quality. The detailed results for each codec have hopefully given their developers something to work on in terms of improving their performance.

Phase 2 retained the applause sample from Phase 1 as a reference item but the other samples, although similar in terms of content type, were different. When results from Phases 1 and 2 were compared they were similar, proving that the testing methodology was valid. Phase 2 again showed that excellent quality can be achieved from low bite-rate codecs, but not for every type of content, and again it gave the developers guidance on areas where improvements can be made.

Phase 3 combined cascaded high bit-rate distribution codecs such as Dolby E, apt-x and Linear Acoustics with a selection of emission codecs. Ten items were selected from the samples used in the previous tests and these were cascaded five times through the same distribution codec before being passed through one or two different emission codecs. Various combinations were tested.

It was decided to use BS1116 rather than MUSHRA for this phase. Because this is an impairment scale, it was not possible to make any direct comparisons with the results of Phases 1 and 2. The conclusion was that distribution codecs still introduce some impairment, having the effect of creating a ‘ceiling’ to the overall quality attainable. The recommendation therefore is to use the highest bit rate possible.

Overall conclusions from these listening tests were that perceptual coding is still an imperfect art and there is room for improvement. Analysis is not easy, but these tests do reveal things that objective tests could never do, as well as uncovering things you wouldn’t expect.

Meeting report by Bill Foster


An Interview with Bob Stuart of Meridian Audio

Date: 15 Dec 2009
Time: 18:30

Location: Royal Academy of Engineering
3 Carlton House Terrace
London SW1Y 5DG

Conducted by Keith Howard

Bob Stuart has been a major figure in the British audio industry for over 30 years. Best known as Chairman and co-founder, with Allen Boothroyd, of what is today Meridian Audio Ltd, he has done much more than steer the company through challenging times to its current high-profile position manufacturing some of the most sophisticated audio equipment available. A pioneer of active and then DSP-equipped loudspeakers, he was quick to recognise the potential of CD and, as part of the ARA, to push for a version of DVD dedicated to high-resolution multichannel audio. Meridian’s own lossless compression algorithm, MLP, was developed in anticipation of this and selected by the DVD Forum for DVD-Audio in a technology shoot-out against stern competition. In expanded form it remains the basis of the Dolby TrueHD lossless compression scheme used in Blu-ray Disc. With a long-standing interest in psychoacoustics, which he studied alongside electronic engineering at Birmingham University, Bob is one of very few creators of high-quality audio equipment to have explored the fundamentals of sound perception and generated computer models of human hearing to help guide the design process. In recent years, in collaboration with Peter Craven, he has investigated the effects of digital anti-aliasing and reconstruction filters, one intriguing result being that Meridian’s latest flagship CD player – the 808.2 Signature Reference – uses minimum-phase rather than linear-phase output filtering.

These subjects and many others are covered in this interview, with Bob presenting supporting material to clarify the issues.

An Interview with Bob Stuart (audio, 23MB)


The Engineering Art Behind the Beolab 5 Loudspeaker

Date: 11 Nov 2008
Time: 18:30

Location: Royal Academy of Engineering
3 Carlton House Terrace
London SW1Y 5DG

Lecture by Gert Munch, Bang & Olufsen

During this lecture Gert Munch will demonstrate how the development of several key technologies, including the development of “acoustical lenses,” led to the design and implementation of the BeoLab 5 loudspeakers.

Gert is based at the Acoustics Research division of Bang & Olufsen, Denmark; he is a specialist in electro-acoustics and has worked at B&O for 30 years. In that time he contributed to the development and design of numerous speaker models, including the subject of this evening’s lecture, the BeoLab 5.

The aims for the BeoLab 5 design included

  • to make the best possible loudspeakers with the most convincing total sound experience
  • to give best possible experience wherever you sit, wherever the loudspeakers are placed
  • to reproduce the whole audible spectrum and dynamic range
  • to make a loudspeaker that didn’t sound like one!

In order to realise the ambition, the following requirements were specified:

  • Adaptive bass control including a moving microphone measurement system
  • Active loudspeaker design using high power ICE power amps
  • Thermal compression compensation (to remove temperature dependency of response)
  • Advanced thermal protection including thermal modelling and monitoring
  • Precise mechanical control and fitting for consistency
  • DSP Processing for response correction and manufacturing variation control

A little history: In the mid-1980s, B&O made the ‘Penta loudspeaker, which embodied the early attempts at B&O to take control of speaker directivity. It had a tapered design, with centralised tweeters, to minimise effects of the floor and ceiling reflections, a factor recognised by B&O engineers as critical to the sound in a real room.

To further understand these reflection issues, the Archimedes project was established (running from 1988 to 1992), and carried out by B&O in conjunction with the Technical University of Denmark and KEF (the UK-based loudspeaker manufacturer). This work led to many ideas about improving loudspeakers and a new, improved unit was designed that, unfortunately, never made it to market.

BeoLab 5 evolution: Also around this time, Sausalito Audio Works was pioneering loudspeaker design incorporating what it dubbed ‘Acoustic Lens Technology’. Despite some initial scepticism, B&O engineers concluded that the speakers from Sausalito actually sounded good.

After several iterations at B&O of the initial Sausalito design, the BeoLab 5 was the evolutionary result. Its distinctive shape (some liken it to a Dalek or a pylon) make it easily recognisable – and it weighs in at a hefty 61kg!

The Acoustic Lens (perhaps a ‘lens’ in the sense that a curved-mirror in a reflector telescope can be a lens) is a mechanical structure that consists of a specially shaped reflector mounted atop an upward facing driver, the special shape being a quarter of an ellipsoid.

An ellipse has two focal points; the drive unit is located at the first so that, by virtue of the shape, all sound passes through the second (assuming a ray-tracing model and an infinitesimal source).

Prior to the building of the speaker, some ray-tracing based simulations were attempted. This simulation technique was later abandoned because such a basic model lacks the ability to predict diffraction effects, a critical factor in loudspeaker directivity.

An audience member asks, ‘Why not place the speaker at the second focal point and do-away with the lens?’. Gert’s answer is that such an approach would not provide any control over the radiation pattern – and it is this radiation pattern control that the ‘acoustic lens’ technology seeks to master.

Later modelling attempts included Boundary Element and Finite Element Analysis.

An animated picture is shown to demonstrate a radiation pattern simulation. The key point is that the response looks the same at a wide range of angles in the horizontal plane. Comparing the two-dimensional Finite element model with the 3-dimensional boundary element model, it is noted that, as presented, they look very similar, providing further confidence in their validity and the concept in general.

Gert points out that, at least initially, the ideal radiation pattern of this speaker appears to be similar to that of a dipole, however, the problem of traditional dipoles loudspeakers is they must be placed at least 1m away from the wall behind to achieve good performance, a restriction which can prove inconvenient in real-world situations, usually due restrictions imposed by one’s cohabitee.

In the BeoLab 5, B&O have aimed to make a design with a forward directivity similar to that of a dipole but, due to the attenuated rear-response, one which can be placed directly against a wall.

Taking the power average from nine measurements made at random room positions, yields some kind of loudspeaker power response. Other measurement techniques have been tried, but this power averaging technique, Gert reports, shows better correlation with subjective testing.

Efficiency of loudspeakers is generally low and the BeoLab 5 is no exception. Free-field, 200W of electrical power input might yield 1W of acoustic power. The BeoLab 5 contains amplifiers capable of supplying around 2.5kW of power!

Gert notes there can be huge changes in power response at around 100Hz for differing speaker placement, so a filter is introduced to equalise the power response positioning-room. A normal tone control can never compensate for this kind of problem; much more precise control is provided in the BeoLab 5 using Digital Signal processing.

The BeoLab 5 includes a formidable array of signal processing. The crossovers are performed digitally, and much more besides.

During factory test, the response of each driver is automatically equalised to compensate for manufacturing tolerances. Overall equalisation is also applied to achieve the overall target frequency response. This production testing employs a total of 6 microphones – four at the front (one close to each of the drivers) and two at the rear. A reference speaker provides the target for the equalisation process . Each production speaker is adjusted to match the frequency response of this reference unit with a target error of less than 0.5dB.

Temperature and air pressure can alter the measurements significantly, so these are monitored during this phase.

Using an in-built, motorised microphone which slides out from under the speaker, automatic correction of low-frequency response up to around 300Hz can be invoked by the user to reduce the effects of the room in which the speakers are placed. Gert points out that this correction is not a modal correction – it’s more like a general equalisation, with the filter response being smoothed during the measurement process.

Interestingly, the target response for this ‘auto-correction’ system is not, as one might expect, a flat response, but rather a response that has been determined empirically through critical listening.

The thermal monitoring uses a combined technique of feed-forward modelling in conjunction with average temperature measurement of the driver mechanical assembly. Each driver also has thermal modelling, arranged such that should, on average, too much power be applied to any driver, progressive attenuation is applied to its output (and also to outputs to all drivers of higher frequency to maintain a consistent tonal balance).

A “party test” is also carried out which runs the speakers at full-power for three days!

The BeoLab 5 is a no-compromise design that might at first appear to be at the more esoteric end of hi-fi. But many thousands of units have been sold, proving that many consumers still aspire to achieve great audio reproduction and are prepared to buy-in to new technology to achieve it.

It was fascinating to hear about the design philosophy and gain some insight into the processes. On behalf of all present I’d like to extend thanks both to Gert for the presentation, and to B&O for making it possible.

Report by Nathan Bentall


Grand Designs — Networked DSP for Really Big Buildings

Date: 14 Jul 2009
Time: 18:30

Location: Royal Academy of Engineering
3 Carlton House Terrace
London SW1Y 5DG

Lecture by Michael Page of Peavey Digital Research

Download MP3 audio recording of this lecture (18MB)

Download PowerPoint slides of this lecture (6MB, MS PowerPoint 97-2003 compatible)

Abstract

Advances in audio distribution and control over digital networks have delivered tremendous benefits for operators of large venues and premises, such as theme parks, cruise ships, stadiums, live performance venues, airports and industrial complexes. Audio for entertainment attractions, background music, paging systems and evacuation purposes may all be transported and controlled on a single distributed system, via Ethernet and IP local area networks. Audio processing for acoustic correction, routing, mixing and other processes is all easily performed using programmable DSP, located both centrally and at distributed nodes. Michael Page of Peavey Digital Research will discuss and demonstrate the technology used to achieve this.

Meeting Report

Michael started his talk by listing the range of applications for the networked audio DSP systems he’d come to talk about: a diverse range including airports, stadiums, theme parks, ports, houses of worship, legislatures, and convention centres. Then, displaying an aerial view of the truly gigantic Hartsville-Jackson Atlanta Airport, he posed the question: what does it take to wire an airport for sound?

It sounded like a straightforward question, until Michael started discussing it. He started by talking about the audio system outputs: each boarding gate area (all 179 of them, at Atlanta) needs an individual output, each lounge, each concession, each arrivals hall zone, each check-in zone, each customs hall, each luggage reclaim zone… not to mention all the non-public areas. Each of these many hundreds of outputs needs, in addition to level control: EQ for loudspeaker correction, EQ for room correction, delay for time-alignment, possibly dynamic range processing, and possibly ambient level sensing. Ambient level sensing is a particularly complex DSP function: it uses a measurement microphone to detect the ambient level in a space, so that the level of the loudspeakers can be adjusted to ensure a consistent signal-to-ambient-noise ratio for the listeners. But if the audio system is active while this measurement is being made – as is often the case – sophisticated DSP is needed to “null-out” the contribution from the loudspeakers from the signal picked up by the microphone, in order to obtain an accurate measurement.

Next, Michael considered the inputs. Each boarding gate has a paging station, plus paging stations for every lounge, concourse and information desk, that may be routed to any system output. There may be background music inputs for lounges; automated message playback systems (“Please do not leave your bags unattended”, etc.); automatic announcements from the fire alarm system; and all these need to be prioritised, so that evacuation announcements aren’t blocked by the background music, for example. Each input typically needs EQ and dynamics processing, and needs to be routable or mixable to any combination of the several hundred outputs. So: we’ve got an audio system with several hundred inputs, several hundred outputs, all connected with a giant intelligent mixer, and a sizeable amount of DSP on every input and output. The inputs and outputs are distributed over six huge buildings, across a site a mile long and half a mile wide, and it has to integrate with the security, life safety, building management and enterprise management systems. Finally, it needs to be extremely robust and redundant, so that it keeps running even if the system sustains major component or infrastructure failures, perhaps caused by a large fire or a bomb explosion. Wiring this for sound isn’t as simple as first thought!

Michael next considered another application: stadiums need to get high-quality sound to every seat in the stadium, despite huge acoustic differences in the seat and loudspeaker placements. So each block of seats needs separate loudspeakers and processing – plus zones for all the internal areas: locker rooms, bars, restaurants, VIP areas, conference centres, car parks, atriums, etc. Stadiums don’t need as many inputs as an airport, but the ambient level processing is even more critical due to the difference in ambient level between sides of the stadium, at crucial points in events!

Finally, Michael discussed the requirements of theme parks. Audio-visual experience attractions such as “Terminator 2 3D” at Universal Studios are an obvious application: audio is a fundamental part of these attractions, and they require high-level, high-quality audio reproduction from a large number of independent channels to be precisely synchronised, sometimes interactively, with the motion control and video control systems that create the other dimensions of the visitor experience. Audio reproduction, even if it is only zone-specific background music, is usually present at pretty much every publically-accessible location in a theme park, and all public areas must have audio coverage for life safety announcements such as fire evacuation. As with airports, the wide range of different acoustic environments and geographic spread requires a large number of independently-addressable audio zones. The audio system inputs may be local, such as interactive audio playback within rides, or remote, such as background music, advertisements or paging announcements. Parade grounds and live shows complicate matters further, with many radio microphones and loudspeakers covering a very large area.

Now the problem is understood – how is it solved? Traditionally, it was analogue: large quantities of analogue multicore, thousands of crosspoints of punch-on patchbay, and many racks of analogue signal processing. It was difficult and expensive to engineer robust system redundancy, and very difficult to get computer-interfaced control of the audio signal processing. Each audio channel required a balanced line connection, requiring a huge quantity – and weight – of cable.

All this was revolutionised in the early 1990s by the arrival of DSP technology. DSP brought huge cost and functional benefits to the audio installation industry, for two principal reasons: it allows very simple interfacing of audio functionality to computer systems; and it permits arbitrary, heterogeneous arrangements of audio DSP functionality to be realised cost-effectively in generic hardware. The other crucial development from digital audio technology was digital audio networking, carrying many channels of uncompressed audio at low latencies on standard computer networking infrastructure. Analogue audio multicore cables were hugely expensive to buy, and even more expensive to install, whereas computer networking cables are flood-wired into all commercial and public buildings. So despite the relatively high transceiver costs, audio networking was a vastly cheaper way of getting audio around a commercial building. It also implicitly provided computer-controlled audio signal routing, saving the cost of expensive dedicated audio routers. The de-facto standard audio networking technology for the commercial installation industry has been CobraNet since the late 1990s, which is Ethernet-based (layer 2), has a latency of about 5 milliseconds, and convey up to 64 channels of audio in both directions.

To indicate the state of the art in audio networking, Michael spoke about a relatively new technology called Audinate Dante. It’s Internet Protocol based, typically runs over Gigabit Ethernet, and it’s scalable for both bandwidth and latency, making it very flexible for a wide range of applications. It may be configured for performance comparable to CobraNet, but in principle it can also function (with higher latency and lower bandwidth) over poorer-quality networks such as the public internet, or alternatively it can function as an ultra-low-latency, ultra-high-bandwidth point-to-point link between audio processors.

Michael then explained how these technologies are brought together. A system typically comprises some number of analogue audio i/o units, DSP units and control interfaces, connected by an Ethernet network for audio and control data, but these units may all be physically remote from each other. Control interfaces are used to communicate with user interface devices, uninterruptable power supplies, fire and life safety systems, building services management (HVAC) systems, show control systems, and many other possibilities.

This is a very successful technology area, with a number of companies actively competing. Peak Audio developed the first product of this kind in the early 1990s, the MediaMatrix system, which comprises a PC-AT motherboard with custom ISA backplane, DSP ISA cards with Motorola 56K DSPs, and analogue i/o boards. This was first used to provide an adaptive, distributed sound reinforcement system in the US Senate Chamber, which posed some unique challenges that could only be solved by computer-controlled DSP. The MediaMatrix product was licensed to Peavey, who manufactured and distributed it, and it was extremely successful.

The second generation MediaMatrix product was the Nion, launched in 2004. It has a PowerPC CPU running embedded Linux, for distributed control and communications with the other Nions on the network, and monitoring the DSPs and audio interfaces. It has a number of Analog Devices SHARC floating-point DSPs, a proprietary high-bandwidth low-latency audio link bus that uses Cat-5 cable to connect Nion units together, and a CobraNet interface module. CobraNet has such high bandwidths and low latencies that it needs dedicated data processing hardware: a generic CPU doesn’t have sufficient network performance. It also features a selection of “general purpose I/O” connections for control interfacing: logic i/o, relays, high-current outputs (for driving lamps, solenoids, etc.), control voltage inputs and outputs, and rotary encoder connections, for creating simple custom control panels.

Michael demonstrated of the NWare software, a Windows application used for defining the DSP and control functionality. It has a graphical user interface resembling a CAD drawing tool, allowing the user to drag-and-drop blocks representing DSP functions, audio i/o, control functions, control scripts, and many other functions. It also allows creation of custom control panels for PC-based or touch-screen user interfaces. When the design is complete, the “deploy” button is pressed to generate the DSP and control code, and download it to Nions connected on the network, which immediately take on the designed functionality.

The lecture was wrapped up with a look at the NWare system design for the MediaMatrix system at Emirates Stadium. As well as huge quantities of signal processing blocks, it featured touch-screen graphical user interfaces based on architectural plans of the stadium for ergonomic control and monitoring of audio across many different zones in the stadium at once. Custom support for communicating with UPS devices is implemented in the Python scripting language, which executes on the Nion. This vast system design gave a flavour of the tremendous complexity of the audio system implemented with MediaMatrix.

The NWare software can be downloaded for free from the downloads section on the MediaMatrix website.

Meeting report by Michael Page


Critical Listening/Evaluation — a path to the future of quality music

Date: 3 Jun 2009
Time: 18:30

Location: Royal Academy of Engineering
3 Carlton House Terrace
London SW1Y 5DG

Special lecture by George Massenburg of George Massenburg Labs

George Massenburg needs little introduction – even if you don’t know of him, you have probably heard his recordings. For a detailed biography, see www.massenburg.com/cgi-bin/ml/bio.html.

Meeting Report

Quality recordings

What is difficult to represent in this report is the passion George exudes about music, a passion which drives him to strive (and help others to strive) to continually improve the quality of recorded music. Many recordings were replayed in the course of this lecture, some made by George, others not. Most were 192kHz, 24-bit; some were transferred from analogue master tapes.

George began by replaying a Diana Krall track, pointing out the subtlety and detail captured by Al Schmitt. In a change of style, the next track was by Neil Young – a new song about the recent financial crisis with the chorus line “A bailout is coming, but not for you”. Elements of the recording were described, there being a pair of guitars (slide and acoustic), rock’n'roll drums and a hi-hat “somewhere in the background”.

George then played a clip from YouTube of a recent and currently very popular track by Autotune The News (their second track, pirates. drugs. gay marriage), an original piece where television newsreaders have been cleverly edited in time and pitch such that they appear to be singing. The point here? Although the YouTube clip has been extremely popular (it received 1.5 million hits in the first week, possibly setting a web record), and although George admitted to thinking it “brilliant”, the audio quality is very poor. George pointed out that repeated listening at this YouTube-quality quickly gets very annoying because of the low-fidelity sound.

Compression Artefacts

George then played the results of some subtraction tests on lossy audio codecs, a technique which George refers to as the Moorer test as it was originally suggested by James A Moorer. In these tests, high-quality 192kHz, 24-bit recordings were converted to various encoded forms such as MP3 and AAC. The encoded files were then decoded and upsampled back to the original 192kHz, 24-bit. A sample-by-sample subtraction was then performed, and the resultant difference – the error introduced by the codec – then replayed. The resulting error signal is surprisingly high in amplitude (estimated by George as typically 25-30% peak), clearly correlated to the signal and with a complex relationship to the original sound (not simple harmonic distortion).

George takes the view that his students should learn to recognise the nature of the codec error using this subtraction method and then listen to the encoded music. Using this learning technique, listeners can familiarise themselves with the artefacts’ sound in isolation, and can subsequently pick them out more readily when the encoded material is played.

Listen Again

George believes that every time we hear a piece of music we should have the possibility of hearing something new – “to take home something else” – and that this is more readily achieved with high resolution recordings. Although George concedes that it’s possible to make a “pretty good” 44.1kHz/16-bit CDs, he remembers the first time he heard a digital recording: rather than being impressed, he was “horrified”.

Subsequent work to push the boundaries of converter technology (George recalls the contribution of Paul Frindle in this area) has convinced him that good digital now is good. He believes that we don’t have to go back to magnetic tape to make good records, and describes himself as having “an easy peace” with both vinyl and analogue tape.

Recording Tips

George deprecates recording techniques in which small elements are recorded separately and later combined/corrected/stretched/re-tuned, etc. He believes a key to great music recording is to maintain a performance focus. Preferably, the band should perform and be recorded playing simultaneously in the same space. George offers these suggestions to help your next recording:

  • 1) Only use destructive record.
  • 2) No punch-ins.
  • 3) No one is allowed to take the recording home and ‘tweak’ it – they can do another take, but the previous one will be overwritten.

The AES UK section and George wish to thank the companies who kindly supplied equipment for this lecture, namely ATC (monitor loudspeakers), Digidesign (ProTools system), Arcam (DVD player) and Prism Sound (D/A converters).

Report by Nathan Bentall (edited by Keith Howard)


Reality is Not a Recording / A Recording is Not Reality

Date: 12 May 2009
Time: 18:30

Location: Royal Academy of Engineering
3 Carlton House Terrace
London SW1Y 5DG

Lecture by Jim Anderson of Jim Anderson Sounds

Abstract

The former New York Times film critic Vincent Canby wrote: “all of us have different thresholds at which we suspend disbelief, and then gladly follow fictions to conclusions that we find logical.” Any recording is a ‘fiction’, a falsity, even in its most pure form. It is the responsibility, if not the duty, of the recording engineer, and producer, to create a universe so compelling and transparent that the listener isn’t aware of any manipulation. Using basic recording techniques, and standard manipulation of audio, a recording is made, giving the listener an experience that is not merely logical but better than reality. How does this occur? What techniques can be applied? How does an engineer create a convincing loudspeaker illusion that a listener will perceive as a plausible reality?

Meeting Report

Jim Anderson: Professor of Recorded Music, Clive Davis Department of Recorded Music, New York University

Jim started his lecture with the attention-grabbing statement that audio recording is trickery, a devious deception – then expanded the point to explain that the aim is to make you, the listener, believe you’re hearing the truth: but actually it’s sleight of hand. He set about illustrating that by playing back a diverse range of audio recordings over the course of the lecture and discussing them, casting some light onto the techniques and tricks he’d used to exercise that devious deception: and without exception, create musical listening experiences of quite exceptional quality.

Jim started by playing the commercial release of J. J. Johnson’s “The Brass Orchestra” – it was extremely punchy, dynamic, and live-sounding. He then played another track: while obviously the same piece, and possibly the same very performance, it had much less impact, drums were much quieter, the soloist was clearly off-mic – this was from a simple stereo pair of mics to capture the “air” of the room, and illustrated the striking difference between the somewhat artificial, yet highly-appealing experience created by the commercial release, and the fly-on-the wall experience of the performance – which is arguably the “real” experience. Jim then discussed some of the details of this performance and the techniques he’d used to create the “false”, yet plausible and appealing final product: it was captured live-performance-style in a single take with no overdubbing; microphone selection was key in realising tonal and dynamic differences within the group; the studio had a “good” acoustic for performance, but this was enhanced with artificial concert-hall reverb. The artist wanted to mix first without the solos, in order to get all the internal balances right: then add the solos later – so the whole thing was mixed twice.

Jim expanded on the microphone selection points by playing “High Noon – The Jazz Soul of Frankie Laine” featuring Gary Smulyan, baritone sax player. Jim used ribbon microphones, with their smooth, easy sound, on all the nine-piece backing group; but used condensers to bring the baritone sax and French horn into sharp dynamic focus. It allows the backing to be up-front in the mix, yet keeping the sax solo sounding appropriately prominent.

To illustrate another interesting technique, Jime played drummer Marvin “Smitty” Smith tracking “The Road Less Travelled”: Marvin had requested “more depth, more breadth” in the kick drum. Jim met this requirement by using a Beyer Opus 51, a boundary effect mic designed for piano, under a sheet of wood to isolate it from the rest of the kit. He used two Opus 51s and an M88 in the middle, to create a mid/side array. In stereo, it creates perfect image of the kit: in mono, it collapses and provides a remarkably leakage-free kick drum.

Among other recordings Jim discussed, he played a track by Patricia Barber, recorded in Chicago. It had an extraordinarily huge, deep, broad-sounding kick drum, very prominent and snappy drums in general, whereas the female vocal is up-front yet full in the low-mids. He then played another recording, with same trumpeter in the same room, yet smoother-sounding – because it’s a tube mic rather than ribbon. Kick drum is only 18”, but with good tuning and an M/S mic it gives the huge depth and finish.

All recordings played so far had been tracked straight to digital: Jim’s next recording was a modern attempt to recreate the classic 1970s Blue Note sound, for an album called “Hubsound – The Music of Freddie Hubbard” Contrary to direct-to-digital tracking, this was done using a 16-track 2” at 15 inches-per-second with no noise reduction. It’s impossible to make lots of overdubs because 16 tracks is very limited. In this way, it emulates not only the sound, but also the practical constraints and therefore the recording techniques, of the Blue Note vintage.

Next up, we heard Gonzalo Rubelcaba performing “Here’s that Rainy Day” in Criteria Studio A in LA: solo piano in a large live rectangular room. Mics were a U87 above, DPA4007 close, DPA 4006 a little further back: and beyond that, a pair of U87s in a modified polyhymnia configuration, so the room sound was also captured in case a surround mix was subsequently needed.

He then played for us Bebo Valdes, a live recording done in a recording truck at the Village Vanguard nightclub. Mics were just a Sanken CUW180 with pair of ratchet movable capsules, here set up for X/Y. Mic pres with A-D were on stage, plus an audience microphone, and optical links connected the A/Ds to the truck. The recording setup was triple-redundant with Tascam DA98s, but the primary recorder was ProTools HD. Jim created a rough mix on Yamaha DM2000, for the performers to check each performance immediately afterwards. Mics were a combination of omnis and cardioids on piano, the Sanken X/Y on bass, and omnis on audience. The worth of the latter was shown when the audience start singing along – precise capture of the audience really added atmosphere to the final product.

Jim concluded by playing us his first ever jazz recording – Ella Fitzgerald at the New Orleans Jazz and Heritage festival 1977, knew Stevie Wonder was in the audience, so called him up to join in! The encore was the duet “You Are The Sunshine Of My Life”. It was a pretty magical moment to capture for a first jazz recording: particularly as immediately after the end of the song, the tape ran out, right then! A close-run thing.

Jim wrapped up this interesting talk – and listening session – by maintaining he’s the liar! Thanks to PMC and Arcam for the superlative audio reproduction system kindly lent to us for the evening.

Meeting report by Michael Page


How to make a high-resolution record label

Date: 9 Jun 2009
Time: 00:00

Location: Royal Academy of Engineering
3 Carlton House Terrace
London SW1Y 5DG

Lecture by Philip Hobbs of Linn Records

Phillips Hobbs is a Producer and Audio Consultant at Linn Records Ltd. He first worked for Linn in 1982, leaving to study on the Tonnmeister course, returning to Linn in 1987 after graduating. Philip’s main roles at Linn have been in music recording and speaker design. Philip described himself as being the ‘worst sort of communicator’, because, according to him, he is both ‘Scottish and an engineer’.

Philip talked tonight of how Linn’s business has been ‘transformed over the last 3 years’ by the introduction of their music download service, a service where the customer can choose the download quality all the way to 192kHz, 24bit and where all the downloads DRM-free.

Linn History

Phillip gives a ‘two minute trip down memory lane’ of how Linn started

Linn was founded by Ivor Tiefenbrun as an offshoot of Castle Precision Engineering, a machining company who made parts for such things as aircraft and Rolls Royce. The original home of Linn, Linn Business Park gave Linn it’s name, and Linn established it’s HiFi  pedigree with the Linn Sondek LP12, a record player still in production today.

Linn expanded its product range and has made amplifiers, CD players, active loudspeakers and Digital Stream Players which stream files from hard disk.

Linn The Record Label

Like many other hardware manufacturers, Linn developed an interest in the recording industry. The initial motivation was to make recordings to test the reproduction capability of the LP12 and to investigate vinyl cutting lathes to the same end, but Linn has subsequently blossomed into a serious audiophile record label with many original recordings.

Traditionally focused on classical music, Carol Kydd was the first Linn ‘proper jazz artist’ and Linn released her first album in 1983. Linn made an initial pressing of 7000 records, selling them through record shops. In 1984, Linn hooked up with a  band called Blue Nile, releasing their first album ‘A Walk Across the Rooftops’. Blue Nile were keen to sell lots of records and Linn ‘spent hundreds of thousands trying to get them to release their second album’. Philip, who was designing speakers for Linn at this time, estimates the total bill in relation to Blue Nile to be just under £1 million. By 1992, Linn were working on building their classical catalogue in the ‘standard boutique label’ philosophy by focusing on recording quality.

By 2006, Linn had around 250 titles and had established distribution in Japan and America. In Philip’s words, the business ‘was a complete catastrophe’, as it was ‘not economically viable to sell CDs in commercial retail space’, a trend, according to Philip, that had been developing since the 1990s. Philip recalls that the situation had become so bad by 2006, Linn were faced with a decision either to leave the record business altogether, or to find some radical new approach – to find ‘a way to get back to the customers’ avoiding ‘the frustration that traditional retailing gives to many companies, and record companies in particular’, namely that ‘the company is so far away from the people they’re selling to’.

Download Revolution

The conclusion at Linn was that they needed to use the internet ‘to connect directly to the customers without compromising on quality’. At this time, Apple’s I-Tunes service was well established and, thanks to the widening availability of fast broad-band services, the ‘possibility you could sell someone 1GB of data’ was becoming a reality.

So Linn built a web site where customers could download music directly – similar to the I-Tunes idea, but with a unique selling point – the ability to provide downloads up to 24-bit 192khz sampling rate (and loss-less), where the customer is free to choose the download resolution/sampling frequency from studio-master down to MP3-level quality.  Like I-Tunes, customers can buy individual tracks or whole albums with the higher quality downloads commanding a higher price tag.

According to Philip, despite an initial cost of £100k, the site is now profitable after around 2.5 years of service.

Philip stated the more traditional music distribution method of physical  media in shop-retail results in around a 15-20% share of the ticket price being returned to the record label. For example, if a CD retails for £15, £2.5 for the record company would be considered as ‘doing pretty well’. With the download service which operates in the absence of ‘middlemen’, the margins increase considerably. Philip estimates that the download business returns around 80% of profit, and further,  their  profits would probably increase were they to move entirely away from physical media, (which Linn continue to support out of loyalty to a minority of, presumably similarly loyal,  customers).

High Quality Master Recordings

A key factor that made Linn’s Hi-Res download business viable was their long-term focus on making recordings of the best quality possible, a commitment which had led them to make many of their master recordings at 96kHz or 192kHz. This resulted in a ready supply of high-resolution back catalogue. This situation was, according to Philip, in contrast to many other record companies whose masters were typically made/archived at 44.1khz or 48kHz.

Customer Focus

Linn have a base of around 120,000 customers. With their focussed direct-marketing approach, a Friday evening email-newsletter often results in many £1000s or business by Monday from their download service.

Philip also points out that these direct marketing activities rarely offer significant discounts (which would reduce profits)  – usually, they are simply aimed to draw the customers attention to some new material or other works that may be similar to previously purchased material.

Download Usage – how are the customers using the downloads?

Philip sees 4 main customer types split by playback method.

  • PCs with sound cards, Windows Media Player, I-Tunes etc.
  • Portable Devices (I-Pods, Zune etc.)
  • Burn-to-Disc – customers making CDR/DVD-R copies
  • Streamed Media Players – from Linn and others – files streamed from local server

The DRM Issue

Linn considered the possibility of using DRM to protect their downloaded material, but at the time the web site was being prepared, it became clear to Linn that DRM just didn’t work sufficiently well. According to Philip, many people felt that moral arguments eventually killed DRM but wonders whether it was in large part due to an inability to make smoothly working system (and without imposing excessively limiting restrictions on the customers).

Download Formats

Offering such a wide range of download quality options has provided Linn with some interesting statistics on the decisions customers make when offered a quality/cost choice.  Despite price differentials, In 2007, 25% of purchases were of the ‘studio master’ quality. By 2008, the figure had rising to around 50%, and so far in 2009, this seems to have risen further to around 70%. Of the CD quality albums downloaded, customers are showing a 50/50 split between choosing FLAC and WMA.

Additionally, of those customers who purchased studio master quality downloads, where they were offered a choice between 96kHz or 192kHz,   80% chose  the higher rate in spite of the fact that many players can’t play 192kHz!

Further, Phil is convinced that around half the customers who have purchased  studio-master quality downloads don’t currently have the playback equipment to support the sample rate/bit-depth they bought. His conclusion is that given a choice, Linn’s customers prefer to buy the best quality available. If this seems odd, there may be some logic here, and in some way maintaining the Linn tradition. The original Linn LP12 can be upgraded all the way to its current production specification. This ability to upgrade has been a Linn philosophy, at least for the LP12, for many years. By upgrading the equipment, the customer can benefit without buying into a whole new format. If the customer buys the Studio Master, the data they get is all that was recorded – it is essentially ‘as good as it will ever be’ – and with such a music collection, future equipment upgrades may offer further sound improvements when replaying the original material, in many ways similar to vinyl.

High Resolution Benefits

Philip made an impactual demonstration of the potential enjoyment offered by high resolution and high quality recording by playing Handel’s Messiah conducted by John Butt (and where Philip was himself the recording engineer). Unbeknown to the audience, the recording began at rate of 88.2kHz/ 24-bit, but as playback progressed, the bit rate dropped to 44.1khz 16 bit, then to 192kb mp3, then to 96kb, mp3. Although these differences were not immediately obvious to all, (at least in the listening environment in which they were presented), Philip described how it was common for the listeners attention to progressively drift to other matters as the bit-rate dropped, they ‘tend to get bored and start thinking about something else’. This certainly described my personal experience with surprising accuracy.

Streaming Player

Philip briefly demonstrated one of the Linn Streaming Players which offer one possible method of replaying the downloaded material. One of the benefits Philip sees for customers with this type of equipment is a significant increase in convenience. Gone are the walls of CD/LP shelves, replaced by a compact hard-disk-based server and controlled via a little application on their I-Phone, a use case Philip describes as ‘addictive’.

The Future

Linn are beginning to diversify. They have taken on a couple of small labels and are offering downloads for them alongside their own material. For those interested in purchasing downloads, Linn’s web site may be found at http://www.linnrecords.com/ and test files for evaluating quality (and compatibility)  can be found at http://www.linnrecords.com/linn-downloads-testfiles.aspx

The AES would like to thank Phillip for his fascinating talk. I’m sure many members were greatly encouraged to hear that there are still many customers for whom recording quality something worth paying for.

Report by Nathan Bentall

Edited by Keith Howard


Intelligent Audio Editing Technologies

Date: 11 Jan 2011
Time: 18:30

Location: Royal Academy of Engineering
3 Carlton House Terrace
London SW1Y 5DG

Lecture by Dr. Josh Reiss, Senior Lecturer, Centre for Digital Music, Queen Mary University of London.
A recording of the lecture is available here (81MB mp3)

The tools of our trade have transformed in the last twenty years, but the workflow of a mixing engineer is almost the same. A large proportion of the time and effort spent mixing down a multitrack recording is invested not in the execution of creative judgement, but in the mundane manipulation of equalisers, dynamics compressors, panning, and replay levels, so that the timbre and blend of individual channels is correct enough to attempt a balance.

There are two good reasons why much of this work has not already been automated. The first is that the task is not trivial: it is a highly parallel and cross-adaptive problem, and the correct value for every setting will depend to some extent on every other. The second reason is a resistance from those who assume that automating the mixdown process will either remove the requirement for a skilled hand and ear, or result in lazy use of automation to the extent that their careers or their integrity will be threatened. To make all music sound the same is not the goal of automation. Rather, automatic mixing will speed up the repetitive parts of an engineer’s job so that more effort can be expended on the art of production.

We need only look at the evolution of digital cameras to see what could be possible with audio. A typical consumer camera of twenty years ago would have had a fixed focal length and aperture, and perhaps an adjustable shutter speed. Now, multi-point auto-focus is a standard feature, the exposure time, aperture, and colour balance are adjusted automatically, a digital signal processor ameliorates camera shake, and so on. Poor shots may be recognised and retaken as many times as is necessary, because the photographer can immediately view their photograph. In spite of these enhancements, professional photographers still exist, and still need to be taught about the optics and anatomy of a camera. However, the emphasis of photographic discipline has shifted towards the creative side of the profession: there is less time spent setting up the camera and developing exposures, and more time in perfecting the technique and shot, and retouching the images.

There are, broadly speaking, four kinds of automatic sound processing tool:

Adaptive processing. Adaptive processes adjust instantaneously to the material that is being played through them. De-noisers and transient shapers are adaptive in nature.

Automatic processing. Automatic processes place some aspects of operation under user control, and make intelligent guesses about the positions of other controls. The ‘automatic’ mode on a dynamics compressor is such an example.

Cross-adaptive processing. A cross-adaptive tool must be aware of, and react to, every signal within the system. For example, the automatic level control on a public address system that adjusts to the ambient noise level may be cross-adaptive.

Reverse engineering tools. Deconstruction of a mix for historical reasons would involve taking the multitrack session master and the stereo master, and determining which processes must be applied to the former to derive the latter. It would be useful to automate some of this.

Adaptive mixing tools require two components: an accumulative feature extraction process, and a set of constrained control rules. Much of the difficulty of getting these tools right is in obtaining the correct information from the audio in the first place: to detect, for example, the pattern of onsets, the correct loudness, and thus precise masking information. The target for an equaliser can then be to reduce temporal and spectral masking, rather than to aim for a flat frequency response. Panning can be used to reduce spatial masking. A compressor can be inserted when the probability of a particular instrument being heard falls below a certain threshold, and it can be boosted to have a certain average loudness without its peak loudness exceeding a higher threshold.

Dr. Reiss played some examples of automated mixing from the Centre for Digital Music, showing us the system element by element. First, each instrument was manipulated in isolation. Then an automatic fader balance was performed. Finally, with one button, the compressor, equaliser, panning, and fader settings were set up for an entire multitrack jazz recording. The result was surprisingly effective, although the automatic nature of the balancing was clear. The vocals, for example, were somewhat quieter than custom usually allows, and the mix was equalised to a fairly flat spectrum whereas most commercial music is boosted at the top and bottom ends. Nevertheless, the power of automated mixing was effectively demonstrated – the result was perfectly reasonable for a monitor mix and, as the algorithms are perfected, the results will certainly improve further.

Suggestions and examples of other automatic tools were shown: an eliminator of feedback for live sound, which set itself the target of keeping the loop gain of the system below 0dB in every frequency band. It achieved this by finding the transfer function of the system and calculating its inverse. A plug-in for automatically correcting inter-channel delay was also demonstrated, which successfully reduced the artefacts created by spill between one microphone and another. The aim of these tools is again to free up the balance engineer’s hands and mind for the more creative aspects of live sound engineering.

The scope for further work in refining these tools is clear, although they already work impressively well. Informal blind testing has shown that it is hard to discern the automated mixes from those executed by students (at least, in short excerpts). In an act of subterfuge, Dr Reiss entered an automated mixdown into a student competition, and confessed his crime only after the competition was judged. Although the mix failed to win a place in the competition, it also failed to pique the judges. Inevitably, technology will soon change our craft beyond recognition. Fortunately for us, the researchers appear no closer to developing a substitute for talent.

Report by Ben Supper


Loudness

Date: 13 Jan 2009
Time: 18:30

Location: Royal Academy of Engineering
3 Carlton House Terrace
London SW1Y 5DG

Lecture by Thomas Lund of TC Electronics

Thomas Lund, TC Electronic A/S

Thomas Lund’s background includes work as a recording engineer and musician and the study of medicine – an unusual combination which may contribute to his understanding of loudness perception. Thomas has also been involved the design of many TC Electronic’s products, he has contributed to various standardisation groups on the subject of loudness, and has authored many papers presented to the AES and other bodies.

Traditional Loudness Measurement

Recent years have seen the ‘level’ of pop/rock music, as delivered by CD, steadily increase. Thomas cited the simple way that audio level has been measured as a partial cause. Historically, audio level has often been measured by peak programme meters, and commonly used definitions of overload have been very simplistic methods such as peak-level-counting (eg three consecutive full-scale samples equals overload). Such simple techniques of measuring (and by association, limiting) the level may have worked well when systems consisted of a microphone, a preamp and an ADC but with digital processing techniques numerous methods have been devised to increase the apparent loudness of material delivered on CD while ‘working around’ the peak-level limitations, apparently (we must assume) to some perceived commercial benefit to the record industry.

Many hold the opinion that such ‘hot mastering’ techniques are severely detrimental to the overall quality of modern music releases. Thomas calls this drive for increased level whatever the cost, coupled with a high willingness of broadcasters and consumers to use large amounts of data compression (for archiving, broadcast and replay), a ‘war on music’.

The Problems of Incorrect Levels

With such hot-mastering techniques, it is trivial to generate digital signals that exceed 0dBFS in the analogue output, after the assumed reconstruction or up-sampling filters. The greater-than-0dB peak levels can cause serious problems in the reproduction chain where some processes have been implemented with the assumption that 0dBFS is the largest signal they should expect.

Thomas offered demonstrations based on a commercially available ‘professional-grade’ sample rate converter, subtracting output from input. In this experiment the output should have been silent but differences could be heard clearly, manifested as ticks and signal-related noise. Other potential problem areas, according to Thomas, include limiting in mix-busses and codecs such as MPEG 1 layer 3 . These processes can all exhibit similar problems when faced with very high level inputs, a phenomenon Thomas further demonstrated. The codec problems can depend on the implementation of the codec as well as the codec itself.

Because of these issues, Thomas recommends normalising to -3dBFS – not to 0dBFS, in digital mixing and recording situations. He pointed out that the final 3dB increase can be done in the mastering room without any real quality loss, given that most recordings use 24 bits.

Better Methods of Loudness/Level Measurement

Thomas gave a functional summary of various improved methods of measuring loudness level and showed relative results based on ITU-R BS.1770. A simple improvement is the over-sampling peak programme meter which offers a more accurate representation of the true peak level.

Thomas also presented a loudness meter available from TC Electronic as a plugin for Pro Tools as ‘LM5 Loudness Radar Meter’.

TC LM5 Loudess Radar Meter

This meter includes representations described as ‘Loudness Units’ (LU) or LkFS, ‘Consistency’ and ‘Center of Gravity’, where Center of Gravity indicates the overall loudness of the programme material or music track, and Consistency indicates the ‘intrinsic loudness changes’ present in the track, with 0 representing a steady-state signal (one which has no loudness changes at all, eg a sine-wave) and progressively more negative numbers indicate reducing Consistency. Low Consistency scores such as -4 or lower indicate that the material may have a large dynamic range.

Conclusions

In conclusion, Thomas offered the following recommendations:

  • Stop Counting Samples: There are better methods of measuring peak levels than counting the number of consecutive full-scale samples
  • True Peak Level: Set maximum peak level at -1dBFS using a true peak meter equipped with oversampling capability.
  • Dialog Level: Suggested level of dialog is -26 to -22 LkFS.
  • Music: Suggested level of music is -20 to -20 LkFS.
  • Avoid Peak level normalisation

If audio level is anchored only to peak level or only to dialogue, both commonly used techniques, loudness chaos is likely to ensue with extreme level jumps between programme, commercials and other home sources.

The tools and understanding exist to provide well-balanced loudness levels between different programmes and material, providing the end-listener a more pleasant viewing/listening experience and the potential for reduced distortion and overall quality improvement. Thomas outlined the problems and offered tools and methods for solving them.

Report by Nathan Bentall (edited by Keith Howard)


From Hi-Fi to PA: Predicting and Measuring What We Hear

Date: 20 Oct 2009
Time: 18:30

Location: Royal Academy of Engineering
3 Carlton House Terrace
London SW1Y 5DG

Lecture by Peter Mapp, Mapp Associates

Download MP3 recording of this lecture (24MB)

Everyone wants ‘high quality’ sound – but what does this mean – is sound quality measureable? is it predictable? The talk will look at how we can assess sound quality – both in large spaces such as concert halls, cathedrals and even railway stations as well as in small rooms such as home theatres and hi-fi listening spaces. After introducing a number of parameters and concepts that affect sound quality and the listening experience, Peter will discuss how these can be measured and potentially predicted. In particular the use of 3D computer modeling of rooms will be highlighted together with the importance of bass frequency reproduction. A number of case studies and examples of problem sound systems/rooms will be presented. Peter will conclude the talk with an insight into some of his latest research and the introduction of a new measurement/assessment concept, SQI – the Sound Quality Index.


The Anatomy of a Modern Audio-Video Amplifier

Date: 10 Nov 2009
Time: 18:30

Location: Royal Academy of Engineering
3 Carlton House Terrace
London SW1Y 5DG

Lecture by John Dawson, Arcam

Download audio recording of lecture (14MB MP3)

A modern Audio-Video amplifier/receiver (AVR) is an exceedingly complex piece of consumer electronics, requiring expertise in many aspects of analogue and digital audio and high definition video, plus considerable software skills. As such it represents a huge project for any small to medium sized audio company. This lecture takes a look inside the Arcam AVR600 – one of the few such units developed outside of the large Japanese CE companies – and will discuss some of the design choices made in order to try to ensure a good chance of commercial success.


Special Lecture: An Interview with Neville Thiele

Date: 24 Nov 2009
Time: 18:30

Location: Royal Academy of Engineering
3 Carlton House Terrace
London SW1Y 5DG

Special Lecture: Interview conducted by Keith Howard

Download audio recording of lecture (24MB MP3)

An excellent Tutorial by Neville Thiele can be found here (AES Members only, log-in required for www.aes.org)

Abstract

Neville Thiele’s name is known to anyone who has ever taken an interest in the practical design of moving coil loudspeakers, through the Thiele-Small parameters that bear his name and that of Richard Small. In 1961 he wrote a seminal paper on the design of vented (reflex) loudspeakers that – although it was largely ignored for 10 years until reproduced in the AES Journal – is now acknowledged as initiating the filter parameter based approach to loudspeaker analysis and synthesis which today is routinely used by the audio industry at large. In recognition of this, in 1994 he was awarded the AES Silver Medal.

In this interview-based lecture, Neville Thiele will talk about what led up to this breakthrough and its significance to the speaker design process. He will then give three short presentations on loudspeaker-related topics: filter-assisted bass alignments and novel crossover approaches; driver ageing effects; and driver impedance correction in crossover networks. Questions will then be invited from the audience.