Every year, as the Summer draws to a close, a bunch of audio researchers, from both academia and industry, get together from across the world to share their latest ideas on, and developments in, digital audio effects. The fact that I’m often one of those researchers is one of the many reasons I love my job. This year The 13th International Conference on Digital Audio Effects (or DAFx10 for short) was held at the Institute of Electronic Music and Acoustics at the University of Graz in Austria. For four days we heard presentations, key-note talks and saw posters discussing everything from transforming the emotional content of speech through to accurately modelling the shape of the plucking finger for a physical model of a guitar. We also ate and drank very well.
Although there were no jaw-dropping, standout moments this year, there was a lot of really good stuff. Axel Robel, head of the analysis-synthesis team at IRCAM (Institut de Recherche et Coordination Acoustique/Musique) in Paris, which is probably the most widely known and respected music technology research institution, presented a new set of plugins called ircamtools. Although they’re not cheap (over £1k for the whole bundle!), the set incorporates all of their accumulated knowledge and expertise in sound transformation and acoustic space modelling. There’s some serious stuff in here for so-called ‘higher level’ transformations of sound (such as changing the age and gender of a voice or making a flute solo more breathy). Most impressive of all though was a demonstration of some ongoing work not yet included in these tools. He demonstrated taking a spoken phrase and changing the emotional content – from relaxed to angry, for example. The human ear is so used to hearing and extracting information from speech that it can soon spot if a voice has been tampered with, coming up with voice processors that sound remotely natural is very hard, especially when you are altering the sound so drastically. The transformations he played were superb – meaningful and plausible.
There’s been a lot of focus in the last couple of years on using graphics cards for audio processing. Graphics cards (GPU) are very good at parallel processing (most CPUs can only process one thread at a time, although the emergence of dual, quad-core CPUs is changing this). This means that, if they can be adapted to use graphics processing instructions, then they can process many channels of audio simultaneously (in fact, at the Audio Engineering Society’s convention this Spring a GPU was shown running one million oscillators in real-time. Yup, one million). At DAFx the use of GPUs for filtering and room modelling was demonstrated. Amazingly, for filtering tasks it is not the GPU itself but the speed at which hundreds of channels of audio can be transferred to and from it via the PCI-express bus which limits what can be done. Other big areas included the simulation of guitar amplifiers by modelling the behaviour of the individual analogue components – it’s actually very hard to get a laptop to imagine its a Marshall stack! I think my favourite gadget was a pair of ‘foley shoes’ – shoes containing sensors so that you played them by walking. That may seem like an odd instrument to play, but not if you’re foley artist in the baking heat of California who’s been asked to provide the footstep sound effects for a documentary on Scott of the Antarctic – just pop the shoes on and hook yourself up to the sampler! Favourite paper title? That has to go to “Virtual Auditory Myography of Timpani-playing Avatars” (animations of musical gesture to aid musicians playing together across the internet, to you and me). I was there to present some work on improving the separation between frequency and amplitude information in cross-synthesis (aka vocoding) and played some examples of how frequency shaping (using Christopher Penrose’ “Shapee” algorithm) could be improved.
We always get well looked after and entertained. One evening we travelled over to The Mumuth, a brand new concert venue with a stunning rig for electroacoustic replay – a massive array of loudspeakers which can be individually remotely positioned. The standout piece of the evening was Natasha Barrett’s “Reality and Secrets No. 1”. Whilst the other composers had produced some interesting sounds on the rig, she is much more able to create a sense of actual objects moving around the space, objects that obey the laws of physics and are entirely real but you just can’t see them – children played around us, enormous chains were hauled above our heads and wisps of cello melody blew though us like bits of paper in the wind. She is a serious talent, someone with complete control over the tools she’s using to create convincing and involving music. Chatting to others after the concert it seems she smashes it pretty much everywhere she goes and she’d brought the house down at the recent Ambisonics symposium too. You can hear a binaural (surround sound via headphones) rendition of the piece at her website (www.natashabarrett.org). Another evening was spent in the city’s art gallery, for a private viewing of an exhibition called “The Human Condition” – not an amazing collection but some standout pieces including a beautiful short film, “Per Speculum” by Adrian Paci.
All in all a great week – although there was nothing this year that was totally new, there’s been a lot of consolidation in audio processing – making the results more believable and usable, and using new computing hardware in ways that allow more processors to run faster and better. Next year the conference moves to IRCAM in Paris, a place that I’ve dreamed about going to ever since I was a teenager watching documentaries on it, so I’ll definitely be going along. The year after that, it comes to us in York. No pressure, then.
http://dafx10.iem.at/


