Welcome to the first of my new series of interviews with virtual instrument makers, and audio production software designers!
To kick things off, I’ve invited 24 year-old composer and software designer, Gabriel Heinrich, who studied composition for film and theater in Arnhem, Netherlands, to discuss his amazing VST plug-in, Virtual Sound Stage. I’ve asked him to share a little about his past, and what led to the development of VSS, now available in version 2.0.
Gabriel Heinrich: “Creating orchestral mockups was pretty much all I did for years. I was absolutely fascinated by the idea of creating something that sounds like the scores I love, all by myself.
We (composers) spend a lot of time on re-creating film scores from sheet music, and comparing them to the original recordings. That’s a very worthwhile endeavor, because you learn so many different things at the same time, but it’s also very painful. Sometimes you spend hours on a particular section, tweaking every last detail, and eventually you become quite content with what you are listening to. But once you compare to the original there’s no denying that it all sounds like shit. However, that’s exactly the kind of reality check you need if you get caught up in the habit of listening to your phony orchestra over and over again.
However, what really becomes apparent in these comparisons is that the stereo and room image in a sampled orchestra usually ends up all over the place. There are a thousand little moments that keep throwing your ears off, while the originals have this unity that’s able to draw you in. That really annoyed me, and I thought very hard about how one could possibly change this, which led me to the ideas that later became VirtualSoundStage. I didn’t know anything about programming at that time, so I remember creating prototypes in Reaktor, and I got a real kick out of it. Once I could listen to something, I was completely hooked and I was destined to create this thing that had developed in my head. I started teaching myself how to program, and I think about eight months later VirtualSoundStage 1 was finished. I was still studying at that time, so whenever I wasn’t busy chasing a deadline for a composition I was programming. That didn’t go too well, and eventually I was completely burned out. I decided to drop out of school because I really needed a break from composing. I moved to a new city, and I have just spent one-and-a-half years on developing VirtualSoundStage 2.0.”
Jason Watts: “You mentioned using NI Reaktor to work on prototypes of VSS. I’m curious; Would you mind elaborating on how Reaktor allowed you to begin the process?”
GH: “The main thing I was interested in back then was using a combination of inter-channel delay and level differences for panning. I was curious if it would really sound different than the pan controls in our DAWs. To do that, I only needed a programmable delay and gain, and both are available in Reaktor.
I didn’t want to randomly play around with the delay and level values. Instead I came up with a formula which calculates those values based on a position and a microphone setup. Implementing that formula in Reaktor was quite challenging. You have to chain a lot of different processing blocks in weird ways if you want Reaktor to evaluate complex mathematical expressions. But eventually I figured it out, and I was amazed by how much better everything sounds if you use time differences to position instruments. I think I even got Reaktor to produce some very basic position-specific reflections. However, some other things I wanted to do were just impossible with Reaktor, and the CPU performance also got really bad. Eventually, I had to leave Reaktor behind. But it was the experiments that motivated me to learn to program, and to create a real audio plug-in.”
JW: “You also mentioned that the stereo and room images in sampled orchestral sections tend to be all over the place. Do you feel that the imaging issues are built into the individual instrument samples, or that they only appear when combining sections from different manufacturers and recording spaces together in a mix?”
GH: “Actually, both, but the problems become more apparent when you combine samples from different manufacturers.
The general problem with samples is that the naturally recorded sounds get processed with envelopes and crossfades. It’s the only way to mimic the behavior of a real player to create phrases out of those static single notes. But if you cut into the samples, you damage the reverb at note transitions and endings. That can be quite irritating, because the sound of the space seems to be constantly changing.
The common approach to mask this is to add a reverb on top, or to use release samples. But release samples, again, have to use crossfading for the transitions. That’s still something your ears might pick up on as being fake.
However, the real problem with envelopes and stereo images become obvious if you consider a sample library that’s recorded in place. This means the instruments have been recorded in their most common position in an orchestra. Now, let’s also consider a microphone setup where inter-channel delay is the dominating source of the stereo image, e.g. a Decca Tree. Here, the sampled notes have a slightly different starting point in the left and right channel, depending on the position of the instrument. The envelope however will still process both channels absolutely synchronously. That’s not a small issue, because our ears are only able to determine inter-aural delays based on sudden differences in a sound, which mainly happen at the start, and end points, of a note. So what happens is this: When a note starts, you correctly sense it’s position by the inter-channel delay, but as soon as the envelope kicks in (i.e. the sound of both channels changes absolutely simultaneously) the sound jumps to the center.
Some libraries suffer more from this effect than others. On sustain patches with good release samples, I don’t even think this is much of an issue. But in legato patches, this is often very annoying. By synchronously crossfading into both channels of a legato transition, and then out of it, most of the natural stereo information gets overwritten. There are so many libraries where the legato patches are jumping around in the stereo field. I am not aware of any sample developer who has addressed this issue by using a delay in their envelopes. Right now, the best solution is to record all sections in center position, like 8dio, Cinematic Strings, and others do.
So that’s an issue with the stereo image of samples in general. But there’s also the problem of layering libraries recorded in different rooms, and with different microphone setups. First, there’s the issue of combining different spaces; Your ears perceive two or more different rooms simultaneously. That is something that just doesn’t happen in live orchestral recordings. Then there is the problem with layering different microphone setups. Every microphone setup produces its own particular stereo sound stage. This means every point of the room results in a very specific inter-channel delay and level difference. When multiple of these sound stages overlap in a mix, you end up without a clear stereo image.
When we combine sample libraries, we get confronted with a lot of things we never experience in real life. It’s easy to underestimate how sensitive our ears are to all the details of a sound that we use to localize it, and those details are what we have been using all our lives to orient ourselves in the real world. And of course that’s not magically turned off when we listen to music.”
JW: “So the idea behind VSS was to give a composer the ability to mask, to some degree, the original early reflections, and “room” in the samples with newly created ERs. Correct? Basically merging everything into one virtual space?”
GH: “Actually, I’d say the most important idea behind VSS is it’s direct signal processing. It’s something that’s really been missing from the tool-set of a composer and it makes a tremendous difference.
Getting the early reflections right is only the next logical step toward achieving realism. They play such an important role in the way we localize sounds and perceive the room. I believe early reflections offer the best possibilities to improve the depth of our mixes.
If you are working with sampled instruments, there are two different scenarios: First, there are those libraries which have been recorded in an environment with little or no room sound. For those libraries you definitely need to add some early reflections. Here, VSS is just a straightforward choice.
But then there are the sample libraries recorded in more reverberant spaces. In this case, the situation is somewhat different. One needs to be careful when adding more early reflections to those libraries. There’s a point where it’s just too much and the room image becomes blurry.
The main goal here is to add just enough early reflections to cover the gaps at note transitions and endings. That’s where the natural room sound gets corrupted by the post-processing in the sampler. Adding a subtle, but consistent, layer of early reflections often makes a huge difference in terms of realism.
On some particular libraries it really does work as you described it; The early reflections from VSS can mask the original ones and you can bring the instruments into a different space. But for others, the sound of the room is so dominant that the only thing you can do is try to balance things out. That’s definitely not a perfect concept, but there are so many great libraries out there which have been recorded in this way, and we all love to use them. This influenced some major design decisions in VSS 2.0. Now it’s possible to dial back every part of the processing to a point where VSS is not doing anything anymore. I think that’s absolutely essential. It allows you to use VSS very carefully to do whatever subtle adjustments you feel necessary.”
JW: “Do you have any tips you’d like to share for the reduction of built-in reflections, and “room sound” from very wet samples? Or would you instead suggest attempting to match drier, less reverberant sections to the more naturally reverberant samples using VSS?”
GH: “Well, there are reverb reduction plug-ins like DeVerberate, but I don’t suggest using them. In my opinion it’s too likely that the sound of the direct signal will be degraded or you end up with artifacts. In whatever I do, I believe in starting with the best possible source material, and then I add and/or subtract as little as possible.
That’s also why I am critical of using only the close mics from a sample library. So often, the sonic quality in the main microphones is noticeably superior. There are some libraries where the close mics are well-balanced, and are a real alternative. But for a lot of other libraries, the close mics are only meant to be used to add a little bit more bite. My suggestion is to find the most convincing microphone position, and use VSS to improve on that. For dry libraries that probably means using VSS heavily to add a stereo image, and a room. On more reverberant libraries you might just change the position to match another library, or add a little bit of early reflections to smooth things out.”
JW: “Something I’ve noticed while using your plug-in is that with careful adjustment to the air absorption feature in combination with the direct signal, one can approximate the sound of close mics, etc., within a mix, bringing things forward, making them clearer, and so forth. Was this feature intentional, or just a natural by-product of the combined controls you’ve offered? I find it very useful in mimicking scores which have close mics in addition to the mains.”
GH: “My idea for the air absorption feature in VSS 2.0 was to make the quality of the filter as high as possible so that you can use it on everything without worrying about sound degradation, and to make it as realistic as possible. In the beginning it was only intended to filter the frequencies in exactly the same way that air absorbs them in real-world scenarios. Effectively, this means reducing the amount of high frequencies. Air absorption has quite a special frequency response. You can’t replicate it with a regular equalizer, but it’s extremely effective, probably because our experience has trained us to associate this frequency response with distance.
However, something else interesting happened. I noticed that it’s possible to reverse the process. Instead of cutting away frequencies as air absorbs them, you can also bring them back. This allows you to move the sound closer to you instead of always pushing it away. Of course this has it’s limits, but I like the effect. The air absorption ended up becoming a lot more flexible than I initially intended. Every now and then I even think about making the air absorption into a small, handy plug-in of its own.”
JW: “Looking ahead, are there any plans for future versions of VSS you’d like to share?”
GH: “First of all, there are a couple of things I’d like to add to VSS 2.0. There are some things that I didn’t think about, or had to cut for the initial release. Next, I am going to add the possibility to organize the instruments in groups. This will take care of any screen clutter in huge projects. Additionally, it will make it possible to use various rooms and microphone setups at the same time. I’d also like to add a more diffuse version of the ERs, some automation, and other things.
The reason we don’t already have automation in VSS is rooted in the inter-plug-in communication. I am very glad that the unified user interface seems to work pretty smoothly, but to make it work I had to use some unorthodox methods. This makes the implementation of automation a lot more difficult than in other plug-ins. But I think I finally got an idea how to solve this, so hopefully VSS will offer automation soon.
Of course I also have an idea in my head for VSS 3.0, but that would be a gigantic project. Given how things are going now, it would have to be financed via crowd funding. I think the more important next step is to make it easier for users to grasp the concepts. This means tutorials, more documentation, and a much bigger preset library. The next real expansion is probably going to be a tool to create a reverb tail from within VSS. This would finally make VSS an all-in-one solution for reverberation.”
JW: “I have one more question for you before we end the interview. Do you have any plans to add multi-mic setups that might include, for instance, the option of adding close mics to the mains?”
GH: “That’s another question I get frequently. Replicating the process a sound engineer goes through when mixing different microphone positions in a real recording session sounds like a good idea, but somehow I suspect that it won’t lead to better sounding results than we get now. However, in my opinion there’s another very good reason to add multi-mic setups to VSS. With more microphones you could create surround mixes from stereo samples. That would make a huge difference for a lot of composers. Almost everyone uses sample libraries to create surround mixes for movies and games. If there’s going to be a 3.0 version of VSS that’s definitely something I want to add.”
Thanks to Gabriel Heinrich for taking time out of his busy schedule to answer these questions so thoroughly! It has been both enlightening, and enjoyable.
Please be sure to visit Gabriel’s website, http://www.VirtualSoundStage.com, and try the free demo! You won’t regret it!
*This interview was conducted via email correspondence.