Sackboy’s Voice – Full Of Eastern Promise

I’m back from a two week trip to Japan. It’s the longest and most far, far away holiday I’ve had for many years. There were lots of temples, photos, temples, walking, temples, eating, temples, a proposal of marriage and more temples. I even managed to make a few sound recordings; the massive ravens which are everywhere, lots of other birdsong, Buddhist chanting, trains, pedestrian and level crossings. I’m looking forward to giving them a listen and a bit of an edit. Oh, and did I mention the temples?

So, I’m feeling refreshed and ready for the year ahead. Mr. Credit Card isn’t looking so healthy. He could just about handle the holiday; I seem to recall he was actually quite looking forward to it as he doesn’t get out the wallet much, never mind the country. But he started to turn a strange colour when I asked him to take on the engagement ring. “That’s what you’re for” I told him. He said I had to “give him lots of money” if we were to remain friends. The bastard!

In other news, LittleBigPlanet has been nominated for 8 GANG awards. There’s some symmetry there with the 8 AIAS awards the game won at DICE last week, ignoring the fact that none of those were for audio BUT you don’t win Game of the Year and Console Game of the Year without good audio so I’m a winner too. Or so I keep telling myself.

Interestingly, Sackboy won an AIAS award for Outstanding Character Performance, so I am now officially an Academy Award winning voice actor. Yes, thank you, thank you ladies and gentlemen. No autographs please, put those pens away. If you’d like me to cough or hold my breath in the Sack-stylee as some after-dinner entertainment then contact my mum. Be warned though, she drives a hard bargain, and you’d be best advised not refuse her offer of a wee cup of tea.

Seriously though, I think the sound design decision not to litter the Sackfolk with inane voice samples contributed significantly towards that award. This is something which some sound designers find hard to resist – there are a couple of 3rd party trailers and adverts out there where some arse has added chipmunk voices to the Sackfolk to make them sound comic and cute. This is a great example of something which is easy to do in the linear medium (so deceptively easy it doesn’t actually require any thought) turning out to be almost impossibly hard to do [properly] in the interactive medium. Which would be purely academic were it not for the fact that an inability to speak is a strong component of the Sackboy/Sackgirl IP.

Ignoring implementation issues such as contextual blindness and technical limitations such as memory constraints, not having a voice allows players to feel that their Sackboy, which they lovingly dress, customise and emote with, is theirs. This sense of ownership would be hard to achieve if the character you were controlling had a mind of their own, voice and language being the most personal way of communication and expression. Which is why when Sackboy does speak it is with the voice of his player, his lips moving to match those of his puppet master.

Anyways, in addition to the 8 nominations from GANG, there are also nominations for the two audio categories at the BAFTA video games awards and the audio award at GDC, so there’s still the chance for some award winning audio love coming towards LittleBigPlanet over the next couple of months.

If you want to read more about the audio in LittleBigPlanet then check out my recently published article at Music4Games, and keep an eye out for the March edition of Game Developer Magazine.

From Scarface to Simlish

From Scarface to Simlish

Blair Jackson investigates the people and practices behind three games which had significant parts of their sound crafted in the San Francisco Bay Area – Scarface’s in-game sound effects and mix at Skywalker Sound, Sam & Max’s dialogue recording at Studio Jory and The Sims 2’s fictitious Simlish language crafted at Maxis.

Recreating Reality

At the inaugural Develop Conference in Brighton I spoke about the differences between sound in the real world and sound in the virtual worlds that we create. Right on cue, the day before my presentation, Mark Rein, VP of Epic Games, claimed in his keynote session that:

“the future of gaming is not Donkey Kong, it is these big expansive worlds”

It seems pertinent to examine the fundamentals of our craft at a time when much of the industry is evangelising the idea of complex, immersive virtual worlds and promising to deliver photo-realism, HD graphics and advanced surround sound technologies in their games. Surely, when creating a virtual world, we should emulate the real world as closely as possible? It turns out that, as ever with interactive media, the task of recreating reality is not quite so straightforward…

Brain Food

At the School of Sound conference in 2001, Professor Paul Robertson, a musician who has spent much of the last 20 years working with neuroscientists to try and unravel the mysteries of music and the mind, opened his presentation with some fascinating facts about the brain. One point in particular has stuck with me ever since, albeit sketchily paraphrased from memory:

Music is not processed in just one specialised part of the brain – so far we have identified 12 different areas which can be utilised when listening to music, one of which only shows activity when listening to bird song! Even after a severe brain trauma the chances are that the brain will still be capable of processing music to some degree. In other words – if you’re not musical you’re brain dead.

The brain is amazing. And it is wired for sound.

Contract Negotiations

Our brains enable us to create and appreciate art. Whilst the technologies which allow us to make games are astounding, the reason we even bother at all is that, by some twist of evolutionary fate, we have the ability to ignore that games are totally and utterly ridiculous. Think about it; you can sit in front of your monitor or TV and, with the lights down, sound turned up, interfacing via some manner of controller, save the world from those pesky aliens. Again.

You have to be a willing victim to become immersed in any virtual world, be it conveyed via a game, film, broadcast or book. Film theory has the concept of the film/audience contract whereby an audience fulfils their obligation by buying-in to the film’s virtual world so long as the film fulfils its obligation of providing an entertaining experience. Similarly, the game/player contract ensures that no matter what we throw at the player, as long as it is explained competently and is entertaining, they will accept it.

In his book Audio-Vision, film sound theorist Michel Chion takes this one stage further by naming this the audio/visual contract to emphasise that sound and image are two separate entities which are only fused together in the minds of an audience. This is a phenomenon he calls synchresis (synchronism and synthesis).

The audio/visual contract and synchresis are what allow our virtual worlds to be entertained. They also permit wondrous things such as acousmatic sound; sound which has been disassociated from its source. For example, combining the sound of an axe chopping wood with the moving image of a bat hitting a ball will create an incredibly powerful strike which adds up to more than the sum of its parts.

Caged Beast

Before we use sounds in our virtual worlds we must first capture them. Capturing anything involves a transformation; the caged animal is a different beast to its free-roaming brethren. A sound in the real world is free to interact with its environment in an infinite number of ways. A captured sound lacks this vivacity; it is flat, static, a shadow of its former self. Therefore, your choice of microphone and its location are, whether you are conscious of it or not, a manipulation which takes place before a sound is even recorded.

Microphones do not hear what our brains hear. The brain is capable of some rather impressive filtering whereas a microphone records whatever hits its membrane. However, these filtering skills are severely diminished when you are listening to a sound recording. For example, when listening to me give my presentation the audience were (hopefully!) tuned in solely to the sound of my voice, whereas if you were to listen to a recording of my presentation you would be distracted by the room’s acoustics, the aircon, the projector’s fan, general noise from the audience and that damn loudspeaker on the right which buzzed throughout the whole day. In our virtual worlds we need to perform this filtering on behalf of our brains, which is the process of mixing. Dynamic, real-time mixing is a relatively new frontier but it is a high priority in our next generation games.

Drool

Our perception of the real world is so much more complex than that offered by even the best binaural recording. Even if our virtual worlds could precisely mimic sound in the real world and be photorealistic, we’d still just be sat there in front of our monitor or TV. Fortuitously, sound and music can be used to represent some of the information which is missing from the virtual world.

Music, in particular, is like monosodium glutamate for the ears (MSG is the chemical they add to snacks and some Chinese foodstuffs to make them taste better than they intrinsically do). The use of music in virtual worlds is especially interesting as it is such an incredibly abstract concept; our lives are not scored by music. In a virtual world, music can be used to intensify an experience and act as an emotional signifier without revealing itself as a manipulative device. Whilst there is no doubt that this is a very powerful tool when used well, it is in no way realistic.

It’s behind you

Everything I’ve covered so far has, for the most part, been applicable to the virtual worlds of both films and games. Surround sound is an area where the two are not so closely aligned.

In film, surround sound is largely concerned with the use of diffuse sound; hence the surround channels being represented by a barrage of speakers along the walls of the theatre. Film established fairly early on in its experiments with surround sound that directional sound was distracting to an audience as it diverted their attention away from the screen/virtual world. Game cinematics tend to stick to the conventions of film surround sound, but in-game sound uses the surrounds in a directional manner. This divergence is as a result of the contrasting voyeuristic nature of film against the participatory nature of games. It’s also because we lazily let the game handle most of the panning in a rather simplistic fashion, though you can expect our use of surround sound to become more sophisticated as we take advantage of discrete surround panning. It should be noted that a game which has surround sound differs greatly from a game which really uses surround sound to its advantage.

Whilst some directional cues can be beneficial to the player it is folly to think that directional surround sound offers any exactness. Despite this apparent failure to represent the virtual world accurately, the audio/visual contract ensures that as long as this doesn’t infringe upon the gaming experience it will be readily accepted by the player.

Another area of surround sound which is evolving is the way in which surround music is mixed. Even though there is nothing realistic about the use of music or surround sound in our virtual worlds, there is a debate about whether music should make overt use of the surround channels. Some argue that players will find it strange that music is coming from behind them, but proponents of this argument fail to take in to account how strange it is that, irrespective of locus, there is any music at all. Anyone who dismisses surround music hasn’t heard it done well, if at all.

Say wha’?

Dialogue in our virtual worlds is significantly different from every-day conversation in the real world. The fundamental difference is caused by dialogue being a vocalisation of the written word. Genuine conversation is a chaotic stream of consciousness littered with interruptions, mispronunciations and other utterances that we are astonishingly good at deciphering. This is in sharp contrast to dialogue which tends to:

  • consist of complete sentences
  • pause at the end of each sentence
  • be well thought through and articulate
  • be absolutely packed with information
  • stay clear of naturalism unless reflecting the emotional state of a character

The only thing that dialogue shares in common with spontaneous speech is that it is spoken aloud.

Similar to the process of mixing, whereby we recreate the brain’s ability to focus on that which is most important, dialogue distils everyday speech down to its essential function; conveying information. In games we tend to overuse dialogue and force it to carry all the weight of the narrative and exposition. This is in spite of dialogue being but one method of communicating information to the player. Here, we should try and emulate the real world more by taking some of the weight off of dialogue and using other aspects of the virtual world to tell the story and communicate information to the player.

Conclusion

Sound in a virtual world works in a decidedly different way to sound in the real world. Thankfully, our brains are quite forgiving of the relative discrepancies between the infinitely complex of the real and the constructed, refined simplicity of the virtual. The best way to recreate reality is by emulating our brain’s response to it. Only then can we begin to create a plausible, entertaining experience for our audience and fulfil our half of the audio/visual contract.