RIP Ralph Baer – The Almost-Father of Game Audio

I was saddened to hear that Ralph Baer, creator of the first video game console, passed away yesterday. Baer was already 50 years old when the Magnavox Odyssey was released to the public in 1972, so I was somewhat surprised in 2009 to find that he was not only alive and kicking but that he had a website through which he made himself available. So, I mailed Ralph to ask him about the role of sound, specifically the lack of it, in this pioneering device and was both amazed and delighted when he actually replied! Here is that e-mail exchange in full.

—– Original Message —–
From: “Kenneth Young”
To: rhbaer@REDACTED
Sent: Wednesday, July 8, 2009 5:59:45 PM GMT -05:00 US/Canada Eastern
Subject: Magnovox Odyssey

Dear Mr. Baer,

I’m giving a talk next week, part of which will include a brief overview of the technological and historical trends of audio in games. Your Brown Box/the Odyssey is obviously an important landmark, not only because it was the first home console, but also because it was silent. I was wondering if you’d be so kind as to fill me in on the specific reasons for the lack of any audio output from the console – was it an unnecessary expense, too technically challenging, simple beeps not considered to be a valuable contribution to the experience of using the console, or was audio output the last thing on your mind given that you were solving bigger, more important challenges?

Any insight you can give me in to the early pioneering days of console development would be much appreciated.

All the best,

Kenneth Young
Audio Designer
Media Molecule Ltd.

Date: Thu, 9 Jul 2009 00:00:33 +0000
From: rhbaer@REDACTED
To: kcmyoung@REDACTED
Subject: Re: Magnovox Odyssey

Hello Kenneth:

For an engineer like me who has been designing and building audio equipment for 60+ years – audio amps, noise reduction devices, audiology hardware, speaker systems, audio test equipment, musical instruments etc – to have ignored the need for sound in our early videogame hardware, is just plain baffling. The fact of the matter is that we just didn’t think of it except in the case of the light gun where we played with sound but dropped it because of cost.

I was an early member of the Audio Engineering Society before WWII and when I came out of the Army in 1946 I built a record player noise reduction circuit to reduce scratch from 78 RPM records at low musical signal levels…probably before Dolbie was out of knee pants.

Everything is obvious in retrospect, I guess. Implementing sound in a ping-pong game is technically easy: You simply use the device (a flip-flop in the case of our hardware) that reverses the direction of the ball in ping-pong upon coincidence with the paddle and tap into that FF’s waveform edge to develop (at a minimum) a transient tat come out of a speaker as a loud click, after it is capacitively coupled from the FF output to a transistor driver which, in turn, feeds the speaker…..but we just didn’t think of it until the Pong arcade game showed up (courtesy of Al Alcorn at Atari).

To whom are you addressing your talk?



From: kcmyoung@REDACTED
To: rhbaer@REDACTED
Subject: RE: Magnovox Odyssey
Date: Thu, 9 Jul 2009 10:51:59 +0100

Hi Ralph,

That’s wonderful information – thank you so much for sharing it with me.

I’m talking to my industry peers at the Develop Conference in Brighton, England. The focus of the audio track this year is “real-time audio”, and most folk will be talking about many of the new technologies coming online for manipulating and synthesising audio in real-time. The niche of my talk is looking at historical uses of real-time audio manipulation in games, because there are a lot of lessons to be (re)learned. For example, real-time synthesis of music was the norm up until games started using better sounding CD audio and, latterly, digital file formats. Currently, there’s much talk of how inevitable it is that we switch back to real-time synthesis of music once we have enough processing power and tech to be able to synthesise music which can sound as good as a CD. Which is great, because it opens the door to procedural generation of musical score, but we’ve been there before so it’s important to be aware of our heritage to minimise reinventing the wheel and repeating mistakes.

I’d always taken the fact that the Odyssey was silent as just one of those things, but in putting this talk together it bugged me that I didn’t know why. Thanks for satisfying my curiosity – the internet is a wonderful thing!



What a legend!

It is kinda perplexing that an experienced audio engineer like Ralph managed to overlook the low hanging fruit offered by the synchronisation of sound and image, but it is a great example of sound not being at the front of people’s minds irrespective of its power and the potential contribution it can make to the player experience.

Near the end of my aforementioned talk, I looked out and caught sight of this ancient, bald man sitting in the audience in a dark navy blazer (not a common occurrence at a game audio lecture!) – I didn’t get an opportunity to confirm if it was him. To be honest, I didn’t really want to know, because that would have weirded me out.

Watching that video, I can’t help but reflect on this experience and think that it really was Ralph. I feel bad for not interacting with him more, but I’m grateful I got the opportunity to do so.

Rest in peace, Ralph.

Character Introductions in Destiny and The Swapper

Introducing a character can be a tricky business. This is especially true of those characters present throughout the gameplay experience – player characters, buddies, guides, narrators and their ilk. Game teams exposed to such characters over months and years of development become so familiar with them that evaluating their introduction, the final implementation of which is often conceived of and added to the game nearer the end of development, can be incredibly difficult. It’s no surprise then that this is a common weak point in the presentation of a game – it’s a blind spot that takes great care to notice and real commitment to address. With voice being so inextricably linked to the portrayal of a character and their identity it often has a central role to play in their introduction, and is therefore a prime candidate for such oversight.

Destiny, Bungie’s lush sounding über-game, has the not-so-enviable task of having to introduce a narrator, enemies, a guide and the player character, all in its opening moments. But, as you’d expect from such a lavish production and distinguished team, much thought and effort has gone in to how to make this work.

Looking past my inability to get over Bill Nighy attempting a straight, earnest delivery [I’d love to have been a fly on the wall in that recording session: “Err, could you tone down the ‘Bill Nighy’ a bit please, Bill?”], what we have here is a fairly bog-standard narrator – an enlightened, disembodied voice, speaking with authority and bringing the player up to speed:

Whilst this character isn’t introduced, this is of course the trump card of the narrator – they are a thoroughly familiar convention which can get by without such formalities. The audience is keen to be brought up to speed and the narrator obliges (or “the dude abides”). But that’s not to say the narrator can just start talking willy-nilly – whilst there is no need to introduce the role of the narrator, the disembodied voice nonetheless needs its introduction to be justified if it isn’t to feel like an uninvited guest. This is fairly straightforward to achieve when the nature of most narrators is to explain. In Destiny, the narrator is introduced by answering the subconscious question on the mind of every player – “WTF is that big round thing they just discovered on Mars WTF?”

The baddies, AI guide and player character are all introduced economically in the short sequence that follows:

This is no mean feat and is nicely done. The same breadcrumb technique of ‘visual intrigue followed by vocal explanation’ that was used to introduce the narrator is used again here to introduce Ghost, the AI buddy character. The baddies, the Fallen, are introduced the other way around – having been explained by the narrator in the previous scene we know who they are when we experience them here for the first time (it obviously helps that the music and their voices identify them as evil). Ghost is also the vehicle used to introduce the player, by answering the question “what is that little robot scanner thing searching for?”. I initially interpreted the exposition that follows as suggesting that I (me, Kenny) had literarily been resurrected in the world of Destiny. I love this concept! Whilst it would be intolerable for every game to attempt to justify the relationship between the player and the game world in this way, I do like it when a game takes a crack at this – it’s kinda hokey, but I’m a willing victim for this kind of 4th wall breaking malarkey.

However, it was only in writing this piece and scrutinising Destiny’s intro sequence more closely that I realised this wasn’t the intention (it doesn’t make any sense within the game’s own backstory/timescales). My confusion here stemmed from the use of first person perspective – it felt like Ghost was talking directly to me. But whilst I was the one he was making eye contact with he was actually talking to my character – it’s not that I had died (in my distant future) and been brought back to life, it’s that my character had previously been killed but was being brought back to life. I didn’t pick up on this because my character effectively hadn’t been introduced yet – as such I hadn’t been given an opportunity to assume their role. Despite this failure, it remains an intriguing setup – attempting to create a scenario which justifies the player’s disorientation and explains their lack of understanding about their character’s role in the game world is a really nice touch.

However, where this failure becomes more significant is that the first person perspective and lack of any spoken response from my character also led me to believe that I was inhabiting a silent protagonist. So, I was taken aback and brought out of the experience at the end of the first mission when my character suddenly appears in the third person and speaks for the first time (and rather pointlessly it must be said):

Whilst it is conventional, if somewhat clunky, for a first person game to have exposition take place from a third person perspective, it is an oversight for Destiny to have failed to set an expectation for this from the get-go and be so inconsistent in its presentation, especially when Bungie have otherwise made such a concerted effort to justify and explain everything.

The player character in Destiny is in many respects the opposite of an acousmêtre – rather than being a disembodied voice which looses all of its power by gaining a body, it is a voiceless body which is robbed of its power by gaining a voice.

Another beautiful sounding game which uses voice in interesting ways when introducing its characters is Facepalm Games’ The Swapper.

The game begins with the player’s astronaut character landing on a lonely planet. There is no real introduction to the character, it relies on the age-old “press buttons to establish a relationship” paradigm. This doesn’t feel awkward – it fits the sense of mystery quite nicely. Indeed, there is nobody there to greet you as you enter what appears to be an abandoned space station. The location is introduced in a rather matter-of-fact fashion by the disembodied voice of the space station’s computer, the lack of formality matching up nicely with their cold, dead, robot personality:

Dat wind! [wub wub] Plowing on through the introductory tutorial sequence, you eventually find a computer terminal which displays a memo log, giving you your first insight into what went on here before your arrival. The computer then plays you a panicked conversation, what you assume are the final moments of the last person to inhabit this place, perhaps the same person whose mail you had just read:

N.B. You’ll need to watch this full screen if you want to read the log!

As you carry on, the computer continues to play you more voice logs, presumably recordings from the locations you are exploring. However, the mystery becomes even more intriguing when you catch sight of another astronaut in the space station:

They look like you, is it a rogue clone? What does it all mean?! Eventually, you catch up with them, and it’s at this point that they speak to you:

It was only then, when I heard her familiar voice, that I realised the voice logs I’d been hearing were actually intended to be live radio broadcasts, that I was meant to be intrigued by the existence of someone else on the space station and feel compelled to track them down.

Does this confusion matter? It’s meant to be a mystery, so isn’t it OK for things to be a little confusing and to become clearer over time? If that is the intention, then absolutely. But I don’t get the impression that this is the intended experience here – this feels like another case of the developer being blinded by the additional context they have in their heads preventing them from seeing that they aren’t communicating it effectively to their audience.

When the space station’s computer introduces the radio broadcast by saying “Radio uplink available – broadcast location: Mine Science Laboratory, Space Station Thesius” the developer thinks they have been crystal clear in calling out that this is contemporary (that’s certainly how it comes across when you know this is the intention). What they didn’t take into account is that by doing so immediately after introducing the game’s concept of ‘computer logs written in the past’, there was a danger that the broadcasts would be framed and heard in this context too. There’s nothing about the computer’s introductory sentence which indicates that the broadcast isn’t from the past – if anything, the inclusion of its precise location and the fact that it is introduced (made “available”) helps to suggest that it is a documented recording. Perhaps if it had unexpectedly just happened it would have felt much more like an “overheard broadcast”. But, even then, the broadcasts are missing any information or context to ground them in the here and now. I suspect the vagueness of the script here is intentional – however, it’s one thing for the player to not understand what is going on, but something else entirely for them to be misled as a result of flawed design.

Despite this bold but problematic use of disembodied voice to introduce the “other” character, The Swapper also contains this incredibly novel and more successful device:

The Watchers, sentient alien monoliths, communicate with the player via text. Normally, this would be in danger of grating in a game which also uses voice, but here that juxtaposition is also what makes it work. Text usually has an implied voice even if there is none to be heard, but here it has none and none is implied. It isn’t text-as-subtitles underscoring an indecipherable alien language – it’s genuinely voiceless and silent text which takes up the entirety of the screen, and your attention, when the player character is stood directly in front of a Watcher. It’s as if the stone is communicating telepathically with you, injecting language directly into your brain. This is of course how all language works – you decipher its coded meaning internally – but it’s incredibly unusual for this magical process to be used in a way which is so overtly aware of this magic. And that’s what gives The Watchers their amazing power. It’s the visual equivalent of a disembodied voice – omnipresent and all consuming.

Not only is this novel, sophisticated, intriguing and succinct, it doesn’t let any of these factors get in the way of its simple primary purpose which is to introduce the character completely and accurately (as per the author’s intent). It’s pure genius.

The Use of Voice in Portal 2

By Kenneth Young

This post contains spoilers. If you haven’t played Portal 2 yet then, er, you should!

Portal 2 represents a real milestone for me. It’s the first game that, prior to even getting my grubby mitts on it, had me more excited about its prospective use of voice than any other aspect of the project. It’s a Valve game, so you trust that it’s going to deliver engaging gameplay and you know that the use of voice is going to be considered and interesting because, time and again, you’ve seen Valve push themselves and push their medium. But it’s also a sequel, and this lowers your expectations, especially as it’s the sequel to something quite so wonderful as the original Portal. The core gameplay couldn’t be reinvented, and Valve were clever enough not to mess with it; they added some new bells and whistles to keep things interesting in the story mode, but the most significant new draw of the gameplay in Portal 2 is its multiplayer which is segregated in to a separate experience.

Portal 2’s story, however, must have presented a bigger challenge. It couldn’t just be more of the same because we’d already had that experience. But the magic formula they’d discovered was so darn potent there was no way they could escape from it. Story and gameplay in Portal were intertwined in such blissful harmony; the repetitive nature of the puzzles and formulaic interactions with GLaDOS, their masterful development and gradual disintegration, culminating with the disintegration of GLaDOS herself, is what made the game tick and boom. So, faced with similar gameplay but unable to simply “do a GLaDOS”, Valve made Portal 2’s story a deft exploration of different characterisations which all fit over the same core gameplay.

The vast majority of articulate videogame characters are perceived primarily through their voices, and the characters in the Portal universe appear to be no different at first glance. But whilst your average videogame character suffers from the affliction of being an intelligent voice bolted on to an idiot meetbag puppet (which all too often isn’t so much an example of “The Uncanny Valley” so much as it is just plain “broken”), Portal makes full use of the power afforded by the disembodied voice.

I’m sure if you were to ask a fan of Portal to describe what GLaDOS looks like they would attempt to articulate what you see in the image above. But that’s the least of GLaDOS – that shell really only exists to give you something to focus on and destroy at the end of the first game. And that shell doesn’t even belong exclusively to her, it’s simply the mechanism through which her AI is connected to the Aperture Science facility. When GLaDOS is in charge, she becomes the voice of the facility, and the facility her body – you can’t separate the two from each other. When you enter a test chamber and see a barrage of wall panels rotating in to place – that’s her. When those wall panels stutter and malfunction it’s a sign that not everything is quite so clinically precise as it may have first appeared – a chink in her armour perhaps? And that camera on the wall…

GLaDOS is a brilliant homage to HAL, the sentient, murderous AI in 2001: A Space Odyssey. If you were to describe HAL as “a red camera lens”, you’d kinda be missing the point – he is first and foremost a disembodied voice and that is what gives his character all of its power. HAL is all-seeing and all-knowing; a true acousmêtre.

Cave Johnson, the founder of Aperture Science, is the most disembodied of all the voices in the Portal universe. His voice isn’t that of a sentient AI and he certainly can’t reveal himself Wizard of Oz style, for his is a voice from the grave. Your whereabouts within the proto-Aperture Science labs trigger instructions from, and the musings of, Johnson, the old voice recordings being contemporary with the dilapidated, vintage surroundings. Perhaps the most interesting aspect of this setup to consider is that despite the story playing up the primitive nature of this technology (in order to contrast it against the sophistication of the modern Aperture Science facility) it is in fact identical to that used in the real world by Portal 2 (and all other videogames); the simple triggering of pre-recorded voice samples as the player moves around, tripping switches. Considering that the in-game implementation is essentially identical for Cave Johnson and GLaDOS, it’s testament to the writing and design that only Johnson is perceived to be a mere voice recording.

“Hello? Are you there”?

Gun turrets are the only consistently embodied voices to be found in Portal and Portal 2. It’s interesting, then, that they are written to be so overtly stupid. They certainly aren’t intended to be perceived as complex AI’s, they’re simply laser sensors with guns. But the point is that embodied voices in videogames are a lot harder to sell as intelligent beings because the body they inhabit is inevitably going to be relatively stupid. For example, compare the super-intelligence of GLaDOS to the annoying tendency in other games of AI ground troops to get in your way despite them being a far more sophisticated piece of game technology. It’s no coincidence that GLaDOS is perceived as having such a sophisticated personality despite lacking a physical body – it is precisely because of the disembodied nature of her voice that the writers are able to pull this feat off quite so convincingly (assisted by lots of linear, scripted sequences, naturally). The use of voice on the gun turrets is clever because it doesn’t try and make them something they’re not – it acknowledges their intrinsic lack of intelligence, something the player would soon uncover during gameplay, and makes their character stronger as a result.

The multiplayer characters in Portal 2 aren’t a million miles away from the gun turrets – they are also overtly dumb, in a vaudeville comedy double-act fashion, but perhaps even more so due to the fact that they communicate via robotic vocalisations rather than speech. Their most explicit communication happens during cutscenes, seen from the voyeuristic point of view of GLaDOS’ spy-cams. It’s interesting that Valve allowed themselves to take control away from the player here, both through the use of these cutscenes and through relinquishing their love of the silent protagonist, but these seem like compromises which are necessary in order to create an engaging two-player experience. I’m sure the decision to use robot gibberish was primarily driven by not wanting to alienate the player from their character.

And then there’s Wheatley. Once Wheatley takes over the facility in Portal 2 he’s effectively just a stupid version of GLaDOS, which is certainly a fun juxtaposition (hilarity ensues etc.), but it’s essentially an exploration of the “turn everything on its head” approach that is the stuff of many sequels. No, the most interesting use of voice with Wheatley’s character comes at the beginning of the game prior to this transformation.

Initially, when you first meet him, Wheatley is not a disembodied voice, he’s a well-meaning little robot AI chappy; a discarded Personality Core, literally an earlier, inferior version of GLaDOS. He doesn’t have the same monotonous robot voice associated with GLaDOS or the gun turrets (or any of the other personality cores encountered in Portal 2 for that matter); this instantly sets him apart as being more likeable, more human, reinforced by the fact that he’s trying to save your life, albeit rather incompetently. The nice thing about his characterisation is that it brilliantly and immediately addresses the problem of the player as a silent protagonist; Wheatley has verbal diarrhea. He doesn’t shut up. He can’t shut up. Every potential awkward silence is filled with utter nonsense by this driveling idiot. It’s a genius idea, and a great performance by Stephen Merchant. The only problem is that by setting this precedent, when you eventually pick Wheatley up and carry him around, it’s a bit awkward to have him in the middle of the screen, staring at you in silence – it would have been better to make his default position be staring ahead (which he actually does a couple of times), only having him turn around to speak to you. It’s hardly a big deal, but it stood out to me in stark contrast to the otherwise brilliant presentation and considered use of voice in the game.

Once Wheatley becomes omniscient, he comes and goes as GLaDOS did in Portal 1 – the same roll explored through a different personality. I found it a lot more interesting to see GLaDOS adopt Wheatley’s previous incarnation; a voice forced into a physical manifestation so that you can carry it about with you (as a potato-powered microchip stuck on the end of your portal gun). This could have been awkward, but by making her small and unobtrusive and, crucially, limited by her vegetable matter power supply, GLaDOS’ voice is also empowered to come and go as the puzzle gameplay allows. This might sound like an insignificant point, but bear in mind that the thing that permitted GLaDOS’ intermittent communication in the first place was the disembodied nature of her voice – to have found a solution (i.e. an excuse) which allows this to continue despite her altered state, and persistence in the player’s field of view, shows an attention to detail and respect for their player experience that most developers fail to give the attention it deserves.

It’s easy to overlook why Valve settled on these solutions in the first place – the characters don’t behave this way because they were thought up in ignorance of the game and then crow-bared into it, their behaviour was dictated by the requirements of the gameplay. They consistently follow two very simple rules – everything else is the result of a problem solving exercise (i.e. a design process) that endeavours to stay true to these fundamental tenets:

  • Don’t use voice to communicate information or story to the player unless they are able to listen
  • Keep the player engaged; don’t lose them

That first rule sounds pretty obvious but in practice it’s actually rather hard to abide by, especially if you’re ignoring the fact that you’re working on a game and instead pretending that you’re working on a film-like experience with an attentive audience. The most common faux pas is to have a character talk to the player whilst their mind is occupied with another task, e.g. during gameplay (you know, that thing that people do when they’re playing a game?). There’s a real conflict here, especially in games with meat-bag characters that follow you about – if you’re with another character then it’s awkward not to have them say anything (because this silence highlights the fact that they’re just digital meat-bags rather than the “real people” the designers want them to be perceived as). Unfortunately, as the amount of meaningless dialogue that is injected in to the game increases (in a desperate attempt to make the characters “come to life”), so does the player’s apathy for any speech they might hear irrespective of how important the information it conveys might be.

The solution regularly used to tackle this problem is to dovetail the gameplay with non-interactive sequences (i.e. cutscenes). The problem with cutscenes is that they neuter the player’s agency, which is one of the reasons why players can quickly loose interest in them – they’d much rather be playing the game they were just enjoying thankyouverymuch. In other words, many games’ solution to rule one is to go ahead and break rule two. Doh. Whilst cutscenes aren’t intrinsically destined to alienate players, they have a real propensity to go on for too long and throw far too much information at the player, a problem which is exacerbated by not finding elegant ways to give the player any information or story exposition during gameplay.

Valve are totally ninja at making the player feel like they’re in control when they’re actually taking part in an interactive cutscene. Much of the time you aren’t even aware of it; all those short walks at the end of each Portal test chamber are cut and dried linear, pseudo-interactive cutscenes. The task of moving from A to B without any obstruction isn’t difficult enough to prevent the player from being able to listen to what GLaDOS, or Wheatley, has to say. In fact, having been challenged to complete a puzzle, the player is smugly and actively looking forward to hear what their adversary has to say about it and this gives the writers a few seconds to smuggle in a little more character development or exposition as well as the laying down of the next gauntlet. It’s not so epically long that players will lose interest, and it certainly helps that the script is so engaging. These frequent, positive interactions buy the player’s trust and permit more extensive interactive cutscenes. It’s a beautiful, beautiful thing when the right balance is struck between gameplay and story, and both Portal games do this with aplomb.

The setting and characterisation in the original Portal had such a pure and convenient relationship with its gameplay that I’m sure it was easy for many to dismiss it as being irrelevant to their own work. But when such low hanging fruit is only just being utilised by our medium it’s pretty clear there’s lots of gold left to be mined in them hills. And as if to prove the point, Portal 2 invents yet more characterisations, styles of performance, writing and design that are perfectly suited to our medium and more or less unexplored by anyone else. Which is exciting to behold, but it’d be nice to see even a few competent copycats, if not a few more pioneers.

Portal 2 is a real milestone. It inspires me; I hope it inspires you too.

In The Know: Voice Over in Flash Gordon, Star Wars, The Big Lebowski & Alan Wake

“Marooned on the planet of Mongo after they had deflected it with their rocket ship from its course towards the earth, Flash Gordon and Dale Arden fall into the clutches of Ming, The Merciless, Ruler of Mongo…..Ming’s attempt to force Dale to marry him is frustrated by Flash and Prince Thun of Lion Men..Flash and Dale are captured by shark men and brought to their undersea city..Thun is left unconscious on the bank of the river..Kala, king of shark men, orders Dale returned to Ming..Flash is thrown in to a torture chamber..the wall opens, a yellow hand gives Flash a diving helmet and directions to Dale’s room..on arriving at the room flash finds the window torn out and Dale missing! LET’S RETURN TO THUN, THE LION MAN…”

So begins strip #13 of the original 1936 Flash Gordon Sunday story line. If you were an avid follower of the strip this intro would act as a handy reminder of what went on last time. If this was the first time you’d met Flash, or you’d missed out on a few strips, this would help get you up to speed and allow you to dive in to the story with the minimum of fuss. It’s crass and unsophisticated, but space is at a premium and, heck, using text is just so darn easy. It works. It does it’s job – no more, no less.

It’s cute, then, that a couple of years later when they started making serialised film stories about Flash’s adventures that they used precisely the same method to solve the same problem (N.B. this clip is from the 1940 serial Flash Gordon Conquers the Universe):

It’s worth noting that, just like the first comic strip in a series, the first film in a series had no use for such recapitulations, there being nothing to recap – where the first comic strip relied on text and pictures to set up whatever it was Gordon must spend the rest of the series saving mankind from, the first film episode used voice over and moving image to convey the peril. The fact that the first episode used VO * and subsequent episodes used text to set the scene, when they could quite easily have used a VO-led “previously on Flash Gordon”-style reminder montage, only serves to highlight the fact that they were well aware of the film’s heritage and chose to give a knowing nod and a wink to fans of the comic strip. But such trivialities didn’t put off one of Flash Gordon’s biggest fans when he came to emulate it some 40 years later with the release of his very own serialised space opera:

The opening crawl to Star Wars is pure cheese – it immediately let’s you know that the film isn’t going to take itself seriously. It says “we could have wasted time doing this properly, but this is just a popcorn flick and we wont pretend otherwise – here’s what’s been going on in this galaxy up to this point. Got it? Good. Right, here we go”. As a method or device for communicating information it is indisputably unsophisticated; the lowest common denominator. But by referencing cheap-ass film serials from the late 30s and their low budget laziness, it buys our acceptance by being knowingly lazy. That doesn’t apply so much these day of course, but in the mid-1970s just before A New Hope‘s release the Flash Gordon film serials had been shown on PBS stations across the USA making them a relevant cultural reference point for much of the film’s younger audience as well as grown-ups familiar with the broadcasts and screenings from their own youth. You don’t even have to get the reference to appreciate what it’s trying to do – the writing is so pulpy and the technique so lazy, it’s quite clear what’s in store. Or so you think – one could argue that this introduction was designed to set the bar so shockingly low that it made the following spectacle of cutting edge special effects have even more impact. But the real point I’m trying to make [stay on target] is that something is missing…

When George Lucas watched the 1930s and 40s Flash Gordon serials as a boy, he did so on TV in the 1950s. Not only were these TV versions renamed to avoid confusion with a newfangled 1954/55 Flash Gordon made-for-TV series, but they’d been messed around with. Crucially, voice over had been added to the opening crawl + (oh, didn’t anyone tell you? Yeah, people who watch TV are stupid and can’t read. Also, TVs used to be really small, kids!):

So, Lucas intended for the opening of his film (indeed, the whole film itself) to be nostalgically naive but he had enough sense (or enough people with sense working with him) not to piss away whatever semblance of class it may have had with its cutting edge visuals by slapping a narrator’s voice over the top. I had the opportunity recently to ask the film’s producer, Gary Kurtz, if VO was ever on the cards for the opening crawl – “no, that was never a consideration”. And thank goodness for that!

You don’t see much “lazy text” in movies these days (I’m not sure location introductions or time settings count: “France, 1945”), film having shed this hang-over from the silent era long ago in favour of more sophisticated ways of communicating information. The closest you’ll get is probably the lazy use of a voice over to introduce a character. One great, knowing, example of this is the intro to the Coen Brother’s The Big Lebowski (1998). It’s probably best if you watch this for yourself – if you haven’t, you really ought to. In lieu of a YouTube clip here’s a transcript:

“Way out west there was this fella… fella I wanna tell ya about… fella by the name of Jeff Lebowski. At least that was the handle his loving parents gave him, but he never had much use for it himself. Mr. Lebowski, he called himself “The Dude”. Now, “Dude” – that’s a name no one would self-apply where I come from. But then there was a lot about the Dude that didn’t make a whole lot of sense. And a lot about where he lived, likewise. But then again, maybe that’s why I found the place so darned interestin’.

They call Los Angeles the “City Of Angels.” I didn’t find it to be that, exactly. But I’ll allow there are some nice folks there. ‘Course I can’t say I’ve seen London, and I ain’t never been to France. And I ain’t never seen no queen in her damned undies, so the feller says. But I’ll tell you what – after seeing Los Angeles, and this here story I’m about to unfold, well, I guess I seen somethin’ every bit as stupefyin’ as you’d see in any of them other places. And in English, too. So I can die with a smile on my face, without feelin’ like the good Lord gypped me.

Now this here story I’m about to unfold took place back in the early ’90s, just about the time of our conflict with Sad’m and the I-raqis. I only mention it because sometimes there’s a man – I won’t say a hero, ’cause, what’s a hero? – but sometimes, there’s a man – and I’m talkin’ about the Dude here – sometimes, there’s a man, well, he’s the man for his time and place. He fits right in there. And that’s the Dude, in Los Angeles. And even if he’s a lazy man – and the Dude was most certainly that, quite possibly the laziest in Los Angeles County, which would place him high in the runnin’ for laziest worldwide – but sometimes there’s a man… sometimes, there’s a man… Aw. I lost my train of thought here. But… aw, hell. I’ve done introduced him enough.”

Narration is a clichéd, hackneyed and lazy technique. But this sequence is aware of this, and it invites us to share in the gag; it’s incredibly clumsy, the narrator explains what he is doing even though it is self-evident, repeats himself, goes off on tangents, stumbles and rambles on for so long that he forgets what he’s talking about. It’s beautifully realised and terribly witty.

And then there’s the intro to Alan Wake:

It’s as narratively lazy as the Star Wars opening crawl, but without any of the pop culture references. And it’s as ridiculous as the opening narration to The Big Lebowski but without any of the knowing intent. The opening words leading up to the logo reveal are good – the quote from Stephen King is genuinely thought provoking – but their lackluster performance and inconsiderate editing, which has removed whatever pacing may have been in the original performance, present them as if they are the “small print” at the end of a financial services ad, that is to say with no attempt to make sure that you’ve had enough time to let any of them sink in. No pauses between sentences whatsoever. Kinda like this:


The intro to Alan Wake continues. Not content with simply telling you all about a nightmare he’s had, we get to see Alan’s nightmare in action. The problem with this is that by telling us what happened Alan strips the visuals of any meaningful purpose they may have had, primarily because there is little to no interplay between what is said and what is seen. Conversely, when we are shown the mysterious disappearance of the dead body of a hitch-hiker Alan has hit with his car, his tardy comment on this event several seconds later, “Suddenly, his body was gone”, comes too late, has zero value and feels incredibly awkward. This duplication of information doesn’t make the experience twice as strong, it makes it the square root of what it could have been. It’s a bit like being told several versions of the same story at the same time by multiple storytellers – you can still understand what’s going on but it fails to present the story as strongly as a well-honed, practiced yarn from someone with the gift of the gab. It could so easily have been made more coherent:

“Suddenly…” [cut to missing body] “…his body was gone”.

We then get to work our way through the rest of the dream as gameplay. There’s a constant running commentary from Alan who continues to describe his dream as we relive it with him. This is an interesting concept, but the VO is clunky and annoying, frequently ruining the experience. Whenever Alan helpfully chimes in with a comment it’s always a prime example of information-heavy game dialogue. Any intrigue or inferred information that has been built up gets thoroughly neutered:

“You don’t even recognise me, do you, writer?”
“Think you’re God? You think you can just make up stuff? Play with people’s lives and kill them when you think it adds to the drama?”
“You’re in this story now, and I’ll make you suffer!”
“You’re a joke. There wouldn’t be a single readable sentence in your books if it wasn’t for your editor.”
“You’ll never publish another one of your shitty stories, ‘cause I’m gonna kill you!”
“It’s not like your stories are any good, not like they have any artistic merit. You’re a lousy writer! Cheap thrills and pretentious shit, that’s all you’re good for – just look at me! Look at your work!”

“I realised that the hitch-hiker was a character from the story I’d been working on”.

Thanks for your invaluable insight, Alan. The thing is, I agree with everything the hitch-hiker is saying. Alan Wake is an incompetent storyteller, and here is one of his creations quite rightly having a go at him. But before I hail Remedy as my new heroes, knowingly using bad game dialogue for comic effect, we need to take in to account the fact that it’s highly unlikely anybody heard what the hitch-hiker had to say. Despite the fact I’ve quoted him word for word, I’ve got to be totally honest; I’m really not that perceptive. The first time I played through the beginning of the intro level I wasn’t listening to what the hitch-hiker was shouting at me:

a) because he was trying to chop me in half with an axe

b) because he was trying to chop me in half with an axe

c) because I was focused on playing the game and controlling Mr. Wake, and was therefore too busy flailing and button-mashing in an attempt to get away from this axe-wielding psycho who was trying to chop me in half

d) because game dialogue is so endemically, habitually toilet that I have been trained to completely ignore it in a self-preserving attempt to have a good experience. This Pavlovian conditioning, which I strongly believe all experienced gamers suffer from (and this therefore includes most game developers), is part of the reason why we totally suck at this stuff – we don’t listen. Especially when someone is trying to chop you in half with an axe.

It doesn’t matter if Remedy knowingly filled their game with worst-in-class VO, dialogue and poor storytelling, wrapping it up with the oh-so-hilarious excuse that the person telling the story is a poor storyteller – they didn’t let the audience in on the gag. But I fail to see how this could have been a good entertainment experience even if they had. I’m not even convinced that this was actually their intention, it’s far more likely that they’re making the same mistakes and are just as clueless about this story in games lark as the rest of us. Which is such a shame – there are aspects of the sound in the game which are absolutely world class.

What I do know for sure is that Alan Wake is not the game that will teach you to listen to game dialogue again.

The search [dramatic pause] continues…


* Voice Over (short for “Voice Over Picture”, often abbreviated to “VO”) and dialogue are not the same thing. You’d think this was fairly obvious what with them being two distinct terms, VO being a particularly transparent one, and yet many folks working in game development use the terms interchangeably. Can you imagine anyone of any responsibility on a film calling their dialogue VO? They’d really have to care so little about what they were making to have such a nonchalant attitude. But that’s the thing – voice tends to matter in film, it’s merely a cheap trick in games.

+ I made this discovery as a result of hunting for Flash Gordon clips on YouTube and noticing that they were radically different to the DVD versions I’d received as my “Secret Santa” at last years Media Molecule Christmas party. After a bit of research it turns out that both versions have proliferated after the original film versions became public domain.

Voice in Bioware’s Dragon Age: Origins

I’m obsessed with the use of voice in games. I find it one of the most fascinating aspects of the medium. Most related discussions you might encounter, certainly as an audio practitioner, focus on processes relating to getting a good acting performance and how to best capture it – casting, auditions, directing, ensemble performances, mo-cap, union versus non-union talent (in the USA at least), which microphone is the best for a particular kind of sound and  other tricks of the trade. Then there are the production and in-game implementation issues of managing tens or even hundreds of thousands of assets, localisation fun, batch processing of files, the pros and cons of temp dialogue, real-time versus off-line manipulation and the joys of synthetic speech. All of which are important, none of which address the fundamental issue of why most game dialogue is excruciatingly painful to behold.

There’s been an obvious improvement in the general standard of writing and acting in games over the past decade. And as time goes on this can only get better – actors, directors and home-grown writers who understand the medium and the unique challenges it presents will naturally continue to hone their craft. But there are a couple of stumbling blocks:

  • There’s a general trend for more dialogue.
  • Integration between story/writing and game design is minimal or non-existent in most games and their development processes.
This first issue is just so face-palmingly stupid. If we want the quality of the experience to go up we need less dialogue. Fact. If you don’t believe me just watch this:

I get the marketing polemic here. Fine, whatever. But which comic genius thought it would be a good idea to start this video with a performance by one of the 20th century’s finest actors? This only serves to highlight how bad all the dialogue is in this bizarrely awesome train wreck of a video.

The second stumbling block is even more of a challenge. There’s only so far we can take this with “better acting” and “better writing” – the biggest issue is integrating these aspects with the game design itself which needs to use voice more appropriately instead of relying on it to communicate anything and everything. Whilst dialogue and its performance should of course be as awesome as possible, it is the role it plays and the information that it is burdened to convey which seals its fate.

Despite my criticism of Bioware/Lucas Arts’ approach in the voice abhorrence that will be Star Wars: The Old Republic, there are glimmers of hope to be found in Bioware’s other products. Dragon Age: Origins mesmerised me for a week over the past Christmas holiday period. Mrs. Kenny was working so I was able to indulge in some epic RPG action, which isn’t something I normally have the time for – I refuse to play this kind of game in small chunks and much prefer being totally engrossed. It was totally sweet, sitting there in my pants, refusing to wash unless absolutely necessary. Eating and sleeping became thoroughly irritating necessities. The only way I can justify this to myself is that I know the experience will eventually come to an end, and I can pick up my real life from my last save game. It’s for this reason that I’ve never played World of Warcraft – the idea of a never ending RPG gives me the fear. Looking past the fact that the Dwarves have American accents (WTF?!), some of my favourite experiences in Dragon Age were driven by its use (or lack of use) of voice.

In contrast, I found Mass Effect, Bioware’s previous IP, to be a bit of a chore – all RPGs are intrinsically a bit grindy for sure, but Mass Effect didn’t give me a story or universe I wanted to see more of and the user interface didn’t make life any easier during combat. As soon as I started playing Dragon Age another thing became dazzlingly clear – the dialogue tree system in Mass Effect (which they’ve retained for Mass Effect 2 and SW:ToR) is not to my tastes. The way it works is it presents you with a list of general directions/responses your character can take and, once you have made your selection, you listen to what your character has to say along those lines. I have a few problems with this:

  • Frequently, my character will say something which I categorically had no intention whatsoever for them to say, in a way which just doesn’t suit the character I’m trying to be. I’ve been forced to choose from a small selection of directions which are compromised abstractions, the result being frustration with my character and the game.
  • I’ve got to listen to the mouthy bugger, and if I skip this I have no idea what they’ve just said because of the limitations of the aforementioned abstractions which are only vaguely representative of my character’s actual response and not the entirety of the rambling speech he then goes on to make.
  • I am my character (this is an RPG, no?), so why do they do things and say things which I have little control over, and know a whole bunch of stuff which I don’t? I mean, I’m meant to be them, but I’m having it rammed down my throat that I’m quite clearly not them. They are themselves more than I am them. If that’s what I was looking for I’d watch a film, a really good film that has a century-long legacy of perfecting this kind of storytelling.
  • In summary – why give me a choice, the illusion of control, only to immediately remind me who’s really in charge? I don’t get this kind of frustration, certainly not to the same degree, playing a game with purely linear cutscenes.
Dragon Age, however, uses a different system whereby you are presented with several verbatim options for what your character could say and, then, as soon as you click on one of these phrases it is as if your character has already said it and you immediately hear and see the other party’s response. This works beautifully for several reasons:

  • Having read all the options, considered whether it fits with the character you have established and any potential outcomes, there is no need for you to hear your character speak this information out loud again (a trap fallen in to by earlier games, such as Ion Storm’s Deus Ex) because you’ve already just “heard” it in your head when reading it. And so, the act of clicking replaces the act of speaking.
  • To highlight how awesome this is, compare it to what happens when you select an action for your character to perform rather than a phrase to speak – you generally have to watch your character perform the action. Why? Because if you didn’t see your character perform the action and yet you instantly saw the results of said action, this discontinuity would require a mechanism which explains the passage of time. But we know our character has said something aloud when the other characters present respond appropriately to our chosen selection, so there is therefore no need to hear it – it has clearly already been said. It’s as if the time spent reading your options replaces the time spent talking and communicating your thoughts to the other parties.
  • If your character were to speak out loud, ignoring the redundancy of hearing it all again, who’s voice is this we are hearing? It certainly isn’t mine or my character’s – it’s some poor bugger who’s been in a recording studio for weeks, where everyone in the recording session has zoned out because it’s the end of another long day of the same monotonous pap, and the director has long since given up trying to get every line perfect. There isn’t even the time for that, never mind the will. And it’s not that the character is mute – this is not the same as Gordon Freeman, the silent protagonist of the Half-Life series, where the player is never given the option to “speak” – it’s that this interface paradigm bypasses the need to hear the character speak. But similar to Gordon Freeman, by not hearing a prescribed character voice the player isn’t bumped out of the experience and is empowered to fully inhabit their character.
  • The experience becomes less about communicating information via voice, and more about communicating via the written word. This opens the door to a whole new world of immersive experiences that voice and dialogue can never get even remotely close to. You can certainly get quite close using sound and the moving image, with judicious use of voice, but you will never have the time or a big enough team to realise this in a game the size and scope of Dragon Age.
  • Less time and money needs to be spent on voice records and localisation. And the experience is better! Low hanging fruit or what?
So, in playing, or trying to play, Mass Effect and then coming to Dragon Age, the differences in these two systems was made so abundantly clear that Dragon Age felt like it had invented a brand new, revolutionary interface paradigm. In reality, this is the same old tried and tested dialogue tree system of Neverwinter Nights and Knights of the Old Republic. So why did Bioware change this for the Mass Effect series of games?

Well, there are things that you can do in Mass Effect that you could never do in Dragon Age – talking to a group of people for example. Check out this scene (taking note of the interactive music which manages to score the scene rather nicely despite the dialogue trees doing their best to mess up the timing):

Pulling off this interactive cinematic in Dragon Age would be impossible given that your character spends no time speaking – this scene is all about your crew focussing on what you have to say, being inspired and motivated (it’s also about motivating the player, which is interesting given that it’s the player’s character doing all the motivating). So, perhaps the story the writers of Mass Effect wanted to tell couldn’t be done using the standard Bioware model. It’s just a shame that the method they’ve settled upon has knock-ons which make the experience less personal, and more frustrating for me as a player. I’m really interested to see what model their next new IP ends up using.

Whilst Dragon Age is more to my tastes it’s not perfection incarnate. For example, there’s some contradictory behaviour. At the beginning of the game you create your character by choosing their sex, race, specialisation, class (or caste), appearance and voice. This voice is heard when your character is engaged in combat, which is useful because these vocalisations can be a really important aural tool which supports the animations for fighting, exertion and death (however, there are generally enough sounds going on during a fight that these events could arguably be described as surplice to requirements). But what I find particularly inappropriate is that when you move your character or send them to pick up an item they will sometimes respond with an “as you say” or “your wish is my command” type phrase. This is odd – I am the character, not a 3rd party, so why are they addressing me as if this is a commander-led situation as in a Real-Time Strategy or God game? This identity crisis is made especially apparent precisely because you never hear your character speak at any other time. Interestingly, all the Non-Player Characters in your party will respond in a similar way when you give them such an instruction – this makes sense, because it’s as if they are responding to an “unspoken” command issued to them by your character (i.e. you).

One aspect of Dragon Age which I really enjoyed was those rare occasions (I only recall a couple of them) where voice was dropped altogether in favour of communicating what was going on via text. Check this out:

I don’t know how that views/feels for you as a non-interactive experience (my impatient clicking certainly makes it quite hard to read all the text – sorry!) but for me this was a most magical encounter. I mean, just compare the special feelings generated when reading a line like:

“The Presence in the gem is at first alarmed when it senses your touch. It recoils in fear, and the images that rush through your mind are ones of imprisonment and loneliness”

…with the awkward spoken dialogue and exposition near the beginning of the clip:

“Is that blood in there? Who’s I wonder? You’d think it would be all dried up after so long. There must be magic involved!”

Was this dialogue meant to be Scooby-Doo bad as part of some in-joke at Bioware that they’d all rather be making intelligent text adventures than spoon-fed talkies with hastily written and recorded dialogue that treats the audience for their adult rated games like children? Who knows? But I found this “text adventure” sequence magical – the voices, sounds and images I experienced whilst reading about and interacting with that spirit were better than anything else I found in the entire game. Point is, this wasn’t just a simple text adventure – interacting with the visuals, music, sound (that ambiguous yet suggestive whispering voice!) and text all added up to something which cannot be experienced in any other medium (including most games).

And whilst text is rarely used for story or exposition, as in the example above, Dragon Age isn’t shy in using text descriptors to heighten the experience in a rather subtle way. For example, the game is awash with “Ominous Doors”, none of which look remotely ominous in the slightest, but their text descriptors do a fantastic job of hinting at what might lie beyond, allowing your brain to fill in the blanks and make the experience more sophisticated than what mere pixels can convey. That is a powerful tool. And Bioware seem to know it – Dragon Age is packed full of information which is locked away in the form of text inside ancient books and scrolls.

So if Bioware aren’t shy in using text all over their experiences, thus making the “oh, people don’t read things”, “people don’t want to read”, “lots of people can’t read” argument somewhat weak, when are they going to do the right thing and cut back on their crummy spoken dialogue so that whatever voice acting they do have stands a chance of being totally awesome and contributes more to the experience of playing one of their games than it does take away? I’d just luuuuuuuuuuuuuuuuuuuuuurve to see the telemetry for how much of their dialogue gets skipped. Go on. I mean, prove me wrong kids, prove me wrong…

I’d hate to give the impression that Dragon Age is devoid of interesting, or more sophisticated uses of voice. In the ‘Anvil of the Void’ quest the character of Hespith, encountered in the Halls of Bownammar in the Dead Trenches, offers up one of the most intriguing examples in the game. As you explore the Halls you hear her chanting a dark and disturbing verse:

First day, they come and catch everyone.
Second day, they beat us and eat some for meat.
Third day, the men are all gnawed on again.
Fourth day, we wait and fear for our fate.
Fifth day, they return and it’s another girl’s turn.
Sixth day, her screams we hear in our dreams.
Seventh day, she grew as in her mouth they spew.
Eighth day, we hated as she is violated.
Ninth day, she grins and devours her kin.
Now she does feast, as she’s become the beast.

All of which sets the grisly, eerie mood rather nicely. The deeper you delve, the more of the verse you hear which gives you a palpable sense of approaching doom (i.e. a kick-ass boss fight). Now, throughout Dragon Age you encounter many demons, the voices of which have been processed with the prerequisite reversed-reverb effect to make them sound suitably otherworldly. And yet here we have a spirit who you can hear speaking through solid rock and locked doors from several hundred yards away and her voice is totally unaffected or processed. The cool thing is it doesn’t need an effect – a disembodied voice chanting such disturbing material gives it an otherworldliness that all the fancy-pants processing in the world cannot impart on any source material.

I say this all the time, but I think it’s worth emphasising and reiterating – the use of a sound/voice is just as important (perhaps more so) than the sound/voice itself.

Interactive Audio Crimes in Heavy Rain

How OCD am I? I refused to fold the piece of card that came with Heavy Rain in to the intended origami figurine because I didn’t want to inflict any creases on it and, instead, recreated it, warts and all, using a blank piece of paper. That’s special :)

Heavy Rain wears its cinematic points of reference on its sleeve, making little attempt to disguise the potent musk of Fincher, Demme, Hitchcock and Lang. It is perhaps so keen to communicate that it is a familiarly cinematic experience that it’s in danger of coming across as a second rate copycat. Which does the game a disservice, because the way the gameplay allows you to interact with its story, characters and environment offers experiences and feelings not often encountered in a game, which was a pleasant surprise, and it is primarily for this reason that Heavy Rain is an experience worth checking out.

The audio experience in the game is pretty good overall. Quite a lot has been made of its acting but, frankly, it’s a mixed bag which, whilst an improvement to the normal dross in games, nonetheless fails to be on par with the cinematic experiences it tries to emulate. Interestingly, the interactive conversations have consistently poorer performances and “off” delivery of lines than the linear performances which play out in scenes, which makes sense given how hard it is to stay on target whilst performing branching conversations (both as a performer and as a director). I suspect what people have been responding positively to is the facial capture, which really is impressive – I love that the actors with speech defects have this accurately reflected in their facial mocap, because this level of detail really helps to fuse the visual with the aural.

However, there are several aspects of the presentation of the audio in Heavy Rain which stood out to me as being not up to scratch. Some of it is understandable and useful as stimulus for discussion, but some of it is just plain disappointing. What follows isn’t meant to be a personal criticism of the audio personnel (assuming they are aware of these issues and would have made them better if they could have), lord knows I’ve committed plenty worse audio crimes in the past, they’re simply observations and thoughts I had whilst playing the game.

  • The opening shot of the prologue had a dog bark sitting on top of the first piano notes of the score which I found to be pretty clumsy – the dog bark should have followed after the piano chord. The bark is needed in this scene; it immediately gave the sense of a warm family home before I even new where I was (knowledge of the game’s story from reviews probably helped here, I suppose you rarely come to a game with zero expectations or prior knowledge). Contrasting this feeling against the sadness of the score this felt bitter-sweet which is a rather sophisticated feeling “for a game”, especially in the opening seconds. But this a-rhythmical, temporal clash between the dog bark and music annoyed me so I restarted the game – it turned out it was down to serendipity and must have been due to a random stinger or random start point in the ambience track. The notion of a rather linear interactive experience being ruined because of a lack of control (or lack of thought) over the audio is intriguing – we don’t currently have a nice easy way of telling a sound “be random, but not in this context”…

  • Pressing start brings up the good ol’ reliable Start Menu (note that said button is rarely used to actually start things these days – a good example of why game controllers are totally baffling to newcomers to the medium). The menu has a nice “heavy rain falling on a puddle” visual effect, and there is a suitable transition sound to lead you in to the menu. It’s strange then that when you press the select button you are taken to the Controls page (again, why is it called select and never used to select anything?) which, despite having an identical visual effect, has a weedy dry transition sound rather than the lush wet one which scores the Start Menu transition (and works a million times better). Actually, it’s worse than that – the sound which plays on entering the Controls page also plays when entering the Start Menu, it’s just that the start menu has an additional sound thrown on top. Just about every game ever (EVER!) suffers from these kind of messy UI sound issues. That’s because UI code is uniformly a total nightmare spaghetti junction of legacy filth – you often can’t easily add a sound to this mess without having to hack it in in several different places, all of which makes the whole thing even more likely to have problems especially once the code gets tweaked in ignorance of the sound hooks. Gah. My favourite UI audio cock-up is the same sound playing on top of itself – best case scenario is it plays back twice as loud (I love how that is “relatively good”), worst case scenario it plays back all phasey (either because you are adding a bit of random pitch shift or because the sounds get called on consecutive frames). Brilliant. Even the X-Box 360s dashboard menu does this kind of stuff. Gotta love how uniformly slack our UI sound is as an industry (and we’re the experts – doh!).

  • The game has some really nice foley (e.g. resting your arms on the balcony railings outside your room in the prologue) but it also has some “missing” foley as a result of the mix (your footsteps on the balcony are completely inaudible in the prologue which makes your character feel a bit floaty and fake). I wonder how they’re mixing this stuff – global falloff? That’s fine 90% of the time, but you really need to be able to tweak it on a per-camera basis as each shot has a unique context which may need to be acknowledged in the mix – “far away = quieter” doesn’t really cut it when you’re trying to be cinematic. There’s also some crappy foley at times – the footsteps are incredibly inconsistent, varying between ‘subtle and varied’ to ‘inappropriately clompy with limited sample sets’. There’s a bug in the hall landing of your house (again, in the prologue) whereby your character’s bare feet are scored with these nice, subtle footstep sounds, but if you go downstairs and come back they get stuck on some gawd awful “carpeted footstep” or “shoe footsteps” – this is interesting simply because it A/Bs the difference between nice subtle foley and the rubbish overt foley we are used to in games. Every subtle foot movement now has a great sodding clomp sound on it. I then took my dude outside where he got attacked by a passing bee or giant invisible hornet – there was no buzzing sound (fine, I guess?), but the sound of my manic feet clomping away rather ruined whatever the moment was meant to represent. Brilliant. However, the worst foley crime in Heavy Rain is the ****ing clapping – whenever anyone claps in this game it is deeply upsetting. Good foley grounds your characters in their world, bad foley only serves to remind you of the artifice of what you are looking at – this is just as true in linear media as in games.

  • The reverb effect applied to the “in your head” VO is inappropriately nasty – really toppy and ear-bleedy. Presumably this is a real-time reverb rather than something baked in to the audio? It’s a really strange decision.

  • The dynamic interactions aren’t scored effectively with sound. For example, the sliding door in the bedroom does not adapt to the speed that I move it at, there is silence and then a one-shot sound event plays. If I’m moving the door slowly then the result is the sound finishing a second before the door actually closes. Interestingly, there’s a point in the animation where it doesn’t matter what speed you are moving at and the door closes at a set speed – this is the point the sound should have played at. Hell, that was probably the idea, but if your system is based on timing rather than designed to deal with the dynamic interaction you are setting yourself up for a fall when the designers come along and change everything (which you know they’re going to do). Similarly, when shaving, there is a one-shot “shaving” sound that plays irrespective of how long the razor spends in contact with my skin. I appreciate that getting this right is a bit more work, but given the importance of these mechanics to the game (reinforcing the sense of ‘mundane normality’) I think this is a bit of a let-down.

  • Coming out of the Start menu always results in a momentary loss of audio for a few frames just after the game (and audio) has resumed. This is the kind of stuff you have to put up with when people outside the audio department don’t really care about the audio and they are more focussed on shipping the game than maintaining the quality of the player’s experience. There’s no way they would have shipped the game with a bug that caused the screen to go blank for a few frames after coming out of the Start menu, but for some reason it’s OK for the audio experience to be a bit shonky?

  • A beautiful, sad ending to the first half of the prologue was ruined by an audio fade-out that didn’t finish properly and got cut off mid-fade. Criminal. All that time and effort gone in to making this scene do its thing, utterly shat on by poor audio presentation. This didn’t happen again in any other scenes, though I found some of the fades to be a bit fast (linear fades are never going to sound good – don’t use them!).

  • This one is the daddy of all crimes – providing only two voice samples for the “shouting the name of someone who is lost” sections which crop up four or five times throughout the game. I don’t care how this came to be, there really is no excuse – it’s a really poor design decision. It makes me properly physically angry, shouty and bitey. And just to tip me over the edge, the same samples are used again later in the game in a totally different context. This could have been a really poignant, meaningful moment – reusing the assets reminding me of the context and emotions I was feeling earlier in the game, but that couldn’t happen when the only emotion I was feeling (and recalling) was anger at the game itself.

  • Trophy announcements appearing just at the end of a chapter, the annoying chirpy sound ruining the poise of the moment. To be fair, this afflicts all console games these days, but it’s especially jarring here – there isn’t really a good time for this event anywhere in this game, which begs the question “why bother”?

  • The mix is good 99.9% of the time in Heavy Rain. But occasionally it’s insanely bad, and always at incredibly inappropriate moments. At one point my virtual wife was berating me for being a bad father but I couldn’t hear her because the background walla in the police station of people muttering and writing notes with pencils was louder than the dialogue. I entertained the idea that this was intentional, perhaps my character “wasn’t listening”, but nothing else in the setup of the scene backed this up. Weird. The very last scene in the game was similarly ruined by the fact that the music was louder than the characters talking to each other.

  • One of the most common crimes was the lack of transitions between music cues. You expect this kind of naïve implementation in a LittleBigPlanet level (UGC FTW!), you do not expect it in a finely crafted cinematic score. The music was always appropriate but the lack of transitions from one piece to another was so incredibly jarring it consistently took me out of the experience.

  • The other consistently annoying thing is the way that if you “fail” one of the quicktime events you have to restart it and put up with audio that is 100% identical. This wasn’t a big deal most of the time for me, but on those handful of occasions where the UI completely fails to communicate exactly what you are meant to do (the infamous cutting off your finger scene being one of the worst culprits) and you are forced to repeat a sequence a dozen times before you work it out, the repetition in the audio becomes incredibly annoying. It’s OK that the animation/mocap is the same each time, I’m too busy focussing on the UI to notice, but it’s really weird and annoying that your character makes the same breathing sound every few seconds – it feels like an unexplained time-warp rather than them/me simply “having another go”.

  • Finally, my favourite crime by far was the sound of an idling car engine playing over the top of David Cage’s screen credit even as said car had pulled away in to the distance. Oh the sweet, sweet irony. To be fair, it’s pretty quiet, but the idea that any film director would let that slip or that any film sound person would even dare to allow such a thing to happen is totally beyond comprehension.

Conclusion – interactive audio is hard! On a film, these kind of “technical” and aesthetic problems can be fixed in seconds – in fact, they’re unlikely to materialise even at the final mix stage because they’re likely to have been caught and addressed as a matter of course. But on a game, fixing this stuff requires more effort, a higher pain threshold, and often requires collaboration with and support from the team you’re working with (at a level unprecedented in most film productions). The audio experience in Heavy Rain, and its failings, suggest a team experienced in high-production-quality linear media, but lacking in interactive audio development experience (e.g. nicely recorded foley which you can’t hear at times due to a naive implementation approach, and repetitive audio which screams “My first game! Our assumptions were tested and they were so wrong it hurts!”). Whilst these kinds of issues can be found in most games (including many of the ones I’ve worked on – always be learning!), they really stand out in a linear, cinematic game because the point of reference isn’t other games, it’s the linear soundtrack perfection of film. That’s a cruel juxtaposition, but whatever the reasons are for these failings – a lack of experience or time, poor collaboration or a lack of support and understanding for audio from the other disciplines – it remains a quality bar by which the audio experience could and should be measured.

GDC 2010

It’s been a year since I posted anything here, and I haven’t been updating with the vim and vigour I aspire to. Sorry folks. I rather naively thought that after LittleBigPlanet had shipped I’d have a nice quiet year and be able to do a bit more writing – that’s certainly what I demanded in my letter to Santa having just emerged from seven months of crunch. Turned out that supporting our community with lots of DLC made 2009 pretty hard going. That’s set to continue, so things are still going to be pretty quiet round these here parts.

But after another round of inspirational talks at GDC (which just wrapped up a couple of hours ago!) I’ve got a little bit of spare time in my hotel room and a lot of cool ideas buzzing around my head, some of which I’ll share here. You lucky ducks.

Clint Bajakian (THE CLINT BAJAKIAN THE) gave an excellent talk on adaptive music techniques this afternoon. Near the end he touched on where he’d like to see ‘horizontal’/‘vertical’ streaming systems go in the future, which was basically fading out/in a phat pile of stems at different times, demonstrating this with an example mocked up in Pro Tools. One of the big problems this solves is the “stuff getting cut off when transitioning” issue, because you get rid of those elements elegantly at an appropriate point before the transition. Awesome, and the examples he had worked beautifully, but it raised the question in my mind of how do you remove something before a transition when the thing which instigates that transition has not yet happened? Bumcakes…

Then I had the idea that in some circumstances you can quite easily predict a future game state (especially in rather linear, controlled gameplay experiences) and use that information to drive the music system. The example I gave in the Q&A;/discussion at the end was that in a God of War combat-esque situation you can track player health and enemy health – if the player is totally kicking ass, has good health and the baddie is nearly dead then you could quite confidently initiate the first stage of the transition process. The tricky part is when the probability is ambiguous, such as when the player is running low on health and the baddie is similarly just as likely to die – it could go either way, and this makes it impossible to know whether to transition to the EPIC FAIL music or back down to the low-intensity (or whatever) music. Yet more bumcakes served up right there…

I left my braindump hanging at that point because I hadn’t finished thinking about it, but there was some more discussion along those lines. Lennie Moore picked up on it and approached the problem from the angle of thinking about the player’s emotional state – this was an excellent point and it triggered more brain juice but we ran out of time so I didn’t get to share.

The day before, Sid Meier’s GDC keynote had promoted the idea of going out of your way to make the player feel good. The example he gave was the fudged probability model in Civilisation Revolution, which does funky things like reduce the chances of the player failing a battle twice in a row because this would upset them – when players feel like they’ve been cheated they are more likely to throw in the towel. This certainly explains why Civ is so bloody addictive! The key point is that it’s about what the player feels should be the outcome of a battle and has nothing to do with the actual probabilities of success versus failure.

Tying these two ideas together is the cool bit, and it’s not unreasonable to suggest it would work well because they both deal with issues of probability. So, when the player is nearly dead and they are close to defeating the baddie, but not guaranteed to do so, the game seamlessly (and totally under the radar of the player) cheats big style by making the player invincible so that the music can safely do an awesome transition. It’s win-win. The player is happy for being bad-ass and their gameplay experience has been scored to perfection.

For me, these idea jams are what GDC is all about – it’s also a really good example of why audio folks should try and go to some non-audio talks. Don’t be shy in sharing your thoughts, because this is what makes a conference successful in the first place. Be inspired and be inspiring. That feedback loop = Awesome to the power of Win.

Sackboy’s Voice – Full Of Eastern Promise

I’m back from a two week trip to Japan. It’s the longest and most far, far away holiday I’ve had for many years. There were lots of temples, photos, temples, walking, temples, eating, temples, a proposal of marriage and more temples. I even managed to make a few sound recordings; the massive ravens which are everywhere, lots of other birdsong, Buddhist chanting, trains, pedestrian and level crossings. I’m looking forward to giving them a listen and a bit of an edit. Oh, and did I mention the temples?

So, I’m feeling refreshed and ready for the year ahead. Mr. Credit Card isn’t looking so healthy. He could just about handle the holiday; I seem to recall he was actually quite looking forward to it as he doesn’t get out the wallet much, never mind the country. But he started to turn a strange colour when I asked him to take on the engagement ring. “That’s what you’re for” I told him. He said I had to “give him lots of money” if we were to remain friends. The bastard!

In other news, LittleBigPlanet has been nominated for 8 GANG awards. There’s some symmetry there with the 8 AIAS awards the game won at DICE last week, ignoring the fact that none of those were for audio BUT you don’t win Game of the Year and Console Game of the Year without good audio so I’m a winner too. Or so I keep telling myself.

Interestingly, Sackboy won an AIAS award for Outstanding Character Performance, so I am now officially an Academy Award winning voice actor. Yes, thank you, thank you ladies and gentlemen. No autographs please, put those pens away. If you’d like me to cough or hold my breath in the Sack-stylee as some after-dinner entertainment then contact my mum. Be warned though, she drives a hard bargain, and you’d be best advised not refuse her offer of a wee cup of tea.

Seriously though, I think the sound design decision not to litter the Sackfolk with inane voice samples contributed significantly towards that award. This is something which some sound designers find hard to resist – there are a couple of 3rd party trailers and adverts out there where some arse has added chipmunk voices to the Sackfolk to make them sound comic and cute. This is a great example of something which is easy to do in the linear medium (so deceptively easy it doesn’t actually require any thought) turning out to be almost impossibly hard to do [properly] in the interactive medium. Which would be purely academic were it not for the fact that an inability to speak is a strong component of the Sackboy/Sackgirl IP.

Ignoring implementation issues such as contextual blindness and technical limitations such as memory constraints, not having a voice allows players to feel that their Sackboy, which they lovingly dress, customise and emote with, is theirs. This sense of ownership would be hard to achieve if the character you were controlling had a mind of their own, voice and language being the most personal way of communication and expression. Which is why when Sackboy does speak it is with the voice of his player, his lips moving to match those of his puppet master.

Anyways, in addition to the 8 nominations from GANG, there are also nominations for the two audio categories at the BAFTA video games awards and the audio award at GDC, so there’s still the chance for some award winning audio love coming towards LittleBigPlanet over the next couple of months.

If you want to read more about the audio in LittleBigPlanet then check out my recently published article at Music4Games, and keep an eye out for the March edition of Game Developer Magazine.

Game Developers as Toilets

I haven’t been thinking about sound much this last week – been doing mainly music t’ings. In its place then, some silliness…

At lunch yesterday the coders were discussing “what kind of programmer are you”: are you the kind of programmer that comes along and torpedoes the toilet lovingly constructed by the other programmers? Or are you the kind of programmer that enjoys licking the toilet clean?

Others were “confused as to why we have to keep re-imagining the toilet what with it having been invented already”, “sick and tired of the toilet being engaged all the time” and “convinced there is more money to be made in the construction of urinals”.

Programmers, eh?

Feeling left out I decided that I was the bleach; a nice smelling addition, absolutely necessary for an enjoyable toilet experience.

Hypersensitivity to Repetition

I’ve been aware of it for a few years, but I’m becoming increasingly sensitive to sound repetition. Or, at least, my reaction to repetition has become more intense, usually resulting in me shouting at the TV and digging my nails into Mrs. Kenny’s leg with much gnashing of teeth and cries of “WHY?! WHY WOULD ANYONE DO THAT?!”.

I’m not talking about subtle variations of the same sound, I’m talking about playing the exact same sound over and over again without any attempt at variation and with no stylistic or contextual justification for doing so.

Cheap-ass TV adverts are a likely bet to set me off on one of my angry, spitty fits. Likewise for cheap-ass TV documentaries with their obsession of (badly) adding foley to old, silent stock film footage. But when you watch a Hollywood blockbuster you expect a superior experience to no-budget broadcast productions, because someone who cares has been paid a lot of money to put the soundtrack together.

There are exceptions of course. In the pressure cooker of post-production it’s understandable that re-using a sound might be the quickest solution, or perhaps even unavoidable when there are an army of audio personnel beavering away. So, last night when I watched Terminator 2 for the first time in many years and heard the same sound being used for gas igniting, a bullet ricochet and a tyre exploding, it intrigued me more than annoyed me; the events were half an hour apart and only a freak (or a specialist) (or a specialist freak) would notice such a thing. I recall that there is a squawk that is used in the Lord of the Rings trilogy for both a passing crow and an orc being shot in the neck by an arrow. The reason these events jump out at me from the soundtrack is that the sound in question is so distinctive that the first time it is heard it lodges itself in my mind as “a nice sound that was fitting and I liked and will no doubt steal that idea one day, muhahahaha”, and from that point on it is a marked man and any re-appearance is quite likely to be picked up on by my hyper-annalytical auditory system.

But when I was watching the unusually PR audio-hyped Wall-E last weekend and, on at least three separate occasions, a sound was used over and over again to score the same event without any justification other than pure laziness, I was a bit miffed. There’s using a sound again with good reason, as when Eve introduces herself and says her name twice exactly the same way to reinforce the fact that she is artificial, a robot/machine/computer; we’re used to hearing the UI sounds on computers being identical which gives us a grounded sense of familiarity that a task is performing as we would expect. Then there’s using the same metal impact sound half a dozen times in the space of 2 seconds to score a robot knocking repeatedly on a door because you are enormously crap at your job and are clearly open to the idea of me biting you in the arm and gouging away at your face. Repeatedly.

But why would a beautiful sounding film, albeit one with too much music for my tastes, let its standards slip? The same people who cut the sounds on cheap-ass adverts and TV documentaries are the same people who add multiple instances of library sounds at Pixar without even thinking; non-sound people who need to bring their mute creations to life during production but don’t want to pay for it or understand why it’s important to use someone who has a clue or gives a rats ass even when it’s “just temporary”. It’s the same people who add temp music as a quick fix and then moan at the composer when their shiny new music isn’t identical to the temp track:

“Dammit man, can’t you make this music sound more like Thomas Newman? The temp track we’re using is perfect!”

“I am Thomas Newman”.

Poor old Ben Burtt, it’s not his fault.

All of which makes me glad I work in games. Which is ironic considering how dreadfully repetitive sounding games are, especially if you are unfortunate enough to overhear someone else playing one in the same room [shudders]. But at least I have the convenient excuse of games being outrageously repetitive experiences. It’s a different kind of repetition though. Honestly.

I’m pretty sure my hypersensitivity is in part due to me actively trying to avoid repetition in my work. If an individual sound event has six variations and is set to have a certain amount of random pitch variation and I don’t hear that reflected in-game, then something is broken. As previously mentioned, UI sounds are the exception; not hearing the same sound would be confusing to the user (“why did I get a different result for performing an identical task?”) and, personally, I like to take that to the extreme. But if you work in linear media you have no excuse. Except perhaps for pecking-order politics. For that you have my sympathy.