Tuesday, July 3, 2012

Nothing Human: Creating More Convincing Conversations

Videogames are, generally speaking, about a few simple interactions.  Fighting is the one that undoubtedly shows up most often, but of course, there are plenty of other sorts - racing cars, playing sports, and so on.  But one thing that developers have yet to really master is something which is arguably far more pertinent to our interests as human beings - "simply" talking with other people.

There are all sorts of distinct challenges when attempting to simulate conversation.  Most games have distinct failure and success states, so dialogue has to be able to fit into our preconceived rules about winning or losing the game.  It has to give players a feeling of reactivity, otherwise it comes across as unnatural and stilted or even non-interactive.  Although there are manifold ways in which developers attempt to circumvent these problems, few are ever really been successful in creating dialogue interactions that feel realistic and believable.

In this article, I'll be breaking down some of those challenges in more detail and providing a few examples of techniques used to create compelling conversations, as well as a number of recommendations that can be used to approach tired dialogue systems from new angles.  Although I use role-playing games almost exclusively as examples here, it's only because there are so few examples of other games, especially in the mainstream, that revolve so heavily around conversation

The Uncanny Valley of Dialogue

Although the much-abused term uncanny valley is almost always used in the context of graphics when it comes to videogames, it's in dialogue where the uncanny valley becomes most obvious.  Just think about all the times you've been playing a game and a character's said something that doesn't make sense for a certain situation.  Perhaps you heard a dialogue line repeat itself.  Maybe a line was even cut off mid-sentence in favor of another one.  Or, after your first play-through of a game, maybe you found that most of the conversation options all caused characters to respond in the same way.

This goes doubly true for the player as well.  Even though games will often provide radically different choices in dialogue, or allow for multiple lines of inquiry, those lines always equate to simple choices between A, B, and C - there's little in the way of nuance and expression.  In Mass Effect, I can't choose to say a line in a cocky manner, or a respectful manner, or a snide one - I have to choose between the options given to me, and if I'm not given an option I like, then I'll have to pick another one.  For all those dialogue choices out there, a lot of time could be saved by realistically boiling everything down to the essential choices.

Even when we're given the option to choose the tone of our replies, the meaning can never stray from what's explicitly defined by the developers in advance.
The reason for this is obvious - short of creating some sort of super-intelligent AI to parse text and provide responses that make sense in context, and then coming up with technology to voice-act those lines in real-time, it's pretty much impossible to have dialogue that is truly adaptive to what the player can say.  Dialogue content is completely finite.  Although many interactions can be governed by sets of rules (wherein we abstract our conceptions of reality), speaking to another human being is something that is not so easily predicted.  People have personalities, they make rash and unpredictable decisions, they let emotions guide them.  Think back on all the times, even just today, when you said or did something based on impulse - while there might be some rules behind how you behave, they're certainly not things that you can divine, and you likely know yourself better than anyone else!

While we've had years and years of training to abstract activities like violence, and others, such as a football game, can be understood in terms of the rules governing the sport and simple approximations of physics, we have little understanding of other people, and even if we get to know someone very, very well, we can't predict complete strangers - in other words, the same rules don't apply.  Most videogame content works by producing sets of common rules that different actors can interact within - in a shooter, you interact with the world by shooting at enemies, who behave in predictable and finite ways no matter how many you fight; in a racing game you interact by piloting a vehicle in competition with others to reach a final point - but that doesn't work very well for talking.

Approximating Humanity

The first solution to this problem is to create a system for interacting with characters that deals less with providing highly detailed content, and more with content that is highly reactive to the player's input.  Necessarily, this will result in a dialogue system that is more approximate, but on paper, does this seem like such a bad idea?  After all, we already accept a great deal of approximation from videogames.  We understand that quips like "low on ammo!" serve as auditory feedback on another game mechanic, and we also accept that, under the hood, the game only has so many lines available to offer.

Smart developers know how to space out these lines to get the most mileage out of them - such as recording 10 different variations on the same line and making sure that players hear repeats as little as possible.  With a smart implementation, it's impressive how often players' brains will trick them.  We don't have extremely long-term memories for most things, so creating dialogue content that exploits the limitations of short-term memory will do a lot of the work for them.  But when it comes to more significant, memorable dialogue, that approach quickly becomes useless as long-term memory takes over.

There's really only two solutions to the problem - either create a dialogue system wherein most dialogue only plays once, but some will repeat indefinitely based on the needs of gameplay, or create a dialogue system wherein the amount of available responses increases dramatically.  Specifically, this means building a set of rules just like any other for the game, ones that approximate human interaction, and provide gameplay flexibility, even if it means giving up the "immersion factor."

Dialogue in Morrowind lacked the emotional resonance that unique responses could provide, but made up for it with huge amounts of detail and universal mechanics that determined how individuals reacted to the player.
The latter solution used to be very common in older role-playing games.  The Elder Scrolls series, up until fairly recently, used keywords to simulate paths of inquiry.  Players could have a list of ten, twenty or more inquiries, and would largely receive the same responses, with variations for specific cases (like plot-advancing dialogue, or minor variations such as changing pronouns and gendered terms around).  In Morrowind, for instance, most characters would recite the same lines over and over again, but mechanics took over - a character's individual reaction modifier would change based on the player's behavior (maybe asking about a taboo subject would net you a -20 reaction penalty), and racial or cultural background would also affect what a person could offer information on (so asking people about events on the other side of the world would rarely be worthwhile).

The big issue with such a system is that, now that you're dealing with a ruleset, suddenly designers and programmers have to start thinking about dialogue in terms of gameplay rules.  No longer can conversations be about conveying lots of unique emotions and subtleties to the player - now they're about cause and effect, winning and losing - and have to be crafted with that in mind.  And, like any game, those rules have to be consistent, predictable, and simple enough to understand.  By effectively turning dialogue into a mini-game, it becomes subject to all of the same constraints any other game is.

Dialogue by the Tree

Dialogue trees are a staple of role-playing games as well, and by far the most popular way to simulate conversation.  Dialogue trees, true to their namesake, take the form of a number of topics of inquiry, which then branch out into more paths.  For example, a dialogue tree structure might consist of: general inquiry -> clarification -> opinion, with the final option taking the player back to the "root" of the conversation.

Obviously, developers can do a lot with dialogue trees, and games like Planescape: Torment are testament to that, with thousands of unique lines of dialogue and dozens of paths of inquiry that, in themselves, make up much of the gameplay.  At the same time, however, the key limitation becomes abundantly clear: while you can take a few shortcuts with a dialogue tree format, pretty soon you're going to end up with huge, sprawling conversations.  The content bloat mentioned earlier is felt very quickly when using dialogue trees, and if players notice too many shortcuts being taken (like identical responses to two radically different dialogue options) then the sense of realism that dialogue trees usually go for is completely shattered.
Despite the bloat, however, it's clear that dialogue trees have a big advantage - they do a much better job at simulating the act of conversation.  Even if you need (mostly) unique lines for every inquiry the player makes, the big benefit of that is that characters can have much more personality, the player is able to express an opinion in more nuanced ways, and, most importantly, that the mechanical side of dialogue disappears.  While there's always going to be a binary yes or no choice, many games do an excellent job of obscuring exactly where the variables in conversations are flipped - Dragon Age: Origins, for example, will show the amount of influence earned or lost after a conversation, but deliberately hides what dialogue options actually affect it, to better simulate the act of talking to a person rather than picking responses from a list for best effect.

Dialogue trees can provide much more detail than almost any other dialogue mechanic short of a graphic novel, but the amount of writing required can become prohibitive to presentation.
Additionally, the reuse lines as "cheating" on the part of the developers might be a little unfair.  Usually, it's actually quite acceptable to reuse responses and add separate lines in that help redirect the conversation.  Most players never really notice these - next time you're playing a dialogue-heavy role-playing game, take note of how many times you hear connecting statements like "anyway", "however", "meanwhile", and so on - in almost every instance, they're being used to disguise points where the conversation has branched and needs to re-converge.  Once you pick up on this, it's surprising just how scripted many conversations really are, but so long as there are enough unique lines to keep the illusion going, it works splendidly.

There's also the player's own emotional impact and investment to consider.  Gamers are usually not developers, and they're not going to be keen to the tricks and shortcuts used - they're often in a very different mindset when playing games, enjoying the content as it's presented to them rather than fussing over the details or looking for holes to poke.  Therefore, as much as I want to complain about Commander Shepard's heroic speech influencing absolutely nothing in Mass Effect, I have to admit that, as a player, it's still pretty cool to be able to give that speech in the first place - to me, it feels like a real decision, and the way that it influences the tone of the narrative can't be denied.  Even if it doesn't really matter to the game whether Shepard's an idealist or pragmatist, it matters to me.

Closing Thoughts

The unfortunate fact is that there are very few other ways that developers have actually handled dialogue - although I could bring up text parsers, they're really not much different from the keyword system mentioned above, except that the possible inquiries are kept ambiguous.  For all the games industry has managed to so effectively simulate the act of killing another human being, or driving a car, the more complex and subtle, less predictable and deterministic act of talking to another person is something that's still up in the air.

That said, there's a few ways that existing dialogue systems can be enhanced to produce conversations that feel more natural and realistic:
  1. Abandon the UI.  Sometimes, it's better to produce results by getting the player to actually do something rather than picking options off of a list or typing them into a dialogue box.  A great example of this can be seen in Half-Life 2, where Alyx Vance will react to all the things the player can do in the game world - they're not deep interactions, sure, and the game is a shooter so there's little meaning behind them as far as gameplay goes, but there's something far more satisfying about using the tools available in one's arsenal to provoke a response rather than selecting "(Shine flashlight in Alyx's eyes)" in a menu.  The upcoming Naughty Dog action game, The Last of Us, looks to be taking this model to new heights.
  2. Emotional impact doesn't mean detail.  It's a common misconception that players need voice acting, an orchestral score and Hollywood-style direction to connect with the characters in games.  Our brains are willing to fill in an exceptional number of gaps in presentation, and sometimes the most engaging experiences are the ones that exist in our own heads rather than in code.  The emotions that players bring to the mechanics of a game are far more potent than the ones that writers try to squeeze out of an audience via sympathetic techniques.
  3. Remember what the player's done.  This is a little thing, but one of the key ways to make characters feel real in a game is to make them reference past events to provide a sense of continuity to the world.  These don't have to be dialogue choices either - one of Deus Ex's most remembered moment is Paul Denton scolding or praising the player depending on the level of lethality used in the game's opening mission, while other characters have the opposite reaction.  This simple distinction, which ties in with the broader mechanics of the game and was probably tracked in all of one or two variables, does more to draw the player into the game than all the expensive cinematic sequences in the world.
As usual, thanks for reading, and please feel free to leave comments below!

1 comment:

  1. One of the aspects I've heard being praised about The Walking Dead episodic game from Telltale is its ability to remember choices across the episodes. Once the player picks a certain response to a question, to game provides feedback indicating that a certain character will remember that choice and then the player must live with that lie over the course of the game. Sean Vanaman, the co-designer/writer for the episodes, described it as "carrying rocks in the imaginary backpack of the player." It's an interesting approach to this problem.

    Of course, I'm partial to the Planescape: Torment method too for the same reason that I also dislike it: the massive amount of text. While it's great to see as a player, it'd be a major hurtle for me to emulate myself. I'd have to spend months just writing instead coding. And while I prefer the permanence of a game like Dragon Age (in that you often can't loop the dialogue trees) as a player, it's daunting to think about as a designer. Crafting content some -- or even many -- won't see bothers me.

    I think I side with the context sensitive option (1) in this. Then again, you have to place something in the world to narrate too. Either as part of the text parser ("You see a table") or another character ("Don't point the flashlight at me, Gordon"). Such an choice prompts its own struggles.