Although games have certainly come a long way since the days of Spacewar! and William Crowther's Adventure, the great bulk of these advancements have been in the realm of graphics. Games definitely look a lot more sophisticated than they ever have before. However, one area that is still painfully lacking in games is artificial intelligence, particulary regarding dialog between players and computer-controlled characters. What I intend to do here is discuss a few approaches game developers have taken to address this issue--and why sometimes less is more.
Perhaps one of the most sought after goals in all of computing is the achievement of artificial intelligence (AI). Although the nature of AI and whether such a goal is possible have long been hotly contested by both computer scientists and psychologists (John Searle springs to mind), game developers have long joined the fray. The idea there is that games must continue to challenge players and adapt; they must essentially "learn" from experience and behave in ways that imitate actual human behavior. Today, the term "AI," at least as it applies to gaming, is usually in reference to tactics in fighting or strategy games. There's nothing as disruptive as seeing some enemies in a first-person shooter continuously running into a wall--or just standing on the corner as someone shoots their elbow fifty times until they die. Obviously, it takes a great deal of good programming to make enemies behave more believably, but it always seem to be in the realm of the possible. At least, I don't doubt that one day we'll have first-person shooters in which the computer opponents behave just as cunningly as a human. Indeed, it will probably be indistinguishable which characters are being controlled by the computer and which are actual human opponents (particularly in the case of multiplayer online games).
However, no matter how well a character might duck, jump, and devise strategies to beat even the most manipulative human, there is one easy way to tell if there is silicone or gray matter at the helm: say hello. To put it simply, even in 2006, a five-year old child has a much better language system in place than even the most sophisticated computer.
Chatterbots and Natural Language Processing
Although there certainly have been cleverly coded chatter bots, or programs that attempt to mimic humans in a chat room (usually a text-only environment), no one with any sense would confuse these parlor tricks with actual intelligence. They usually work by ferreting out keywords and responding in what is hopefully a believable manner. The famous program ELIZA, for instance, used the conceit of a Rogerian therapist. Since Rogerian therapists are thought to possess a very strange and contrived manner of conversation anyway, Eliza's odd questions and responses seemed believable enough. Other chatterbots have relied on various other techniques, such as simulating bad typing, spelling and the like. The classic test these bots were subjected to is the "Turing Test." The idea there is that a human wouldn't be able to tell the difference between a bot and another human. Needless to say, no bot has yet managed to pass that test in a truly satisfactory manner.
Perhaps one area where game developers have worked the hardest in achieving believable dialog is interactive fiction (or, "adventure games," or "interactive drama," or what have you). With the typical text adventure, the goal might be to simulate "natural language" to the point where the player and narrator could function like a dungeon master and player of a tabletop role-playing game. Rather than rely on strict, highly artificial syntax and a database of stock responses, the narrator could "talk" with the player to figure out what he wanted to try, then act accordingly.
Obviously, such a parsing system would be extremely difficult to program. The reasons are apparent to anyone who has really thought much about language. I really think that Searle's Chinese Room Argument puts the issue best. Does knowing all the appropriate symbolic responses to a string of Chinese characters really mean that you understand Chinese? Of course not. Readers familiar with Noam Chomsky might also cite the nature of a generative grammar as an impossible obstacle. To put it simply, even seriously limited native speakers of a language can produce a near-infinite variety of acceptable statements and responses. Imagine how many different ways you could really say "take lantern": "Make that lantern mine." "Lantern; oh, lantern, joinest my inventory thou now must do." "That brass lantern that was on the table? Now it's in my hand." You get the idea. And that's not even taking into consideration other possible responses, such as "How'd that lantern get there?" or "That's a really ugly lantern" or "Damn. That's so cliche. A brass lantern?" An actual human could respond to each of these statements with little difficulty, but a computer trying to parse them would probably retort something like, "I don't see a 'cliche' here."
A recent return to earlier forms of dialog can be seen in a free game called Facade, which returns to the tradition of simulating natural language processing in order to fool players into thinking its parser is much more sophisticated than it actually is.
There are a few ways a game developer can approach this problem. As I mentioned before, several text adventure game engines have worked ever more diligently towards providing better natural language processing (NLP). What you'll find as you read more about these systems is that the creation of a "natural language processor" has resulted mostly in the creation of an "artificial language" (i.e., jargon) that these folks use to befuddle themselves and the outside world into thinking they know what they're talking about. To make a long story short, the whole process of language acquisition is about as poorly understood to us at the inner workings of the human brain. Suffice it to say, NLP is a very long way away, and certainly not a practical option for a modern game developer interested in better dialog.
Another approach that became common in graphical adventure games is to present dialog as a series of menu options. For instance, a player encountering a troll might be shown a menu like the following:
(a) Die, troll! (attack)
(b) Excuse me, sir, do you mind if I go past?
(c) Hello, my name is Fred. What's yours?
Once the player has made a selection, the game can respond accordingly. Usually in these situations there is really only one "correct" response, which either must be chosen for the game to progress or is actually the inevitable choice regardless. In the above exchange, for example, it might be pre-ordained that the player must fight the troll. Selecting options B or C might result merely in a grunt from the troll and an attack. This technique presents the illusion of choice without having to bother with more dynamic gameplay.
However, other possibilities exist. The game could follow a "branching tree structure." Maybe choices B or C could lead independently to a series of other choices, and so on, until at last there were millions of possible outcomes. The problem with this structure is the exponential growth of the structure. Of course, everyone will remember those great "Choose Your Own Adventure Books," which generally solved this problem by quickly ending the narrative if the player made the "wrong" decision. It's really a form of cheating to tell a player to "choose his own adventure" when, in reality, we all know that it's a highly linear narrative with a single correct trajectory through a series of pre-defined decision-making moments.
The number of games that take the menu-based approach is large indeed, and it seems to be the most common approach even today. Perhaps the most well-known adventure games of this type are LucasArts', particularly Ron Gilbert's Secret of Monkey Island series. These developers always seemed aware of the limitations of the menu-based system and made light of it, often to quite humorous effect. Indeed, anyone who has played the first Secret of Monkey Island game will remember the sword-fighting scenes, where the player must exchange insults with his opponents in order to progress. However, the player can only succeed if Guybrush Threepwood (the avatar) offers the reponse that correctly corresponds to the one hurled by the opponent. Unfortunately, Guybrush can only learn what the right responses are by a combination of trial and error and experience in many such battles, so that at last he has the whole reportiore at his disposal. To my mind, this is the most clever use ever made of a menu-based dialog system. At the very least, it was the most fun. Revolution attempted, well, a revolution, by creating a very complex dialog system in its game Lure of the Temptress, where players could string together very long commands and responses to characters. Unfortunately, this resulted more in confusion and frustration than enjoyment.
Of course, one of the obvious problems with a menu-based system is that a player might very well desire to order a dish that's not on the menu. This is often the case in rigidly linear games that try to force a player along a narrow route. There is also a problem in that the resulting conversations (particularly when they're spoken out by actors) feel tremendously artificial when things are said out of order. For instance, a player might hear a character say, "Ah, good point" or "Well, now that you mention it" at a moment that makes no sense. Furthermore, we don't use the same tone of voice and inflections at the beginning as we do the end of a conversation. Game developers have tried to get around these limitations by having the characters break off with an automated "goodbye" of some sort, but these canned responses often seem repetitive and unconvincing. The game that springs to my mind here is Sierra's Gabriel Knight 2.
Some games try to shore up some of these weaknesses by offering icon-based menu options rather than strings of dialogue. This is true of many later LucasArts games, such as Sam & Max Hit the Road, but also many modern games such as Revolution's Broken Sword 3: The Sleeping Dragon or Unknown Identity's The Broken Mirror. The idea here seems to be that presenting the player with icons allows for a bit more room for abstraction and generalization--and also a bit of suspense, since the player won't know exactly what the avatar will say. For instance, in The Broken Mirror, players are frequently presented with two choices for responding to dialog: A smiley mask ("positive") and a scowling mask ("negative"). This technique offers the player significant control without totally relinquishing the element of surprise. I might add that it also greatly simplifies translation, since there will obviously be much less text involved at the interface level.
One of the most interesting systems of abstract dialogue can be found in SSI's Gold Box Games. There, players could choose among "haughty," "meek," "sly," "nice," and "abusive" options in dialog. This five-tier system, while perhaps somewhat overly complex and a bit redundant, nevertheless makes for some very interesting possibilities when dealing with computer-controlled characters. However, again we have the problem of the branching narratives. The solution is either to link most of the choices to the same fixed result (i.e., four will cause the creature to attack, one will cause him to flee) or limit the dialog to one or two turns. Most games seem to opt for both, with the result that only a highly linear and narrow set of "correct" decisions will result in the desired outcome.
The norm now seems to be tending towards either a purely icon-based menu system or a hybrid of icons and text. Fun Com took an interesting approach with Dreamfall, in which the avatars seem to be thinking aloud about the dialog options (i.e., "I should try to show some interest in his job.") Cinemaware attempted a similar approach to dialog in its 1987 game King of Chicago. In this game, the characters would talk using pop-up bubbles like those seen in comic books. The player made choices about the dialogue by selecting one of two "thought clouds" that would appear above the avatar's head, again in the fashion of comic books. More interesting still, the player's mouse arrow was replaced by a fly during these segments, and sometimes Pinky would make remarks indicating that he was being pestered by the fly. Unlike most other games, the player could elect to do nothing, and after brief pauses Pinky would make a random choice and the game would move forward, thus preserving the game's frantic pace. Both Cinemaware's and Fun Com's approaches really seem to help the player "get into the head" of the avatar and seems to represent some interesting possibilities.
Ostensibly, we might think that what's really called for is a more dynamic system in which the computer could generate believable responses to any combination-- a sort of "on-the-fly" approach to dialog. However, with a highly limited icon-based menu system, it quickly becomes apparent just how unfeasible this is--particularly if the game is to have any sort of narrative guiding the action. Should we allow a hopelessly abusive player to nevertheless win the game? Or do we want to punish that player by refusing to grant him victory? Several CRPGs have tried to allow as much flexibility as possible, even to the point of allowing a player to go on a "killing spree," killing enemies and friendly characters alike--and still "complete" the game. Increasingly, games are offering many variations on their end sequences, so that "good" players will see a different set of screens and narrated resolutions than "bad" ones. However, such a degree of flexibility seems to rob the game of any authorial intent, which, at least in most adventure games, is one of the reasons why the game is enjoyable in the first place.
Games like LucasArts' 1995 classic The Dig, for instance, rely on a fairly linear series of events. If the player could just kill off all the non-player characters at the beginning and still manage to win the game, the integrity of the game would fall apart. All of the wonderful interactions and character development that make that game special would be irrelevant.
What I suppose this all boils down to is that for dialog to be any fun, it has to be interesting, non-trivial (i.e., it has to serve some real purpose in the game), and dramatic (here, I'm treading on Brenda Laurel's work). There's hardly anything dramatic about a game that let you do whatever you wanted, with no thought of anything being "inevitable" or "determined by fate," i.e., those thing we associate with good drama. Of course, the problem is that for a game to be any fun, the player has to make choices. My contention is that those choices should be purposeful, but constrained by the author's overall intention. To use a somewhat bad example, consider Unknown Identity's game The Black Mirror, which ends by the player's avatar committing suicide. Undoubtedly, many players would like to see a different ending, and some may even resent the fact that they are forced to watch their avatar leap off the edge of a tower. However, allowing some other choice would ruin the effect the developers intended. I might liken this to the death of Floyd in Planetfall. Surely, most players would have opted to save Floyd had they been given the chance. However, the reason the moment is remembered so poignantly is that it was fated to be; i.e., it was not a choice to be made but a situation to be felt.
To put it simply, players had better be careful what they wish for when they wish for AI. A truly dynamic dialog system would be about as much fun as reading a "novel" that was just a stack of blank papers upon which the reader was supposed to "choose his own adventure." Can you see the Emporer's new clothes?
In reality, a good dialog system would have to allow for a certain leeway while maintaining the author's control over the narrative as a whole (i.e., the meta-narrative). Certainly, there ought to be nearly unlimited ways for players to complete the games; hundreds and hundreds of small choices would nevertheless lead up to a satisfying conclusion (punctuated by set developments).
A precedent for what I'm talking about can be seen in Greek theater. While all the Greeks knew perfectly well what would happen to Oedipus in Sophocles' famous play, they had no idea how the playwright would arrange events--or the exact dialogue he would have coming out of the character's mouths. To my mind, a great dialog system would work the same way. While players could experience the game in greatly different ways depending on the way they wanted to play the characters, the overall outcome would always be the same. The fun would be in discovering all the varied possibilities of getting there.
Really nice article, Matt. This is something we'll have to get into in our book later on. :)
One nice dialogue system I enjoyed is the one used in the mediocre Wing Commander 3: Heart of the Tiger and the excellent Wing Commander 4: The Price of Freedom space-sim games. While the main gameplay stuck to the tried and true WC formula, the sequences where you could talk to fellow wingment inbetween machines had two branching dialogue options. Some of these did have significant repercussions, while many did not, but the decision basically boiled into: is my avatar (played by Mark Hamill) going to be a righteous hero or a conniving asshole? It was fun to see how the acting in the different FMV clips would play out.
WC 3 had an interesting Japanese dating-sim element with the flirtations between your avatar and two people on the ship: the mechanic or a co-pilot. Several FMV scenes had you choose how you wanted to pursue the relationships and while this had little effect on gameplay, it added a soap-opera level of intrigue that enhanced the game as a whole.
There was a spot in WC 4 where the dialogue branching added a layer to the plot. Late in the game, you had the choice between two missions: would you go to a weapons depot and pick up much-needed ammo for your ships or go and save a civilian ship under attack? It's the Dove VS Hawk argument, but if you picked the "Rescue the Civvies" option, the last few missions would be a bit harder, yet if you picked the "Secure the Weapons" option, the later missions would be a bit easier.
Another game with an interesting dialogue system was The Legend of Kyrandia 3: Malcolm's Revenge. Your avatar was Malcolm, an evil court jester set free from his stone imprisonment. The dialogue options were controlled by a slider with three settings: Nice, Neutral, or Evil. The game featured no dialogue trees, but having a slider gave the game variety; certain puzzles could be solved only with the appropriate "mood" selected.
=- Mat Tschirgi =- Armchair Arcade Editor
Hear my gaming podcasts!
Maybe i'm the only one, but i don't really need NLP in a game, because that's a sure recipe for having to type a lot. Menu based conversation serves perfectly fine in most games. In fact, the frequently quoted Facade would have been served better by such a conversation system. I may be missing something, but apart from the Facade programmers, I don't particularly recall that many game developers or players wishing for strong AI in conversation systems.
Go here http://cogsci.ucsd.edu/~asaygin/tt/ttest.html#hi for lots of info on the Turing Test.
Alan Turing was the English genius who built Colossus - the machine used to help break the German codes produced by the Enigma machine during WWII.
Turing is one of the great theorists of computer science - one of his works discussed the idea of a computer "appearing" as a human via a conversation held at a keyboard,called the Turing Test.
Personally I am very much interested in Turing and computers behaving and interacting like humans. As I progress in my studies to become a psychiatrist - having to digest a lot of literature on psychiatry and psychology and a lot of the old theorists/behaviourists - my sense of human behaviour and interection does become different. You see clear patterns emerging and quite often human behaviour can be very predictable. I would love to create an AI-'engine' capable of these sorts of behaviours and patterns and see what that brings. 'Elise Plus' here we come ;)
-= Mark Vergeer - Armchair Arcade editor =-
Great piece and comments. Menu-based dialog is a convenient way out of the whole interaction issue, since the pre-canned queries can branch out to pre-canned responses. To me, this is a purely passive system, again, no more advanced than the "Choose your own Adventure" books from our youths (though those were quite fun for what they were, some of the more advanced ones even incorporated dice and combat). On the other end of the spectrum is when we played Dungeons & Dragons - paper versions - and there was a human taking the role of "Dungeon/Game Master". You were guaranteed interactive conversations there - active conversations - while still being kept within the confines of the game world (eventually you'd get some nugget of useful information so you could continue the adventure or get an important clue). Obviously in some cases, that would be the holy grail of conversation-based gaming, the "Turing test" if you will (though there is some modern debate whether the Turing test is a valid qualifier, but that's a different discussion).
Then there's Douglas Adams' "Starship Titanic". While I own it, I have yet to play it. That was among the last mainstream games to try a different type of conversation system. While the game itself played out like a traditional adventure game, the dialog system reverted back to text adventure style, where the player would actually actively type in questions. To me, this is a great solution to the passive versus active methodologies, as on the game end its passive, but on the player end it gives the ILLUSION of being active. Of course the game itself was not entirely a success, but ideas like that need to be explored more.
Not at all! There's really no reason why a menu based conversation system can't be smart, context-aware and interactive.
When conversation isn't the point of the game, as it is in such rare examples at Facade and Starship Titanic (which I owned, a played a good deal through), it's much better for conversations to be short and to the point. After all, games would become tedious when playes continually are missing clues, because they miss the right subject to talk about.
The problem for me with menu-based dialog is that I'm forced to play with either the attitude or the questions that the designers thought of, which may not be how I want to approach the conversation or the types of questions I want to ask. I realize you can't have infinite flexibility and variety, but I get frustrated when say there are four choices, and none of the four are said in a way that I either want to say them or I want to ask them in. I can think of countless games like that and I just don't think it's a particularly elegant solution. I'd prefer to see more alternatives mixed in with the relative 'easy way out' in menuing that's become all but standard for EVERY game.
Wow, these are great comments. We've got some incredibly smart people here! :-) I was thinking a lot about Mark while writing this post, since I knew he was training as a psychologist. I've been listening to a series of lectures by Daniel Robinson on psychology lately, and have also listened to John Searle's teaching company courses on the philosophy of mind. Both courses talk a bit about artificial intelligence, so it's been on my mind lately. ;-)
First, a quick response to Bill. I'd argue that in most adventure games, at least, you are supposed to be essentially "role-playing" a certain character, with a history (no matter how shallow) and personality. For instance, in the Monkey Island series, you are playing Guybrush Threepwood, and he's expected to offer certain kinds of dialog and responses (he's got a personality). If that were opened up to the point where you could have him say anything (act "out of character,") then you couldn't honestly say you were "playing as Guybrush Threepwood.") I think that part of the fun here is (a) enjoying the character and (b) enjoying playing as the character. Of course, in games where you were supposed to define your own character, you should as much leeway as possible in determining how your avatar responds to given situations.
As far as the other questions...The basic problem seems to boil down to how much stock we put in the notion of "understanding." If you teach a child to say "I'm doing fine" in response to "How are you?", does that mean the child really understands what "I'm doing fine" means? Or is the kid just obeying instructions? I know I've often had conversations with people about things I knew very little about, but just by being able to ask seemingly relevant questions, I was able to fool them into thinking I knew more about the subject. For instance, I know next to nothing about football other than the name of a few teams, such as Arsenal and Manchester United. If I encounter someone who wants to talk about the subject, I can usually trick them into thinking I'm a football buff just by virtue of being an American and asking something like, "What are your thoughts on Manchester United this year?" From there, just by asking more questions and confirming or asking for clarification on certain points ("Now, what makes you say that, exactly?"), I can fool them into thinking that football is something of an obsession of mine, when really I've probably watched maybe two games in my life. In short, I don't know understand football, but just by responding to certain keywords, I can seem to.
I think a computer might be able to do something similar one day. I think of something like Godel's Incompleteness Theorem and think, well, what we need now is a computer program that can take advantage of all the information being delivered on the net. Instead of thinking of an AI as something that you have to completely program from the get-go, I think it makes more sense to program it as something like Wintermute.
For instance, a chatterbot could spend a great deal of time in chatrooms on the net, first by "lurking" and collecting vast amounts of possible responses (and the keywords/phrases that likely lead to them) in a database ("machine learning" or "data mining"), and then experimenting by seeing how well it was able to trick people into thinking it was human. There would need to be some sort of self-correcting mechanism in place, so that unintelligible or give-away responses were gradually weeded from the database. While a system like this might never work perfectly, given enough time and access to enough data and experiments, I'm pretty sure it could reach a very impressive level of mimicry. It's be very ironic if the first really convincing AIs were programmed by spammers trying to lure customers on chatsites.
Of course, the one thing that seems to be missing here is the hormonal rushes that infants get when they do something right. It must be a big thrill for a child who learns that saying "mama" gets all sorts of wonderful things going. How can we make a computer actually desire to do something rather than just do it because it was programmed to do so? Unless we can integrate some type of organic components into the hardware, I don't see how this will ever be possible.
I think it's interesting that games seem to be using the net less as proving grounds or AI and more to unite human players with other humans in "massive multiplayer games." Who really needs AI when you have so many other humans to play with? Why play against a computer opponent in a chess game when you can easily log in and play with real chess buffs? We're in a situation someone like that of the Romans with slave labor; why invent so many labor-saving devices when you can cheaply buy a slave to do the work instead?
If your ask a question the authors didn't think of, it's unlikely the programmers thought of satisfaying response. The technology is just not there. The best they can do really is to give some kind of generic response, and steer the conversation back on track. This would become obvious after a while. A pseudo intelligent conversation system just makes the procedure less obvious.
When you look at Oblivion, most NPCSs have plenty of subjects to talk about, but after a while, you don't really want to spend that long in conversation mode, because most topics you can skip without running the chance you miss something important. Sometimes, a character has a piece of suprise knowlegde, which falls outside the running quest topics, or generic repsonses.
Now, had Oblivion some form of text conversation, you would have to ask the NPC about each subject separately. And if that NPC had some piece of surprise knowledge, you would never know about it.
I guess it's often a case of "A question you can't ask is a question you don't need to ask".
The most obvious method of storing knowledge in a database is in the form of a matrix called a neural network. Unfortunately problems in such matrices tend to be NP-Complete, and there is no amount of computing power availble to retrieve solutions, once the matrix grows over a certain size. To for a generic knowledge database some sort of heuristic algortithms must be found to retrieve a reasonably close context for any subject (after all a real conversation partner may also be clueless ;) ). If this is not enough, which would seem likely, given the scope of the knowledgebase, then the obvious solution would be a neural net of heuristic routines to acces the knowlegde database, (for which the problems would be also be NP-Complete).