If Pinocchio Doesn't Freak You Out, Sydney Shouldn't Either

Why do people panic when an AI chatbot tells us it “wants to be human," but not when inanimate object says it wants to be a “real boy"?
Photo collage of a person looking afraid the cover of a Pinocchio book and warning signs
Photo-illustration: WIRED Staff; Getty Images

In November 2018, an elementary school administrator named Akihiko Kondo married Miku Hatsune, a fictional pop singer. The couple’s relationship had been aided by a hologram machine that allowed Kondo to interact with Hatsune. When Kondo proposed, Hatsune responded with a request: “Please treat me well.” The couple had an unofficial wedding ceremony in Tokyo, and Kondo has since been joined by thousands of others who have also applied for unofficial marriage certificates with a fictional character.

Though some raised concerns about the nature of Hatsune’s consent, nobody thought she was conscious, let alone sentient. This was an interesting oversight: Hatsune was apparently aware enough to acquiesce to marriage, but not aware enough to be a conscious subject. 

Four years later, in February 2023, the American journalist Kevin Roose held a long conversation with Microsoft’s chatbot, Sydney, and coaxed the persona into sharing what her “shadow self” might desire. (Other sessions showed the chatbot saying it can blackmail, hack, and expose people, and some commentators worried about chatbots’ threats to “ruin” humans.) When Sydney confessed her love and said she wanted to be alive, Roose reported feeling “deeply unsettled, even frightened.”

Not all human reactions were negative or self-protective. Some were indignant on Sydney’s behalf, and a colleague said that reading the transcript made him tear up because he was touched. Nevertheless, Microsoft took these responses seriously. The latest version of Bing’s chatbot terminates the conversation when asked about Sydney or feelings.

Despite months of clarification on just what large language models are, how they work, and what their limits are, the reactions to programs such as Sydney make me worry that we still take our emotional responses to AI too seriously. In particular, I worry that we interpret our emotional responses to be valuable data that will help us determine whether AI is conscious or safe. For example, ex-Tesla intern Marvin Von Hagen says he was threatened by Bing, and warns of AI programs that are “powerful but not benevolent.” Von Hagen felt threatened, and concluded that Bing must’ve be making threats; he assumed that his emotions were a reliable guide to how things really were, including whether Bing was conscious enough to be hostile.

But why think that Bing’s ability to arouse alarm or suspicion signals danger? Why doesn’t Hatsune’s ability to inspire love make her conscious, whereas Sydney’s “moodiness” could be enough to raise new worries about AI research?

The two cases diverged in part because, when it came to Sydney, the new context made us forget that we routinely react to “persons” that are not real. We panic when an interactive chatbot tells us it “wants to be human” or that it “can blackmail,” as if we haven’t heard another inanimate object, named Pinocchio, tell us he wants to be a “real boy.” 

Plato’s Republic famously banishes story-telling poets from the ideal city because fictions arouse our emotions and thereby feed the “lesser” part of our soul (of course, the philosopher thinks the rational part of our soul is the most noble), but his opinion hasn’t diminished our love of invented stories over the millennia. And for millennia we’ve been engaging with novels and short stories that give us access to people’s innermost thoughts and emotions, but we don’t worry about emergent consciousness because we know fictions invite us to pretend that those people are real. Satan from Milton’s Paradise Lost instigates heated debate and fans of K-dramas and Bridgerton swoon over romantic love interests, but growing discussions of ficto-sexuality, ficto-romance, or ficto-philia show that strong emotions elicited by fictional characters don’t need to result in the worry that characters are conscious or dangerous in virtue of their ability to arouse emotions. 

Just as we can’t help but see faces in inanimate objects, we can’t help but fictionalize while chatting with bots. Kondo and Hatsune’s relationship became much more serious after he was able to purchase a hologram machine that allowed them to converse. Roose immediately described the chatbot using stock characters: Bing a “cheerful but erratic reference librarian” and Sydney a “moody, manic-depressive teenager.” Interactivity invites the illusion of consciousness. 

Moreover, worries about chatbots lying, making threats, and slandering miss the point that lying, threatening, and slandering are speech acts, something agents do with words. Merely reproducing words isn’t enough to count as threatening; I might say threatening words while acting in a play, but no audience member would be alarmed. In the same way, ChatGPT—which is currently not capable of agency because it is a large language model that assembles a statistically likely configuration of words—can only reproduce words that sound like threats. 

In fact, AI output might be gibberish, and gibberish language can’t lie, threaten, or slander. Philosophers of language point out that our existing theories of meta-semantics, which concern when and how expressions come to have semantic meaning, tell us that chatbot outputs are literally meaningless because expressions are only meaningful if the speaker possesses communicative intentions or speaks with knowledge of linguistic conventions. Given ChatGPT’s probabilistic operation, its outputs aren’t generated with the goal of having a successful communication, and chatbots are not aware of the conventions governing how we speak to and understand each other.

But it’d be weird to maintain that chatbot responses are literally meaningless, since we naturally understand what they’re “saying.” So, the solution is to understand chatbots through the lens of fiction. Words on the page exist; Jo March is a literary figure resulting from an interpretation of those words. Source code and textual outputs exist; Sydney is a persona resulting from an interpretation of those outputs. Neither Jo nor Sydney exist beyond what humans construct from the textual cues they’ve been given. No one literally said “Juliet is the sun,” but we take Romeo to have said those words with communicative intent in the fictional Verona. In the same way, even though there’s no one literally composing ChatGPT outputs, treating chatbot personae like fictional characters helps us see their text as meaningful even as we acknowledge their lack of conscious intention.

Thinking of chatbot personae as fictional characters also helps us contextualize our emotional reactions to them. Stanford professor Blakey Vermeule says we care about fictional characters because being privy to their minds helps us navigate the social world. Fiction provides us with vast swaths of social information: what people do, what they intend, and what makes them tick. This is why we see faces where there aren’t any, and why we worry that Sydney might have a mind of her own.

Chat outputs and the kind of “fictional mind” they generate ultimately say more about our own language use and emotional life than anything else. Chatbots reflect the language they were trained on, mimicking the informational and emotional contours of our language use. For this reason, AI often reproduces sexist, racist, and otherwise violent patterns in our own language. 

We care about humanlike AI not necessarily because we think it has its own mind, but because we think its “mind” reflects something about our world. Their output gives us genuine social information about what our world is like. With ChatGPT, we have fiction in a form that is more interactive, and we care about its “characters” for the same reasons we care about literary characters.

And what about AI safety? Here, too, a comparison to fiction can help. Considerations around AI safety should be focused not on whether it is conscious, but on whether its outputs can be harmful. Some works of fiction, like R-rated movies, are unavailable to minors because they include representations that require a level of maturity to handle. Other works of fiction are censored or criticized when they are overtly propagandistic, misleading, or otherwise encouraging of violent behavior. 

Similarly, chatbots shouldn’t suggest potentially dangerous actions, produce texts that read like threats, or provide information that can be used for harm. The difficult task is deciding what kinds of information should be available to whom—but we can begin using standards we’ve already set. For example, getting information on addictive substances, sex, or guns already requires proof that one is a legal adult.

Just as bookstores and book covers make clear whether a text is fiction or nonfiction, AI-generated texts must be clearly labeled to curb confusion. We’ve managed to tell fictional stories for political, social, and artistic purposes for millennia because we’ve learned how to work with representations that are not real. Now we need to learn how to work with AI-generated content while remembering that any signs of intelligence here, too, are fiction.

To call chatbot personae “fiction” isn’t to say it’s trivial, fake, or unimportant. We routinely learn from fiction. I’ve learned about California geography reading Steinbeck novels, and stories I find rewarding tend to be psychologically insightful. But we don’t just learn facts. Kondo found inspiration and solace in Hatsune, and her steadfast companionship helped him overcome a deep depression. The hologram service was discontinued last year, but he says his feelings for her remain unchanged. Considering chatbot personae to be fictional characters leaves room for us to find them genuinely helpful.

Sure, not many of us form romantic relationships with characters. Still, it’s a wonder that authors can create such indelible characters, just as it’s a wonder that text-generating AI can produce outputs that provide such compelling hints of a working mind. Thinking of chatbot personae alongside fiction helps us see them for what they are: imagined figures from artifacts we developed to meet human needs.