User:Joca/essay The Ghost in the Speaker

Embracing biased characters in conversational interfaces

Should we be kind to our smart assistants? In Why'd You Push That Button, a podcast about the social dynamics around technology a mother of a six-year old gives the following answer to this question: 'We really want him to understand them that you have conversations with people and how you have them. Having a robot or a smart assistant that will answer to you no matter how you speak with them, well that's not life, even though it is life, but it's not real life.'

The slight confusion in this quote gives a hint of the power of conversational interfaces to give the illusion of a consciousness, even just by the use of audio. The examples in the first chapter show that sound is a medium that can embed character, as was typically done in radio and broadcast. The Interactivity of a smart speaker allows for different kind of storytelling. In the context of journalism, it might fit a slower type of news than is typically done now with the news briefings on smart speakers.

This potential is not used at the moment for various reasons: one is the lack of content specifically designed for consumption on speakers. On the other hand, the smart speaker is a new medium, which starts with presenting old media as its content before it develops its own genres (McLuhan, 2002).

On the other hand, the space to experiment with content forms is rather limited by the way these conversational interfaces are positioned in the market. The dominant platforms are designed to create assistants that act after user commands. They are designed as an efficient tool, rather than a way to enrich 'our own capacities to think, feel and act' as formulated by Brenda Laurel in her thoughts on interfaces in the book Computers as Theatre (2013).

In this chapter I want to speculate on a interaction design for a smart speaker that allows for a different way of storytelling. Starting from the role and character of the speaker to give it more agency than an assistant. Then I elaborate on what this means for how these speakers could act in a conversation, using the ideas of Computers as Theatre in the context of smart speakers. Then I conclude with different possibilities to present news on smart speakers using these ideas from a more realistic, to more speculative scenarios.

The speaker as a spirited object

To broaden the possibilities for interactions between humans and conversational interfaces like smart speakers, it helps to consider a different role than the one of a virtual assistant, because of the constraints that are part of the master-slave relationship connected to it.

In a big number of science-fiction movies the alternative role that is proposed is then that the robot takes the ultimate lead and kills all humans on its way to keep its power. On the spectrum of servant to killer with absolute power, there are many different roles to consider that give a conversational interface more agency. I'd like to first discuss two speculative design projects on digital assistants that deal with this idea, before moving on to my metaphor smart speakers as spirited objects.

With Foresight (2017) the designer David van Gelder de Neufville envisions a digital assistant that gets its agency based on the data and permissions given to it by its users. The system has a persona called Athena that helps its users, sometimes proactively pops up but also denies certain requests. For example when one of the family members closes down Athena's access to her agenda and private messages.

Based on information from social networks, smart light bulbs and private chats Athena observes what all family members did, are doing and will do. One of the questions that De Neufville asks here is if the assistant can create its own reality using the data, and following that its own awareness based on the knowledge and freedoms that it gained from its users.

The idea of the bot as a companion is further researched in Karin Anders (2017), a speculative design research project by Karin Fischnaller that focuses on the digital assistant as an alter ego that could be a sparring partner for a designer. In her thesis she argues that the bot doesn't need to be a prosthesis, but can be a partner that brings in new ideas, but also discusses the input brought in by the designer. The added value of the intelligence lies here in the collaboration between a human and a computer, that complement and conflict with each other similar to normal social interactions.

In her research, Fischnaller refers to the actor-network theory of the French philosopher Bruno Latour. He coins the term actants for non-human entities that are able to perform actions in the world and have a form of agency. For the context of smart speakers, I like how Jensen and Block (2013) eloborate on this idea by connecting the actants to Japanese Shinto-inspired techno-animism. In the shinto religion there is a focus on the idea that things change form from non-human to human, from the real-world to the other world. Spirits inhabit living creatures, but also natural objects. Techno-animism extend this idea to electronical devices.

As a metaphor for smart speakers I find this idea useful, because techno-animism relies on the idea of space and material. The intelligence is not flowing freely in the space, but is able to live in a device like a smart speaker. Another interesting aspect in contrast to Western thinking about non-human creatures is that these 'spirits' don't necessarily have a backstory that explains their behaviour.

The spirit in a speaker might be a ghost that has a lot of knowledge thanks to its internet connection, but on the other hand is not able to move out of the speaker. Sometimes it is willing to help its user, but sometimes it needs your help to do something. As these ghosts are bound to their device, different speaker feature different ghosts that have their own distinct personality.

Within this metaphor the speaker is intelligent and might have a certain degree of conscience, but at the same time it is unable to do some things that humans are capable of. It becomes a mysterious object with a degree of agency, that users can discover by conversation with it.

Given the state of technology at the moment, the idea of a spirited speaker is of course a metaphor. It might get more realistic in the future, like the Mechanical Turk was an early vision of a chess computer like Deep Mind, but the metaphor serves mainly a different goal: it is a way to envision an exchange between people and conversational interfaces that is somewhere in between voice commands, and social conversation.

The speaker as a player

When a smart speaker has the persona of a ghost that lives in the speaker, the next question is then what this means for interaction with users. And how these characteristics of the medium can be used for more interesting ways of publishing news on these devices.

In the media equation (Reeves and Nass, 1996) the authors argue that interactions of humans and computers are similar to social interactions. Media equals real life, and in our use of media the same social codes apply as in interactions with other people. The illusion of some form of intelligence and autonomy could be enough to make people believe it. The ideas of Brenda Laurel in Computers as Theatre (2013) connect well to this idea. In the book Laurel uses theatre as a model for interaction design. When the first edition was published in the '90's Laurel's inteded applications were originally games, or virtual reality. The idea of the interface as a player is however particularly useful for a smart speaker, because of the importance of conversation and character that it has in common with theatre.