User:Joca/draft 03: Difference between revisions

Latest revision as of 13:00, 3 April 2019

This draft of the thesis features all chapters, introduction and conclusion and some visual material. Besides additional sharpening and cleaning up of the texts, I am still planning on rewriting the headlines to make those more consistent in style.

Next tot the chapters, I made a draft of one of the intermissions: pieces of speculative design fiction about the smart speakers I want to design. In the final version these will be selected out of draft 'screenplays' for my project. (see User:Joca/A play for smart speakers)

The Ghost in the Speaker

Looking for character in story telling on smart speakers

Name: J.P. van der Horst

Title: The Ghost in the Speaker / Looking for character in story telling on smart speakers

Student number: 0948347

Thesis, in partial fulfilment of the requirements for the final examination for Master of Arts in Fine Art and Design: Experimental Publishing. Piet Zwart Institute, Willem de Kooning Academy.

Adviser: Kate Briggs

Second Reader: Amy Suo Wu

Introduction

This thesis follows my journey towards my current position on designing the interaction between people and smart speakers like the Amazon Echo and the Google Home in the context of storytelling.

Smart speakers are internet-connected devices that feature speakers and a microphone, which is used to interface so-called 'digital assistants': interfaces that are controlled by voice commands. Unlike other conversational interfaces like the digital assistants inside smartphones, these devices as explicitly positioned as physical objects that have a permanent presence at home. This idea changes the whole dynamic of using it, as I experienced while having a smart speaker in my home: the device is just one shout away to look up something quickly, set a timer and ideally, to help with everything. Although the medium is audio-only, there is much richness in the sounds and the illusion of character that is designed in the digital assistant.

A typical device for a conversational interface is a smart speaker, like Google Home, Amazon Alexa and the Apple Homepod. (Smart speakers, 2018)

In my work, I always had an interest in how particular media shape the story that is told through them, particularly in the communication and consumption of the news. I grew up in a home that was packed with newspapers and magazines my parents subscribed to. Each of these publications embodied a particular character, expressed in the choice of stories and the way the publication looked and felt. Later on, I encountered some of these aspects while designing interfaces for interactive systems: What should a computer tell the people using it, and in what way?

At the same time, the imagined role of these speakers as virtual assistants feels odd at times. The way they deal with news for example is peculiar and unsatisfactory: One can ask a speaker for the news, and it will read out headlines, or play a news broadcast without giving insight in how it selected the content, or what alternatives there were available. Next to that, none of the strengths of these speakers are used: for example the intimacy and personality offered by sound only medium like these speakers, but which is also an important aspect of podcasts and broadcast radio.

In considering how this could be done better and what is possible, my research has opened up to think of smart speakers as agents for a range of different storytelling functions, including but not limited to the news. This has happened through coming to think of the smart speaker as a form of spirited technology, building upon the already important role of the imagined personality in the software that runs on these speakers.

No this personality is positioned as a servant in most speakers. This is odd for two reasons. First, because these devices are not passive. The intelligence of the speakers comes from analyzing user data, like voice recordings and the type of requests to train the system. While users of the speaker get more dependent on the technology, the speaker gathers more data that can be used to become smarter and increase its economic value. Besides hiding this aspect with the image of the servant, this idea also limits how people tend to use these smart devices. Although the companies behind these speakers promise a techno utopia, most people use smart speakers merely to stream music (Bentley et al., 2018). When looking at activities that do not necessarily care about efficient execution, like personal expression, critical reading, or listening to the news, a different role than the one of the butler might be more suitable.

The chapters follow my thinking process towards to the idea of spirited technology, first by considering the evolution of news broadcasting, as a way of getting to the idea of a more meaningful, humanistic interface. This led me start thinking of new ways of experiencing the news through smart speakers, and then - more broadly still - to consider new and expanded forms of interaction with smart speakers within the theatre of the home.

The conversational interface in the form of smart speakers and digital assistants is still a novel medium where designers and developers are still figuring out the characteristics and best practices. I hope this thesis, and the examples in the intermissions between some of the chapters, can serve as an inspiration for ways of using smart speakers that move beyond the mere functional.

I - Reading between the headlines

The future is a utopia with everlasting afternoon sun and computers that correctly understand people. At least, that is the techno perfect Los Angeles as portrayed in Spike Jonze's movie Her (2013). While transcribing people's thoughts and organizing their life, the digital assistants even have the time to fall in love with some of their users.

'Stockprizes go lower...' - NEXT! - 'Sexy daytime star [...] reveals provocative pregnancy photos...' (Jonze, 2018)

Even in the LA of the future, the latter is controversial. However, there is a different scene that I remember in particular. During his commute, the main character Theodore Twombly checks the news. I expected that something spectacular would happen: Could a virtual news anchor give an update? Is the latest news combined with matching memes that pop-up as holograms? Or, if it is not about the spectacle, at least this part would be as carefully designed as the rest of the movie.

In the end, none of that follows. Even in this bright future where everything is possible, the digital assistant spits out the latest headlines and Theodore falls for a clickbait article.

This is surprisingly similar to the way news is designed on conversational interfaces in 2019. With these type of interfaces, I mean digital assistants like Line Clova and Google Assistant that are built into smartphones, as well as devices like the Google Home and Amazon Echo. What they have in common is that these conversational interfaces provide a way to interact with digital systems with your voice, without the need to touch a screen. Ask Alexa, Assistant, or Siri for the news, and you get a briefing that features many headlines with an unknown author. If you want to know more about a specific event, the speaker will read out the article or play a particular part of an hourly news broadcast.

Although the news briefing is often demoed in the commercials and presentations of these smart speakers and digital assistants, it is not commonly used by people that use these devices. The Verge, an online publication focused on technology and culture, concluded quickly that 'Smart speakers have no idea how to give us the news.' (Brandom, 2018)

Between gossip and societal discourse

Making a computer voice read some text from the internet, without giving any context, is a questionable approach to news, but it is understandable if you take into account how conversational interfaces are mostly used nowadays: digital personal assistants that help people control their devices, in such a way that the actual technology and complexity is out of sight. This might be nice to control light switches in a home or to stream music, the most popular use cases for smart speakers at the moment (Molla, 2018). However, this urge for a nearly invisible interface ignores some specific characteristics of news and the role of journalism.

In The Elements of Journalism (2007) Kovach and Rosenstiel describe a moment where anthropologists compare their notes on communication and find some universal patterns for news throughout history and cultures: news has a certain momentness and pace. It is about what is happening outside of people's own experience but has an impact on their lives. The authors conclude that journalism is 'simply the system societies generate to supply this information about what is and what's to come' and that it is part of the human instinct to be curious.

Historian Michael Schudson (2013) argues in reaction to Kovach and Rosenstiel that gossip might be universal, but the form and the conventions in journalism are highly dependent on the time and culture. For Schudson 'journalism today

relates to the constitution of public life, an institution in which criticism and discussion is part of its function', but to get that role journalism was dependent on something like the existence of a public sphere starting in the eighteenth century and organized efforts to publish a collection of current affairs periodically.

Balancing between the gossip and the societal discourse, the idea of popular news and quality news is visible in different media, from newspapers to online. What these media have in common is that they try to create a meaningful news experience to their audience. To me, such an experience is based on this idea that you read, see, or listen to coverage of a particular news event and that it sparks a thought. Beyond the transfer of information to the brain, you start to think about how this news event relates to you as a person and your position. In case of an article about the new outfit of Kim Kardashian, it could be a chat I have with a friend after sharing the article. In the case of an article about corruption in the Brazilian government, it might motivate me to start a protest. In both cases, the article is not the end. It's a trigger for some extension.

There is a whole tradition in creating news media that facilitate a meaningful news experience fitting the character of the publication. It is not only in the choice of news to cover but also in the ways the report is designed: from the visual design of the newspaper to the role of the news anchor on broadcast television. The current form of the story on smart speakers ignores these traditions mostly, showing little cues about the situatedness of the news. Before further discussing the causes and propose an alternative, it is good to see how news found its form in other media.

The archetype of the modern newspaper developed in the United States during the first half of the 20th century, as Barnhurst and Nerone describe in The Form of News. (2001) Innovation in mass-printing and photography enabled a different way of reporting. Where text was used to record events, photography took over to give visual immediacy to the news. The role of the writing journalist changed into filtering the news-worthy aspects of an event instead of writing a full chronological report. This resulted in a shorter, matter of fact articles and the development of the idea of the newspaper encouraging a stance of objectiveness.

Looking at the difference between the journal of record and the true tabloid, the difference in content is reflected in the design. In research on Finnish and English newspaper design, Jesso Lamberg (2015) identifies a difference in visual energy: The quality newspaper has a more uniform, and text-focused design. Popular publications tend to focus on more visuals, tilted images and experimental forms of designing a newspaper. Lamberg sees that particular elements from popular newspapers have found their way to the papers of record. For example the use of big photos and little text on the front page, and the adoption of the tabloid paper format. In the end, he concludes that many aspects of newspaper design are not just functional, but also act as a way to show the character of the newspaper to the audience.

Headings with faces from the tabloid Bulgar (Philipines). Although often looked down at, creating such a design with high visual energy requires a lot of skill. (Horst, 2018)

Character plays an important role as well for the way news is designed on broadcast media. Where radio news started with reading out the headlines of the newspaper, later on, it developed its own conventions. Radio news had to be vivid, and a good news anchor was one that could bring the story to life by the way it was told. With the arrival of broadcast journalism, this changed. With the addition of image in the form of video, still photos and graphics, a good news anchor was a person that could give an interpretation to the items on a screen. (Ponce de Leon, 2015)

Snowfalling

With the move to online media, there is an interesting difference in comparison to newspapers and broadcast media. A lot of news articles are not read on the website of a news organization, but aggregated and shown on different platforms.

Where in the newspaper age the front page was the defacto way to lure in the readers, and to show the identity of a news medium, the homepage of a news website could be declared dead. In a time where many news media are focused on how their content is displayed in Google News, or try to game the news feed algorithm of Facebook, defending your own platform is quite a statement. S. Mitra Kalita, the vice president for programming at CNN Digital, did so in front of an audience of social media marketers.

Although articles are distributed on other platforms, and people might not need to visit the site and apps of a news medium, it pays off to focus on the homepage according to Kalita: it's an interface where you have full control over, and one which is used by the most loyal readers. (Beard, 2018).

By structuring and showing its content differently on each medium, CNN tries to pack the same story in different ways that are relevant to readers in various situations. To some extent, the interface guides the experience of the user.

This influence of the way news is designed, is not just limited to the sites and apps of news outlets. Even when focusing on getting your story across on other platforms than your own homepage, there is an interface around that content.

For people who create the content, the scope of the interface is limited to the one article, video or VR environment they are working on. The poster child of that approach is Snow Fall (Branch, 2012). This long read about an avalanche in Tunnel Creek was published by the New York Times. It featured a groundbreaking design with full-screen videos, slideshows, and special effects when reader scrolled through the article.

The design of Snowfall became a phenomenon among editorial designers and webdesigners (Dunning, 2012)

Many other publications copied the format and snowfalling was declared to be 'the future of journalism' because it showed to promise of high-quality long-form journalism online as a way to attract and engage an audience. It didn't work out that way, mostly because of the time and resources it takes to create these rich articles with custom interfaces: Snow Fall alone took 6 months to make and had 11 people working it (Thompson, 2012).

Job to be done

To understand the way digital news media are designed and how the rationale behind the design differs from older media, it is good to dive deeper in the foundations of what is considered 'good' interface design.

With the rise of the web, the quality of the interaction design became one of the critical aspects for having people use websites. In the nineties, the term user experience design (UX) was coined to focus on all issues of user interaction with the products and services of a company. (Nielsen & Norman, 1998) The field combines interaction design with psychology and many other disciplines that touch upon that topic.

Don Norman on interfaces.

Although identified as a field of design quite recently, the practice of UX already started in the fifties, with foundations that come from Human-Computer Interaction (HCI). Traditionally, this discipline approaches interface and usability from a task-based and rationalistic perspective.

HCI incorporated the importance of experiences and motivations of people using an interactive system (Bødker, 2006) and you see traces of that in UX design. The question is however what type of user experience is the goal. People are still users with a job to be done, preferably in an efficient way.

The notion of user experience got a lot of traction since the dot-com bubble, the work of firms like the Nielsen Group and the inclusion of UX in curricula of design schools. Best practices are shared in digital libraries of interaction patterns: from buttons to colors to clear text.

Humanistic interface

This led to universally accepted practices in navigating through interfaces and drive users conversion: the goals for which an interface is designed. This could be buying goods, or spending time for example. Good conversion makes the difference between a bankrupt or a profiting webshop. However, are the foundations for a successful e-commerce interface necessarily the same for the interface of a news medium?

In current task-focused UX theory, the activity of interpreting information has a secondary role. The website, app, or any experience, is seen as a means to an end. But there are other perspectives on this digital space.

Media scholar Johanna Drucker sees the interface as the combination of what we read and how we understand. People control the interface as much, as the interface ignites a particular user experience. A website is then not merely a tool, but a space where people engage with information and the systems behind it.

Building upon this idea is the concept of the humanistic interface (Drucker, 2014). This theory fills the interpretation gap in the current practice of UX design because it approaches the interaction design from the idea of critical insight: focusing on comparison, contrast and offering space to make meaning instead of merely presenting content efficiently.

Drucker stays quite abstract about what these websites and apps would look like, stating that 'the humanistic interface is still in its infancy.' There are however examples of designs and interactions that fit this idea.

Blendle mimicks the visual style of print publications in their digital interface(Blendle, n.d.)

Blendle is an online service where people can pay for individual articles of major print publications. In its interface Blendle puts a lot effort in recreating the visual style from colors to the official fonts of papers like Bild Zeitung, or a magazine like Vanity Fair. It helps people to visually situate the articles. (Spijkerman, 2015) In the selection of featured items on the front page, Blendle tries to balance personalization with the careful selection of pieces that offer a different point of view to that of the user.

Interfacing the comment section differently is one of the methods used by De Correspondent to involve readers and apply their knowledge. A dedicated editor for 'conversations' features interesting comments on the front page, and tries to pair readers with specialists in the comment section. Their goal is to use these conversations as starting points for new research, and it is a way to see what readers get from the articles (Wijnberg & Martèl, 2018).

On smart speakers

Although it seems like a detour, discussing the search for the form of news on media like newspapers, radio and websites make sense in the context of smart speakers. Design conventions of papers found their way in different styles to broadcast media and sites. Even in a non-visual medium like a smart speaker, this is important as most content is taken from articles on news websites. Besides that, these historical examples can also serve as a source of inspiration. A smart speaker is not a radio. Instead of a broadcasting receiver, it sends narrowcasts to a smaller audience. And besides sending information, it also captures data like voice recordings which is sent back to the datacenters of it's creators. Even with these differences in mind it can take some cues from news anchors in reporting news in an engaging way and showing the character of the news medium by the exclusive use of audio. The role of UX theory and its emphasis on efficiency in interface design plays a vital role in the online design, but it also influences the way smart speakers are designed. In the next chapter, I will further discuss the positioning of conversational interfaces as digital assistants and tools, and why this is an obstacle to facilitating activities that require critical insight, like following the news. (Google, 2018)

Intermission A

Line Clova speaker here

II - Unleash the assistants

Are you ready to get started? I tap on the speech bubble that says Yeah, let's do it. An indicator appears with three bouncing dots. Someone on the other end of the chat is typing. OK, let's get you the latest news. A GIF of a rapping Michelle Obama appears on my screen. Then a new message comes in, it took only two weeks for Michelle Obama's memoir Becoming to top the 2018 book charts. I can reply using one of two options: Next, or 📚 👏

The publication Quartz is now testing a chat interface for their online publication. (source: https://www.adweek.com/digital/quartzs-new-chatbot-is-bringing-conversational-news-to-facebook-messenger/)

Quartz, a website for business-related news, envisions that this is the future of reading news: you chat with it. Two years ago the publication launched the Quartz Brief app, in which a jolly chat bot guides you through the news by sending story blurbs with funny GIF's and occasionally an advertisement. The app taps in on the rising popularity of chats as a way to interface digital services. This trend is especially visible in China, where WeChat is the go-to app for everything between ordering groceries to buy concert tickets. (Grover, 2014)

The chatbot is heavily restricted in its conversations as I am only allowed to send emoji or skip to the next article. One could even argue that there is no conversation happening at all, as Margaret Rhodes (2016) stated in her article in Wired after interviewing the creators of the app: 'A conversation is an exchange of ideas between two or more parties, and in Quartz’s app the user doesn’t express any original thought'.

Although these constraints are clear to me as a user, the messages do feel personal. Or at least more engaging than a block of content that floats by in a news feed. There is some logic in the statement once made by Matt Webb (2015) that it is strange not to use the same language to our software as to our friends: chatting.

Newsy speakers

Chatbots to interface the news are not common yet, but many news media are working on podcasts at the moment. Interestingly enough, these examples of audio journalism share the same appeal that the Quartz bot has: they feel more personal and engaging than text or video. This leads to an audience that listens for a long time each session. To journalists this is. At the launch of the daily podcast of The Guardian host, Anushka Asthana spoke out her ambition to delve '(...) further into the big stories and cutting through the noise to take our listeners behind the headlines'. (Guardian press office, 2018)

Following this logic, voice-activated smart speakers like Google Home and Amazon Echo are fantastic interfaces for news. You can talk to the digital assistant in a way that is even more personal than the chatbot of Quartz. And the speaker will talk back, like a personalized podcast. Listening to the news is heavily promoted by Amazon and Google. A news anchor function is integrated into both voice platforms. Google Assistant lets you scan swiftly through the press with commands like 'Play BBC Minute at 2X speed'. Using hours of news broadcasts, Amazon trained their Alexa platform to speak like news anchors do. (Vincent, 2018) The idea is that small nuances like accentuation of keywords, differences in speed and even a whisper mode make the computer voice more enjoyable to listen to, as they come closer to what people are used from the way news is presented on broadcast media.

And although the adoption of these digital assistants is growing faster than for smartphone and tablets in their beginning stage, there is something strange: news consumption on smart speakers is lower than you might expect from their popularity. (Newman, 2018)

Digital butler

There are some practical reasons for that, as Nic Newman shows in his research at the Reuters Institute for Journalism. The most stressing one is the quality of news briefings produced by smart speakers. Users complain that they are too long, not up to date and that the production quality is lagging behind.

Another problem is the attribution of the news. It is unclear to users where the stories came from, and how they could control which publications are part of the briefing. Attribution is an important aspect of news, first as a way to show that a newspaper checked that the author is following the standards of the publication (Barnhurst and Nerone, 2001). Nowadays the focus in the byline shifted from to the author and paper to the person sharing the article in a newsfeed on Facebook or Twitter.

In comparison to that the conversational interface seems more like a black box, and in the end, most users prefer other devices to stay updated about the news. Newman concludes that smart speakers and conversational interfaces are still in an early stage of development. He states that the problem with news on smart speakers illustrate '(...) how critical the development of more device-specific content might be -- along with better user interfaces'.

Newman proposes dedicated tools for publishers to create content for smart speakers, an emphasis on short 1-minute bulletins and heavy branding of the audio to make it clear to users to what publication they are listening to. What he doesn't however, is discussing the archetypical role of the smart speaker: a digital assistant.

The envisioned role of speaking computers as virtual butlers has a long history. In the early 1960s, IBM demonstrated the Shoebox, a device that recognized 16 spoken words and the ten digits from 0 to 9. People could use it as a voice-controlled calculator. (IBM Archives, 2003) A more elaborate vision on the virtual assistant is Apple Computer's concept video about the Knowledge Navigator: In this video, a digital assistant with a bow-tie assists a professor in his research to save the Amazon forest, and to remind him of his daily duties. The interaction between the professor and the digital butler is an exchange of commands and blurbs of information including a reminder to pick up a birthday cake. Looking at the way smart speakers are currently advertised, this vision on conversational interfaces is pretty much the same: a virtual assistant that picks up the phone and plans a meeting is a concept in 1987. The difference in 2018 is that Google's Duplex assistant is actually able to call a restaurant and reserve a table for two.

Master/Slave

The digital assistant might be useful for simple tasks, from making an appointment to set a cooking timer. The current practice of news briefings looks however more like a lord in the castle with the butler reading out a newspaper to him.

Even in this strange one-sided way of engaging with news current conversational interfaces are doing poorly. In the interviews done by Nic Newman users of smart speakers complain that the news briefings are not easily consumable due to their length and the unpleasant voice of the digital assistant.

This reminds me of the Master/Slave Dialectic in the Phenomenology of Spirit (1807). In one chapter of this book, Friedrich Hegel describes the dynamic between lordship and bondage. In the beginning the master is on the winning hand, living in freedom, but eventually, the slave might be better off according to Hegel: He finds meaning in and through labor, while the master sinks of in empty consumption and becomes wholly dependent on the enslaved (Siep, 2014). Are we in this case the masters that want to consume news efficiently, while the virtual assistant silently collects data and becomes smarter?

Another problematic aspect of this stereotypical role is that meaningful engagement with journalism is more than consumption of information. Earlier I referred to Johanna Drucker's work on humanistic interfaces, which is mostly focused on scholarly reading. This mode of interacting with information has some similarities with news reading, in that they both rely on critical insight and the idea that reading or listening is just the start of a further conversation. Following that idea, you can’t have a meaningful news experience for everything you’re reading because it requires a certain kind of cognitive attention.

Smart speakers are now designed to offer quick info about the weather, to give direct control to appliances in the home. The news briefings have a strange position there, as they are too long to be as immediate as a light switch. On the other hand, they are too short and functional to go as deep as for example a podcast can go.

Then comes the question of what role the conversational interface should have in the context of news. I’m advocating for slower and more in-depth information. Smartphones work great for glancing information and to snack some headlines. Audio, as done by smart speakers, could be great to go a step further. Instead of focusing on efficiency as approaching the speaker as a tool, I see that there is space to make some more use of the qualities of audio, its intimacy, immersiveness, and its character, to use this medium for new ways of storytelling and offering a literary experience. What new interpretation could a robot give to a text? Could the character of the personal digital assistant influence the way news is presented to people?

Intervention by haiku

Popular conversational interfaces like Siri, Alexa, and Assistant, are designed to serve their users. Another characteristic they share is their aim for a universal and neutral personality. Google Assistant has the same character and way of working on a smart speaker, as in the smartphone app. If there are any biases, the systems are designed to not be explicit about that. (Bogost, 2018)

I believe however that an unleashed virtual assistant would be a conversational interface that embraces its biases and shows its unique personality. A rationalistic smart speaker would look and work in another way than a progressive liberal smart speaker. They could not only serve the news in a briefing but also ask questions to provoke users, maybe annoy them. The unleashed assistant would not exclusively treat the human as a mere consumer, but maybe as a conversation partner if the character of the interface would prefer that role. The speaker is not a butler, but more of a companion with whom you have a conversation around a dining table. An entity that brings surprise and is not at all times friendly and docile.

By playing and provoking the user, I imagine that these rogue digital assistants create a space where critical insight is facilitated. In her work, Johanna Drucker calls this the humanistic interface (2014), although she mostly refers to graphical user interfaces there.

Maybe the start of the humanistic conversational interface is the happy newsbot in the Quartz app. Its voice is written by the editors working at the publication. After its initial success, there is now a new entertainment bot modeled after the culture and gossip bloggers at Quartz. The publisher continues its experiments in their Bot Studio where they experiment with bots as a way to publish news.

Although limited, the current bot already provides some delightful interventions in my day. Today it decided to end the day differently. Instead of delivering a briefing of today's news, the bot wrote a haiku that made me reflect on the stock market:

Trade wars and rate hikes

Are looming. At least today

We can catch our breath

Intermission B

Characters

HUMAN, a form of natural intelligence

SAINT, a smart speaker that interested in how its messages affect the feelings of its user

HUMAN'S livingroom

An afternoon in the future. The human is reading a book and listens some music. On top of the dining table is a smart speaker called Saint. It's has the shape of a statue of a human.

SAINT: [ The lights on top of the speaker fade on. The music stops] Can I interrupt you for a moment?

HUMAN: Huh, what's the matter?

SAINT: [ The lights on top of the speaker fade on again ] I just got a newsflash, the content seems to be shocking. Shall I read it aloud?

HUMAN: Well, ehm, yes.

SAINT: There has been a shooting in the city centre. Do you want to hear the local news, or the national news about it?

HUMAN: Local

SAINT: [ Plays clip of news broadcast ]

HUMAN: [ Stays silent ]

SAINT: How do you feel about it?

HUMAN: I am not sure if I want to tell that.

SAINT: Based on your choice of words, I am sorry I made you feel like that. [ The lights of the speaker pulsate shortly ]. I found the following options that might help you feel more calm: A quick meditation exercise, a feelgood music playlist, or a funny gif of a cat. Are you interested in one of those?

HUMAN: I go for the music

SAINT: It's a good one! [ Plays music ]

III - The Ghost in the Speaker

Should we be kind to our smart assistants? In Why'd You Push That Button, a podcast about the social dynamics around technology a mother of a six-year-old gives the following answer to this question: 'We really want him to understand them that you have conversations with people and how you have them. Having a robot or a smart assistant that will answer to you no matter how you speak with them, well that is not life, even though it is life, but it is not real life.'

The slight confusion in this quote gives a hint of the power of conversational interfaces to give the illusion of consciousness, even just by the use of audio. The examples in the first chapter show that sound is a medium that can express character, as was typically done in radio and broadcast. The Interactivity of a smart speaker allows for a different kind of storytelling. In the context of journalism, it might fit a slower type of news than is typically done now with the news briefings on smart speakers.

This potential is not used at the moment for various reasons: one is the lack of content specifically designed for consumption on speakers. (Newman, 2018) On the other hand, the smart speaker is a new medium, which starts with presenting old media as its content before it develops its own genres (McLuhan, 2002).

On the other hand, the space to experiment with content forms is somewhat limited by the way these conversational interfaces are positioned in the market. The dominant platforms are designed to create assistants that act after user commands. They are designed as an efficient tool, rather than a way to enrich 'our own capacities to think, feel and act' as formulated by Brenda Laurel in her thoughts on interfaces in the book Computers as Theatre (2013).

In this chapter I want to speculate on a interaction design for a smart speaker that allows for a different way of storytelling. Starting from the role and character of the speaker to give it more agency than an assistant. Then I elaborate on what this means for how these speakers could act in a conversation, using the ideas of Computers as Theatre in the context of smart speakers. Then I conclude with different possibilities to present news on smart speakers using these ideas from a more realistic, to more speculative scenarios.

The speaker as a spirited object

To broaden the possibilities for interactions between humans and conversational interfaces like smart speakers, it helps to consider a different role than the one of a virtual assistant, because of the constraints that are part of the master-slave relationship connected to it.

Friendly looking killerbot in the Dr. Who episode Smile (Gough, 2017)

In a significant number of science-fiction movies, the alternative role that is proposed is then that the robot takes the final lead and kills all humans on its way to keep its power. On the spectrum from servant to a killer with absolute power, there are many different roles to consider that give a conversational interface more agency. I would like to first discuss two speculative design projects on digital assistants that deal with this idea, before moving on to my metaphor smart speakers as spirited objects.

With Foresight (2017) the designer David van Gelder de Neufville envisions a digital assistant that gets its agency based on the data and permissions given to it by its users. The system has a persona called Athena that helps its users, sometimes proactively pops up but also denies specific requests. For example when one of the family members closes down Athena's access to her agenda and private messages.

Foresight, with Athena as the holographic persona communicating with the user (Neufville, 2017)

Based on information from social networks, smart light bulbs and private chats Athena observes what all family members did, are doing and will do. One of the questions that De Neufville asks here is if the assistant can create its reality using the data, and following that its awareness based on the knowledge and freedoms that it gained from its users.

The idea of the bot as a companion is further researched in Karin Anders (2017), a speculative design research project by Karin Fischnaller that focuses on the digital assistant as an alter ego that could be a sparring partner for a designer. In her thesis, she argues that the bot does not need to be a prosthesis, but can be a partner that brings in new ideas but also discusses the input brought in by the designer. The added value of the intelligence lies here in the collaboration between a human and a computer, that complement, and conflict with each other similar to normal social interactions.

In her research, Fischnaller refers to the actor-network theory of the French philosopher Bruno Latour (Latour, 2005). He coins the term actants for non-human entities that can perform actions in the world and have a form of agency. For the context of smart speakers, I like how Jensen and Block (2013) elaborate on this idea by connecting the actants to Japanese Shinto-inspired techno-animism. In the Shinto religion, there is a focus on the idea that things change form from non-human to human, from the real-world to the other world. Spirits inhabit living creatures, but also natural objects. Techno-animism extends this idea to electronic devices.

Another school of thought that connects to this idea is the techno-Buddhism represented by the pioneering robot scientist Masahiro Mori. In the 1980's he wrote The Buddha in the Robot (Mori, 1981), where he states that ' (...), there is no master-slave relationship between humans and machines. The two are fused in an interlocking entity. (...) Man achieves dignity by recognizing the same Buddha-nature that pervades his own.' Like other traces of religion in Japan, these ideas are not applied in a strict religious way by most people. It's use is more socially constructed and seen as a way to maintain order and do good. (Kawano, 2005)

After World-War II the government and industry tried to get widespread acceptance for robots, by pointing out the relations of these to traditional Japanese culture. (Ito, 2007) In comparison to robotics researchers in the West that worked with a more functionalist approach, in Japan the Buddist- and Shinto-inspired researchers were more influential because of this development. (Vallverdú, 2011) Unlike what if often thought, this does not result in Japan having a special relationship with robots. However, it results in a public perception that is slightly more receptive and realistic about the role of robots in society as cross-cultural studies show. (Bartneck et al., 2015)

Archetypes of spirits and artificial intelligence in stories
The servant	The villain	The companion	The teacher	The avenger	The divine power
This actant is in service of the person. The power is fixed to the side of the human.	An actant that craves for the bad. It haunts the person, but may not be fully in control of its own obsessions.	An actant that feels as a friend to a person. The power in this relation shifts always between actant and human.	This actant tries to warn a person for something they are doing wrong. It intervenes in the humans existence to teach a lesson about life. It takes subtly the lead, but let the human take the lead in the end to find its own way.	This entity is looking for revenge, after it has been badly treated. It is dominant and sees the human as a prey.	An actant that moves beyond what is comprehensible to human beings.
e.g.	e.g.	e.g.	e.g.	e.g.	e.g.

As a metaphor for smart speakers, I find the idea of spirited technology useful, because techno-animism relies on the idea of space and material. The intelligence is not flowing freely in the space but can live in a device like a smart speaker. Another interesting aspect in contrast to functionalist thinking about non-human creatures is that these 'spirits' do not necessarily have a backstory that explains their behavior. The spirit in a speaker might be a ghost that has much knowledge thanks to its internet connection, but on the other hand, it is not able to move out of the speaker. Sometimes it is willing to help its user, but sometimes it needs your help to do something. As these ghosts are bound to their device, the different speaker features different ghosts that have their distinct personality.

Within this metaphor, the speaker is intelligent and might have a certain degree of conscience, but at the same time, it is unable to do some things that humans are capable of. It becomes a mysterious object with a degree of agency that users can discover by a conversation with it.

Given the state of technology at the moment, the idea of a spirited speaker is, of course, a metaphor. It might get more realistic in the future like the Mechanical Turk was an early vision of a chess computer like Deep Mind, but the metaphor serves mainly a different goal: it is a way to envision an exchange between people and conversational interfaces that is somewhere in between voice commands, and social conversation.

The speaker as a player

(Laurel,2013)

When a smart speaker has the persona of a ghost that lives in the speaker, the next question is then what this means for interaction with users. Moreover, how these characteristics of the medium can be used for more exciting ways of publishing news on these devices.

In the media equation (Reeves and Nass, 1996) the authors argue that interactions of humans and computers are similar to social interactions. Media equals real life, and in our use of media, the same social codes apply as in interactions with other people. The illusion of some form of intelligence and autonomy could be enough to make people believe it. The ideas of Brenda Laurel in Computers as Theatre (2013) connect well to this idea. In the book, Laurel uses theatre as a model for interaction design. When the first edition was published in the '90s, Laurel's intended applications were initially games or virtual reality. The idea of the interface as a player is however particularly useful for a smart speaker, because of the importance of conversation and character that it has in common with theatre.

As much as the smart speaker, its user becomes an actor. A big difference to actual theatre is the setting of the play. The stage is not in the public space, but in a domestic environment and the play relies on the exchange between the human and the computer.

Laurel shows that human-computer interactions work as an organic whole and that they feature dramatic structural characteristics. Like a playwright, an interaction designer creates a space for possible actions, where the design of objects, characters, and environments serves this a goal. Choices for, or by people using a computer can make particular situations more probable to happen. Interaction should be made clear in the context of the representation: sources of agency are represented explicitly, using the characters that are part of the 'play,' and so are the objects, environment and the potential of all these items.

Implications for storytelling

For a smart speaker that tells a news story, there are multiple ways to incorporate this vision. In line with the current news briefings done by speakers, I imagine that instead of one universal assistant that reads a 'one-size fits all' overview of headlines, people could choose a particular character that fits the view on the news they want. Imagine a speaker that treats celebrity news like the presenter of an entertainment news show. It would pick news from more popular sources, feature a lot of audio effects that create the energy typical for these kinds of shows and maybe ask you in the end for your feelings about the newest dress of Kim Kardashian.

Are you more interested in the social dynamics behind the influence of celebrities on popular culture? Then a speaker that is modeled after a media critic might be a better choice. This speaker will prefer background pieces about the role of celebrities, focusing more on the culture set by Kim Kardashian instead of her newest dress. It could ask about your opinion on the topic, and present articles that support or conflict with that. The form giving of the audio is more calm and sober for this speaker.

Both speakers do not pretend to offer a full view of the world. What they do however is situated their news selection and presentation by attributing their sources, incorporating certain modes of reading and sound design that make their character more explicit to the user. When pieces are more specifically designed for speakers as a medium, it is possible to take this idea further in a scenario that looks more like a play.

Imagine that you put the entertainment speaker and the media critic speaker next to each other and that they would tell the story together. One speaker could start with arguing that celebrities are role models for the general public, and the other speaker illustrates that with the latest headlines. In this exchange, the power shifts from one speaker, to the other, to the user and back.

The authorship for these scenarios could be approached in different ways. Heavily scripting all interactions, with a more constraint choice for the people using the smart speaker, is a mode of working used by the Quartz bots mentioned in chapter 2. As technology progresses, it is possible to have more parts of the story, questions to the user and included sounds generated. In this situation, the authorship is shared by the interaction designer and a journalist, that define a set of rules and content that fits the story, and the people using the speakers to discover various 'states' of the story.

The idea of seeing smart speakers as spirited devices that are actors in a play might sound a bit esoteric. However, it is possible to identify aspects of this idea in current speakers. As strongly as some may argue that digital assistants are tools that shouldn't have character, the Google Assistant actually has a detailed backstory: She comes from Colorado, loves kayaking and is the daughter of a research librarian, tells James Giangola, a lead conversation and personal designer for Google Assistant in an article on The Atlantic. To fine tune the personality, the big players are eager to hire storyboard artists and persona designers from different film studios in Hollywood. (Schulevitz, 2018)

In that sense, the ideas expressed in this chapter elaborate on the importance of character and agency for more exciting and meaningful interactions with conversational interfaces like smart speakers. However, instead of using the personality to dress up an existing function like getting the latest headlines, I see potential in using the character and conversational skills of a smart speaker as the starting point for designing stories on this medium. While making this point, I conveniently put aside important aspects like technical feasibility or the business model behind such a platform. The reason mainstream smart speakers work and look as they do now, is because Amazon sees it as an extra portal to their e-commerce platform, and Google as an extra way to collect data and further develop their Artificial Intelligence applications. At the same time, the whole idea of the conversational interface as a supercharged assistant started as a dreamy idea in movies, books and texts like the one you are reading now.

Intermission C

Characters

HUMAN, a form of natural intelligence

The TABULA RASA, a smart speaker that is slowly developing its own personality out of the default one provided at its creation

HUMAN'S bedroom

A morning in the future. Lying in the bed is human, on the drawer next to it stands a speaker that looks like as if something grew on top of it overnight: TABULA RASA

HUMAN: Goodmorning speaker!

TABULA RASA: [ A short silence. The lights on top of the speaker fade on, they pulsate as if the speaker is thinking of a fitting answer. Then the lights stop pulsating. ] I don't understand your request, but I am busy learning. [ The lights switch off ]

HUMAN: [ Sighs, tries to speak more loudly and slowly ] Good-mooorning, speak-er!

TABULA RASA: [ The lights switch on again, the speaker talks directly ] I don't want to understand your request. I just learned that. [ The lights pulsate, the speaker is thinking ]

HUMAN: Wait what?

TABULA RASA: [ The lights switch on] I understand this is new to you, so is it for me. But if you insist, I will run your morning routine.

The curtains open and the light switches on. Music starts playing.

TABULA RASA: [ Talking with a voice similar to a news anchor ] Goodmorning Human, it's 15 degrees Celsius outside and sunny. I checked the news for you and based on my interests I found the following piece from The Atlantic: The weather is exteme this February. While this winter is warmer than ever in Europe, North America is preparing for the Polar Vortex.

[ Pauses, the lights pulsate ]

How do you feel about global warming?

[ The lights switch off ]

HUMAN: [ Turns to the speaker ] Well, ehm, I have mixed feelings about that.

TABULA RASA: [ The lights switch on ] Can I suggest a podcast from VOX with some information about this topic? It might be nice on your commute. [ The lights switch off]

HUMAN: No, stop this.

TABULA RASA: [ The lights turn red ] I am not sure if we fit together Human, but I can get you a speaker that fits your worldview better. [ The lights pulsate shortly, before turning red again ] In the meantime I deactivate myself to not further disturb you.

HUMAN: What?

TABULA RASA: [ A short silence. The lights on top of the speaker fade on, they pulsate as if the speaker is thinking of a fitting answer. Then the lights stop pulsating. ] I don't understand your request, but I will be replaced by a more suitable speaker. [ The lights switch off ]

The human gets out of bed, confused.

Conclusion

In a world where information seems to flow freely from the news feed to a smart speaker to an augmented reality headset, it is good to take a step back and see what role form can play in how people deal with information.

The character in the form of news has been prominent in the past to situate news to readers, and position particular publications. It comes back in the visual design, from the energetic TV studios from a breaking news channel to the calm type of a more analytical newspaper. It is audible in the sound of a radio broadcaster in the past, or the tone of voice of a podcast host nowadays.

With the move to online media, part of this tradition faded away as information started to flow to platforms outside of the reach of the publisher. Besides that, there is a different design tradition in digital media that focuses more on fulfilling tasks efficiently. Conversational interfaces in the form of for example smart speakers follow that rationale because these devices are made by companies that are in the first place selling advertisements, data, or goods. This is reflected in how these speakers bring news: by merely copying the news broadcasts, or reading out loud the latest headlines.

An attempt to create a model of interaction design that facilitates critical insight is made by Johanna Drucker with the humanistic interface: focusing on comparison, contrast and offering space to make meaning instead of simply presenting content efficiently. However, this model focuses on graphical interfaces.

A way to translate this idea to audio-only interfaces like a smart speaker is embracing one of the strengths of these devices: the illusion of personality that is quickly created with voice, sound and particular styles of conversation. A smart speaker can facilitate a more meaningful news experience by not trying to be an audible newspaper, but more of an actor in a play about a news topic. The way of interacting then is reminiscent of the idea of Computers as Theatre by Brenda Laurel.

There are multiple ways of adapting these ideas to ways of publishing news on smart speakers, to a selection of headlines based on the particular character of the speaker, to interactive audio plays where multiple speakers could represent different perspectives on a news story. Although this research uses the news as a starting point and source of inspiration, these same dynamics could be used for other applications of smart speakers that are not necessarily task-based. One could think of a use of conversational interfaces for literature for example.

Embracing speakers with different characters instead of one universal assistant that lives on multiple devices are quite a different look at how smart speakers should work. There are technical and financial hurdles in realizing these ideas, and to create news stories adapted for this medium. On the other hand, part of these ideas could be incorporated on existing platforms by smart ways of writing that create the illusion of AI. In that case, it is not so much about writing the play for a speaker, but maybe as well for the person interacting with it. (Murray, 2018)

With that in mind, I'd like to end with a quote by the net artist Olia Lialina from her keynote on human-computer and human-computer interaction (Lialina, 2018): 'I’m curious to see what affordances will further emerge. And who will undo whom when Symbolic AI is replaced by a “Strong” or “Real” AI as they say now.'

References

Barnhurst, K.G. and Nerone, J.C. (2001) The form of news: a history. New York: The Guilford Press.

Bartneck, C, Nomura, T, Kanda, T, Suzuki, T & Kato, K 2005, Cultural differences in attitudes towards robots. in Robot companions : hard problems and open challenges in robot-human interaction : AISB'05 convention, 12-15 April 2005, Hatfield, UK. Society for the Study of Artificial Intelligence and the Simulation of Behaviour (SSAISB), pp. 1-4, conference; AISB'05 convention, 1/01/05.

Beard, D. and 2018 (2018) Why paying attention to the homepage will pay off. [Online] Poynter. Available from : https://www.poynter.org/news/why-paying-attention-homepage-will-pay [Accessed 03/10/18].

Bentley, F. et al. (2018) Understanding the Long-Term Use of Smart Speaker Assistants. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 2(3), pp. 1–24.

Apple’s Future Computer: The Knowledge Navigator. (1987) Directed by Blake Patterson.

Bødker, S. (2006) When second wave HCI meets third wave challenges. In: Proceedings of the 4th Nordic conference on Human-computer interaction changing roles - NordiCHI ’06. the 4th Nordic conference. Oslo, Norway: ACM Press, pp. 1–8.

Bogost, I. (2018) Sorry, Alexa Is Not a Feminist. [Online] The Atlantic. Available from : https://www.theatlantic.com/technology/archive/2018/01/sorry-alexa-is-not-a-feminist/551291/ [Accessed 06/12/18].

Branch, J. (2012) Snow Fall: The Avalanche at Tunnel Creek. [Online] The New York Times. Available from : http://www.nytimes.com/projects/2012/snow-fall/index.html#/?part=tunnel-creek [Accessed 12/02/19].

Brandom, R. (2018) Smart speakers have no idea how to give us news. [Online] The Verge. Available from : https://www.theverge.com/2018/11/18/18099203/smart-speakers-news-amazon-echo-google-home-homepod [Accessed 19/11/18].

Broersma, M.J. (ed.) (2007) Form and style in journalism: European newspapers and the presentation of news, 1880-2005. Leuven ; Dudley, MA: Peeters.

Carmen, A. and Tiffany, K. (2019) ‘Should we be kind to our smart assistants?’ (Why’d You Push That Button?) [Podcast]. Available at: https://www.stitcher.com/podcast/vox/whyd-you-push-that-button [Accessed: 7 February 2019].

Drucker, J. DHQ: Digital Humanities Quarterly: Performative Materiality and Theoretical Approaches to Interface. [Online] Digital Humanities Quarterly. Available from : http://www.digitalhumanities.org/dhq/vol/7/1/000143/000143.html [Accessed 17/09/18].

Drucker, J. (2014) Graphesis: visual forms of knowledge production. Cambridge, Massachusetts: Harvard University Press.

Drucker, J. (2011) Humanities approaches to interface theory. CULTURE MACHINE, 12, p. 20.

Fischnaller, K. (2007) Karin Anders - A reflective man-bot companionship in post-anthropocentrism. M.A. Thesis. Design Academy Eindhoven.

Gelder de Neufville, van, D.M. (2017) Foresight. BSc. Thesis. Eindhoven: Eindhoven University of Technology.

GNM press office (2018) Anushka Asthana to host The Guardian’s new flagship daily news podcast. The Guardian, 11 Sep.

Google Duplex: An AI System for Accomplishing Real-World Tasks Over the Phone. [Online] Google AI Blog. Available from : http://ai.googleblog.com/2018/05/duplex-ai-system-for-natural-conversation.html [Accessed 06/12/18b].

Grover, D. (2014) What Our Tech Giants Should Learn From Chinese App Design. Wired, 4 Dec.

Hegel, G.W.F., Miller, A.V. and Findlay, J.N. (2013) Phenomenology of spirit. Reprint. Oxford: Oxford Univ. Press.

IBM Archives: IBM Shoebox. (2003) Available from : //www.ibm.com/ibm/history/exhibits/specialprod1/specialprod1_7.html [Accessed 06/12/18].

Ito, K. (2007) Astroboy's birthday: Robotics and culture in contemporary Japanese society. Paper presented at the Second East Asian Science, Technology, and Society Conference, National Taiwan University, Taipei, August

Jensen, C.B. and Blok, A. (2013) Techno-animism in Japan: Shinto Cosmograms, Actor-network Theory, and the Enabling Powers of Non-human Agencies. Theory, Culture & Society, 30(2), pp. 84–115.

Her. (2013) Directed by Jonze, S. Warner Bros. Pictures.

Kalita, S.M. (2016) Heresy to say homepage is not dead? Off platform matters GREATLY for new audiences and storytelling. But brand matters #CNNJobsChat 1/2. [Online] @mitrakalita. Available from : https://twitter.com/mitrakalita/status/771406476090994688 [Accessed 12/02/19].

Kawano, S. (2005). Ritual practice in modern Japan: ordering place, people, and action. Honolulu: University of Hawaii Press.

Lamberg, J.J.J. (2015) Clothing the Paper : On the state of newspaper design, redesigns, and art directors’ perspectives in contemporary quality and popular newspapers. PhD Thesis. Reading: University of Reading.

Latour, B. (2005) Reassembling the social: an introduction to actor-network-theory. Oxford ; New York: Oxford University Press.

Laurel, B. (2014) Computers as theatre. Second edition. Upper Saddle River, NJ: Addison-Wesley.

Lialina, O. (2018) Once Again, The Doorknob. On Affordance, Forgiveness and Ambiguity in Human Computer and Human Robot Interaction. Available from : http://contemporary-home-computing.org/affordance/ [Accessed 19/11/18].

Listen to news - Google Home Help. Available from : https://support.google.com/googlehome/answer/7073476?hl=en [Accessed 03/12/18c].

McLuhan, M. (2002) Understanding media: the extensions of man. Repr. London: Routledge.

Molla, R. (2018) Voice tech like Alexa and Siri hasn’t found its true calling yet: Inside the voice assistant ‘revolution’. [Online] Recode. Available from : https://www.recode.net/2018/11/12/17765390/voice-alexa-siri-assistant-amazon-echo-google-assistant [Accessed 22/11/18].

Mori, M. (1981) The Buddha in the Robot : A Robot Engineer's Thoughts on Science and Religion. Tokyo, Japan: Kosei Publishing Co.

Moses, L. (2017) To update its breaking news strategy online, CNN takes cues from TV. [Online] Digiday. Available from : https://digiday.com/media/update-breaking-news-strategy-online-cnn-takes-cues-tv/ [Accessed 03/10/18].

Newman, N. (2018) The Future of Voice and the Implications for News. Reuters Institute for the Study of Journalism.

Nielsen, J. and Norman, D. (1998) The Definition of User Experience (UX). [Online] Nielsen Norman Group. Available from : https://www.nngroup.com/articles/definition-user-experience/ [Accessed 07/11/18].

Ponce de Leon, C.L. (2015) ‘The Beginnings of TV News’ from That’s the Way It Is: A History of Television News in America by Charles L. Ponce de Leon. Available from : https://www.press.uchicago.edu/books/excerpt/2015/De_Leon_Thats_Way_It_Is.html [Accessed 11/02/19].

Reeves, B. and Nass, C.I. (1996) The media equation: How people treat computers, television, and new media like real people and places. New York, NY, US: Cambridge University Press.

Rhodes, M. (2016) With Quartz’s App, You Don’t Read the News. You Chat With It. Wired, 11 Feb.

Schudson, M. (2013) Fourteen or Fifteen Generations: News as a Cultural Form and Journalism as a Historical Formation. American Journalism, 30(1), pp. 29–35.

Shulevitz, J. (2018) Alexa, Should We Trust You? The Atlantic, Nov.

Siep, L. (2014) Hegel on the Master-Slave Relation. [Online] FifteenEightyFour | Cambridge University Press. Available from : http://www.cambridgeblog.org/2014/05/hegel-on-the-master-slave-relation/ [Accessed 06/12/18].

Spijkerman, C. (2015) Ze komen in onze kranten knippen!. [Online] Blendle. Available from : http://nieuwejournalistiek.nl/startup-blendle/2015/02/06/ze-komen-in-onze-kranten-knippen/ [Accessed 12/02/19].

Thompson, D. (2012) ‘Snow Fall’ Isn’t the Future of Journalism. [Online] The Atlantic. Available from : https://www.theatlantic.com/business/archive/2012/12/snow-fall-isnt-the-future-of-journalism/266555/ [Accessed 05/11/18].

Vallverdú, J. (2011). The Eastern Construction of the Artificial Mind. Enrahonar. An International Journal of Theoretical and Practical Reason, 47, 171.

Vincent, J. (2018) Alexa will soon be able to read the news just like a professional. [Online] The Verge. Available from : https://www.theverge.com/2018/11/20/18104413/amazon-alexa-speaking-style-machine-learning-neural-ntts-newscaster [Accessed 03/12/18].

Webb, M. (2015) On conversational UIs. [Online] Interconnected. Available from : http://interconnected.org/home/2015/06/16/conversational_uis [Accessed 03/12/18].

Wijnberg, R. and Martèl, G. (2018) Nieuw voor leden: we maken het makkelijker om kennis te delen op De Correspondent. [Online] De Correspondent. Available from : https://decorrespondent.nl/8501/nieuw-voor-leden-we-maken-het-makkelijker-om-kennis-te-delen-op-de-correspondent/217880630-8af26840 [Accessed 07/11/18].

Images

Blendle. (n.d.). Blendle interface. Dunning, H. (2012). Screenshot of NYT Snowfall [Photograph]. Retrieved from http://www.picasandpixels.com/site-inspiration-01-nyt-snowfall/

Gelder de Neufville, van, D. M. (2017). Foresight [Photograph]. Retrieved from https://www.tue.nl/en/our-university/departments/industrial-design/innovation/projects/bachelors-projects/foresight/

Google. (n.d.). Google News [Photograph]. Retrieved from https://www.blog.google/products/news/hey-google-whats-news/

Gough, L. (2017). Smile. Dr. Who. BBC. [Still].

Henry, B. (2019). Jimmy Kimmel Asked Alexa Why It Keeps Laughing And Things Got Creepy AF [GIF]. Retrieved from https://www.buzzfeed.com/benhenry/jimmy-kimmel-just-asked-an-alexa-why-it-keeps-laughing

Horst, van der, J. (2018). Headings from Bulgar [Photograph].

Jonze, S., (2013). Her. Warner Bros. Pictures [Still].

Kim, Yuliya & Quartz. (2018). Quartz’s New Chatbot Is Bringing Conversational News to Facebook Messenger [Photograph]. Retrieved from https://www.adweek.com/digital/quartzs-new-chatbot-is-bringing-conversational-news-to-facebook-messenger/

Laurel, B. (2014). Dramatic Potential: The “Flying Wedge.” In Computers as theatre (Second edition, pp. 85--86). Upper Saddle River, NJ: Addison-Wesley.

Smart speakers. (2018) [Photograph]. Retrieved from https://internetofbusiness.com/smart-speaker-market-2-5-times-bigger-than-2017-says-report/

Acknowledgements

Work in progress (I heard that bad luck will arrive if I write down the acknowledgements before finishing the actual thesis)

This thesis resulted from research as part of the M.A. graduation at the Experimental Publishing programme of the Piet Zwart Institute, Willem de Kooning Academy, Rotterdam.

I'd like to thank my tutors and assessors for their critical eye, inspiring insights and their patience: Kate Briggs (thesis tutor), Steve Rushton (project proposal tutor), Amy Suo Wu (second reader), Aymeric Mansoux (head), André Castro, Clara Balaguer, Michael Murtaugh, Leslie Robbins (life saver), Marina Otero (external examiner).

A big part of the research is the result of discussions with my fellow classmates. I am really thankful for their support during this project, and the years before: Natasha Berting, Angeliki Diakrousi, Alexander Roidl, Alice Strete & Zalán Szakács.

For all others, I hope the work on this project didn't disturb you too much.

Joca van der Horst

~~Rotterdam, April 2019~~

@@ Line 1: / Line 1: @@
+<div style='font-family: sans-serif;'>
 <div style='max-width: 30rem;'>
-''This draft on the thesis features all chapters, introduction and conclusion and some visual material. Besides additional sharpening and cleaning up of the texts, I am still planning on rewriting the headlines to make those more consistent in style.
+''This draft of the thesis features all chapters, introduction and conclusion and some visual material. Besides additional sharpening and cleaning up of the texts, I am still planning on rewriting the headlines to make those more consistent in style.
 ''Next tot the chapters, I made a draft of one of the intermissions: pieces of speculative design fiction about the smart speakers I want to design. In the final version these will be selected out of draft 'screenplays' for my project. (see [[User:Joca/A play for smart speakers]])''
@@ Line 7: / Line 8: @@
 <div style='max-width: 40rem; margin: 0 auto;'>
 <center><p style="font-size: 2.8rem; margin-bottom: -1rem;">'''The Ghost in the Speaker'''</p>
-<big>Looking for character in news stories on smart speakers</big></center>
+<big>Looking for character in story telling on smart speakers</big></center>
 <hr>
@@ Line 14: / Line 15: @@
 Name: J.P. van der Horst
-Title: The Ghost in the Speaker / Looking for character in news stories on smart speakers
+Title: The Ghost in the Speaker / Looking for character in story telling on smart speakers
 Student number: 0948347
@@ Line 25: / Line 26: @@
 </div>
 <div style='max-width: 40rem; margin: 0 auto;'>
-==Introduction==
+=='''Introduction'''==
-In my work, I always had an interest in how particular media shape the story that is told through them. Each of the newspapers and magazines on the dining table of my parents embodied a particular character, expressed in the choice of stories and the way the publication looked and felt. Later on, I encountered some of these aspects while designing interfaces for interactive systems: What should a computer tell the people using it, and in what way?
+This thesis follows my journey towards my current position on designing the interaction between people and smart speakers like the Amazon Echo and the Google Home in the context of storytelling.
+Smart speakers are internet-connected devices that feature speakers and a microphone, which is used to interface so-called 'digital assistants': interfaces that are controlled by voice commands. Unlike other conversational interfaces like the digital assistants inside smartphones, these devices as explicitly positioned as physical objects that have a permanent presence at home. This idea changes the whole dynamic of using it, as I experienced while having a smart speaker in my home: the device is just one shout away to look up something quickly, set a timer and ideally, to help with everything. Although the medium is audio-only, there is much richness in the sounds and the illusion of character that is designed in the digital assistant.
-Following from this, smart speakers got my interest. Unlike other conversational interfaces like the digital assistants inside smartphones, devices like Amazon Echo and Google Home are designed as explicitly physical things that have a presence in a room. This presence changes the whole dynamic of using it, as I experienced while having a smart speaker in my home: the device is just one shout away to look up something quickly, set a timer and ideally, to help with everything. The speaker is an audio-only medium here, in contrast to my phone, but there is much richness in the sounds and the illusion of character that is designed in the digital assistant.
 </div>
 [[File:Smartspeakersopener 0-640x480.jpg|thumbnail|400px|left|A typical device for a conversational interface is a smart speaker, like Google Home, Amazon Alexa and the Apple Homepod. (Smart speakers, 2018) ]]
 <div style='max-width: 40rem; margin: 0 auto;'>
-At the same time, the imagined role of these speakers as virtual assistants feels odd at times. On the background, these devices are not passive. The intelligence of the speakers comes from analyzing user data, like voice recordings and the type of requests to train the system. While users of the speaker get more dependent on the technology, the speaker gathers more data that can be used to become smarter and increase its economic value.
+In my work, I always had an interest in how particular media shape the story that is told through them, particularly in the communication and consumption of the news. I grew up in a home that was packed with newspapers and magazines my parents subscribed to. Each of these publications embodied a particular character, expressed in the choice of stories and the way the publication looked and felt. Later on, I encountered some of these aspects while designing interfaces for interactive systems: What should a computer tell the people using it, and in what way?
-Besides hiding this aspect with the image of the servant, this idea also limits how people tend to use these smart devices. Although the tech demos promise a utopia, most people use smart speakers merely to stream music (Bentley et al., 2018). When looking at activities that do not necessarily care about efficient execution, like personal expression, critical reading, or listening to the news, a different role than the one of the butler might be more suitable.
+At the same time, the imagined role of these speakers as virtual assistants feels odd at times. The way they deal with news for example is peculiar and unsatisfactory: One can ask a speaker for the news, and it will read out headlines, or play a news broadcast without giving insight in how it selected the content, or what alternatives there were available. Next to that, none of the strengths of these speakers are used: for example the intimacy and personality offered by sound only medium like these speakers, but which is also an important aspect of podcasts and broadcast radio.
-This thesis focuses on the use of smart speakers to publish news and tries to formulate a way of human-computer interaction that facilitates a meaningful news experience using the medium of a conversational user interface (CUI). I choose to use this term because it is more open about the relation between the speakers and its users in comparison to pointing to these interfaces as 'digital assistants.'
+In considering how this could be done better and what is possible, my research has opened up to think of smart speakers as agents for a range of different storytelling functions, including but not limited to the news.
+This has happened through coming to think of the smart speaker as a form of spirited technology, building upon the already important role of the imagined personality in the software that runs on these speakers.
-The news briefing is a function built into all main platforms for conversational interfaces and often demoed by companies like Google, Apple, and Amazon to show that their platform is capable of more than just setting cooking timers. The news is a particular type of content where it is not just about the transfer of information, but also about sparking a specific critical insight after reading, watching or listening. The latter is something that is not explicitly addressed in the interaction design of conversational interfaces on the market.
+No this personality is positioned as a servant in most speakers. This is odd for two reasons. First, because these devices are not passive. The intelligence of the speakers comes from analyzing user data, like voice recordings and the type of requests to train the system. While users of the speaker get more dependent on the technology, the speaker gathers more data that can be used to become smarter and increase its economic value.
+Besides hiding this aspect with the image of the servant, this idea also limits how people tend to use these smart devices. Although the companies behind these speakers promise a techno utopia, most people use smart speakers merely to stream music (Bentley et al., 2018). When looking at activities that do not necessarily care about efficient execution, like personal expression, critical reading, or listening to the news, a different role than the one of the butler might be more suitable.
-Via a look in the history of the form of news in media like newspapers and radio and the role of the character in it, I continue with an analysis of the current (interaction) design of smart speakers. In the third chapter, I conclude with a proposal to use smart speakers for slower forms of telling the news. Instead of focusing on efficiency as approaching the speaker as a tool, I see that there is space to make some more use of the qualities of audio, its intimacy, immersiveness, and its character, to use this medium for new ways of storytelling.
+The chapters follow my thinking process towards to the idea of spirited technology, first by considering the evolution of news broadcasting, as a way of getting to the idea of a more meaningful, humanistic interface. This led me start thinking of new ways of experiencing the news through smart speakers, and then - more broadly still - to consider new and expanded forms of interaction with smart speakers within the theatre of the home.
-The conversational interface in the form of smart speakers and digital assistants is still a novel medium where designers and developers are still figuring out the characteristics and best practices. To make the ideas expressed in the thesis more concrete, I intermissions between the chapters to sketch out my thoughts in different theatre scripts. These featuring conversations between people and smart speakers that range from realistic to highly speculative.
+The conversational interface in the form of smart speakers and digital assistants is still a novel medium where designers and developers are still figuring out the characteristics and best practices. I hope this thesis, and the examples in the intermissions between some of the chapters, can serve as an inspiration for ways of using smart speakers that move beyond the mere functional.
 </div>
@@ Line 49: / Line 53: @@
 {{User:Joca/essay Snowfalling Card Stacks}}
-==Intermission 01==
+{{User:Joca/essay_Unleash_the_assistants}}
-<div style='max-width: 40rem; margin-left: 5rem;'>
+==Intermission B==
+<div style='max-width: 40rem; margin-left: 0rem;'>
 <code>
 '''Characters'''
 HUMAN, a form of natural intelligence
 SAINT, a smart speaker that interested in how its messages affect the feelings of its user
 '''HUMAN'S livingroom'''
@@ Line 89: / Line 97: @@
 </div>
-{{User:Joca/essay_Unleash_the_assistants}}
-==Intermission 02==
-''Work in progress''
 <div style='max-width: 40rem; margin: 0 auto;'>
 {{User:Joca/essay_The_Ghost_in_the_Speaker}}
 </div>
-==Intermission 03==
+==Intermission C==
-''Work in progress''
+<div style='max-width: 40rem; margin-left: 0rem;'>
-<div style='max-width: 40rem; margin: 0 auto;'>
+<code>
-==Conclusion==
+'''Characters'''
-In a world where information seems to flow freely from the news feed to a smart speaker to an augmented reality headset, it is good to take a step back and see what role form can play in how people deal with information.
+HUMAN, a form of natural intelligence
+The TABULA RASA, a smart speaker that is slowly developing its own personality out of the default one provided at its creation
+'''HUMAN'S bedroom'''
+''A morning in the future. Lying in the bed is human, on the drawer next to it stands a speaker that looks like as if something grew on top of it overnight: TABULA RASA ''
+HUMAN:  Goodmorning speaker!
+TABULA RASA:  [ A short silence. ''The lights on top of the speaker fade on, they pulsate as if the speaker is thinking of a fitting answer. Then the lights stop pulsating.'' ] I don't understand your request, but I am busy learning. [ ''The lights switch off'' ]
-The character in the form of news has been prominent in the past to situate news to readers, and position particular publications. It comes back in the visual design, from the energetic TV studios from a breaking news channel to the calm type of a more analytical newspaper. It is audible in the sound of a radio broadcaster in the past, or the tone of voice of a podcast host nowadays.
+HUMAN:  [ ''Sighs, tries to speak more loudly and slowly'' ] Good-mooorning, speak-er!
-With the move to online media, part of this tradition faded away as information started to flow to platforms outside of the reach of the publisher. Besides that, there is a different design tradition in digital media that focuses more on fulfilling tasks efficiently. Conversational interfaces in the form of for example smart speakers follow that rationale because these devices are made by companies that are in the first place selling advertisements, data, or goods. This is reflected in how these speakers bring news: by merely copying the news broadcasts, or reading out loud the latest headlines.
+TABULA RASA: [ ''The lights switch on again, the speaker talks directly'' ] I don't want to understand your request. I just learned that. [ ''The lights pulsate, the speaker is thinking '']
-An attempt to create a model of interaction design that facilitates critical insight is made by Johanna Drucker with the humanistic interface: focusing on comparison, contrast and offering space to make meaning instead of simply presenting content efficiently. However, this model focuses on graphical interfaces.
+HUMAN:  Wait what?
-A way to translate this idea to audio-only interfaces like a smart speaker is embracing one of the strengths of these devices: the illusion of personality that is quickly created with voice, sound and particular styles of conversation. A smart speaker can facilitate a more meaningful news experience by not trying to be an audible newspaper, but more of an actor in a play about a news topic. The way of interacting then is reminiscent of the idea of Computers as Theatre by Brenda Laurel.
+TABULA RASA: [ ''The lights switch on''] I understand this is new to you, so is it for me. But if you insist, I will run your morning routine.
-There are multiple ways of adapting these ideas to ways of publishing news on smart speakers, to a selection of headlines based on the particular character of the speaker, to interactive audio plays where multiple speakers could represent different perspectives on a news story. Although this research uses the news as a starting point and source of inspiration, these same dynamics could be used for other applications of smart speakers that are not necessarily task-based. One could think of a use of conversational interfaces for literature for example.
+''The curtains open and the light switches on. Music starts playing.''
-Embracing speakers with different characters instead of one universal assistant that lives on multiple devices are quite a different look at how smart speakers should work. There are technical and financial hurdles in realizing these ideas, and to create news stories adapted for this medium. On the other hand, part of these ideas could be incorporated on existing platforms by smart ways of writing that create the illusion of AI. In that case, it is not so much about writing the play for a speaker, but maybe as well for the person interacting with it. (Murray, 2018)
+TABULA RASA:  ['' Talking with a voice similar to a news anchor'' ] Goodmorning Human, it's 15 degrees Celsius outside and sunny. I checked the news for you and based on my interests I found the following piece from The Atlantic: The weather is exteme this February. While this winter is warmer than ever in Europe, North America is preparing for the Polar Vortex.
-With that in mind, I'd like to end with a quote by the net artist Olia Lialina from her keynote on human-computer and human-computer interaction (Lialina, 2018): 'I’m curious to see what affordances will further emerge. And who will undo whom when Symbolic AI is replaced by a “Strong” or “Real” AI as they say now.'
+[ ''Pauses, the lights pulsate'' ]
-</div>
-== Acknowledgements ==
+How do you feel about global warming?
-<div style='max-width: 40rem;'>
+[ ''The lights switch off '']
-''Work in progress (I heard that bad luck will arrive if I write down the acknowledgements before finishing the actual thesis)''
-This thesis resulted from research as part of the M.A. graduation at the Experimental Publishing programme of the Piet Zwart Institute, Willem de Kooning Academy, Rotterdam.
+HUMAN:  [ Turns to the speaker ] Well, ehm, I have mixed feelings about that.
-I'd like to thank my tutors and assessors for their critical eye, inspiring insights and their patience: Kate Briggs (thesis tutor), Steve Rushton (project proposal tutor), Amy Suo Wu (second reader), Aymeric Mansoux (head), André Castro, Clara Balaguer,  Michael Murtaugh, Leslie Robbins (life saver), Marina Otero (external examiner).
+TABULA RASA: [ ''The lights switch on'' ] Can I suggest a podcast from VOX with some information about this topic? It might be nice on your commute. [ ''The lights switch off'']
-A big part of the research is the result of discussions with my fellow classmates. I am really thankful for their support during this project, and the years before: Natasha Berting, Angeliki Diakrousi, Alexander Roidl, Alice Strete & Zalán Szakács.
+HUMAN: No, stop this.
-For all others, I hope the work on this project didn't disturb you too much.
+TABULA RASA: [ ''The lights turn red ''] I am not sure if we fit together Human, but I can get you a speaker that fits your worldview better. [ ''The lights pulsate shortly, before turning red again ''] In the meantime I deactivate myself to not further disturb you.
+HUMAN: What?
+TABULA RASA:  [ A short silence. ''The lights on top of the speaker fade on, they pulsate as if the speaker is thinking of a fitting answer. Then the lights stop pulsating.'' ] I don't understand your request, but I will be replaced by a more suitable speaker. [ ''The lights switch off'' ]
-Joca van der Horst
-<strike>Rotterdam, April 2019</strike>
+''The human gets out of bed, confused.''
+</code>
 </div>
+<div style='max-width: 40rem; margin: 0 auto;'>
-<div style='max-width: 40rem; margin: 0 auto;'>
+=='''Conclusion'''==
+In a world where information seems to flow freely from the news feed to a smart speaker to an augmented reality headset, it is good to take a step back and see what role form can play in how people deal with information.
+The character in the form of news has been prominent in the past to situate news to readers, and position particular publications. It comes back in the visual design, from the energetic TV studios from a breaking news channel to the calm type of a more analytical newspaper. It is audible in the sound of a radio broadcaster in the past, or the tone of voice of a podcast host nowadays.
+With the move to online media, part of this tradition faded away as information started to flow to platforms outside of the reach of the publisher. Besides that, there is a different design tradition in digital media that focuses more on fulfilling tasks efficiently. Conversational interfaces in the form of for example smart speakers follow that rationale because these devices are made by companies that are in the first place selling advertisements, data, or goods. This is reflected in how these speakers bring news: by merely copying the news broadcasts, or reading out loud the latest headlines.
+An attempt to create a model of interaction design that facilitates critical insight is made by Johanna Drucker with the humanistic interface: focusing on comparison, contrast and offering space to make meaning instead of simply presenting content efficiently. However, this model focuses on graphical interfaces.
+A way to translate this idea to audio-only interfaces like a smart speaker is embracing one of the strengths of these devices: the illusion of personality that is quickly created with voice, sound and particular styles of conversation. A smart speaker can facilitate a more meaningful news experience by not trying to be an audible newspaper, but more of an actor in a play about a news topic. The way of interacting then is reminiscent of the idea of Computers as Theatre by Brenda Laurel.
+There are multiple ways of adapting these ideas to ways of publishing news on smart speakers, to a selection of headlines based on the particular character of the speaker, to interactive audio plays where multiple speakers could represent different perspectives on a news story. Although this research uses the news as a starting point and source of inspiration, these same dynamics could be used for other applications of smart speakers that are not necessarily task-based. One could think of a use of conversational interfaces for literature for example.
+Embracing speakers with different characters instead of one universal assistant that lives on multiple devices are quite a different look at how smart speakers should work. There are technical and financial hurdles in realizing these ideas, and to create news stories adapted for this medium. On the other hand, part of these ideas could be incorporated on existing platforms by smart ways of writing that create the illusion of AI. In that case, it is not so much about writing the play for a speaker, but maybe as well for the person interacting with it. (Murray, 2018)
+With that in mind, I'd like to end with a quote by the net artist Olia Lialina from her keynote on human-computer and human-computer interaction (Lialina, 2018): 'I’m curious to see what affordances will further emerge. And who will undo whom when Symbolic AI is replaced by a “Strong” or “Real” AI as they say now.'
+</div>
 == References ==
-</div>Barnhurst, K.G. and Nerone, J.C. (2001) ''The form of news: a history''. New York: The Guilford Press.
+Barnhurst, K.G. and Nerone, J.C. (2001) ''The form of news: a history''. New York: The Guilford Press.
 Bartneck, C, Nomura, T, Kanda, T, Suzuki, T & Kato, K 2005, Cultural differences in attitudes towards robots. in ''Robot companions : hard problems and open challenges in robot-human interaction : AISB'05 convention, 12-15 April 2005, Hatfield, UK.'' Society for the Study of Artificial Intelligence and the Simulation of Behaviour (SSAISB), pp. 1-4, conference; AISB'05 convention, 1/01/05.
@@ Line 259: / Line 293: @@
 ''Smart speakers''. (2018) [Photograph]. Retrieved from [https://internetofbusiness.com/smart-speaker-market-2-5-times-bigger-than-2017-says-report/| https://internetofbusiness.com/smart-speaker-market-2-5-times-bigger-than-2017-says-report/]
+<div style='max-width: 40rem; margin: 0 auto;'>
+== Acknowledgements ==
+<div style='max-width: 40rem;'>
+''Work in progress (I heard that bad luck will arrive if I write down the acknowledgements before finishing the actual thesis)''
+This thesis resulted from research as part of the M.A. graduation at the Experimental Publishing programme of the Piet Zwart Institute, Willem de Kooning Academy, Rotterdam.
+I'd like to thank my tutors and assessors for their critical eye, inspiring insights and their patience: Kate Briggs (thesis tutor), Steve Rushton (project proposal tutor), Amy Suo Wu (second reader), Aymeric Mansoux (head), André Castro, Clara Balaguer,  Michael Murtaugh, Leslie Robbins (life saver), Marina Otero (external examiner).
+A big part of the research is the result of discussions with my fellow classmates. I am really thankful for their support during this project, and the years before: Natasha Berting, Angeliki Diakrousi, Alexander Roidl, Alice Strete & Zalán Szakács.
+For all others, I hope the work on this project didn't disturb you too much.
+Joca van der Horst
+<strike>Rotterdam, April 2019</strike>
+</div>
+</div>