User:Joca/draft 03: Difference between revisions
No edit summary |
No edit summary |
||
Line 3: | Line 3: | ||
</div> | </div> | ||
<div style='max-width: 40rem; margin: 0 auto;'> | |||
==Introduction== | ==Introduction== | ||
''Work in progress'' | ''Work in progress'' | ||
</div> | |||
{{User:Joca/essay Snowfalling Card Stacks}} | {{User:Joca/essay Snowfalling Card Stacks}} |
Revision as of 18:42, 12 February 2019
As a first draft I want to show the structure of the thesis and the finished three essays that form the backbone. Besides that, I made a draft of one of the intermissions: pieces of speculative design fiction about the smart speakers I want to design. (see User:Joca/A play for smart speakers)
Introduction
Work in progress
I - Reading between the headlines
The future is a utopia with everlasting afternoon sun and computers that correctly understand people. At least, that is the techno perfect Los Angeles as portrayed in Spike Jonze's movie Her (2013). While transcribing people's thoughts and organizing their life, the digital assistants even have the time to fall in love with some of their users.
Even in the LA of the future, the latter is controversial. However, there is a different scene that I remember in particular. During his commute, the main character Theodore Twombly checks the news. I expected that something spectacular would happen: Could a virtual news anchor give an update? Is the latest news combined with matching memes that pop-up as holograms? Or, if it is not about the spectacle, at least this part would be as carefully designed as the rest of the movie.
In the end, none of that follows. Even in this bright future where everything is possible, the digital assistant spits out the latest headlines and Theodore falls for a clickbait article.
This is surprisingly similar to the way news is designed on conversational interfaces in 2019. With these type of interfaces, I mean digital assistants like Line Clova and Google Assistant that are built into smartphones, as well as devices like the Google Home and Amazon Echo. What they have in common is that these conversational interfaces provide a way to interact with digital systems with your voice, without the need to touch a screen. Ask Alexa, Assistant, or Siri for the news, and you get a briefing that features many headlines with an unknown author. If you want to know more about a specific event, the speaker will read out the article or play a particular part of an hourly news broadcast.
Although the news briefing is often demoed in the commercials and presentations of these smart speakers and digital assistants, it is not commonly used by people that use these devices. The Verge, an online publication focused on technology and culture, concluded quickly that 'Smart speakers have no idea how to give us the news.' (Brandom, 2018)
Between gossip and societal discourse
Making a computer voice read some text from the internet, without giving any context, is a questionable approach to news, but it is understandable if you take into account how conversational interfaces are mostly used nowadays: digital personal assistants that help people control their devices, in such a way that the actual technology and complexity is out of sight. This might be nice to control light switches in a home or to stream music, the most popular use cases for smart speakers at the moment (Molla, 2018). However, this urge for a nearly invisible interface ignores some specific characteristics of news and the role of journalism.
In The Elements of Journalism (2007) Kovach and Rosenstiel describe a moment where anthropologists compare their notes on communication and find some universal patterns for news throughout history and cultures: news has a certain momentness and pace. It is about what is happening outside of people's own experience but has an impact on their lives. The authors conclude that journalism is 'simply the system societies generate to supply this information about what is and what's to come' and that it is part of the human instinct to be curious.
Historian Michael Schudson (2013) argues in reaction to Kovach and Rosenstiel that gossip might be universal, but the form and the conventions in journalism are highly dependent on the time and culture. For Schudson 'journalism today
relates to the constitution of public life, an institution in which criticism and discussion is part of its function', but to get that role journalism was dependent on something like the existence of a public sphere starting in the eighteenth century and organized efforts to publish a collection of current affairs periodically.
Balancing between the gossip and the societal discourse, the idea of popular news and quality news is visible in different media, from newspapers to online. What these media have in common is that they try to create a meaningful news experience to their audience. To me, such an experience is based on this idea that you read, see, or listen to coverage of a particular news event and that it sparks a thought. Beyond the transfer of information to the brain, you start to think about how this news event relates to you as a person and your position. In case of an article about the new outfit of Kim Kardashian, it could be a chat I have with a friend after sharing the article. In the case of an article about corruption in the Brazilian government, it might motivate me to start a protest. In both cases, the article is not the end. It's a trigger for some extension.
There is a whole tradition in creating news media that facilitate a meaningful news experience fitting the character of the publication. It is not only in the choice of news to cover but also in the ways the report is designed: from the visual design of the newspaper to the role of the news anchor on broadcast television. The current form of the story on smart speakers ignores these traditions mostly, showing little cues about the situatedness of the news. Before further discussing the causes and propose an alternative, it is good to see how news found its form in other media.
The archetype of the modern newspaper developed in the United States during the first half of the 20th century, as Barnhurst and Nerone describe in The Form of News. (2001) Innovation in mass-printing and photography enabled a different way of reporting. Where text was used to record events, photography took over to give visual immediacy to the news. The role of the writing journalist changed into filtering the news-worthy aspects of an event instead of writing a full chronological report. This resulted in a shorter, matter of fact articles and the development of the idea of the newspaper encouraging a stance of objectiveness.
Looking at the difference between the journal of record and the true tabloid, the difference in content is reflected in the design. In research on Finnish and English newspaper design, Jesso Lamberg (2015) identifies a difference in visual energy: The quality newspaper has a more uniform, and text-focused design. Popular publications tend to focus on more visuals, tilted images and experimental forms of designing a newspaper. Lamberg sees that particular elements from popular newspapers have found their way to the papers of record. For example the use of big photos and little text on the front page, and the adoption of the tabloid paper format. In the end, he concludes that many aspects of newspaper design are not just functional, but also act as a way to show the character of the newspaper to the audience.
Character plays an important role as well for the way news is designed on broadcast media. Where radio news started with reading out the headlines of the newspaper, later on, it developed its own conventions. Radio news had to be vivid, and a good news anchor was one that could bring the story to life by the way it was told. With the arrival of broadcast journalism, this changed. With the addition of image in the form of video, still photos and graphics, a good news anchor was a person that could give an interpretation to the items on a screen. (Ponce de Leon, 2015)
Snowfalling
With the move to online media, there is an interesting difference in comparison to newspapers and broadcast media. A lot of news articles are not read on the website of a news organization, but aggregated and shown on different platforms.
Where in the newspaper age the front page was the defacto way to lure in the readers, and to show the identity of a news medium, the homepage of a news website could be declared dead. In a time where many news media are focused on how their content is displayed in Google News, or try to game the news feed algorithm of Facebook, defending your own platform is quite a statement. S. Mitra Kalita, the vice president for programming at CNN Digital, did so in front of an audience of social media marketers.
Although articles are distributed on other platforms, and people might not need to visit the site and apps of a news medium, it pays off to focus on the homepage according to Kalita: it's an interface where you have full control over, and one which is used by the most loyal readers. (Beard, 2018).
By structuring and showing its content differently on each medium, CNN tries to pack the same story in different ways that are relevant to readers in various situations. To some extent, the interface guides the experience of the user.
This influence of the way news is designed, is not just limited to the sites and apps of news outlets. Even when focusing on getting your story across on other platforms than your own homepage, there is an interface around that content.
For people who create the content, the scope of the interface is limited to the one article, video or VR environment they are working on. The poster child of that approach is Snow Fall (Branch, 2012). This long read about an avalanche in Tunnel Creek was published by the New York Times. It featured a groundbreaking design with full-screen videos, slideshows, and special effects when reader scrolled through the article.
Many other publications copied the format and snowfalling was declared to be 'the future of journalism' because it showed to promise of high-quality long-form journalism online as a way to attract and engage an audience. It didn't work out that way, mostly because of the time and resources it takes to create these rich articles with custom interfaces: Snow Fall alone took 6 months to make and had 11 people working it (Thompson, 2012).
Job to be done
To understand the way digital news media are designed and how the rationale behind the design differs from older media, it is good to dive deeper in the foundations of what is considered 'good' interface design.
With the rise of the web, the quality of the interaction design became one of the critical aspects for having people use websites. In the nineties, the term user experience design (UX) was coined to focus on all issues of user interaction with the products and services of a company. (Nielsen & Norman, 1998) The field combines interaction design with psychology and many other disciplines that touch upon that topic.
Although identified as a field of design quite recently, the practice of UX already started in the fifties, with foundations that come from Human-Computer Interaction (HCI). Traditionally, this discipline approaches interface and usability from a task-based and rationalistic perspective.
HCI incorporated the importance of experiences and motivations of people using an interactive system (Bødker, 2006) and you see traces of that in UX design. The question is however what type of user experience is the goal. People are still users with a job to be done, preferably in an efficient way.
The notion of user experience got a lot of traction since the dot-com bubble, the work of firms like the Nielsen Group and the inclusion of UX in curricula of design schools. Best practices are shared in digital libraries of interaction patterns: from buttons to colors to clear text.
Humanistic interface
This led to universally accepted practices in navigating through interfaces and drive users conversion: the goals for which an interface is designed. This could be buying goods, or spending time for example. Good conversion makes the difference between a bankrupt or a profiting webshop. However, are the foundations for a successful e-commerce interface necessarily the same for the interface of a news medium?
In current task-focused UX theory, the activity of interpreting information has a secondary role. The website, app, or any experience, is seen as a means to an end. But there are other perspectives on this digital space.
Media scholar Johanna Drucker sees the interface as the combination of what we read and how we understand. People control the interface as much, as the interface ignites a particular user experience. A website is then not merely a tool, but a space where people engage with information and the systems behind it.
Building upon this idea is the concept of the humanistic interface (Drucker, 2014). This theory fills the interpretation gap in the current practice of UX design because it approaches the interaction design from the idea of critical insight: focusing on comparison, contrast and offering space to make meaning instead of merely presenting content efficiently.
Drucker stays quite abstract about what these websites and apps would look like, stating that 'the humanistic interface is still in its infancy.' There are however examples of designs and interactions that fit this idea.Blendle is an online service where people can pay for individual articles of major print publications. In its interface Blendle puts a lot effort in recreating the visual style from colors to the official fonts of papers like Bild Zeitung, or a magazine like Vanity Fair. It helps people to visually situate the articles. (Spijkerman, 2015) In the selection of featured items on the front page, Blendle tries to balance personalization with the careful selection of pieces that offer a different point of view to that of the user.
Interfacing the comment section differently is one of the methods used by De Correspondent to involve readers and apply their knowledge. A dedicated editor for 'conversations' features interesting comments on the front page, and tries to pair readers with specialists in the comment section. Their goal is to use these conversations as starting points for new research, and it is a way to see what readers get from the articles (Wijnberg & Martèl, 2018).
On smart speakers
Although it seems like a detour, discussing the search for the form of news on media like newspapers, radio and websites make sense in the context of smart speakers. Design conventions of papers found their way in different styles to broadcast media and sites. Even in a non-visual medium like a smart speaker, this is important as most content is taken from articles on news websites. Besides that, these historical examples can also serve as a source of inspiration. A smart speaker is not a radio. Instead of a broadcasting receiver, it sends narrowcasts to a smaller audience. And besides sending information, it also captures data like voice recordings which is sent back to the datacenters of it's creators. Even with these differences in mind it can take some cues from news anchors in reporting news in an engaging way and showing the character of the news medium by the exclusive use of audio. The role of UX theory and its emphasis on efficiency in interface design plays a vital role in the online design, but it also influences the way smart speakers are designed. In the next chapter, I will further discuss the positioning of conversational interfaces as digital assistants and tools, and why this is an obstacle to facilitating activities that require critical insight, like following the news. (Google, 2018)
Intermission 01
Work in progress
Essay 02
Intermission A
Line Clova speaker here
II - Unleash the assistants
Are you ready to get started? I tap on the speech bubble that says Yeah, let's do it. An indicator appears with three bouncing dots. Someone on the other end of the chat is typing. OK, let's get you the latest news. A GIF of a rapping Michelle Obama appears on my screen. Then a new message comes in, it took only two weeks for Michelle Obama's memoir Becoming to top the 2018 book charts. I can reply using one of two options: Next, or 📚 👏
Quartz, a website for business-related news, envisions that this is the future of reading news: you chat with it. Two years ago the publication launched the Quartz Brief app, in which a jolly chat bot guides you through the news by sending story blurbs with funny GIF's and occasionally an advertisement. The app taps in on the rising popularity of chats as a way to interface digital services. This trend is especially visible in China, where WeChat is the go-to app for everything between ordering groceries to buy concert tickets. (Grover, 2014)
The chatbot is heavily restricted in its conversations as I am only allowed to send emoji or skip to the next article. One could even argue that there is no conversation happening at all, as Margaret Rhodes (2016) stated in her article in Wired after interviewing the creators of the app: 'A conversation is an exchange of ideas between two or more parties, and in Quartz’s app the user doesn’t express any original thought'.
Although these constraints are clear to me as a user, the messages do feel personal. Or at least more engaging than a block of content that floats by in a news feed. There is some logic in the statement once made by Matt Webb (2015) that it is strange not to use the same language to our software as to our friends: chatting.
Newsy speakers
Chatbots to interface the news are not common yet, but many news media are working on podcasts at the moment. Interestingly enough, these examples of audio journalism share the same appeal that the Quartz bot has: they feel more personal and engaging than text or video. This leads to an audience that listens for a long time each session. To journalists this is. At the launch of the daily podcast of The Guardian host, Anushka Asthana spoke out her ambition to delve '(...) further into the big stories and cutting through the noise to take our listeners behind the headlines'. (Guardian press office, 2018)
Following this logic, voice-activated smart speakers like Google Home and Amazon Echo are fantastic interfaces for news. You can talk to the digital assistant in a way that is even more personal than the chatbot of Quartz. And the speaker will talk back, like a personalized podcast. Listening to the news is heavily promoted by Amazon and Google. A news anchor function is integrated into both voice platforms. Google Assistant lets you scan swiftly through the press with commands like 'Play BBC Minute at 2X speed'. Using hours of news broadcasts, Amazon trained their Alexa platform to speak like news anchors do. (Vincent, 2018) The idea is that small nuances like accentuation of keywords, differences in speed and even a whisper mode make the computer voice more enjoyable to listen to, as they come closer to what people are used from the way news is presented on broadcast media.
And although the adoption of these digital assistants is growing faster than for smartphone and tablets in their beginning stage, there is something strange: news consumption on smart speakers is lower than you might expect from their popularity. (Newman, 2018)
Digital butler
There are some practical reasons for that, as Nic Newman shows in his research at the Reuters Institute for Journalism. The most stressing one is the quality of news briefings produced by smart speakers. Users complain that they are too long, not up to date and that the production quality is lagging behind.
Another problem is the attribution of the news. It is unclear to users where the stories came from, and how they could control which publications are part of the briefing. Attribution is an important aspect of news, first as a way to show that a newspaper checked that the author is following the standards of the publication (Barnhurst and Nerone, 2001). Nowadays the focus in the byline shifted from to the author and paper to the person sharing the article in a newsfeed on Facebook or Twitter.
In comparison to that the conversational interface seems more like a black box, and in the end, most users prefer other devices to stay updated about the news. Newman concludes that smart speakers and conversational interfaces are still in an early stage of development. He states that the problem with news on smart speakers illustrate '(...) how critical the development of more device-specific content might be -- along with better user interfaces'.
Newman proposes dedicated tools for publishers to create content for smart speakers, an emphasis on short 1-minute bulletins and heavy branding of the audio to make it clear to users to what publication they are listening to. What he doesn't however, is discussing the archetypical role of the smart speaker: a digital assistant.
The envisioned role of speaking computers as virtual butlers has a long history. In the early 1960s, IBM demonstrated the Shoebox, a device that recognized 16 spoken words and the ten digits from 0 to 9. People could use it as a voice-controlled calculator. (IBM Archives, 2003) A more elaborate vision on the virtual assistant is Apple Computer's concept video about the Knowledge Navigator: In this video, a digital assistant with a bow-tie assists a professor in his research to save the Amazon forest, and to remind him of his daily duties. The interaction between the professor and the digital butler is an exchange of commands and blurbs of information including a reminder to pick up a birthday cake. Looking at the way smart speakers are currently advertised, this vision on conversational interfaces is pretty much the same: a virtual assistant that picks up the phone and plans a meeting is a concept in 1987. The difference in 2018 is that Google's Duplex assistant is actually able to call a restaurant and reserve a table for two.
Master/Slave
The digital assistant might be useful for simple tasks, from making an appointment to set a cooking timer. The current practice of news briefings looks however more like a lord in the castle with the butler reading out a newspaper to him.
Even in this strange one-sided way of engaging with news current conversational interfaces are doing poorly. In the interviews done by Nic Newman users of smart speakers complain that the news briefings are not easily consumable due to their length and the unpleasant voice of the digital assistant.
This reminds me of the Master/Slave Dialectic in the Phenomenology of Spirit (1807). In one chapter of this book, Friedrich Hegel describes the dynamic between lordship and bondage. In the beginning the master is on the winning hand, living in freedom, but eventually, the slave might be better off according to Hegel: He finds meaning in and through labor, while the master sinks of in empty consumption and becomes wholly dependent on the enslaved (Siep, 2014). Are we in this case the masters that want to consume news efficiently, while the virtual assistant silently collects data and becomes smarter?
Another problematic aspect of this stereotypical role is that meaningful engagement with journalism is more than consumption of information. Earlier I referred to Johanna Drucker's work on humanistic interfaces, which is mostly focused on scholarly reading. This mode of interacting with information has some similarities with news reading, in that they both rely on critical insight and the idea that reading or listening is just the start of a further conversation. Following that idea, you can’t have a meaningful news experience for everything you’re reading because it requires a certain kind of cognitive attention.
Smart speakers are now designed to offer quick info about the weather, to give direct control to appliances in the home. The news briefings have a strange position there, as they are too long to be as immediate as a light switch. On the other hand, they are too short and functional to go as deep as for example a podcast can go.
Then comes the question of what role the conversational interface should have in the context of news. I’m advocating for slower and more in-depth information. Smartphones work great for glancing information and to snack some headlines. Audio, as done by smart speakers, could be great to go a step further. Instead of focusing on efficiency as approaching the speaker as a tool, I see that there is space to make some more use of the qualities of audio, its intimacy, immersiveness, and its character, to use this medium for new ways of storytelling and offering a literary experience. What new interpretation could a robot give to a text? Could the character of the personal digital assistant influence the way news is presented to people?
Intervention by haiku
Popular conversational interfaces like Siri, Alexa, and Assistant, are designed to serve their users. Another characteristic they share is their aim for a universal and neutral personality. Google Assistant has the same character and way of working on a smart speaker, as in the smartphone app. If there are any biases, the systems are designed to not be explicit about that. (Bogost, 2018)
I believe however that an unleashed virtual assistant would be a conversational interface that embraces its biases and shows its unique personality. A rationalistic smart speaker would look and work in another way than a progressive liberal smart speaker. They could not only serve the news in a briefing but also ask questions to provoke users, maybe annoy them. The unleashed assistant would not exclusively treat the human as a mere consumer, but maybe as a conversation partner if the character of the interface would prefer that role. The speaker is not a butler, but more of a companion with whom you have a conversation around a dining table. An entity that brings surprise and is not at all times friendly and docile.
By playing and provoking the user, I imagine that these rogue digital assistants create a space where critical insight is facilitated. In her work, Johanna Drucker calls this the humanistic interface (2014), although she mostly refers to graphical user interfaces there.
Maybe the start of the humanistic conversational interface is the happy newsbot in the Quartz app. Its voice is written by the editors working at the publication. After its initial success, there is now a new entertainment bot modeled after the culture and gossip bloggers at Quartz. The publisher continues its experiments in their Bot Studio where they experiment with bots as a way to publish news.
Although limited, the current bot already provides some delightful interventions in my day. Today it decided to end the day differently. Instead of delivering a briefing of today's news, the bot wrote a haiku that made me reflect on the stock market:
Trade wars and rate hikes
Are looming. At least today
We can catch our breath
Intermission 02
Work in progress
Essay 03
III - The Ghost in the Speaker
Should we be kind to our smart assistants? In Why'd You Push That Button, a podcast about the social dynamics around technology a mother of a six-year-old gives the following answer to this question: 'We really want him to understand them that you have conversations with people and how you have them. Having a robot or a smart assistant that will answer to you no matter how you speak with them, well that is not life, even though it is life, but it is not real life.'
The slight confusion in this quote gives a hint of the power of conversational interfaces to give the illusion of consciousness, even just by the use of audio. The examples in the first chapter show that sound is a medium that can express character, as was typically done in radio and broadcast. The Interactivity of a smart speaker allows for a different kind of storytelling. In the context of journalism, it might fit a slower type of news than is typically done now with the news briefings on smart speakers.
This potential is not used at the moment for various reasons: one is the lack of content specifically designed for consumption on speakers. (Newman, 2018) On the other hand, the smart speaker is a new medium, which starts with presenting old media as its content before it develops its own genres (McLuhan, 2002).
On the other hand, the space to experiment with content forms is somewhat limited by the way these conversational interfaces are positioned in the market. The dominant platforms are designed to create assistants that act after user commands. They are designed as an efficient tool, rather than a way to enrich 'our own capacities to think, feel and act' as formulated by Brenda Laurel in her thoughts on interfaces in the book Computers as Theatre (2013).
In this chapter I want to speculate on a interaction design for a smart speaker that allows for a different way of storytelling. Starting from the role and character of the speaker to give it more agency than an assistant. Then I elaborate on what this means for how these speakers could act in a conversation, using the ideas of Computers as Theatre in the context of smart speakers. Then I conclude with different possibilities to present news on smart speakers using these ideas from a more realistic, to more speculative scenarios.
The speaker as a spirited object
To broaden the possibilities for interactions between humans and conversational interfaces like smart speakers, it helps to consider a different role than the one of a virtual assistant, because of the constraints that are part of the master-slave relationship connected to it.
In a significant number of science-fiction movies, the alternative role that is proposed is then that the robot takes the final lead and kills all humans on its way to keep its power. On the spectrum from servant to a killer with absolute power, there are many different roles to consider that give a conversational interface more agency. I would like to first discuss two speculative design projects on digital assistants that deal with this idea, before moving on to my metaphor smart speakers as spirited objects.
With Foresight (2017) the designer David van Gelder de Neufville envisions a digital assistant that gets its agency based on the data and permissions given to it by its users. The system has a persona called Athena that helps its users, sometimes proactively pops up but also denies specific requests. For example when one of the family members closes down Athena's access to her agenda and private messages.
Based on information from social networks, smart light bulbs and private chats Athena observes what all family members did, are doing and will do. One of the questions that De Neufville asks here is if the assistant can create its reality using the data, and following that its awareness based on the knowledge and freedoms that it gained from its users.
The idea of the bot as a companion is further researched in Karin Anders (2017), a speculative design research project by Karin Fischnaller that focuses on the digital assistant as an alter ego that could be a sparring partner for a designer. In her thesis, she argues that the bot does not need to be a prosthesis, but can be a partner that brings in new ideas but also discusses the input brought in by the designer. The added value of the intelligence lies here in the collaboration between a human and a computer, that complement, and conflict with each other similar to normal social interactions.
In her research, Fischnaller refers to the actor-network theory of the French philosopher Bruno Latour (Latour, 2005). He coins the term actants for non-human entities that can perform actions in the world and have a form of agency. For the context of smart speakers, I like how Jensen and Block (2013) elaborate on this idea by connecting the actants to Japanese Shinto-inspired techno-animism. In the Shinto religion, there is a focus on the idea that things change form from non-human to human, from the real-world to the other world. Spirits inhabit living creatures, but also natural objects. Techno-animism extends this idea to electronic devices.
Another school of thought that connects to this idea is the techno-Buddhism represented by the pioneering robot scientist Masahiro Mori. In the 1980's he wrote The Buddha in the Robot (Mori, 1981), where he states that ' (...), there is no master-slave relationship between humans and machines. The two are fused in an interlocking entity. (...) Man achieves dignity by recognizing the same Buddha-nature that pervades his own.' Like other traces of religion in Japan, these ideas are not applied in a strict religious way by most people. It's use is more socially constructed and seen as a way to maintain order and do good. (Kawano, 2005)
After World-War II the government and industry tried to get widespread acceptance for robots, by pointing out the relations of these to traditional Japanese culture. (Ito, 2007) In comparison to robotics researchers in the West that worked with a more functionalist approach, in Japan the Buddist- and Shinto-inspired researchers were more influential because of this development. (Vallverdú, 2011) Unlike what if often thought, this does not result in Japan having a special relationship with robots. However, it results in a public perception that is slightly more receptive and realistic about the role of robots in society as cross-cultural studies show. (Bartneck et al., 2015)
The servant | The villain | The companion | The teacher | The avenger | The divine power |
---|---|---|---|---|---|
This actant is in service of the person. The power is fixed to the side of the human. | An actant that craves for the bad. It haunts the person, but may not be fully in control of its own obsessions. | An actant that feels as a friend to a person. The power in this relation shifts always between actant and human. | This actant tries to warn a person for something they are doing wrong. It intervenes in the humans existence to teach a lesson about life. It takes subtly the lead, but let the human take the lead in the end to find its own way. | This entity is looking for revenge, after it has been badly treated. It is dominant and sees the human as a prey. | An actant that moves beyond what is comprehensible to human beings. |
e.g. | e.g. | e.g. | e.g. | e.g. | e.g. |
As a metaphor for smart speakers, I find the idea of spirited technology useful, because techno-animism relies on the idea of space and material. The intelligence is not flowing freely in the space but can live in a device like a smart speaker. Another interesting aspect in contrast to functionalist thinking about non-human creatures is that these 'spirits' do not necessarily have a backstory that explains their behavior. The spirit in a speaker might be a ghost that has much knowledge thanks to its internet connection, but on the other hand, it is not able to move out of the speaker. Sometimes it is willing to help its user, but sometimes it needs your help to do something. As these ghosts are bound to their device, the different speaker features different ghosts that have their distinct personality.
Within this metaphor, the speaker is intelligent and might have a certain degree of conscience, but at the same time, it is unable to do some things that humans are capable of. It becomes a mysterious object with a degree of agency that users can discover by a conversation with it.
Given the state of technology at the moment, the idea of a spirited speaker is, of course, a metaphor. It might get more realistic in the future like the Mechanical Turk was an early vision of a chess computer like Deep Mind, but the metaphor serves mainly a different goal: it is a way to envision an exchange between people and conversational interfaces that is somewhere in between voice commands, and social conversation.
The speaker as a player
When a smart speaker has the persona of a ghost that lives in the speaker, the next question is then what this means for interaction with users. Moreover, how these characteristics of the medium can be used for more exciting ways of publishing news on these devices.
In the media equation (Reeves and Nass, 1996) the authors argue that interactions of humans and computers are similar to social interactions. Media equals real life, and in our use of media, the same social codes apply as in interactions with other people. The illusion of some form of intelligence and autonomy could be enough to make people believe it. The ideas of Brenda Laurel in Computers as Theatre (2013) connect well to this idea. In the book, Laurel uses theatre as a model for interaction design. When the first edition was published in the '90s, Laurel's intended applications were initially games or virtual reality. The idea of the interface as a player is however particularly useful for a smart speaker, because of the importance of conversation and character that it has in common with theatre.
As much as the smart speaker, its user becomes an actor. A big difference to actual theatre is the setting of the play. The stage is not in the public space, but in a domestic environment and the play relies on the exchange between the human and the computer.
Laurel shows that human-computer interactions work as an organic whole and that they feature dramatic structural characteristics. Like a playwright, an interaction designer creates a space for possible actions, where the design of objects, characters, and environments serves this a goal. Choices for, or by people using a computer can make particular situations more probable to happen. Interaction should be made clear in the context of the representation: sources of agency are represented explicitly, using the characters that are part of the 'play,' and so are the objects, environment and the potential of all these items.
Implications for storytelling
For a smart speaker that tells a news story, there are multiple ways to incorporate this vision. In line with the current news briefings done by speakers, I imagine that instead of one universal assistant that reads a 'one-size fits all' overview of headlines, people could choose a particular character that fits the view on the news they want. Imagine a speaker that treats celebrity news like the presenter of an entertainment news show. It would pick news from more popular sources, feature a lot of audio effects that create the energy typical for these kinds of shows and maybe ask you in the end for your feelings about the newest dress of Kim Kardashian.
Are you more interested in the social dynamics behind the influence of celebrities on popular culture? Then a speaker that is modeled after a media critic might be a better choice. This speaker will prefer background pieces about the role of celebrities, focusing more on the culture set by Kim Kardashian instead of her newest dress. It could ask about your opinion on the topic, and present articles that support or conflict with that. The form giving of the audio is more calm and sober for this speaker.
Both speakers do not pretend to offer a full view of the world. What they do however is situated their news selection and presentation by attributing their sources, incorporating certain modes of reading and sound design that make their character more explicit to the user. When pieces are more specifically designed for speakers as a medium, it is possible to take this idea further in a scenario that looks more like a play.
Imagine that you put the entertainment speaker and the media critic speaker next to each other and that they would tell the story together. One speaker could start with arguing that celebrities are role models for the general public, and the other speaker illustrates that with the latest headlines. In this exchange, the power shifts from one speaker, to the other, to the user and back.
The authorship for these scenarios could be approached in different ways. Heavily scripting all interactions, with a more constraint choice for the people using the smart speaker, is a mode of working used by the Quartz bots mentioned in chapter 2. As technology progresses, it is possible to have more parts of the story, questions to the user and included sounds generated. In this situation, the authorship is shared by the interaction designer and a journalist, that define a set of rules and content that fits the story, and the people using the speakers to discover various 'states' of the story.
The idea of seeing smart speakers as spirited devices that are actors in a play might sound a bit esoteric. However, it is possible to identify aspects of this idea in current speakers. As strongly as some may argue that digital assistants are tools that shouldn't have character, the Google Assistant actually has a detailed backstory: She comes from Colorado, loves kayaking and is the daughter of a research librarian, tells James Giangola, a lead conversation and personal designer for Google Assistant in an article on The Atlantic. To fine tune the personality, the big players are eager to hire storyboard artists and persona designers from different film studios in Hollywood. (Schulevitz, 2018)
In that sense, the ideas expressed in this chapter elaborate on the importance of character and agency for more exciting and meaningful interactions with conversational interfaces like smart speakers. However, instead of using the personality to dress up an existing function like getting the latest headlines, I see potential in using the character and conversational skills of a smart speaker as the starting point for designing stories on this medium. While making this point, I conveniently put aside important aspects like technical feasibility or the business model behind such a platform. The reason mainstream smart speakers work and look as they do now, is because Amazon sees it as an extra portal to their e-commerce platform, and Google as an extra way to collect data and further develop their Artificial Intelligence applications. At the same time, the whole idea of the conversational interface as a supercharged assistant started as a dreamy idea in movies, books and texts like the one you are reading now.
Intermission 03
Work in progress
Conclusion
Work in progress