User:Joca/essay Unleash the assistants

Intermission A

Line Clova speaker here

II - Unleash the assistants

Are you ready to get started? I tap on the speech bubble that says Yeah, let's do it. An indicator appears with three bouncing dots. Someone on the other end of the chat is typing. OK, let's get you the latest news. A GIF of a rapping Michelle Obama appears on my screen. Then a new message comes in, it took only two weeks for Michelle Obama's memoir Becoming to top the 2018 book charts. I can reply using one of two options: Next, or 📚 👏

The publication Quartz is now testing a chat interface for their online publication. (source: https://www.adweek.com/digital/quartzs-new-chatbot-is-bringing-conversational-news-to-facebook-messenger/)

Quartz, a website for business-related news, envisions that this is the future of reading news: you chat with it. Two years ago the publication launched the Quartz Brief app, in which a jolly chat bot guides you through the news by sending story blurbs with funny GIF's and occasionally an advertisement. The app taps in on the rising popularity of chats as a way to interface digital services. This trend is especially visible in China, where WeChat is the go-to app for everything between ordering groceries to buy concert tickets. (Grover, 2014)

The chatbot is heavily restricted in its conversations as I am only allowed to send emoji or skip to the next article. One could even argue that there is no conversation happening at all, as Margaret Rhodes (2016) stated in her article in Wired after interviewing the creators of the app: 'A conversation is an exchange of ideas between two or more parties, and in Quartz’s app the user doesn’t express any original thought'.

Although these constraints are clear to me as a user, the messages do feel personal. Or at least more engaging than a block of content that floats by in a news feed. There is some logic in the statement once made by Matt Webb (2015) that it is strange not to use the same language to our software as to our friends: chatting.

Newsy speakers

Chatbots to interface the news are not common yet, but many news media are working on podcasts at the moment. Interestingly enough, these examples of audio journalism share the same appeal that the Quartz bot has: they feel more personal and engaging than text or video. This leads to an audience that listens for a long time each session. To journalists this is. At the launch of the daily podcast of The Guardian host, Anushka Asthana spoke out her ambition to delve '(...) further into the big stories and cutting through the noise to take our listeners behind the headlines'. (Guardian press office, 2018)

Following this logic, voice-activated smart speakers like Google Home and Amazon Echo are fantastic interfaces for news. You can talk to the digital assistant in a way that is even more personal than the chatbot of Quartz. And the speaker will talk back, like a personalized podcast. Listening to the news is heavily promoted by Amazon and Google. A news anchor function is integrated into both voice platforms. Google Assistant lets you scan swiftly through the press with commands like 'Play BBC Minute at 2X speed'. Using hours of news broadcasts, Amazon trained their Alexa platform to speak like news anchors do. (Vincent, 2018) The idea is that small nuances like accentuation of keywords, differences in speed and even a whisper mode make the computer voice more enjoyable to listen to, as they come closer to what people are used from the way news is presented on broadcast media.

And although the adoption of these digital assistants is growing faster than for smartphone and tablets in their beginning stage, there is something strange: news consumption on smart speakers is lower than you might expect from their popularity. (Newman, 2018)

Digital butler

There are some practical reasons for that, as Nic Newman shows in his research at the Reuters Institute for Journalism. The most stressing one is the quality of news briefings produced by smart speakers. Users complain that they are too long, not up to date and that the production quality is lagging behind.

Another problem is the attribution of the news. It is unclear to users where the stories came from, and how they could control which publications are part of the briefing. Attribution is an important aspect of news, first as a way to show that a newspaper checked that the author is following the standards of the publication (Barnhurst and Nerone, 2001). Nowadays the focus in the byline shifted from to the author and paper to the person sharing the article in a newsfeed on Facebook or Twitter.

In comparison to that the conversational interface seems more like a black box, and in the end, most users prefer other devices to stay updated about the news. Newman concludes that smart speakers and conversational interfaces are still in an early stage of development. He states that the problem with news on smart speakers illustrate '(...) how critical the development of more device-specific content might be -- along with better user interfaces'.

Newman proposes dedicated tools for publishers to create content for smart speakers, an emphasis on short 1-minute bulletins and heavy branding of the audio to make it clear to users to what publication they are listening to. What he doesn't however, is discussing the archetypical role of the smart speaker: a digital assistant.

The envisioned role of speaking computers as virtual butlers has a long history. In the early 1960s, IBM demonstrated the Shoebox, a device that recognized 16 spoken words and the ten digits from 0 to 9. People could use it as a voice-controlled calculator. (IBM Archives, 2003) A more elaborate vision on the virtual assistant is Apple Computer's concept video about the Knowledge Navigator: In this video, a digital assistant with a bow-tie assists a professor in his research to save the Amazon forest, and to remind him of his daily duties. The interaction between the professor and the digital butler is an exchange of commands and blurbs of information including a reminder to pick up a birthday cake. Looking at the way smart speakers are currently advertised, this vision on conversational interfaces is pretty much the same: a virtual assistant that picks up the phone and plans a meeting is a concept in 1987. The difference in 2018 is that Google's Duplex assistant is actually able to call a restaurant and reserve a table for two.

Master/Slave

The digital assistant might be useful for simple tasks, from making an appointment to set a cooking timer. The current practice of news briefings looks however more like a lord in the castle with the butler reading out a newspaper to him.

Even in this strange one-sided way of engaging with news current conversational interfaces are doing poorly. In the interviews done by Nic Newman users of smart speakers complain that the news briefings are not easily consumable due to their length and the unpleasant voice of the digital assistant.

This reminds me of the Master/Slave Dialectic in the Phenomenology of Spirit (1807). In one chapter of this book, Friedrich Hegel describes the dynamic between lordship and bondage. In the beginning the master is on the winning hand, living in freedom, but eventually, the slave might be better off according to Hegel: He finds meaning in and through labor, while the master sinks of in empty consumption and becomes wholly dependent on the enslaved (Siep, 2014). Are we in this case the masters that want to consume news efficiently, while the virtual assistant silently collects data and becomes smarter?

Another problematic aspect of this stereotypical role is that meaningful engagement with journalism is more than consumption of information. Earlier I referred to Johanna Drucker's work on humanistic interfaces, which is mostly focused on scholarly reading. This mode of interacting with information has some similarities with news reading, in that they both rely on critical insight and the idea that reading or listening is just the start of a further conversation. Following that idea, you can’t have a meaningful news experience for everything you’re reading because it requires a certain kind of cognitive attention.

Smart speakers are now designed to offer quick info about the weather, to give direct control to appliances in the home. The news briefings have a strange position there, as they are too long to be as immediate as a light switch. On the other hand, they are too short and functional to go as deep as for example a podcast can go.

Then comes the question of what role the conversational interface should have in the context of news. I’m advocating for slower and more in-depth information. Smartphones work great for glancing information and to snack some headlines. Audio, as done by smart speakers, could be great to go a step further. Instead of focusing on efficiency as approaching the speaker as a tool, I see that there is space to make some more use of the qualities of audio, its intimacy, immersiveness, and its character, to use this medium for new ways of storytelling and offering a literary experience. What new interpretation could a robot give to a text? Could the character of the personal digital assistant influence the way news is presented to people?

Intervention by haiku

Popular conversational interfaces like Siri, Alexa, and Assistant, are designed to serve their users. Another characteristic they share is their aim for a universal and neutral personality. Google Assistant has the same character and way of working on a smart speaker, as in the smartphone app. If there are any biases, the systems are designed to not be explicit about that. (Bogost, 2018)

I believe however that an unleashed virtual assistant would be a conversational interface that embraces its biases and shows its unique personality. A rationalistic smart speaker would look and work in another way than a progressive liberal smart speaker. They could not only serve the news in a briefing but also ask questions to provoke users, maybe annoy them. The unleashed assistant would not exclusively treat the human as a mere consumer, but maybe as a conversation partner if the character of the interface would prefer that role. The speaker is not a butler, but more of a companion with whom you have a conversation around a dining table. An entity that brings surprise and is not at all times friendly and docile.

By playing and provoking the user, I imagine that these rogue digital assistants create a space where critical insight is facilitated. In her work, Johanna Drucker calls this the humanistic interface (2014), although she mostly refers to graphical user interfaces there.

Maybe the start of the humanistic conversational interface is the happy newsbot in the Quartz app. Its voice is written by the editors working at the publication. After its initial success, there is now a new entertainment bot modeled after the culture and gossip bloggers at Quartz. The publisher continues its experiments in their Bot Studio where they experiment with bots as a way to publish news.

Although limited, the current bot already provides some delightful interventions in my day. Today it decided to end the day differently. Instead of delivering a briefing of today's news, the bot wrote a haiku that made me reflect on the stock market:

Trade wars and rate hikes

Are looming. At least today

We can catch our breath