User:Nadiners/ Thesis draft

From XPUB & Lens-Based wiki

7 feb

Intro/Chapter 1?

‘To Unpublish’: As I type this word my autocorrect software does not recognise it. It is a new verb introduced in the online Oxford Dictionary:

       "Make (content that has previously been published online) unavailable to the public.”

Proper usage of the word is illustrated with two telling and relevant examples:

  • Once the images have been published on the internet it will be practically impossible for any court order to unpublish them.’
  • ‘After an outcry on Twitter, the magazine unpublished the column, but the editors at the blog Retraction Watch managed to find a cached version, reminding us all that the internet never forgets.’

In our daily lives, we constantly now publish content. This is often intentional, as a way to express ourselves. However, just as often we unknowingly leave a trail of digital content which automatically disseminates into the network. We lose track of what both where we have been, and what we have told the world about ourselves. To unpublish is to try to retake control. We actively look to remove a specific content from the internet, to limit what others can know about us. The dictionary examples remind us that this is easier said than done.

Before proceeding, I'd like to make a distinction between publishing and disseminating data. Data is something given, it is raw, a symbol. Not all data is published. Our privately held medical records are not public in the manner of our Facebook posts. The data that we produce only becomes publishing once it has been analysed and dissipated through the network into the public sphere. Publishing, thus, is about making knowledge public (to the people).

As digital memory storage expands, the trail of data that we produce, and ultimately publish, is inexhaustibly growing. Once it reaches the people and comes out as knowledge, it becomes more meaningful, and much harder to forget, let alone to destroy.

If we break it down even more, typically knowledge is defined in terms of information, and information in terms of data (first specified in detail by R. L. Ackoff in 1988 DIKW). Data, from 'Datum' in Latin means 'something given'. Data constantly gives itself, disseminates itself, but how do you get data back? David Thorne's email thread with Jane illustrates the essence of digital information.


               From: Jane Gilles
               Date: Wednesday 8 Oct 2008 12.19pm
               To: David Thorne
               Subject: Overdue account
                                       
               Dear David,                                                                                  
               Our records indicate that your account is overdue by the amount of $233.95. 
               If you have already made this payment please contact us within the next 7 days
               to confirm payment has been applied to your account and is no longer outstanding.
                                                               
               Yours sincerely, Jane Gilles
                                                                                
               From: David Thorne
               Date: Wednesday 8 Oct 2008 12.37pm
               To: Jane Gilles
               Subject: Re: Overdue account
                                       
               Dear Jane,
               
               I do not have any money so am sending you this drawing I did of a spider instead. 
               I value the drawing at $233.95 so trust that this settles the matter.
                                                                
               Regards, David.
                                       
                                       
               From: Jane Gilles
               Date: Thursday 9 Oct 2008 10.07am
               To: David Thorne
               Subject: Overdue account
                                       
               Dear David,
               
                Thank you for contacting us. Unfortunately we are unable to accept drawings 
                as payment and your account remains in arrears of $233.95. Please contact us 
                within the next 7 days to confirm payment has been applied to your account 
                and is no longer outstanding.
               
               Yours sincerely, Jane Gilles
               
               From: David Thorne
               Date: Thursday 9 Oct 2008 10.32am
               To: Jane Gilles
               Subject: Re: Overdue account
                                       
              Dear Jane,
              
              Can I have my drawing of a spider back then please.
              
              Regards, David.
              
              From: Jane Gilles
              Date: Thursday 9 Oct 2008 11.42am
              To: David Thorne
              Subject: Re: Re: Overdue account
                                       
              Dear David,
              
              You emailed the drawing to me. Do you want me to email it back to you?
              
              Yours sincerely, Jane Gilles
              
              From: David Thorne
              
              Date: Thursday 9 Oct 2008 11.56am
              To: Jane Gilles
              Subject: Re: Re: Re: Overdue account
              
              Dear Jane,
              
              Yes please.
              
              Regards, David.


The dialogue continues, with Jane insisting that she has returned the original spider: "I copied and pasted it from the email you sent me". But of course, there is no original spider in the way that there would be an original paper drawing. And with every new email in the chain, the spider is replicated again. This is how data is. Once data is created it can only disseminate, the very idea of receiving your data back is absurd. The very nature of data makes it impossible to be deleted. For any human to try to get rid of it, they would be working against the force of nature.

The intention to unpublish is not in order to delete or destroy anything, the purpose is to limit the access to which public can view the information. An individual wishing to remove a previously published content might do so to protect their privacy and identity. Or a large corporation might want to remove content as censorship, in order to protect its reputation, or its commercial success. There are many reasons why someone might want to unpublish.

Nonetheless, the very nature of data can make unpublishing unviable. Once knowledge is public, then even if its destruction is in principle possible, the process of unpublishing becomes harder. If "knowledge" has already spread through the network, limiting its access won’t stop it from spreading.

Chapter 2?

There are many reasons why deleting data is, in practice, impossible. For data to be truly deleted, you must physically destroy the hardware because recoveries can always be made. For data that is on the internet, meaning in someone else's server, it would be impossible to physically go there are destroy the hardware with the specific unwanted content. (FOOTNOTE: To destroy hardware one would need a plan as sophisticated as the one described in the tv series Mr. Robot - in which Elliot Alderson, a cybersecurity engineer and hacker, is recruited by "Mr. Robot", to join a group of hacktivists called "fsociety". The group aims to destroy all debt records by encrypting the financial data of the largest conglomerate in the world, E Corp.)

Even if the actual content you want to be deleted, is in fact deleted, its metadata will live on. This is a record of when, where, and by whom a specific piece of content was deleted. Metadata is sometimes more telling than the actual content itself.

Another backlash effect is very common. When you really try to take something off the internet, you will in fact being more attention to what you are trying to delete. This is called the Streisand effect. It "is the phenomenon whereby an attempt to hide, remove, or censor a piece of information has the unintended consequence of publicizing the information more widely, usually facilitated by the Internet" (wiki). Newspapers like the Guardian regularly report on failed attempts by some individuals to have sensitive information redacted. And even though Google now removes some search results from its listings, it tells you that it is has done so. To a determined bloodhound, this can function like a trail of blood.

All of these obstacles to unpublishing are exacerbated by the growing army of trolls who, in the name of free speech, are willing effortfully disrupt others' attempts to control their data trail. This can be done with good intent (perhaps as in the early days of WikiLeaks. But often it is not. One of the greatest obstacles to countering revenge porn attacks is the willingness of others, found lurking on 4chan or in Reddit forums, to replicate and reupload images of the innocent victims whom they have never met.

Given these challenges to unpublishing, if you'd like something to be unpublished, deleting it is rarely an option. The only way to divert attention from it, is to carry on producing content.

"In the past, censorship worked by blocking the flow of information. In the twenty-first century censorship works by flooding people with irrelevant information." (Homo Deus, Harrari)

The Nature of information is impossible to Unpublish

Abstract

In my thesis I will discuss how the very nature of information does not allow itself to disappear. When talking about content that goes on the internet, we tend to call it publishing, what I'm proposing is that the reverse: 'Unpublishing' is not possible. I will begin with comparing data (the raw material of publishing) to energy, both their natural courses, as they are being created can be either stored or disseminated. Today we are living in a data-centric world, we are constantly leaving trails of data wether it's voluntarily or unconsciously. This will lead me to talk about how data, from anything living or material, is constantly feeding the algorithms on the network. This then defies the purpose of publishing, if the data is made to grow the web, and it is no longer serving the people (gaining knowledge), but the people are serving the algorithm, then what does publishing become? There is an overproduction of data, and with the ease of data disseminating there is inevitably an excess problem. It's all being stored in huge data centres, what do we do with the junk? If our bodies carry approximately 90% of useless DNA, must there be a natural reason for this. Is nature telling us again that we can't get rid of data? Does the question then become similar to what do we do with nuclear waste? As destruction is beyond our capabilities, we just have to find a way to store it in a harmless way and make it completely inaccessible. Following accessibility in the next chapter, and considering we cannot delete, destroy or unpublish information, we are still trying so hard to block access and regulate the flow. I will give various examples here on how Net Neutrality, Laws such as 'the right to be forgotten' or content moderators, are trying to regulate the web, for better or for worse they are working against the force of nature. It does not however mean we should give up, there are some data we can control and put away or that can exist beside the network.

intro

Read this to begin: Overdue Account

Data, from 'Datum' in Latin means 'something given'. Data constantly gives, but how do you give data back? David Thorne's email thread with Jane illustrates the essence of digital information. Once data is created it can only disseminate, the very idea of receiving your data back is absurd.


Chapter 1 - the nature of information

Before entering the topic I'd like to introduce the DIKW classification model which represents the relationships between the follwing:

  • Data : raw symbols or signs
  • Information : data that are processed to be useful; provides answers to "who", "what", "where", and "when" questions
  • knowledge : application of data and information; answers "how" questions
  • Wisdom : evaluated understanding. (optional)

(first specified in detail by R. L. Ackoff in 1988) Typically information is defined in terms of data, knowledge in terms of information, and wisdom in terms of knowledge.

To Publish: is to make knowledge public. Public comes from the latin 'populus' from latin meaning 'people', so to publish is to give knowledge to people.

If we agree on these definitions, then data, the raw segments, that are available to the public are not considered as publishing.

Chapter 2 - Feeding the algorithm rather than publishing

Chapter 3 - excess problem overproduction .. what to do with the junk

Chapter 3 or 4 - Access and regulation