NLTK text analysis

From XPUB & Lens-Based wiki

Natural Language Tool Kit_141020_Michael

Basic

url = "https://git.xpub.nl/XPUB/S13-Words-for-the-Future-notebooks/raw/branch/master/txt/words-for-the-future/UNDECIDABILITY.txt"

from nltk import word_tokenize, Text
tokens = word_tokenize(text)
len(tokens)
tokens[-1]
tokens[:10] # not including the 10th word
tokens[21:30] # not including the 30th word

strengers = Text(tokens)
strengers.concordance("multiplicity", width = 84, lines = 72)

Displaying 11 of 11 matches:
 ] attempts to escape the vortex of multiplicity are useless.  [ 6 ] In his fifth m
 , he subsequently focuses on [ i ] multiplicity [ i ] as a way for literature to co
fore , let  s think visibility and multiplicity together , as : a multiplication of
n the contrary , it is generating a multiplicity of different gazes that are all leg
ed and thus incomplete and open . A Multiplicity of Gazes An undecidable artwork is 
ics today , is that they generate a multiplicity of gazes and of forms of spectators
 positions and points of view . The multiplicity of gazes produced and gathered by u
tes a radical collectivity based on multiplicity and on conflicting positions that a
ility and from its encounter with a multiplicity of gazes . Preserving it is possibl
encounter between undecidable art , multiplicity of gazes , and a curatorial dimensi
 ibid , p. 98 . 7 . Italo Calvino , Multiplicity , [ i ] Six Memos for the Next Mill


for line in strengers.concordance_list("the", width=82, lines=74):
    print (line.left_print, line.query, line.right_print)

dability Silvia Bottiroli Multiplying the Visible The word [ i ] undecidable [ i
lvia Bottiroli Multiplying the Visible The word [ i ] undecidable [ i ] appears i
e [ i ] appears in [ i ] Six Memos for the Next Millennium [ i ] written by Italo
ry lectures at Harvard University . In the last months of his life Calvino worked
rishly on these lectures , but died in the process . In the five memos he left be
ectures , but died in the process . In the five memos he left behind , he did not
i ] Visibility [ i ] , revolves around the capacity of literature to generate ima
flow continuously . Calvino focuses on the imagination as  the repertory of what
alvino focuses on the imagination as  the repertory of what is potential ; what 
 exist but might have existed.  [ 2 ] The main concern that he brings forth lies
ncern that he brings forth lies within the relation between contemporary culture 
contemporary culture and imagination : the risk to definitely lose , in the overp
ion : the risk to definitely lose , in the overproduction of images , the power o
se , in the overproduction of images , the power of bringing visions into focus w
g [ i ] in terms of images.  [ 3 ] In the last pages of the lecture , he propose
f images.  [ 3 ] In the last pages of the lecture , he proposes a shift from und
he proposes a shift from understanding the fantastic world of the artist , not as
m understanding the fantastic world of the artist , not as indefinable , but as [
th this word , Calvino means to define the coexistence and the relation , within 
no means to define the coexistence and the relation , within any literary work , 
, between three different dimensions . The first dimension is the artist  s imag
nt dimensions . The first dimension is the artist  s imagination  a world of po
at no work will succeed in realizing . The second is the reality as we experience
l succeed in realizing . The second is the reality as we experience it by living 
we experience it by living . Finally , the third is the world of the actual work 
 it by living . Finally , the third is the world of the actual work , made by the
 . Finally , the third is the world of the actual work , made by the layers of si
the world of the actual work , made by the layers of signs that accumulate in it 
ns that accumulate in it ; compared to the first two worlds , it is  also infini
ctory to formulation.  [ 4 ] He calls the link between these three worlds  the 
 the link between these three worlds  the undecidable , the paradox of an infini
these three worlds  the undecidable , the paradox of an infinite whole that cont
ino , artistic operations involve , by the means of the infinity of linguistic po
c operations involve , by the means of the infinity of linguistic possibilities ,
infinity of linguistic possibilities , the infinity of the artist  s imagination
uistic possibilities , the infinity of the artist  s imagination , and the infin
ty of the artist  s imagination , and the infinity of contingencies . Therefore 
ity of contingencies . Therefore ,  [ the ] attempts to escape the vortex of mul
erefore ,  [ the ] attempts to escape the vortex of multiplicity are useless.  
 as a way for literature to comprehend the complex nature of the world that for t
re to comprehend the complex nature of the world that for the author is a whole o
e complex nature of the world that for the author is a whole of wholes , where th
he author is a whole of wholes , where the acts of watching and knowing also inte
watching and knowing also intervene in the observed reality and alter it . Calvin
are readable as different narratives . The lecture revolves around some novels th
ain multiple worlds and make space for the readers  imaginations . The common so
space for the readers  imaginations . The common source to all these experiments
all these experiments seems to rely in the understanding of the contemporary nove
 seems to rely in the understanding of the contemporary novel  as an encyclopedi
 , as a network of connections between the events , the people , and the things o
rk of connections between the events , the people , and the things of the world. 
 between the events , the people , and the things of the world.  [ 7 ] Therefore
vents , the people , and the things of the world.  [ 7 ] Therefore , let  s thi
ic production and define a context for the undecidable , or rather for undecidabi
le , or rather for undecidability , as the quality of being undecidable . Calvino
tion modes and doesn  t fade out from the scene of the  real  world . We might
d doesn  t fade out from the scene of the  real  world . We might stretch this
 s potentiality is that of multiplying the visible as an actual counterstrategy t
isible as an actual counterstrategy to the proliferation of images that surrounds
ly articulates , redefines , or alters the complex system of links , bounds , and
specific to some artworks within which the three worlds that Calvino describes me
tains and under certain terms performs the possibility of its actualisation , a w
into one actual form . In particular , the potentiality generated by undecidable 
c of  and and and  as opposite to the logic of  either or  that seems to
ature and just exist as such . None of the images of an artwork are being more or
twork are being more or less real than the others , no matter whether they come a
vidual or collective fantasies . It is the art ( work ) as such that creates a gr
s such that creates a ground where all the images that come into visibility share
images that come into visibility share the same gradient of reality , no matter w
itors or spectators to enter into  if the invitation of art is often that of los
itation of art is often that of losing the contact with known worlds in order to 
Here , spectators are invited to enter the work  s fictional world carrying with
ctional world carrying with themselves the so-called real world and all their oth
ll these worlds are equally welcomed . The artwork may then be navigated either b


Making my own pattern

for w in strengers:
    if w.endswith("ity"):
        print (w) # but then this will show overlapping, looping.. 
Undecidability
University
visibility
Visibility
capacity
reality
infinity
infinity
infinity
multiplicity
multiplicity
reality
visibility
multiplicity
undecidability
quality
potentiality
visibility
undecidability
undecidability
quality
possibility
potentiality
potentiality
reality
reality
visibility
reality
undecidability
reality
contemporaneity
possibility
possibility
possibility
undecidability
community
possibility
multiplicity
Multiplicity
multiplicity
multiplicity
community
collectivity
multiplicity
reality
responsibility
undecidability
potentiality
undecidability
collectivity
visibility
Undecidability
possibility
potentiality
quality
undecidability
multiplicity
intensity
multiplicity
Visibility
University
Multiplicity
University

# and now collected in a list, and squashing case, and using a "set" to remove dupliates.

ity = []
for w in strengers :
    if w.endswith("ity"):
        #print(w)
        ity.append(w.lower())
        #strengers.concordance()
ity = set(ity)        


with open("nami_undecidibility_Michael_NLTK_141020.text", "w") as output:

    s = 0

    for word in ity:
        #strengers.concordance(word, width = 84)
        for line in strengers.concordance_list(word, width=82, lines=74):
            t = line.left_print + " " * (2 + int(s)) + line.query + " " * (2 + int(s)) + line.right_print 
            #print(s)
            print (t[:82], file = output)#0-82 limited
            s = s + 0.3

Outcome

alt text


































Which parts of the original text to be put into the reinterpreted page?

Similarity function

Similarity.png
When delivering the original essay "Undecidability", I wanted to compress essential parts what the undecidability is meant by the author.
The way how I chose parts to be compressed was using "Similarity" function in NLTK. With this, I filtered 5 similar words with undecidability. Which were 'real, potentiality, resonances, spectatorship, original'.

Concordance function

Then I analysed the contexts of these five words, using a function of "Concordance". Through this process, I was able to choose which parts of the original essay could be transferred to my new interpretation web page.
Concordance.png Concordance2.png