Interfacing the law prototyping Zalan Szakacs

CGI & TF-IDF search engine – Michael

Pad of the lesson: prototyping-17042018

CGI (the other one) Common Gateway Interface > from early days of the web, introducing the idea of browsing, the query, running software dynamically online https://docs.python.org/2/library/cgi.html

perl-- early programming language, web editing, design https://www.perl.org/

https://www.amazon.com/Games-Diversions-Perl-Culture-Journal/dp/0596003129 error_in_time https://vimeo.com/228916519 (poem-programming) 'More than one way to do it!' - Larry Wall, creator of Perl

you can run perl scripts: perl mystery.pl

Comma separated catalogue – Andre + Michael

Pad of the lesson: CSV-libgen

CSV:

comma-separated values
plain-text file
commas are used as field seperators
A record ends at a line terminator (such as new line \n)
All records have to have the same number of fields

How to handle it? Tools for handling CSV

head - output the first part of files
tail - output the last part of files
CSVKit - "swiss army knife" of CSV files. Install: sudo pip3 install csvkit

Documentation: https://csvkit.readthedocs.io/en/1.0.1/#

Python CSV core lib https://docs.python.org/3.5/library/csv.html
structured-text-tools https://github.com/dbohdan/structured-text-tools

What is it? How is it structure? What data does it contain? Is it incomplete or messy?

Headers are in seperate file libgen_columns.txt, we might need to integrate it into content.csv, as python CSV library takes the first row of a CSV file as the headers row
convert libgen_columns.txt to CSV:

A alternative: cat libgen_columns.txt | sed "s/\ \ \`/\"/g" | sed "s/\`.*/\",/g"| tr --delete '\n'

MAC: cat libgen_columns.txt | sed "s/\ \ \`/\"/g" | sed "s/\`.*/\",/g" | tr -d '\n'

converting txt to csv (MAC): libgen_columns.txt | sed "s/\ \ \`/\"/g" | sed "s/\`.*/\",/g" | tr -d '\n' > headers.csv

Workshop 1 Building our own search engine – Dusan Barok (monoskop)

#!//usr/local/bin/python3
import cgi
import cgitb; cgitb.enable()
import nltk
import re

print ("Content-type:text/html;charset=utf-8")
print ()

#cgi.print_environ()

f = cgi.FieldStorage()
submit1 = f.getvalue("submit1", "")
submit2 = f.getvalue("submit2", "")

text = f.getvalue("text", "")

### SORTING
import os
import csv
import string
import pandas as pd
import sys
### SEARCHING
#input keyword you want to search
keyword = text


print ("""<!DOCTYPE html>
<html>
<head>
	<title>Search</title>
	<meta charset="utf-8">
</head>
<body>
<p style='font-size: 20pt; font-family: Courier'>Search by keyword</p>
	<form method="get">
	<textarea name="text" style="background: yellow; font-size: 10pt; width: 370px; height: 28px;" autofocus></textarea>
	<input type="submit" name="submit" value="Search" style='font-size: 9pt; height: 32px; vertical-align:top;'>

</form>
<p style='font-size: 9pt; font-family: Courier'>
	webring <br>
<a href="http://145.24.204.185:8000/form.html">joca</a>
<a href="http://145.24.198.145:8000/form.html">alice</a>
<a href="http://145.24.246.69:8000/form.html">michael</a>
<a href="http://145.24.165.175:8000/form.html">ange</a>
<a href="http://145.24.254.39:8000/form.html">zalan</a>

</p>
</body>
</html>""")
x = 0
if text :
	#read csv, and split on "," the line
	csv_file = csv.reader(open('tfidf.csv', "r"), delimiter=",")
	col_names = next(csv_file)
	#loop through csv list
	for row in csv_file:
		#if current rows value is equal to input, print that row
		if keyword == row[0] :
			tfidf_list = list(zip(col_names, row))
			del tfidf_list[0]
			sorted_by_second = sorted(tfidf_list, key=lambda x:float(x[1]), reverse=True)
			print ("<p></p>")
			print ("--------------------------------------------------------------------------------------")
			print ("<p style='font-size: 20pt; font-family: Courier'>Results</p>")
			for item in sorted_by_second:
				x = x+1
				print ("--------------------------------------------------------------------------------------")
				print ("<br></br>")
				print(x, item)
				n = item[0]

				f = open("cgi-bin/texts/{}".format(n), "r")
				sents = nltk.sent_tokenize(f.read())

				for sentence in sents:
					if re.search(r'\b({})\b'.format(text), sentence):
						print ("<br></br>")
						print(sentence)
				f.close()
				print ("<br></br>")

Workshop 2 Tunneling – Marcel Mars (memory of the world)

Pad of the day Marcell Mars workshop

mapping locations with firewall location/exclusive spaces: physical border range of wi-fi mapping the network: network switch boxes, ethernet cables, router https://en.wikipedia.org/wiki/Router_(computing)

AP - access point eduroam- authorization https://en.wikipedia.org/wiki/Eduroam

works everywhere in eduroam environment the password matches in the same building- metadata tells you when the information comes from--the university guarantee to the central institution

what they will know from a website:

domain name
subdomain

https: https://en.wikipedia.org/wiki/HTTPS secure connection certificate (we were paying before)

https://letsencrypt.org/ https://en.wikipedia.org/wiki/Let's_Encrypt so not to use https for every site mozilla was behind that--

notary

https://www.digicert.com/ wildcard/ the first word is hidden:

.wikipedia.org

notary will charge you for wildcard/ now they are free https://en.wikipedia.org/wiki/Wildcard_character

ex: somebody stills the certificate of the server of wikipedia and then impersonates wikipedia

jessica and logan use the part after the slash in the browser

DNS servers root servers

https://www.lifewire.com/what-is-a-dns-server-2625854

Automatic (DHCP): automatic IP address systemd-resolve --status for ubuntu cat /etc/resolv.conf

1.1.1.1 DNS https://blog.cloudflare.com/announcing-1111/ cat /etc/hosts ip a link/ether (with barcode)

https://en.wikipedia.org/wiki/International_Mobile_Equipment_Identity

macchanger in linux--> you change to another decive and they dont know your megadress

http://sci-hub.hk/ ping sci-hub.hk <is the server alive> copy the 80.82.77.83 in the /etc/host

127.0.0.1 facebook.com <you block facebook, telling your computer to go to localhost instead>

Network interface - part of the computer that talks to the internet an ip address is assigned to the network interface on your computer (wireless, cable)

discussion on current discourse on security, politics: Amazon Web Services "more dangerous than google" ! https://aws.amazon.com/?nc2=h_lg cloud computing (set up for domains)

Even Apple is on Amazon servers: https://techcrunch.com/2018/02/27/apple-now-relies-on-google-cloud-platform-and-amazon-s3-for-icloud-data/

https://arstechnica.com/information-technology/2018/04/in-effort-to-shut-down-telegram-russia-blocks-amazon-google-network-addresses/

Letter: In solidarity with Library Genesis and Sci-Hub inviting other 'custodians' into civil obedience http://custodians.online