Protoyping 11

04/03 Compiling static sites: Working with the MediaWiki API + Python with Michael

Working with API

PHP is a popular general-purpose scripting language that is especially suited to web development. Data flowing in-out of websites was very popular 15 years ago but now this is shutting down due to privacy problems.

API keys

if you want to use something you need to sign in and agree to legal term, another level of agreement higher than username and password. This helps control your users and their information.

RSS

Markup came from blogging and zines and related to RSS. RSS is a web feed which allows users and applications to access updates to websites in a standardised, computer-readable format. These feeds can, for example, allow a user to keep track of many different websites in a single news aggregator. You could subscribe and unsubscribe from different feed and curate your own. Is mostly used in podcasts.

Web 2.0

Refers to websites that emphasize user-generated content, ease of use, participatory culture and interoperability for end users. The term was invented by Darcy DiNucci in 1999 and later popularized by Tim O'Reilly and Dale Dougherty at the [1] O'Reilly Media Web 2.0 Conference in late 2004. Fom a developer that maintaining a website P.O.V an API can be dangerous. Here API etiquette comes in.

ActivityPub is an open, https://en.wikipedia.org/wiki/Decentralised_system decentralized] social networking protocol based on Pump.io's ActivityPump protocol. It provides a client/server API for creating, updating and deleting content, as well as a federated server-to-server API for delivering notifications and content.

A bot user has a higher limit

Python

{} - data structure {text }= dictionary {numbers} = list

In terminal

If you are in active mode in terminal _ points to the last action I. + tab - we get all the image properties we can use

Mark Pilgrim

How to translate the ask_query to change from a query you write in the semantic page, to use the API site.api ask

RSPi And TOR

Sessions with Aymeric

Server- It’s a computer but the real server is a software. Examples for server softwares: NGINX- HTTP server, by default will serve the web port: 80- the default port . The traffic is not encrypted. or 443, the TLS port (HTTPS, s is foe secure) and this one is encrypted. When you configure a server you want to set 443 to default, even if you get info in the 80. With Tor we use 80 because it is already encrypted. If you use TOR with 443 it will be double encrypted, but this has no value and not necessary. You can use the 443 just for certification. People use certificate through TOR to verify certification. Generate certificate- SELF SIGNED CERTIFICATE. The problem- you type a few command in your shell and you create one. Each browser/operating list/etc comes with a list of trusted vendors. So you can buy a license to one of these vendors. So you need to trust a third party (a vendor). The price of this changes and can be very expensive. There is a lot of criticism about this this, it’s cost a lot, some vendors doesn’t have sufficient security, they are not up to standard or compromised. There is a Mafia around it. The biggest one is Verisign. These are the companies that provide a certificate ( a file) that you can install inside NGNIX that is allowed to serve the 443 port (if you want to encrypt your website, and not get a warning. If you get a warning people will not enter your site, they will be scared, Hackers communities were upset about this, because you can’t easily self-certificate to the 443 port, and the only options is very expensive. That is why Let’s Encrypt (part of the LINUX, launched April 2016) was founded- they became a vendor and they allow Free certificate for you domain (Free Certbot). It really exploded and very one what using it because browser became very encryption oriented. They are in the list of Vendors. This process took like 5 years because the big vendors were against. You have to work a bit more with this than the big vendors that you just get the certificate via email. There are some concerns- Let’s Encrypt ask you to renew the certificate every 3 months, now they want to do it every 3 days to make it hard to crack the keys. This will make them very powerful and allow them to control most of the websites because so many are using them. This is why there should be more free vendors. POSTFIX-email server Our laptop is a server because we have software that are able to communicate. Being part of the internet means you have a public IP. The problem with IPs- the default limit 255.255.255.255 and we will reach the maximum combinations (IPV4 exhaustion) pretty soon. This is why IPV6 was made, to create a bigger range- and this number is really enormous. LAN- local area network. You have a router that has two network interface- a public IP and local (internal) IPs. When you connect your phone/compute/pi they are going to get an IP. When you a new machine is connected its gonna “ask ”the router” is there anyone there? The DHCP server is gonna reply “yes I’m here”. Then the machine will ask for an IP, the sever offers an IP and the machine needs to agree to it. Then the connection will be establish through wifi or cable. Then when you open the browser and wanna access a website. The Router DNS will translate the actually domain to the IP of the machine that serves this website. Domain is just a “skin” that we put on an IP address for it to be more human friendly. My machine will ask my router for connecting, if its not found there, it will direct to the next router and so on. Traceroute- will reveal all the route (all the networks) that you go to establish a connection. This is why encryption is important- you are going through all these routers, networks. Ping command is well known in the gaming community because it says how fast is your connection. Packet loss during ping- can be compromised by crappy connection. LEASE- when you attributes with an IP it will be yours for some amount of time (hours, days). Lease will expire at some point. You can can go to the settings of your router and you have a list of set IPs DHCP binding- set permanent IP. Its is good to host something. If you host a website it will be annoying that the IP changes all the time. MAC address- the unique id of your network card, the DHCP asks for this address when connection is established. Terminal command- ifconf ig. ISP- static IP.

The rspi has a private IP- how can we make it talk with an our machine? We need to make the 80 and 443 ports available. But your machine won’t get the public IP. The raspy can’t talk to you machine. How can we do it? We have the router in-between- it is friendly your local network but if another machine asks for the port 80 public IP than the software, fire wall, will say “no it’s close” or you just get nothing (you can config it with you firewall to drop the packet the pole way or not). You can config your router to have the port 80 open. NAT- Is a type of software, a protocol, that is built into any router. The router will receive the request in the public IP, the firewall says yes its open, then it goes to the NAT which is in the middle and It translate the public IP to a Private IP. It configured to establish the connection between the outside and insida. Tinc is doing it but is completely virtual. Example- you wanna talk to this website on this IP 5.35.137.12:80 (on port 80). You can go through NAT or not. You will try to enter that router, you say “hello” and you get the html back. Then you ask for another page, and then you browser gets it and translate it to graphic elements. You can host different websites on one server. Both websites (or more) will have the same IP but will have different HOST code. You can first establish an alias (command IN A) for the IP address. So if you want the IP to serve multiple website you need the software that will translate the input and direct it to the right way. You can have several domains with the same IP and filter is in the HOST level. In modern browsers- HTTP-port 80, HTTPS- port 443. You can make NGINX to listen to other ports but you will have to add to your url this port+configure it to be open or encrypted with NGINX. Two rules: 1. One software per port 2. No more than one unique IP address on the same subnetwork.

The .local problem- in Mac operation system points to that you don’t want to excess the internet but local. Shouldn’t be a problem of excess to website, just to ssh to the pi.

15/4 VPN- Traceroute- if you wanna connect to a website your data package is jumping from one router to another. The internet protocol is a postal service- you know the address but don’t know how to get there, you need help. Data packet- a small “postcard” (command) with the message you want to deliver to another point. On the “postcard” there is the info where it comes from and where it goes. It goes through the first router, this router has the view of your network but also of a network that he belongs to but not yours. It will go through routers until it get to a router that is in the same network of your receiving address. This can also result with the “postcard” being lost in the way. The flaw- routers can read the data that is written on the “postcard” because it is not encrypted. This was abused in the past- you can spy on the communication (sniff) but you can also change. This system is weak security wise. Nowadays you have “envelopes” instead of “postcards”. You still know the address and the place it was sent from, but the data is save inside with string glue- encrypted. The more powerful the “glue” (encryption) the harder it is to retrieve the data. This is always relative to the power of computers and computer science, it always evolves. HTTP- the type of packet os “postcard” HTTPS-“envelope”.

Still a flaw you still have the metadata of the data- “from” and “to” address, who are you talking to, time and so one. So now router can surveil your behaviour. This is still the main argument in mainstream media- you shouldn’t worry about surveillance because we are just interested in the metadata, not in the actual data. So in the route you make between routers there can be evil router retrieving your data. All the places that you visit are collected and this is data that can be sold, analysed and turned into other data like behavioural models that can be sold for more money for people to design products or commercial space. This is how cheap internet and phone networks can be- they sale your info. Even when you send emails. How can we hide the address and receiver then? VPN. A VPN (virtual private network) usually being promoted as an anti surveillance but this is not true. When you are on your network you can’t choose your IP, and the router IPs are pretty stable. It is always traceable to your physical location. If you take your laptop and connect from somewhere else, maybe the path to your address will be similar but the entry point will be different. What if the address (b) will move too? You can’t access someone else LAN without specifying router and port. Your computer has one Network Interface on it, but you can create a virtual Network Interface. We have the internet with the public IP, we have the LANS, but we also can create a network between a certain amount of computers, separate from the physical infrastructure of the internet. So if I move with my machine somewhere else, different router, it will still connect to this virtual network. This virtual network we created is VLAN. If you are part of this network you have 2 IPs- public and virtual. When you sign into a VPN you always have a nod that is the entry point for that network/connects you with that network. VPN providers are using VPN software so you can’t see the “from” address. It will use a different node IP. But how can you trust that node? It receives the data of the “from” and “to” address. This VPN providers are a trap, they get your data. They can’t really protect you and this is a very smart business. VPN can be interesting to use nods from other countries

Open VPN- You have the server and the client and everything goes through that server. In Tinc everyone is a client and a server. So when you connect it will first try to connect to a nod that is publicly accessible so you need to configure one node to be publicly accessible so when a new client wants to connect it doesn’t know a lot about the network. You can tell the new client manual which nod is the public nod to access the network. If you install Tinc you can access the sandbox without XVM(?). When connecting with Tiny it will first try to connect to your “to” address directly, and of it doesn’t manage it will take the route it needs to take. So Tinc is a more sophisticated way to access than VPN. So how can we visit a website without anyone knowing about it? The Tor browser = Firefox+TOR software. So you can install the TOR browser or the Tor software so you can use to configure applications to be forced to be open through TOR. Or you can can use Torify that will force an application to go through TOR even if it’s not written to communicate through TOR. When you launch the browser it asks for the TOR directory. It connected to it and ask for a Circuit. A Circuit is made of 3 nods with public IPs and will be identifies with some ID (C1 for examples). 1st node is the Guard- the entry point and should be stable 2nd is Relay- can be any node, 3rd is the Exit- only receives traffic that should go outside to the internet. This path looks a bit like VPN but I’m still exposed to these nodes. When the Circuit is generated it makes 3 unique keys- KG (key guard), KR (key relay), and KE (key exit) and each knows just their key, not the others. The 3 packets process- TOR is also going to generate packet no.1 (envelope), can be very small, and the “from” address is going the Exit node and it is going to be encrypted with KE, and this key only you and the Exit node knows. Then TOR creates packet no. 2 that is encrypting with KR. And then another packet of all of this that is encrypted with KG. This is why it’s called an Onion. TOR is changing Circuits every few minutes. The Exit node doesn’t know were the packet is from. The ID of the Circuit goes inside the packets too so that is how Exit node know how to send a packet back. The only one that can open the whole onion is The A machine (the first one that sent a packet) because it has the 3 keys. When you have an onion address it is your final address (your B). This address is available in the TOR directory.

FTP- a method, protocol, to interact between 2 nods (port 122). You can write your own FTP protocol with python for example. Used to be the most common way to share file. It is not encrypted and complicated to make encrypted. At some point you couldn’t host it on a regular browser because it was considered unsafe. Now in the pandemic, google will allow to host FTP on chrome. FTP is an equivalent to HTTP (FTP://). Now old institution (universities and such) still use it but it is being ignored by the internet community.