User:Alice/IFL
My contribution to Xpub Library
Research questions
- How can we represent the books in the collection in a different way - using the idea of stacks as mixtapes, depending on study path and reading time
- What kind of interface would be best suited for a library that serves our community
Knowledge organization
Examples Stacks A stack is a number of books that are read at a certain point in time, alternating between them. They usually have a topic in common, or follow a certain study path that can bring you to a point of knowledge. Rather than a bookshelf, where books are lined up and often forgotten, the stack on your table/nightstand/toilet consists of books prone to be opened and reopened at any time.
Reading time
For this first challenge, I looked into representing a text through the amount of time it takes to read (similar to Medium). I adapted this script (license) using BeautifulSoup to extract the text from an html file and print out the estimated reading time in minutes and hours.
import bs4
import urllib.request, re
from math import ceil
def extract_text(url):
html = urllib.request.urlopen(url).read()
soup = bs4.BeautifulSoup(html, 'html.parser')
texts = soup.findAll(text=True)
return texts
def is_visible(element):
if element.parent.name in ['style', 'script', '[document]', 'head', 'title']:
return False
elif isinstance(element, bs4.element.Comment):
return False
elif element.string == "\n":
return False
return True
def filter_visible_text(page_texts):
return filter(is_visible, page_texts)
WPM = 180
WORD_LENGTH = 5
def count_words_in_text(text_list, word_length):
total_words = 0
for current_text in text_list:
total_words += len(current_text)/word_length
return total_words
def estimate_reading_time(url):
texts = extract_text(url)
filtered_text = filter_visible_text(texts)
total_words = count_words_in_text(filtered_text, WORD_LENGTH)
minutes = ceil(total_words/WPM)
hours = minutes/60
return [minutes, hours]
html_file = 'file:///home/alice/Documents/Reader%20final/txt/where_is.html'
values = estimate_reading_time(html_file)
print('It will take you', values[0], ' minutes or', values[1], 'hours to read this text')
Interface
I want to research the prospect of having only a command line interface through which anyone can search the library, shared online using gotty.
Things I'm currently playing with:
Syncthing testing
Syncthing
Session with Tash, Andre & Alice: 28.05.2018
How to configure and install syncthing on the raspberry pi, and two of our own machines?
Syncthing can be used to sync book files and catalog files between different instances of our library (e.g. syncing catalog between server and Pi's, syncing book files between Pi's)
Files are not stored in the cloud and it allows for decentralized, read-write architecture (different from rsync which uses a master-slave relationship)
Running Syncthing
At first start Syncthing will generate a configuration file, some keys and then start the admin GUI in your browser.
The GUI remains available on https://localhost:8384/.
For Syncthing to be able to synchronize files with another device, it must be told about that device. This is accomplished by exchanging “device IDs”. A device ID is a unique, cryptographically-secure identifier that is generated as part of the key generation the first time you start Syncthing. It is printed in the log above, and you can see it in the web GUI by selecting the “gear menu” (top right) and “Show ID”.
Two devices will only connect and talk to each other if they are both configured with each other’s device ID. Since the configuration must be mutual for a connection to happen, device IDs don’t need to be kept secret. They are essentially part of the public key.
To get your two devices to talk to each other click “Add Device” at the bottom right on both, and enter the device ID of the other side. You should also select the folder(s) that you want to share. The device name is optional and purely cosmetic. It can be changed later if required.
Configuration
Syncthing config.xml file, which can be edited via terminal or through the web GUI interface.
Each element describes one folder. The following attributes may be set on the folder element:
id - The folder ID, must be unique. (mandatory)labelThe label of a folder is a human readable and descriptive local name. May be different on each device, empty, and/or identical to other folder labels. (optional)
path - The path to the directory where the folder is stored on this device; not sent to other devices. (mandatory)
type - Controls how the folder is handled by Syncthing. Possible values are:
readwrite - The folder is in default mode. Sending local and accepting remote changes.readonlyThe folder is in “send-only” mode – it will not be modified by Syncthing on this device.
rescanIntervalS - The rescan interval, in seconds. Can be set to zero to disable when external plugins are used to trigger rescans.
Because the pi can't access the browser GUI, you can change the config file to add the GUI port address from 127... to 0000 served on Apache web server. Then you can look at the GUI remotely in your browser. Alternatively, you can add device keys via terminal in the config file. Question: Can we have rw permissions on the main pi, and read only permissions on all others? - probs
Troubleshooting
Kernel Panic
Don't use the shark SD card! Aymeric bought them for super cheap and they will corrupt the f up.
Kernel panic means you have to try and reboot the Pi in recovery mode. Or... abort.
Merging & file conflicts
Editing CSV files in different nodes at the same time will result in conflicts.
How to make a fault tolerant, decentralized file system which will allow up-to-date uploads, edits and deletions between different nodes?
Important for us: How to keep catalog and files separate so that only catalog is visible to public? AND How to make sure file and catalog are synced in a way that is distributed?