User:Alice/IFL: Difference between revisions

From XPUB & Lens-Based wiki
(Created page with "= My contribution to Xpub Library = == Research questions == * How can we represent the books in the collection in a different way - using the idea of stacks as mixtapes, depe...")
 
No edit summary
Line 2: Line 2:
== Research questions ==
== Research questions ==
* How can we represent the books in the collection in a different way - using the idea of stacks as mixtapes, depending on study path and reading time
* How can we represent the books in the collection in a different way - using the idea of stacks as mixtapes, depending on study path and reading time
* What kind of interface would be best suited for a library that serves our community
=== Reading time ===
=== Reading time ===
For this first challenge, I looked into representing a text through the amount of time it takes to read (similar to Medium). I adapted [https://github.com/assafelovic/reading_time_estimator this] script ([https://github.com/assafelovic/reading_time_estimator/blob/master/LICENSE license]) using BeautifulSoup to extract the text from an html file and print out the estimated reading time in minutes and hours.
For this first challenge, I looked into representing a text through the amount of time it takes to read (similar to Medium). I adapted [https://github.com/assafelovic/reading_time_estimator this] script ([https://github.com/assafelovic/reading_time_estimator/blob/master/LICENSE license]) using BeautifulSoup to extract the text from an html file and print out the estimated reading time in minutes and hours.
Line 53: Line 56:


</syntaxhighlight>
</syntaxhighlight>
=== Interface ===
I want to research the prospect of having only a command line interface through which anyone can search the library.
Things I'm currently playing with:
* [https://github.com/yudai/gotty Gotty]
* [https://github.com/tmux/tmux/wiki Tmux]
* [http://urwid.org/tutorial/index.html Urwid]

Revision as of 23:18, 13 May 2018

My contribution to Xpub Library

Research questions

  • How can we represent the books in the collection in a different way - using the idea of stacks as mixtapes, depending on study path and reading time
  • What kind of interface would be best suited for a library that serves our community

Reading time

For this first challenge, I looked into representing a text through the amount of time it takes to read (similar to Medium). I adapted this script (license) using BeautifulSoup to extract the text from an html file and print out the estimated reading time in minutes and hours.

import bs4
import urllib.request, re
from math import ceil

def extract_text(url):
    html = urllib.request.urlopen(url).read()
    soup = bs4.BeautifulSoup(html, 'html.parser')
    texts = soup.findAll(text=True)
    return texts

def is_visible(element):
    if element.parent.name in ['style', 'script', '[document]', 'head', 'title']:
        return False
    elif isinstance(element, bs4.element.Comment):
        return False
    elif element.string == "\n":
        return False
    return True

def filter_visible_text(page_texts):
    return filter(is_visible, page_texts)

WPM = 180
WORD_LENGTH = 5

def count_words_in_text(text_list, word_length):
    total_words = 0
    for current_text in text_list:
        total_words += len(current_text)/word_length
    return total_words


def estimate_reading_time(url):
    texts = extract_text(url)
    filtered_text = filter_visible_text(texts)
    total_words = count_words_in_text(filtered_text, WORD_LENGTH)
    minutes = ceil(total_words/WPM)
    hours = minutes/60
    return [minutes, hours]
html_file = 'file:///home/alice/Documents/Reader%20final/txt/where_is.html'

values = estimate_reading_time(html_file)

print('It will take you', values[0], ' minutes or', values[1], 'hours to read this text')

Interface

I want to research the prospect of having only a command line interface through which anyone can search the library.

Things I'm currently playing with: