User:Rita Graca/specialissue9: Difference between revisions

From XPUB & Lens-Based wiki
Line 360: Line 360:
[[File:Common words.png|600px |thumbnail|left| Testing it with Balázs Bodó's text, Own Nothing ]]
[[File:Common words.png|600px |thumbnail|left| Testing it with Balázs Bodó's text, Own Nothing ]]
<br clear=all>
<br clear=all>
[https://git.xpub.nl/rita/categorization_of_files Git here]


== Workshop: Knowledge in Action ==
== Workshop: Knowledge in Action ==

Revision as of 12:49, 26 June 2019

Introduction to Shadow Libraries

See pad: https://pad.xpub.nl/p/IFL_introductions

In the first class of this trimester, we (XPUB) were introduced to a big collection of shadow libraries. Each library seems to follow a different path: distinct methods for cataloging, an idea of what the role of the librarian should be, a specific user interface, etc.

We looked into Aaaaarg, Ubu, Project Gutenberg, Library Genesis, Clockwise libraries, Memory of the world, Monoskop, Bilbiotecha and The Piratebay. We tried to answer some questions:

   Content: What is in the library? How much?
   Users: Who is using / uploading / downloading?
   Catalog: What is the system? How is it organized? How about its ontology?
   Infrastructure: What are the technical specs? Software? Hardware?
   Politics: What is the attitude?
   Economy: Sponsors? Donation? Advertising?
   Law: How does it interface?

Pedro and I looked into Library Genesis and .Onion libraries. However, we were browsing some websites for the first time. After Bodó workshop, we learned a lot more about LibGen, and how some of our first impressions were wrong.

LIBRARY GENESIS

Homepage of Library Genesis


Content: What is in the library? How much?
In 2014 it had 25 million documents (42 terabytes)?

Users: Who is using / uploading / downloading?
(Balázs Bodó input: everyone. It seems that not only the countries with less access to expensive books use this library. There's a lot of activity from high income North American and European countries, they are the biggest per capita users.)

Catalog: What is the system? How is it organized? How about its ontology?
You can search by topics, or by genre (fiction, comics, etc.)
An index list is provided in each category
No curation, everything seems valid. From their FAQ: "we are random book collectors: if we see a book somewhere and it's not in LG yet, we take it.", "We do not fetch specific books, we rather gather collections from public zones on the Web."
(BB input: the focus of the library depends on the kind of archives being dumped, by year, language, etc. e.g pre-2011 was Russian and academic work, mostly natural sciences and mathematics. And actually, there are not many downloads of fiction work.)
Uploading is advertised as being very easy, a lot of duplicated material? (BB input: actually LibGen works more with batch-uploads, big quantities of files being upload from one source, rather than one-book-only upload by a random user)

Infrastructure: What are the technical specs? Software? Hardware?
Russian servers?

Politics: What is the attitude?
They seem to distance themselves from the idea of bringing academic research for people without access/ Sci-Hub
"If you are from India, Pakistan or Iran, you may have difficulties with finances and be tempted to place such requests, then this answer is for you. There may exist some sites on the net that can help you find certain books upon request, but we simply cannot do this. If you need the book urgently and it's missing in LG, please, do not rely on us and try to get it from some other place."

Economy: Sponsors? Donation? Advertising?
No sponsors or donations visible on the website, nor in the site map or forum.
Advertise on download
In Reddit you can find a board (https://www.reddit.com/r/libgen/comments/2m2m1p/libgen_needs_donations_for_a_new_server/) where they have asked for a donation in bitcoin to buy a new server

Law: How does it interface?
...


.ONION LIBRARIES

Content: What is in the library? How much?
Much less content, curatorial side, a small list of books, focused on one/two categories. libraryqtlpitkix is focused on the sciences.

Users: Who is using / uploading / downloading?
You have to access them through a different kind of browser, we used Tor. It's hard to find them by chance, there must be an effort to see these libraries. very specific are these book hard to find in libgen? rare findings?

Catalog: What is the system? How is it organized? How about its ontology?
Different kinds of organization, either in one page with scroll list that fits In a warezy way, with directories, all stored and organized together

Infrastructure: What are the technical specs? Software? Hardware?
...

Politics: What is the attitude?
Economy: Sponsors? Donation? Advertising?

Law: How does it interface?
...

Guests

Balázs Bodó

What information can we understand from the top 10 downloaded books in LibGen?

Data from LibGen, top 10 books downloaded organized by country


TOP 10 IN PORTUGAL

General observations:
— 1 book in portuguese (a translation), all others in english
— almost all universities make available the mandatory books for the year through photocopy/copy store, this can have an impact in downloads


  • BOOK 1

444166, 34, Norman K. Denzin, Yvonna S. Lincoln, The SAGE Handbook of Qualitative Research, Sage Publications, Inc, 2005, English

ISBN: 9781483349800

oficial publisher — Sage Publications, £120.00
pirate — LibGen, available

3 portuguese bookstores:
Fnac — 129,89 €, not available in store
Bertrand — 148,82€, not available in store
Wook — 148,82€, not available in store

international:
amazon.com — $132.14, in stock with more buying choices, ebooks, etc.

Found in public libraries (Porto and Lisboa)? no

Syllabus of:
Degree (BSc) in Psychology, ISCTE-IUL (note: this is a well-rated university in Lisbon, capital city)
(https://fenix.iscte-iul.pt/disciplinas/l5207/2018-2019/2-semestre/bibliografia?locale=en_EN_ISCTE)

  • BOOK 2

881883, 32, Henry Gleitman, Alan J. Fridlund, Daniel Reisberg, Psicologia, Fundação Calouste Gulbenkian, 2003, Portuguese

ISBN: 9789723113709

oficial publisher — Fundação Calouste Gulbenkian, 40.00€, not available (the foundation has a partnership with Ebooks from Marka, the book was not available there either)
pirate — LibGen, available

3 portuguese bookstores:
Fnac — 32€, available
Bertrand — 40,02€, not available
Wook — 40,02€, not available

international:
amazon.com — only option: buy new, $203.00, available

Found in public libraries (Porto and Lisboa)? yes

Syllabus of:
Universidade Católica Portuguesa, Psicologia course (https://fch.lisboa.ucp.pt/asset/4016/file) (note: this is a well-rated university in Lisbon, capital city)
Universidade Lusófona, Ciências Sociais course (revistas.ulusofona.pt/index.php/cadernosociomuseologia/article/view/2672/2039)

  • BOOK 3

Introduction to the Human Body, 777469 28 Gerard J. Tortora, Bryan H. Derrickson Introduction to the Human Body Wiley 2009 English

pirate — LibGen, pdf available

3 portuguese bookstores:
Wook: 63,24€ Online store, available
Bertrand: 63,24€ Online store, available, Not available physical in store
FNAC: 66,29 € Needs to be ordered online, takes 1 to 2 weeks

international:
Amazon: Hardcover Rent, $24.99, Due Date: Aug 13, 2019 Rental Details, FREE return shipping at the end of the semester., Access codes and supplements are not guaranteed with rentals.
Buy new, $66.66, In Stock.
E book not available

  • BOOK 4

Atlas of Human Anatomy 4e (Netter Basic Science) 819203 26 Frank H. Netter MD Saunders 2006 English

pirate: in Lib gen, pdf

3 portuguese bookstores:
Wook — Not available
Bertrand — 80,73€ — Online store, available, Not available physical in store
FNAC — 63,90 € Online and in stock

International:
Amazon: Buy new, $71.97, 1 In Stock.
Buy used, $15.69, Condition: Good

Syllabus of:
Anatomia e Histologia , Universidade de Coimbra
https://sigarra.up.pt/fcnaup/pt/ucurr_geral.ficha_uc_view?pv_ocorrencia_id=331954

DEPT. DE FÍSICA
http://fisica.uc.pt/fa/20092010/showit__uc.php?id_disc=155&id_typ=4&lnkd=links11non&lnkdi=links11non&lkwc=:QwM:fisica:QhM:class:QhM:getpresentation.do:QqM:idclass:QkM:114:QgM:idyear:QkM:4&ckwc=&inwc=1

Teaching Human Anatomy to the Graduation Course in Health Sciences of the Lisbon University: Five Years of a New Educational Experience
https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=18&cad=rja&uact=8&ved=2ahUKEwjfmLuqwYniAhWBCewKHb8CCc84ChAWMAd6BAgIEAI&url=https%3A%2F%2Fwww.actamedicaportuguesa.com%2Frevista%2Findex.php%2Famp%2Farticle%2Fview%2F4254%2F3358&usg=AOvVaw0U_aDrYNW1_pU3vrEasgzd

Syllabus of Universidade de Aveiro, Escola Superior de Saude
http://www.ua.pt/essua/PageDisc.aspx?id=2863

  • BOOK 5

The Oxford History of Ancient Egypt 746957, 25, Ian Shaw (editor), Oxford University Press, USA, 2000, English
Isbn :9780192804587

pirate — LibGen, available

3 portuguese bookstores:
Fnac — 15,20 €, not available in store
Bertrand — 16,11€, not available in store
Wook — 16,11€, not available in store

international:
amazon.com —$15.50, with other buying options

Syllabus of:
Universidade de Lisboa (http://repositorio.ul.pt/bitstream/10451/2461/1/ulsd059259_td_vol.2_13.%20Bibliografia.pdf)

  • BOOK 9

Cosa: the Italian sigillata 814731 22 Maria Teresa Marabini Moevs University of Michigan Press 2006 English

pirate: in Lib gen, pdf

3 portuguese bookstores:
Wook — 0 results
Bertrand — 0 results
FNAC — 85,18€, Online 1 stock

international:
Amazon: 88.63$, 2 In Stock.

The SAGE Handbook of Qualitative Research, 122.37$ on Amazon


See pads:
https://pad.xpub.nl/p/IFL_bodobalasz
https://pad.xpub.nl/p/libgen_top10

Rietveld Academy Library

Librarianship, politics of classification, machine-readable metadada, and more.

http://catalogue.rietveldacademie.nl/eg/opac/home
http://catalogue.rietveldacademie.nl/class/

See pad: https://pad.constantvzw.org/p/rietveld_library

Eva Weinmayr

Experiencing books that have been copied, altered, reframed, edited, pirated, ...

Bo & Rita:

  1. Feminism/Postmodernism

Reproducing (but not exactly.) Producing again - not reproducing. improving materiality (imposition, bindng), adding value Localizing
Crafting
This book was photocopied and assembled in a copy store, when asked by a client
not imitating, or copying. the design of the cover is clearly different. it doesn't try to reproduce perfectly, or faking it.
There are design decisins made. Spine writing updide down. (recultured)

What is the price of these books? What is the price of a reprodution?
Original book: 35dollars, in amazon.us

  • Cover

Wouldn't it be easy to photocopy the whole cover? It was a choice to only have the title and other elements, probably took more time if they had to use a paper to hide the other elements in the photocopy machine.
Back cover is also very different: no information about the author, or short biography. the emphasis on the author is removed.

  • Spine

Authoratitive book has title from top to bottom.
Reproduced one has from bottom to top. it's crafted. (Crafting)

  • Format

Printed Size, Paper, Binding are slighly different.
deliberate decision: bit smaller size than A5: maybe they chose this size of their convenience for printing?

  • Digital copy from LibGen:

The cover is same
The contents are photocopied.
There is no Back-cover.

In the authoritative copy there's a paper reminding the person who borrowed the book to put it back, this in 2004.


  1. A Room of One's Own
picture and work by Kajsa Dahlberg, 2006


Marginal notes, very similar approach with the annotated text we were doing in Steve's class How many people were involved?

You can see the photocopy marks.

The notes are all readable in terms of having their own space, they are not all on top of each other. How was this done? Someone designed and edited the notes, or are all the authors using the same book? Is this one book annotated by different people or many books annotated by the owner of each copy of the book?

Very subjective vision, your reading is being interrupted by others' thoughts. The authorship is shared, collective way of reading and writing: Authorship is participative. Interdependent. You transmit your knowledge, the reading is shared

  • Cover

looks like Reclam publisher, but different. Just like the Feminism/Postmodernism cover of the copy, it doesn't try to reproduce perfectly or faking it.

Reproducing
Adapting
Caring
Adding
Quoting
Inserting
influencing
opinioning
bringing into conversation
layering
merging

The Reclam book:
https://www.reclam.de/detail/978-3-15-018887-3/Woolf__Virginia/Ein_Zimmer_fuer_sich_allein

http://wiki.evaweinmayr.com/index.php/Main_Page

See pad: https://pad.xpub.nl/p/IFL_weynmayr

Dušan Barok

See pad: https://pad.xpub.nl/p/IFL_2019-06-04

Techniques

OCR

command line for simple tesseract:

tesseract nameofpicture.png outputbase
Scan of a book page


Output: recognition of the characters with tesseract-ocr and styled with javascript


https://github.com/tesseract-ocr/tesseract

See pad: https://pad.xpub.nl/p/IFL_2018-05-14

Steganography

See pad: https://pad.xpub.nl/p/steganography_amy

Prototyping

Image classifier for annotations

At the time of this special issue, a point of interest for everyone was annotations. We were reading and annotating texts together and debating the possibilities of sharing these notes. One particular discussion was about what could/should be considered as annotation: folding corners of pages, linking to other contents, highlighting, scribbling, drawing. I was curious if we could train a computer to see all of these traces, so I started prototyping some examples.

Aim: make the computer recognize "clean" pages of books or "annotated" pages of books.

Using the script from .py.rate.chnic session 2, pad notes here, and Alex's git here. My data set here.

"Annotated" example from data set > test set


"Clean" example from data set > test set


Each set (test and training) had 50 examples of "clean" pages and "annotated" pages, it makes sense to add more in the future.
The results were not very accurate. Pages with hand-written text gave better results while highlighting and computer notes were often misinterpreted. It’s useful to try to see what the computer is looking for, understand if the script is breaking the image in parts, and try other scripts.

Some results:

Computer categorization for text files

The actions of categorizing and cataloging happen in the most mundane activities, but they are not innocent. They translate values and certain visions of the world.
In the Rietveld Academy Library, we saw how the librarians are challenging the Library of Congress classification. With Dušan we browsed in the Monoskop Index, an interesting combination of a “book index, library catalog, and tag cloud”.
With this script, I was experimenting with an automated classification of text files. The script searches for the three most common words in the text and tries to match these words to a category. For example, if one of the most common words is “books” the category of the text is considered “Library Studies”. The same would happen with the word “archives”, “author”, “bibliographic”, “bibliotheca”, “book”, “bookcase”, etc. The script only has one category right now, but it would be easy to add more. By doing so, I would be making associations that are very personal, sometimes inaccurate, and I would be creating a bias in the catalog.

Testing it with Balázs Bodó's text, Own Nothing


Git here

Workshop: Knowledge in Action

Testruns, june 12

Through role-play, you will perform the activities crucial to the sustenance of libraries. You will interpret and reimagine the actors that take part in knowledge production and distribution, such as the librarian, the researcher, the pirate, the publisher, the reader, the writer, the student, the copyist, the printer.

The workshop consists of three activities where different scenarios shift your accustomed perspective to start common dialogues. Put yourselves in the shoes of the librarian, imagine together a reading space, and contest the morality of knowledge ownership.


Workshop with 3 activities

ONE: Librarian's Choice

1. We provide 10 books (updated to: the group chooses 10 books from the library). As a group you should decide on 5 books to keep, 5 to throw away. You are a librarian, try to think outside your personal preference. The group will have to debate to have agreement.

2. As a second action, we ask you to re-think the decisions, now based on specific situations. You are the librarian:

  • Decide on what books to keep for a leisure library
  • Decide on what books to keep for a retirement house
  • Decide on what books to keep for a book café

Did anything change?

3. As a third action, you decide now over book categories. We provide 10 categories from Willem de Kooning library, the group should decide on 5 to keep, 5 to throw away.

Excluded books from the first round
Choosing between library categories


TWO: Perfect Library

1. Think now as a user, the reader, the library goer. The goal is to create our collective perfect library. We provide post-its and ask you to:

  • Write three categories of books you would like to have.
  • Think about furniture/spaces.
  • Custom/random stuff.

2. You should all think about the organization of the categories and organization of the space. What books are near what?

Second round:
3. Look to Willem de Kooning library now. Revise the categories.

  • Write three categories you would like to have in your perfect library
  • Browse around the library shelves and see if they have your categories
  • Think about the three categories you wrote in action 1. How do these categories fit in current categories? Write your category proposal on tags, insert them where you'd like them to be.

Are the spaces you imagined in action 1 present?

Creating together an ideal library


THREE: Future Library

  • Choose a role/character.
  • Take some time to read the quotes and familiarize with your character.
  • We provide a case.
  • You should defend your character's best interests. Make use of the quotes if you want, but feel free to improvise.

CASE 1: Ming recently graduated from a Chinese University, so she lost access to academic research database subscription. The academic resource access is way to expensive for her. Her nearest library is actually very good in providing the resources, however she prefers writing at home where she can be alone. She uses shadow libraries to access research materials.
Speak from the best interest of the roles you have selected and interpret the scenario.

Second round:

CASE 2: A researcher just made significant discoveries in a particular field and would like to make the work available to as many people as possible.
Speak from the best interest of the roles you have selected and interpret the scenario.


Feedback from Femke:
Great you found multiple ways to play out and discuss complex questions around libraries, categorisation, legality, inclusion and exclusion. The three formats are quite different and that adds to the space for reflection that you open up.

Maybe in your introduction, you can explain why you took this specific direction: why are you interested in the subject matter(s) and methods.

The connection to the Leeszaal selection process is great.

You could spatialize/theatricalize the actions a lot more: what does the ‘keep’ and ‘throw’ areas look like (is there a bin?), badges for the roles ...

In general: transitions between ‘rounds’ seem a bit abrupt (but there is obviously a story interconnecting them). After 2nd round: It seems important to think about the flow between the three parts (that now feel more like two parts). How are the parts conceptually connected or maybe more clearly differentiated? How do you explain their relation, or their difference? Perhaps think of a way to connect all three through the question of inclusion/exclusion?

From the first test: The ‘ideal library’ is a bit of a generic question and risks to lead to quite predictable generic proposals (a beanbag ;-)). Maybe think of ways to push the categories a bit more, and to ask people to think of an ideal library for these specific categories.

The pre-selected categories could be made more challenging: in relation to Leeszaal, and in relation to thinking about inclusion/exclusion (can you find more ambiguous, conflicting or otherwise problematic categories?). How can you choose them in such a way that they produce discussion, tension (different scales, levels of formality …)

The part introducing ‘choose your categories’ could be a bit sharper formulated I feel, to really challenge participants to try out what categories can do, and how they might differ from person to person.

The role game in the end works very well. Of course the ‘flashcards’ could do with some more work. They are a good vehicle, not just for the workshop but seem to provide you with a useful structure to think through Issue #9.

We all felt that the players should have a bit more time with the role, and be asked to introduce (we tried it, it worked). You could play both a role in this round, to spice up the discussion. You could be Ming for example!

Works well to come back to what happened at the end and reflect with the participants.