OuNuPo: Difference between revisions
Andre Castro (talk | contribs) |
Andre Castro (talk | contribs) (→week 2) |
||
Line 54: | Line 54: | ||
* Tash: retraining Tesseract to misinterpret text (hidding messages in plain-sight - Steganography ) [[User:Tash/Prototyping_02#Independent_Research:_Retraining_Tesseract | here]] | * Tash: retraining Tesseract to misinterpret text (hidding messages in plain-sight - Steganography ) [[User:Tash/Prototyping_02#Independent_Research:_Retraining_Tesseract | here]] | ||
* Joca: [[User:Joca/tesseract-preprocessing | image manipulation to optimize OCR]] | * Joca: [[User:Joca/tesseract-preprocessing | image manipulation to optimize OCR]] | ||
===week 5=== | |||
Raw data sonification/visualization research: a recipe, a work (with documentation on how you got to the result), a survey of tools, a workflow, or whatever you feel curious about, concerning that topic. | |||
=Links/References/Reading List= | =Links/References/Reading List= |
Revision as of 11:21, 30 January 2018
Special Issue 5: OuNuPo, Ouvroir de Numérisation Potentielle
Project Description
Main partner: WORM (WORM's Pirate Bay to be precise)
Special guests: Manetta Berends and Cristina Cochior (Algolit group)
The outcome of the special issue will be the following things:
- 2 book scanners (one to stay at XPUB, one to stay at WORM)
- one unique (as in unique copy) reader in the form of an artist's book.
The reader will be a collection of texts curated by students and staff on the topic of book scanner, text mobility, constraint writing, algorithmic literature, and I also hope the culture and politics of OCR, text analysis, AI in the context of text processing and generation.
- a collection of different software back-ends for the book scanner, so
as to reconfigure its functionality, ie you might scan a book, and get a pdf, the content of which has been manipulated in poetical or critical way, or you could also get something else entirely, a sound file, an collection of images, etc.
- a gigantic collection of files produced by combining the reader as
input source into the scanner making use of the plethora of different back-ends. IMPORTANT: the reader will never exist as a typical digital alter-ego of the analog original but only through a multitude of different digital interpretations.
- an evening launch at WORM with presentation of the reader, back-ends,
results, etc.
Sessions
- 09.01.2018 Intro to the 'Feminist Reader' pad
- 10.01.2018 The Alphabet as Software - R&W
- 15.01.2018 Optical character recognition - pad
- 17.01.2018 Python and N+7 - pad
- 18.01.2018 OuLi/NuPo, natural language processing - pad
- 29.01.2018 Sonification - pad
Independent Research
week 2
Look into your assigned topic, try, test, and on a wiki page write a recipe/report/tutorial based on your experiments and research. Andre Castro (talk) 20:01, 15 January 2018 (CET)
- Zalan: page segmentation; image detection
- Alex: Web ocr implementation: WebOCR
- Angeliki + Alice: training Tesseract to detect new font
- Tash: retraining Tesseract to misinterpret text (hidding messages in plain-sight - Steganography ) here
- Joca: image manipulation to optimize OCR
week 5
Raw data sonification/visualization research: a recipe, a work (with documentation on how you got to the result), a survey of tools, a workflow, or whatever you feel curious about, concerning that topic.
Links/References/Reading List
Archivist Quill Book Scanner (Base Kit)
- includes 2 Canon Power shot running [http://chdk.wikia.com/wiki/CHDK
Canon Hack Development Kit]
gPhoto a free, redistributable, ready to use set of digital camera software applications for Unix-like systems, written by a whole team of dedicated volunteers around the world. It supports more than 2300 cameras
Klijn, Edwin. 2008. ‘The Current State-of-Art in Newspaper Digitization: A Market Perspective’. D-Lib Magazine 14 (1/2). https://doi.org/10.1045/january2008-klijn.
DIY Book Scanner forum: Hardware & building
Planning Overview
weeks 2,3,4, 5: January
- @Steve and Delphine: develop the reader = select texts in Jan and edit and work on form in Feb. documentation on the wiki steve writing and notation
- @Andre: OCR
- @Michael
- @ Aymeric
- @ Cristina&Manetta
weeks 6,7,8,9: February
- week 6: delivery of the scanner parts
- weeks 7, 8: @Aymeric + Frederic/ Worm: scanner assemblage
Weeks 10,11,12,13: March
- week 11,12: 2nd and 3rd week of March
- testing scanner in
- 9-17 March barcode Dj at Worm http://kubriel.servus.at/index.php?s=barcodedjsfrom Hungarian artists group that produce bar code DJ sets ... could they be invited to perform with the bookscanner.
- 15-16 March Algoliterary Encounters at VARIA with WS on the 16th (could be a good moment to beta the platform / test present in public)
week 13: 4th week of March
- 28 March (wed) - Launch
- 29 March (thu) - Assessment