User:Pedro Sá Couto/Prototyping 3rd: Difference between revisions

From XPUB & Lens-Based wiki
Line 1: Line 1:
=Watermarks, downloading, deleting=
=Watermarks, downloading, deleting=
https://pad.xpub.nl/p/IFL_2018-05-13
== Watermarking ==
According to the Digital Watermarking Alliance:  http://digitalwatermarkingalliance.org/about/quick-facts/
    Digital watermarking is the process by which identifying data is woven into media content such as images, printed materials, movies, music or TV programming, giving those objects a unique, digital identity that can be used for a variety of valuable applications.
    ...
    Digital watermarking can enable content identification, forensic tracking and copyright communication on a broad scale and can provide a range of solutions for identifying, securing, managing and tracking digital images, audio, video, and printed materials.
   
===JSOR (Ithaka Harbors, Inc) publishers===
* download an article from http://jstor.org/ from within school
* search for watermarks on the PDF
* look for some leads in https://sourceforge.net/p/pdfedit/mailman/message/27874955/ — https://github.com/kanzure/pdfparanoia/issues?utf8=%E2%9C%93&q=jstor
===Verso Books===
* Create an account in https://www.versobooks.com/
* Download book as epub, a free one will do, for instance https://www.versobooks.com/books/2772-verso-2017-mixtape
* search for watermarks on the EPUB
* Look at the Institute for Biblio-Immunology -- First Communique: https://pastebin.com/raw/E1xgCUmb for more leads. https://www.booxtream.com/


=Tesseract, OCR, Book scan=
=Tesseract, OCR, Book scan=

Revision as of 19:24, 24 May 2019

Watermarks, downloading, deleting

https://pad.xpub.nl/p/IFL_2018-05-13

Watermarking

According to the Digital Watermarking Alliance: http://digitalwatermarkingalliance.org/about/quick-facts/

   Digital watermarking is the process by which identifying data is woven into media content such as images, printed materials, movies, music or TV programming, giving those objects a unique, digital identity that can be used for a variety of valuable applications.
   ...
   Digital watermarking can enable content identification, forensic tracking and copyright communication on a broad scale and can provide a range of solutions for identifying, securing, managing and tracking digital images, audio, video, and printed materials.
   

JSOR (Ithaka Harbors, Inc) publishers


Verso Books

Tesseract, OCR, Book scan

Bash ocr.png
<script type="text/javascript">

    //store all class 'ocr_line' in 'lines'
    var lines = document.querySelectorAll(".ocr_line");    

    //loop through each element in 'lines'
    for (var i = 0; i < lines.length; i++){ 

      var line = lines[i];
      console.log(line.title) 

      //split the content of 'title' every space and store the list in 'parts'
      var parts = line.title.split(" ");
      console.log(parts);

      // width and height starts from the side 
      var left = parseInt(parts[1], 10);
      var top = parseInt(parts[2], 10);
      var width = (parseInt(parts[3], 10) - left);
      var height = (parseInt(parts[4], 10) - top);

      // create a style element with the content selected from the list 'parts'
      line.style = "position: absolute; left: " + parts[1] + "px; top: " + parts[2] + "px; width: " + width + "px; height: " + height + "px; border: 5px solid lightblue";

      var words = line.querySelectorAll(".ocrx_word");

      for (var e = 0; e < words.length; e++){ 

        var span = words[e];
        console.log(span.title) 

        var parts = span.title.split(" ");
        console.log(parts);

        var wleft = parseInt(parts[1], 10);
        var wtop = parseInt(parts[2], 10);
        var wwidth = (parseInt(parts[3], 10) - wleft);
        var wheight = (parseInt(parts[4], 10) - wtop);

        span.style = "position: absolute; left: " + (wleft - left) + "px; top: " + (wtop - top) + "px; width: " + wwidth  + "px; height: " + wheight + "px; border: 2px solid purple";
      } 
    }