User:Pedro Sá Couto/TW/REPUBLISHING FLOW: Difference between revisions

From XPUB & Lens-Based wiki
No edit summary
No edit summary
Line 63: Line 63:


=RESULTS IN EACH STEP=
=RESULTS IN EACH STEP=
'''0.''' Starting with a Paper from JSTOR<br>
[[File:42938075.pdf|thumb|Calibration of Watermark soil moisture sensors for soil matric potential and temperature.pdf]]


'''1.''' Bursting the PDF into PNGs<br>
'''2.''' Watermark from ReportLab<br>
====PDF is seperated into pages====
<gallery>
File:wiki_page1.png
File:wiki_page2.png
File:wiki_page3.png
File:wiki_page4.png
File:wiki_page5.png
File:wiki_page6.png
</gallery>


'''2.''' Overlaying the cover<br>
====The watermark is created from the .txt input====
 
====The cover is overlayed and dewatermarked====
<gallery>
<gallery>
File:wiki_page1_water.png
File:wiki_page1_water.png
Line 86: Line 73:
'''3.''' Overlaying the pages<br>
'''3.''' Overlaying the pages<br>


====The pages are overlayed and dewatermarked====
====Burst the pdf into pages====
<gallery>
<gallery>
File:wiki_page2_water.png
File:wiki_page2_water.png
Line 95: Line 82:
</gallery>
</gallery>


'''4.''' OCR again<br>
====Rotate the watermark with PIL====
====You have a De-watermarked, searchable PDF====
<gallery>
[[File:42938075_dewater.pdf|thumb|De Watermarked Calibration of Watermark soil moisture sensors for soil matric potential and temperature.pdf]]
File:wiki_page2_water.png
<br>
</gallery>
 
====Overlay the watermark to the cover with PIL====
<gallery>
File:wiki_page2_water.png
</gallery>

Revision as of 04:04, 6 June 2020

STEPS

Republishing is separated into 6 steps:

1. Moving the book from the webserver to a work place

1.1 Replacing all spaces with underscores

2. Creating the watermark from the gathered form in Tactical Watermarks

2.1 Create the watermark in pdf with reportlab
2.2 Convert to a png

3. Append the watermark to the pdf

3.1 Burst the pdf into pages
3.2 Rotate the watermark with PIL
3.3 Overlay the watermark with PIL
3.4 Merge all images into a PDF

4. OCR the pdf if not OCRed already
5. Save the file in a directory open to Library Genesis Staff
6. Delete all the unwanted traces

FLOW

RUN.SH

To activate the stream I use ./run.sh

sudo chmod 777 *
./movebookfolder.sh
./watermarkformtxt.sh
./appendwatermarktopdf.sh
./republish.sh
./deletetraces.sh


1. Moving the book from the webserver to a work place


2. Creating the watermark from the gathered form in Tactical Watermarks


3. Append the watermark to the pdf


4. OCR the pdf if not OCRed already


5. Save the file in a directory open to Library Genesis Staff


6. Delete all the unwanted traces



RESULTS IN EACH STEP

2. Watermark from ReportLab

The watermark is created from the .txt input

3. Overlaying the pages

Burst the pdf into pages

Rotate the watermark with PIL

Overlay the watermark to the cover with PIL