User:Pedro Sá Couto/TW/REPUBLISHING FLOW: Difference between revisions
< User:Pedro Sá Couto | TW
No edit summary |
No edit summary |
||
Line 10: | Line 10: | ||
:'''3.2''' Rotate the watermark with PIL<br> | :'''3.2''' Rotate the watermark with PIL<br> | ||
:'''3.3''' Overlay the watermark with PIL<br> | :'''3.3''' Overlay the watermark with PIL<br> | ||
:'''3.4''' Merge | :'''3.4''' OCR the new cover<br> | ||
:'''3.5''' Resize the OCRed cover to fit the book<br> | |||
:'''3.6''' Merge the cover and the pdf into one<br> | |||
'''4.''' OCR the pdf if not OCRed already<br> | '''4.''' OCR the pdf if not OCRed already<br> | ||
'''5.''' Save the file in a directory open to Library Genesis Staff<br> | '''5.''' Save the file in a directory open to Library Genesis Staff<br> | ||
Line 37: | Line 39: | ||
==2. Creating the watermark from the gathered form in Tactical Watermarks== | ==2. Creating the watermark from the gathered form in Tactical Watermarks== | ||
====2.1 Creating the watermark from the gathered form in Tactical Watermarks==== | |||
<source lang="python"> | |||
</source> | |||
<br> | |||
====2.2 Convert to a png==== | |||
<source lang="python"> | <source lang="python"> | ||
</source> | </source> | ||
Line 42: | Line 49: | ||
==3. Append the watermark to the pdf== | ==3. Append the watermark to the pdf== | ||
==3.1 Burst the pdf cover== | |||
<source lang="python"> | |||
</source> | |||
<br> | |||
==3.2 Rotate the watermark with PIL== | |||
<source lang="python"> | |||
</source> | |||
<br> | |||
==3.3 Overlay the watermark with PIL== | |||
<source lang="python"> | |||
</source> | |||
<br> | |||
==3.4 OCR the new cover== | |||
<source lang="python"> | |||
</source> | |||
<br> | |||
==3.5 Resize the OCRed cover to fit the book== | |||
<source lang="python"> | |||
</source> | |||
<br> | |||
==3.6 Merge the cover and the pdf into one== | |||
<source lang="python"> | <source lang="python"> | ||
</source> | </source> | ||
<br> | <br> | ||
==4. OCR the pdf if not OCRed already== | ==4. OCR the pdf if not OCRed already== | ||
Line 73: | Line 102: | ||
'''3.''' Overlaying the pages<br> | '''3.''' Overlaying the pages<br> | ||
====Burst the pdf | ====Burst the pdf cover==== | ||
<gallery> | <gallery> | ||
File:wiki_page2_water.png | File:wiki_page2_water.png | ||
</gallery> | </gallery> | ||
Line 91: | Line 116: | ||
File:wiki_page2_water.png | File:wiki_page2_water.png | ||
</gallery> | </gallery> | ||
====OCR the new cover==== | |||
[[File:OCRedcover.pdf|thumb]] | |||
====Merge the new cover and the pdf==== | |||
[[File:OCRfinal.pdf|thumb]] |
Revision as of 00:04, 7 June 2020
STEPS
Republishing is separated into 6 steps:
1. Move the book from the webserver to a work directory
- 1.1 Replacing all spaces with underscores
2. Creating the watermark from the gathered form in Tactical Watermarks
- 2.1 Create the watermark in pdf with reportlab
- 2.2 Convert to a png
3. Append the watermark to the pdf
- 3.1 Burst the pdf cover
- 3.2 Rotate the watermark with PIL
- 3.3 Overlay the watermark with PIL
- 3.4 OCR the new cover
- 3.5 Resize the OCRed cover to fit the book
- 3.6 Merge the cover and the pdf into one
4. OCR the pdf if not OCRed already
5. Save the file in a directory open to Library Genesis Staff
6. Delete all the unwanted traces
FLOW
RUN.SH
To activate the stream I use ./run.sh
sudo chmod 777 *
./movebookfolder.sh
./watermarkformtxt.sh
./appendwatermarktopdf.sh
./republish.sh
./deletetraces.sh
1. Moving the book from the webserver to a work place
2. Creating the watermark from the gathered form in Tactical Watermarks
2.1 Creating the watermark from the gathered form in Tactical Watermarks
2.2 Convert to a png
3. Append the watermark to the pdf
3.1 Burst the pdf cover
3.2 Rotate the watermark with PIL
3.3 Overlay the watermark with PIL
3.4 OCR the new cover
3.5 Resize the OCRed cover to fit the book
3.6 Merge the cover and the pdf into one
4. OCR the pdf if not OCRed already
5. Save the file in a directory open to Library Genesis Staff
6. Delete all the unwanted traces
RESULTS IN EACH STEP
2. Watermark from ReportLab
The watermark is created from the .txt input
3. Overlaying the pages