Wiki to print
Publishing Workflows
Alessandro Ludovico: Post-Digital Print: The Mutation of Publishing Since 1894
(2012)
From Print to Ebooks: a Hybrid Publishing Toolkit for the Arts
https://github.com/DigitalPublishingToolkit/NN09
https://github.com/DigitalPublishingToolkit/Hybrid-Publishing-Resources
Wiki To Print Workflow
required software
- mwclient - python library
- pandoc
- WeasyPrint
download scripts
git config --global http.sslVerify false git clone https://git.xpub.nl/repos/wiki_publishing.git
mwclient
Python library to interface with the MediaWiki API.
https://github.com/mwclient/mwclient
Use: to download content from wiki pages, through the wiki-download.py script ./wiki-download.py -h
More on Wiki_publishing
Pandoc
A universal document converter - converts from one markup language onto another
Use: convert downloaded wiki pages onto HTML files
extensive documentation in Pandoc’s Manual or man pandoc
pandoc example1: convert HTML string to markdown
echo "<h1>Hello Pandoc</h1><p>from html to markdown</p>" | pandoc -f html -t markdown
pandoc example2: mediawiki file to HTML
Pandoc common arguments
-f - option standing for “from”, is followed by the input format;
-t - option standing for “to”, is followed by the output format;
-s - option standing for “standalone”, produces output with an appropriate header and footer;
-o - option for file output;
mediawiki - mediawiki input filename - you need to replace it by its actual name
WeasyPrint
A visual rendering engine for HTML and CSS that can export to PDF. It aims to support web standards for printing. . The CSS layout engine is written in Python, designed for pagination, and meant to be easy to hack on.
https://weasyprint.org/, WeadyPrint documentation
Use: to convert HTML + CSS onto a PDF
The standalone command weasyprint
can produce a PDF, simply with the instructions:
weasyprint input.html -s style.css ouput.pdf
Where:
input.html
- is the souce HTML file (could also be a URL)-s
- is the option for a CSS stylesheetouput.pdf
- the resulting PDF
@page
@page CSS rule that determines orientation and page size is successfully rendered in the PDF.
@page {
size: A5 portrait;
}
@page left @page right
Option for the left and right pages, such as the margin sizes, which have to alternate in order to produce a bound work, are correctly rendered.
@page:right {
margin-left: 3cm; /*inner margin*/
margin-right:1cm; /*outer margin*/
}
@page:left {
margin-right: 3cm; /*inner margin*/
margin-left:1cm; /*outer margin*/
}
@bottom
Weasy-print also applies consistently @bottom
rules, including page counting.
@bottom-right {
margin: 10pt 0 30pt 0;
border-top: .25pt solid #FF05F6;
content: "Testing WeasyPrint";
font-size: 6pt;
color: #00FFF2;
}
@bottom-center {
margin: 10pt 0 30pt 0;
content: counter(page);
font-size: 6pt;
}
CSS Custom Fonts
Weasy Print does not support CSS's (@font-face
) rule.
Yet it can use fonts available in your system.
On Linux fc-list
will give you a list of fonts installed in your system
Page breaks
To control page breaks on the PDF use the CSS properties:
With the values:
- always
- avoid
- left
- right
Weasyprint examples
- stylesheet for the RadiatedBook