Wiki to print

From XPUB & Lens-Based wiki

Wiki To Print Workflow

required software

download scripts

git config --global http.sslVerify false
git clone https://git.xpub.nl/repos/wiki_publishing.git

mwclient

Python library to interface with the MediaWiki API.

https://github.com/mwclient/mwclient

Use: to download content from wiki pages, through the wiki-download.py script ./wiki-download.py -h

More on Wiki_publishing

Pandoc

Pandoc diagram.jpg

A universal document converter - converts from one markup language onto another

https://pandoc.org/

Use: convert downloaded wiki pages onto HTML files

extensive documentation in Pandoc’s Manual or man pandoc


pandoc example1: convert HTML string to markdown

echo "<h1>Hello Pandoc</h1><p>from html to markdown</p>" | pandoc -f html -t markdown

pandoc example2: mediawiki file to HTML

Pandoc common arguments

-f - option standing for “from”, is followed by the input format;

-t - option standing for “to”, is followed by the output format;

-s - option standing for “standalone”, produces output with an appropriate header and footer;

-o - option for file output;

mediawiki - mediawiki input filename - you need to replace it by its actual name



WeasyPrint

A visual rendering engine for HTML and CSS that can export to PDF. It aims to support web standards for printing. . The CSS layout engine is written in Python, designed for pagination, and meant to be easy to hack on.

https://weasyprint.org/, WeadyPrint documentation

Use: to convert HTML + CSS onto a PDF


The standalone command weasyprint can produce a PDF, simply with the instructions:

weasyprint input.html -s style.css ouput.pdf


Where:

  • input.html - is the souce HTML file (could also be a URL)
  • -s - is the option for a CSS stylesheet
  • ouput.pdf - the resulting PDF


@page

@page CSS rule that determines orientation and page size is successfully rendered in the PDF.

@page {
size: A5 portrait;
}

@page left @page right

Option for the left and right pages, such as the margin sizes, which have to alternate in order to produce a bound work, are correctly rendered.

@page:right {
  margin-left: 3cm; /*inner margin*/
  margin-right:1cm; /*outer margin*/ 
}

@page:left {
  margin-right: 3cm; /*inner margin*/
  margin-left:1cm; /*outer margin*/
}

Weasyprint-margins.png

@bottom

Weasy-print also applies consistently @bottom rules, including page counting.

  @bottom-right {
    margin: 10pt 0 30pt 0;
    border-top: .25pt solid #FF05F6;
    content: "Testing WeasyPrint";
    font-size: 6pt;
    color: #00FFF2;
  }

  @bottom-center {
    margin: 10pt 0 30pt 0;
    content: counter(page);
    font-size: 6pt;
  }


CSS Custom Fonts

Weasy Print does not support CSS's (@font-face) rule.

Yet it can use fonts available in your system.

On Linux fc-list will give you a list of fonts installed in your system

Page breaks

To control page breaks on the PDF use the CSS properties:

With the values:

  • always
  • avoid
  • left
  • right

Weasyprint examples