Wiki to print

From XPUB & Lens-Based wiki
Revision as of 15:03, 14 October 2018 by Andre Castro (talk | contribs)

Wiki To Print Workflow

required sofware

mwclient

Python library to interface with the MediaWiki API.

https://github.com/mwclient/mwclient

Use: to download content from wiki pages, through the wiki-download.py script ./wiki-download.py -h

Wiki_publishing

Pandoc

Pandoc diagram.jpg

A universal document converter - converts from one markup language onto another

https://pandoc.org/

Use: convert downloaded wiki pages onto HTML files

extensive documentation in Pandoc’s Manual or man pandoc


pandoc example1: convert HTML string to markdown

echo "<h1>Hello Pandoc</h1><p>from html to markdown</p>" | pandoc -f html -t markdown

pandoc example2: mediawiki file to HTML

Pandoc common arguments

-f - option standing for “from”, is followed by the input format;

-t - option standing for “to”, is followed by the output format;

-s - option standing for “standalone”, produces output with an appropriate header and footer;

-o - option for file output;

mediawiki - mediawiki input filename - you need to replace it by its actual name



WeasyPrint

A visual rendering engine for HTML and CSS that can export to PDF. It aims to support web standards for printing. . The CSS layout engine is written in Python, designed for pagination, and meant to be easy to hack on.

https://weasyprint.org/, WeadyPrint documentation

Use: to convert HTML + CSS onto a PDF