EPubChapters

From Media Design: Networked & Lens-Based wiki
Jump to navigation Jump to search

Turning a couple of HTML files into an ePub

Prerequisites

  • You need to run this script next to a folder called chapters, that contains your HTML files
  • The files you want to add must be valid XHTML
  • You need to use the attached skeleton that does not contain any HTML files

Code

#!/usr/bin/python
import os, shutil

# Remove previous epub files
if os.path.isdir("/tmp/epub"):
    shutil.rmtree("/tmp/epub")

# Copy ePub skeleton
shutil.copytree("epub-raw-files", "/tmp/epub")

# Copy chapters
for chapter in os.listdir("chapters"):
    shutil.copyfile("chapters/" + chapter, "/tmp/epub/OEBPS/" + chapter)

###
# Create item list
manifest = ""
spine = ""
nav = ""
playorder = 1
for file in sorted(os.listdir("/tmp/epub/OEBPS/")):
    if file.split(".")[-1] == "html":
        manifest += '<item id="' + file[:-5]  + '" href="' + file + '" media-type="application/xhtml+xml"/>'
        spine += '<itemref idref="' + file[:-5] + '"/>'
        nav += '<navPoint id="' + file[:-5] + '" playOrder="' + str(playorder) + '"><navLabel><text>' + file[:-5] + '</text></navLabel><content src="' + file + '"/></navPoint>'
        playorder += 1

# Add chapters to catalogue
opf = open("/tmp/epub/OEBPS/content.opf", "w")

content = """<?xml version='1.0' encoding='utf-8'?>
<package xmlns="http://www.idpf.org/2007/opf" 
            xmlns:dc="http://purl.org/dc/elements/1.1/" 
            unique-identifier="bookid" version="2.0">
  <metadata>
    <dc:title>Book of Chapters</dc:title>
    <dc:creator>Anonymous</dc:creator>
    <dc:identifier id="bookid">urn:uuid:12345</dc:identifier>
    <dc:language>en-US</dc:language>
    <meta name="cover" content="cover-image" /> 
  </metadata>
  <manifest>
    <item id="ncx" href="toc.ncx" media-type="text/xml"/>
    <item id="cover" href="title.html" media-type="application/xhtml+xml"/>
    <item id="cover-image" href="images/cover.png" media-type="image/png"/>
    <item id="css" href="stylesheet.css" media-type="text/css"/>
""" + manifest + """
  </manifest>
  <spine toc="ncx">
    <itemref idref="cover" linear="no"/>
""" + spine + """
  </spine>
  <guide>
    <reference href="title.html" type="cover" title="Cover"/>
  </guide>
</package>"""

opf.write(content)
### Write the TOC of our book
# 

ncx = open("/tmp/epub/OEBPS/toc.ncx", "w")

content = """<?xml version='1.0' encoding='utf-8'?>
<!DOCTYPE ncx PUBLIC "-//NISO//DTD ncx 2005-1//EN" 
                 "http://www.daisy.org/z3986/2005/ncx-2005-1.dtd">
<ncx xmlns="http://www.daisy.org/z3986/2005/ncx/" version="2005-1">
  <head>
    <meta name="dtb:uid" content="urn:uuid:12345"/>
    <meta name="dtb:depth" content="1"/>
    <meta name="dtb:totalPageCount" content="0"/>
    <meta name="dtb:maxPageNumber" content="0"/>
  </head>
  <docTitle>
    <text>Book of Chapters</text>
  </docTitle>
  <navMap>
""" + nav + """
  </navMap>
</ncx>"""

ncx.write(content)

Don't forget the magic words:

cd /tmp/epub
zip -0X img.epub mimetype
zip -Xr9D img.epub *


Attachments