Anatomy of an ePUB: Difference between revisions
(8 intermediate revisions by one other user not shown) | |||
Line 1: | Line 1: | ||
<slidy theme="aa" /> | |||
== What is and epub? == | == What is and epub? == | ||
Line 20: | Line 21: | ||
** xhtml files - The book's contents are in these | ** xhtml files - The book's contents are in these | ||
** CSS files | ** CSS files | ||
== Practical ePub dissection: table of content == | == Practical ePub dissection: table of content == | ||
Line 63: | Line 42: | ||
* '''<text>:''' the title of the chapter as it will be displayed in the TOC | * '''<text>:''' the title of the chapter as it will be displayed in the TOC | ||
* '''<content>:''' needs to point to a valid declared HTML content file in your container. | * '''<content>:''' needs to point to a valid declared HTML content file in your container. | ||
== Practical ePub dissection: font embedding == | |||
Font embedding is the technique that gives the opportunity to an ebook designer to provide and use his/her own set of fonts. This feature is not yet totally supported by all readers, but no worries the limited reader will use then its own fonts as fallback. So there is not reasons not to start using this feature! | |||
* example: | |||
<source lang="text"> | |||
@font-face { | |||
font-family: "Linux Libertine"; | |||
font-style: normal; | |||
font-weight: normal; | |||
src:url(LinLibertine_Re.ttf); | |||
} | |||
</source> | |||
* [http://www.pigsgourdsandwikis.com/2011/04/embedding-fonts-in-epub-ipad-iphone-and.html practical implementation] | |||
== An ebook or a website? == | |||
* ePUB 2.x is pretty much like a simple website | |||
* [http://idpf.org/epub/30/spec/epub30-changes.html#sec-new-changed ePUB 3.x brings HTML5 and scripting inside electronic books] | |||
== ePUB repacking == | |||
* Choose an ePUB | |||
* Modify some files | |||
* repack: | |||
<source lang="bash"> | |||
cd /tmp/epub | |||
zip -0Xq my.epub mimetype | |||
zip -Xr9Dq my.epub * | |||
</source> | |||
* test |
Latest revision as of 19:22, 16 November 2021
<slidy theme="aa" />
What is and epub?
"EPUB (electronic publication; also sometimes ePub, EPub, or epub) is a free and open e-book standard, by the International Digital Publishing Forum (IDPF). Files have the extension .epub. EPUB is designed for reflowable content, meaning that the text display can be optimized for the particular display device. The format is meant to function as a single format that publishers and conversion houses can use in-house, as well as for distribution and sale." (source: wikipedia)
Technically, an EPUB is just a compressed ZIP file, containing a bunch of XHTML and XML, so it's easy for anyone to make your own with a wide range of tools (including the most basic text editors) and this simplicity built on top of open standards is an attractive solution for the publishing industry to support it.
Try some epubs
- download http://www.gutenberg.org/ebooks/11.epub
- Test on the various readers
- unzip the epub - what's inside?
Inside a simple ePUB
- mimetype
- META-INF folder
- container.xml: tells the reader software where in the zip file to find the book.
- OEBPS folder - books content (name can change)
- images folder - images (PNG) go here (can be changed)
- Content.opf - lists what's in the zip file
- toc.ncx - Table of content
- xhtml files - The book's contents are in these
- CSS files
Practical ePub dissection: table of content
The last file that we need to understand for hacking ePubs is the TOC (Table Of Content) and is usually called toc.ncx and located in the OEBPS folder of the ePub container. So far we have only covered ePub with only one html file used for the content, which is why the TOC only had only one item/link.
The three important parts of the toc.ncx file are:
- <head>: make sure you use same ebook uid as the one you declared in content.opf
- <docTitle>: make sure you use the same or similar title as your ebook title. This one will be displayed as the book title in your TOC.
- <navMap>: This is where you need to describe the chapters of your book. The navMap is made of navPoints, Each navPoint tag represents a chapter and where it is located in the container. More particularly:
<navPoint id="navpoint-1" playOrder="1">
<navLabel>
<text>Book cover</text>
</navLabel>
<content src="title.html"/>
</navPoint>
- id: can be anything you want in theory but to make it easier to remember and for the sake of compatibility, use the same id names as those defined in content.opf.
- playOrder: the display order of this item in the TOC. Must be an integer and continuous, ie: 1, 2, 3, 4
- <text>: the title of the chapter as it will be displayed in the TOC
- <content>: needs to point to a valid declared HTML content file in your container.
Practical ePub dissection: font embedding
Font embedding is the technique that gives the opportunity to an ebook designer to provide and use his/her own set of fonts. This feature is not yet totally supported by all readers, but no worries the limited reader will use then its own fonts as fallback. So there is not reasons not to start using this feature!
- example:
@font-face {
font-family: "Linux Libertine";
font-style: normal;
font-weight: normal;
src:url(LinLibertine_Re.ttf);
}
An ebook or a website?
- ePUB 2.x is pretty much like a simple website
- ePUB 3.x brings HTML5 and scripting inside electronic books
ePUB repacking
- Choose an ePUB
- Modify some files
- repack:
cd /tmp/epub
zip -0Xq my.epub mimetype
zip -Xr9Dq my.epub *
- test