Friday, 13 November 2009

making epubs from pdf's

As I've said earlier, I've found that epub files are definitely easier to work with on e-readers than pdf's.

My initial thought was, given I've a reasonable amount of stuff in pdf format to convert it. But how?

PDF files are essentially modified postscript with some embedded metadata but epub is a zip file based format with a manifest, formatting css and the document source material in xhtml - conceptually not unlike an open office document file in structure.

My initial thought experiment, based in part a very useful howto on hand creation of epub files was to write a print driver (ok, a ppd) to print the pdf to xhtml based on public domain pdf to thext and pdf to html code, apply a default style and create a manifest based on the embedded metadata.

However Stanza also allows the saving of pdf files in epub documents. Given that they have the technology, and I suspect that their epub conversion is perhaps a little more sophisticated given both that their native format is epub and they are now an amazon subsiduary.

A bit of creative play might be in order ...

