[sugar] eBook ideas

Don Hopkins dhopkins
Fri Mar 30 02:53:02 EDT 2007


I've been working on a PDF eBook reader activity for the Internet Archive. 

The goal is to make a light weight efficient PDF file reader that can search and download books from the Internet Archive. 

I'm basing it on the xbook activity, but adding support for browsing the Internet Archive, and changing the way it renders. 
The xbook activity uses evince to render PDF, which runs in a separate process, and supports different formats besides PDF. Instead, I want to integrate the poppler PDF library into Python, and use it to write a light-weight PDF book reader that draws with Cairo directly. 

(See Irvin Probst's goal "2/ write a minimal set of python bindings for libpoppler")
http://www.mail-archive.com/evince-list@gnome.org/msg00710.html

I made a Python module named "poppler" (in a project named pypoppler) that can read PDF files and render them by calling Cairo directly (instead of making an intermediate bitmap with its own renderer or running a separate process). It can take a pycairo context as a parameter. So now I can render PDF files into Cairo contexts from Python, at any scale or rotation or clipping region! 

I used SWIG to define a "poppler" Python module with wrappers around C++ classes for Document and Page. They wrap the corresponding poppler glib objects.

I put in the most important methods that will be useful to us now (count pages, get a page, measure the page, render the page through Cairo), but haven't fleshed it out with support for all the other things you can do with PDF files (annotations, fonts, index, form fields, etc).

I found some m4 macros for configuring Python extensions with SWIG, which helped make the configuration process easier.
The header file popplerwrappers.h is SWIG-friendly valid C++ header defining Document and Page classes with inline function definitions, and it gets processed by SWIG as well as included in the generated wrapper.

The "poppler.i" SWIG file defines some headers and initialization code, defines some typemaps, and loads some typemaps from "pycairo.i".
It defines some initialization code that imports the pycairo interface (so we can pass pycairo.Context in and get the cairo_t to draw with), and initalizes the glib object system.
It defines some typemaps for out parameters (so functions like size and getCropBox can return multiple numbers).

The "pycairo.i" SWIG library defines typemaps that let you convert between C cairo_t pointers and Python pycairo.Context objects (which 
could be used by other projects).

Typemaps let you tell SWIG how to pass and convert parameters in and out of functions (like the Cairo context).

I've found some cool Python libraries for generating charts and reports in PDF (PyChart and ReportLab). Is anyone considering including stuff like that in the standard distribution, and are there any favorites? SimCity could use PyChart and ReportLab to display its history graphs and statistics. Here are some ideas I wrote about the eBook reader and related projects that people could work on:

http://wiki.laptop.org/go/Summer_of_Code/2006

ebook reader

    Mentor: Don Hopkins 

Work with a crossmark/html book reader, or produce tools for converting to/from this format, to give children annotatable access to the worlds digitized books.

Don Hopkins is developing a PDF based eBook reader for the Internet Archive, using the "poppler" library to draw with Cairo. It will have a simple book reading user interface to search, page, zoom, pan, rotate, arrange pages in various configurations, follow links, navigate the index, etc. It should be fully usable in "book mode" with the game controller. It will be able to browse and search the Internet Archive eBook library, and download eBooks to read. It can use the Internet Archive RSS feeds and web services to get lists and descriptions of books, and search the archive, and download XML meta-data and PDF documents.

Other interesting eBook related projects:

Optimizing eBook activity and libraries for low power and memory consumption. Optimizing Cairo library image rendering. Reusing the "poppler" PDF rendering module for other purposes. Integrate useful PDF generation modules (i.e. PyGraph, ReportLab). Write some useful components and applications using PDF generation and rendering modules. Extending Poppler's API to support editing PDF documents. Developing a simple PDF editor component (for annotating eBooks and editing graphics).

Collaborative shared eBook reading activity: synchronize the document, page and a cursor over the network, so kids can take turns reading an eBook out loud together, with special support for plays and scripts. Each child chooses one or more characters to read, and the eBook parses the text to know who speaks each line, and prompts each child to read their lines by zooming and highlighting the text to read.





More information about the Sugar-devel mailing list