[IAEP] Get Internet Archive Books Activity available soon

Jim Simmons nicestep at gmail.com
Sun Jun 28 19:21:01 EDT 2009


Sayamindu,

On the OPDS issue with only linking to PDF the Internet Archive uses
some pretty rigid file naming conventions, so if you want a DJVU and
the URL for PDF is given it could be as simple as changing the
filename suffix from .pdf to .djvu.

I especially hope that you will have time to give my Activity a try.
I'm also interested to know what you think would be possible with OPDS
that I'm not already doing.

Project Gutenberg has a huge XML file in "Dublin Core" format that
tells you everything about their books except the URL to download them
from, which makes their far simpler offline catalog a better deal for
what I'm trying to do.  I'm a lot better pleased with the IA Advanced
Search.  It seems to give your everything they have, though at times I
wish that was more.  For instance, they have a field "publication
date".  But it isn't the *books* publication date, it's the date the
*ebook* became available.  And some of the books have decent
descriptions but most just say who scanned and uploaded it and where
they got the original book.

PG's contents are also available through the Internet Archive, so it
might be possible to use my new Activity to download PG books in epub
format, when you have that working.

I read a book last week about the MIT Media Lab written by Stewart
Brand back in the 1980's and back then the buzzword was convergence.
That's how I feel now: lots of stuff *that close* to converging.  And
when it does look out.  We'll bury those kids in books.

James Simmons


On Sun, Jun 28, 2009 at 4:08 PM, Sayamindu Dasgupta<sayamindu at gmail.com> wrote:
> On Mon, Jun 29, 2009 at 2:31 AM, Jim Simmons<nicestep at gmail.com> wrote:
>> I uploaded the first version of Get Internet Archive Books to ASLO
>> about an hour ago, so perhaps by the time you read this it will be
>> available to try out there.  I'm sending this email to IAEP because
>> I'd like some feedback from those who might use the Activity.  What
>> this Activity does is to provide a front end to the Advanced Search
>> function of the Internet Archive website.  In essence it gives you a
>> nice GUI to search through the archive, get information about books,
>> then download the books you choose to the Journal.  It's very similar
>> to the offline catalog feature of Read Etexts, but better, because it
>> has much more information on the books.  The screenshots at ASLO tell
>> the story so I won't give more details here.  Suffice it to say if you
>> are looking for books with pictures, or books in languages other than
>> English, then this Activity will be of interest.  If you've ever
>> dreamed of reading the works of Jules Verne in Yiddish then this
>> Activity will make those dreams come true.
>>
>> Currently the Activity can only download the DJVU format.  This format
>> is an alternative to PDF for documents consisting of scanned in book
>> pages.  It gives better results than PDF in less than half the disk
>> space.  You can use Read to view these files.  Unfortunately, Read's
>> support for DJVU is flaky, at least in .82 on the XO,  I'm pretty sure
>> I'm downloading the books correctly, but it's possible I'm to blame
>> for this.  I'll need to do some more testing to know for sure.  Future
>> versions will support downloading PDFs and other formats offered by
>> this website.
>>
>
> http://dev.laptop.org/~sayamindu/Read-56.xo will give much better
> performance in 8.2.x OLPC OS releases.
>
> On a related note - you will probably be interested to know that the
> Internet Archive has started work on experimental OPDS support:
> http://bookserver.archive.org/ (unfortunately they only link to the
> PDF variants from that catalogue)
>
> Cheers,
> Sayamindu
>
>
> --
> Sayamindu Dasgupta
> [http://sayamindu.randomink.org/ramblings]
>


More information about the IAEP mailing list