An office docs to flash converter (i.e. my slideshare clone)

July 31, 2009

Since a friend of mine told me of using OpenOffice as a daemon to run tasks automatically, I thought that would be nice to try it as a part of a proof of concept to a slideshare mini clone. It would be a matter of uploading the original file, convert it using OO and displaying a page along with it. There`s an API and many clients. I choose not to develop a new client and used JodConverter. Of course I would have to develop a new converter if I wanted to inject or run customized procedures over a document.

To make things more interesting, I decided to throw in a semi-DSL for python, called Juno and provide for a high demand service architecture. Also, I didn`t wanted to ask the user for a full registration form and procedure. It would be a nice thing to use craiglist`s model, which you receive a “publication kit” by email, which you may use to delete and manage your ad.

Well, WANTS list ready, I setup to code it. Most part was done over a weekend, except for Juno. Really, it`s an amazing framework but still miles away from the easy of DSL`s like Sinatra. It`s most shortcuts over web.py, which is fantastic framework too. After patching it to run under lighttpd using fcgi and a unix domain socket and going for MySQL instead of sqlite to avoid a threading error message, I was good to go.

The database, by the way, is just to store a dict list of hash -> document info. I could have used Redis, but I`m saving the awesomeness to other projects.

The whole processĀ  is as follows:

  • User uploads doc, xls or ppt, along with an email and some file info
  • It got saved to the db, a hash is generated.
  • In the filesystem, a public directory is created using the hash as its name, and the original file isĀ  saved there.
  • There`s a default wait html page which is created there too.
  • A json fragment is posted in the queue, containing the path for the file, email and other information
  • The consumer daemon converts it to .pdf using JodConverter and OO, saves it in the file`s directory
  • Also it uses pdf2swf from swftools, because OO`s swf converter mangles the color sometimes. Saves the .swf in that directory too.
  • A new index page is created, with links to original file, pdf and swf, along with a swf player
  • Mail is sent, with the direct link and a delete link.

The web front end only handles two operations: file upload and delete. All the rest is done by the consumer deamon, decoupling the web app from the heavy processing.

Results may vary, but for my purposes it turned out a nice sandbox. You can give it a try at http://www.tinyppt.net.

Software:

  • JodConverter
  • OpenOffice
  • RabbitMQ (AMPQ message queue)
  • Python and ampq connector
  • Juno
  • MySQL

Architecture diagram

architecture diagram

architecture diagram

Advertisements

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: