Visual Slide Presentation Analysis: May 2007

Thursday, May 31, 2007

Update

Log of project (in point form for simplicity)

*Unable to find a suitable viewer or utility to make use of the .sep file that was generated with the gsdjvu library.

*Started work on writing a program to separate foreground/background.

*Began reading up on suitable libraries available.

*Tried sixlegs png java package -- found not to be suitable due to lack of documentation on it.

*Found a suitable library -> java advanced imaging (JAI)

*Started implementation on java which includes:

- learning how to use JAI as well as some basics of image processing due to lack of prior experience in it.

- able to read image pixels from a png image and dump them to a text file

- able to read a folder of png images and dump the pixel values into their corresponding output text files

- ran into problems with heap size -> search online for a suitable solution -> using parameter -Xmx to solve it

- found out more about writing images by setting individual pixels' RGB values

TODO:

Start to implement a counter for the dominant pixel value across a set of slides and to use that pixel value to write to an image file. To see if an accurate background pixel can be obtained.

Wednesday, May 23, 2007

.sep files

Finally managed to compile the gsdjvu source. Took me more than a day to do so. First trying on windows and cygwin and then linux which I am new to.

Using one of the gsdjvu's method, it is supposed to separate the foreground and background of a ps or pdf file. However, it generates the output as a .sep file which I have not figured out on how to read it.

It seems to be a TIFF separation file or something like that. But so far, all efforts to open it with the existing viewers have not succeeded.

Wednesday, May 16, 2007

DjVu Image Compression Format and Ruby on Rails

While reading up on foreground/background separation techniques, I came across the DjVu image compression format.

DjVu is able to store compressed images of documents as very small files, yet not compromising on the readability. It first separates the foreground and the background of the document, and then compresses the background while keeping the texts at high resolutions. In addition, the foreground text can be OCR-ed as well.

This seems to be related to what I need for the first portion of the project and I shall see whether I can make use of this over the next few days.

There is also a Ruby on Rails meeting coming up later and I shall be attending it to see if it can be useful for the project. In any case, it is perhaps good to learn about it, though I currently have no knowledge of Ruby, what more one on rails?

Also required to try to decide on the software part: programming languages to be used, libraries etc.

Thursday, May 10, 2007

Project Initialisation

The project for my honours year is called "Visual Slide Presentation Analysis". The main goal of the project is to be able to analyse and extract valuable visual information from slides as well as to segment and classify them.

The first portion of the project will involve the foreground and background separation of the slides which I will be finding out more about these few weeks.

The following are the tasks that are lined up for the next few weeks as I begin the project.

1) Find out more about the PNG image format
2) Start learning a scripting language (either Perl or Ruby)
3) Find out more about existing techniques (if any) on foreground/background separation
4) Find out what users look out for in slides and if there is any value-add in most slides or if they are just a summarized versions of papers

Visual Slide Presentation Analysis