this is my blog

Busy busy busy...
at 22:32 | General, Thesis.

It has been quite a while since my last blog post. As usual I am too busy doing work, finishing my thesis and other stuff. I still have to get used to regularly updating this website.

I have just updated a new design of this website, which I have created quite a while ago, but I guess this is the most I will do to this website for at least another two months.

Right now I am in the final two months of my thesis (hopefully). I am trying to continuously write, to finish my a draft of my thesis at the end of september, so I can finish my Master somewhere in october.

Obviously, once I've finished my Master, I'll post the end product and hopefully I will have some more time to actually get this website more up to date.

iPhone Application: iRoll
at 14:08 | General, Applications.

It's been a while since my last post. I have been busy with a lot of things, mostly my thesis and work (websites...). Another thing I have been doing is finally finishing my Game Programming project. For this project I have created a small iPhone application.

The application is a game based on Yahtzee. As an addition to the default game where you can roll dice and save your score, I've added two things which are more related to Artificial Intelligence.

First there's Computer Vision involved. If you have an device with a camera (iPhone), you can take a picture of real dice and it will recognize what you rolled automatically, so you can use this roll in your game.

Another feature is suggestions. The application will give you some suggestions on what dice to keep and which to re-roll, but also which score category you should fill in. The suggestions are a result of calculating the full game graph and taking the path with the highest score.

Anyway, here's some screenshots of the result:

iRoll 1iRoll 2iRoll 3

If you are interested in the game, you can buy ($0.99 or €0.79) it in iTunes (App Store) through the following link: iRoll. Also here's a link to the final paper for this project: Download.

Labeling
at 13:41 | General, Thesis.

This week I am trying to finish the new labeling for the Hollywood2 data. Instead of automatic labeling, which was done by Ivan Laptev, I am now manually labeling all 1700 videos. Obviously this will take up quite some time, but it will be very useful for further experiments. Also I will make this data openly available, for use in the future.

To aid me in labeling, I created a simple GUI application in Matlab. With this tool I can easily go through all frames of a video, rewind, pause etc. To save an action I simply have to press the button for the particular action. Using this tool I should be able to finish the labeling before the end of the week.

Label Tool Matlab

More Results
at 12:05 | General, Thesis.

Since my last entry I haven't done many new things. Mainly because I have been running more experiments with different settings, but also because I have been working a bit more on my iPhone program, which I am working on for a course called Game Programming (I will probably make a blog entry on this as well).

Yesterday however, I finished some code for checking performance on the results of splitting the test data into smaller pieces of video. The splitting was done on the shot boundaries that come with the Hollywood2 dataset. As we do not have specific labels for this split data, I can not say anything specific about the performance. What I can do is check random videos and verify the results.

So what I did was writing a script for picking a random video and plotting the likelihood for every action for each part of the video (divided using the shot boundaries). The results look promising, for most videos I looked at, the actions are identified with the highest likelihood in the right shot. However, this is not true for every video and sometimes, even though there's no action taking place in the video, it still has high likelihoods for some actions. As the labels for each video are generated by an automated process, described in Learning realistic human actions from movies, it is likely that errors exist. Because this can become quite a problem if you want to have a more specific location of an action in a video, in the coming days I will be looking at a way of labeling the data by hand. This might take some time, but will be useful for further experiments.

Just to show one of the promising results I got yesterday, here's a short video, from Raising Arizona (1987), and the likelihoods for each shot. As you can see in the video, you can see a man running in the second, fourth and sixth shot of the video. The likelihood results clearly show the same pattern.

View video

split data results
First Results
at 00:21 | General, Thesis.

Today we finally got the first results that are comparable to the ones from Ivan Laptev's paper. Interestingly enough, our results seem significantly better. This should not be the case, as we followed the same path Laptev used in his paper with the same dataset and same values for the different algorithms involved.

Obviously somewhere along the way, something must be different. Ivo and I will have to find out how this is possible and at least come up with a logical explanation.

The next step for me will be creating code for splitting up the dataset into smaller time frames. It will then be possible to see if the trained classifier will also be able to find the correct actions in smaller time frames. This is useful for finding the actual position of the action in a movie (not just simply indicating wether an action occurs in a given movie).