Expectation-Maximization Algorithm for Bernoulli Mixture Models (Tutorial)

Even though the title is quite a mouthful, this post is about two really cool ideas:

1. A solution to the "chicken-and-egg" problem (known as the Expectation-Maximization method, described by A. Dempster, N. Laird and D. Rubin in 1977), and
2. An application of this solution to automatic image clustering by similarity, using Bernoulli Mixture Models.

For the curious, an implementation of the automatic image clustering is shown in the video below. The source code (C#, Windows x86/x64) is also available for download!

Automatic clustering of handwritten digits from MNIST database using Expectation-Maximization algorithm

While automatic image clustering nicely illustrates the E-M algorithm, E-M has been successfully applied in a number of other areas: I have seen it being used for word alignment in automated machine translation, valuation of derivatives in financial models, and gene expression clustering/motif finding in bioinformatics.

As a side note, the notation used in this tutorial closely matches the one used in Christopher M. Bishop's "Pattern Recognition and Machine Learning". This should hopefully encourage you to check out his great book for a broader understanding of E-M, mixture models or machine learning in general.

Alright, let's dive in!

1. Expectation-Maximization Algorithm

Imagine the following situation. You observe some data set $\mathbf{X}$ (e.g. a bunch of images). You hypothesize that these images are of $K$ different objects... but you don't know which images represent which objects.

Let $\mathbf{Z}$ be a set of latent (hidden) variables, which tell precisely that: which images represent which objects.

Clearly, if you knew $\mathbf{Z}$, you could group images into the clusters (where each cluster represents an object), and vice versa, if you knew the groupings you could deduce $\mathbf{Z}$. A classical "chicken-and-egg" problem, and a perfect target for an Expectation-Maximization algorithm.

Here's a general idea of how E-M algorithm tackles it. First of all, all images are assigned to clusters arbitrarily. Then we use this assignment to modify the parameters of the clusters (e.g. we change what object is represented by that cluster) to maximize the clusters' ability to explain the data; after which we re-assign all images to the expected most-likely clusters. Wash, rinse, repeat, until the assignment explains the data well-enough (i.e. images from the same clusters are similar enough).

(Notice the words in bold in the previous paragraph: this is where the expectation and maximization stages in the E-M algorithm come from.)

To formalize (and generalize) this a bit further, say that you have a set of model parameters $\mathbf{\theta}$ (in the example above, some sort of cluster descriptions).

To solve the problem of cluster assignments we effectively need to find model parameters $\mathbf{\theta'}$ that maximize the likelihood of the observed data $\mathbf{X}$, or, equivalently, the model parameters that maximize the log likelihod

Using some simple algebra we can show that for any latent variable distribution $q(\mathbf{Z})$, the log likelihood of the data can be decomposed as
\begin{align}
\ln \,\text{Pr}(\mathbf{X} | \theta) = \mathcal{L}(q, \theta) + \text{KL}(q || p), \label{eq:logLikelihoodDecomp}
\end{align}
where $\text{KL}(q || p)$ is the Kullback-Leibler divergence between $q(\mathbf{Z})$ and the posterior distribution $\,\text{Pr}(\mathbf{Z} | \mathbf{X}, \theta)$, and
\begin{align}
\mathcal{L}(q, \theta) := \sum_{\mathbf{Z}} q(\mathbf{Z}) \left( \mathcal{L}(\theta) - \ln q(\mathbf{Z}) \right)
\end{align}
with $\mathcal{L}(\theta) := \ln \,\text{Pr}(\mathbf{X}, \mathbf{Z}| \mathbf{\theta})$ being the "complete-data" log likelihood (i.e. log likelihood of both observed and latent data).

To understand what the E-M algorithm does in the expectation (E) step, observe that $\text{KL}(q || p) \geq 0$ for any $q(\mathbf{Z})$ and hence $\mathcal{L}(q, \theta)$ is a lower bound on $\ln \,\text{Pr}(\mathbf{X} | \theta)$.

Then, in the E step, the gap between the $\mathcal{L}(q, \theta)$ and $\ln \,\text{Pr}(\mathbf{X} | \theta)$ is minimized by minimizing the Kullback-Leibler divergence $\text{KL}(q || p)$ with respect to $q(\mathbf{Z})$ (while keeping the parameters $\theta$ fixed).

Since $\text{KL}(q || p)$ is minimized at $\text{KL}(q || p) = 0$ when $q(\mathbf{Z}) = \,\text{Pr}(\mathbf{Z} | \mathbf{X}, \theta)$, at the E step $q(\mathbf{Z})$ is set to the conditional distribution $\,\text{Pr}(\mathbf{Z} | \mathbf{X}, \theta)$.

To maximize the model parameters in the M step, the lower bound $\mathcal{L}(q, \theta)$ is maximized with respect to the parameters $\theta$ (while keeping $q(\mathbf{Z}) = \,\text{Pr}(\mathbf{Z} | \mathbf{X}, \theta)$ fixed; notice that $\theta$ in this equation corresponds to the old set of parameters, hence to avoid confusion let $q(\mathbf{Z}) = \,\text{Pr}(\mathbf{Z} | \mathbf{X}, \theta^\text{old})$).

The function $\mathcal{L}(q, \theta)$ that is being maximized w.r.t. $\theta$ at the M step can be re-written as
\begin{align*}
\theta^\text{new} &= \underset{\mathbf{\theta}}{\text{arg max }} \left. \mathcal{L}(q, \theta) \right|_{q(\mathbf{Z}) = \,\text{Pr}(\mathbf{Z} | \mathbf{X}, \theta^\text{old})} \\
&= \underset{\mathbf{\theta}}{\text{arg max }} \left. \sum_{\mathbf{Z}} q(\mathbf{Z}) \left( \mathcal{L}(\theta) - \ln q(\mathbf{Z}) \right) \right|_{q(\mathbf{Z}) = \,\text{Pr}(\mathbf{Z} | \mathbf{X}, \theta^\text{old})} \\
&= \underset{\mathbf{\theta}}{\text{arg max }} \sum_{\mathbf{Z}} \,\text{Pr}(\mathbf{Z} | \mathbf{X}, \theta^\text{old}) \left( \mathcal{L}(\theta) - \ln \,\text{Pr}(\mathbf{Z} | \mathbf{X}, \theta^\text{old}) \right) \\
&= \underset{\mathbf{\theta}}{\text{arg max }} \mathbb{E}_{\mathbf{Z} | \mathbf{X}, \theta^\text{old}} \left[ \mathcal{L}(\theta) \right] - \sum_{\mathbf{Z}} \,\text{Pr}(\mathbf{Z} | \mathbf{X}, \theta^\text{old}) \ln \,\text{Pr}(\mathbf{Z} | \mathbf{X}, \theta^\text{old}) \\
&= \underset{\mathbf{\theta}}{\text{arg max }} \mathbb{E}_{\mathbf{Z} | \mathbf{X}, \theta^\text{old}} \left[ \mathcal{L}(\theta) \right] - (C \in \mathbb{R}) \\
&= \underset{\mathbf{\theta}}{\text{arg max }} \mathbb{E}_{\mathbf{Z} | \mathbf{X}, \theta^\text{old}} \left[ \mathcal{L}(\theta) \right],
\end{align*}

i.e. in the M step the expectation of the joint log likelihood of the complete data is maximized with respect to the parameters $\theta$.

So, just to summarize,

• Expectation step: $q^{t + 1}(\mathbf{Z}) \leftarrow \,\text{Pr}(\mathbf{Z} | \mathbf{X}, \mathbf{\theta}^t)$
• Maximization step: $\mathbf{\theta}^{t + 1} \leftarrow \underset{\mathbf{\theta}}{\text{arg max }} \mathbb{E}_{\mathbf{Z} | \mathbf{X}, \theta^\text{t}} \left[ \mathcal{L}(\theta) \right]$ (where superscript $\mathbf{\theta}^t$ indicates the value of parameter $\mathbf{\theta}$ at time $t$).

Phew. Let's go to the image clustering example, and see how all of this actually works. Read the rest of this entry »

3D Display Simulation using Head-Tracking with Kinect

During my final year in Cambridge I had the opportunity to work on the project that I wanted to implement for the last three years.

It all started when I saw Johnny Lee's "Head Tracking for Desktop VR Displays using the Wii Remote" project in early 2008 (see below). He cunningly used the infrared camera in the Nintendo Wii's remote and a head mounted sensor bar to track the location of the viewer's head and render view dependent images on the screen. He called it a "portal to the virtual environment".

I always thought that it would be really cool to have this behaviour without having to wear anything on your head (and it was - see the video below!).

My "portal to the virtual environment" which does not require head gear. And it has 3D Tetris!

I am a firm believer in three-dimensional displays, and I am certain that we do not see the widespread adoption of 3D displays simply because of a classic network effect (also know as "chicken-and-egg" problem). The creation and distribution of a three-dimensional content is inevitably much more expensive than a regular, old-school 2D content. If there is no demand (i.e. no one has a 3D display at home/work), then the content providers do not have much of an incentive to bother creating the 3D content. Vice versa, if there is no content then consumers do not see much incentive to invest in (inevitably more expensive) 3D displays.

A "portal to the virtual environment", or as I like to call it, a 2.5D display could effectively solve this. If we could enhance every 2D display to get what you see in Johnny's and my videos (and I mean every: LCD, CRT, you-name-it), then suddenly everyone can consume the 3D content even without having the "fully" 3D display. At that point it starts making sense to mass-create 3D content.

The terms "fully" and 2.5D, however, require a bit of explanation.

CST Part II

Nearly a year has passed since the last update. In order to at least partially rectify this situation, I am planning to publish a series of posts describing what have I been up to during that time. A natural place to start then is the final year of Computer Science undergraduate degree (also known as CST Part II) at Cambridge.

CST Part II undoubtedly offers a largest amount of flexibility out of all years of the Tripos. Three written exams (or as they are known in Cambridge, papers) have to be taken at the end of the year. Each paper contains fourteen questions, out of which five have to be selected and answered during a three hour exam.

This implies that it is enough to select a proper subset of subjects (out of twenty-four subjects offered in Part II) in order to do well in the final exams. This is where various strategies come into play.

Here is one that worked for me:

1. Choose the courses that you are really interested in,
2. Choose the courses that you are really good at.

The only remaining task then is to identify courses that satisfy the above criteria.

Let's start with #2: it is easy to be good at well-taught courses. That includes courses with well-written handouts, good exercise sheets and effective lecturers. Here is my list (but YMMW):

1. All courses by Prof John Daugman (this typically includes "Information Theory and Coding" and "Computer Vision"). There are multiple good reasons for taking these courses: first of all, it is really worth seeing a polymath at work. I have not seen anyone closer to a Renaissance man than John Daugman. Secondly, the questions for ITC and CV follow the round-robin pattern over the years and never deviate from the material in the learning guides.
2. "Business Studies" and "E-Commerce" courses by a serial entrepreneur-in-residence Jack Lang. I have been to a number of "business" courses, both within and outside the academic environment. I have not seen anyone eliminate the bullshit factor more effectively than Jack. This is both a tremendously effective skill to learn, and provides a very down-to-earth set of introductory lectures to anyone who aspires to be an entrepreneur. Exam-wise, Jack follows the same no-bullshit approach. The questions test your basic knowledge and provide an easy way to grab fifteen out of twenty marks (the median results for the year 2011-2012 were 15/20 for BS and 16/20 for E-C).
3. "Computer Systems Modelling" by Dr Richard Gibbens. Besides the fact that Dr Gibbens is a very good lecturer, he is also very stable when it comes to setting the exam questions (viz. one out of two questions is always about the probability theory and the other is about queuing theory). Stability is predictability, and predictability is an expensive commodity in Cambridge exams.
4. "Bioinformatics" (also known as "Algorithms III") by Dr Pietro Lio'. Despite sporting a couple of PhDs, Dr Lio' is a tremendously helpful lecturer. Seriously. He demonstrated it multiple times by arranging additional examples classes, Q/A sessions, handing out complementary lecture material and, of course, by setting the exam questions that yielded "very good results, well spread but with very few low marks" (his own words).
5. "Temporal Logic and Model Checking" by Prof Mike Gordon (not to be confused with the evil "Hoare Logic" course!).

Temporal Logic and Model Checking scribbles...

Behind the nasty notation hurdle (see the image on the right) lies another self-contained and straightforward course. This is also reflected in the exam results: the median marks for both TL&MC questions in 2011/2012 were 17/20!

Once again, the list above should be taken with a grain of salt, especially for the relatively young courses like Bioinformatics or TL&MC, where a large variance in the exam question difficulty has not been statistically disproved.

However, written papers account for only 75% of the final grade. The remaining 25% is allocated for the Part II dissertation. One of the biggest challenges in Part II is juggling between writing up the dissertation and learning for the exams, especially in the Easter term.

Here is my solution:

1. Take all the courses in Michaelmas and Lent terms.
2. Do not take any courses (apart from business ones) in the Easter term.
3. Finish the implementation of the Part II project by the end of Christmas break. If you need to spend all your break on project work - do so; leave the revision for the Easter term (see below).
4. Finish the write-up of the dissertation in the first two weeks of the Easter term. If you need to spend the whole Easter break on the dissertation work - do so.
5. After the first two weeks of the Easter term you should be done with your dissertation and your courses (modulus a couple of business lectures per week). This means that between the first day of week three in the Easter term and the day minus-one of the first exam, you should be spending 95% of your time revising.

This approach has two main strength-points: first of all, the amount of task juggling is reduced to bare minimum. You work on your Part II project during the holidays, you study during the first two terms, and you spend all your time revising during the Easter term. Secondly, you avoid the "re-learning" (c.f. with Michaelmas courses in Parts 1A and 1B) since there is no new material between the revision and the exams to interfere. It tremendously reduces the amount of time required to prepare for 10-11 courses.

Finally, three tips for your revision:

Examination revision plan (Gantt chart in Microsoft Project)

1. Go for the quality, not for the quantity. It is much better to have six strong questions in each paper than seven or eight average ones. Most people will go for the latter, making a huge mistake. Having four strong answers yields you a first class result while still leaving two questions for risk management.
2. Find a suitable working space and establish a working regime. Separate "working" and "leisure" environments (e.g. work at the library instead of your college room). Seeing fifty people working hard around you both increases the motivation and decreases the temptation to procrastinate.
3. Create a revision plan and follow it. Download Microsoft Project from DreamSpark and create a Gantt chart (you can see mine above). I am not joking. It is surprisingly easy to slip up by a day-or-two ("I will revise these two topics tomorrow and I will be done with the course.") Having an interactive schedule which can be easily updated, helps to see what trade-offs are actually being made ("I could revise these two topics tomorrow, but that will leave me only one day to revise the whole Information Retrieval subject").

So how does this all work in practice? For me this strategy yielded the average grade >80% (well above the first-class threshold) and a Jennings prize third year in a row. YMMW.

Perhaps more importantly though, it left me with enough time to do a Part II project that was highly-commended by the Computer Laboratory, and a >200 page dissertation that was highly-commended in the international 2012 Undergraduate Awards. More on the Part II project will follow in the later posts.

It's not all doom and gloom. With Bjarne Stroustrup, creator of C++. Queens' College, Cambridge (2012)

All Shall Be Well

You step in the stream,
But the water has moved on.
-- Internet Folkore

Rather surprisingly even for me, it has already been over five months since my last post. Writing a thorough and yet succinct account about everything that happened over that time is way beyond my literary abilities, so here's a bullet-point summary about a few things worth mentioning from that period.

1. Cambridge CST Part IB Exams

Jennings Prize '11

Back in June 2012, folks at Computer Science Tripos Part IB (including me) had the opportunity to enjoy four consecutive days of exams. The structure of papers was straightforward: three hours, nine questions (one or two questions per course); five questions had to be answered.

If I could give a single advice to someone who is about to go through this process, it would be "time". Three hours will not be enough to answer five questions, and there will be almost no time to think about anything if you get stuck. Most likely, the bottleneck in your answers will be the speed of your writing, so don't make a huge mistake by thinking that you will be able to do less revision and figure out things "on the spot". You won't.

At the end of the day, of course, it's not the grades that matter, but... receiving the thing on the left was still nice.

2. Internship at Microsoft

My office at Microsoft (Redmond, WA, 2011)

After three days from my last exam I arrived to my office at Microsoft, in Redmond, WA.

Over the summer, I was working as a Software Development Engineer in Test in Microsoft Office team. In particular, I was writing a performance testing suite for the feature that is used across Office and Windows divisions when developing and updating over 50k+ pages of documentation on MSDN.

Microsoft Campus

Well, I say "working". The summer was literally packed with events for interns! Starting with the intern day of caring, when we spent a full working day volunteering for the community, and ending with the huge intern celebration involving Dave Matthews and customized Xbox + Kinect bundles for everyone (photo on the right)!

Customized Xbox

And then, there were meetings and talks by senior people in the company. And by senior people, I don't mean my manager's manager. Just to throw in a few names, we heard talks from Andy Lees, Steven Sinofsky and, of course, Steve Ballmer (or as he's known within the company, SteveB). I even had a half-an-hour chat with the head of the Microsoft Office Division, Kurt DelBene - definitely one of my best experiences over the whole internship!

Also, the perks that you get as an intern are incredible: paid flights, subsidized housing, free bike, two weeks of free rental car, gym plan (and if you're around Redmond, WA area - check out the Pro Sports Club). That, and all-you-can-drink soda at work! (OK, so you don't get free food as in some other places, but trust me, you will able to afford food from your intern compensation package).

Seattle Skyline, WA, 2011

Finally, there are people that you are working with. If you are as lucky as me, then everyone's going to be exceptionally smart, and nevertheless, extremely approachable. I wouldn't have achieved

anything over this summer if not for my team; their help and guidance was invaluable. And not only limited to the working environment - the weekend trip to Vancouver, BC was pretty epic, and I already miss our regular Friday basketball games, just outside the office.

However, to make things even better, I have already received an offer for the summer of 2012! If there aren't any major changes, there is a good chance that I will spend my next summer working as a Program Manager intern at Microsoft Office team!

Microsoft Campus (Redmond, WA, 2011)

3. Vacation

This year me and Ada have decided to go to Turkey. We have spent two final weeks of September in a beautiful Alanya region: the weather (every single day above 30°C/86°F) justified the nickname of the region ("Where The Sun Smiles") and the Mediterranean Sea was simply magnificent. We were also very lucky with the choice of the hotel: two pools, five minutes away from the sea, extremely polite and courteous staff, and great all inclusive food and drinks!

Anyway, a picture is worth more than a thousand words, so here's a few of them. Enjoy!

P.S. Fact: it is possible to get a 500% discount when buying a leather jacket in Turkey. Verified. And it takes only a little bit over two hours of negotiating.

Backpropagation Tutorial

The PhD thesis of Paul J. Werbos at Harvard in 1974 described backpropagation as a method of teaching feed-forward artificial neural networks (ANNs). In the words of Wikipedia, it lead to a "rennaisance" in the ANN research in 1980s.

As we will see later, it is an extremely straightforward technique, yet most of the tutorials online seem to skip a fair amount of details. Here's a simple (yet still thorough and mathematical) tutorial of how backpropagation works from the ground-up; together with a couple of example applets. Feel free to play with them (and watch the videos) to get a better understanding of the methods described below!

Training a single perceptron (linear classifier)

Training a multilayer neural network

1. Background

To start with, imagine that you have gathered some empirical data relevant to the situation that you are trying to predict - be it fluctuations in the stock market, chances that a tumour is benign, likelihood that the picture that you are seeing is a face or (like in the applets above) the coordinates of red and blue points.

We will call this data training examples and we will describe $i$th training example as a tuple $(\vec{x_i}, y_i)$, where $\vec{x_i} \in \mathbb{R}^n$ is a vector of inputs and $y_i \in \mathbb{R}$ is the observed output.

Ideally, our neural network should output $y_i$ when given $\vec{x_i}$ as an input. In case that does not always happen, let's define the error measure as a simple squared distance between the actual observed output and the prediction of the neural network: $E := \sum_i (h(\vec{x_i}) - y_i)^2$, where $h(\vec{x_i})$ is the output of the network.

2. Perceptrons (building-blocks)

The simplest classifiers out of which we will build our neural network are perceptrons (fancy name thanks to Frank Rosenblatt). In reality, a perceptron is a plain-vanilla linear classifier which takes a number of inputs $a_1, ..., a_n$, scales them using some weights $w_1, ..., w_n$, adds them all up (together with some bias $b$) and feeds everything through an activation function $\sigma \in \mathbb{R} \rightarrow \mathbb{R}$.

A picture is worth a thousand equations:

Perceptron (linear classifier)

To slightly simplify the equations, define $w_0 := b$ and $a_0 := 1$. Then the behaviour of the perceptron can be described as $\sigma(\vec{a} \cdot \vec{w})$, where $\vec{a} := (a_0, a_1, ..., a_n)$ and $\vec{w} := (w_0, w_1, ..., w_n)$.

To complete our definition, here are a few examples of typical activation functions:

• sigmoid: $\sigma(x) = \frac{1}{1 + \exp(-x)}$,
• hyperbolic tangent: $\sigma(x) = \tanh(x)$,
• plain linear $\sigma(x) = x$ and so on.

Now we can finally start building neural networks. Read the rest of this entry »

Halfway There

Another term in Cambridge has gone by - four out of nine to go. In the meantime, here's a quick update of what I've been up to in the past few months.

1. Microsoft internship

Redmond, WA, 2011

In January I had the opportunity to visit Microsoft's headquarters in Redmond, WA, to interview for the Software Development Engineer in Test intern position in the Office team. In short - a great trip, in every aspect.

I left London Heathrow on January 11th, 2:20 PM and landed in Seattle Tacoma at 4:10 PM (I suspect that there might have been a few time zones in between those two points). I arrived in Mariott Redmond roughly an hour later, which meant that because of my anti-jetlag technique ("do not go to bed until 10-11 PM in the new timezone no matter what") I had a few hours to kill. Ample time to unpack, grab a dinner in Mariott's restaurant and go for a short stroll around Redmond before going to sleep.

On the next day I had four interviews arranged. The interviews themselves were absolutely stress-free, it felt more like a chance to meet and have a chat with some properly smart (and down-to-earth) folks.

Top of the Space Needle. Seattle, WA, 2011

The structure of the interviews seemed fairly typical: each interview consisted of some algorithm/data structure problems, a short discussion about the past experience and the opportunity to ask questions (obviously a great chance to learn more about the team/company/company culture, etc). Since this was my third round of summer internship applications (I have worked as a software engineer for Wolfson Microelectronics in '09 and Morgan Stanley in '10), everything made sense and was pretty much what I expected.

My trip ended with a quick visit to Seattle on the next day: a few pictures of the Space Needle, a cup of Seattle's Best Coffee and there I was on my flight back to London, having spent \$0.00 (yeap, Microsoft paid for everything - flights, hotel, meals, taxis, etc). Even so, the best thing about Microsoft definitely seemed to be the people working there; since I have received and accepted the offer, we'll see if my opinion remains unchanged after this summer!

2. Lent term v2.0

TrueMobileCoverage group project

Well, things are still picking up the speed. Seven courses with twenty-eight supervisions in under two months, plus managing a group project (crowd-sourcing mobile network signal strength, the link is on the left), a few basketball practices each week on top of that and you'll see a reason why this blog has not been updated for a couple of months.

It's not all doom and gloom, of course. Courses themselves are great, lecturers make some decently convoluted material understandable in minutes and an occasional formal hall (e.g. below) also helps.

All in all, my opinion, that Cambridge provides a great opportunity to learn a huge amount of material in a very short timeframe, remains unchanged.

There will be more to come about some cool things that I've learnt in separate posts, but now speaking of learning - it's revision time... :-)

Me and Ada at the CompSci formal. Cambridge, England, 2011

Conway's Game of Life (cont.)

"Beauty in things exists in the mind which contemplates them."
- David Hume (1711-1776)

Conway's Game of Life theme continues. Here is a short video with the Game of Life, this time running on Altera DE2 FPGA board with custom soft MIPS CPU.

Game of Life running on Altera DE2 FPGA board.

Morgan Stanley Internship

After work (Canary Wharf, 2010)

Last week I received a short e-mail from my former manager at Morgan Stanley:

"Hi Manfred,

Just to let you know that GlobalAxe all went live last week and so far no issues at all."

Since the people on the trading floor started using my system and it seems to be standing on its feet so far, it probably is a good time to recap on what had happened over my ten week internship at Morgan Stanley.

I was working as technology analyst in repo trading team (in institutional securities group). My task was to develop and integrate a new screen into trading software, to create an associated e-mail subsystem generating daily/weekly reports for senior executives and to code a website which would provide access to the data for executives/sales people without the trading software on their machines.

Development-wise it involved working with quite a wide range of technologies, such as C# and CAB for UI development, Java/Spring for e-mail report generation/server backend, MVC under ASP.NET for the website, Transact-SQL for Sybase DB backend; everything interconnected with SOAP/XML and distributed locally over in-house pubsub systems or through IBM's MQ for inter-continental data transactions.

Even though working and learning about all these technologies was fun on it's own right, the best thing I would say about my experience was the people.

Night at Canary Wharf, 2010

There is no better feeling than having a quick call with traders in New York demoing them the stuff that you just wrote, then dropping an e-mail to Tokyo checking if your recent changes made it through to their database, discussing the architecture of your system with the guys in your team and then going to the global team video-meeting; all in the same day.

And sometimes you feel the need to pinch yourself, because the level of responsibility that you get as an intern is staggering. You have the same rights and responsibilities as any other team member: a screw up in your code can block sixty people from submitting their code before the end of the iteration, a failure to convince the head of traders in NY that what you are doing is going to help them will affect the name of the whole team, and so on.

But then, you own your project: you make the final design decisions, you implement it and you give it to the end-users, who often appear to be bigshots. And that more than makes up for a few late nights in the office. Plus, Canary Wharf is absolutely beautiful at night.

Without expanding too much (and breaching too many non-disclosure agreements) - it was definitely the best experience so far: in terms of team, project, technology, skill, involvement and everything else. And it seems like I will have a chance to repeat it again: I have already received an unconditional offer for the internship at MS next summer!

Oh, and regarding the summer days spent in glass, steel and stone towers... well, Majorca more than made up for it!

CST Part 1A

... otherwise known as Part 1A of Computer Science Tripos in University of Cambridge has officially ended.

All in all, a rather enjoyable year. From the introduction to ML by the brilliant Prof. Larry Paulson, to the realms of Discrete Mathematics II (with a wicked a proof of existance of ordinal numbers in the exam); from Algorithms to Software Design, from Digital Electronics to Operating Systems, from Floating-Point Computation to Regular Languages and Finite Automata; and everything in between.

A crash course, but the one that is definitely worth going through.

If you're just about to come to Cambridge (or just starting your part 1A), here are a few simple tips that proved to be helpful for me:

• Don't fall behind - in lectures, ticks, homeworks, supervision assignments - in Cambridge pace it's difficult to catch up.
• Do things in advance - it's usually a very good idea and pays off well.
• Make sure that you keep your work/life balance: do a bit of sports (many choices in University of Cambridge Societies website) and go out once in a while. Paradoxically, having a few hours off in a week will help you to stay on top of things.
• Finally, keep an eye on these subjects: Discrete Mathematics II, Operating Systems II, Floating-Point Computation, HW ticks (sorted by the effort they took from me, in decreasing order). They might be different for you, but these particular ones are worth being aware of.

But most importantly - enjoy what you're doing.
Good luck and have fun.

"The Hard Way is the Right Way"

While the Europe is paralyzed by the volcanic ash and I am restrained from coming back to Cambridge, here is a quick recap of what I have been doing for the past couple of months.

1. Guitar

It looks like it is becoming a tradition - another vacation ends with a guitar video. This time - a short slow blues in Stefan Grossman's style. The usual apologies for the sound, video and playing quality apply.

2. Professional

I was accepted as a student member to the Chartered Institute for IT (former British Computer Society). For the next two years (at least) I should be reachable through this e-mail: manfredas.zabarauskas@bcs.org.

3. Studies

Without getting into the gory details, the outcomes of the Lent (read: second) term in Cambridge can be summarized by:

Conway's Game of Life

Mandelbrot set (1280x1024)

1. a wallpaper "spit out" by ML depicting a rather standard Mandelbrot set fractal (on the right),
2. an animation "spit out" by Java depicting eight-hundred generations of the spacefiller pattern in the Conway's Game of Life (further on the right),
3. and an insane amount of material to prepare for the upcoming exams; both broad and not as shallow as I expected.

I am still extremely enjoying it, even though the amount of my spare-time has decreased. (As an afterthought - I still had time to play basketball for Blues, Lions and my college, so maybe it was not that bad).

4. Internship

Over the summer I will be working as a Technology Analyst for Morgan Stanley, in their Innovative Data, Environments, Analytics & Systems (IDEAS) group.

Sunrise over Canary Wharf

During the interviews Morgan Stanley really left a very good impression both at the level of knowledge of the people working there and the communication and culture inside the company. We will see if that applies in day-to-day situations. In any case, I have nothing against spending the summer in Canary Wharf and seeing the investment banking industry from inside.

It will surely give me more things to write about in this blog, so be sure to check back once in a while.

Competition: internship offer packs from Morgan Stanley and Citigroup.