Working with Hadoop

My last twitter was “Its after midnight and i just finished my first MapReduce application. Next up, blogging about it.”

Well now its time to blog about it. And share some code…won’t that be fun?

So I’ve always wanted to do some distributed programming or parallel programming(technically not the same, but aim at solving similar problems), but the barrier of entry was always too high and the learning curve a bit steep. At least for a hobby. I remember getting a demo account on my university’s cluster when i was still a student after attending a workshop but i never had the time to try it out and it silently expired.

Well, I’ve been reading about hadoop for quite some time and i finally decided to dedicate some time to learn how to use it. I was pleasantly surprised. I gotta say, that’s one well written application. Doug Cutting is one exceptional programmer; Not that he’s waiting for my recognition. The project wouldn’t be where it is without the Hadoop community, so kudos to all.

The whole experience only took a couple of hours. That included reading the documentation, setting up hadoop as a single node cluster and writing my first application.

Hadoop can certainly make use of some more tutorials. The documentation is very good, but at first view might look a little overwhelming. Let’s hope that more people will use it and share their experience.

Next up, my first app, the problems i faced and my comments.