Wednesday, March 7, 2012

Heap space error using the Hadoop tutorial with the Cloudera VMware virtual machine

Today, I was trying out the Cloudera Hadoop tutorial in combination with their pre-configured VMware virtual machine running CentOS (Cloudera Hadoop VM's).

After having put everything in place i launched the hadoop command :


hadoop jar wordcount.jar org.myorg.wordcount.WordCount /usr/joe/wordcount/input usr/joe/wordcount/output


and suprisingly, it kept hanging on map 0%, reduce 0%

I had to kill the virtual machine and reboot.
Next time I got the following output :


Hmm, so it seems Hadoop is not happy with the amount of heap space it can get.

After some trial and error, the solution was to increment the memory setting of the virtual machine. The Cloudera VM comes configured with 1Gbyte of memory. Changing this to 2Gbyte and rebooting solved the problem.