Running Hadoop on Mac OS X Single Node Cluster

This guide will get you past the troubles of getting Hadoop installed and running on Mac OS X. I’ve tested it for Hadoop 2.2.x and OS X 10.9.

The meat of the process you need to follow is well documented by Michael Noll. I’m only going to add the steps here that are required over and above the Ubuntu guide for OS X. You should probably use brew unless you have a great reason not to.

in <HADOOP>/etc/hadoop/hadoop-env.sh add or edit these lines:

# The java implementation to use.

export JAVA_HOME=`/usr/libexec/java_home -v 1.6`

# Extra Java runtime options.  Empty by default.

export HADOOP_OPTS=”$HADOOP_OPTS -Djava.net.preferIPv4Stack=true”

export HADOOP_OPTS=”$HADOOP_OPTS  -Djava.awt.headless=true -Djava.security.krb5.realm=-Djava.security.krb5.kdc=”

YARN_OPTS=”$YARN_OPTS -Djava.security.krb5.realm=OX.AC.UK -Djava.security.krb5.kdc=kdc0.ox.ac.uk:kdc1.ox.ac.uk -Djava.awt.headless=true”

For x permissions problems over ssh

xhost +

Make sure you enable ssh using System Preferences > Sharing > Remote Login.

Starting Hadoop.

<HADOOP>/sbin/start-dfs.sh

<HADOOP>/sbin/start-yarn.sh

Running the WordCount Examples (download the files as in Michael’s guide).

bin/hdfs dfs -mkdir -p /user/hduser/gutenberg

bin/hdfs dfs -copyFromLocal -f /tmp/gutenberg/* /user/hduser/gutenberg

bin/hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar wordcount /user/hduser/gutenberg /user/hduser/gutenberg-output

find /usr/local/hadoop -name hadoop*examples*.jar

1 thought on “Running Hadoop on Mac OS X Single Node Cluster”

  1. Hello,

    your post is really helpful for setup hadoop in Mac. Thank you for your post. I could able to run hadoop successfully with this.

    -Hari

Leave a Reply

Your email address will not be published. Required fields are marked *