Running Hadoop on Mac OS X Single Node Cluster

This guide will get you past the troubles of getting Hadoop installed and running on Mac OS X. I’ve tested it for Hadoop 2.2.x and OS X 10.9.

The meat of the process you need to follow is well documented by Michael Noll. I’m only going to add the steps here that are required over and above the Ubuntu guide for OS X. You should probably use brew unless you have a great reason not to.

in <HADOOP>/etc/hadoop/hadoop-env.sh add or edit these lines:

# The java implementation to use.

export JAVA_HOME=`/usr/libexec/java_home -v 1.6`

# Extra Java runtime options.  Empty by default.

export HADOOP_OPTS=”$HADOOP_OPTS -Djava.net.preferIPv4Stack=true”

export HADOOP_OPTS=”$HADOOP_OPTS  -Djava.awt.headless=true -Djava.security.krb5.realm=-Djava.security.krb5.kdc=”

YARN_OPTS=”$YARN_OPTS -Djava.security.krb5.realm=OX.AC.UK -Djava.security.krb5.kdc=kdc0.ox.ac.uk:kdc1.ox.ac.uk -Djava.awt.headless=true”

For x permissions problems over ssh

xhost +

Make sure you enable ssh using System Preferences > Sharing > Remote Login.

Starting Hadoop.

<HADOOP>/sbin/start-dfs.sh

<HADOOP>/sbin/start-yarn.sh

Running the WordCount Examples (download the files as in Michael’s guide).

bin/hdfs dfs -mkdir -p /user/hduser/gutenberg

bin/hdfs dfs -copyFromLocal -f /tmp/gutenberg/* /user/hduser/gutenberg

bin/hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar wordcount /user/hduser/gutenberg /user/hduser/gutenberg-output

find /usr/local/hadoop -name hadoop*examples*.jar

Conversion Details with Brecht Palombo

I posted an interview with Brecht over at Keepify. He started and runs Distressed Pro, a sales intelligence tool for real estate brokers. The interview is a deep dive into the details of how he acquired early customers and what he does now to make conversions happen.

Also, check out the Bootstrapped with Kids podcast where he and Scott Yewell discuss the progress of their online business pursuits each week.