forked from apache/spark
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[SPARK-6511] [DOCUMENTATION] Explain how to use Hadoop provided builds
This provides preliminary documentation pointing out how to use the Hadoop free builds. I am hoping over time this list can grow to include most of the popular Hadoop distributions. Getting more people using these builds will help us long term reduce the number of binaries we build. Author: Patrick Wendell <[email protected]> Closes apache#6729 from pwendell/hadoop-provided and squashes the following commits: 1113b76 [Patrick Wendell] [SPARK-6511] [Documentation] Explain how to use Hadoop provided builds
- Loading branch information
Showing
2 changed files
with
33 additions
and
3 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
--- | ||
layout: global | ||
displayTitle: Using Spark's "Hadoop Free" Build | ||
title: Using Spark's "Hadoop Free" Build | ||
--- | ||
|
||
Spark uses Hadoop client libraries for HDFS and YARN. Starting in version Spark 1.4, the project packages "Hadoop free" builds that lets you more easily connect a single Spark binary to any Hadoop version. To use these builds, you need to modify `SPARK_DIST_CLASSPATH` to include Hadoop's package jars. The most convenient place to do this is by adding an entry in `conf/spark-env.sh`. | ||
|
||
This page describes how to connect Spark to Hadoop for different types of distributions. | ||
|
||
# Apache Hadoop | ||
For Apache distributions, you can use Hadoop's 'classpath' command. For instance: | ||
|
||
{% highlight bash %} | ||
### in conf/spark-env.sh ### | ||
|
||
# If 'hadoop' binary is on your PATH | ||
export SPARK_DIST_CLASSPATH=$(hadoop classpath) | ||
|
||
# With explicit path to 'hadoop' binary | ||
export SPARK_DIST_CLASSPATH=$(/path/to/hadoop/bin/hadoop classpath) | ||
|
||
# Passing a Hadoop configuration directory | ||
export SPARK_DIST_CLASSPATH=$(hadoop classpath --config /path/to/configs) | ||
|
||
{% endhighlight %} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters