I. Building
First, I downloaded the Spark source code (v1.6.0.zip) and then extracted all files.・Apache Spark source code
https://github.com/apache/spark/releases
v1.6.0.zip
v1.6.0.zip
・after extracting
C:\enjyoyspace
├─spark-1.6.0
├─assembly
├─bagel
├─bin
├─build
├─spark-1.6.0
├─assembly
├─bagel
├─bin
├─build
Next, I compiled with Scala 2.10.6 to produce a Spark package (Hadoop 2.6.0) by using Apache Maven. Apache Maven is a so loveable :) The changes of pom.xml and the command are following.
・changes of pom.xml
C:\enjyoyspace\spark-1.6.0\pom.xml
before: <scala.version>2.10.5</scala.version>
after: <scala.version>2.10.6</scala.version>
before: <scala.version>2.10.5</scala.version>
after: <scala.version>2.10.6</scala.version>
・command
cd C:\enjyoyspace\spark-1.6.0
mvn -Pyarn -Phadoop-2.6 -Dhadoop.version=2.6.0 -DskipTests clean package
(install Apache Maven: https://maven.apache.org/install.html)
mvn -Pyarn -Phadoop-2.6 -Dhadoop.version=2.6.0 -DskipTests clean package
(install Apache Maven: https://maven.apache.org/install.html)
The Spark package (Hadoop 2.6.0) was produced.
・Spark package (Hadoop 2.6.0)
C:\enjyoyspace\spark-1.6.0\assembly\target\scala-2.10\spark-assembly-1.6.0-hadoop2.6.0.jar
II. Operation Check
I checked the operation capabilities of Spark (testing interactive programs). It noramlly ran :)・command
C:\enjyoyspace\spark-1.6.0>bin\spark-shell
・using Spark
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 1.6.0
/_/
Using Scala version 2.10.6 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_77)
Type in expressions to have them evaluated.
Type :help for more information.
Spark context available as sc.
SQL context available as sqlContext.
scala> val lines = sc.parallelize(List("Sqoop", "from external datastores into HDFS", "Julia", "Julia is a high-performance dynamic programming language", "JuliaCon 2016"))
lines: org.apache.spark.rdd.RDD[String] = ParallelCollectionRDD[0] at parallelize at< console>:27
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 1.6.0
/_/
Using Scala version 2.10.6 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_77)
Type in expressions to have them evaluated.
Type :help for more information.
Spark context available as sc.
SQL context available as sqlContext.
scala> val lines = sc.parallelize(List("Sqoop", "from external datastores into HDFS", "Julia", "Julia is a high-performance dynamic programming language", "JuliaCon 2016"))
lines: org.apache.spark.rdd.RDD[String] = ParallelCollectionRDD[0] at parallelize at< console>:27
scala> lines.count()
res0: Long = 5
scala> val words = lines.flatMap(line => line.split(" "))
words: org.apache.spark.rdd.RDD[String] = MapPartitionsRDD[1] at flatMap at< console>:29
scala> words.count()
res1: Long = 16
res1: Long = 16
scala>
・Spark Web UI (http://localhost:4040/jobs/)