Skip to content

Connecting Spark with hadoop

Awantik Das edited this page Jan 25, 2019 · 1 revision

In spark-config file

Example:

spark.master yarn spark.eventLog.enabled true spark.eventLog.dir hdfs://localhost:9000/spark-logs

spark.serializer org.apache.spark.serializer.KryoSerializer

spark.driver.memory 5g

spark.executor.extraJavaOptions -XX:+PrintGCDetails -Dkey=value -Dnumbers="one two three"

spark.history.provider org.apache.spark.deploy.history.FsHistoryProvider spark.history.fs.logDirectory hdfs://localhost:9000/spark-logs spark.history.fs.update.interval 10s spark.history.ui.port 18080

  1. Configure history server ~/packages/spark-2.4.0-bin-hadoop2.7/sbin$ ./start-history-server.sh