Javatpoint Logo
Javatpoint Logo


What is SparkConf?

The SparkConf offers configuration for any Spark application. To start any Spark application on a local Cluster or a dataset, we need to set some configuration and parameters, and it can be done using SparkConf.

Features of Sparkconf and their usage

The most commonly used features of the Sparkconf when working with PySpark is given below:

  • set(key, value)-
  • setMastervalue(value) -
  • setAppName(value)-
  • get(key,defaultValue=None) -
  • setSparkHome(value) -

Consider the following example to understand some attributes of SparkConf:


'PySpark Demo App'

The initial thing any spark program does is creating a SparkContext object which tells the application how to access a cluster. To accomplish the task, you need to implement SparkConf so that the SparkContext object contains the configuration information about the application. Below we are describing the SparkContext in detail:


What is SparkContext?

The SparkContext is the first and essential thing that gets initiated when we run any Spark application. The most important step of any Spark driver application is to generate SparkContext. It is an entry gate for any spark derived application or functionality. It is available as sc by default in Pyspark.

Note: You need to remember that creating the other variable instead of sc will give an error.


SparkContext accepts the following parameter that we have described below:


The URL of the cluster connects to Spark.


The name of your task.


SparkHome is a Spark installation directory.


.zip or .py files are send to the cluster and then added to the PYTHONPATH.


It represents the worker nodes environment variables.


The number of Python object represents the BatchSize. If you want to disable the batching, set it to 1. It automatically chooses the batch size based on object size 0, set 1 for unlimited batch size.


It represents the Serializer, an RDD.


It set all the spark properties. An object of L {SparkConf} is there.


It is a class of custom profile which is used to do the profiling, although make sure the pyspark.profiler.BasicProfiler is the default one.

The Master and Appname are the most widely used parameter among the parameters. The following are the initial code for any PySpark application.

Next TopicPySpark SQL

Youtube For Videos Join Our Youtube Channel: Join Now


Help Others, Please Share

facebook twitter pinterest

Learn Latest Tutorials


Trending Technologies

B.Tech / MCA