Lazy Evaluation in Apache Spark and its Advantage

Lazy Evaluation means that the execution will not start until an action is triggered.
Transformations are lazy in nature meaning when we call some operation in RDD, it does not execute immediately.

Now some advantages of this Lazy Evaluation in Spark:

  1. Increases Manageability :Using Apache Spark RDD lazy evaluation, users can freely organize their Apache Spark program into smaller operations. It reduces the number of passes on data by grouping operations.
  2. Saves Computation and increases speed : Lazy Evaluation plays a key role in saving calculation overhead. Since value does not need to be calculated of, it is not used. Only necessary values are computed. It saves the trip between driver and cluster, thus speeds up the process.
  3. Reduces complexities : The two main complexities of any operation are time and space complexity. Using Spark lazy evaluation we can overcome both. Since we do not execute every operation, the time gets saved. It let us work with an infinite data structure. The action is triggered only when the data is required, it reduces overhead.
  4. Optimization : It provides optimization by reducing the number of queries.

http://mycloudplace.com/spark-rdd-transformations-actions/

Spark RDD Transformations & Actions

http://mycloudplace.com/apache-spark-architecture/

Apache Spark Architecture

http://mycloudplace.com/calculating-executor-memory-number-of-executors-cores-per-executor-for-a-spark-application/

Calculating executor memory, number of Executors & Cores per executor for a Spark Application

http://mycloudplace.com/what-is-spark-executor/

What is spark Executor?

http://mycloudplace.com/deep-understanding-of-sparkcontext-applications-driver-process/

Deep Understanding of SparkContext & Application’s Driver Process

External Links:

https://en.wikipedia.org/wiki/Apache_Spark

https://data-flair.training/blogs/apache-spark-lazy-evaluation/

https://en.wikipedia.org/wiki/Lazy_evaluation

1 thought on “Lazy Evaluation in Apache Spark and its Advantage”

  1. Pingback: pyspark dataframe | python spark dataframe with examples - Mycloudplace

Leave a Comment

Your email address will not be published. Required fields are marked *