Python and PySpark
Sno. |
||
1
|
Python is an interpreter high level language for several purpose programming.
|
Pyspark is the python shell of spark. i e Pyspark is the interface that give access to Spark using Python
|
2
|
It is slower compared to pyspark.
|
It is 10 times faster than Python
|
3
|
Comparatively easier to learn for Java programmers because of syntax and standard libraries.
|
Pyspark arcane syntax makes it difficult to master(Verbose Language)
|
4
|
It is dynamically typed language. So it is less safer compared to Pyspark
|
It is statistically typed language. So it is safer than Python
|
5
|
Programs written in python cannot be submitted to a spark cluster and runs locally.
|
Program written in pyspark can be submitted to a spark cluster and run in a distributed manner.
|
6
|
There are also inbuilt packages and libraries available with python which are also available with Pyspark mostly.
|
It is thought of as a set of libraries, since there are more sub packages in Pyspark like spark, SQL, spark ML etc
|
7
|
Python works like an interpreter
|
In Pyspark, python is only a scripting front end, i.e., no interpreted Python code is executed once the spark job starts
|
8
|
Waste lots of memory (especially in case of iterations)
|
It doesn't waste memory. It Creates a counter value one by one.
|
9
|
Python does support heavy weight process forking using WSGI but it does not support true multi-threading.
|
Supports powerful concurrency through primitives like Akka's actors
|
10
|
RDD operations cannot be done
|
RDD operations can be done
|
Author:
A.Yoga Sai Satwik
No comments:
Write commentsPlease do not enter spam links