pySpark on Windows

pySpark on Windows can be installed using two different ways. Since spark is a distributed compute engine, it also works stand alone. Most of the developer who are familiar with working jupyter notebood prefer to use jupyter notebook and it has to be integrated with pySpark.

pySpark Jupiter Notebook

There are other sets of python developers who prefer to … read the rest

PySpark Tutorial

In this PySpark Tutorial, we will understand why PySpark is becoming popular among data engineers and data scientist. This PySpark Tutorial will also highlight the key limilation of PySpark over Spark written in Scala (PySpark vs Spark Scala). The PySpark is actually a Python API for Spark and helps python developer/community to collaborat with Apache Spark using … read the rest