Apache Spark 3.0 Release Note (Preview)

Apache Spark 3.0 is released and available for testing in preview mode. The release was done on 2019-Nov-08 and it was announed via twiter. The preview mode is lauched to enable wide-scale community testing of this major release.

This Spark 3.0 preview is not a stable release in terms of either API or functionality, but it is meant to give the community early access to try the code that will become Spark 3.0. If you would like to test the release, please download it, and send feedback using either the mailing lists or JIRA.

The Spark issue tracker already contains a list of features in 3.0.

What's new in Spark 3.0

The Spark 3.0 is faster, easier, and smarter. Apache Spark 3.0 extends its scope with more than 3000 resolved JIRAs. The features are exciting and data developers as well as machine learning engineerings will find it exciting to explore them. Along with list feature list other major initiatives that are coming in the future. You will find lot of good and intitutive example and demos in future articles.

The following features are covered:

  • Accelerator-aware scheduling
  • Adaptive query execution
  • Dynamic partition pruning
  • Join hints
  • New query explain
  • Better ANSI compliance
  • Observable metrics
  • New UI for structured streaming
  • New UDAF and built-in functions
  • New unified interface for Pandas UDF
  • Various enhancements in the built-in data sources [e.g., parquet, ORC and JDBC].

You can get the summary of each newly added here in this article.

Additional PySpark Resource & Reading Material

PySpark Frequentl Asked Question

Refer our PySpark FAQ space where important queries and informations are clarified. It also links to important PySpark Tutorial apges with-in site.

PySpark Examples Code

Find our GitHub Repository which list PySpark Example with code snippet

PySpark/Spark Related Interesting Blogs

Here are the list of informative blogs and related articles, which you might find interesting

  1. PySpark Frequently Asked Questions
  2. Apach Spark Introduction
  3. How Spark Works
  4. PySpark Installation on Windows 10
  5. PySpark Jupyter Notebook Configuration On Windows
  6. PySpark Tutorial
  7. Apache Spark 3.0 Release Note (Preview)
  8. PySpark Complete Guide