Spark and Python for Big Data with PySpark

Learn how to use Spark with Python, including Spark Streaming, Machine Learning, Spark 2.0 DataFrames and more!
4.51 (25460 reviews)
Udemy
platform
English
language
Data Science
category
instructor
Spark and Python for Big Data with PySpark
143 605
students
10.5 hours
content
May 2020
last update
$174.99
regular price

What you will learn

Use Python and Spark together to analyze Big Data

Learn how to use the new Spark 2.0 DataFrame Syntax

Work on Consulting Projects that mimic real world situations!

Classify Customer Churn with Logisitic Regression

Use Spark with Random Forests for Classification

Learn how to use Spark's Gradient Boosted Trees

Use Spark's MLlib to create Powerful Machine Learning Models

Learn about the DataBricks Platform!

Get set up on Amazon Web Services EC2 for Big Data Analysis

Learn how to use AWS Elastic MapReduce Service!

Learn how to leverage the power of Linux with a Spark Environment!

Create a Spam filter using Spark and Natural Language Processing!

Use Spark Streaming to Analyze Tweets in Real Time!

Course Gallery

Spark and Python for Big Data with PySpark – Screenshot 1
Screenshot 1Spark and Python for Big Data with PySpark
Spark and Python for Big Data with PySpark – Screenshot 2
Screenshot 2Spark and Python for Big Data with PySpark
Spark and Python for Big Data with PySpark – Screenshot 3
Screenshot 3Spark and Python for Big Data with PySpark
Spark and Python for Big Data with PySpark – Screenshot 4
Screenshot 4Spark and Python for Big Data with PySpark

Loading charts...

Comidoc Review

Our Verdict

Overall, this course is a great starting point to learn PySpark with in-depth hands-on examples and practical projects. However, be prepared for outdated content, particularly in certain installations and APIs that may require external resources for up-to-date information. Furthermore, the focus on machine learning and lack of emphasis on core Spark concepts can make this course feel mismatched, affecting its overall value.

What We Liked

  • Comprehensive coverage of PySpark, including data manipulation and machine learning techniques
  • Hands-on examples and practical projects that are useful for beginners
  • Detailed explanations of concepts with a step-by-step approach
  • Instructor goes the extra mile to ensure learners do not feel lost

Potential Drawbacks

  • Outdated content, particularly in areas such as installing AWS EC2 and Databricks, using Twitter API for streaming, and working with DataFrames
  • Lack of focus on core Spark concepts like master node and worker nodes
  • Insufficient data pre-processing step approach, and reliance on complementary courses for RDDs, log files, etc.
  • Fast pace may make it challenging to keep up and fully grasp the content
980798
udemy ID
10/10/2016
course created date
07/08/2019
course indexed date
Bot
course submited by