10,000+ Free Udemy Courses to Start Today

Coursesity is supported by learner community. We may earn affiliate commission when you make purchase via links on Coursesity.

Certification Course

Apache Spark Streaming with Python and PySpark

Apache Spark Streaming with Python and PySpark

Add Big Data Streaming to your Data Science and Machine Learning Python Projects - Free Course

Discount Offer

Go to Course SAVE

Course Overview
Reviews

Description

In this course, you will :

Create big data streaming pipelines with Spark using Python
Run analytics on live Tweet data from Twitter
Integrate Spark Streaming with tools like Apache Kafka, used by Fortune 500 companies
Work with new features of the most recent version of Spark: 2.3

Syllabus :

1. Pyspark Basics

What are Discretized Streams?
How to Create Discretized Streams
Transformations on DStreams
Transformation Operation
Window Operations
Window
countByWindow
reduceByKeyAndWindow
countByValueAndWindow
Output Operations on DStreams
forEachRDD
SQL Operations
Reviewing the Basics

2. Advanced Spark Concepts

Join Operations
Stateful Transformations
Checkpointing
Accumulators
Fault Tolerance

3. PySpark Streaming at Scale

Performance Tuning
PySpark Streaming with Apache Kafka
Integration with Kafka Text Lecture
PySpark Streaming with Amazon Kinesis
Integration with Kinesis Text Lecture

4. Structured Streaming

Introduction to Structured Streaming
Operations on Streaming Dataframes and DataSets
Window Operations
Handling Late Data and Watermarking

Similar Courses

Course Features

30 days return
Certificate on completion
3h 57m
Udemy
English
Beginner
Self Paced
Development ,Apache Spark

Enrollment options

Course Material
Certificate on completion
30 days Refund (refund policy)
Lifetime Access
Instructor direct message
Instructor Q&A