Apache Spark Streaming with Python and PySpark
MP4 | Video: AVC 1280x720 | Audio: AAC 44KHz 2ch | Duration: 3.5 Hours | Lec: 37 | 2.89 GB
Genre: eLearning | Language: English
Add Spark Streaming to your Data Science and Machine Learning Python Projects
What is this course about?
This course covers all the fundamentals about Apache Spark streaming with Python and teaches you everything you need to know about developing Spark streaming applications using PySpark, the Python API for Spark. At the end of this course, you will gain in-depth knowledge about Spark streaming and general big data manipulation skills to help your company to adapt Spark Streaming for building big data processing pipelines and data analytics applications. This course will be absolutely critical to anyone trying to make it in data science today.
What will you learn from this Apache Spark streaming cour?
In this Apache Spark streaming course, you'll learn the following:
An overview of the architecture of Apache Spark.
How to develop Apache Spark streaming applications with PySpark using RDD transformations and actions and Spark SQL.
How to work with Spark's primary abstraction, resilient distributed datasets(RDDs), to process and analyze large data sets.
Advanced techniques to optimize and tune Apache Spark jobs by partitioning, caching and persisting RDDs.
Analyzing structured and semi-structured data using Datasets and DataFrames, and develop a thorough understanding of Spark SQL.
How to scale up Spark Streaming applications for both bandwidth and processing speed
How to integrate Spark Streaming with cluster computing tools like Apache Kafka
How to connect your Spark Stream to a data source like Amazon Web Services (AWS) Kinesis
Best practices of working with Apache Spark streaming in the field.
Big data ecosystem overview.