Databricks Fundamentals & Apache Spark Core
Learn how to process big-data using Databricks & Apache Spark 2.4 and 3.0.0 – DataFrame API and Spark SQL
What you’ll learn
-
Databricks
-
Apache Spark Architecture
-
Apache Spark DataFrame API
-
Apache Spark SQL
-
Selecting, and manipulating columns of a DataFrame
-
Filtering, dropping, sorting rows of a DataFrame
-
Joining, reading, writing and partitioning DataFrames
-
Aggregating DataFrames rows
-
Working with User Defined Functions
-
Use the DataFrameWriter API
Requirements
-
Basic Scala knowledge
-
Basic SQL knowledge
Description
Welcome to this course on Databricks and Apache Spark 2.4 and 3.0.0
Apache Spark is a Big Data Processing Framework that runs at scale.
In this course, we will learn how to write Spark Applications using Scala and SQL.
Databricks is a company founded by the creator of Apache Spark.
Databricks offers a managed and optimized version of Apache Spark that runs in the cloud.
The main focus of this course is to teach you how to use the DataFrame API & SQL to accomplish tasks such as:
- Write and run Apache Spark code using Databricks
- Read and Write Data from the Databricks File System – DBFS
- Explain how Apache Spark runs on a cluster with multiple Nodes
Use the DataFrame API and SQL to perform data manipulation tasks such as
- Selecting, renaming and manipulating columns
- Filtering, dropping and aggregating rows
- Joining DataFrames
- Create UDFs and use them with DataFrame API or Spark SQL
- Writing DataFrames to external storage systems
List and explain the element of Apache Spark execution hierarchy such as
- Jobs
- Stages
- Tasks
Who this course is for:
- Software developers curious about big-data, data engeneering and data science
- Beginner data engineer who want to learn how to do work with databricks
- Beginner data scientist who want to learn how to do work with databricks
Created by Wadson Guimatsa
Last updated 9/2021
English
English [Auto]
Size: 5.29 GB
https://www.udemy.com/course/databricks-fundamentals-apache-spark-core/.