Learn Data Engineering With Databricks On Aws Cloud

Published 9/2023
MP4 | Video: h264, 1280x720 | Audio: AAC, 44.1 KHz
Language: English | Size: 1.89 GB | Duration: 4h 49m

Build Data Engineering Pipelines on AWS using Databricks core features such as Spark and Delta Lake

What you'll learn
Data Engineering leveraging Databricks features
Databricks CLI to manage files, Data Engineering jobs and clusters for Data Engineering Pipelines
Deploying Data Engineering applications developed using PySpark on job clusters.
Deploying Data Engineering applications developed using PySpark using Notebooks on job clusters
Building Data Engineering Pipelines using Spark Structured Streaming on Databricks Clusters
Setting up development environment to develop Data Engineering applications using Databricks.

Must have Prior programming experience in SQL and Python.
Experience in Spark Programming is a must.
Prior basic experience or understanding of cloud services like AWS is important.

Get ready to Learn Data Engineering with Databricks on AWS Cloud with this complete course. Gain familiarity with the course details and topics designed to help you succeed.This comprehensive course is designed to equip you with the skills and knowledge needed to excel in the field of data engineering using two powerful platforms: Databricks and Amazon Web Services (AWS). Data engineering is the backbone of any successful data-driven initiative, and Databricks, a unified analytics platform, has emerged as a leading choice for data engineers and data scientists worldwide. When combined with AWS, a cloud computing powerhouse, you have a robust ecosystem that can handle data at scale, provide advanced analytics capabilities, and support a wide range of data sources and formats.Learn about Data Engineering with Databricks on AWS with Hands-On LabsLearn Data Engineering with Databricks on AWS Cloud is a hands-on practice course designed to familiarize you with the core functionality of Databricks by connecting it with AWS to perform Data Engineering. Through hands-on exercises, you'll gain a thorough understanding of Databrick's architecture and how it revolutionizes data engineering in the cloud. You'll explore the seamless integration of Databricks with AWS services, such as Amazon S3 and Glue, unlocking a world of possibilities for managing and analyzing your data.This course has been meticulously designed to provide you with both a solid theoretical foundation and extensive hands-on practice in the dynamic realms of data engineering, Databricks, and Amazon Web Services (AWS).The course comprises approximately 50 labs starting from the basics and moving to high levels in terms of complexity.Who should take this course?The course "Learn Data Engineering with Databricks on AWS Cloud" is designed for a wide range of individuals who are interested in building expertise in data engineering using Databricks on the AWS Cloud. If you're looking to start a career in data engineering, this course is an excellent choice. It will provide you with the foundational knowledge and practical skills needed to become a successful data engineer. Data scientists and analysts who want to expand their skill set and be able to work with large-scale data processing, data pipelines, and data lakes can greatly benefit from this course. IT professionals who want to transition into roles focused on data engineering and cloud computing can use this course as a stepping stone to acquire the necessary skills and knowledge. Individuals interested in cloud computing, specifically AWS, and its applications in data engineering will gain a deep understanding of cloud-based data engineering solutions.Requirements● Basic knowledge of SQL or writing queries in any language● Scripting in Python Willingness to explore, learn, and put in the extra effort to succeed● An active AWS Account & know-how of basic cloud fundamentals● Programming experience using Python● Data Engineering experience using Spark

Section 1: Getting Startted with Databricks on AWS
Lecture 1 Introduction to Getting Started with Databricks on AWS
Lecture 2 Signing up for aws free account
Lecture 3 Logging in into AWS Management Console
Lecture 4 Setting up Databricks workspace on AWS using Quickstart
Lecture 5 Logging in into Databricks Workspace on AWS
Lecture 6 Cleaning up the workspace and resources
Lecture 7 Quick Walkthrough of Databricks UI on AWS
Lecture 8 Creating Single-Node Databricks cluster on AWS
Lecture 9 Upload Data using AWS Databricks UI
Lecture 10 Develop spark Application using AWS Databricks Notebook
Lecture 11 Writing dataframe to DBFS
Lecture 12 Export and Import AWS Databricks Notebooks
Section 2: AWS Storage Solutions
Lecture 13 Getting Started with AWS S3
Lecture 14 Overview of AWS S3 Glacier
Lecture 15 Creating S3 Bucket and adding Objects
Lecture 16 Version Control in AWS S3
Lecture 17 AWS S3 Cross-Region Replication for Fault Tolerance
Lecture 18 Setup and configure AWS S3 CLI using IAM user credentials
Lecture 19 Managing Objects in AWS S3 using CLI
Section 3: AWS S3 and IAM Role
Lecture 20 Overview of IAM for Databricks on AWS
Lecture 21 Creating AWS IAM User
Lecture 22 Logging into AWS Management Console using IAM User
Lecture 23 Validate Programmatic Access to AWS IAM User3
Lecture 24 AWS IAM Identity-based policies
Lecture 25 AWS IAM User Groups
Lecture 26 AWS IAM Custom Policies
Section 4: Integration S3 and Glue Catalog
Lecture 27 Introduction to Integrating AWS s3 and Glue Catalog with Databricks
Lecture 28 Create AWS IAM Group for Databricks Developers
Lecture 29 Creating AWS IAM Users and adding to group
Lecture 30 Creating AWS s3 Bucket for Databricks Developers
Lecture 31 Grant Permissions on AWS S3 Bucket to the users in group
Lecture 32 Attach AWS IAM Policy to grant access to Glue
Lecture 33 Upload JSON Dataset to s3 to crawl using AWS Glue Crawler
Lecture 34 Create AWS IAM Custom Service Role for Glue Crawlers
Lecture 35 Create and Run Glue Crawler to Create Multiple Glue Catalog Tables
Lecture 36 Overview of Integration of Databricks Clusters and AWS EC2 Instances
Lecture 37 Create AWS IAM Role or Instance Profile
Lecture 38 Registering AWS IAM Instance Profile with Databricks Account
Lecture 39 Attach AWS IAM Instance Profile to Databricks Cluster
Lecture 40 Grant Permissions on S3 to Databricks Clusters
Lecture 41 Integrate Databricks Cluster with Glue Catalog via Instance Profile
Section 5: Setup local development environment for databricks
Lecture 42 Setup single node databricks cluster
Lecture 43 Install Databricks Connect
Lecture 44 Configure Databricks Connect
Section 6: Using Databricks CLI
Lecture 45 Install and configure databricks CLI
Lecture 46 Interacting with File System using Databricks CLI
Lecture 47 Getting Cluster details using Databricks CLI
Section 7: Spark Jobs Deployment using Notebooks
Lecture 48 Modularizing Notebooks
Lecture 49 Running Job using Notebook
Lecture 50 Refactor application as Databricks Notebooks
Lecture 51 Run Notebooks using Development Cluster
Software engineers, aspiring data engineers or data analyst & data scientists,Programmers and Database Administrators with experience in writing SQL queries,BI Analysts looking to enhance their understanding of data engineering, particularly in the context of big data and cloud platforms, can leverage this course to broaden their skill set.,IT professionals who want to transition into roles focused on data engineering and cloud computing