Spark and Scala

About the Course

Become a spark expert by mastering scala, python. The Apache Spark & Scala course will enable the participants to understand how Spark enables in-memory distributed datasets that optimize iterative workloads in addition to interactive queries. This course is a part of Developer's learning path.

Course Objectives

After the completion of 'Apache Spark & Scala' course, you will be able to:
1) Understand Scala and its implementation
2) Apply Lazy values, Control Structures, Loops, Collection, etc.
3) Learn the concepts of Traits and OOPS in scala.
4) Understand Functional programming in scala.
5) Get an insight into the BigData challenges.
6) How spark acts as a solution to these challenges.
7) Install spark and implement spark operations on spark shell.
8) Understand what are RDDs in spark.
9) Implement spark application on YARN (Hadoop).
10) Analyze Hive and Spark SQL Architecture.

Who should go for this course?

This course is a foundation to anyone who aspires to get into the field of Big Data and be aware of the latest developments in fast processing of ever growing data using Spark and related projects. The following professionals can go for this course :
1. Big Data Enthusiasts.
2. Software Architects, Engineers and Developers.
3. Data Scientists and Analytics Professionals.

Course Curriculum:

1. > Introduction to Scala
Topics - Why Scala?, What is Scala?, Introducing Scala, Installing Scala, Journey - Java to Scala, First Dive - Interactive Scala, Writing Scala Scripts - Compiling Scala Programs, Scala Basics, Scala Basic Types, Defining Functions, IDE for Scala, Scala Community.

2. > Scala Essentials
Topics - Immutability in Scala - Semicolons, Method Declaration, Literals, Lists, Tuples, Options, Maps, Reserved Words, Operators, Precedence Rules, If statements, Scala For Comprehensions, While Loops, Do-While Loops, Conditional Operators, Pattern Matching, Enumerations.

3. > Traits and OOPs in Scala
Topics - Traits Intro - Traits as Mixins, Stackable Traits, Creating Traits Basic OOPS - Class and Object Basics, Scala Constructors, Nested Classes, Visibility Rules.

4. > Functional Programming in Scala
Topics - What is Functional Programming?, Functional Literals and Closures, Recursion, Tail Calls, Functional Data Structures, Implicit Function Parameters, Call by Name, Call by Value.

5. > Introduction to Big Data and Spark
Topics - Introduction to Big Data, Challenges with Big Data, Batch Vs. Real Time Big Data Analytics, Batch Analytics - Hadoop Ecosystem Overview, Real Time Analytics Options, Streaming Data - Storm, In Memory Data - Spark, What is Spark?, Modes of Spark, Spark Installation Demo, Overview of Spark on a cluster, Spark Standalone Cluster.

6. > Spark Baby Steps
Topics - Invoking Spark Shell, Creating the Spark Context, Loading a File in Shell, Performing Some Basic Operations on Files in Spark Shell, Building a Spark Project with sbt, Running Spark Project with sbt, Caching Overview, Distributed Persistence, Spark Streaming Overview, Example: Streaming Word Count.

7. > Playing with RDDs
Topics - RDDs, Transformations in RDD, Actions in RDD, Loading Data in RDD, Saving Data through RDD, Key-Value Pair RDD, MapReduce and Pair RDD Operations, Scala and Hadoop Integration Hands on.

8. > Spark with SQL- When Spark meets Hive
Topics - Analyze Hive and Spark SQL Architecture, Analyze Spark SQL, Context in Spark SQL, Implement a sample example for Spark SQL, Implement Data Visualization in Spark, Loading of Data, Hive Queries through Spark, Testing Tips in Scala, Performance Tuning Tips in Spark, Shared Variables: Broadcast Variables, Shared Variables: Accumulators.