Scala Class and Object
A class is a blueprint for objects. Once you define a class, you can create
objects from the class
blueprint with the keyword new. Following is a simple syntax to define a class in scala :
For example :
class Point(x:Int, y:Int) {
varx1:Int = x;
vary1:Int = y;
defmove(dx:Int, dy:Int)
{
x1 = x1+dx
y1 = y1+dy
println(x1+" "+y1)
}
}
command to execute :
\>scalac Demo.scala
Explanation all in one:
Scala Course
Content
Introduction of
Scala
Introducing
Scala and deployment of Scala for Big Data applications and Apache Spark
analytics.
Pattern Matching
The
importance of Scala, the concept of REPL (Read Evaluate Print Loop), deep dive
into Scala pattern matching, type interface, higher order function, currying,
traits, application space and Scala for data analysis.
Executing the
Scala code
Learning
about the Scala Interpreter, static object timer in Scala, testing String
equality in Scala, Implicit classes in Scala, the concept of currying in Scala,
various classes in Scala.
Classes concept
in Scala
Learning
about the Classes concept, understanding the constructor overloading, the
various abstract classes, the hierarchy types in Scala, the concept of object
equality, the val and var methods in Scala.
Case classes and
pattern matching
Understanding
Sealed traits, wild, constructor, tuple, variable pattern, and constant
pattern.
Concepts of
traits with example
Understanding
traits in Scala, the advantages of traits, linearization of traits, the Java
equivalent and avoiding of boilerplate code.
Scala java
Interoperability
Implementation
of traits in Scala and Java, handling of multiple traits extending.
Scala
collections
Introduction
to Scala collections, classification of collections, the difference between
Iterator, and Iterable in Scala, example of list sequence in Scala.
Mutable
collections vs. Immutable collections
The
two types of collections in Scala, Mutable and Immutable collections,
understanding lists and arrays in Scala, the list buffer and array buffer,
Queue in Scala, double-ended queue Deque, Stacks, Sets, Maps, Tuples in Scala.
Use Case
bobsrockets package
Introduction
to Scala packages and imports, the selective imports, the Scala test classes,
introduction to JUnit test class, JUnit interface via JUnit 3 suite for Scala
test, packaging of Scala applications in Directory Structure, example of Spark
Split and Spark Scala.
Spark Course
Content
Introduction to
Spark
Introduction
to Spark, how Spark overcomes the drawbacks of working MapReduce, understanding
in-memory MapReduce,interactive operations on MapReduce, Spark stack, fine vs.
coarse grained update, Spark stack,Spark Hadoop YARN, HDFS Revision, YARN
Revision, the overview of Spark and how it is better Hadoop, deploying Spark
without Hadoop,Spark history server, Cloudera distribution.
Spark Basics
Spark
installation guide,Spark configuration, memory management, executor memory vs.
driver memory, working with Spark Shell, the concept of Resilient Distributed
Datasets (RDD), learning to do functional programming in Spark, the
architecture of Spark.
Working with
RDDs in Spark
Spark
RDD, creating RDDs, RDD partitioning, operations & transformation in
RDD,Deep dive into Spark RDDs, the RDD general operations, a read-only
partitioned collection of records, using the concept of RDD for faster and
efficient data processing,RDD action for Collect, Count, Collectsmap,
Saveastextfiles, pair RDD functions.
Aggregating Data
with Pair RDDs
Understanding
the concept of Key-Value pair in RDDs, learning how Spark makes MapReduce
operations faster, various operations of RDD,MapReduce interactive operations,
fine & coarse grained update, Spark stack.
Writing and
Deploying Spark Applications
Comparing
the Spark applications with Spark Shell, creating a Spark application using
Scala or Java, deploying a Spark application,Scala built application,creation
of mutable list, set & set operations, list, tuple, concatenating list,
creating application using SBT,deploying application using Maven,the web user
interface of Spark application, a real world example of Spark and configuring
of Spark.
Parallel
Processing
Learning
about Spark parallel processing, deploying on a cluster, introduction to Spark
partitions, file-based partitioning of RDDs, understanding of HDFS and data
locality, mastering the technique of parallel operations,comparing repartition
& coalesce, RDD actions.
Spark RDD
Persistence
The
execution flow in Spark, Understanding the RDD persistence overview,Spark
execution flow & Spark terminology, distribution shared memory vs. RDD, RDD
limitations, Spark shell arguments,distributed persistence, RDD
lineage,Key/Value pair for sorting implicit conversion like CountByKey,
ReduceByKey, SortByKey, AggregataeByKey
Spark Streaming
& Mlib
Spark
Streaming Architecture, Writing streaming programcoding, processing of spark
stream,processing Spark Discretized Stream (DStream), the context of Spark
Streaming, streaming transformation, Flume Spark streaming, request count and
Dstream, multi batch operation, sliding window operations and advanced data
sources. Different Algorithms, the concept of iterative algorithm in Spark,
analyzing with Spark graph processing, introduction to K-Means and machine
learning, various variables in Spark like shared variables, broadcast
variables, learning about accumulators.
Improving Spark
Performance
Introduction
to various variables in Spark like shared variables, broadcast variables,
learning about accumulators, the common performance issues and troubleshooting
the performance problems.
Spark SQL and
Data Frames
Learning
about Spark SQL, the context of SQL in Spark for providing structured data
processing, JSON support in Spark SQL, working with XML data, parquet files,
creating HiveContext, writing Data Frame to Hive, reading JDBC files,
understanding the Data Frames in Spark, creating Data Frames, manual inferring
of schema, working with CSV files, reading JDBC tables, Data Frame to JDBC,
user defined functions in Spark SQL, shared variable and accumulators, learning
to query and transform data in Data Frames, how Data Frame provides the benefit
of both Spark RDD and Spark SQL, deploying Hive on Spark as the execution
engine.
Scheduling/
Partitioning
Learning
about the scheduling and partitioning in Spark,hash partition, range partition,
scheduling within and around applications, static partitioning, dynamic
sharing, fair scheduling,Map partition with index, the Zip, GroupByKey, Spark
master high availability, standby Masters with Zookeeper, Single Node Recovery
With Local File System, High Order Functions.
Apache Spark –
Scala Project
Project 1: Movie RecommendationTopics – This is a project wherein you will gain hands-on experience in deploying Apache Spark for movie recommendation. You will be introduced to the Spark Machine Learning Library, a guide to MLlib algorithms and coding which is a machine learning library. Understand how to deploy collaborative filtering, clustering, regression, and dimensionality reduction in MLlib. Upon completion of the project you will gain experience in working with streaming data, sampling, testing and statistics.
Project 2: Twitter API Integration for tweet Analysis
Topics – With this project you will learn to integrate Twitter API for analyzing tweets. You will write codes on the server side using any of the scripting languages like PHP, Ruby or Python, for requesting the Twitter API and get the results in JSON format. You will then read the results and perform various operations like aggregation, filtering and parsing as per the need to come up with tweet analysis.
Project 3: Data Exploration Using Spark SQL – Wikipedia data set
Topics – This project lets you work with Spark SQL. You will gain experience in working with Spark SQL for combining it with ETL applications, real time analysis of data, performing batch analysis, deploying machine learning, creating visualizations and processing of graphs.
Source from -
No comments:
Post a Comment