Syllabus
Introduction
Figure 1

Figure 2

Figure 3
## Gartner Hype Cycle 2014
Figure 4
## Gartner Hype Cycle 2019
Figure 5
## Gartner Hype Cycle 2020
Figure 6

MapReduce Programming Paradigm
Figure 1

Figure 2
k'
.

Figure 3

Figure 4

Spark Computing EnvironmentSpark computing environment
Figure 1

Figure 2

Figure 3

Figure 4

Figure 5

Figure 6

Figure 7

Figure 8

Figure 9

Figure 10

Figure 11

Figure 12

Figure 13

Figure 14

Data Parallel Computing with SparkData parallel computing with Spark
Page RankPage Rank
Figure 1

Figure 2

Figure 3

Figure 4

Figure 5

Figure 6

Figure 7

Figure 8

Figure 9

Figure 10

Figure 11

Figure 12


Figure 13
Figure 14

Figure 15
{:
.solution} {: .challenge}
Figure 16

Figure 17
{:
.solution} {: .challenge}
Figure 18

Figure 19

Locality Sensitive HashingLocality Sensitive Hashing
Figure 1

Figure 2

Figure 3

Figure 4

Figure 5

Figure 6

Figure 7

Frequent ItemsetsFrequent Itemsets
Figure 1
Repeat the process with increasing number of items added to only
sets found to be frequent
.
ClusteringClustering
Recommendation SystemsRecommendation Systems
Distributed Machine Learning with SparkDistributed machine learning with Spark
Figure 1

Figure 2

Figure 3
.
Figure 4

Figure 5
.
Figure 6

Figure 7
i
:


Figure 8

Figure 9
.
Figure 10
A
and
B
than of C
.

Figure 11


Figure 12
w
is good
according to intuition, theory, and practice.

Figure 13
.
Figure 14
Slide from the book
.
Figure 15
.
Figure 16
. - After some
more mathematical manipulations:
. - Everything
comes back to an optimization problem on
w
:
.
.
Figure 17
. - For each
data point: - If margin greater than 1, don’t care. - If margin is less
than 1, pay linear penalty. - Introducing slack variables:
Figure 18
.
Figure 19
.