Syllabus
Introduction
Figure 1

Figure 2

Figure 3
## Gartner Hype Cycle 2014

Figure 4
## Gartner Hype Cycle 2019

Figure 5
## Gartner Hype Cycle 2020

Figure 6

MapReduce Programming Paradigm
Figure 1

Figure 2
k'.
Figure 3

Figure 4

Spark Computing EnvironmentSpark computing environment
Figure 1

Figure 2

Figure 3

Figure 4

Figure 5

Figure 6

Figure 7

Figure 8

Figure 9

Figure 10

Figure 11

Figure 12

Figure 13

Figure 14

Data Parallel Computing with SparkData parallel computing with Spark
Page RankPage Rank
Figure 1

Figure 2

Figure 3

Figure 4

Figure 5

Figure 6

Figure 7
Figure 8

Figure 9

Figure 10

Figure 11

Figure 12

Figure 13


Figure 14

Figure 15
{:
.solution} {: .challenge}
Figure 16

Figure 17
{:
.solution} {: .challenge}
Figure 18

Figure 19

Locality Sensitive HashingLocality Sensitive Hashing
Figure 1

Figure 2

Figure 3

Figure 4

Figure 5

Figure 6

Figure 7

Frequent ItemsetsFrequent Itemsets
Figure 1
Repeat the process with increasing number of items added to only
sets found to be frequent.

ClusteringClustering
Recommendation SystemsRecommendation Systems
Distributed Machine Learning with SparkDistributed machine learning with Spark
Figure 1

Figure 2
.Figure 3
.
Figure 4
Figure 5
.
Figure 6
.Figure 7
i:
.
.Figure 8
.Figure 9
.
Figure 10
A and
B than of C.
.Figure 11
.
.Figure 12
w is good
according to intuition, theory, and practice.
.Figure 13
.
Figure 14
Slide from the book
.
Figure 15
.
Figure 16
. - After some
more mathematical manipulations:
. - Everything
comes back to an optimization problem on w:
.
.
Figure 17
. - For each
data point: - If margin greater than 1, don’t care. - If margin is less
than 1, pay linear penalty. - Introducing slack variables:
Figure 18
.
Figure 19
.