aboutsummaryrefslogtreecommitdiff
path: root/chapter/8
diff options
context:
space:
mode:
authormsabhi <abhi.is2006@gmail.com>2016-12-04 08:52:46 -0500
committerGitHub <noreply@github.com>2016-12-04 08:52:46 -0500
commit729cbb73db20226f91b40d16c4af9102c3c80b98 (patch)
treea0550080aeb14194f561defad9b212fba33aac13 /chapter/8
parenta9883554b8e4ab00e41dbd8a358f97628f35f392 (diff)
Fixed indentation
Diffstat (limited to 'chapter/8')
-rw-r--r--chapter/8/big-data.md23
1 files changed, 12 insertions, 11 deletions
diff --git a/chapter/8/big-data.md b/chapter/8/big-data.md
index bae6b83..6778f52 100644
--- a/chapter/8/big-data.md
+++ b/chapter/8/big-data.md
@@ -105,20 +105,21 @@ The properties that power RDD with the above mentioned features :
Spark API provide two kinds of operations on a RDD:
-Transformations - lazy operations that return another RDD.
-`map (f : T => U) : RDD[T] ⇒ RDD[U]` : Return a MappedRDD[U] by applying function f to each element
-`flatMap( f : T ⇒ Seq[U]) : RDD[T] ⇒ RDD[U]` : Return a new FlatMappedRDD[U] by first applying a function to all elements and then flattening the results.
-`filter(f:T⇒Bool) : RDD[T] ⇒ RDD[T]` : Return a FilteredRDD[T] having elemnts that f return true
-`groupByKey()` : Being called on (K,V) Rdd, return a new RDD[([K], Iterable[V])]
-`reduceByKey(f: (V, V) => V)` : Being called on (K, V) Rdd, return a new RDD[(K, V)] by aggregating values using eg: reduceByKey(_+_)
-`join((RDD[(K, V)], RDD[(K, W)]) ⇒ RDD[(K, (V, W))]` :Being called on (K,V) Rdd, return a new RDD[(K, (V, W))] by joining them by key K.
+- Transformations - lazy operations that return another RDD.
+ - `map (f : T => U) : RDD[T] ⇒ RDD[U]` : Return a MappedRDD[U] by applying function f to each element
+ - `flatMap( f : T ⇒ Seq[U]) : RDD[T] ⇒ RDD[U]` : Return a new FlatMappedRDD[U] by first applying a function to all elements  and then flattening the results.
+ - `filter(f:T⇒Bool) : RDD[T] ⇒ RDD[T]` : Return a FilteredRDD[T] having elemnts that f return true
+ - `groupByKey()` : Being called on (K,V) Rdd, return a new RDD[([K], Iterable[V])]
+ - `reduceByKey(f: (V, V) => V)` : Being called on (K, V) Rdd, return a new RDD[(K, V)] by aggregating values using eg: reduceByKey(_+_)
+ - `join((RDD[(K, V)], RDD[(K, W)]) ⇒ RDD[(K, (V, W))]` :Being called on (K,V) Rdd, return a new RDD[(K, (V, W))] by joining them by key K.
-Actions - operations that trigger computation on a RDD and return values.
-`reduce(f:(T,T)⇒T) : RDD[T] ⇒ T` : return T by reducing the elements using specified commutative and associative binary operator
-`collect()` : Return an Array[T] containing all elements
-`count()` : Return the number of elements
+- Actions - operations that trigger computation on a RDD and return values.
+
+ - `reduce(f:(T,T)⇒T) : RDD[T] ⇒ T` : return T by reducing the elements using specified commutative and associative binary operator
+ - `collect()` : Return an Array[T] containing all elements
+ - `count()` : Return the number of elements
Why RDD over Distributed Shared memory (DSM) ?