From b82f8af807915aaa0020a39d1b1a61f5d23ca2ff Mon Sep 17 00:00:00 2001 From: cnnrznn Date: Wed, 16 Nov 2016 12:14:42 -0500 Subject: Update dist-langs.md --- chapter/4/dist-langs.md | 18 +++++++++++++++--- 1 file changed, 15 insertions(+), 3 deletions(-) (limited to 'chapter') diff --git a/chapter/4/dist-langs.md b/chapter/4/dist-langs.md index 9c8a8c9..d8201fa 100644 --- a/chapter/4/dist-langs.md +++ b/chapter/4/dist-langs.md @@ -1,11 +1,23 @@ --- layout: page title: "Distributed Programming Languages" -by: "Joe Schmoe and Mary Jane" +by: "A Systems Person" --- -Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum. {% cite Uniqueness --file dist-langs %} +Distributed programming is hard because of: + +* Network partitions +* Node failures +* Efficiency / Communication +* Data distribution / locality + +Approaches: + +* Message-passing +* RPC +* Actors +* Coordination (Linda) ## References -{% bibliography --file dist-langs %} \ No newline at end of file +{% bibliography --file dist-langs %} -- cgit v1.2.3 From c59d17f62954a29c3556d4211680bdebe6842af6 Mon Sep 17 00:00:00 2001 From: cnnrznn Date: Wed, 16 Nov 2016 13:03:00 -0500 Subject: Update dist-langs.md bullets --- chapter/4/dist-langs.md | 26 +++++++++++++++++++++----- 1 file changed, 21 insertions(+), 5 deletions(-) (limited to 'chapter') diff --git a/chapter/4/dist-langs.md b/chapter/4/dist-langs.md index d8201fa..fd3f053 100644 --- a/chapter/4/dist-langs.md +++ b/chapter/4/dist-langs.md @@ -11,12 +11,28 @@ Distributed programming is hard because of: * Efficiency / Communication * Data distribution / locality -Approaches: +### Two major major, orthogonal approaches to distributed languages: -* Message-passing -* RPC -* Actors -* Coordination (Linda) +#### Actor / Object model + +* Erlang +* Cloud Haskell + +#### Dataflow model + +The dataflow model has its roots in functional programming. +Some languages that use this model are: + +* Multilisp +* MapReduce (Spark, Hadoop, etc.) + +### Why GPL's not DSL's? + +* problem of domain-composition +* problem of abstraction +* problem of ecosystem +* problem of tumultuous architecture +* "any gpl + library can act as a dsl" - mernik" ## References -- cgit v1.2.3 From f832573aad966fa9f600f1707eb709f7e89814c3 Mon Sep 17 00:00:00 2001 From: cnnrznn Date: Wed, 16 Nov 2016 13:03:41 -0500 Subject: Update dist-langs.md --- chapter/4/dist-langs.md | 9 +-------- 1 file changed, 1 insertion(+), 8 deletions(-) (limited to 'chapter') diff --git a/chapter/4/dist-langs.md b/chapter/4/dist-langs.md index fd3f053..d268c3a 100644 --- a/chapter/4/dist-langs.md +++ b/chapter/4/dist-langs.md @@ -4,19 +4,12 @@ title: "Distributed Programming Languages" by: "A Systems Person" --- -Distributed programming is hard because of: - -* Network partitions -* Node failures -* Efficiency / Communication -* Data distribution / locality - ### Two major major, orthogonal approaches to distributed languages: #### Actor / Object model * Erlang -* Cloud Haskell +* Cloud Haskell (I know, right? Why?) #### Dataflow model -- cgit v1.2.3 From 1818c8eabf2cb1c65019bafd57198eaade8af9c0 Mon Sep 17 00:00:00 2001 From: cnnrznn Date: Wed, 16 Nov 2016 13:51:13 -0500 Subject: Update dist-langs.md --- chapter/4/dist-langs.md | 19 ++++++++++++++++++- 1 file changed, 18 insertions(+), 1 deletion(-) (limited to 'chapter') diff --git a/chapter/4/dist-langs.md b/chapter/4/dist-langs.md index d268c3a..9f3a91a 100644 --- a/chapter/4/dist-langs.md +++ b/chapter/4/dist-langs.md @@ -8,10 +8,13 @@ by: "A Systems Person" #### Actor / Object model +The actor model has its roots in procedural programming. +This model maps in a straighforward way to a distributed environment. + * Erlang * Cloud Haskell (I know, right? Why?) -#### Dataflow model +#### Dataflow model (static and stream) The dataflow model has its roots in functional programming. Some languages that use this model are: @@ -27,6 +30,20 @@ Some languages that use this model are: * problem of tumultuous architecture * "any gpl + library can act as a dsl" - mernik" +#### Erlang vs C: A Tar and Feathering + +[citation erlang paper] + +Erlang has only one clear benefit over C, which is dynamic code upgrading. +However, there are ways of making C behave in a similar fashion with minimal downtime. +Shuffler [citation] is a system for continuous randomization of code. +Using techniques discussed in the paper, one could dynamically replace sections of a binary. +Another, slightly hack-ish workaround would be to receive the upgrade, serialize the current state, and finally run the new binary based on the serialized state. + +Other than dynamic code swapping and poor error detection, Erlang does not offer anything that is not offered by a traditional OS. +Isolation, concurrency, and message passing can all be accomplished with unix-style system calls. +Why is this language not considered redundant? + ## References {% bibliography --file dist-langs %} -- cgit v1.2.3 From 53e58a99885ddcf08fa5a352a917a9e6100e093a Mon Sep 17 00:00:00 2001 From: cnnrznn Date: Wed, 16 Nov 2016 13:58:53 -0500 Subject: Update dist-langs.md --- chapter/4/dist-langs.md | 2 ++ 1 file changed, 2 insertions(+) (limited to 'chapter') diff --git a/chapter/4/dist-langs.md b/chapter/4/dist-langs.md index 9f3a91a..8745be9 100644 --- a/chapter/4/dist-langs.md +++ b/chapter/4/dist-langs.md @@ -39,6 +39,8 @@ However, there are ways of making C behave in a similar fashion with minimal dow Shuffler [citation] is a system for continuous randomization of code. Using techniques discussed in the paper, one could dynamically replace sections of a binary. Another, slightly hack-ish workaround would be to receive the upgrade, serialize the current state, and finally run the new binary based on the serialized state. +A third way of circumventing this problem would be to encapsulate any code in a shared library, and have logic in the program to unmap the old code, replace the library, and remap. +This approach is analogous to Erlang's approach. Other than dynamic code swapping and poor error detection, Erlang does not offer anything that is not offered by a traditional OS. Isolation, concurrency, and message passing can all be accomplished with unix-style system calls. -- cgit v1.2.3 From 82620a042d98e2d3c1f2e87be17c87cd329ccca3 Mon Sep 17 00:00:00 2001 From: cnnrznn Date: Wed, 16 Nov 2016 14:27:18 -0500 Subject: Update dist-langs.md --- chapter/4/dist-langs.md | 21 +++++++++------------ 1 file changed, 9 insertions(+), 12 deletions(-) (limited to 'chapter') diff --git a/chapter/4/dist-langs.md b/chapter/4/dist-langs.md index 8745be9..a097f51 100644 --- a/chapter/4/dist-langs.md +++ b/chapter/4/dist-langs.md @@ -32,18 +32,15 @@ Some languages that use this model are: #### Erlang vs C: A Tar and Feathering -[citation erlang paper] - -Erlang has only one clear benefit over C, which is dynamic code upgrading. -However, there are ways of making C behave in a similar fashion with minimal downtime. -Shuffler [citation] is a system for continuous randomization of code. -Using techniques discussed in the paper, one could dynamically replace sections of a binary. -Another, slightly hack-ish workaround would be to receive the upgrade, serialize the current state, and finally run the new binary based on the serialized state. -A third way of circumventing this problem would be to encapsulate any code in a shared library, and have logic in the program to unmap the old code, replace the library, and remap. -This approach is analogous to Erlang's approach. - -Other than dynamic code swapping and poor error detection, Erlang does not offer anything that is not offered by a traditional OS. -Isolation, concurrency, and message passing can all be accomplished with unix-style system calls. +{% cite Armstrong2010 --file dist-langs %} + +Erlang offers nothing that is unavailable in C. + +For example, dynamic code swapping is one of Erlang's major selling points. +However, code swapping can easily be achieved in C with dynamic linking. +This approach is analogous to the example offered in the Erlang paper. + +Other selling points, such as isolation, concurrency, and message passing can all be accomplished with unix-style system calls. Why is this language not considered redundant? ## References -- cgit v1.2.3 From 21ef2e4488013769d08a27765b21017e7713a91f Mon Sep 17 00:00:00 2001 From: cnnrznn Date: Wed, 16 Nov 2016 14:34:56 -0500 Subject: Update dist-langs.md --- chapter/4/dist-langs.md | 6 ++++++ 1 file changed, 6 insertions(+) (limited to 'chapter') diff --git a/chapter/4/dist-langs.md b/chapter/4/dist-langs.md index a097f51..9f04232 100644 --- a/chapter/4/dist-langs.md +++ b/chapter/4/dist-langs.md @@ -43,6 +43,12 @@ This approach is analogous to the example offered in the Erlang paper. Other selling points, such as isolation, concurrency, and message passing can all be accomplished with unix-style system calls. Why is this language not considered redundant? +#### MapReduce: A New Hope + +Unlike Erlang, MapReduce and DSL's that implement the paradigm are "all the rage." +Unlike Erlang, MapReduce has experienced adoption because it offers true abstraction of the problems of distributed computing. +Erlang only provided a way of detecting a process failure; it did not consider machine or network failures. + ## References {% bibliography --file dist-langs %} -- cgit v1.2.3 From 8fdcf6b4a99834901812f5d1c41e596ecc370647 Mon Sep 17 00:00:00 2001 From: msabhi Date: Thu, 17 Nov 2016 16:38:52 -0500 Subject: Update big-data.md --- chapter/8/big-data.md | 27 +++++++++++++++++++++++++-- 1 file changed, 25 insertions(+), 2 deletions(-) (limited to 'chapter') diff --git a/chapter/8/big-data.md b/chapter/8/big-data.md index bfd3e7b..b08ae70 100644 --- a/chapter/8/big-data.md +++ b/chapter/8/big-data.md @@ -4,8 +4,31 @@ title: "Large Scale Parallel Data Processing" by: "Joe Schmoe and Mary Jane" --- -Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum. {% cite Uniqueness --file big-data %} +Though highly efficient and one of the first major programming models for distributed batch processing, it too has a few limitations.
+Map Reduce doesn’t scale easily and is highly inefficient for iterative / graph algorithms like page rank and machine learning algorithms. Iterative algorithms requires programmer to explicitly handle the intermediate results (writing to disks). Hence, every iteration requires reading the input file and writing the results to the disk resulting in high disk I/O which is a performance bottleneck for any batch processing system.
+Also graph algorithms require exchange of messages between vertices. In case of PageRank, every vertex requires the contributions from all its adjacent nodes to calculate its score. Map reduce currently lacks this model of message passing which makes it complex to reason about graph algorithms.
+`Bulk synchronous parallel` model was introduced in 1980 to represent the hardware design features of parallel computers. It gained popularity as an alternative for map reduce since it addressed the above mentioned issues with map reduce to an extent.
+In BSP model ++ Computation consists of several steps called as supersets. +2> The processors involved have their own local memory and every processor is connected to other via a point-to-point communication. +3> At every superstep, a processor receives input at the beginning, performs computation and outputs at the end. +4> Barrier synchronization synchs all the processors at the end of every superstep. +A notable feature of the model is the complete control on data through communication between every processor at every superstep. +Though similar to map reduce model, BSP preserves data in memory across supersteps and helps in reasoning iterative graph algorithms. +Pregel is an implementation of classic BSP model by Google (PageRank) to analyze large graphs exclusively. It was followed by open source implementations - Apache’s Giraph and Hama; which were BSP models built on top of Hadoop. +Pregel is highly scalable, fault-tolerant and can successfully represent larger complex graphs. Google claims the API becomes easy once a developer adopts “think like a vertex” mode. +Pregel’s computation system is iterative and every iteration is called as superstep. The system takes a directed graph as input with properties assigned to both vertices and graph. At each superstep, all vertices executes in parallel, a user-defined function which represents the behavior of the vertex. The function has access to message sent to its vertex from the previous superstep S-1 and can update the state of the vertex, its edges, the graph and even send messages to other vertices which would receive in the next superstep S+1. The synchronization happens only between two supersteps. Every vertex is either active or inactive at any superstep. The iteration stops when all the vertices are inactive. A vertex can deactivate itself by voting for it and gets active if it receives a message. This asynchronous message passing feature eliminates the shared memory, remote reads and latency of Map reduce model. +Pregel’s API provides +1> compute() method for the user to implement the logic to change the state of the graph/vertex at every superstep. It guarantees message delivery through an iterator at every superstep. +2> User defined handler for handling issues like missing destination vertex etc. +3> Combiners reduce the amount of messages passed from multiple vertices to the same destination vertex. +4> Aggregators capture the global state of the graph. A reduce operation combines the value given by every vertex to the aggregator. The combined/aggregated value is passed onto to all the vertices in the next superstep. +5> Fault tolerance is achieved through checkpointing and instructing the workers to save the state of nodes to a persistent storage. When a machine fails, all workers restart the execution with state of their recent checkpoint. +6> Master and worker implementation : The master partitions graph into set of vertices (hash on vertex ID mod number of partitions) and outgoing edges per partition. Each partition is assigned to a worker who manages the state of all its vertices by executing compute() method and coordinating the message communication. The workers also notifies the master of the vertices that are active for the next superstep. +Pregel works good for sparse graphs. However, dense graph could cause communication overhead resulting in system to break. Also, the entire computation state resides in the main memory. +Apache Giraph is an open source implementation of Pregel in which new features like master computation, sharded aggregators, edge-oriented input, out-of-core computation are added making it more efficient. The most high performance graph processing framework is GraphLab which is developed at Carnegie Melon University and uses the BSP model and executes on MPI. + ## References -{% bibliography --file big-data %} \ No newline at end of file +{% bibliography --file big-data %} -- cgit v1.2.3 From 99b7aadcaf75b04c9d7a31d6d9671d13f609db36 Mon Sep 17 00:00:00 2001 From: msabhi Date: Thu, 17 Nov 2016 16:40:46 -0500 Subject: Update big-data.md --- chapter/8/big-data.md | 29 +++++++++++++++-------------- 1 file changed, 15 insertions(+), 14 deletions(-) (limited to 'chapter') diff --git a/chapter/8/big-data.md b/chapter/8/big-data.md index b08ae70..9dfd1d6 100644 --- a/chapter/8/big-data.md +++ b/chapter/8/big-data.md @@ -7,24 +7,25 @@ by: "Joe Schmoe and Mary Jane" Though highly efficient and one of the first major programming models for distributed batch processing, it too has a few limitations.
Map Reduce doesn’t scale easily and is highly inefficient for iterative / graph algorithms like page rank and machine learning algorithms. Iterative algorithms requires programmer to explicitly handle the intermediate results (writing to disks). Hence, every iteration requires reading the input file and writing the results to the disk resulting in high disk I/O which is a performance bottleneck for any batch processing system.
Also graph algorithms require exchange of messages between vertices. In case of PageRank, every vertex requires the contributions from all its adjacent nodes to calculate its score. Map reduce currently lacks this model of message passing which makes it complex to reason about graph algorithms.
-`Bulk synchronous parallel` model was introduced in 1980 to represent the hardware design features of parallel computers. It gained popularity as an alternative for map reduce since it addressed the above mentioned issues with map reduce to an extent.
+### Bulk synchronous parallel model +This model was introduced in 1980 to represent the hardware design features of parallel computers. It gained popularity as an alternative for map reduce since it addressed the above mentioned issues with map reduce to an extent.
In BSP model + Computation consists of several steps called as supersets. -2> The processors involved have their own local memory and every processor is connected to other via a point-to-point communication. -3> At every superstep, a processor receives input at the beginning, performs computation and outputs at the end. -4> Barrier synchronization synchs all the processors at the end of every superstep. -A notable feature of the model is the complete control on data through communication between every processor at every superstep. -Though similar to map reduce model, BSP preserves data in memory across supersteps and helps in reasoning iterative graph algorithms. -Pregel is an implementation of classic BSP model by Google (PageRank) to analyze large graphs exclusively. It was followed by open source implementations - Apache’s Giraph and Hama; which were BSP models built on top of Hadoop. ++ The processors involved have their own local memory and every processor is connected to other via a point-to-point communication. ++ At every superstep, a processor receives input at the beginning, performs computation and outputs at the end. ++ Barrier synchronization synchs all the processors at the end of every superstep. +A notable feature of the model is the complete control on data through communication between every processor at every superstep.
+Though similar to map reduce model, BSP preserves data in memory across supersteps and helps in reasoning iterative graph algorithms.
+`Pregel` is an implementation of classic BSP model by Google (PageRank) to analyze large graphs exclusively. It was followed by open source implementations - Apache’s Giraph and Hama; which were BSP models built on top of Hadoop. Pregel is highly scalable, fault-tolerant and can successfully represent larger complex graphs. Google claims the API becomes easy once a developer adopts “think like a vertex” mode. Pregel’s computation system is iterative and every iteration is called as superstep. The system takes a directed graph as input with properties assigned to both vertices and graph. At each superstep, all vertices executes in parallel, a user-defined function which represents the behavior of the vertex. The function has access to message sent to its vertex from the previous superstep S-1 and can update the state of the vertex, its edges, the graph and even send messages to other vertices which would receive in the next superstep S+1. The synchronization happens only between two supersteps. Every vertex is either active or inactive at any superstep. The iteration stops when all the vertices are inactive. A vertex can deactivate itself by voting for it and gets active if it receives a message. This asynchronous message passing feature eliminates the shared memory, remote reads and latency of Map reduce model. -Pregel’s API provides -1> compute() method for the user to implement the logic to change the state of the graph/vertex at every superstep. It guarantees message delivery through an iterator at every superstep. -2> User defined handler for handling issues like missing destination vertex etc. -3> Combiners reduce the amount of messages passed from multiple vertices to the same destination vertex. -4> Aggregators capture the global state of the graph. A reduce operation combines the value given by every vertex to the aggregator. The combined/aggregated value is passed onto to all the vertices in the next superstep. -5> Fault tolerance is achieved through checkpointing and instructing the workers to save the state of nodes to a persistent storage. When a machine fails, all workers restart the execution with state of their recent checkpoint. -6> Master and worker implementation : The master partitions graph into set of vertices (hash on vertex ID mod number of partitions) and outgoing edges per partition. Each partition is assigned to a worker who manages the state of all its vertices by executing compute() method and coordinating the message communication. The workers also notifies the master of the vertices that are active for the next superstep. +#### Pregel’s API provides ++ compute() method for the user to implement the logic to change the state of the graph/vertex at every superstep. It guarantees message delivery through an iterator at every superstep. ++ User defined handler for handling issues like missing destination vertex etc. ++ Combiners reduce the amount of messages passed from multiple vertices to the same destination vertex. ++ Aggregators capture the global state of the graph. A reduce operation combines the value given by every vertex to the aggregator. The combined/aggregated value is passed onto to all the vertices in the next superstep. ++ Fault tolerance is achieved through checkpointing and instructing the workers to save the state of nodes to a persistent storage. When a machine fails, all workers restart the execution with state of their recent checkpoint. ++ Master and worker implementation : The master partitions graph into set of vertices (hash on vertex ID mod number of partitions) and outgoing edges per partition. Each partition is assigned to a worker who manages the state of all its vertices by executing compute() method and coordinating the message communication. The workers also notifies the master of the vertices that are active for the next superstep. Pregel works good for sparse graphs. However, dense graph could cause communication overhead resulting in system to break. Also, the entire computation state resides in the main memory. Apache Giraph is an open source implementation of Pregel in which new features like master computation, sharded aggregators, edge-oriented input, out-of-core computation are added making it more efficient. The most high performance graph processing framework is GraphLab which is developed at Carnegie Melon University and uses the BSP model and executes on MPI. -- cgit v1.2.3 From d63ceb8aa8c280d789e3163426ea86bff16997e0 Mon Sep 17 00:00:00 2001 From: msabhi Date: Thu, 17 Nov 2016 16:41:41 -0500 Subject: Update big-data.md --- chapter/8/big-data.md | 1 + 1 file changed, 1 insertion(+) (limited to 'chapter') diff --git a/chapter/8/big-data.md b/chapter/8/big-data.md index 9dfd1d6..6a3edd3 100644 --- a/chapter/8/big-data.md +++ b/chapter/8/big-data.md @@ -14,6 +14,7 @@ In BSP model + The processors involved have their own local memory and every processor is connected to other via a point-to-point communication. + At every superstep, a processor receives input at the beginning, performs computation and outputs at the end. + Barrier synchronization synchs all the processors at the end of every superstep. ++ A notable feature of the model is the complete control on data through communication between every processor at every superstep.
Though similar to map reduce model, BSP preserves data in memory across supersteps and helps in reasoning iterative graph algorithms.
`Pregel` is an implementation of classic BSP model by Google (PageRank) to analyze large graphs exclusively. It was followed by open source implementations - Apache’s Giraph and Hama; which were BSP models built on top of Hadoop. -- cgit v1.2.3 From ce6727b47cbd0d4f0ac3407b498982a09c4b3e50 Mon Sep 17 00:00:00 2001 From: msabhi Date: Thu, 17 Nov 2016 16:43:04 -0500 Subject: Update big-data.md --- chapter/8/big-data.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'chapter') diff --git a/chapter/8/big-data.md b/chapter/8/big-data.md index 6a3edd3..eef8a8f 100644 --- a/chapter/8/big-data.md +++ b/chapter/8/big-data.md @@ -15,7 +15,7 @@ In BSP model + At every superstep, a processor receives input at the beginning, performs computation and outputs at the end. + Barrier synchronization synchs all the processors at the end of every superstep. + -A notable feature of the model is the complete control on data through communication between every processor at every superstep.
++A notable feature of the model is the complete control on data through communication between every processor at every superstep.
Though similar to map reduce model, BSP preserves data in memory across supersteps and helps in reasoning iterative graph algorithms.
`Pregel` is an implementation of classic BSP model by Google (PageRank) to analyze large graphs exclusively. It was followed by open source implementations - Apache’s Giraph and Hama; which were BSP models built on top of Hadoop. Pregel is highly scalable, fault-tolerant and can successfully represent larger complex graphs. Google claims the API becomes easy once a developer adopts “think like a vertex” mode. -- cgit v1.2.3 From 7ad2750c2af8b62717eef017cbdb4a370fbca2e5 Mon Sep 17 00:00:00 2001 From: msabhi Date: Thu, 17 Nov 2016 16:44:00 -0500 Subject: Update big-data.md --- chapter/8/big-data.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to 'chapter') diff --git a/chapter/8/big-data.md b/chapter/8/big-data.md index eef8a8f..d17d2b1 100644 --- a/chapter/8/big-data.md +++ b/chapter/8/big-data.md @@ -19,8 +19,8 @@ In BSP model Though similar to map reduce model, BSP preserves data in memory across supersteps and helps in reasoning iterative graph algorithms.
`Pregel` is an implementation of classic BSP model by Google (PageRank) to analyze large graphs exclusively. It was followed by open source implementations - Apache’s Giraph and Hama; which were BSP models built on top of Hadoop. Pregel is highly scalable, fault-tolerant and can successfully represent larger complex graphs. Google claims the API becomes easy once a developer adopts “think like a vertex” mode. -Pregel’s computation system is iterative and every iteration is called as superstep. The system takes a directed graph as input with properties assigned to both vertices and graph. At each superstep, all vertices executes in parallel, a user-defined function which represents the behavior of the vertex. The function has access to message sent to its vertex from the previous superstep S-1 and can update the state of the vertex, its edges, the graph and even send messages to other vertices which would receive in the next superstep S+1. The synchronization happens only between two supersteps. Every vertex is either active or inactive at any superstep. The iteration stops when all the vertices are inactive. A vertex can deactivate itself by voting for it and gets active if it receives a message. This asynchronous message passing feature eliminates the shared memory, remote reads and latency of Map reduce model. -#### Pregel’s API provides +Pregel’s computation system is iterative and every iteration is called as superstep. The system takes a directed graph as input with properties assigned to both vertices and graph. At each superstep, all vertices executes in parallel, a user-defined function which represents the behavior of the vertex. The function has access to message sent to its vertex from the previous superstep S-1 and can update the state of the vertex, its edges, the graph and even send messages to other vertices which would receive in the next superstep S+1. The synchronization happens only between two supersteps. Every vertex is either active or inactive at any superstep. The iteration stops when all the vertices are inactive. A vertex can deactivate itself by voting for it and gets active if it receives a message. This asynchronous message passing feature eliminates the shared memory, remote reads and latency of Map reduce model.
+Pregel’s API provides
+ compute() method for the user to implement the logic to change the state of the graph/vertex at every superstep. It guarantees message delivery through an iterator at every superstep. + User defined handler for handling issues like missing destination vertex etc. + Combiners reduce the amount of messages passed from multiple vertices to the same destination vertex. -- cgit v1.2.3 From c4ffeb4613d4cc96d4f520d4f57be9b22d951c1f Mon Sep 17 00:00:00 2001 From: msabhi Date: Thu, 17 Nov 2016 16:44:30 -0500 Subject: Update big-data.md --- chapter/8/big-data.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) (limited to 'chapter') diff --git a/chapter/8/big-data.md b/chapter/8/big-data.md index d17d2b1..89e3d6a 100644 --- a/chapter/8/big-data.md +++ b/chapter/8/big-data.md @@ -13,9 +13,9 @@ In BSP model + Computation consists of several steps called as supersets. + The processors involved have their own local memory and every processor is connected to other via a point-to-point communication. + At every superstep, a processor receives input at the beginning, performs computation and outputs at the end. -+ Barrier synchronization synchs all the processors at the end of every superstep. -+ -+A notable feature of the model is the complete control on data through communication between every processor at every superstep.
++ Barrier synchronization synchs all the processors at the end of every superstep.
+ +A notable feature of the model is the complete control on data through communication between every processor at every superstep.
Though similar to map reduce model, BSP preserves data in memory across supersteps and helps in reasoning iterative graph algorithms.
`Pregel` is an implementation of classic BSP model by Google (PageRank) to analyze large graphs exclusively. It was followed by open source implementations - Apache’s Giraph and Hama; which were BSP models built on top of Hadoop. Pregel is highly scalable, fault-tolerant and can successfully represent larger complex graphs. Google claims the API becomes easy once a developer adopts “think like a vertex” mode. -- cgit v1.2.3 From 5b5b7221244988db67903836ebe6d76194001f23 Mon Sep 17 00:00:00 2001 From: msabhi Date: Thu, 17 Nov 2016 16:45:45 -0500 Subject: Update big-data.md --- chapter/8/big-data.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'chapter') diff --git a/chapter/8/big-data.md b/chapter/8/big-data.md index 89e3d6a..0bafaed 100644 --- a/chapter/8/big-data.md +++ b/chapter/8/big-data.md @@ -26,7 +26,7 @@ Pregel’s API provides
+ Combiners reduce the amount of messages passed from multiple vertices to the same destination vertex. + Aggregators capture the global state of the graph. A reduce operation combines the value given by every vertex to the aggregator. The combined/aggregated value is passed onto to all the vertices in the next superstep. + Fault tolerance is achieved through checkpointing and instructing the workers to save the state of nodes to a persistent storage. When a machine fails, all workers restart the execution with state of their recent checkpoint. -+ Master and worker implementation : The master partitions graph into set of vertices (hash on vertex ID mod number of partitions) and outgoing edges per partition. Each partition is assigned to a worker who manages the state of all its vertices by executing compute() method and coordinating the message communication. The workers also notifies the master of the vertices that are active for the next superstep. ++ Master and worker implementation : The master partitions graph into set of vertices (hash on vertex ID mod number of partitions) and outgoing edges per partition. Each partition is assigned to a worker who manages the state of all its vertices by executing compute() method and coordinating the message communication. The workers also notifies the master of the vertices that are active for the next superstep.
Pregel works good for sparse graphs. However, dense graph could cause communication overhead resulting in system to break. Also, the entire computation state resides in the main memory. Apache Giraph is an open source implementation of Pregel in which new features like master computation, sharded aggregators, edge-oriented input, out-of-core computation are added making it more efficient. The most high performance graph processing framework is GraphLab which is developed at Carnegie Melon University and uses the BSP model and executes on MPI. -- cgit v1.2.3 From 8acda3d0cfc366f2c6e52836635c347c312ba7c2 Mon Sep 17 00:00:00 2001 From: msabhi Date: Thu, 17 Nov 2016 16:46:45 -0500 Subject: Update big-data.md --- chapter/8/big-data.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'chapter') diff --git a/chapter/8/big-data.md b/chapter/8/big-data.md index 0bafaed..ec39127 100644 --- a/chapter/8/big-data.md +++ b/chapter/8/big-data.md @@ -26,7 +26,7 @@ Pregel’s API provides
+ Combiners reduce the amount of messages passed from multiple vertices to the same destination vertex. + Aggregators capture the global state of the graph. A reduce operation combines the value given by every vertex to the aggregator. The combined/aggregated value is passed onto to all the vertices in the next superstep. + Fault tolerance is achieved through checkpointing and instructing the workers to save the state of nodes to a persistent storage. When a machine fails, all workers restart the execution with state of their recent checkpoint. -+ Master and worker implementation : The master partitions graph into set of vertices (hash on vertex ID mod number of partitions) and outgoing edges per partition. Each partition is assigned to a worker who manages the state of all its vertices by executing compute() method and coordinating the message communication. The workers also notifies the master of the vertices that are active for the next superstep.
++ Master and worker implementation : The master partitions graph into set of vertices (hash on vertex ID mod number of partitions) and outgoing edges per partition. Each partition is assigned to a worker who manages the state of all its vertices by executing compute() method and coordinating the message communication. The workers also notifies the master of the vertices that are active for the next superstep.
Pregel works good for sparse graphs. However, dense graph could cause communication overhead resulting in system to break. Also, the entire computation state resides in the main memory. Apache Giraph is an open source implementation of Pregel in which new features like master computation, sharded aggregators, edge-oriented input, out-of-core computation are added making it more efficient. The most high performance graph processing framework is GraphLab which is developed at Carnegie Melon University and uses the BSP model and executes on MPI. -- cgit v1.2.3 From 59d3351d76963bf6da6489233a4f7adc098382d0 Mon Sep 17 00:00:00 2001 From: msabhi Date: Thu, 17 Nov 2016 16:47:58 -0500 Subject: Update big-data.md --- chapter/8/big-data.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to 'chapter') diff --git a/chapter/8/big-data.md b/chapter/8/big-data.md index ec39127..e2ff3e3 100644 --- a/chapter/8/big-data.md +++ b/chapter/8/big-data.md @@ -15,8 +15,7 @@ In BSP model + At every superstep, a processor receives input at the beginning, performs computation and outputs at the end. + Barrier synchronization synchs all the processors at the end of every superstep.
-A notable feature of the model is the complete control on data through communication between every processor at every superstep.
-Though similar to map reduce model, BSP preserves data in memory across supersteps and helps in reasoning iterative graph algorithms.
+A notable feature of the model is the complete control on data through communication between every processor at every superstep. BSP preserves data in memory across supersteps and helps in reasoning iterative graph algorithms.
`Pregel` is an implementation of classic BSP model by Google (PageRank) to analyze large graphs exclusively. It was followed by open source implementations - Apache’s Giraph and Hama; which were BSP models built on top of Hadoop. Pregel is highly scalable, fault-tolerant and can successfully represent larger complex graphs. Google claims the API becomes easy once a developer adopts “think like a vertex” mode. Pregel’s computation system is iterative and every iteration is called as superstep. The system takes a directed graph as input with properties assigned to both vertices and graph. At each superstep, all vertices executes in parallel, a user-defined function which represents the behavior of the vertex. The function has access to message sent to its vertex from the previous superstep S-1 and can update the state of the vertex, its edges, the graph and even send messages to other vertices which would receive in the next superstep S+1. The synchronization happens only between two supersteps. Every vertex is either active or inactive at any superstep. The iteration stops when all the vertices are inactive. A vertex can deactivate itself by voting for it and gets active if it receives a message. This asynchronous message passing feature eliminates the shared memory, remote reads and latency of Map reduce model.
@@ -27,6 +26,7 @@ Pregel’s API provides
+ Aggregators capture the global state of the graph. A reduce operation combines the value given by every vertex to the aggregator. The combined/aggregated value is passed onto to all the vertices in the next superstep. + Fault tolerance is achieved through checkpointing and instructing the workers to save the state of nodes to a persistent storage. When a machine fails, all workers restart the execution with state of their recent checkpoint. + Master and worker implementation : The master partitions graph into set of vertices (hash on vertex ID mod number of partitions) and outgoing edges per partition. Each partition is assigned to a worker who manages the state of all its vertices by executing compute() method and coordinating the message communication. The workers also notifies the master of the vertices that are active for the next superstep.
+ Pregel works good for sparse graphs. However, dense graph could cause communication overhead resulting in system to break. Also, the entire computation state resides in the main memory. Apache Giraph is an open source implementation of Pregel in which new features like master computation, sharded aggregators, edge-oriented input, out-of-core computation are added making it more efficient. The most high performance graph processing framework is GraphLab which is developed at Carnegie Melon University and uses the BSP model and executes on MPI. -- cgit v1.2.3 From 607c2f97c8c032b912bc64c553b43b694f10f693 Mon Sep 17 00:00:00 2001 From: msabhi Date: Thu, 17 Nov 2016 16:59:19 -0500 Subject: Update big-data.md --- chapter/8/big-data.md | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) (limited to 'chapter') diff --git a/chapter/8/big-data.md b/chapter/8/big-data.md index e2ff3e3..cf13efa 100644 --- a/chapter/8/big-data.md +++ b/chapter/8/big-data.md @@ -1,7 +1,7 @@ --- layout: page title: "Large Scale Parallel Data Processing" -by: "Joe Schmoe and Mary Jane" +by: "JingJing and Abhilash" --- Though highly efficient and one of the first major programming models for distributed batch processing, it too has a few limitations.
@@ -32,5 +32,7 @@ Apache Giraph is an open source implementation of Pregel in which new features l ## References +"Bulk synchronous model" http://www.cse.unt.edu/~tarau/teaching/parpro/papers/Bulk%20synchronous%20parallel.pdf. +"Pregel: A System for Large-Scale Graph Processing." +"One trillion edges: graph processing at Facebook-scale" -{% bibliography --file big-data %} -- cgit v1.2.3 From 37e2fe6098829d50679546be27d744487918d488 Mon Sep 17 00:00:00 2001 From: msabhi Date: Thu, 17 Nov 2016 16:59:38 -0500 Subject: Update big-data.md --- chapter/8/big-data.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'chapter') diff --git a/chapter/8/big-data.md b/chapter/8/big-data.md index cf13efa..f1e53e0 100644 --- a/chapter/8/big-data.md +++ b/chapter/8/big-data.md @@ -33,6 +33,6 @@ Apache Giraph is an open source implementation of Pregel in which new features l ## References "Bulk synchronous model" http://www.cse.unt.edu/~tarau/teaching/parpro/papers/Bulk%20synchronous%20parallel.pdf. -"Pregel: A System for Large-Scale Graph Processing." +"Pregel: A System for Large-Scale Graph Processing."
"One trillion edges: graph processing at Facebook-scale" -- cgit v1.2.3 From 3fc056ab35031b0c47df3a52c65a812428383250 Mon Sep 17 00:00:00 2001 From: msabhi Date: Thu, 17 Nov 2016 17:01:47 -0500 Subject: Update big-data.md --- chapter/8/big-data.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'chapter') diff --git a/chapter/8/big-data.md b/chapter/8/big-data.md index f1e53e0..4c1f060 100644 --- a/chapter/8/big-data.md +++ b/chapter/8/big-data.md @@ -34,5 +34,5 @@ Apache Giraph is an open source implementation of Pregel in which new features l ## References "Bulk synchronous model" http://www.cse.unt.edu/~tarau/teaching/parpro/papers/Bulk%20synchronous%20parallel.pdf. "Pregel: A System for Large-Scale Graph Processing."
-"One trillion edges: graph processing at Facebook-scale" +"One Trillion Edges: Graph Processing at Facebook-Scale." Accessed November 17, 2016. http://www.vldb.org/pvldb/vol8/p1804-ching.pdf. -- cgit v1.2.3 From 74473b82407edd9bc5f442103715985e1adc5859 Mon Sep 17 00:00:00 2001 From: Jingjing Ren Date: Thu, 24 Nov 2016 22:15:48 -0500 Subject: add mapreduce+flumejava+skeleton --- chapter/8/big-data.md | 113 ++++++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 106 insertions(+), 7 deletions(-) (limited to 'chapter') diff --git a/chapter/8/big-data.md b/chapter/8/big-data.md index 4c1f060..34a14f1 100644 --- a/chapter/8/big-data.md +++ b/chapter/8/big-data.md @@ -1,18 +1,116 @@ --- layout: page title: "Large Scale Parallel Data Processing" -by: "JingJing and Abhilash" +by: "Jingjing and Abhilash" --- +## Introduction +`JJ: Placeholder for introduction` The booming Internet has generated big data... + + +This chapter is organized in JJ: need to fill in more stuff + +- **Data paralleling**: + - MapReduce {% cite dean2008mapreduce --file big-data %} + - FlumeJava {% cite chambers2010flumejava --file big-data %} + - ... +- **Graph paralleling**: + - Pregel + - ... + +For each programming model, we will discuss the motivation, basic model, execution model, fault-tolerance and performance. + + +Ideas: get a table of what to include in the context +Idea: instead of data/graph, maybe add one more layer (unstructured vs. structured) + +# Data paralleling + +## MapReduce (2004) +MapReduce {% cite dean2008mapreduce --file big-data %} is a programming model that allows programmers to express the simple computations for terabytes data on thousands of commodity machines. + +**Basic & Examples** +This model applies to computations that are usually parallelizable: A `map` function can operate on each logical "record", this generates a set of intermediate key/value pairs, and then a `reduce` function applies on all values that share the same key and generate one or zero output value. + +Concretely, considering the problem of counting the number of occurrence of each word in a large collection of documents: each time, a `map` function that emits a word plus its count 1; a `reduce` function sums together all counts emitted for the same word + +``` +map(String key, String value): + // key: document name + // value: document contents + for each word w in value: + EmitIntermediate(w, "1"); + +reduce(String key, Iterator values): + // key: a word + // values: a list of counts + int result = 0; + for each v in values: + result += ParseInt(v); + Emit(AsString(result)); +``` + +Conceptually, the map and reduction functions have associated **types**: +``` +map (k1,v1) -> → list(k2,v2) +reduce (k2,list(v2)) -> list(v2) +``` +The input keys and values are drawn from a different domain than the output keys and values. The intermediate keys and values are from the same domain as the output keys and values. The implementation given by the authors essentially pass strings and it is users' responsibility to convert between strings and appropriate types. + +More formalized descriptions about the `map` and `reduce` function can be found in the original paper {% cite dean2008mapreduce --file big-data %}. + +**Execution** +At high level, when the user program calls *MapReduce* function, the input files are split into *M* pieces and it runs *map* function on corresponding splits; then intermediate key space are partitioned into *R* pieces using a partitioning function; After the reduce functions all successfully complete, the output is available in *R* files. The sequences of actions {% cite dean2008mapreduce --file big-data %} are shown in the figure below. We can see from label (4) and (5) that the intermediate key/value pairs are written/read into disks, this is a key to fault-tolerance in MapReduce model and also a bottleneck for more complex computation algorithms. + +
+ MapReduce Execution Overview +
+ + +**Fault Tolerance** +In this model, there are two parts that could fail: the master and the worker. +- Worker failure: The master pings every worker periodically and if no response in a certain amount of time, master marks the worker as failed and re-assign it to an idle worker. +- Master Failure: If the master fail, MapReduce function fails. The model itself assumes that master won't fail and they have separate mechanics to backup the master, which is out of the scope of our discussion. + +The output from distributed computation should be same as one from non-faulting sequential execution of the entire program. And the model relies on the atomic commits of map and reduce task outputs to achieve this. The basic idea is to create private temporary files and rename them only when the task has finished. + +There are some practices in this paper that make the model work very well in Google, one of them is **backup tasks**: when a MapReduce operation is close to completion, the master schedules backup executions of the remaining in-progress tasks ("straggler"). The task is marked as completed whenever either the primary or the backup execution completes. + +`JJ: what about other refinement: ` + +**Performance** +In the paper, the authors measure the performance of MapReduce on two computations running on a large cluster of machines. One computation *grep* through approximately 1TB of data. The other computation *sort* approximately 1TB of data. Both computations take in the order of a hundred seconds. In addition, the backup tasks do help largely reduce execution time. In the experiment where 200 out of 1746 tasks were intentionally killed, the scheduler was able to recover quickly and finish the whole computation for just a 5% increased time. +Overall, the performance is very good for conceptually unrelated computations. + + +## FlumeJava (2010) +Many real-world computations involves a pipeline of MapReduces, and this motivates additional management to chain together those separate MapReduce stages in an efficient way. FlumeJava {% cite chambers2010flumejava --file big-data %} can help build those pipelines and keep computations modular. At core, FlumeJava are a couple of classes that represent immutable parallel collections. It defers evaluation and optimization by internally constructing an execution plan dataflow graph. + +**Core Abstraction** + +- `PCollection`, a immutable bag of elements of type `T` +- `recordOf(...)`, specifies the encoding of the instance +- `PTable`, a subclass of `PCollection>`, a immutable multi-map with keys of type `K` and values of type `V` +- `parallelDo()`, can be expressed both the map and reduce parts of MapReduce +- `groupByKey()`, same as shuffle step of MapReduce `JJ: clear this in MapReduce` +- `combineValues()`, semantically a special case of `parallelDo()`, a combination of a MapReduce combiner and a MapReduce reducer, which is more efficient than doing all the combining in the reducer. + +**Deferred Evaluation** +`(JJ: placehoder) join, deferred/materialized; execution plan; figure 1 initial execution plan` + +**Optimizer** +`(JJ: placehoder) parallelDo Fusion; MSCR; overall goal to produce the fewest, most efficient MSCR operations in the final optimized plan` + +# Graph paralleling Though highly efficient and one of the first major programming models for distributed batch processing, it too has a few limitations.
Map Reduce doesn’t scale easily and is highly inefficient for iterative / graph algorithms like page rank and machine learning algorithms. Iterative algorithms requires programmer to explicitly handle the intermediate results (writing to disks). Hence, every iteration requires reading the input file and writing the results to the disk resulting in high disk I/O which is a performance bottleneck for any batch processing system.
Also graph algorithms require exchange of messages between vertices. In case of PageRank, every vertex requires the contributions from all its adjacent nodes to calculate its score. Map reduce currently lacks this model of message passing which makes it complex to reason about graph algorithms.
-### Bulk synchronous parallel model +## Bulk synchronous parallel model This model was introduced in 1980 to represent the hardware design features of parallel computers. It gained popularity as an alternative for map reduce since it addressed the above mentioned issues with map reduce to an extent.
-In BSP model -+ Computation consists of several steps called as supersets. +In BSP model ++ Computation consists of several steps called as supersets. + The processors involved have their own local memory and every processor is connected to other via a point-to-point communication. -+ At every superstep, a processor receives input at the beginning, performs computation and outputs at the end. ++ At every superstep, a processor receives input at the beginning, performs computation and outputs at the end. + Barrier synchronization synchs all the processors at the end of every superstep.
A notable feature of the model is the complete control on data through communication between every processor at every superstep. BSP preserves data in memory across supersteps and helps in reasoning iterative graph algorithms.
@@ -20,7 +118,7 @@ A notable feature of the model is the complete control on data through communica Pregel is highly scalable, fault-tolerant and can successfully represent larger complex graphs. Google claims the API becomes easy once a developer adopts “think like a vertex” mode. Pregel’s computation system is iterative and every iteration is called as superstep. The system takes a directed graph as input with properties assigned to both vertices and graph. At each superstep, all vertices executes in parallel, a user-defined function which represents the behavior of the vertex. The function has access to message sent to its vertex from the previous superstep S-1 and can update the state of the vertex, its edges, the graph and even send messages to other vertices which would receive in the next superstep S+1. The synchronization happens only between two supersteps. Every vertex is either active or inactive at any superstep. The iteration stops when all the vertices are inactive. A vertex can deactivate itself by voting for it and gets active if it receives a message. This asynchronous message passing feature eliminates the shared memory, remote reads and latency of Map reduce model.
Pregel’s API provides
-+ compute() method for the user to implement the logic to change the state of the graph/vertex at every superstep. It guarantees message delivery through an iterator at every superstep. ++ compute() method for the user to implement the logic to change the state of the graph/vertex at every superstep. It guarantees message delivery through an iterator at every superstep. + User defined handler for handling issues like missing destination vertex etc. + Combiners reduce the amount of messages passed from multiple vertices to the same destination vertex. + Aggregators capture the global state of the graph. A reduce operation combines the value given by every vertex to the aggregator. The combined/aggregated value is passed onto to all the vertices in the next superstep. @@ -32,7 +130,8 @@ Apache Giraph is an open source implementation of Pregel in which new features l ## References +{% bibliography --file big-data %} + "Bulk synchronous model" http://www.cse.unt.edu/~tarau/teaching/parpro/papers/Bulk%20synchronous%20parallel.pdf. "Pregel: A System for Large-Scale Graph Processing."
"One Trillion Edges: Graph Processing at Facebook-Scale." Accessed November 17, 2016. http://www.vldb.org/pvldb/vol8/p1804-ching.pdf. - -- cgit v1.2.3 From e2e0995491d8f3588d6214a2b21351063f17e9e3 Mon Sep 17 00:00:00 2001 From: Jingjing Ren Date: Thu, 24 Nov 2016 22:28:51 -0500 Subject: mv ref to .bib --- chapter/8/big-data.md | 6 +----- 1 file changed, 1 insertion(+), 5 deletions(-) (limited to 'chapter') diff --git a/chapter/8/big-data.md b/chapter/8/big-data.md index 34a14f1..d49d5a1 100644 --- a/chapter/8/big-data.md +++ b/chapter/8/big-data.md @@ -14,7 +14,7 @@ This chapter is organized in