From 3ceac593715dec5c1dc42357d3a6d28e96eeb722 Mon Sep 17 00:00:00 2001 From: Connor Zanin Date: Mon, 12 Dec 2016 14:54:07 -0500 Subject: image --- chapter/4/MR.png | Bin 0 -> 83219 bytes chapter/4/dist-langs.md | 16 ++++++++++++++-- 2 files changed, 14 insertions(+), 2 deletions(-) create mode 100644 chapter/4/MR.png diff --git a/chapter/4/MR.png b/chapter/4/MR.png new file mode 100644 index 0000000..54db004 Binary files /dev/null and b/chapter/4/MR.png differ diff --git a/chapter/4/dist-langs.md b/chapter/4/dist-langs.md index 21adfa6..64eccc0 100644 --- a/chapter/4/dist-langs.md +++ b/chapter/4/dist-langs.md @@ -283,9 +283,21 @@ Computation is data-centric, and expressed easily as a directed acyclic graph (D Unlike the DSM and actor models, processes are not exposed to the programmer. Rather, the programmer designs the data transformations, and a system is responsible for initializing processes and distributing work accross a system. -#### Multilisp () -#### MapReduce () +#### MapReduce (2004) + +* input key-value pairs -> output key-value pairs +* Map and Reduced chained to create programs +* Map + * input key-value pairs transformed into intermediate key-value pairs +* Reduce + * intermediate keys are aggregated by key + * function performs some action based on all values associated with an intermediate key +* Map and Reduce may emit zero, one, or many key-value pairs per input + +![Alt text] (/MR.png "MapReduce Wordcount Workflow") + #### DryadLINQ () +#### Discretized Streams (2012) ### Which is best? Why? -- cgit v1.2.3