aboutsummaryrefslogtreecommitdiff
path: root/chapter
diff options
context:
space:
mode:
Diffstat (limited to 'chapter')
-rw-r--r--chapter/1/figures/grpc-benchmark.pngbin0 -> 17014 bytes
-rw-r--r--chapter/1/figures/grpc-client-transport-handler.pngbin0 -> 67834 bytes
-rw-r--r--chapter/1/figures/grpc-cross-language.pngbin0 -> 27394 bytes
-rw-r--r--chapter/1/figures/grpc-googleapis.pngbin0 -> 33354 bytes
-rw-r--r--chapter/1/figures/grpc-languages.pngbin0 -> 47003 bytes
-rw-r--r--chapter/1/figures/grpc-server-transport-handler.pngbin0 -> 60913 bytes
-rw-r--r--chapter/1/figures/hello-world-client.pngbin0 -> 30161 bytes
-rw-r--r--chapter/1/figures/hello-world-server.pngbin0 -> 13005 bytes
-rw-r--r--chapter/1/figures/http2-frame.pngbin0 -> 12057 bytes
-rw-r--r--chapter/1/figures/http2-stream-lifecycle.pngbin0 -> 49038 bytes
-rw-r--r--chapter/1/figures/protobuf-types.pngbin0 -> 19941 bytes
-rw-r--r--chapter/1/gRPC.md323
-rw-r--r--chapter/1/rpc.md378
-rw-r--r--chapter/2/futures.md602
-rw-r--r--chapter/2/images/1.pngbin0 -> 41235 bytes
-rw-r--r--chapter/2/images/15.pngbin0 -> 48459 bytes
-rw-r--r--chapter/2/images/5.pngbin0 -> 20821 bytes
-rw-r--r--chapter/2/images/6.pngbin0 -> 19123 bytes
-rw-r--r--chapter/2/images/7.pngbin0 -> 30068 bytes
-rw-r--r--chapter/2/images/8.pngbin0 -> 13899 bytes
-rw-r--r--chapter/2/images/9.pngbin0 -> 6463 bytes
-rw-r--r--chapter/2/images/p-1.pngbin0 -> 39600 bytes
-rw-r--r--chapter/2/images/p-1.svg4
-rw-r--r--chapter/2/images/p-2.pngbin0 -> 40084 bytes
-rw-r--r--chapter/2/images/p-2.svg4
-rw-r--r--chapter/3/E_account_spreadsheet_vats.pngbin0 -> 183811 bytes
-rw-r--r--chapter/3/E_vat.pngbin0 -> 53914 bytes
-rw-r--r--chapter/3/message-passing.md462
-rw-r--r--chapter/3/sentinel_nodes.pngbin0 -> 157837 bytes
-rw-r--r--chapter/3/supervision_tree.pngbin0 -> 143187 bytes
-rw-r--r--chapter/4/MR.pngbin0 -> 83219 bytes
-rw-r--r--chapter/4/dist-langs.md512
-rw-r--r--chapter/5/langs-extended-for-dist.md265
-rw-r--r--chapter/6/acidic-to-basic-how-the-database-ph-has-changed.md182
-rw-r--r--chapter/6/being-consistent.md82
-rw-r--r--chapter/6/consistency-crdts.md11
-rw-r--r--chapter/6/resources/partitioned-network.jpgbin0 -> 24303 bytes
-rw-r--r--chapter/8/Hive-architecture.pngbin0 -> 33250 bytes
-rw-r--r--chapter/8/Hive-transformation.pngbin0 -> 48497 bytes
-rw-r--r--chapter/8/big-data.md717
-rw-r--r--chapter/8/cluster-overview.pngbin0 -> 22912 bytes
-rw-r--r--chapter/8/ecosystem.pngbin0 -> 190654 bytes
-rw-r--r--chapter/8/edge-cut.pngbin0 -> 79745 bytes
-rw-r--r--chapter/8/hadoop-ecosystem.jpgbin0 -> 76009 bytes
-rw-r--r--chapter/8/spark-ecosystem.pngbin0 -> 49070 bytes
-rw-r--r--chapter/8/spark_pipeline.pngbin0 -> 17570 bytes
-rw-r--r--chapter/8/sparksql-data-flow.jpgbin0 -> 128479 bytes
-rw-r--r--chapter/8/sql-vs-dataframes-vs-datasets.pngbin0 -> 48229 bytes
-rw-r--r--chapter/8/vertex-cut-datastructure.pngbin0 -> 570007 bytes
-rw-r--r--chapter/9/DAG.jpgbin0 -> 44785 bytes
-rw-r--r--chapter/9/DiagramStream.jpgbin0 -> 28155 bytes
-rw-r--r--chapter/9/Kafka.jpgbin0 -> 63202 bytes
-rw-r--r--chapter/9/Naiad.jpgbin0 -> 35678 bytes
-rw-r--r--chapter/9/TimelyD.jpgbin0 -> 45235 bytes
-rw-r--r--chapter/9/Topology.jpgbin0 -> 32913 bytes
-rw-r--r--chapter/9/streaming.md272
56 files changed, 3780 insertions, 34 deletions
diff --git a/chapter/1/figures/grpc-benchmark.png b/chapter/1/figures/grpc-benchmark.png
new file mode 100644
index 0000000..9f39c71
--- /dev/null
+++ b/chapter/1/figures/grpc-benchmark.png
Binary files differ
diff --git a/chapter/1/figures/grpc-client-transport-handler.png b/chapter/1/figures/grpc-client-transport-handler.png
new file mode 100644
index 0000000..edd5236
--- /dev/null
+++ b/chapter/1/figures/grpc-client-transport-handler.png
Binary files differ
diff --git a/chapter/1/figures/grpc-cross-language.png b/chapter/1/figures/grpc-cross-language.png
new file mode 100644
index 0000000..c600f67
--- /dev/null
+++ b/chapter/1/figures/grpc-cross-language.png
Binary files differ
diff --git a/chapter/1/figures/grpc-googleapis.png b/chapter/1/figures/grpc-googleapis.png
new file mode 100644
index 0000000..62718e5
--- /dev/null
+++ b/chapter/1/figures/grpc-googleapis.png
Binary files differ
diff --git a/chapter/1/figures/grpc-languages.png b/chapter/1/figures/grpc-languages.png
new file mode 100644
index 0000000..1f1c50d
--- /dev/null
+++ b/chapter/1/figures/grpc-languages.png
Binary files differ
diff --git a/chapter/1/figures/grpc-server-transport-handler.png b/chapter/1/figures/grpc-server-transport-handler.png
new file mode 100644
index 0000000..fe895c0
--- /dev/null
+++ b/chapter/1/figures/grpc-server-transport-handler.png
Binary files differ
diff --git a/chapter/1/figures/hello-world-client.png b/chapter/1/figures/hello-world-client.png
new file mode 100644
index 0000000..c4cf7d4
--- /dev/null
+++ b/chapter/1/figures/hello-world-client.png
Binary files differ
diff --git a/chapter/1/figures/hello-world-server.png b/chapter/1/figures/hello-world-server.png
new file mode 100644
index 0000000..a51554b
--- /dev/null
+++ b/chapter/1/figures/hello-world-server.png
Binary files differ
diff --git a/chapter/1/figures/http2-frame.png b/chapter/1/figures/http2-frame.png
new file mode 100644
index 0000000..59d6ed5
--- /dev/null
+++ b/chapter/1/figures/http2-frame.png
Binary files differ
diff --git a/chapter/1/figures/http2-stream-lifecycle.png b/chapter/1/figures/http2-stream-lifecycle.png
new file mode 100644
index 0000000..87333cb
--- /dev/null
+++ b/chapter/1/figures/http2-stream-lifecycle.png
Binary files differ
diff --git a/chapter/1/figures/protobuf-types.png b/chapter/1/figures/protobuf-types.png
new file mode 100644
index 0000000..aaf3a1e
--- /dev/null
+++ b/chapter/1/figures/protobuf-types.png
Binary files differ
diff --git a/chapter/1/gRPC.md b/chapter/1/gRPC.md
new file mode 100644
index 0000000..f6c47b7
--- /dev/null
+++ b/chapter/1/gRPC.md
@@ -0,0 +1,323 @@
+---
+layout: page
+title: "gRPC"
+by: "Paul Grosu (Northeastern U.), Muzammil Abdul Rehman (Northeastern U.), Eric Anderson (Google, Inc.), Vijay Pai (Google, Inc.), and Heather Miller (Northeastern U.)"
+---
+
+<h1>
+<p align="center">gRPC</p>
+</h1>
+
+<h4><em>
+<p align="center">Paul Grosu (Northeastern U.), Muzammil Abdul Rehman (Northeastern U.), Eric Anderson (Google, Inc.), Vijay Pai (Google, Inc.), and Heather Miller (Northeastern U.)</p>
+</em></h4>
+
+<hr>
+
+<h3><em><p align="center">Abstract</p></em></h3>
+
+<em>gRPC has been built from a collaboration between Google and Square as a public replacement of Stubby, ARCWire and Sake {% cite Apigee %}. The gRPC framework is a form of an Actor Model based on an IDL (Interface Description Language), which is defined via the Protocol Buffer message format. With the introduction of HTTP/2 the internal Google Stubby and Square Sake frameworks are now been made available to the public. By working on top of the HTTP/2 protocol, gRPC enables messages to be multiplexed and compressed bi-directionally as premptive streams for maximizing capacity of any microservices ecosystem. Google has also a new approach to public projects, where instead of just releasing a paper describing the concepts will now also provide the implementation of how to properly interpret the standard.
+</em>
+
+<h3><em>Introduction</em></h3>
+
+In order to understand gRPC and the flexibity of enabling a microservices ecosystem to become into a Reactive Actor Model, it is important to appreciate the nuances of the HTTP/2 Protocol upon which it is based. Afterward we will describe the gRPC Framework - focusing specifically on the gRPC-Java implementation - with the scope to expand this chapter over time to all implementations of gRPC. At the end we will cover examples demonstrating these ideas, by taking a user from the initial steps of how to work with the gRPC-Java framework.
+
+<h3>1 <em>HTTP/2</em></h3>
+
+The HTTP 1.1 protocol has been a success for some time, though there were some key features which began to be requested by the community with the increase of distributed computing, especially in the area of microservices. The phenomenon of creating more modularized functional units that are organically constructed based on a <em>share-nothing model</em> with a bidirectional, high-throughput request and response methodology demands a new protocol for communication and integration. Thus the HTTP/2 was born as a new standard, which is a binary wire protocol providing compressed streams that can be multiplexed for concurrency. As many microservices implementations currently scan header messages before actually processing any payload in order to scale up the processing and routing of messages, HTTP/2 now provides header compression for this purpose. One last important benefit is that the server endpoint can actually push cached resources to the client based on anticipated future communication, dramatically saving client communication time and processing.
+
+<h3>1.1 <em>HTTP/2 Frames</em></h3>
+
+The HTTP/2 protocol is now a framed protocol, which expands the capability for bidirectional, asynchronous communication. Every message is thus part of a frame that will have a header, frame type and stream identifier aside from the standard frame length for processing. Each stream can have a priority, which allows for dependency between streams to be achieved forming a <em>priority tree</em>. The data can be either a request or response which allows for the bidirectional communication, with the capability of flagging the communication for stream termination, flow control with priority settings, continuation and push responses from the server for client confirmation. Below is the format of the HTTP/2 frame {% cite RFC7540 %}:
+
+<p align="center">
+ <img src="figures/http2-frame.png" /><br>
+ <em>Figure 1: The encoding a HTTP/2 frame.</em>
+</p>
+
+<h3>1.2 <em>Header Compression</em></h3>
+
+The HTTP header is one of the primary methods of passing information about the state of other endpoints, the request or response and the payload. This enables endpoints to save time when processing a large quantity to streams, with the ability to forward information along without wasting time to inspect the payload. Since the header information can be quite large, it is possible to now compress the them to allow for better throughput and capacity of stored stateful information.
+
+<h3>1.3 <em>Multiplexed Streams</em></h3>
+
+As streams are core to the implementation of HTTP/2, it is important to discuss the details of their implemenation in the protocol. As many streams can be open simultanously from many endpoints, each stream will be in one of the following states. Each stream is multiplexed together forming a chain of streams that are transmitted over the wire, allowing for asynchronous bi-directional concurrency to be performed by the receiving endpoint. Below is the lifecycle of a stream {% cite RFC7540 %}:
+
+<p align="center">
+ <img src="figures/http2-stream-lifecycle.png" /><br>
+ <em>Figure 2: The lifecycle of a HTTP/2 stream.</em>
+</p>
+
+To better understand this diagram, it is important to define some of the terms in it:
+
+<em>PUSH_PROMISE</em> - This is being performed by one endpoint to alert another that it will be sending some data over the wire.
+
+<em>RST_STREAM</em> - This makes termination of a stream possible.
+
+<em>PRIORITY</em> - This is sent by an endpoint on the priority of a stream.
+
+<em>END_STREAM</em> - This flag denotes the end of a <em>DATA</em> frame.
+
+<em>HEADERS</em> - This frame will open a stream.
+
+<em>Idle</em> - This is a state that a stream can be in when it is opened by receiving a <em>HEADERS</em> frame.
+
+<em>Reserved (Local)</em> - To be in this state is means that one has sent a PUSH_PROMISE frame.
+
+<em>Reserved (Remote)</em> - To be in this state is means that it has been reserved by a remote endpoint.
+
+<em>Open</em> - To be in this state means that both endpoints can send frames.
+
+<em>Closed</em> - This is a terminal state.
+
+<em>Half-Closed (Local)</em> - This means that no frames can be sent except for <em>WINDOW_UPDATE</em>, <em>PRIORITY</em>, and <em>RST_STREAM</em>.
+
+<em>Half-Closed (Remote)</em> - This means that a frame is not used by the remote endpoint to send frames of data.
+
+<h3>1.4 <em>Flow Control of Streams</em></h3>
+
+Since many streams will compete for the bandwidth of a connection, in order to prevent bottlenecks and collisions in the transmission. This is done via the <em>WINDOW_UPDATE</em> payload for every stream - and the overall connection as well - to let the sender know how much room the receiving endpoint has for processing new data.
+
+<h3>2 <em>Protocol Buffers with RPC</em></h3>
+
+Though gRPC was built on top of HTTP/2, an IDL had to be used to perform the communication between endpoints. The natural direction was to use Protocol Buffers is the method of stucturing key-value-based data for serialization between a server and client. At the time of the start of gRPC development only version 2.0 (proto2) was available, which only implemented data structures without any request/response mechanism. An example of a Protocol Buffer data structure would look something like this:
+
+```
+// A message containing the user's name.
+message Hello {
+ string name = 1;
+}
+```
+<p align="center">
+ <em>Figure 3: Protocol Buffer version 2.0 representing a message data-structure.</em>
+</p>
+
+This message will also be encoded for highest compression when sent over the wire. For example, let us say that the message is the string <em>"Hi"</em>. Every Protocol Buffer type has a value, and in this case a string has a value of `2`, as noted in the Table 1 {% cite Protobuf-Types %}.
+
+<p align="center">
+ <img src="figures/protobuf-types.png" /><br>
+ <em>Table 1: Tag values for Protocol Buffer types.</em>
+</p>
+
+One will notice that there is a number associated with each field element in the Protocol Buffer definition, which represents its <em>tag</em>. In Figure 3, the field `name` has a tag of `1`. When a message gets encoded each field (key) will start with a one byte value (8 bits), where the least-significant 3-bit value encode the <em>type</em> and the rest the <em>tag</em>. In this case tag which is `1`, with a type of 2. Thus the encoding will be `00001 010`, which has a hexdecimal value of `A`. The following byte is the length of the string which is `2`, followed by the string as `48` and `69` representing `H` and `i`. Thus the whole tranmission will look as follows:
+
+```
+A 2 48 69
+```
+
+Thus the language had to be updated to support gRPC and the development of a service message with a request and a response definition was added for version version 3.0.0 of Protocol Buffers. The updated implementation would look as follows {% cite HelloWorldProto %}:
+
+```
+// The request message containing the user's name.
+message HelloRequest {
+ string name = 1;
+}
+
+// The response message containing the greetings
+message HelloReply {
+ string message = 1;
+}
+
+// The greeting service definition.
+service Greeter {
+ // Sends a greeting
+ rpc SayHello (HelloRequest) returns (HelloReply) {}
+}
+```
+<p align="center">
+ <em>Figure 4: Protocol Buffer version 3.0.0 representing a message data-structure with the accompanied RPC definition.</em>
+</p>
+
+Notice the addition of a service, where the RPC call would use one of the messages as the structure of a <em>Request</em> with the other being the <em>Response</em> message format.
+
+Once of these Proto file gets generated, one would then use them to compile with gRPC to for generating the <em>Client</em> and <em>Server</em> files representing the classical two endpoints of a RPC implementation.
+
+<h3>3 <em>gRPC</em></h3>
+
+gRPC was built on top of HTTP/2, and we will cover the specifics of gRPC-Java, but expand it to all the implementations with time. gRPC is a cross-platform framework that allows integration across many languages as denoted in Figure 5 {% cite gRPC-Overview %}.
+
+<p align="center">
+ <img src="figures/grpc-cross-language.png" /><br>
+ <em>Figure 5: gRPC allows for asynchronous language-agnostic message passing via Protocol Buffers.</em>
+</p>
+
+To ensure scalability, benchmarks are run on a daily basis to ensure that gRPC performs optimally under high-throughput conditions as illustrated in Figure 6 {% cite gRPC-Benchmark %}.
+
+<p align="center">
+ <img src="figures/grpc-benchmark.png" /><br>
+ <em>Figure 6: Benchmark showing the queries-per-second on two virtual machines with 32 cores each.</em>
+</p>
+
+To standardize, most of the public Google APIs - including the Speech API, Vision API, Bigtable, Pub/Sub, etc. - have been ported to support gRPC, and their definitions can be found at the following location:
+
+<p align="center">
+ <img src="figures/grpc-googleapis.png" /><br>
+ <em>Figure 7: The public Google APIs have been updated for gRPC, and be found at <a href="https://github.com/googleapis/googleapis/tree/master/google">https://github.com/googleapis/googleapis/tree/master/google</a></em>
+</p>
+
+
+<h3>3.1 <em>Supported Languages</em></h3>
+
+The officially supported languages are listed in Table 2 {% cite gRPC-Languages %}.
+
+<p align="center">
+ <img src="figures/grpc-languages.png" /><br>
+ <em>Table 2: Officially supported languages by gRPC.</em>
+</p>
+
+<h3>3.2 <em>Authentication</em></h3>
+
+There are two methods of authentication that are available in gRPC:
+
+* SSL/TLS
+* Google Token (via OAuth2)
+
+gRPC is flexible in that once can plug in their custom authentication system if that is preferred.
+
+<h3>3.3 <em>Development Cycle</em></h3>
+
+In its simplest form gRPC has a structured set of steps one goes about using it, which has this general flow:
+
+<em>1. Download gRPC for the language of interest.</em>
+
+<em>2. Implement the Request and Response definition in a ProtoBuf file.</em>
+
+<em>3. Compile the ProtoBuf file and run the code-generators for the the specific language. This will generate the Client and Server endpoints.</em>
+
+<em>4. Customize the Client and Server code for the desired implementation.</em>
+
+Most of these will require tweaking the Protobuf file and testing the throughput to ensure that the network and CPU capacities are optimally maximized.
+
+<h3>3.4 <em>The gRPC Framework (Stub, Channel and Transport Layer)</em></h3>
+
+One starts by initializing a communication <em>Channel</em> between <em>Client</em> to a <em>Server</em> and storing that as a <em>Stub</em>. The <em>Credentials</em> are provided to the Channel when being initialized. These form a <em>Context</em> for the Client's connection to the Server. Then a <em>Request</em> can be built based on the definition in the Protobuf file. The Request and associated expected<em>Response</em> is executed by the <em>service</em> constructed in the Protobuf file. The Response is them parsed for any data coming from the Channel.
+
+The connection can be asynchronous and bi-directionally streaming so that data is constantly flowing back and available to be read when ready. This allows one to treat the Client and Server as endpoints where one can even adjust not just the flow but also intercept and decoration to filter and thus request and retrieve the data of interest.
+
+The <em>Transport Layer</em> performs the retrieval and placing of binary protocol on the wire. For <em>gRPC-Java</em> has three implementations, though a user can implement their own: <em>Netty, OkHttp, and inProcess.</em>
+
+<h3>3.5 <em>gRPC Java</em></h3>
+
+The Java implementation of gRPC been built with Mobile platform in mind and to provide that capability it requires JDK 6.0 to be supported. Though the core of gRPC is built with data centers in mind - specifically to support C/C++ for the Linux platform - the Java and Go implementations are two very reliable platform to experiment the microservice ecosystem implementations.
+
+There are several moving parts to understanding how gRPC-Java works. The first important step is to ensure that the Client and Server stub inferface code get generated by the Protobuf plugin compiler. This is usually placed in your <em>Gradle</em> build file called `build.gradle` as follows:
+
+```
+ compile 'io.grpc:grpc-netty:1.0.1'
+ compile 'io.grpc:grpc-protobuf:1.0.1'
+ compile 'io.grpc:grpc-stub:1.0.1'
+```
+
+When you build using Gradle, then the appropriate base code gets generated for you, which you can override to build your preferred implementation of the Client and Server.
+
+Since one has to implement the HTTP/2 protocol, the chosen method was to have a <em>Metadata</em> class that will convert the key-value pairs into HTTP/2 Headers and vice-versa for the Netty implementation via <em>GrpcHttp2HeadersDecoder</em> and <em>GrpcHttp2OutboundHeaders</em>.
+
+Another key insight is to understand that the code that handles the HTTP/2 conversion for the Client and the Server are being done via the <em>NettyClientHandler.java</em> and <em>NettyServerHandler.java</em> classes shown in Figures 8 and 9.
+
+<p align="center">
+ <img src="figures/grpc-client-transport-handler.png" /><br>
+ <em>Figure 8: The Client Tranport Handler for gRPC-Java.</em>
+</p>
+
+<p align="center">
+ <img src="figures/grpc-server-transport-handler.png" /><br>
+ <em>Figure 9: The Server Tranport Handler for gRPC-Java.</em>
+</p>
+
+
+<h3>3.5.1 <em>Downloading gRPC Java</em></h3>
+
+The easiest way to download the gRPC-Java implementation is by performing the following command:
+
+```
+git clone -b v1.0.0 https://github.com/grpc/grpc-java.git
+```
+
+Next compile on a Windows machine using Gradle (or Maven) using the following steps - and if you are using any Firewall software it might be necessary to temporarily disable it while compiling gRPC-Java as sockets are used for the tests:
+
+```
+cd grpc-java
+set GRADLE_OPTS=-Xmx2048m
+set JAVA_OPTS=-Xmx2048m
+set DEFAULT_JVM_OPTS="-Dfile.encoding=utf-8"
+echo skipCodegen=true > gradle.properties
+gradlew.bat build -x test
+cd examples
+gradlew.bat installDist
+```
+
+If you are having issues with Unicode (UTF-8) translation when using Git on Windows, you can try the following commands after entering the `examples` folder:
+
+```
+wget https://raw.githubusercontent.com/benelot/grpc-java/feb88a96a4bc689631baec11abe989a776230b74/examples/src/main/java/io/grpc/examples/routeguide/RouteGuideServer.java
+
+copy RouteGuideServer.java src\main\java\io\grpc\examples\routeguide\RouteGuideServer.java
+```
+
+<h3>3.5.2 <em>Running the Hello World Demonstration</em></h3>
+
+Make sure you open two Command (Terminal) windows, each within the `grpc-java\examples\build\install\examples\bin` folder. In the first of the two windows type the following command:
+
+```
+hello-world-server.bat
+```
+
+You should see the following:
+
+<p align="center">
+ <img src="figures/hello-world-server.png" /><br>
+ <em>Figure 10: The Hello World gRPC Server.</em>
+</p>
+
+In the second of the two windows type the following command:
+
+```
+hello-world-client.bat
+```
+
+You should see the following response:
+
+<p align="center">
+ <img src="figures/hello-world-client.png" /><br>
+ <em>Figure 10: The Hello World gRPC Client and the response from the Server.</em>
+</p>
+
+<h3>4 <em>Conclusion</em></h3>
+
+This chapter presented an overview of the concepts behing gRPC, HTTP/2 and will be expanded in both breadth and language implementations. The area of microservices one can see how a server endpoint can actually spawn more endpoints where the message content is the protobuf definition for new endpoints to be generated for load-balancing like for the classical Actor Model.
+
+## References
+
+` `[Apigee]: https://www.youtube.com/watch?v=-2sWDr3Z0Wo
+
+` `[Authentication]: http://www.grpc.io/docs/guides/auth.html
+
+` `[Benchmarks]: http://www.grpc.io/docs/guides/benchmarking.html
+
+` `[CoreSurfaceAPIs]: https://github.com/grpc/grpc/tree/master/src/core
+
+` `[ErrorModel]: http://www.grpc.io/docs/guides/error.html
+
+` `[gRPC]: https://github.com/grpc/grpc/blob/master/doc/g_stands_for.md
+
+` `[gRPC-Companies]: http://www.grpc.io/about/
+
+` `[gRPC-Languages]: http://www.grpc.io/docs/
+
+` `[gRPC-Protos]: https://github.com/googleapis/googleapis/
+
+` `[Netty]: http://netty.io/
+
+` `[RFC7540]: http://httpwg.org/specs/rfc7540.html
+
+` `[HelloWorldProto]: https://github.com/grpc/grpc/blob/master/examples/protos/
+helloworld.proto
+
+` `[Protobuf-Types]: https://developers.google.com/protocol-buffers/docs/encoding
+
+` `[gRPC-Overview]: http://www.grpc.io/docs/guides/
+
+` `[gRPC-Languages]: http://www.grpc.io/about/#osp
+
+` `[gRPC-Benchmark]: http://www.grpc.io/docs/guides/benchmarking.html
diff --git a/chapter/1/rpc.md b/chapter/1/rpc.md
index b4bce84..ccc9739 100644
--- a/chapter/1/rpc.md
+++ b/chapter/1/rpc.md
@@ -1,11 +1,381 @@
---
layout: page
-title: "Remote Procedure Call"
-by: "Joe Schmoe and Mary Jane"
+title: "RPC is Not Dead: Rise, Fall and the Rise of Remote Procedure Calls"
+by: "Muzammil Abdul Rehman and Paul Grosu"
---
-Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum. {% cite Uniqueness --file rpc %}
+## Introduction:
+
+*Remote Procedure Call* (RPC) is a design *paradigm* that allow two entities to communicate over a communication channel in a general request-response mechanism. The definition of RPC has mutated and evolved significantly over the past three decades, and therefore RPC *paradigm* is a generic, broadly classifying term to refer to all RPC-esque systems that have arisen over the past four decades. The *definition* of RPC has evolved over the decades. It has moved on from a simple *client-server* design to a group of inter-connected *services*. While the initial RPC *implementations* were designed as tools for outsourcing computation to a server in a distributed system, however, RPC has evolved over the years to build language-agnostic ecosystem of applications. This RPC *paradigm* has been part of the driving force in creating truly revolutionizing distributed systems and giving rise to various communication schemes and protocols between diverse systems.
+
+RPC *paradigm* has been used to implement our every-day systems. From lower level applications like Network File Systems{% cite sunnfs --file rpc %} and Remote Direct Memory Access{% cite rpcoverrdma --file rpc %} to access protocols to developing an ecosystem of microservices, RPC has been used everywhere. RPC has a diverse variety of applications -- SunNFS{% cite sunnfs --file rpc %}, Twitter's Finagle{% cite finagle --file rpc %}, Apache Thrift{% cite thrift --file rpc %}, Java RMI{% cite rmipaper --file rpc %}, SOAP, CORBA{% cite corba --file rpc %} and Google's gRPC{% cite grpc --file rpc %} to name a few.
+
+RPC has evolved over the years. Starting off as a synchronous, insecure, request-response system, RPC has evolved into a secure, asynchronous, resilient *paradigm* that has influenced protocols and programming designs, like, HTTP, REST, and just about anything with a request-response system. It has transitioned to an asynchronous, bidirectional, communication mechanism for connecting services and devices across the internet. While the initial RPC implementations mainly focused on a local, private network with multiple clients communicating with a server and synchronously waiting for the response from the server, modern RPC systems have *endpoints* communicating with each other, asynchronously passing arguments and processing responses, as well having two-way request-response streams(from client to server, and also from server to client). RPC has influenced various design paradigms and communication protocols.
+
+## Remote Procedure Calls:
+
+The *Remote Procedure Call paradigm* can be defined, at a high level, as a set of two communication *endpoints* connected over a network with one endpoint sending a request and the other endpoint generating a response based on that request. In the simplest terms, it's a request-response paradigm where the two *endpoints*/hosts have different *address space*. The host that requests a remote procedure can be referred to as *caller* and the host that responds to this can be referred to as *callee*.
+
+The *endpoints* in the RPC can either be a client and a server, two nodes in a peer-to-peer network, two hosts in a grid computation system, or even two microservices. The RPC communication is not limited to two hosts, rather could have multiple hosts or *endpoints* involved {% cite anycastrpc --file rpc %}.
+
+<p align="center">
+[ Image Source: {% cite rpcimage --file rpc %}]
+</p>
+<figure>
+ <img src="{{ site.baseurl }}/resources/img/rpc_chapter_1_ycog_10_steps.png" alt="RPC in 10 Steps." />
+<p>Fig1. - Remote Procedure Call.</p>
+</figure>
+
+The simplest RPC implementation looks like Fig1. In this case, the *client*(or *caller*) and the *server*(or *callee*) are separated by a physical network. The main components of the system are the client routine/program, the client stub, the server routine/program, the server stub, and the network routines. A *stub* is a small program that is generally used as a stand-in(or an interface) for a larger program{% cite stubrpc --file rpc %}. A *client stub* exposes the functionality provided by the server routine to the client routine while the server stub provides a client-like program to the server routine{% cite rpcimage --file rpc %}. The client stub takes the input arguments from the client program and returns the result, while the server stub provides input arguments to the server program and gets the results. The client program can only interact with the client stub that provides the interface of the remote server to the client. This stub also provides marshalling/pickling/serialization of the input arguments sent to the stub by the client routine. Similarly, the server stub provides a client interface to the server routines as well as the marshalling services.
+
+When a client routine performs a *remote procedure*, it calls the client stub, which serializes the input argument. This serialized data is sent to the server using OS network routines (TCP/IP){% cite rpcimage --file rpc %}. The data is serialized by the server stub, present to the server routines for the given arguments. The return value from the server routines is serialized again and sent over the network back to the client where it's deserialized by the client stub and presented to the client routine. This *remote procedure* is generally hidden from the client routine and it appears as a *local procedure* to the client. RPC services also require a discovery service/host-resolution mechanism to bootstrap the communication between the client and the server.
+
+One important feature of RPC is different *address space* {% cite implementingrpc --file rpc %} for all the endpoints, however, passing the locations to a global storage(Amazon S3, Microsoft Azure, Google Cloud Store) is not impossible. In RPC, all the hosts have separate *address spaces*. They can't share pointers or references to a memory location in one host. This *address space* isolation means that all the information is passed in the messages between the host communicating as a value (objects or variables) but not by reference. Since RPC is a *remote* procedure call, the values sent to the *remote* host cannot be pointers or references to a *local* memory. However, passing links to a global shared memory location is not impossible but rather dependent on the type of system (see *Applications* section for detail).
+
+Originally, RPC was developed as a synchronous request-response mechanism, tied to a specific programming language implementation, with a custom network protocol to outsource computation {% cite implementingrpc --file rpc %}. It had registry system to register all the servers. One of the earliest RPC-based system {% cite implementingrpc --file rpc %} was implemented in the Cedar programming language in early 1980's. The goal of this system was to provide similar programming semantics as local procedure calls. Developed for a LAN network with an inefficient network protocol and a *serialization* scheme to transfer information using the said network protocol, this system aimed at executing a *procedure*(also referred as *method* or a *function*) in a remote *address space*. The single-thread synchronous client and the server were written in an old *Cedar* programming language with a registry system used by the servers to *bind*(or register) their procedures. The clients used this registry system to find a specific server to execute their *remote* procedures. This RPC implementation {% cite implementingrpc --file rpc %} had a very specific use-case. It was built specifically for outsourcing computation between a "Xerox research internetwork", a small, closed, ethernet network with 16-bit addresses{% cite implementingrpc --file rpc %}.
+
+Modern RPC-based systems are language-agnostic, asynchronous, load-balanced systems. Authentication and authorization to these systems have been added as needed along with other security features. Most of these systems have fault-handling built into them as modules and the systems are generally spread all across the internet.
+
+RPC programs have a network (or a communication channel), therefore, they need to handle remote errors and be able to communication information successfully. Error handling generally varies and is categorized as *remote-host* or *network* failure handling. Depending on the type of the system, and the error, the caller (or the callee) return an error and these errors can be handled accordingly. For asynchronous RPC calls, it's possible to specify events to ensure progress.
+
+RPC implementations use a *serialization*(also referred to as *marshalling* or *pickling*) scheme on top of an underlying communication protocol (traditionally TCP over IP). These *serialization* schemes allow both the caller *caller* and *callee* to become language agnostic allowing both these systems to be developed in parallel without any language restrictions. Some examples of serialization schemes are JSON, XML, or Protocol Buffers {% cite grpc --file rpc %}.
+
+Modern RPC systems allow different components of a larger system to be developed independently of one another. The language-agnostic nature combined with a decoupling of some parts of the system allows the two components (caller and callee) to scale separately and add new functionalities. This independent scaling of the system might lead to a mesh of interconnected RPC *services* facilitating one another.
+
+### Examples of RPC
+
+RPC has become very predominant in modern systems. Google even performs orders of 10^10^ RPC calls per second {% cite grpcpersec --file rpc %}. That's *tens of trillions* of RPC calls *every second*. It's more than the *annual GDP of United States* {%cite usgdp --file rpc%}.
+
+In the simplest RPC systems, a client connects to a server over a network connection and performs a *procedure*. This procedure could be as simple as `return "Hello World"` in your favorite programming language. However, the complexity of the of this remote procedure has no upper bound.
+
+Here's the code of this simple RPC server, written in Python3.
+```python
+from xmlrpc.server import SimpleXMLRPCServer
+
+# a simple RPC function that returns "Hello World!"
+def remote_procedure(n):
+ return "Hello World!"
+
+server = SimpleXMLRPCServer(("localhost", 8080))
+print("RPC Server listening on port 8080...")
+server.register_function(remote_procedure, "remote_procedure")
+server.serve_forever()
+```
+
+This code for a simple RPC client for the above server, written in Python3, is as follows.
+
+```python
+import xmlrpc.client
+
+with xmlrpc.client.ServerProxy("http://localhost:8080/") as proxy:
+ print(proxy.remote_procedure())
+```
+
+In the above example, we create a simple function called `remote_procedure` and *bind* it to port *8080* on *localhost*. The RPC client then connects to the server and *request* the `remote_procedure` with no input arguments. The server then *responds* with a return value of the `remote_procedure`.
+
+One can even view the *three-way handshake* as an example of RPC paradigm. The *three-way handshake* is most commonly used in establishing a TCP connection. Here, a server-side application *binds* to a port on the server, and adds a hostname resolution entry is added to a DNS server(can be seen as a *registry* in RPC). Now, when the client has to connect to the server, it requests a DNS server to resolve the hostname to an IP address and the client sends a SYN packet. This SYN packet can be seen as a *request* to another *address space*. The server, upon receiving this, returns a SYN-ACK packet. This SYN-ACK packet from the server can be seen as *response* from the server, as well as a *request* to establish the connection. The client then *responds* with an ACK packet.
+
+## Evolution of RPC:
+
+RPC paradigm was first proposed in 1980’s and still continues as a relevant model of performing distributed computation, which initially was developed for a LAN and now can be implemented on open networks, as web services across the internet. It has had a long and arduous journey to its current state. Here are the three main(overlapping) stages that RPC went through.
+
+### The Rise: All Hail RPC(Early 1970's - Mid 1980's)
+
+RPC started off strong. With RFC 674{% cite rfc674 --file rpc %} and RFC 707{% cite rfc674 rfc707 --file rpc %} coming out and specifying the design of Remote Procedure Calls, followed by Nelson et. al{% cite implementingrpc --file rpc %} coming up with a first RPC implementation for the Cedar programming language, RPC revolutionized systems in general and gave rise to one of the earliest distributed systems(apart from the internet, of course).
+
+With these early achievements, people started using RPC as the defacto design choice. It became a Holy Grail in the systems community for a few years after the first implementation.
+
+### The Fall: RPC is Dead(Late 1970's - Late 1990's)
+
+RPC, despite being an initial success, wasn't without flaws. Within a year of its inception, the limitation of the RPC started to catch up with it. RFC 684 criticized RPC for latency, failures, and the cost. It also focussed on message-passing systems as an alternative to RPC design. Similarly, a few years down the road, in 1988, Tenenbaum et.~al presented similar concerns against RPC {%cite critiqueofrpc --file rpc %}. It talked about problems heterogeneous devices, message passing as an alternative, packet loss, network failure, RPC's synchronous nature, and highlighted that RPC is not a one-size-fits-all model.
+
+In 1994, *A Note on Distributed Computing* was published. This paper claimed RPC to be "fundamentally flawed" {%cite notedistributed --file rpc %}. It talked about a unified object view and cited four main problems with dividing these objects for distributed computing in RPC: communication latency, address space separation, partial failures and concurrency issues(resulting from accessing same remote object by two concurrent client requests). Although most of these problems(except partial failures) were inherently associated with distributed computing itself but partial failures for RPC systems meant that progress might not always be possible in an RPC system.
+
+This era wasn't a dead end for RPC, though. Some of the preliminary designs for modern RPC systems were introduced in this era. Perhaps, the earliest system in this era was SunRPC {% cite sunnfs --file rpc %} used for the Sun Network File System(NFS). Soon to follow SunRPC was CORBA{% cite corba --file rpc %} which was followed by Java RMI{% cite rmipaper --file rpc %}.
+
+However, the initial implementations of these systems were riddled with various issues and design flaws. For instance, Java RMI didn't handle network failures and assumed a reliable network with zero-latency{% cite rmipaper --file rpc %}.
+
+### The Rise, Again: Long Live RPC(Late 1990's - Today)
+
+Despite facing problems in its early days, RPC withstood the test of time. Researchers realized the limitations of RPC and focussed on rectifying and instead of enforcing RPC, they started to use RPC in applications where it was needed. The designer started adding exception-handling, async, network failure handling and heterogeneity between different languages/devices to RPC.
+
+In this era, SunRPC went through various additions and became came to be known as Open Network Computing RPC(ONC RPC). CORBA and RMI have also undergone various modifications as internet standards were set.
+
+A new breed of RPC also started in this era, Async(asynchronous) RPC, giving rise to systems that use *futures* and *promises*, like Finagle{% cite finagle --file rpc %} and Cap'n Proto(post-2010).
+
+
+<p align="center">
+[ Image Source: {% cite norman --file rpc %}]
+</p>
+<figure>
+ <img src="{{ site.baseurl }}/resources/img/rpc_chapter_1_syncrpc.jpg" alt="RPC in 10 Steps." />
+<p>Fig2. - Synchronous RPC.</p>
+</figure>
+
+
+<p align="center">
+[ Image Source: {% cite norman --file rpc %}]
+</p>
+<figure>
+ <img src="{{ site.baseurl }}/resources/img/rpc_chapter_1_asyncrpc.jpg" alt="RPC in 10 Steps." />
+<p>Fig3. - Asynchronous RPC.</p>
+</figure>
+
+
+A traditional, synchronous RPC is a *blocking* operation while an asynchronous RPC is a *non-blocking* operation{%cite dewan --file rpc %}. Fig2. shows a synchronous RPC call while Fig3. shows an asynchronous RPC call. In synchronous RPC, the client sends a request to the server and blocks and waits for the server to perform its computation and return the result. Only after getting the result from the server, the client proceeds onwards. In an asynchronous RPC, the client performs a request to the server and waits only for the acknowledgment of the delivery of input parameters/arguments. After this, the client proceeds onwards and when the server is finished processing, it sends an interrupt to the client. The client receives this message from the server, receives the results, and continues.
+
+Asynchronous RPC makes it possible to separate the remote call from the return value making it possible to write a single-threaded client to handle multiple RPC calls at the specific intervals it needs to process{%cite async --file rpc%}. It also allows easier handling of slow clients/servers as well as transferring large data easily(due to their incremental nature){%cite async --file rpc%}.
+
+In the post-2000 era, MAUI{% cite maui --file rpc %}, Cap'n Proto{% cite capnprotosecure --file rpc %}, gRPC{% cite grpc --file rpc %}, Thrift{% cite thrift --file rpc %} and Finagle{% cite finagle --file rpc %} have been released, which have significantly boosted the widespread use of RPC.
+
+Most of these newer systems came up with their Interface Description Languages(IDLs). These IDLs specified the common protocols and interfacing language that could be used to transfer information clients and servers written in different programming languages, making these RPC implementations language-agnostic. Some of the most common IDLs are JSON, XML, and ProtoBufs.
+
+A high-level overview of some of the most important RPC implementation is as follows.
+
+#### Java Remote Method Invocation
+Java RMI (Java Remote Method Invocation){% cite rmibook --file rpc %} is a Java implementation for performing RPC (Remote Procedure Calls) between a client and a server. The client using a stub passes via a socket connection the information over the network to the server that contains remote objects. The Remote Object Registry (ROR){% cite rmipaper --file rpc %} on the server contains the references to objects that can be accessed remotely and through which the client will connect to. The client then can request the invocation of methods on the server for processing the requested call and then responds with the answer.
+
+RMI provides some security by being encoded but not encrypted, though that can be augmented by tunneling over a secure connection or other methods. Moreover, RMI is very specific to Java. It cannot be used to take advantage of the language-independence feature that is inherent to most RPC implementations. Perhaps the main problem with RMI is that it doesn't provide *access transparency*. This means that a programmer(not the client program) cannot distinguish between the local objects or the remote objects making it relatively difficult handle partial failures in the network{%cite roi --file rpc %}.
+
+#### CORBA
+CORBA (Common Object Request Broker Architecture){% cite corba --file rpc %} was created by the Object Management Group {% cite corbasite --file rpc %} to allow for language-agnostic communication among multiple computers. It is an object-oriented model defined via an Interface Definition Language (IDL) and the communication is managed through an Object Request Broker (ORB). This ORB acts as a broker for objects. CORBA can be viewed as a language-independent RMI system where each client and server have an ORB by which they communicate. The benefits of CORBA is that it allows for multi-language implementations that can communicate with each other, but much of the criticism around CORBA relates to poor consistency among implementations and it's relatively outdated by now. Moreover, CORBA suffers from same access transparency issues as Java RMI.
+
+#### XML-RPC and SOAP
+The XML-RPC specifications {% cite Wiener --file rpc%} performs an HTTP Post request to a server formatted as XML composed of a *header* and *payload* that calls only one method. It was originally released in the late 1990's and unlike RMI, it provides transparency by using HTTP as a transparent mechanism.
+
+The header has to provide the basic information, like user agent and the size of the payload. The payload has to initiate a `methodCall` structure by specifying the name via `methodName` and associated parameter values. Parameters for the method can be scalar, structures or (recursive) arrays. The types of scalar can be one of `i4`, `int`, `boolean`, `string`, `double`, `dateTime.iso8601` or `base64`. The scalars are used to create more complex structures and arrays.
+
+Below is an example as provided by the XML-RPC documentation{% cite Wiener --file rpc%}:
+
+```XML
+
+POST /RPC2 HTTP/1.0
+User-Agent: Frontier/5.1.2 (WinNT)
+Host: betty.userland.com
+Content-Type: text/xml
+Content-length: 181
+
+<?xml version="1.0"?>
+<methodCall>
+ <methodName>examples.getStateName</methodName>
+ <params>
+ <param>
+ <value><i4>41</i4></value>
+ </param>
+ </params>
+ </methodCall>
+```
+
+The response to a request will have the `methodResponse` with `params` and values, or a `fault` with the associated `faultCode` in case of an error {% cite Wiener --file rpc %}:
+
+```XML
+HTTP/1.1 200 OK
+Connection: close
+Content-Length: 158
+Content-Type: text/xml
+Date: Fri, 17 Jul 1998 19:55:08 GMT
+Server: UserLand Frontier/5.1.2-WinNT
+
+<?xml version="1.0"?>
+<methodResponse>
+ <params>
+ <param>
+ <value><string>South Dakota</string></value>
+ </param>
+ </params>
+ </methodResponse>
+```
+
+SOAP (Simple Object Access Protocol) is a successor of XML-RPC as a web-services protocol for communicating between a client and server. It was initially designed by a group at Microsoft {% cite soaparticle1 --file rpc %}. The SOAP message is an XML-formatted message composed of an envelope inside which a header and a payload are provided(just like XML-RPC). The payload of the message contains the request and response of the message, which is transmitted over HTTP or SMTP(unlike XML-RPC).
+
+SOAP can be viewed as the superset of XML-RPC that provides support for more complex authentication schemes{%cite soapvsxml --file rpc %} as well as its support for WSDL(Web Services Description Language), allowing easier discovery and integration with remote web services{%cite soapvsxml --file rpc %}.
+
+The benefit of SOAP is that it provides the flexibility for transmission over multiple transport protocol. The XML-based messages allow SOAP to become language agnostic, though parsing such messages could become a bottleneck.
+
+#### Thrift
+Thrift is an *asynchronous* RPC system created by Facebook and now part of the Apache Foundation {% cite thrift --file rpc %}. It is a language-agnostic Interface Description Language(IDL) by which one generates the code for the client and server. It provides the opportunity for compressed serialization by customizing the protocol and the transport after the description file has been processed.
+
+Perhaps, the biggest advantage of Thrift is that its binary data format has a very low overhead. It has a relatively lower transmission cost(as compared to other alternatives like SOAP){%cite thrifttut --file rpc %} making it very efficient for large amounts of data transfer.
+
+#### Finagle
+Finagle is a fault-tolerant, protocol-agnostic runtime for doing RPC and high-level API for composing futures(see Async RPC section), with RPC calls generated under the hood. It was created by Twitter and is written in Scala to run on a JVM. It is based on three object types: Service objects, Filter objects and Future objects {% cite finagle --file rpc %}.
+
+The Future objects act by asynchronously being requested for a computation that would return a response at some time in the future. These Future objects are the main communication mechanism in Finagle. All the inputs and the output are represented as Future objects.
+
+The Service objects are an endpoint that will return a Future upon processing a request. These Service objects can be viewed as the interfaces used to implement a client or a server.
+
+A sample Finagle Server that reads a request and returns the version of the request is shown below. This example is taken from Finagle documentation{% cite finagletut --file rpc %}
+
+```Scala
+import com.twitter.finagle.{Http, Service}
+import com.twitter.finagle.http
+import com.twitter.util.{Await, Future}
+
+object Server extends App {
+ val service = new Service[http.Request, http.Response] {
+ def apply(req: http.Request): Future[http.Response] =
+ Future.value(
+ http.Response(req.version, http.Status.Ok)
+ )
+ }
+ val server = Http.serve(":8080", service)
+ Await.ready(server)
+}
+```
+
+A Filter object transforms requests for further processing in case additional customization is required from a request. These provide program-independent operations like, timeouts, etc. They take in a Service and provide a new Service object with the applied Filter. Aggregating multiple Filters is alos possible in Finagle.
+
+A sample timeout Filter that takes in a service and creates a new service with timeouts is shown below. This example is taken from Finagle documentation{% cite finagletut --file rpc %}
+
+```Scala
+import com.twitter.finagle.{Service, SimpleFilter}
+import com.twitter.util.{Duration, Future, Timer}
+
+class TimeoutFilter[Req, Rep](timeout: Duration, timer: Timer)
+ extends SimpleFilter[Req, Rep] {
+
+ def apply(request: Req, service: Service[Req, Rep]): Future[Rep] = {
+ val res = service(request)
+ res.within(timer, timeout)
+ }
+}
+```
+
+#### Open Network Computing RPC(ONC RPC)
+ONC was originally introduced as SunRPC {%cite sunnfs --file rpc %} for the Sun NFS. The Sun NFS system had a stateless server, with client-side caching, unique file handlers, and supported NFS read, write, truncate, unlink, etc operations. However, SunRPC was later revised as ONC in 1995 {%cite rfc1831 --file rpc %} and then in 2009 {%cite rfc5531 --file rpc %}. The IDL used in ONC(and SunRPC) is External Data Representation (XDR), a serialization mechanism specific to networks communication and therefore, ONC is limited to applications like Network File Systems.
+
+#### Mobile Assistance Using Infrastructure(MAUI)
+The MAUI project {% cite maui --file rpc %}, developed by Microsoft is a computation offloading system for mobile systems. It's an automated system that offloads a mobile code to a dedicated infrastructure in order to increase the battery life of the mobile, minimize the load on the programmer and perform complex computations offsite. MAUI uses RPC as the communication protocol between the mobile and the infrastructure.
+
+#### gRPC
+
+gRPC is a multiplexed, bi-directional streaming RPC protocol developed Google and Square. The IDL for gRPC is Protocol Buffers(also referred as ProtoBuf) and is meant as a public replacement of Stubby, ARCWire, and Sake {% cite Apigee --file rpc %}. More details on Protocol Buffers, Stubby, ARCWire, and Sake are available in our gRPC chapter{% cite grpcchapter --file rpc %}.
+
+gRPC provides a platform for scalable, bi-directional streaming using both synchronized and asynchronous communication.
+
+In a general RPC mechanism, the client initiates a connection to the server and only the client can *request* while the server can only *respond* to the incoming requests. However, in bi-directional gRPC streams, although the initial connection is initiated by the client(call it *endpoint 1*), once the connection is established, both the server(call it *endpoint 2*) and the *endpoint 1* can send *requests* and receive *responses*. This significantly eases the development where both *endpoints* are communicating with each other(like, grid computing). It also saves the hassle of creating two separate connections between the endpoints (one from *endpoint 1* to *endpoint 2* and another from *endpoint 2* to *endpoint 1*) since both streams are independent.
+
+It multiplexes the requests over a single connection using header compression. This makes it possible for gRPC to be used for mobile clients where battery life and data usage are important.
+The core library is in C -- except for Java and GO -- and surface APIs are implemented for all the other languages connecting through it{% cite CoreSurfaceAPIs --file rpc %}.
+
+Since Protocol Buffers has been utilized by many individuals and companies, gRPC makes it natural to extend their RPC ecosystems via gRPC. Companies like Cisco, Juniper and Netflix {% cite gRPCCompanies --file rpc %} have found it practical to adopt it.
+A majority of the Google Public APIs, like their places and maps APIs, have been ported to gRPC ProtoBuf {% cite gRPCProtos --file rpc %} as well.
+
+More details about gRPC and bi-directional streaming can be found in our gRPC chapter {% cite grpcchapter --file rpc %}
+
+#### Cap'n Proto
+CapnProto{% cite capnprotosecure --file rpc %} is a data interchange RPC system that bypasses data-encoding step(like JSON or ProtoBuf) to significantly improve the performance. It's developed by the original author of gRPC's ProtoBuf, but since it uses bytes(binary data) for encoding/decoding, it outperforms gRPC's ProtoBuf. It uses futures and promises to combine various remote operations into a single operation to save the transportation round-trips. This means if an client calls a function `foo` and then calls another function `bar` on the output of `foo`, Cap'n Proto will aggregate these two operations into a single `bar(foo(x))` where `x` is the input to the function `foo` {% cite capnprotosecure --file rpc %}. This saves multiple roundtrips, especially in object-oriented programs.
+
+### The Heir to the Throne: gRPC or Thrift
+
+Although there are many candidates to be considered as top contenders for RPC throne, most of these are targeted for a specific type of application. ONC is generally specific to the Network File System(though it's being pushed as a standard), Cap'n Proto is relatively new and untested, MAUI is specific to mobile systems, the open-source Finagle is primarily being used at Twitter(not widespread), and the Java RMI simply doesn't even come close due to its transparency issues(sorry to burst your bubble Java fans).
+
+Probably, the most powerful, and practical systems out there are Apache Thrift and Google's gRPC, primarily because these two systems cater to a large number of programming languages, have a significant performance benefit over other techniques and are being actively developed.
+
+Thrift was actually released a few years ago, while the first stable release for gRPC came out in August 2016. However, despite being 'out there', Thrift is currently less popular than gRPC {%cite trendrpcthrift --file rpc %}.
+
+gRPC {% cite gRPCLanguages --file rpc %} and Thrift, both, support most of the popular languages, including Java, C/C++, and Python. Thrift supports other languages, like Ruby, Erlang, Perl, Javascript, Node.js and OCaml while gRPC currently supports Node.js and Go.
+
+The gRPC core is written in C(with the exception of Java and Go) and wrappers are written in other languages to communicate with the core, while the Thrift core is written in C++.
+
+gRPC also provides easier bidrectional streaming communication between the caller and callee. The client generally initiates the communication {% cite gRPCLanguages --file rpc %} and once the connection is established the client and the server can perform reads and writes independently of each other. However, bi-directional streaming in Thrift might be a little difficult to handle, since it focuses explicitly on a client-server model. To enable bidirectional, async streaming, one may have to run two separate systems {%cite grpcbetter --file rpc%}.
+
+Thrift provides exception-handling as a message while the programmer has to handle exceptions in gRPC. In Thrift, exceptions can be returned built into the message, while in gRPC, the programmer explicitly defines this behavior. This Thrift exception-handling makes it easier to write client-side applications.
+
+Although custom authentication mechanisms can be implemented in both these system, gRPC come with a Google-backed authentication using SSL/TLS and Google Tokens {% cite grpcauth --file rpc %}.
+
+Moreover, gRPC-based network communication is done using HTTP/2. HTTP/2 makes it feasible for communicating parties to multiplex network connections using the same port. This is more efficient(in terms of memory usage) as compared to HTTP/1.1. Since gRPC communication is done HTTP/2, it means that gRPC can easily multiplex different services. As for Thrift, multiplexing services is possible, however, due to lack of support from underlying transport protocol, it is performed using a `TMulitplexingProcessor` class(in code) {% cite multiplexingthrift --file rpc %}.
+
+However, both gRPC and Thrift allow async RPC calls. This means that a client can send a request to the server and continue with its execution and the response from the server is processed it arrives.
+
+
+The major comparison between gRPC and Thrift can be summed in this table.
+
+| Comparison | Thrift | gRPC |
+| ----- | ----- | ----- |
+| License | Apache2 | BSD |
+| Sync/Async RPC | Both | Both |
+| Supported Languages | C++, Java, Python, PHP, Ruby, Erlang, Perl, Haskell, C#, Cocoa, JavaScript, Node.js, Smalltalk, and OCaml | C/C++, Python, Go, Java, Ruby, PHP, C#, Node.js, Objective-C |
+| Core Language | C++| C |
+| Exceptions | Allows being built in the message | Implemented by the programmer |
+| Authentication | Custom | Custom + Google Tokens |
+| Bi-Directionality | Not straightforward | Straightforward |
+| Multiplexing | Possible via `TMulitplexingProcessor` class | Possible via HTTP/2 |
+
+Although, it's difficult to specifically choose one over the other, however, with increasing popularity of gRPC, and the fact that it's still in early stages of development, the general trend{%cite trendrpcthrift --file rpc %} over the past year has started to shift in favor of gRPC and it's giving Thrift a run for its money. Although, it may not be considered as a metric, but the gRPC was searched, on average, three times more as compared to Thrift{%cite trendrpcthrift --file rpc %}.
+
+**Note:** This comparison is performed in December 2016 so the results are expected to change with time.
+
+## Applications:
+
+Since its inception, various papers have been published in applying RPC paradigm to different domains, as well as using RPC implementations to create new systems. Here are some of applications and systems that incorporated RPC.
+
+#### Shared State and Persistence Layer
+
+One major limitation(and the advantage) of RPC is considered the separate *address space* of all the machines in the network. This means that *pointers* or *references* to a data object cannot be passed between the caller and the callee. Therefore, Interweave {% cite interweave2 interweave1 interweave3 --file rpc %} is a *middleware* system that allows scalable sharing of arbitrary datatypes and language-independent processes running on heterogeneous hardware. Interweave is specifically designed and is compatible with RPC-based systems and allows easier access to the shared resources between different applications using memory blocks and locks.
+
+Although research has been done in order to ensure a global shared state for an RPC-based system, However, these systems tend to take away the sense of independence and modularity between the *caller* and the *callee* by using a shared storage instead of a separate *address space*.
+
+#### GridRPC
+
+Grid computing is one of the most widely used applications of RPC paradigm. At a high level, it can be seen as a mesh (or a network) of computers connected with each other to for *grid* such each system can leverage resources from any other system in the network.
+
+In the GridRPC paradigm, each computer in the network can act as the *caller* or the *callee* depending on the amount of resources required {% cite grid1 --file rpc %}. It's also possible for the same computer to act as the *caller* as well as the *callee* for *different* computations.
+
+Some of the most popular implementations that allow one to have GridRPC-compliant middleware are GridSolve{% cite gridsolve1 gridsolve2 --file rpc %} and Ninf-G{% cite ninf --file rpc %}. Ninf is relatively older than GridSolve and was first published in the late 1990's. It's a simple RPC layer that also provides authentication and secure communication between the two parties. GridSolve, on the other hand, is relatively complex and provides a middleware for the communications using a client-agent-server model.
+
+#### Mobile Systems and Computation Offloading
+
+Mobile systems have become very powerful these days. With multi-core processors and gigabytes of RAM, they can undertake relatively complex computations without a hassle. Due to this advancement, they consume a larger amount of energy and hence, their batteries, despite becoming larger, drain quickly with usage. Moreover, mobile data (network bandwidth) is still limited and expensive. Due to these requirements, it's better to offload mobile computations from mobile systems when possible. RPC plays an important role in the communication for this *computation offloading*. Some of these services use Grid RPC technologies to offload this computation. Whereas, other technologies use an RMI(Remote Method Invocation) system for this.
+
+The Ibis Project {% cite ibis --file rpc %} builds an RMI(similar to JavaRMI) and GMI (Group Method Invocation) model to facilitate outsourcing computation. Cuckoo {% cite cuckoo --file rpc %} uses this Ibis communication middleware to offload computation from applications(built using Cuckoo) running on Android smartphones to remote Cuckoo servers.
+
+The Microsoft's MAUI Project {% cite maui --file rpc %} uses RPC communication and allows partitioning of .NET applications and "fine-grained code offload to maximize energy savings with minimal burden on the programmer". MAUI decides the methods to offload to the external MAUI server at runtime.
+
+#### Async RPC, Futures and Promises
+
+Remote Procedure Calls can be asynchronous. Not only that but these async RPCs play in integral role in the *futures* and *promises*. *Future* and *promises* are programming constructs that where a *future* is seen as variable/data/return type/error while a *promise* is seen as a *future* that doesn't have a value, yet. We follow Finagle's {% cite finagle --file rpc %} definition of *futures* and *promises*, where the *promise* of a *future*(an empty *future*) is considered as a *request* while the async fulfillment of this *promise* by a *future* is seen as the *response*. This construct is primarily used for asynchronous programming.
+
+Perhaps the most renowned systems using this type of RPC model are Twitter's Finagle{% cite finagle --file rpc %} and Cap'n Proto{% cite capnprotosecure --file rpc %}.
+
+#### RPC in Microservices Ecosystem:
+
+RPC implementations have moved from a one-server model to multiple servers and on to dynamically-created, load-balanced microservices. RPC started as separate implementations of REST, Streaming RPC, MAUI, gRPC, Cap'n Proto, and has now made it possible for integration of all these implementations as a single abstraction as a user *endpoint*. The endpoints are the building blocks of *microservices*. A *microservice* is usually *service* with a very simple, well-defined purpose, written in almost any language that interacts with other microservices to give the feel of one large monolithic *service*. These microservices are language-agnostic. One *microservice* for airline tickets written in C/C\++, might be communicating with a number of other microservices for individual airlines written in different languages(Python, C\++, Java, Node.js) using a language-agnostic, asynchronous, RPC framework like gRPC{%cite grpc --file rpc %} or Thrift{%cite thrift --file rpc %}.
+
+The use of RPC has allowed us to create new microservices on-the-fly. The microservices can not only created and bootstrapped at runtime but also have inherent features like load-balancing and failure-recovery. This bootstrapping might occur on the same machine, adding to a Docker container {% cite docker --file rpc %}, or across a network (using any combination of DNS, NATs or other mechanisms).
+
+RPC can be defined as the "glue" that holds all the microservices together{% cite microservices1rpc --file rpc %}. This means that RPC is one of the primary communication mechanism between different microservices running on different systems. A microservice requests another microservice to perform an operation/query. The other microservice, upon receiving such request, performs an operation and returns a response. This operation could vary from a simple computation to invoking another microservice creating a series of RPC events to creating new microservices on the fly to dynamically load balance the microservices system. These microservices are language-agnostic. One *microservice* could be written in C/C++, another one could be in different languages(Python, C++, Java, Node.js) and they all might be communicating with each other using a language-agnostic, asynchronous, performant RPC framework like gRPC{%cite grpc --file rpc %} or Thrift{%cite thrift --file rpc %}.
+
+An example of a microservices ecosystem that uses futures/promises is Finagle{%cite finagle --file rpc %} at Twitter.
+
+## Security in RPC:
+
+The initial RPC implementation {% cite implementingrpc --file rpc %} was developed for an isolated LAN network and didn't focus much on security. There're various attack surfaces in that model, from the malicious registry to a malicious server, to a client targeting for Denial-of-Service to Man-in-the-Middle attack between client and server.
+
+As time progressed and internet evolved, new standards came along, and RPC implementations became much more secure. Security, in RPC, is generally added as a *module* or a *package*. These modules have libraries for authentication and authorization of the communication services (caller and callee). These modules are not always bug-free and it's possible to gain unauthorized access to the system. Efforts are being made to rectify these situations by the security in general, using code inspection and bug bounty programs to catch these bugs beforehand. However, with time new bugs arise and this cycle continues. It's a vicious cycle between attackers and security experts, both of whom tries to outdo their opponent.
+
+For example, the Oracle Network File System uses a *Secure RPC*{% cite oraclenfs --file rpc %} to perform authentication in the NFS. This *Secure RPC* uses Diffie-Hellman authentication mechanism with DES encryption to allow only authorized users to access the NFS. Similarly, Cap'n Proto {% cite capnprotosecure --file rpc %} claims that it is resilient to memory leaks, segfaults, and malicious inputs and can be used between mutually untrusting parties. However, in Cap'n Proto "the RPC layer is not robust against resource exhaustion attacks, possibly allowing denials of service", nor has it undergone any formal verification {% cite capnprotosecure --file rpc %}.
+
+Although, it's possible to come up with a *Threat Model* that would make an RPC implementation insecure to use, however, one has to understand that using any distributed system increases the attack surface anyways and claiming one *paradigm* to be more secure than another would be a biased statement, since *paradigms* are generally an idea and it depends on different system designers to use these *paradigms* to build their systems and take care of features specific to real systems, like security and load-balancing. There's always a possibility of rerouting a request to a malicious server(if the registry gets hacked), or there's no trust between the *caller* and *callee*. However, we maintain that RPC *paradigm* is not secure or insecure(for that matter), and that the most secure systems are the ones that are in an isolated environment, disconnected from the public internet with a self-destruct mechanism{% cite selfdest --file rpc %} in place, in an impenetrable bunker, and guarded by the Knights Templar(*they don't exist! Well, maybe Fort Meade comes close*).
+
+## Discussion:
+
+RPC *paradigm* shines the most in *request-response* mechanisms. Futures and Promises also appear to a new breed of RPC. This leads one to question, as to whether every *request-response* system is a modified implementation to of the RPC *paradigm*, or does it actually bring anything new to the table? These modern communication protocols, like HTTP and REST, might just be a different flavor of RPC. In HTTP, a client *requests* a web page(or some other content), the server then *responds* with the required content. The dynamics of this communication might be slightly different from your traditional RPC, however, an HTTP Stateless server adheres to most of the concepts behind RPC *paradigm*. Similarly, consider sending a request to your favorite Google API. Say, you want to translate your latitude/longitude to an address using their Reverse Geocoding API, or maybe want to find out a good restaurant in your vicinity using their Places API, you'll send a *request* to their server to perform a *procedure* that would take a few input arguments, like the coordinates, and return the result. Even though these APIs follow a RESTful design, it appears to be an extension to the RPC *paradigm*.
+
+RPC paradigm has evolved over time. It has evolved to the extent that, currently, it's become very difficult differentiate RPC from non-RPC. With each passing year, the restrictions and limitations of RPC evolve. Current RPC implementations even have the support for the server to *request* information from the client to *respond* to these requests and vice versa (bidirectionality). This *bidirectional* nature of RPCs have transitioned RPC from simple *client-server* model to a set of *endpoints* communicating with each other.
+
+For the past four decades, researchers and industry leaders have tried to come up with *their* definition of RPC. The proponents of RPC paradigm view every *request-response* communication as an implementation the RPC paradigm while those against RPC try to explicitly enumerate the limitations of RPC. These limitations, however, seem to slowly vanish as new RPC models are introduced with time. RPC supporters consider it as the Holy Grail of distributed systems. They view it as the foundation of modern distributed communication. From Apache Thrift and ONC to HTTP and REST, they advocate it all as RPC while REST developers have strong opinions against RPC.
+
+Moreover, with modern global storage mechanisms, the need for RPC systems to have a separate *address space* seems to be slowly dissolving and disappearing into thin air. So, the question remains what *is* RPC and what * is not* RPC? This is an open-ended question. There is no unanimous agreement about what RPC should look like, except that it has communication between two *endpoints*. What we think of RPC is:
+
+*In the world of distributed systems, where every individual component of a system, be it a hard disk, a multi-core processor, or a microservice, is an extension of the RPC, it's difficult to come with a concrete definition of the RPC paradigm. Therefore, anything loosely associated with a request-response mechanism can be considered as RPC.*
+
+<blockquote>
+<p align="center">
+<em>**RPC is not dead, long live RPC!**</em>
+</p>
+</blockquote>
## References
-{% bibliography --file rpc %} \ No newline at end of file
+{% bibliography --file rpc --cited %}
diff --git a/chapter/2/futures.md b/chapter/2/futures.md
index 5c56e92..0075773 100644
--- a/chapter/2/futures.md
+++ b/chapter/2/futures.md
@@ -1,11 +1,605 @@
---
layout: page
title: "Futures"
-by: "Joe Schmoe and Mary Jane"
+by: "Kisalaya Prasad and Avanti Patil"
---
-Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum. {% cite Uniqueness --file futures %}
+# Introduction
-## References
+As human beings we have an ability to multitask ie. we can walk, talk and eat at the same time except when you sneeze. Sneeze is like a blocking activity from the normal course of action, because it forces you to stop what you’re doing for a brief moment and then you resume where you left off. Activities like multitasking are called multithreading in computer lingo. In contrast to this behaviour, computer processors are single threaded. So when we say that a computer system has multi-threaded environment, it is actually just an illusion created by processor where processor’s time is shared between multiple processes. Sometimes processor gets blocked when some tasks are hindered from normal execution due to blocking calls. Such blocking calls can range from IO operations like read/write to disk or sending/receiving packets to/from network. Blocking calls can take disproportionate amount of time compared to the processor’s task execution i.e. iterating over a list.
-{% bibliography --file futures %} \ No newline at end of file
+
+The processor can either handle blocking calls in two ways:
+- **Synchronous method**: As a part of running task in synchronous method, processor continues to wait for the blocking call to complete the task and return the result. After this processor will resume processing next task. Problem with this kind of method is CPU time not utilized in an ideal manner.
+- **Asynchronous method**: When you add asynchrony, you can utilize the time of CPU to work on some other task using one of the preemptive time sharing algorithm. This is not blocking the processor at any time and when the asynchronous call returns the result, processor can again switch back to the previous process using preemption and resume the process from the point where it’d left off.
+
+In the world of asynchronous communications many terminologies were defined to help programmers reach the ideal level of resource utilization. As a part of this article we will talk about motivation behind rise of Promises and Futures, how the current notion we have of futures and promises have evolved over time, try to explain various execution models associated with it and finally we will end this discussion with how this construct helps us today in different general purpose programming languages.
+
+
+<figure>
+ <img src="./images/1.png" alt="timeline" />
+</figure>
+
+# Motivation
+
+The rise of promises and futures as a topic of relevance can be traced parallel to the rise of asynchronous or distributed systems. This seems natural, since futures represent a value available in Future which fits in very naturally with the latency which is inherent to these heterogeneous systems. The recent adoption of NodeJS and server side Javascript has only made promises more relevant. But, the idea of having a placeholder for a result came in significantly before than the current notion of futures and promises. As we will see in further sections, this idea of having a *"placeholder for a value that might not be available"* has changed meanings over time.
+
+Thunks can be thought of as a primitive notion of a Future or Promise. According to its inventor P. Z. Ingerman, thunks are "A piece of coding which provides an address". {% cite 23 --file futures %} They were designed as a way of binding actual parameters to their formal definitions in Algol-60 procedure calls. If a procedure is called with an expression in the place of a formal parameter, the compiler generates a thunk which computes the expression and leaves the address of the result in some standard location.
+
+
+The first mention of Futures was by Baker and Hewitt in a paper on Incremental Garbage Collection of Processes. They coined the term - call-by-futures to describe a calling convention in which each formal parameter to a method is bound to a process which evaluates the expression in the parameter in parallel with other parameters. Before this paper, Algol 68 also presented a way to make this kind of concurrent parameter evaluation possible, using the collateral clauses and parallel clauses for parameter binding.
+
+
+In their paper, Baker and Hewitt introduced a notion of Futures as a 3-tuple representing an expression E consisting of (1) A process which evaluates E, (2) A memory location where the result of E needs to be stored, (3) A list of processes which are waiting on E. But, the major focus of their work was not on role of futures and the role they play in Asynchronous distributed computing, and focused on garbage collecting the processes which evaluate expressions not needed by the function.
+
+
+The Multilisp language, presented by Halestead in 1985 built upon this call-by-future with a Future annotation. Binding a variable to a future expression creates a process which evaluates that expression and binds x to a token which represents its (eventual) result. It allowed an operation to move past the actual computation without waiting for it to complete. If the value is never used, the current computation will not pause. MultiLisp also had a lazy future construct, called Delay, which only gets evaluated when the value is first required.
+
+ This design of futures influenced the paper of design of Promises in Argus by Liskov and Shrira in 1988. Both futures in MultiLisp and Promises in Argus provisioned for the result of a call to be picked up later. Building upon the initial design of Future in MultiLisp, they extended the original idea by introducing strongly typed Promises and integration with call streams. Call streams are a language-independent communication mechanism connecting a sender and a receiver in a distributed programming environment. It is used to make calls from sender to receiver like normal RPC. In addition, sender could also make stream-calls where it chooses to not wait for the reply and can make further calls. Stream calls seem like a good use-case for a placeholder to access the result of a call in the future : Promises. Call streams also had provisions for handling network failures. This made it easier to handle exception propagation from callee to the caller and also to handle the typical problems in a multi-computer system. This paper also talked about stream composition. The call-streams could be arranged in pipelines where output of one stream could be used as input on next stream. This notion is not much different to what is known as promise pipelining today, which will be introduced in more details later.
+
+
+E is an object-oriented programming language for secure distributed computing, created by Mark S. Miller, Dan Bornstein, and others at Electric Communities in 1997. One of the major contribution of E was the first non-blocking implementation of Promises. It traces its routes to Joule which was a dataflow programming language. E had an eventually operator, * <- * . This created what is called an eventual send in E : the program doesn't wait for the operation to complete and moves to next sequential statement. Eventual-sends queue a pending delivery and complete immediately, returning a promise. A pending delivery includes a resolver for the promise. Further messages can also be eventually send to a promise before it is resolved. These messages are queued up and forwarded once the promise is resolved. The notion of promise pipelining in E is also inherited from Joule.
+
+
+Among the modern languages, Python was perhaps the first to come up with something on the lines of E’s promises with the Twisted library. Coming out in 2002, it had a concept of Deferred objects, which were used to receive the result of an operation not yet completed. They were just like normal objects and could be passed along, but they didn’t have a value. They supported a callback which would get called once the result of the operation was complete.
+
+
+Promises and javascript have an interesting history. In 2007 inspired by Python’s twisted, dojo came up with it’s own implementation of of dojo.Deferred. This inspired Kris Zyp to then come up with the CommonJS Promises/A spec in 2009. Ryan Dahl introduced the world to NodeJS in the same year. In it’s early versions, Node used promises for the non-blocking API. When NodeJS moved away from promises to its now familiar error-first callback API (the first argument for the callback should be an error object), it left a void for a promises API. Q.js was an implementation of Promises/A spec by Kris Kowal around this time. FuturesJS library by AJ ONeal was another library which aimed to solve flow-control problems without using Promises in the strictest of senses. In 2011, JQuery v1.5 first introduced Promises to its wider and ever-growing audience. The API for JQuery was subtly different than the Promises/A spec. With the rise of HTML5 and different APIs, there came a problem of different and messy interfaces which added to the already infamous callback hell. A+ promises aimed to solve this problem. From this point on, leading from widespread adoption of A+ spec, promises was finally made a part of ECMAScript® 2015 Language Specification. Still, a lack of backward compatibility and additional features provided means that libraries like BlueBird and Q.js still have a place in the javascript ecosystem.
+
+
+# Different Definitions
+
+
+Future, promise, Delay or Deferred generally refer to same synchronisation mechanism where an object acts as a proxy for a yet unknown result. When the result is discovered, promises hold some code which then gets executed.
+
+In some languages however, there is a subtle difference between what is a Future and a Promise.
+“A ‘Future’ is a read-only reference to a yet-to-be-computed value”.
+“A ‘Promise’ is a pretty much the same except that you can write to it as well.”
+
+
+In other words, a future is a read-only window to a value written into a promise. You can get the Future associated with a Promise by calling the future method on it, but conversion in the other direction is not possible. Another way to look at it would be, if you Promise something, you are responsible for keeping it, but if someone else makes a Promise to you, you expect them to honor it in Future.
+
+More technically, in Scala, “SIP-14 – Futures and Promises” defines them as follows:
+A future is a placeholder object for a result that does not yet exist.
+A promise is a writable, single-assignment container, which completes a future. Promises can complete the future with a result to indicate success, or with an exception to indicate failure.
+
+An important difference between Scala and Java (6) futures is that Scala futures were asynchronous in nature. Java's future, at least till Java 6, were blocking. Java 7 introduced the Futures as the asynchronous construct which are more familiar in the distributed computing world.
+
+
+In Java 8, the Future<T> interface has methods to check if the computation is complete, to wait for its completion, and to retrieve the result of the computation when it is complete. CompletableFutures can be thought of as Promises as their value can be set. But it also implements the Future interface and therefore it can be used as a Future too. Promises can be thought of as a future with a public set method which the caller (or anybody else) can use to set the value of the future.
+
+
+In Javascript world, Jquery introduces a notion of Deferred objects which are used to represent a unit of work which is not yet finished. Deferred object contains a promise object which represent the result of that unit of work. Promises are values returned by a function, while the deferred object can be canceled by its caller.
+
+
+C# also makes the distinction between futures and promises. In C#, futures are implemented as Task<T> and in fact in earlier versions of the Task Parallel Library futures were implemented with a class Future<T> which later became Task<T>. The result of the future is available in the readonly property Task<T>.Result which returns T. Tasks are asynchronous in C#.
+
+# Semantics of Execution
+
+Over the years promises and futures have been implemented in different programming languages. Different languages chose to implement futures/promises in a different way. In this section, we try to introduce some different ways in which futures and promises actually get executed and resolved underneath their APIs.
+
+## Thread Pools
+
+Thread pools are a group of ready, idle threads which can be given work. They help with the overhead of worker creation, which can add up in a long running process. The actual implementation may vary everywhere, but what differentiates thread pools is the number of threads it uses. It can either be fixed, or dynamic. Advantage of having a fixed thread pool is that it degrades gracefully : the amount of load a system can handle is fixed, and using fixed thread pool, we can effectively limit the amount of load it is put under. Granularity of a thread pool is the number of threads it instantiates.
+
+
+In Java executor is an object which executes the Runnable tasks. Executors provides a way of abstracting out how the details of how a task will actually run. These details, like selecting a thread to run the task, how the task is scheduled are managed by the object implementing the Executor interface. Threads are an example of a Runnable in java. Executors can be used instead of creating a thread explicitly.
+
+
+Similar to Executor, there is an ExecutionContext as part of scala.concurrent. The basic intent behind it is same as an Executor : it is responsible for executing computations. How it does it can is opaque to the caller. It can create a new thread, use a pool of threads or run it on the same thread as the caller, although the last option is generally not recommended. Scala.concurrent package comes with an implementation of ExecutionContext by default, which is a global static thread pool.
+
+
+ExecutionContext.global is an execution context backed by a ForkJoinPool. ForkJoin is a thread pool implementation designed to take advantage of a multiprocessor environment. What makes fork join unique is that it implements a type of work-stealing algorithm : idle threads pick up work from still busy threads. ForkJoinPool manages a small number of threads, usually limited to the number of processor cores available. It is possible to increase the number of threads, if all of the available threads are busy and wrapped inside a blocking call, although such situation would be highly undesirable for most of the systems. ForkJoin framework work to avoid pool-induced deadlock and minimize the amount of time spent switching between the threads.
+
+
+In Scala, Futures are generally a good framework to reason about concurrency as they can be executed in parallel, waited on, are composable, immutable once written and most importantly, are non blocking (although it is possible to have blocking futures, like Java 6). In Scala, futures (and promises) are based on ExecutionContext. Using ExecutionContext gives users flexibility to implement their own ExecutionContext if they need a specific behavior, like blocking futures. The default ForkJoin pool works well in most of the scenarios.
+
+Scala futures api expects an ExecutionContext to be passed along. This parameter is implicit, and usually ExecutionContext.global. An example :
+
+
+```scala
+implicit val ec = ExecutionContext.global
+val f : Future[String] = Future { “hello world” }
+```
+
+In this example, the global execution context is used to asynchronously run the created future. Taking another example,
+
+
+```scala
+implicit val ec = ExecutionContext.global
+
+val f = Future {
+ Http("http://api.fixed.io/latest?base=USD").asString
+}
+
+f.onComplete {
+ case success(response) => println(response.body)
+ case Failure(t) => println(t)
+}
+```
+
+
+It is generally a good idea to use callbacks with Futures, as the value may not be available when you want to use it.
+
+So, how does it all work together ?
+
+As we mentioned, Futures require an ExecutionContext, which is an implicit parameter to virtually all of the futures API. This ExecutionContext is used to execute the future. Scala is flexible enough to let users implement their own Execution Contexts, but let’s talk about the default ExecutionContext, which is a ForkJoinPool.
+
+
+ForkJoinPool is ideal for many small computations that spawn off and then come back together. Scala’s ForkJoinPool requires the tasks submitted to it to be a ForkJoinTask. The tasks submitted to the global ExecutionContext is quietly wrapped inside a ForkJoinTask and then executed. ForkJoinPool also supports a possibly blocking task, using ManagedBlock method which creates a spare thread if required to ensure that there is sufficient parallelism if the current thread is blocked. To summarize, ForkJoinPool is an really good general purpose ExecutionContext, which works really well in most of the scenarios.
+
+
+## Event Loops
+
+Modern systems typically rely on many other systems to provide the functionality they do. There’s a file system underneath, a database system, and other web services to rely on for the information. Interaction with these components typically involves a period where we’re doing nothing but waiting for the response back. This is single largest waste of computing resources.
+
+
+Javascript is a single threaded asynchronous runtime. Now, conventionally async programming is generally associated with multi-threading, but we’re not allowed to create new threads in Javascript. Instead, asynchronicity in Javascript is achieved using an event-loop mechanism.
+
+
+Javascript has historically been used to interact with the DOM and user interactions in the browser, and thus an event-driven programming model was a natural fit for the language. This has scaled up surprisingly well in high throughput scenarios in NodeJS.
+
+
+The general idea behind event-driven programming model is that the logic flow control is determined by the order in which events are processed. This is underpinned by a mechanism which is constantly listening for events and fires a callback when it is detected. This is the Javascript’s event loop in a nutshell.
+
+
+A typical Javascript engine has a few basic components. They are :
+- **Heap**
+Used to allocate memory for objects
+- **Stack**
+Function call frames go into a stack from where they’re picked up from top to be executed.
+- **Queue**
+ A message queue holds the messages to be processed.
+
+
+Each message has a callback function which is fired when the message is processed. These messages can be generated by user actions like button clicks or scrolling, or by actions like HTTP requests, request to a database to fetch records or reading/writing to a file.
+
+
+Separating when a message is queued from when it is executed means the single thread doesn’t have to wait for an action to complete before moving on to another. We attach a callback to the action we want to do, and when the time comes, the callback is run with the result of our action. Callbacks work good in isolation, but they force us into a continuation passing style of execution, what is otherwise known as Callback hell.
+
+
+```javascript
+
+getData = function(param, callback){
+ $.get('http://example.com/get/'+param,
+ function(responseText){
+ callback(responseText);
+ });
+}
+
+getData(0, function(a){
+ getData(a, function(b){
+ getData(b, function(c){
+ getData(c, function(d){
+ getData(d, function(e){
+
+ });
+ });
+ });
+ });
+});
+
+```
+
+<center><h4> VS </h4></center>
+
+```javascript
+
+getData = function(param, callback){
+ return new Promise(function(resolve, reject) {
+ $.get('http://example.com/get/'+param,
+ function(responseText){
+ resolve(responseText);
+ });
+ });
+}
+
+getData(0).then(getData)
+ .then(getData).
+ then(getData).
+ then(getData);
+
+
+```
+
+> **Programs must be written for people to read, and only incidentally for machines to execute.** - *Harold Abelson and Gerald Jay Sussman*
+
+
+Promises are an abstraction which make working with async operations in javascript much more fun. Callbacks lead to inversion of control, which is difficult to reason about at scale. Moving on from a continuation passing style, where you specify what needs to be done once the action is done, the callee simply returns a Promise object. This inverts the chain of responsibility, as now the caller is responsible for handling the result of the promise when it is settled.
+
+The ES2015 spec specifies that “promises must not fire their resolution/rejection function on the same turn of the event loop that they are created on.” This is an important property because it ensures deterministic order of execution. Also, once a promise is fulfilled or failed, the promise’s value MUST not be changed. This ensures that a promise cannot be resolved more than once.
+
+Let’s take an example to understand the promise resolution workflow as it happens inside the Javascript Engine.
+
+Suppose we execute a function, here g() which in turn, calls function f(). Function f returns a promise, which, after counting down for 1000 ms, resolves the promise with a single value, true. Once f gets resolved, a value true or false is alerted based on the value of the promise.
+
+
+<figure>
+ <img src="./images/5.png" alt="timeline" />
+</figure>
+
+Now, javascript’s runtime is single threaded. This statement is true, and not true. The thread which executes the user code is single threaded. It executes what is on top of the stack, runs it to completion, and then moves onto what is next on the stack. But, there are also a number of helper threads which handle things like network or timer/settimeout type events. This timing thread handles the counter for setTimeout.
+
+<figure>
+ <img src="./images/6.png" alt="timeline" />
+</figure>
+
+Once the timer expires, the timer thread puts a message on the message queue. The queued up messages are then handled by the event loop. The event loop as described above, is simply an infinite loop which checks if a message is ready to be processed, picks it up and puts it on the stack for it’s callback to be executed.
+
+<figure>
+ <img src="./images/7.png" alt="timeline" />
+</figure>
+
+Here, since the future is resolved with a value of true, we are alerted with a value true when the callback is picked up for execution.
+
+<figure>
+ <img src="./images/8.png" alt="timeline" />
+</figure>
+
+Some finer details :
+We’ve ignored the heap here, but all the functions, variables and callbacks are stored on heap.
+As we’ve seen here, even though Javascript is said to be single threaded, there are number of helper threads to help main thread do things like timeout, UI, network operations, file operations etc.
+Run-to-completion helps us reason about the code in a nice way. Whenever a function starts, it needs to finish before yielding the main thread. The data it accesses cannot be modified by someone else. This also means every function needs to finish in a reasonable amount of time, otherwise the program seems hung. This makes Javascript well suited for I/O tasks which are queued up and then picked up when finished, but not for data processing intensive tasks which generally take long time to finish.
+We haven’t talked about error handling, but it gets handled the same exact way, with the error callback being called with the error object the promise is rejected with.
+
+
+Event loops have proven to be surprisingly performant. When network servers are designed around multithreading, as soon as you end up with a few hundred concurrent connections, the CPU spends so much of its time task switching that you start to lose overall performance. Switching from one thread to another has overhead which can add up significantly at scale. Apache used to choke even as low as a few hundred concurrent users when using a thread per connection while Node can scale up to a 100,000 concurrent connections based on event loops and asynchronous IO.
+
+
+## Thread Model
+
+
+Oz programming language introduced an idea of dataflow concurrency model. In Oz, whenever the program comes across an unbound variable, it waits for it to be resolved. This dataflow property of variables helps us write threads in Oz that communicate through streams in a producer-consumer pattern. The major benefit of dataflow based concurrency model is that it’s deterministic - same operation called with same parameters always produces the same result. It makes it a lot easier to reason about concurrent programs, if the code is side-effect free.
+
+
+Alice ML is a dialect of Standard ML with support for lazy evaluation, concurrent, distributed, and constraint programming. The early aim of Alice project was to reconstruct the functionalities of Oz programming language on top of a typed programming language. Building on the Standard ML dialect, Alice also provides concurrency features as part of the language through the use of a future type. Futures in Alice represent an undetermined result of a concurrent operation. Promises in Alice ML are explicit handles for futures.
+
+
+Any expression in Alice can be evaluated in it's own thread using spawn keyword. Spawn always returns a future which acts as a placeholder for the result of the operation. Futures in Alice ML can be thought of as functional threads, in a sense that threads in Alice always have a result. A thread is said to be touching a future if it performs an operation that requires the value future is a placeholder for. All threads touching a future are blocked until the future is resolved. If a thread raises an exception, the future is failed and this exception is re-raised in the threads touching it. Futures can also be passed along as values. This helps us achieve the dataflow model of concurrency in Alice.
+
+
+Alice also allows for lazy evaluation of expressions. Expressions preceded with the lazy keyword are evaluated to a lazy future. The lazy future is evaluated when it is needed. If the computation associated with a concurrent or lazy future ends with an exception, it results in a failed future. Requesting a failed future does not block, it simply raises the exception that was the cause of the failure.
+
+# Implicit vs. Explicit Promises
+
+
+We define Implicit promises as ones where we don’t have to manually trigger the computation vs Explicit promises where we have to trigger the resolution of future manually, either by calling a start function or by requiring the value. This distinction can be understood in terms of what triggers the calculation : With Implicit promises, the creation of a promise also triggers the computation, while with Explicit futures, one needs to triggers the resolution of a promise. This trigger can in turn be explicit, like calling a start method, or implicit, like lazy evaluation where the first use of a promise’s value triggers its evaluation.
+
+
+The idea for explicit futures were introduced in the Baker and Hewitt paper. They’re a little trickier to implement, and require some support from the underlying language, and as such they aren’t that common. The Baker and Hewitt paper talked about using futures as placeholders for arguments to a function, which get evaluated in parallel, but when they’re needed. MultiLisp also had a mechanism to delay the evaluation of the future to the time when it's value is first used, using the defer construct. Lazy futures in Alice ML have a similar explicit invocation mechanism, the first thread touching a future triggers its evaluation.
+
+An example for Explicit Futures would be (from AliceML):
+
+```
+fun enum n = lazy n :: enum (n+1)
+
+```
+
+This example generates an infinite stream of integers and if stated when it is created, will compete for the system resources.
+
+Implicit futures were introduced originally by Friedman and Wise in a paper in 1978. The ideas presented in that paper inspired the design of promises in MultiLisp. Futures are also implicit in Scala and Javascript, where they’re supported as libraries on top of the core languages. Implicit futures can be implemented this way as they don’t require support from language itself. Alice ML’s concurrent futures are also an example of implicit invocation.
+
+For example
+
+```scala
+
+val f = Future {
+ Http("http://api.fixer.io/latest?base=USD").asString
+}
+
+f onComplete {
+ case Success(response) => println(response.body)
+ case Failure(t) => println(t)
+}
+
+```
+
+This sends the HTTP call as soon as it the Future is created. In Scala, although the futures are implicit, Promises can be used to have an explicit-like behavior. This is useful in a scenario where we need to stack up some computations and then resolve the Promise.
+
+An Example :
+
+```scala
+
+val p = Promise[Foo]()
+
+p.future.map( ... ).filter( ... ) foreach println
+
+p.complete(new Foo)
+
+```
+
+Here, we create a Promise, and complete it later. In between we stack up a set of computations which get executed once the promise is completed.
+
+
+# Promise Pipelining
+
+One of the criticism of traditional RPC systems would be that they’re blocking. Imagine a scenario where you need to call an API ‘a’ and another API ‘b’, then aggregate the results of both the calls and use that result as a parameter to another API ‘c’. Now, the logical way to go about doing this would be to call A and B in parallel, then once both finish, aggregate the result and call C. Unfortunately, in a blocking system, the way to go about is call a, wait for it to finish, call b, wait, then aggregate and call c. This seems like a waste of time, but in absence of asynchronicity, it is impossible. Even with asynchronicity, it gets a little difficult to manage or scale up the system linearly. Fortunately, we have promises.
+
+
+<figure>
+ <img src="./images/p-1.png" alt="timeline" />
+</figure>
+
+<figure>
+ <img src="./images/p-2.png" alt="timeline" />
+</figure>
+
+Futures/Promises can be passed along, waited upon, or chained and joined together. These properties helps make life easier for the programmers working with them. This also reduces the latency associated with distributed computing. Promises enable dataflow concurrency, which is also deterministic, and easier to reason.
+
+The history of promise pipelining can be traced back to the call-streams in Argus. In Argus, Call streams are a mechanism for communication between distributed components. The communicating entities, a sender and a receiver are connected by a stream, and sender can make calls to receiver over it. Streams can be thought of as RPC, except that these allow callers to run in parallel with the receiver while processing the call. When making a call in Argus, the caller receives a promise for the result. In the paper on Promises by Liskov and Shrira, they mention that having integrated Promises into call streams, next logical step would be to talk about stream composition. This means arranging streams into pipelines where output of one stream can be used as input of the next stream. They talk about composing streams using fork and coenter.
+
+Channels in Joule were a similar idea, providing a channel which connects an acceptor and a distributor. Joule was a direct ancestor to E language, and talked about it in more detail.
+
+```
+
+t3 := (x <- a()) <- c(y <- b())
+
+t1 := x <- a()
+t2 := y <- b()
+t3 := t1 <- c(t2)
+
+```
+
+Without pipelining in E, this call will require three round trips. First to send a() to x, then b() to y then finally c to the result t1 with t2 as an argument. But with pipelining, the later messages can be sent with promises as result of earlier messages as argument. This allowed sending all the messages together, thereby saving the costly round trips. This is assuming x and y are on the same remote machine, otherwise we can still evaluate t1 and t2 parallely.
+
+
+Notice that this pipelining mechanism is different from asynchronous message passing, as in asynchronous message passing, even if t1 and t2 get evaluated in parallel, to resolve t3 we still wait for t1 and t2 to be resolved, and send it again in another call to the remote machine.
+
+
+Modern promise specifications, like one in Javascript comes with methods which help working with promise pipelining easier. In javascript, a Promises.all method is provided, which takes in an iterable and returns a new Promise which gets resolved when all the promises in the iterable get resolved. There’s also a race method, which returns a promise which is resolved when the first promise in the iterable gets resolved.
+
+```javascript
+
+var a = Promise.resolve(1);
+var b = new Promise(function (resolve, reject) {
+ setTimeout(resolve, 100, 2);
+});
+
+Promise.all([p1, p2]).then(values => {
+ console.log(values); // [1,2]
+});
+
+Promise.race([p1, p2]).then(function(value) {
+ console.log(value); // 1
+});
+
+```
+
+In Scala, futures have a onSuccess method which acts as a callback to when the future is complete. This callback itself can be used to sequentially chain futures together. But this results in bulkier code. Fortunately, Scala api comes with combinators which allow for easier combination of results from futures. Examples of combinators are map, flatmap, filter, withFilter.
+
+
+# Handling Errors
+
+If world would have run without errors we would rejoice in unison, but it is not the case in programming world as well. When you run a program you either receive an expected output or an error. Error can be defined as wrong output or an exception. In a synchronous programming model, the most logical way of handling errors is a try...catch block.
+
+```javascript
+
+try{
+ do something1;
+ do something2;
+ do something3;
+ ...
+} catch ( exception ){
+ HandleException;
+}
+
+```
+
+Unfortunately, the same thing doesn’t directly translate to asynchronous code.
+
+
+```javascript
+
+foo = doSomethingAsync();
+
+try{
+ foo();
+ // This doesn’t work as the error might not have been thrown yet
+} catch ( exception ){
+ handleException;
+}
+
+
+```
+
+
+
+Although most of the earlier papers did not talk about error handling, the Promises paper by Liskov and Shrira did acknowledge the possibility of failure in a distributed environment. To put this in Argus's perspective, the 'claim' operation waits until the promise is ready. Then it returns normally if the call terminated normally, and otherwise it signals the appropriate 'exception', e.g.,
+
+```
+y: real := pt$claim(x)
+ except when foo: ...
+ when unavailable(s: string): .
+ when failure(s: string): . .
+ end
+
+```
+Here x is a promise object of type pt; the form pi$claim illustrates the way Argus identifies an operation of a type by concatenating the type name with the operation name. When there are communication problems, RPCs in Argus terminate either with the 'unavailable' exception or the 'failure' exception.
+'Unavailable' - means that the problem is temporary, e.g., communication is impossible right now.
+'Failure' - means that the problem is permanent, e.g., the handler’s guardian does not exist.
+Thus stream calls (and sends) whose replies are lost because of broken streams will terminate with one of these exceptions. Both exceptions have a string argument that explains the reason for the failure, e.g., future(“handler does not exist”), or unavailable(“cannot communicate”). Since any call can fail, every handler can raise the exceptions failure and unavailable. In this paper they also talked about propagation of exceptions from the called procedure to the caller. In paper about E language they talk about broken promises and setting a promise to the exception of broken references.
+
+In modern languages like Scala, Promises generally come with two callbacks. One to handle the success case and other to handle the failure. e.g.
+
+```scala
+
+f onComplete {
+ case Success(data) => handleSuccess(data)
+ case Failure(e) => handleFailure(e)
+}
+```
+
+In Scala, the Try type represents a computation that may either result in an exception, or return a successfully computed value. For example, Try[Int] represents a computation which can either result in Int if it's successful, or return a Throwable if something is wrong.
+
+```scala
+
+val a: Int = 100
+val b: Int = 0
+def divide: Try[Int] = Try(a/b)
+
+divide match {
+ case Success(v) =>
+ println(v)
+ case Failure(e) =>
+ println(e) // java.lang.ArithmeticException: / by zero
+}
+
+```
+
+Try type can be pipelined, allowing for catching exceptions and recovering from them along the way.
+
+#### In Javascript
+```javascript
+
+promise.then(function (data) {
+ // success callback
+ console.log(data);
+}, function (error) {
+ // failure callback
+ console.error(error);
+});
+
+```
+Scala futures exception handling:
+
+When asynchronous computations throw unhandled exceptions, futures associated with those computations fail. Failed futures store an instance of Throwable instead of the result value. Futures provide the onFailure callback method, which accepts a PartialFunction to be applied to a Throwable. TimeoutException, scala.runtime.NonLocalReturnControl[] and ExecutionException exceptions are treated differently
+
+Scala promises exception handling:
+
+When failing a promise with an exception, three subtypes of Throwables are handled specially. If the Throwable used to break the promise is a scala.runtime.NonLocalReturnControl, then the promise is completed with the corresponding value. If the Throwable used to break the promise is an instance of Error, InterruptedException, or scala.util.control.ControlThrowable, the Throwable is wrapped as the cause of a new ExecutionException which, in turn, is failing the promise.
+
+
+To handle errors with asynchronous methods and callbacks, the error-first callback style ( which we've seen before, also adopted by Node) is the most common convention. Although this works, but it is not very composable, and eventually takes us back to what is called callback hell. Fortunately, Promises allow asynchronous code to apply structured error handling. Promises .then method takes in two callbacks, a onFulfilled to handle when a promise is resolved successfully and a onRejected to handle if the promise is rejected.
+
+```javascript
+
+var p = new Promise(function(resolve, reject){
+ resolve(100);
+});
+
+p.then(function(data){
+ console.log(data); // 100
+},function(error){
+ console.err(error);
+});
+
+var q = new Promise(function(resolve, reject){
+ reject(new Error(
+ {'message':'Divide by zero'}
+ ));
+});
+
+q.then(function(data){
+ console.log(data);
+},function(error){
+ console.err(error);// {'message':'Divide by zero'}
+});
+
+```
+
+
+Promises also have a catch method, which work the same way as onFailure callback, but also help deal with errors in a composition. Exceptions in promises behave the same way as they do in a synchronous block of code : they jump to the nearest exception handler.
+
+
+```javascript
+function work(data) {
+ return Promise.resolve(data+"1");
+}
+
+function error(data) {
+ return Promise.reject(data+"2");
+}
+
+function handleError(error) {
+ return error +"3";
+}
+
+
+work("")
+.then(work)
+.then(error)
+.then(work) // this will be skipped
+.then(work, handleError)
+.then(check);
+
+function check(data) {
+ console.log(data == "1123");
+ return Promise.resolve();
+}
+
+```
+
+The same behavior can be written using catch block.
+
+```javascript
+
+work("")
+.then(work)
+.then(error)
+.then(work)
+.catch(handleError)
+.then(check);
+
+function check(data) {
+ console.log(data == "1123");
+ return Promise.resolve();
+}
+
+```
+
+
+# Futures and Promises in Action
+
+
+## Twitter Finagle
+
+
+Finagle is a protocol-agnostic, asynchronous RPC system for the JVM that makes it easy to build robust clients and servers in Java, Scala, or any JVM-hosted language. It uses Futures to encapsulate concurrent tasks. Finagle
+introduces two other abstractions built on top of Futures to reason about distributed software :
+
+- ** Services ** are asynchronous functions which represent system boundaries.
+
+- ** Filters ** are application-independent blocks of logic like handling timeouts and authentication.
+
+In Finagle, operations describe what needs to be done, while the actual execution is left to be handled by the runtime. The runtime comes with a robust implementation of connection pooling, failure detection and recovery and load balancers.
+
+Example of a Service:
+
+
+```scala
+
+val service = new Service[HttpRequest, HttpResponse] {
+ def apply(request: HttpRequest) =
+ Future(new DefaultHttpResponse(HTTP_1_1, OK))
+}
+
+```
+A timeout filter can be implemented as :
+
+```scala
+
+def timeoutFilter(d: Duration) =
+ { (req, service) => service(req).within(d) }
+
+```
+
+
+## Correctables
+Correctables were introduced by Rachid Guerraoui, Matej Pavlovic, and Dragos-Adrian Seredinschi at OSDI ‘16, in a paper titled Incremental Consistency Guarantees for Replicated Objects. As the title suggests, Correctables aim to solve the problems with consistency in replicated objects. They provide incremental consistency guarantees by capturing successive changes to the value of a replicated object. Applications can opt to receive a fast but possibly inconsistent result if eventual consistency is acceptable, or to wait for a strongly consistent result. Correctables API draws inspiration from, and builds on the API of Promises. Promises have a two state model to represent an asynchronous task, it starts in blocked state and proceeds to a ready state when the value is available. This cannot represent the incremental nature of correctables. Instead, Correctables have a updating state when it starts. From there on, it remains in updating state during intermediate updates, and when the final result is available, it transitions to final state. If an error occurs in between, it moves into an error state. Each state change triggers a callback.
+
+<figure>
+ <img src="./images/15.png" alt="timeline" />
+</figure>
+
+
+## Folly Futures
+Folly is a library by Facebook for asynchronous C++ inspired by the implementation of Futures by Twitter for Scala. It builds upon the Futures in the C++11 Standard. Like Scala’s futures, they also allow for implementing a custom executor which provides different ways of running a Future (thread pool, event loop etc).
+
+
+## NodeJS Fiber
+Fibers provide coroutine support for v8 and node. Applications can use Fibers to allow users to write code without using a ton of callbacks, without sacrificing the performance benefits of asynchronous IO. Think of fibers as light-weight threads for NodeJs where the scheduling is in the hands of the programmer. The node-fibers library doesn’t recommend using raw API and code together without any abstractions, and provides a Futures implementation which is ‘fiber-aware’.
+
+
+# References
+
+{% bibliography --file futures %}
diff --git a/chapter/2/images/1.png b/chapter/2/images/1.png
new file mode 100644
index 0000000..569c326
--- /dev/null
+++ b/chapter/2/images/1.png
Binary files differ
diff --git a/chapter/2/images/15.png b/chapter/2/images/15.png
new file mode 100644
index 0000000..15a2a81
--- /dev/null
+++ b/chapter/2/images/15.png
Binary files differ
diff --git a/chapter/2/images/5.png b/chapter/2/images/5.png
new file mode 100644
index 0000000..b86de04
--- /dev/null
+++ b/chapter/2/images/5.png
Binary files differ
diff --git a/chapter/2/images/6.png b/chapter/2/images/6.png
new file mode 100644
index 0000000..aaafdbd
--- /dev/null
+++ b/chapter/2/images/6.png
Binary files differ
diff --git a/chapter/2/images/7.png b/chapter/2/images/7.png
new file mode 100644
index 0000000..7183fb6
--- /dev/null
+++ b/chapter/2/images/7.png
Binary files differ
diff --git a/chapter/2/images/8.png b/chapter/2/images/8.png
new file mode 100644
index 0000000..d6d2e0e
--- /dev/null
+++ b/chapter/2/images/8.png
Binary files differ
diff --git a/chapter/2/images/9.png b/chapter/2/images/9.png
new file mode 100644
index 0000000..1b67a45
--- /dev/null
+++ b/chapter/2/images/9.png
Binary files differ
diff --git a/chapter/2/images/p-1.png b/chapter/2/images/p-1.png
new file mode 100644
index 0000000..7061fe3
--- /dev/null
+++ b/chapter/2/images/p-1.png
Binary files differ
diff --git a/chapter/2/images/p-1.svg b/chapter/2/images/p-1.svg
new file mode 100644
index 0000000..87e180b
--- /dev/null
+++ b/chapter/2/images/p-1.svg
@@ -0,0 +1,4 @@
+<?xml version="1.0" standalone="yes"?>
+
+<svg version="1.1" viewBox="0.0 0.0 720.0 540.0" fill="none" stroke="none" stroke-linecap="square" stroke-miterlimit="10" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink"><clipPath id="p.0"><path d="m0 0l720.0 0l0 540.0l-720.0 0l0 -540.0z" clip-rule="nonzero"></path></clipPath><g clip-path="url(#p.0)"><path fill="#000000" fill-opacity="0.0" d="m0 0l720.0 0l0 540.0l-720.0 0z" fill-rule="nonzero"></path><path fill="#000000" fill-opacity="0.0" d="m273.45404 246.13159l163.37006 -17.354324" fill-rule="nonzero"></path><path stroke="#93c47d" stroke-width="1.0" stroke-linejoin="round" stroke-linecap="butt" d="m276.86194 245.76958l159.96216 -16.99231" fill-rule="evenodd"></path><path fill="#93c47d" stroke="#93c47d" stroke-width="1.0" stroke-linecap="butt" d="m276.86194 245.76959l0.9995117 -1.2370911l-2.9537048 1.4446716l3.1912842 0.7919159z" fill-rule="evenodd"></path><path fill="#000000" fill-opacity="0.0" d="m40.0 88.13911l613.98425 0l0 44.000008l-613.98425 0z" fill-rule="nonzero"></path><path fill="#000000" d="m151.20563 109.902855q0 0.34375 -0.015625 0.578125q-0.015625 0.234375 -0.03125 0.4375l-6.53125 0q0 1.4375 0.796875 2.203125q0.796875 0.765625 2.296875 0.765625q0.40625 0 0.8125 -0.03125q0.40625 -0.046875 0.78125 -0.09375q0.390625 -0.0625 0.734375 -0.125q0.34375 -0.078125 0.640625 -0.15625l0 1.328125q-0.65625 0.1875 -1.484375 0.296875q-0.828125 0.125 -1.71875 0.125q-1.203125 0 -2.0625 -0.328125q-0.859375 -0.328125 -1.421875 -0.9375q-0.546875 -0.625 -0.8125 -1.515625q-0.265625 -0.890625 -0.265625 -2.03125q0 -0.984375 0.28125 -1.859375q0.296875 -0.875 0.828125 -1.53125q0.546875 -0.671875 1.328125 -1.0625q0.796875 -0.390625 1.796875 -0.390625q0.984375 0 1.734375 0.3125q0.75 0.296875 1.265625 0.859375q0.515625 0.5625 0.78125 1.375q0.265625 0.796875 0.265625 1.78125zm-1.6875 -0.21875q0.03125 -0.625 -0.125 -1.140625q-0.140625 -0.515625 -0.453125 -0.890625q-0.3125 -0.375 -0.78125 -0.578125q-0.453125 -0.203125 -1.078125 -0.203125q-0.515625 0 -0.953125 0.203125q-0.4375 0.203125 -0.765625 0.578125q-0.3125 0.359375 -0.5 0.890625q-0.1875 0.515625 -0.234375 1.140625l4.890625 0zm12.460419 5.375l-2.140625 0l-2.515625 -3.546875l-2.484375 3.546875l-2.078125 0l3.609375 -4.671875l-3.453125 -4.640625l2.078125 0l2.4375 3.578125l2.40625 -3.578125l2.0 0l-3.5 4.671875l3.640625 4.640625zm9.819794 -4.828125q0 1.25 -0.34375 2.1875q-0.34375 0.921875 -0.953125 1.53125q-0.609375 0.609375 -1.453125 0.921875q-0.828125 0.296875 -1.8125 0.296875q-0.4375 0 -0.890625 -0.046875q-0.4375 -0.046875 -0.890625 -0.15625l0 3.890625l-1.609375 0l0 -13.109375l1.4375 0l0.109375 1.5625q0.6875 -0.96875 1.46875 -1.34375q0.796875 -0.390625 1.71875 -0.390625q0.796875 0 1.390625 0.34375q0.609375 0.328125 1.015625 0.9375q0.40625 0.609375 0.609375 1.46875q0.203125 0.859375 0.203125 1.90625zm-1.640625 0.078125q0 -0.734375 -0.109375 -1.34375q-0.109375 -0.609375 -0.34375 -1.046875q-0.234375 -0.4375 -0.59375 -0.6875q-0.359375 -0.25 -0.859375 -0.25q-0.3125 0 -0.625 0.109375q-0.3125 0.09375 -0.65625 0.328125q-0.328125 0.21875 -0.703125 0.59375q-0.375 0.375 -0.8125 0.9375l0 4.515625q0.453125 0.1875 0.9375 0.296875q0.5 0.09375 0.96875 0.09375q1.3125 0 2.046875 -0.875q0.75 -0.890625 0.75 -2.671875zm21.936462 -1.25l-7.984375 0l0 -1.359375l7.984375 0l0 1.359375zm0 3.234375l-7.984375 0l0 -1.359375l7.984375 0l0 1.359375z" fill-rule="nonzero"></path><path fill="#0000ff" d="m210.85876 115.059105l-0.03125 -1.25q-0.765625 0.75 -1.546875 1.09375q-0.78125 0.328125 -1.65625 0.328125q-0.796875 0 -1.359375 -0.203125q-0.5625 -0.203125 -0.9375 -0.5625q-0.359375 -0.359375 -0.53125 -0.84375q-0.171875 -0.484375 -0.171875 -1.046875q0 -1.40625 1.046875 -2.1875q1.046875 -0.796875 3.078125 -0.796875l1.9375 0l0 -0.828125q0 -0.8125 -0.53125 -1.3125q-0.53125 -0.5 -1.609375 -0.5q-0.796875 0 -1.5625 0.1875q-0.765625 0.171875 -1.578125 0.484375l0 -1.453125q0.296875 -0.109375 0.671875 -0.21875q0.390625 -0.109375 0.796875 -0.1875q0.421875 -0.078125 0.875 -0.125q0.453125 -0.0625 0.921875 -0.0625q0.84375 0 1.515625 0.1875q0.6875 0.1875 1.15625 0.578125q0.46875 0.375 0.71875 0.953125q0.25 0.5625 0.25 1.34375l0 6.421875l-1.453125 0zm-0.171875 -4.234375l-2.0625 0q-0.59375 0 -1.03125 0.125q-0.4375 0.109375 -0.71875 0.34375q-0.28125 0.21875 -0.421875 0.53125q-0.125 0.296875 -0.125 0.6875q0 0.28125 0.078125 0.53125q0.09375 0.234375 0.28125 0.421875q0.1875 0.1875 0.484375 0.3125q0.296875 0.109375 0.71875 0.109375q0.5625 0 1.28125 -0.34375q0.71875 -0.34375 1.515625 -1.078125l0 -1.640625z" fill-rule="nonzero"></path><path fill="#980000" d="m230.9671 118.94973q-4.28125 -3.953125 -4.28125 -8.75q0 -1.125 0.21875 -2.234375q0.234375 -1.125 0.734375 -2.25q0.515625 -1.125 1.34375 -2.234375q0.828125 -1.125 2.015625 -2.234375l0.9375 0.953125q-3.59375 3.546875 -3.59375 7.875q0 2.15625 0.90625 4.140625q0.90625 1.984375 2.6875 3.75l-0.96875 0.984375z" fill-rule="nonzero"></path><path fill="#0000ff" d="m253.85669 110.23098q0 1.15625 -0.328125 2.078125q-0.3125 0.90625 -0.90625 1.546875q-0.578125 0.640625 -1.421875 0.984375q-0.84375 0.328125 -1.90625 0.328125q-0.828125 0 -1.6875 -0.15625q-0.859375 -0.15625 -1.703125 -0.5l0 -12.5625l1.609375 0l0 3.609375l-0.0625 1.71875q0.6875 -0.9375 1.484375 -1.3125q0.796875 -0.390625 1.703125 -0.390625q0.796875 0 1.390625 0.34375q0.609375 0.328125 1.015625 0.9375q0.40625 0.609375 0.609375 1.46875q0.203125 0.859375 0.203125 1.90625zm-1.640625 0.078125q0 -0.734375 -0.109375 -1.34375q-0.109375 -0.609375 -0.34375 -1.046875q-0.234375 -0.4375 -0.59375 -0.6875q-0.359375 -0.25 -0.859375 -0.25q-0.3125 0 -0.625 0.109375q-0.3125 0.09375 -0.65625 0.328125q-0.328125 0.21875 -0.703125 0.59375q-0.375 0.375 -0.8125 0.9375l0 4.515625q0.484375 0.1875 0.96875 0.296875q0.5 0.09375 0.9375 0.09375q0.5625 0 1.0625 -0.171875q0.5 -0.171875 0.890625 -0.578125q0.390625 -0.421875 0.609375 -1.09375q0.234375 -0.6875 0.234375 -1.703125z" fill-rule="nonzero"></path><path fill="#980000" d="m271.99628 118.94973q-4.28125 -3.953125 -4.28125 -8.75q0 -1.125 0.21875 -2.234375q0.234375 -1.125 0.734375 -2.25q0.515625 -1.125 1.34375 -2.234375q0.828125 -1.125 2.015625 -2.234375l0.9375 0.953125q-3.59375 3.546875 -3.59375 7.875q0 2.15625 0.90625 4.140625q0.90625 1.984375 2.6875 3.75l-0.96875 0.984375z" fill-rule="nonzero"></path><path fill="#0000ff" d="m294.1671 114.715355q-0.625 0.234375 -1.296875 0.34375q-0.65625 0.125 -1.359375 0.125q-2.21875 0 -3.40625 -1.1875q-1.1875 -1.203125 -1.1875 -3.5q0 -1.109375 0.34375 -2.0q0.34375 -0.90625 0.953125 -1.546875q0.625 -0.640625 1.484375 -0.984375q0.875 -0.34375 1.90625 -0.34375q0.734375 0 1.359375 0.109375q0.625 0.09375 1.203125 0.3125l0 1.546875q-0.59375 -0.3125 -1.234375 -0.453125q-0.625 -0.15625 -1.28125 -0.15625q-0.625 0 -1.1875 0.25q-0.546875 0.234375 -0.96875 0.6875q-0.40625 0.4375 -0.65625 1.078125q-0.234375 0.640625 -0.234375 1.4375q0 1.6875 0.8125 2.53125q0.828125 0.84375 2.28125 0.84375q0.65625 0 1.265625 -0.140625q0.625 -0.15625 1.203125 -0.453125l0 1.5z" fill-rule="nonzero"></path><path fill="#980000" d="m313.02545 118.94973q-4.28125 -3.953125 -4.28125 -8.75q0 -1.125 0.21875 -2.234375q0.234375 -1.125 0.734375 -2.25q0.515625 -1.125 1.34375 -2.234375q0.828125 -1.125 2.015625 -2.234375l0.9375 0.953125q-3.59375 3.546875 -3.59375 7.875q0 2.15625 0.90625 4.140625q0.90625 1.984375 2.6875 3.75l-0.96875 0.984375zm6.5854187 -17.703125q4.265625 3.953125 4.265625 8.8125q0 1.0 -0.203125 2.078125q-0.203125 1.078125 -0.703125 2.203125q-0.484375 1.125 -1.3125 2.28125q-0.828125 1.171875 -2.09375 2.328125l-0.9375 -0.953125q1.8125 -1.78125 2.703125 -3.734375q0.890625 -1.953125 0.890625 -4.078125q0 -4.421875 -3.59375 -7.953125l0.984375 -0.984375z" fill-rule="nonzero"></path><path fill="#980000" d="m340.12546 101.246605q4.265625 3.953125 4.265625 8.8125q0 1.0 -0.203125 2.078125q-0.203125 1.078125 -0.703125 2.203125q-0.484375 1.125 -1.3125 2.28125q-0.828125 1.171875 -2.09375 2.328125l-0.9375 -0.953125q1.8125 -1.78125 2.703125 -3.734375q0.890625 -1.953125 0.890625 -4.078125q0 -4.421875 -3.59375 -7.953125l0.984375 -0.984375z" fill-rule="nonzero"></path><path fill="#000000" d="m349.19525 116.965355q0.484375 0.015625 0.921875 -0.09375q0.453125 -0.09375 0.78125 -0.296875q0.34375 -0.203125 0.546875 -0.5q0.203125 -0.296875 0.203125 -0.671875q0 -0.390625 -0.140625 -0.625q-0.125 -0.25 -0.296875 -0.453125q-0.171875 -0.203125 -0.3125 -0.4375q-0.125 -0.234375 -0.125 -0.625q0 -0.1875 0.078125 -0.390625q0.078125 -0.21875 0.21875 -0.390625q0.15625 -0.1875 0.390625 -0.296875q0.25 -0.109375 0.5625 -0.109375q0.328125 0 0.625 0.140625q0.3125 0.125 0.53125 0.40625q0.234375 0.28125 0.359375 0.703125q0.140625 0.40625 0.140625 0.96875q0 0.78125 -0.28125 1.484375q-0.28125 0.703125 -0.84375 1.25q-0.5625 0.5625 -1.40625 0.875q-0.828125 0.328125 -1.953125 0.328125l0 -1.265625z" fill-rule="nonzero"></path><path fill="#0000ff" d="m368.52234 110.590355q0 -1.1875 0.3125 -2.109375q0.328125 -0.921875 0.921875 -1.546875q0.609375 -0.640625 1.4375 -0.96875q0.84375 -0.328125 1.875 -0.328125q0.453125 0 0.875 0.0625q0.4375 0.046875 0.859375 0.171875l0 -3.921875l1.625 0l0 13.109375l-1.453125 0l-0.0625 -1.765625q-0.671875 0.984375 -1.46875 1.46875q-0.78125 0.46875 -1.703125 0.46875q-0.796875 0 -1.40625 -0.328125q-0.59375 -0.34375 -1.0 -0.953125q-0.40625 -0.609375 -0.609375 -1.453125q-0.203125 -0.859375 -0.203125 -1.90625zm1.640625 -0.09375q0 1.6875 0.5 2.515625q0.5 0.828125 1.40625 0.828125q0.609375 0 1.296875 -0.546875q0.6875 -0.546875 1.4375 -1.625l0 -4.3125q-0.40625 -0.1875 -0.890625 -0.28125q-0.484375 -0.109375 -0.953125 -0.109375q-1.3125 0 -2.0625 0.859375q-0.734375 0.84375 -0.734375 2.671875z" fill-rule="nonzero"></path><path fill="#980000" d="m395.0838 118.94973q-4.28125 -3.953125 -4.28125 -8.75q0 -1.125 0.21875 -2.234375q0.234375 -1.125 0.734375 -2.25q0.515625 -1.125 1.34375 -2.234375q0.828125 -1.125 2.015625 -2.234375l0.9375 0.953125q-3.59375 3.546875 -3.59375 7.875q0 2.15625 0.90625 4.140625q0.90625 1.984375 2.6875 3.75l-0.96875 0.984375z" fill-rule="nonzero"></path><path fill="#0000ff" d="m417.89526 109.902855q0 0.34375 -0.015625 0.578125q-0.015625 0.234375 -0.03125 0.4375l-6.53125 0q0 1.4375 0.796875 2.203125q0.796875 0.765625 2.296875 0.765625q0.40625 0 0.8125 -0.03125q0.40625 -0.046875 0.78125 -0.09375q0.390625 -0.0625 0.734375 -0.125q0.34375 -0.078125 0.640625 -0.15625l0 1.328125q-0.65625 0.1875 -1.484375 0.296875q-0.828125 0.125 -1.71875 0.125q-1.203125 0 -2.0625 -0.328125q-0.859375 -0.328125 -1.421875 -0.9375q-0.546875 -0.625 -0.8125 -1.515625q-0.265625 -0.890625 -0.265625 -2.03125q0 -0.984375 0.28125 -1.859375q0.296875 -0.875 0.828125 -1.53125q0.546875 -0.671875 1.328125 -1.0625q0.796875 -0.390625 1.796875 -0.390625q0.984375 0 1.734375 0.3125q0.75 0.296875 1.265625 0.859375q0.515625 0.5625 0.78125 1.375q0.265625 0.796875 0.265625 1.78125zm-1.6875 -0.21875q0.03125 -0.625 -0.125 -1.140625q-0.140625 -0.515625 -0.453125 -0.890625q-0.3125 -0.375 -0.78125 -0.578125q-0.453125 -0.203125 -1.078125 -0.203125q-0.515625 0 -0.953125 0.203125q-0.4375 0.203125 -0.765625 0.578125q-0.3125 0.359375 -0.5 0.890625q-0.1875 0.515625 -0.234375 1.140625l4.890625 0z" fill-rule="nonzero"></path><path fill="#980000" d="m436.11298 118.94973q-4.28125 -3.953125 -4.28125 -8.75q0 -1.125 0.21875 -2.234375q0.234375 -1.125 0.734375 -2.25q0.515625 -1.125 1.34375 -2.234375q0.828125 -1.125 2.015625 -2.234375l0.9375 0.953125q-3.59375 3.546875 -3.59375 7.875q0 2.15625 0.90625 4.140625q0.90625 1.984375 2.6875 3.75l-0.96875 0.984375zm6.5854187 -17.703125q4.265625 3.953125 4.265625 8.8125q0 1.0 -0.203125 2.078125q-0.203125 1.078125 -0.703125 2.203125q-0.484375 1.125 -1.3125 2.28125q-0.828125 1.171875 -2.09375 2.328125l-0.9375 -0.953125q1.8125 -1.78125 2.703125 -3.734375q0.890625 -1.953125 0.890625 -4.078125q0 -4.421875 -3.59375 -7.953125l0.984375 -0.984375z" fill-rule="nonzero"></path><path fill="#000000" d="m451.7682 116.965355q0.484375 0.015625 0.921875 -0.09375q0.453125 -0.09375 0.78125 -0.296875q0.34375 -0.203125 0.546875 -0.5q0.203125 -0.296875 0.203125 -0.671875q0 -0.390625 -0.140625 -0.625q-0.125 -0.25 -0.296875 -0.453125q-0.171875 -0.203125 -0.3125 -0.4375q-0.125 -0.234375 -0.125 -0.625q0 -0.1875 0.078125 -0.390625q0.078125 -0.21875 0.21875 -0.390625q0.15625 -0.1875 0.390625 -0.296875q0.25 -0.109375 0.5625 -0.109375q0.328125 0 0.625 0.140625q0.3125 0.125 0.53125 0.40625q0.234375 0.28125 0.359375 0.703125q0.140625 0.40625 0.140625 0.96875q0 0.78125 -0.28125 1.484375q-0.28125 0.703125 -0.84375 1.25q-0.5625 0.5625 -1.40625 0.875q-0.828125 0.328125 -1.953125 0.328125l0 -1.265625z" fill-rule="nonzero"></path><path fill="#0000ff" d="m479.82965 103.44973q-1.265625 -0.265625 -2.1875 -0.265625q-2.1875 0 -2.1875 2.28125l0 1.640625l4.09375 0l0 1.34375l-4.09375 0l0 6.609375l-1.640625 0l0 -6.609375l-2.984375 0l0 -1.34375l2.984375 0l0 -1.546875q0 -3.71875 3.875 -3.71875q0.96875 0 2.140625 0.21875l0 1.390625zm-9.75 2.296875l0 0z" fill-rule="nonzero"></path><path fill="#980000" d="m497.65674 118.94973q-4.28125 -3.953125 -4.28125 -8.75q0 -1.125 0.21875 -2.234375q0.234375 -1.125 0.734375 -2.25q0.515625 -1.125 1.34375 -2.234375q0.828125 -1.125 2.015625 -2.234375l0.9375 0.953125q-3.59375 3.546875 -3.59375 7.875q0 2.15625 0.90625 4.140625q0.90625 1.984375 2.6875 3.75l-0.96875 0.984375zm6.5854187 -17.703125q4.265625 3.953125 4.265625 8.8125q0 1.0 -0.203125 2.078125q-0.203125 1.078125 -0.703125 2.203125q-0.484375 1.125 -1.3125 2.28125q-0.828125 1.171875 -2.09375 2.328125l-0.9375 -0.953125q1.8125 -1.78125 2.703125 -3.734375q0.890625 -1.953125 0.890625 -4.078125q0 -4.421875 -3.59375 -7.953125l0.984375 -0.984375z" fill-rule="nonzero"></path><path fill="#980000" d="m524.7567 101.246605q4.265625 3.953125 4.265625 8.8125q0 1.0 -0.203125 2.078125q-0.203125 1.078125 -0.703125 2.203125q-0.484375 1.125 -1.3125 2.28125q-0.828125 1.171875 -2.09375 2.328125l-0.9375 -0.953125q1.8125 -1.78125 2.703125 -3.734375q0.890625 -1.953125 0.890625 -4.078125q0 -4.421875 -3.59375 -7.953125l0.984375 -0.984375zm20.514648 0q4.265625 3.953125 4.265625 8.8125q0 1.0 -0.203125 2.078125q-0.203125 1.078125 -0.703125 2.203125q-0.484375 1.125 -1.3125 2.28125q-0.828125 1.171875 -2.09375 2.328125l-0.9375 -0.953125q1.8125 -1.78125 2.703125 -3.734375q0.890625 -1.953125 0.890625 -4.078125q0 -4.421875 -3.59375 -7.953125l0.984375 -0.984375z" fill-rule="nonzero"></path><path fill="#d0e0e3" d="m433.02225 144.22713l64.97638 0l0 384.75592l-64.97638 0z" fill-rule="nonzero"></path><path stroke="#000000" stroke-width="1.0" stroke-linejoin="round" stroke-linecap="butt" d="m433.02225 144.22713l64.97638 0l0 384.75592l-64.97638 0z" fill-rule="nonzero"></path><path fill="#434343" d="m469.6318 285.2438q0 0.859375 -0.359375 1.515625q-0.34375 0.640625 -0.984375 1.078125q-0.625 0.421875 -1.515625 0.640625q-0.875 0.21875 -1.953125 0.21875q-0.46875 0 -0.953125 -0.046875q-0.484375 -0.03125 -0.921875 -0.09375q-0.4375 -0.046875 -0.828125 -0.125q-0.390625 -0.078125 -0.703125 -0.15625l0 -1.59375q0.6875 0.25 1.546875 0.40625q0.875 0.140625 1.984375 0.140625q0.796875 0 1.359375 -0.125q0.5625 -0.125 0.921875 -0.359375q0.359375 -0.25 0.515625 -0.59375q0.171875 -0.359375 0.171875 -0.8125q0 -0.5 -0.28125 -0.84375q-0.265625 -0.34375 -0.71875 -0.609375q-0.4375 -0.28125 -1.015625 -0.5q-0.578125 -0.234375 -1.171875 -0.46875q-0.59375 -0.25 -1.171875 -0.53125q-0.5625 -0.28125 -1.015625 -0.671875q-0.4375 -0.390625 -0.71875 -0.90625q-0.265625 -0.515625 -0.265625 -1.234375q0 -0.625 0.265625 -1.21875q0.265625 -0.609375 0.8125 -1.078125q0.546875 -0.46875 1.40625 -0.75q0.859375 -0.296875 2.046875 -0.296875q0.296875 0 0.65625 0.03125q0.359375 0.03125 0.71875 0.078125q0.375 0.046875 0.734375 0.125q0.359375 0.0625 0.65625 0.125l0 1.484375q-0.71875 -0.203125 -1.4375 -0.296875q-0.703125 -0.109375 -1.375 -0.109375q-1.421875 0 -2.09375 0.46875q-0.65625 0.46875 -0.65625 1.265625q0 0.5 0.265625 0.859375q0.28125 0.34375 0.71875 0.625q0.453125 0.28125 1.015625 0.515625q0.578125 0.21875 1.171875 0.46875q0.59375 0.234375 1.15625 0.515625q0.578125 0.28125 1.015625 0.6875q0.453125 0.390625 0.71875 0.921875q0.28125 0.515625 0.28125 1.25z" fill-rule="nonzero"></path><path fill="#434343" d="m469.14743 310.52505l-6.90625 0l0 -12.125l6.90625 0l0 1.390625l-5.25 0l0 3.75l5.03125 0l0 1.40625l-5.03125 0l0 4.171875l5.25 0l0 1.40625z" fill-rule="nonzero"></path><path fill="#434343" d="m470.1318 332.52505l-1.859375 0l-1.8125 -3.875q-0.203125 -0.453125 -0.421875 -0.734375q-0.203125 -0.296875 -0.453125 -0.46875q-0.25 -0.171875 -0.546875 -0.25q-0.28125 -0.078125 -0.640625 -0.078125l-0.78125 0l0 5.40625l-1.65625 0l0 -12.125l3.25 0q1.046875 0 1.8125 0.234375q0.765625 0.234375 1.25 0.65625q0.484375 0.40625 0.703125 1.0q0.234375 0.578125 0.234375 1.296875q0 0.5625 -0.171875 1.078125q-0.15625 0.5 -0.484375 0.921875q-0.328125 0.40625 -0.828125 0.71875q-0.484375 0.296875 -1.109375 0.4375q0.515625 0.171875 0.859375 0.625q0.359375 0.4375 0.734375 1.171875l1.921875 3.984375zm-2.640625 -8.796875q0 -0.96875 -0.609375 -1.453125q-0.609375 -0.484375 -1.71875 -0.484375l-1.546875 0l0 4.015625l1.328125 0q0.59375 0 1.0625 -0.140625q0.46875 -0.140625 0.796875 -0.40625q0.328125 -0.265625 0.5 -0.640625q0.1875 -0.390625 0.1875 -0.890625z" fill-rule="nonzero"></path><path fill="#434343" d="m470.78806 342.40005l-4.109375 12.125l-2.234375 0l-4.03125 -12.125l1.875 0l2.609375 8.171875l0.75 2.390625l0.75 -2.390625l2.625 -8.171875l1.765625 0z" fill-rule="nonzero"></path><path fill="#434343" d="m469.14743 376.52505l-6.90625 0l0 -12.125l6.90625 0l0 1.390625l-5.25 0l0 3.75l5.03125 0l0 1.40625l-5.03125 0l0 4.171875l5.25 0l0 1.40625z" fill-rule="nonzero"></path><path fill="#434343" d="m470.1318 398.52505l-1.859375 0l-1.8125 -3.875q-0.203125 -0.453125 -0.421875 -0.734375q-0.203125 -0.296875 -0.453125 -0.46875q-0.25 -0.171875 -0.546875 -0.25q-0.28125 -0.078125 -0.640625 -0.078125l-0.78125 0l0 5.40625l-1.65625 0l0 -12.125l3.25 0q1.046875 0 1.8125 0.234375q0.765625 0.234375 1.25 0.65625q0.484375 0.40625 0.703125 1.0q0.234375 0.578125 0.234375 1.296875q0 0.5625 -0.171875 1.078125q-0.15625 0.5 -0.484375 0.921875q-0.328125 0.40625 -0.828125 0.71875q-0.484375 0.296875 -1.109375 0.4375q0.515625 0.171875 0.859375 0.625q0.359375 0.4375 0.734375 1.171875l1.921875 3.984375zm-2.640625 -8.796875q0 -0.96875 -0.609375 -1.453125q-0.609375 -0.484375 -1.71875 -0.484375l-1.546875 0l0 4.015625l1.328125 0q0.59375 0 1.0625 -0.140625q0.46875 -0.140625 0.796875 -0.40625q0.328125 -0.265625 0.5 -0.640625q0.1875 -0.390625 0.1875 -0.890625z" fill-rule="nonzero"></path><path fill="#d0e0e3" d="m208.50131 144.22713l64.97638 0l0 384.75592l-64.97638 0z" fill-rule="nonzero"></path><path stroke="#000000" stroke-width="1.0" stroke-linejoin="round" stroke-linecap="butt" d="m208.50131 144.22713l64.97638 0l0 384.75592l-64.97638 0z" fill-rule="nonzero"></path><path fill="#434343" d="m245.09523 288.07193q-1.453125 0.609375 -3.0625 0.609375q-2.5625 0 -3.9375 -1.53125q-1.375 -1.546875 -1.375 -4.546875q0 -1.46875 0.375 -2.640625q0.375 -1.171875 1.078125 -2.0q0.71875 -0.828125 1.71875 -1.265625q1.0 -0.453125 2.234375 -0.453125q0.84375 0 1.5625 0.15625q0.734375 0.140625 1.40625 0.4375l0 1.625q-0.65625 -0.359375 -1.375 -0.546875q-0.703125 -0.203125 -1.53125 -0.203125q-0.859375 0 -1.546875 0.328125q-0.6875 0.3125 -1.171875 0.921875q-0.484375 0.609375 -0.75 1.484375q-0.25 0.875 -0.25 2.0q0 2.359375 0.953125 3.5625q0.953125 1.1875 2.796875 1.1875q0.78125 0 1.5 -0.171875q0.71875 -0.1875 1.375 -0.515625l0 1.5625z" fill-rule="nonzero"></path><path fill="#434343" d="m245.00148 310.52505l-6.984375 0l0 -12.125l1.6875 0l0 10.71875l5.296875 0l0 1.40625z" fill-rule="nonzero"></path><path fill="#434343" d="m240.25148 321.79068l-2.796875 0l0 -1.390625l7.25 0l0 1.390625l-2.78125 0l0 9.328125l2.78125 0l0 1.40625l-7.25 0l0 -1.40625l2.796875 0l0 -9.328125z" fill-rule="nonzero"></path><path fill="#434343" d="m244.62648 354.52505l-6.90625 0l0 -12.125l6.90625 0l0 1.390625l-5.25 0l0 3.75l5.03125 0l0 1.40625l-5.03125 0l0 4.171875l5.25 0l0 1.40625z" fill-rule="nonzero"></path><path fill="#434343" d="m245.22023 376.52505l-2.15625 0l-3.53125 -7.5625l-1.03125 -2.421875l0 6.109375l0 3.875l-1.53125 0l0 -12.125l2.125 0l3.359375 7.15625l1.21875 2.78125l0 -6.5l0 -3.4375l1.546875 0l0 12.125z" fill-rule="nonzero"></path><path fill="#434343" d="m245.5171 387.8063l-3.59375 0l0 10.71875l-1.671875 0l0 -10.71875l-3.59375 0l0 -1.40625l8.859375 0l0 1.40625z" fill-rule="nonzero"></path><path fill="#000000" fill-opacity="0.0" d="m275.37988 154.37495l159.52756 23.685043" fill-rule="nonzero"></path><path stroke="#e06666" stroke-width="1.0" stroke-linejoin="round" stroke-linecap="butt" d="m275.3799 154.37495l156.1376 23.181747" fill-rule="evenodd"></path><path fill="#e06666" stroke="#e06666" stroke-width="1.0" stroke-linecap="butt" d="m431.51752 177.5567l-1.2775574 0.9472351l3.2214355 -0.6586304l-2.8911133 -1.5661621z" fill-rule="evenodd"></path><path fill="#000000" fill-opacity="0.0" d="m343.18604 134.28076l22.677185 2.9606323l-2.551178 14.992126l-22.677185 -2.9606323z" fill-rule="nonzero"></path><path fill="#e06666" d="m354.0799 156.38391l1.5900879 0.42819214q-0.5466919 1.6304474 -1.8093567 2.4425812q-1.2600708 0.79670715 -2.871399 0.5863342q-1.9986572 -0.26094055 -3.0022888 -1.7156067q-0.98553467 -1.4680634 -0.57107544 -3.903656q0.26757812 -1.5723419 0.97821045 -2.6771393q0.72875977 -1.1181946 1.8974915 -1.5643921q1.1868591 -0.45959473 2.4418335 -0.2957611q1.5958252 0.2083435 2.4664917 1.1414185q0.870697 0.9330597 0.9132385 2.451355l-1.6532898 0.03627014q-0.06451416 -1.0169067 -0.56933594 -1.5870667q-0.5048218 -0.57014465 -1.3259583 -0.6773529q-1.2549744 -0.16383362 -2.1972961 0.6270752q-0.92681885 0.79293823 -1.2547302 2.7198334q-0.33312988 1.9577179 0.27389526 2.9509125q0.6096802 0.9777832 1.8181763 1.1355591q0.9760742 0.12742615 1.7265015 -0.37338257q0.75302124 -0.51623535 1.1488037 -1.725174z" fill-rule="nonzero"></path><path fill="#000000" fill-opacity="0.0" d="m273.4716 190.73802l160.53543 -11.842529" fill-rule="nonzero"></path><path stroke="#e06666" stroke-width="1.0" stroke-linejoin="round" stroke-linecap="butt" d="m276.88937 190.48589l157.11765 -11.590378" fill-rule="evenodd"></path><path fill="#e06666" stroke="#e06666" stroke-width="1.0" stroke-linecap="butt" d="m276.88937 190.4859l1.0388184 -1.2042694l-2.9986572 1.3488464l3.1641235 0.8942261z" fill-rule="evenodd"></path><path fill="#000000" fill-opacity="0.0" d="m275.37988 209.34238l158.61417 19.433075" fill-rule="nonzero"></path><path stroke="#93c47d" stroke-width="1.0" stroke-linejoin="round" stroke-linecap="butt" d="m275.3799 209.34238l155.2125 19.016296" fill-rule="evenodd"></path><path fill="#93c47d" stroke="#93c47d" stroke-width="1.0" stroke-linecap="butt" d="m430.5924 228.35869l-1.2529907 0.9794769l3.2035828 -0.7404938l-2.9300842 -1.4919891z" fill-rule="evenodd"></path><path fill="#000000" fill-opacity="0.0" d="m333.4637 185.07516l55.62204 7.3385773l-2.551178 14.992126l-55.62204 -7.3385773z" fill-rule="nonzero"></path><path fill="#93c47d" d="m344.0724 210.80086l-1.5335693 -0.20233154l2.282013 -13.410477l1.6420288 0.21664429l-0.81314087 4.7784424q1.2763367 -1.1712341 2.9028625 -0.9566345q0.898468 0.11853027 1.6410217 0.5947571q0.74520874 0.46080017 1.1436157 1.1910706q0.4165039 0.7168884 0.5534363 1.6805725q0.13696289 0.96369934 -0.041412354 2.0118713q-0.42492676 2.4971313 -1.897644 3.70549q-1.4700928 1.1929626 -3.2050476 0.96406555q-1.7349548 -0.22891235 -2.4669495 -1.7911987l-0.20721436 1.2177277zm0.82388306 -4.9346313q-0.29638672 1.7418213 0.0345459 2.589264q0.5749512 1.3682098 1.907135 1.5439758q1.0843506 0.1430664 2.0317688 -0.6775665q0.9500427 -0.83602905 1.2674255 -2.7011719q0.32263184 -1.8959656 -0.28164673 -2.905548q-0.6043091 -1.0095978 -1.6731567 -1.1506195q-1.0843506 -0.1430664 -2.0343933 0.6929779q-0.9500427 0.8360443 -1.2516785 2.6086884zm10.417725 10.452469q-1.069397 -1.90625 -1.6235046 -4.327667q-0.55148315 -2.4368134 -0.13180542 -4.9031067q0.37246704 -2.1888428 1.4079285 -4.085312q1.22995 -2.2017975 3.340271 -4.2716675l1.1927795 0.15737915q-1.4379578 1.7488098 -1.9332886 2.518753q-0.7727356 1.1903992 -1.3315125 2.5193634q-0.6939087 1.6578522 -0.9876709 3.384262q-0.7501831 4.408493 1.2595825 9.165375l-1.1927795 -0.15737915zm10.658783 -6.2690277l1.5898132 0.43040466q-0.54663086 1.6300049 -1.809082 2.4405823q-1.2598572 0.795166 -2.8709106 0.5826111q-1.998291 -0.26365662 -3.0017395 -1.7199249q-0.98532104 -1.469635 -0.5708618 -3.9050903q0.2675476 -1.5722656 0.9780884 -2.6763153q0.7286377 -1.1174164 1.8971863 -1.5621338q1.1866455 -0.45809937 2.4414062 -0.2925415q1.5955505 0.21051025 2.466034 1.1448975q0.8705139 0.9343872 0.9130249 2.453003l-1.6530151 0.034072876q-0.06448364 -1.0171661 -0.56918335 -1.588089q-0.5047302 -0.57092285 -1.3257141 -0.679245q-1.2547607 -0.16555786 -2.19693 0.62423706q-0.92666626 0.79185486 -1.2545471 2.7186432q-0.33312988 1.9576111 0.2737732 2.9517975q0.6095581 0.97875977 1.8178406 1.1381836q0.9758911 0.12875366 1.7261963 -0.3711548q0.75289917 -0.5153198 1.1486206 -1.723938zm2.6727295 8.027954l-1.1772766 -0.15533447q3.4894104 -4.0313263 4.2395935 -8.439835q0.2911682 -1.7109833 0.19244385 -3.4576569q-0.09185791 -1.4147949 -0.43444824 -2.7523499q-0.21463013 -0.8793793 -1.0047302 -2.937912l1.1773071 0.15531921q1.3441162 2.52565 1.771698 4.946121q0.37420654 2.0824585 0.0017089844 4.2713013q-0.41967773 2.4662933 -1.7735596 4.651718q-1.3357544 2.1720734 -2.9927368 3.718628z" fill-rule="nonzero"></path><path fill="#000000" fill-opacity="0.0" d="m276.33405 274.71732l159.52756 23.685028" fill-rule="nonzero"></path><path stroke="#bf9000" stroke-width="1.0" stroke-linejoin="round" stroke-linecap="butt" d="m276.33405 274.71732l156.1376 23.181732" fill-rule="evenodd"></path><path fill="#bf9000" stroke="#bf9000" stroke-width="1.0" stroke-linecap="butt" d="m432.47165 297.89905l-1.2775269 0.9472351l3.221405 -0.6586304l-2.8911133 -1.5661621z" fill-rule="evenodd"></path><path fill="#000000" fill-opacity="0.0" d="m343.18744 255.03127l22.677185 2.9606476l-2.551178 14.992126l-22.677185 -2.9606323z" fill-rule="nonzero"></path><path fill="#bf9000" d="m353.7983 277.53867l1.667572 0.43832397q-0.6546631 1.4272766 -1.8963318 2.1160583q-1.2236023 0.67541504 -2.927887 0.45291138q-2.138092 -0.2791443 -3.170105 -1.7532654q-1.0320129 -1.4741516 -0.62802124 -3.848053q0.41708374 -2.4510193 1.9028931 -3.6437073q1.5013123 -1.1906738 3.5309448 -0.9256897q1.9676819 0.25689697 2.9815674 1.7444153q1.0138855 1.4875183 0.6020508 3.9077148q-0.023590088 0.13873291 -0.08892822 0.42959595l-7.281952 -0.9507141q-0.1798706 1.6153259 0.4970398 2.570343q0.6768799 0.9550476 1.9008484 1.1148376q0.8986206 0.11734009 1.6125488 -0.2621765q0.73205566 -0.39291382 1.29776 -1.3905945zm-4.9844055 -3.3768005l5.453705 0.7120056q0.0987854 -1.2319336 -0.30758667 -1.9153137q-0.62753296 -1.0588989 -1.8979797 -1.224762q-1.1310425 -0.14764404 -2.0523682 0.5199585q-0.90319824 0.6541748 -1.1957703 1.9081116z" fill-rule="nonzero"></path><path fill="#000000" fill-opacity="0.0" d="m274.42575 311.08038l160.53543 -11.842529" fill-rule="nonzero"></path><path stroke="#bf9000" stroke-width="1.0" stroke-linejoin="round" stroke-linecap="butt" d="m277.84354 310.82825l157.11761 -11.590393" fill-rule="evenodd"></path><path fill="#bf9000" stroke="#bf9000" stroke-width="1.0" stroke-linecap="butt" d="m277.84354 310.82825l1.0387878 -1.2042542l-2.9986572 1.3488464l3.1641235 0.8942261z" fill-rule="evenodd"></path><path fill="#000000" fill-opacity="0.0" d="m275.38113 346.36475l159.52756 23.685028" fill-rule="nonzero"></path><path stroke="#134f5c" stroke-width="1.0" stroke-linejoin="round" stroke-linecap="butt" d="m275.38116 346.36475l156.1376 23.181732" fill-rule="evenodd"></path><path fill="#134f5c" stroke="#134f5c" stroke-width="1.0" stroke-linecap="butt" d="m431.51877 369.54648l-1.2775269 0.9472351l3.221405 -0.6586304l-2.8910828 -1.5661316z" fill-rule="evenodd"></path><path fill="#000000" fill-opacity="0.0" d="m343.18732 325.66556l22.677155 2.9606323l-2.551178 14.992126l-22.677185 -2.9606323z" fill-rule="nonzero"></path><path fill="#134f5c" d="m349.54965 350.8171l1.4348755 -8.432098l-1.4718628 -0.19213867l0.22033691 -1.2948914l1.4718933 0.19216919l0.17575073 -1.0328064q0.16525269 -0.9711609 0.4169922 -1.4267883q0.34259033 -0.6170654 1.0123901 -0.923584q0.67245483 -0.3218994 1.757019 -0.18029785q0.6972046 0.091033936 1.5231018 0.3564148l-0.49447632 1.4166565q-0.49554443 -0.15924072 -0.96035767 -0.21990967q-0.7591553 -0.099121094 -1.124115 0.18414307q-0.3623352 0.26785278 -0.51187134 1.1465149l-0.15213013 0.8940735l1.9057007 0.24880981l-0.22033691 1.2948608l-1.9057007 -0.24880981l-1.4348755 8.432098l-1.642334 -0.2144165z" fill-rule="nonzero"></path><path fill="#000000" fill-opacity="0.0" d="m273.47284 382.7278l160.53543 -11.842529" fill-rule="nonzero"></path><path stroke="#134f5c" stroke-width="1.0" stroke-linejoin="round" stroke-linecap="butt" d="m276.89062 382.47568l157.11765 -11.590393" fill-rule="evenodd"></path><path fill="#134f5c" stroke="#134f5c" stroke-width="1.0" stroke-linecap="butt" d="m276.89062 382.47568l1.0388184 -1.2042542l-2.9986572 1.3488464l3.1641235 0.8942261z" fill-rule="evenodd"></path><path fill="#000000" fill-opacity="0.0" d="m275.37988 417.41953l159.52756 23.685059" fill-rule="nonzero"></path><path stroke="#a64d79" stroke-width="1.0" stroke-linejoin="round" stroke-linecap="butt" d="m275.3799 417.41953l156.1376 23.181732" fill-rule="evenodd"></path><path fill="#a64d79" stroke="#a64d79" stroke-width="1.0" stroke-linecap="butt" d="m431.51752 440.6013l-1.2775574 0.9472351l3.2214355 -0.6586304l-2.8911133 -1.5661621z" fill-rule="evenodd"></path><path fill="#000000" fill-opacity="0.0" d="m318.4788 393.75833l72.09448 9.480316l-2.551178 14.992126l-72.09448 -9.480316z" fill-rule="nonzero"></path><path fill="#741b47" d="m336.46692 420.44815l0.20986938 -1.2331848q-1.1760864 1.3267517 -2.973114 1.0904541q-1.1618652 -0.15280151 -2.0457764 -0.91516113q-0.8658142 -0.77575684 -1.2139282 -1.9877319q-0.32995605 -1.2253723 -0.075531006 -2.720581q0.24655151 -1.4489746 0.928772 -2.5727234q0.6977234 -1.1217346 1.7657471 -1.6274414q1.0835266 -0.5036621 2.29187 -0.34475708q0.8830261 0.116119385 1.501709 0.5756836q0.6341553 0.4616394 0.9656372 1.1198425l0.8183899 -4.8093567l1.6421204 0.21594238l-2.282074 13.410675l-1.5336914 -0.20166016zm-4.4098816 -5.544159q-0.31741333 1.8651733 0.3152771 2.8939514q0.64819336 1.0307922 1.717102 1.1713562q1.0844116 0.14260864 1.993042 -0.63619995q0.90859985 -0.7788086 1.2181091 -2.5977478q0.3383789 -1.9884644 -0.2788086 -3.0151978q-0.6145935 -1.0421448 -1.7454834 -1.1908569q-1.099884 -0.14465332 -1.995636 0.6516113q-0.89572144 0.79626465 -1.2236023 2.7230835zm10.849762 10.425415q-1.0694885 -1.9057007 -1.6236267 -4.326721q-0.5515442 -2.4364624 -0.13183594 -4.9028015q0.37246704 -2.1888733 1.407959 -4.085663q1.230011 -2.2022095 3.3404236 -4.2728577l1.1928406 0.15686035q-1.4380188 1.7493286 -1.9333496 2.5194397q-0.77279663 1.1906738 -1.3315735 2.5197754q-0.6939392 1.6580505 -0.98773193 3.384491q-0.7501831 4.4085693 1.2597351 9.164337l-1.1928406 -0.15686035zm10.895721 -5.8008423l1.6673584 0.43988037q-0.65460205 1.4268494 -1.8961487 2.1145935q-1.2234192 0.67437744 -2.9275208 0.45028687q-2.137848 -0.28112793 -3.1697083 -1.7563782q-1.0318604 -1.4752197 -0.62789917 -3.8490906q0.41705322 -2.4508972 1.90271 -3.642395q1.5011597 -1.1894226 3.530548 -0.9225769q1.9674377 0.25872803 2.9812012 1.747345q1.0137329 1.4886475 0.6019287 3.908722q-0.023620605 0.13873291 -0.08895874 0.42956543l-7.281067 -0.9574585q-0.17984009 1.6153564 0.49694824 2.5711365q0.67678833 0.9557495 1.9006348 1.1166992q0.89849854 0.118133545 1.6123657 -0.2607727q0.7319641 -0.39230347 1.2976074 -1.3895569zm-4.9837646 -3.3817444l5.453064 0.71707153q0.0987854 -1.2320251 -0.30752563 -1.9158325q-0.6274414 -1.0596008 -1.8977661 -1.226654q-1.1308899 -0.14868164 -2.052124 0.51812744q-0.9031067 0.6534729 -1.1956482 1.9072876zm8.479828 7.0406494l0.32000732 -1.8805847l1.8899841 0.24853516l-0.32000732 1.8805847q-0.17575073 1.0327759 -0.65509033 1.6158752q-0.4819641 0.59854126 -1.329773 0.83377075l-0.3440857 -0.77020264q0.56344604 -0.14654541 0.88739014 -0.5609741q0.3239441 -0.4144287 0.4965515 -1.2427368l-0.9449768 -0.12426758zm5.1080627 0.6717224l1.434845 -8.431793l-1.4717102 -0.19351196l0.22033691 -1.2948303l1.4717102 0.19351196l0.17572021 -1.0327759q0.16525269 -0.97109985 0.4169922 -1.4265442q0.3425293 -0.6168823 1.0122986 -0.9227905q0.6723633 -0.32131958 1.7567749 -0.17871094q0.69714355 0.09164429 1.5229492 0.35784912l-0.4944458 1.4163818q-0.4954834 -0.159729 -0.9602356 -0.2208252q-0.75909424 -0.099823 -1.1239929 0.18313599q-0.3623047 0.2675476 -0.5118103 1.1461792l-0.15216064 0.89404297l1.9054871 0.25057983l-0.22033691 1.2947998l-1.9054871 -0.25054932l-1.4348145 8.431763l-1.6421204 -0.21591187zm5.1492004 4.7115173l-1.1773682 -0.15481567q3.4895935 -4.0325623 4.239807 -8.441132q0.2911377 -1.711029 0.19238281 -3.4575806q-0.09185791 -1.4146729 -0.43447876 -2.7519836q-0.21466064 -0.87924194 -1.0047913 -2.9373474l1.1773682 0.15484619q1.3442383 2.5249329 1.7718201 4.9450684q0.37423706 2.0821838 0.0017700195 4.271057q-0.41970825 2.466339 -1.7736206 4.6522217q-1.335846 2.1725159 -2.9928894 3.7196655z" fill-rule="nonzero"></path><path fill="#000000" fill-opacity="0.0" d="m273.4716 453.7826l160.53543 -11.842499" fill-rule="nonzero"></path><path stroke="#741b47" stroke-width="1.0" stroke-linejoin="round" stroke-linecap="butt" d="m276.88937 453.53046l157.11765 -11.590363" fill-rule="evenodd"></path><path fill="#741b47" stroke="#741b47" stroke-width="1.0" stroke-linecap="butt" d="m276.88937 453.5305l1.0388184 -1.2042847l-2.9986572 1.3488464l3.1641235 0.8942261z" fill-rule="evenodd"></path><path fill="#000000" fill-opacity="0.0" d="m274.42447 484.65225l159.52759 23.685028" fill-rule="nonzero"></path><path stroke="#3d85c6" stroke-width="1.0" stroke-linejoin="round" stroke-linecap="butt" d="m274.42447 484.65222l156.13766 23.181732" fill-rule="evenodd"></path><path fill="#3d85c6" stroke="#3d85c6" stroke-width="1.0" stroke-linecap="butt" d="m430.56213 507.83395l-1.2775574 0.9472351l3.2214355 -0.65859985l-2.8911133 -1.5661621z" fill-rule="evenodd"></path><path fill="#000000" fill-opacity="0.0" d="m272.51617 521.0153l160.53543 -11.842529" fill-rule="nonzero"></path><path stroke="#3c78d8" stroke-width="1.0" stroke-linejoin="round" stroke-linecap="butt" d="m275.93396 520.7632l157.11765 -11.590393" fill-rule="evenodd"></path><path fill="#3c78d8" stroke="#3c78d8" stroke-width="1.0" stroke-linecap="butt" d="m275.934 520.7632l1.0387878 -1.2042847l-2.9986572 1.348877l3.1641235 0.89416504z" fill-rule="evenodd"></path><path fill="#000000" fill-opacity="0.0" d="m318.4788 462.90244l72.09448 9.480316l-2.551178 14.992126l-72.09448 -9.480316z" fill-rule="nonzero"></path><path fill="#3d85c6" d="m334.14395 488.05753q-1.0632629 0.6639099 -1.970398 0.87557983q-0.9045105 0.19625854 -1.8804626 0.06790161q-1.5956421 -0.20980835 -2.3320312 -1.0946045q-0.73373413 -0.90023804 -0.5265198 -2.117981q0.123291016 -0.7244873 0.5482788 -1.267456q0.4250183 -0.54299927 1.0120544 -0.8282471q0.58703613 -0.28527832 1.284668 -0.3826599q0.5193176 -0.07354736 1.5136719 -0.053100586q2.0558777 0.018188477 3.0559387 -0.1812439q0.05770874 -0.33914185 0.07345581 -0.4316101q0.17050171 -1.0019531 -0.22341919 -1.4792786q-0.54074097 -0.63842773 -1.7955627 -0.8034363q-1.1618652 -0.15280151 -1.7877502 0.1746521q-0.6259155 0.3274536 -1.067627 1.3410034l-1.5872803 -0.4451294q0.40811157 -1.0021973 1.011383 -1.5690308q0.6187744 -0.5647583 1.62146 -0.77960205q1.020813 -0.22824097 2.2911377 -0.06121826q1.2548218 0.16500854 1.9795532 0.5597534q0.72476196 0.39474487 1.0204773 0.8906555q0.29574585 0.49591064 0.31976318 1.1924744q0.022125244 0.42843628 -0.16412354 1.5228577l-0.37509155 2.2042847q-0.39083862 2.2967834 -0.40283203 2.9255676q0.006164551 0.615448 0.23706055 1.2131348l-1.7350769 -0.22814941q-0.17678833 -0.54330444 -0.12072754 -1.2451172zm0.48486328 -3.6870117q-0.9588318 0.23638916 -2.8159485 0.26010132q-1.046051 0.004272461 -1.4958191 0.13424683q-0.44973755 0.12997437 -0.74243164 0.45394897q-0.2927246 0.3239746 -0.3661499 0.7555847q-0.11279297 0.6628418 0.30685425 1.1750488q0.43777466 0.49884033 1.3982544 0.6251221q0.96047974 0.12631226 1.7749023 -0.19210815q0.8170471 -0.3338318 1.2940063 -0.9960327q0.3604126 -0.53570557 0.54403687 -1.6147461l0.10229492 -0.6011658zm5.703949 9.764496q-1.0694885 -1.9057007 -1.6236572 -4.326721q-0.5515137 -2.4364624 -0.13183594 -4.9028015q0.37249756 -2.1888428 1.4079895 -4.085663q1.230011 -2.202179 3.3404236 -4.272827l1.1928406 0.15686035q-1.4380188 1.7492981 -1.9333496 2.5194397q-0.77279663 1.1906433 -1.3315735 2.5197754q-0.6939392 1.6580505 -0.98773193 3.3844604q-0.7502136 4.4086 1.2597351 9.164337l-1.1928406 -0.15686035zm5.204529 -3.3500366l-1.5336914 -0.20166016l2.282074 -13.410706l1.6421204 0.21594238l-0.81314087 4.778534q1.2763672 -1.1717224 2.9030151 -0.9578247q0.89849854 0.118133545 1.6411133 0.59402466q0.74523926 0.46047974 1.1436768 1.1905212q0.41653442 0.7166748 0.5534973 1.6802673q0.13696289 0.963562 -0.041412354 2.0117493q-0.42492676 2.4971619 -1.8977356 3.7061157q-1.4701538 1.193512 -3.2052002 0.96533203q-1.7350769 -0.22814941 -2.467102 -1.7900391l-0.20721436 1.2177429zm0.82388306 -4.9346924q-0.29641724 1.7418518 0.034576416 2.5891113q0.5749817 1.3678894 1.9072571 1.5430603q1.0844116 0.14260864 2.0318604 -0.67837524q0.95007324 -0.83639526 1.2674866 -2.7015686q0.32263184 -1.8959961 -0.28170776 -2.9052734q-0.6043091 -1.0092773 -1.6732483 -1.1498413q-1.0844116 -0.14257812 -2.0344849 0.69381714q-0.95010376 0.83639526 -1.2517395 2.6090698zm8.363342 6.1427917l0.32003784 -1.8805542l1.8899841 0.24850464l-0.32003784 1.8805847q-0.17575073 1.0327759 -0.65509033 1.6159058q-0.4819641 0.59851074 -1.3297424 0.83374023l-0.3440857 -0.77020264q0.56344604 -0.1465149 0.8873596 -0.5609436q0.3239441 -0.4144287 0.49658203 -1.2427673l-0.9450073 -0.12426758zm11.041382 1.4519348l0.20983887 -1.2331543q-1.1760559 1.3267212 -2.9730835 1.0904236q-1.1618652 -0.152771 -2.0457764 -0.91516113q-0.8658142 -0.77575684 -1.2139282 -1.9877319q-0.32998657 -1.2253418 -0.075531006 -2.7205505q0.24655151 -1.4489746 0.928772 -2.572754q0.6977234 -1.1217346 1.7657471 -1.6274414q1.0835266 -0.5036621 2.29187 -0.34475708q0.8830261 0.116119385 1.501709 0.5757141q0.6341553 0.4616089 0.9656067 1.119812l0.8184204 -4.8093567l1.6421204 0.21594238l-2.282074 13.410706l-1.5336914 -0.20169067zm-4.409912 -5.5441284q-0.3173828 1.8651428 0.31530762 2.893921q0.64819336 1.0307922 1.717102 1.1713562q1.0844116 0.14260864 1.9930115 -0.63619995q0.9086304 -0.7788086 1.2181396 -2.5977478q0.3383789 -1.9884644 -0.2788086 -3.0151978q-0.6145935 -1.0421448 -1.7454834 -1.1908569q-1.099884 -0.1446228 -1.995636 0.65164185q-0.89572144 0.79626465 -1.2236328 2.7230835zm8.773895 10.152435l-1.1773376 -0.15484619q3.4895935 -4.0325623 4.2397766 -8.441132q0.2911682 -1.711029 0.19241333 -3.45755q-0.09188843 -1.4147034 -0.43447876 -2.7520142q-0.21466064 -0.87924194 -1.0047913 -2.937317l1.1773376 0.15481567q1.3442383 2.5249329 1.7718506 4.9450684q0.37423706 2.0822144 0.001739502 4.271057q-0.41967773 2.466339 -1.7736206 4.6522217q-1.3358154 2.1725159 -2.9928894 3.719696z" fill-rule="nonzero"></path><path fill="#000000" fill-opacity="0.0" d="m7.0 5.0l689.98425 0l0 49.007874l-689.98425 0z" fill-rule="nonzero"></path><path fill="#434343" d="m279.24628 31.451248q-0.8125 0.3125 -1.5625 0.46875q-0.75 0.171875 -1.578125 0.171875q-1.296875 0 -2.3125 -0.390625q-1.0 -0.390625 -1.703125 -1.140625q-0.6875 -0.765625 -1.046875 -1.890625q-0.34375 -1.125 -0.34375 -2.625q0 -1.53125 0.390625 -2.71875q0.390625 -1.1875 1.109375 -2.015625q0.71875 -0.828125 1.75 -1.25q1.046875 -0.4375 2.328125 -0.4375q0.421875 0 0.78125 0.03125q0.375 0.015625 0.71875 0.0625q0.359375 0.046875 0.71875 0.140625q0.359375 0.078125 0.75 0.203125l0 2.265625q-0.78125 -0.375 -1.5 -0.53125q-0.71875 -0.15625 -1.296875 -0.15625q-0.859375 0 -1.484375 0.3125q-0.609375 0.3125 -1.0 0.875q-0.390625 0.5625 -0.578125 1.34375q-0.1875 0.765625 -0.1875 1.6875q0 0.984375 0.1875 1.765625q0.1875 0.765625 0.578125 1.3125q0.40625 0.53125 1.03125 0.8125q0.625 0.28125 1.484375 0.28125q0.296875 0 0.65625 -0.046875q0.359375 -0.0625 0.71875 -0.15625q0.375 -0.109375 0.734375 -0.234375q0.359375 -0.140625 0.65625 -0.28125l0 2.140625zm10.726044 -4.3125q0 1.109375 -0.328125 2.03125q-0.3125 0.921875 -0.90625 1.578125q-0.59375 0.65625 -1.453125 1.03125q-0.859375 0.359375 -1.96875 0.359375q-1.046875 0 -1.875 -0.3125q-0.8125 -0.3125 -1.390625 -0.90625q-0.578125 -0.609375 -0.890625 -1.515625q-0.296875 -0.921875 -0.296875 -2.140625q0 -1.125 0.3125 -2.03125q0.328125 -0.921875 0.921875 -1.578125q0.59375 -0.65625 1.453125 -1.0q0.875 -0.359375 1.953125 -0.359375q1.0625 0 1.890625 0.3125q0.828125 0.296875 1.390625 0.921875q0.578125 0.609375 0.875 1.515625q0.3125 0.90625 0.3125 2.09375zm-2.359375 0.046875q0 -1.46875 -0.5625 -2.203125q-0.546875 -0.734375 -1.625 -0.734375q-0.59375 0 -1.015625 0.234375q-0.40625 0.234375 -0.671875 0.640625q-0.265625 0.390625 -0.390625 0.9375q-0.125 0.53125 -0.125 1.140625q0 1.484375 0.59375 2.234375q0.59375 0.734375 1.609375 0.734375q0.578125 0 0.984375 -0.21875q0.421875 -0.234375 0.671875 -0.625q0.265625 -0.40625 0.390625 -0.953125q0.140625 -0.546875 0.140625 -1.1875zm9.741669 4.734375l0 -6.140625q0 -1.546875 -1.140625 -1.546875q-0.578125 0 -1.109375 0.46875q-0.515625 0.453125 -1.109375 1.25l0 5.96875l-2.25 0l0 -9.421875l1.953125 0l0.046875 1.390625q0.296875 -0.359375 0.609375 -0.65625q0.3125 -0.296875 0.671875 -0.5q0.359375 -0.21875 0.765625 -0.328125q0.421875 -0.109375 0.953125 -0.109375q0.71875 0 1.25 0.234375q0.546875 0.234375 0.90625 0.671875q0.359375 0.421875 0.53125 1.03125q0.1875 0.609375 0.1875 1.359375l0 6.328125l-2.265625 0zm9.913544 0l-2.609375 0l-3.734375 -9.421875l2.515625 0l1.953125 5.34375l0.59375 1.71875l0.578125 -1.65625l1.96875 -5.40625l2.4375 0l-3.703125 9.421875zm13.194794 -5.4375q0 0.234375 -0.015625 0.609375q-0.015625 0.359375 -0.046875 0.6875l-6.1875 0q0 0.625 0.1875 1.109375q0.1875 0.46875 0.53125 0.78125q0.359375 0.3125 0.84375 0.484375q0.484375 0.171875 1.078125 0.171875q0.6875 0 1.46875 -0.109375q0.78125 -0.109375 1.625 -0.34375l0 1.796875q-0.359375 0.09375 -0.78125 0.1875q-0.421875 0.078125 -0.875 0.140625q-0.4375 0.078125 -0.90625 0.109375q-0.453125 0.03125 -0.875 0.03125q-1.078125 0 -1.9375 -0.3125q-0.84375 -0.3125 -1.4375 -0.90625q-0.59375 -0.59375 -0.90625 -1.46875q-0.3125 -0.890625 -0.3125 -2.046875q0 -1.15625 0.3125 -2.09375q0.3125 -0.9375 0.890625 -1.609375q0.578125 -0.671875 1.390625 -1.03125q0.828125 -0.375 1.828125 -0.375q1.015625 0 1.78125 0.3125q0.765625 0.296875 1.28125 0.859375q0.53125 0.5625 0.796875 1.328125q0.265625 0.765625 0.265625 1.6875zm-2.296875 -0.328125q0 -0.546875 -0.15625 -0.953125q-0.140625 -0.421875 -0.390625 -0.6875q-0.25 -0.28125 -0.59375 -0.40625q-0.34375 -0.125 -0.734375 -0.125q-0.84375 0 -1.390625 0.578125q-0.546875 0.5625 -0.65625 1.59375l3.921875 0zm9.960419 5.765625l0 -6.140625q0 -1.546875 -1.140625 -1.546875q-0.578125 0 -1.109375 0.46875q-0.515625 0.453125 -1.109375 1.25l0 5.96875l-2.25 0l0 -9.421875l1.953125 0l0.046875 1.390625q0.296875 -0.359375 0.609375 -0.65625q0.3125 -0.296875 0.671875 -0.5q0.359375 -0.21875 0.765625 -0.328125q0.421875 -0.109375 0.953125 -0.109375q0.71875 0 1.25 0.234375q0.546875 0.234375 0.90625 0.671875q0.359375 0.421875 0.53125 1.03125q0.1875 0.609375 0.1875 1.359375l0 6.328125l-2.265625 0zm12.413544 -0.09375q-0.609375 0.140625 -1.234375 0.21875q-0.625 0.09375 -1.171875 0.09375q-0.9375 0 -1.609375 -0.203125q-0.671875 -0.1875 -1.109375 -0.578125q-0.4375 -0.40625 -0.65625 -1.015625q-0.203125 -0.625 -0.203125 -1.484375l0 -4.59375l-2.53125 0l0 -1.765625l2.53125 0l0 -2.421875l2.328125 -0.59375l0 3.015625l3.65625 0l0 1.765625l-3.65625 0l0 4.421875q0 0.8125 0.359375 1.234375q0.375 0.40625 1.25 0.40625q0.5625 0 1.078125 -0.09375q0.53125 -0.09375 0.96875 -0.21875l0 1.8125zm8.038544 -11.90625q0 0.296875 -0.125 0.578125q-0.109375 0.265625 -0.3125 0.46875q-0.1875 0.1875 -0.46875 0.3125q-0.265625 0.109375 -0.578125 0.109375q-0.3125 0 -0.59375 -0.109375q-0.265625 -0.125 -0.46875 -0.3125q-0.203125 -0.203125 -0.3125 -0.46875q-0.109375 -0.28125 -0.109375 -0.578125q0 -0.3125 0.109375 -0.578125q0.109375 -0.265625 0.3125 -0.46875q0.203125 -0.203125 0.46875 -0.3125q0.28125 -0.125 0.59375 -0.125q0.3125 0 0.578125 0.125q0.28125 0.109375 0.46875 0.3125q0.203125 0.203125 0.3125 0.46875q0.125 0.265625 0.125 0.578125zm-2.515625 4.34375l-2.65625 0l0 -1.765625l4.984375 0l0 7.65625l2.71875 0l0 1.765625l-8.03125 0l0 -1.765625l2.984375 0l0 -5.890625zm15.710419 2.875q0 1.109375 -0.328125 2.03125q-0.3125 0.921875 -0.90625 1.578125q-0.59375 0.65625 -1.453125 1.03125q-0.859375 0.359375 -1.96875 0.359375q-1.046875 0 -1.875 -0.3125q-0.8125 -0.3125 -1.390625 -0.90625q-0.578125 -0.609375 -0.890625 -1.515625q-0.296875 -0.921875 -0.296875 -2.140625q0 -1.125 0.3125 -2.03125q0.328125 -0.921875 0.921875 -1.578125q0.59375 -0.65625 1.453125 -1.0q0.875 -0.359375 1.953125 -0.359375q1.0625 0 1.890625 0.3125q0.828125 0.296875 1.390625 0.921875q0.578125 0.609375 0.875 1.515625q0.3125 0.90625 0.3125 2.09375zm-2.359375 0.046875q0 -1.46875 -0.5625 -2.203125q-0.546875 -0.734375 -1.625 -0.734375q-0.59375 0 -1.015625 0.234375q-0.40625 0.234375 -0.671875 0.640625q-0.265625 0.390625 -0.390625 0.9375q-0.125 0.53125 -0.125 1.140625q0 1.484375 0.59375 2.234375q0.59375 0.734375 1.609375 0.734375q0.578125 0 0.984375 -0.21875q0.421875 -0.234375 0.671875 -0.625q0.265625 -0.40625 0.390625 -0.953125q0.140625 -0.546875 0.140625 -1.1875zm9.741669 4.734375l0 -6.140625q0 -1.546875 -1.140625 -1.546875q-0.578125 0 -1.109375 0.46875q-0.515625 0.453125 -1.109375 1.25l0 5.96875l-2.25 0l0 -9.421875l1.953125 0l0.046875 1.390625q0.296875 -0.359375 0.609375 -0.65625q0.3125 -0.296875 0.671875 -0.5q0.359375 -0.21875 0.765625 -0.328125q0.421875 -0.109375 0.953125 -0.109375q0.71875 0 1.25 0.234375q0.546875 0.234375 0.90625 0.671875q0.359375 0.421875 0.53125 1.03125q0.1875 0.609375 0.1875 1.359375l0 6.328125l-2.265625 0zm10.554169 0l-0.0625 -1.234375q-0.296875 0.3125 -0.625 0.578125q-0.3125 0.265625 -0.703125 0.46875q-0.390625 0.1875 -0.859375 0.296875q-0.453125 0.109375 -1.0 0.109375q-0.71875 0 -1.265625 -0.21875q-0.546875 -0.21875 -0.921875 -0.59375q-0.375 -0.375 -0.5625 -0.90625q-0.1875 -0.546875 -0.1875 -1.203125q0 -0.671875 0.28125 -1.234375q0.28125 -0.5625 0.859375 -0.96875q0.578125 -0.40625 1.4375 -0.640625q0.875 -0.234375 2.046875 -0.234375l1.234375 0l0 -0.5625q0 -0.359375 -0.109375 -0.65625q-0.09375 -0.296875 -0.328125 -0.5q-0.21875 -0.203125 -0.578125 -0.3125q-0.359375 -0.109375 -0.890625 -0.109375q-0.84375 0 -1.65625 0.1875q-0.8125 0.1875 -1.5625 0.53125l0 -1.8125q0.671875 -0.265625 1.546875 -0.4375q0.890625 -0.171875 1.859375 -0.171875q1.046875 0 1.796875 0.203125q0.75 0.1875 1.234375 0.59375q0.484375 0.390625 0.71875 1.0q0.234375 0.59375 0.234375 1.390625l0 6.4375l-1.9375 0zm-0.328125 -4.171875l-1.375 0q-0.578125 0 -0.984375 0.125q-0.390625 0.109375 -0.640625 0.3125q-0.25 0.1875 -0.375 0.4375q-0.109375 0.25 -0.109375 0.546875q0 0.5625 0.359375 0.875q0.375 0.296875 1.015625 0.296875q0.46875 0 0.984375 -0.34375q0.515625 -0.34375 1.125 -0.984375l0 -1.265625zm7.7104187 -7.171875l-2.65625 0l0 -1.765625l4.984375 0l0 11.34375l2.71875 0l0 1.765625l-8.03125 0l0 -1.765625l2.984375 0l0 -9.578125zm23.498962 11.34375l-1.703125 -3.890625q-0.25 -0.5625 -0.6875 -0.84375q-0.421875 -0.28125 -1.0 -0.28125l-0.4375 0l0 5.015625l-2.28125 0l0 -12.125l3.53125 0q1.0 0 1.8125 0.171875q0.8125 0.171875 1.390625 0.578125q0.578125 0.390625 0.890625 1.03125q0.3125 0.640625 0.3125 1.5625q0 0.671875 -0.203125 1.203125q-0.1875 0.515625 -0.546875 0.890625q-0.34375 0.375 -0.84375 0.609375q-0.484375 0.21875 -1.046875 0.3125q0.4375 0.09375 0.8125 0.515625q0.375 0.40625 0.734375 1.1875l1.9375 4.0625l-2.671875 0zm-0.5625 -8.546875q0 -0.890625 -0.578125 -1.28125q-0.5625 -0.390625 -1.6875 -0.390625l-1.0 0l0 3.421875l0.921875 0q0.53125 0 0.953125 -0.109375q0.4375 -0.109375 0.734375 -0.328125q0.3125 -0.234375 0.484375 -0.5625q0.171875 -0.328125 0.171875 -0.75zm13.179169 0.28125q0 0.953125 -0.328125 1.75q-0.3125 0.796875 -0.9375 1.390625q-0.609375 0.578125 -1.546875 0.90625q-0.921875 0.328125 -2.140625 0.328125l-1.171875 0l0 3.890625l-2.296875 0l0 -12.125l3.5625 0q1.171875 0 2.078125 0.265625q0.90625 0.25 1.515625 0.75q0.625 0.484375 0.9375 1.203125q0.328125 0.71875 0.328125 1.640625zm-2.390625 0.15625q0 -0.484375 -0.15625 -0.875q-0.15625 -0.390625 -0.46875 -0.671875q-0.3125 -0.28125 -0.78125 -0.421875q-0.46875 -0.15625 -1.125 -0.15625l-1.203125 0l0 4.4375l1.28125 0q0.59375 0 1.046875 -0.15625q0.453125 -0.15625 0.765625 -0.453125q0.3125 -0.3125 0.46875 -0.734375q0.171875 -0.4375 0.171875 -0.96875zm12.288544 7.640625q-0.8125 0.3125 -1.5625 0.46875q-0.75 0.171875 -1.578125 0.171875q-1.296875 0 -2.3125 -0.390625q-1.0 -0.390625 -1.703125 -1.140625q-0.6875 -0.765625 -1.046875 -1.890625q-0.34375 -1.125 -0.34375 -2.625q0 -1.53125 0.390625 -2.71875q0.390625 -1.1875 1.109375 -2.015625q0.71875 -0.828125 1.75 -1.25q1.046875 -0.4375 2.328125 -0.4375q0.421875 0 0.78125 0.03125q0.375 0.015625 0.71875 0.0625q0.359375 0.046875 0.71875 0.140625q0.359375 0.078125 0.75 0.203125l0 2.265625q-0.78125 -0.375 -1.5 -0.53125q-0.71875 -0.15625 -1.296875 -0.15625q-0.859375 0 -1.484375 0.3125q-0.609375 0.3125 -1.0 0.875q-0.390625 0.5625 -0.578125 1.34375q-0.1875 0.765625 -0.1875 1.6875q0 0.984375 0.1875 1.765625q0.1875 0.765625 0.578125 1.3125q0.40625 0.53125 1.03125 0.8125q0.625 0.28125 1.484375 0.28125q0.296875 0 0.65625 -0.046875q0.359375 -0.0625 0.71875 -0.15625q0.375 -0.109375 0.734375 -0.234375q0.359375 -0.140625 0.65625 -0.28125l0 2.140625z" fill-rule="nonzero"></path></g></svg>
+
diff --git a/chapter/2/images/p-2.png b/chapter/2/images/p-2.png
new file mode 100644
index 0000000..ccc5d09
--- /dev/null
+++ b/chapter/2/images/p-2.png
Binary files differ
diff --git a/chapter/2/images/p-2.svg b/chapter/2/images/p-2.svg
new file mode 100644
index 0000000..f5c6b05
--- /dev/null
+++ b/chapter/2/images/p-2.svg
@@ -0,0 +1,4 @@
+<?xml version="1.0" standalone="yes"?>
+
+<svg version="1.1" viewBox="0.0 0.0 720.0 540.0" fill="none" stroke="none" stroke-linecap="square" stroke-miterlimit="10" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink"><clipPath id="p.0"><path d="m0 0l720.0 0l0 540.0l-720.0 0l0 -540.0z" clip-rule="nonzero"></path></clipPath><g clip-path="url(#p.0)"><path fill="#000000" fill-opacity="0.0" d="m0 0l720.0 0l0 540.0l-720.0 0z" fill-rule="nonzero"></path><path fill="#000000" fill-opacity="0.0" d="m40.0 88.13911l613.98425 0l0 44.000008l-613.98425 0z" fill-rule="nonzero"></path><path fill="#000000" d="m151.20563 109.902855q0 0.34375 -0.015625 0.578125q-0.015625 0.234375 -0.03125 0.4375l-6.53125 0q0 1.4375 0.796875 2.203125q0.796875 0.765625 2.296875 0.765625q0.40625 0 0.8125 -0.03125q0.40625 -0.046875 0.78125 -0.09375q0.390625 -0.0625 0.734375 -0.125q0.34375 -0.078125 0.640625 -0.15625l0 1.328125q-0.65625 0.1875 -1.484375 0.296875q-0.828125 0.125 -1.71875 0.125q-1.203125 0 -2.0625 -0.328125q-0.859375 -0.328125 -1.421875 -0.9375q-0.546875 -0.625 -0.8125 -1.515625q-0.265625 -0.890625 -0.265625 -2.03125q0 -0.984375 0.28125 -1.859375q0.296875 -0.875 0.828125 -1.53125q0.546875 -0.671875 1.328125 -1.0625q0.796875 -0.390625 1.796875 -0.390625q0.984375 0 1.734375 0.3125q0.75 0.296875 1.265625 0.859375q0.515625 0.5625 0.78125 1.375q0.265625 0.796875 0.265625 1.78125zm-1.6875 -0.21875q0.03125 -0.625 -0.125 -1.140625q-0.140625 -0.515625 -0.453125 -0.890625q-0.3125 -0.375 -0.78125 -0.578125q-0.453125 -0.203125 -1.078125 -0.203125q-0.515625 0 -0.953125 0.203125q-0.4375 0.203125 -0.765625 0.578125q-0.3125 0.359375 -0.5 0.890625q-0.1875 0.515625 -0.234375 1.140625l4.890625 0zm12.460419 5.375l-2.140625 0l-2.515625 -3.546875l-2.484375 3.546875l-2.078125 0l3.609375 -4.671875l-3.453125 -4.640625l2.078125 0l2.4375 3.578125l2.40625 -3.578125l2.0 0l-3.5 4.671875l3.640625 4.640625zm9.819794 -4.828125q0 1.25 -0.34375 2.1875q-0.34375 0.921875 -0.953125 1.53125q-0.609375 0.609375 -1.453125 0.921875q-0.828125 0.296875 -1.8125 0.296875q-0.4375 0 -0.890625 -0.046875q-0.4375 -0.046875 -0.890625 -0.15625l0 3.890625l-1.609375 0l0 -13.109375l1.4375 0l0.109375 1.5625q0.6875 -0.96875 1.46875 -1.34375q0.796875 -0.390625 1.71875 -0.390625q0.796875 0 1.390625 0.34375q0.609375 0.328125 1.015625 0.9375q0.40625 0.609375 0.609375 1.46875q0.203125 0.859375 0.203125 1.90625zm-1.640625 0.078125q0 -0.734375 -0.109375 -1.34375q-0.109375 -0.609375 -0.34375 -1.046875q-0.234375 -0.4375 -0.59375 -0.6875q-0.359375 -0.25 -0.859375 -0.25q-0.3125 0 -0.625 0.109375q-0.3125 0.09375 -0.65625 0.328125q-0.328125 0.21875 -0.703125 0.59375q-0.375 0.375 -0.8125 0.9375l0 4.515625q0.453125 0.1875 0.9375 0.296875q0.5 0.09375 0.96875 0.09375q1.3125 0 2.046875 -0.875q0.75 -0.890625 0.75 -2.671875zm21.936462 -1.25l-7.984375 0l0 -1.359375l7.984375 0l0 1.359375zm0 3.234375l-7.984375 0l0 -1.359375l7.984375 0l0 1.359375z" fill-rule="nonzero"></path><path fill="#0000ff" d="m210.85876 115.059105l-0.03125 -1.25q-0.765625 0.75 -1.546875 1.09375q-0.78125 0.328125 -1.65625 0.328125q-0.796875 0 -1.359375 -0.203125q-0.5625 -0.203125 -0.9375 -0.5625q-0.359375 -0.359375 -0.53125 -0.84375q-0.171875 -0.484375 -0.171875 -1.046875q0 -1.40625 1.046875 -2.1875q1.046875 -0.796875 3.078125 -0.796875l1.9375 0l0 -0.828125q0 -0.8125 -0.53125 -1.3125q-0.53125 -0.5 -1.609375 -0.5q-0.796875 0 -1.5625 0.1875q-0.765625 0.171875 -1.578125 0.484375l0 -1.453125q0.296875 -0.109375 0.671875 -0.21875q0.390625 -0.109375 0.796875 -0.1875q0.421875 -0.078125 0.875 -0.125q0.453125 -0.0625 0.921875 -0.0625q0.84375 0 1.515625 0.1875q0.6875 0.1875 1.15625 0.578125q0.46875 0.375 0.71875 0.953125q0.25 0.5625 0.25 1.34375l0 6.421875l-1.453125 0zm-0.171875 -4.234375l-2.0625 0q-0.59375 0 -1.03125 0.125q-0.4375 0.109375 -0.71875 0.34375q-0.28125 0.21875 -0.421875 0.53125q-0.125 0.296875 -0.125 0.6875q0 0.28125 0.078125 0.53125q0.09375 0.234375 0.28125 0.421875q0.1875 0.1875 0.484375 0.3125q0.296875 0.109375 0.71875 0.109375q0.5625 0 1.28125 -0.34375q0.71875 -0.34375 1.515625 -1.078125l0 -1.640625z" fill-rule="nonzero"></path><path fill="#980000" d="m230.9671 118.94973q-4.28125 -3.953125 -4.28125 -8.75q0 -1.125 0.21875 -2.234375q0.234375 -1.125 0.734375 -2.25q0.515625 -1.125 1.34375 -2.234375q0.828125 -1.125 2.015625 -2.234375l0.9375 0.953125q-3.59375 3.546875 -3.59375 7.875q0 2.15625 0.90625 4.140625q0.90625 1.984375 2.6875 3.75l-0.96875 0.984375z" fill-rule="nonzero"></path><path fill="#0000ff" d="m253.85669 110.23098q0 1.15625 -0.328125 2.078125q-0.3125 0.90625 -0.90625 1.546875q-0.578125 0.640625 -1.421875 0.984375q-0.84375 0.328125 -1.90625 0.328125q-0.828125 0 -1.6875 -0.15625q-0.859375 -0.15625 -1.703125 -0.5l0 -12.5625l1.609375 0l0 3.609375l-0.0625 1.71875q0.6875 -0.9375 1.484375 -1.3125q0.796875 -0.390625 1.703125 -0.390625q0.796875 0 1.390625 0.34375q0.609375 0.328125 1.015625 0.9375q0.40625 0.609375 0.609375 1.46875q0.203125 0.859375 0.203125 1.90625zm-1.640625 0.078125q0 -0.734375 -0.109375 -1.34375q-0.109375 -0.609375 -0.34375 -1.046875q-0.234375 -0.4375 -0.59375 -0.6875q-0.359375 -0.25 -0.859375 -0.25q-0.3125 0 -0.625 0.109375q-0.3125 0.09375 -0.65625 0.328125q-0.328125 0.21875 -0.703125 0.59375q-0.375 0.375 -0.8125 0.9375l0 4.515625q0.484375 0.1875 0.96875 0.296875q0.5 0.09375 0.9375 0.09375q0.5625 0 1.0625 -0.171875q0.5 -0.171875 0.890625 -0.578125q0.390625 -0.421875 0.609375 -1.09375q0.234375 -0.6875 0.234375 -1.703125z" fill-rule="nonzero"></path><path fill="#980000" d="m271.99628 118.94973q-4.28125 -3.953125 -4.28125 -8.75q0 -1.125 0.21875 -2.234375q0.234375 -1.125 0.734375 -2.25q0.515625 -1.125 1.34375 -2.234375q0.828125 -1.125 2.015625 -2.234375l0.9375 0.953125q-3.59375 3.546875 -3.59375 7.875q0 2.15625 0.90625 4.140625q0.90625 1.984375 2.6875 3.75l-0.96875 0.984375z" fill-rule="nonzero"></path><path fill="#0000ff" d="m294.1671 114.715355q-0.625 0.234375 -1.296875 0.34375q-0.65625 0.125 -1.359375 0.125q-2.21875 0 -3.40625 -1.1875q-1.1875 -1.203125 -1.1875 -3.5q0 -1.109375 0.34375 -2.0q0.34375 -0.90625 0.953125 -1.546875q0.625 -0.640625 1.484375 -0.984375q0.875 -0.34375 1.90625 -0.34375q0.734375 0 1.359375 0.109375q0.625 0.09375 1.203125 0.3125l0 1.546875q-0.59375 -0.3125 -1.234375 -0.453125q-0.625 -0.15625 -1.28125 -0.15625q-0.625 0 -1.1875 0.25q-0.546875 0.234375 -0.96875 0.6875q-0.40625 0.4375 -0.65625 1.078125q-0.234375 0.640625 -0.234375 1.4375q0 1.6875 0.8125 2.53125q0.828125 0.84375 2.28125 0.84375q0.65625 0 1.265625 -0.140625q0.625 -0.15625 1.203125 -0.453125l0 1.5z" fill-rule="nonzero"></path><path fill="#980000" d="m313.02545 118.94973q-4.28125 -3.953125 -4.28125 -8.75q0 -1.125 0.21875 -2.234375q0.234375 -1.125 0.734375 -2.25q0.515625 -1.125 1.34375 -2.234375q0.828125 -1.125 2.015625 -2.234375l0.9375 0.953125q-3.59375 3.546875 -3.59375 7.875q0 2.15625 0.90625 4.140625q0.90625 1.984375 2.6875 3.75l-0.96875 0.984375zm6.5854187 -17.703125q4.265625 3.953125 4.265625 8.8125q0 1.0 -0.203125 2.078125q-0.203125 1.078125 -0.703125 2.203125q-0.484375 1.125 -1.3125 2.28125q-0.828125 1.171875 -2.09375 2.328125l-0.9375 -0.953125q1.8125 -1.78125 2.703125 -3.734375q0.890625 -1.953125 0.890625 -4.078125q0 -4.421875 -3.59375 -7.953125l0.984375 -0.984375z" fill-rule="nonzero"></path><path fill="#980000" d="m340.12546 101.246605q4.265625 3.953125 4.265625 8.8125q0 1.0 -0.203125 2.078125q-0.203125 1.078125 -0.703125 2.203125q-0.484375 1.125 -1.3125 2.28125q-0.828125 1.171875 -2.09375 2.328125l-0.9375 -0.953125q1.8125 -1.78125 2.703125 -3.734375q0.890625 -1.953125 0.890625 -4.078125q0 -4.421875 -3.59375 -7.953125l0.984375 -0.984375z" fill-rule="nonzero"></path><path fill="#000000" d="m349.19525 116.965355q0.484375 0.015625 0.921875 -0.09375q0.453125 -0.09375 0.78125 -0.296875q0.34375 -0.203125 0.546875 -0.5q0.203125 -0.296875 0.203125 -0.671875q0 -0.390625 -0.140625 -0.625q-0.125 -0.25 -0.296875 -0.453125q-0.171875 -0.203125 -0.3125 -0.4375q-0.125 -0.234375 -0.125 -0.625q0 -0.1875 0.078125 -0.390625q0.078125 -0.21875 0.21875 -0.390625q0.15625 -0.1875 0.390625 -0.296875q0.25 -0.109375 0.5625 -0.109375q0.328125 0 0.625 0.140625q0.3125 0.125 0.53125 0.40625q0.234375 0.28125 0.359375 0.703125q0.140625 0.40625 0.140625 0.96875q0 0.78125 -0.28125 1.484375q-0.28125 0.703125 -0.84375 1.25q-0.5625 0.5625 -1.40625 0.875q-0.828125 0.328125 -1.953125 0.328125l0 -1.265625z" fill-rule="nonzero"></path><path fill="#0000ff" d="m368.52234 110.590355q0 -1.1875 0.3125 -2.109375q0.328125 -0.921875 0.921875 -1.546875q0.609375 -0.640625 1.4375 -0.96875q0.84375 -0.328125 1.875 -0.328125q0.453125 0 0.875 0.0625q0.4375 0.046875 0.859375 0.171875l0 -3.921875l1.625 0l0 13.109375l-1.453125 0l-0.0625 -1.765625q-0.671875 0.984375 -1.46875 1.46875q-0.78125 0.46875 -1.703125 0.46875q-0.796875 0 -1.40625 -0.328125q-0.59375 -0.34375 -1.0 -0.953125q-0.40625 -0.609375 -0.609375 -1.453125q-0.203125 -0.859375 -0.203125 -1.90625zm1.640625 -0.09375q0 1.6875 0.5 2.515625q0.5 0.828125 1.40625 0.828125q0.609375 0 1.296875 -0.546875q0.6875 -0.546875 1.4375 -1.625l0 -4.3125q-0.40625 -0.1875 -0.890625 -0.28125q-0.484375 -0.109375 -0.953125 -0.109375q-1.3125 0 -2.0625 0.859375q-0.734375 0.84375 -0.734375 2.671875z" fill-rule="nonzero"></path><path fill="#980000" d="m395.0838 118.94973q-4.28125 -3.953125 -4.28125 -8.75q0 -1.125 0.21875 -2.234375q0.234375 -1.125 0.734375 -2.25q0.515625 -1.125 1.34375 -2.234375q0.828125 -1.125 2.015625 -2.234375l0.9375 0.953125q-3.59375 3.546875 -3.59375 7.875q0 2.15625 0.90625 4.140625q0.90625 1.984375 2.6875 3.75l-0.96875 0.984375z" fill-rule="nonzero"></path><path fill="#0000ff" d="m417.89526 109.902855q0 0.34375 -0.015625 0.578125q-0.015625 0.234375 -0.03125 0.4375l-6.53125 0q0 1.4375 0.796875 2.203125q0.796875 0.765625 2.296875 0.765625q0.40625 0 0.8125 -0.03125q0.40625 -0.046875 0.78125 -0.09375q0.390625 -0.0625 0.734375 -0.125q0.34375 -0.078125 0.640625 -0.15625l0 1.328125q-0.65625 0.1875 -1.484375 0.296875q-0.828125 0.125 -1.71875 0.125q-1.203125 0 -2.0625 -0.328125q-0.859375 -0.328125 -1.421875 -0.9375q-0.546875 -0.625 -0.8125 -1.515625q-0.265625 -0.890625 -0.265625 -2.03125q0 -0.984375 0.28125 -1.859375q0.296875 -0.875 0.828125 -1.53125q0.546875 -0.671875 1.328125 -1.0625q0.796875 -0.390625 1.796875 -0.390625q0.984375 0 1.734375 0.3125q0.75 0.296875 1.265625 0.859375q0.515625 0.5625 0.78125 1.375q0.265625 0.796875 0.265625 1.78125zm-1.6875 -0.21875q0.03125 -0.625 -0.125 -1.140625q-0.140625 -0.515625 -0.453125 -0.890625q-0.3125 -0.375 -0.78125 -0.578125q-0.453125 -0.203125 -1.078125 -0.203125q-0.515625 0 -0.953125 0.203125q-0.4375 0.203125 -0.765625 0.578125q-0.3125 0.359375 -0.5 0.890625q-0.1875 0.515625 -0.234375 1.140625l4.890625 0z" fill-rule="nonzero"></path><path fill="#980000" d="m436.11298 118.94973q-4.28125 -3.953125 -4.28125 -8.75q0 -1.125 0.21875 -2.234375q0.234375 -1.125 0.734375 -2.25q0.515625 -1.125 1.34375 -2.234375q0.828125 -1.125 2.015625 -2.234375l0.9375 0.953125q-3.59375 3.546875 -3.59375 7.875q0 2.15625 0.90625 4.140625q0.90625 1.984375 2.6875 3.75l-0.96875 0.984375zm6.5854187 -17.703125q4.265625 3.953125 4.265625 8.8125q0 1.0 -0.203125 2.078125q-0.203125 1.078125 -0.703125 2.203125q-0.484375 1.125 -1.3125 2.28125q-0.828125 1.171875 -2.09375 2.328125l-0.9375 -0.953125q1.8125 -1.78125 2.703125 -3.734375q0.890625 -1.953125 0.890625 -4.078125q0 -4.421875 -3.59375 -7.953125l0.984375 -0.984375z" fill-rule="nonzero"></path><path fill="#000000" d="m451.7682 116.965355q0.484375 0.015625 0.921875 -0.09375q0.453125 -0.09375 0.78125 -0.296875q0.34375 -0.203125 0.546875 -0.5q0.203125 -0.296875 0.203125 -0.671875q0 -0.390625 -0.140625 -0.625q-0.125 -0.25 -0.296875 -0.453125q-0.171875 -0.203125 -0.3125 -0.4375q-0.125 -0.234375 -0.125 -0.625q0 -0.1875 0.078125 -0.390625q0.078125 -0.21875 0.21875 -0.390625q0.15625 -0.1875 0.390625 -0.296875q0.25 -0.109375 0.5625 -0.109375q0.328125 0 0.625 0.140625q0.3125 0.125 0.53125 0.40625q0.234375 0.28125 0.359375 0.703125q0.140625 0.40625 0.140625 0.96875q0 0.78125 -0.28125 1.484375q-0.28125 0.703125 -0.84375 1.25q-0.5625 0.5625 -1.40625 0.875q-0.828125 0.328125 -1.953125 0.328125l0 -1.265625z" fill-rule="nonzero"></path><path fill="#0000ff" d="m479.82965 103.44973q-1.265625 -0.265625 -2.1875 -0.265625q-2.1875 0 -2.1875 2.28125l0 1.640625l4.09375 0l0 1.34375l-4.09375 0l0 6.609375l-1.640625 0l0 -6.609375l-2.984375 0l0 -1.34375l2.984375 0l0 -1.546875q0 -3.71875 3.875 -3.71875q0.96875 0 2.140625 0.21875l0 1.390625zm-9.75 2.296875l0 0z" fill-rule="nonzero"></path><path fill="#980000" d="m497.65674 118.94973q-4.28125 -3.953125 -4.28125 -8.75q0 -1.125 0.21875 -2.234375q0.234375 -1.125 0.734375 -2.25q0.515625 -1.125 1.34375 -2.234375q0.828125 -1.125 2.015625 -2.234375l0.9375 0.953125q-3.59375 3.546875 -3.59375 7.875q0 2.15625 0.90625 4.140625q0.90625 1.984375 2.6875 3.75l-0.96875 0.984375zm6.5854187 -17.703125q4.265625 3.953125 4.265625 8.8125q0 1.0 -0.203125 2.078125q-0.203125 1.078125 -0.703125 2.203125q-0.484375 1.125 -1.3125 2.28125q-0.828125 1.171875 -2.09375 2.328125l-0.9375 -0.953125q1.8125 -1.78125 2.703125 -3.734375q0.890625 -1.953125 0.890625 -4.078125q0 -4.421875 -3.59375 -7.953125l0.984375 -0.984375z" fill-rule="nonzero"></path><path fill="#980000" d="m524.7567 101.246605q4.265625 3.953125 4.265625 8.8125q0 1.0 -0.203125 2.078125q-0.203125 1.078125 -0.703125 2.203125q-0.484375 1.125 -1.3125 2.28125q-0.828125 1.171875 -2.09375 2.328125l-0.9375 -0.953125q1.8125 -1.78125 2.703125 -3.734375q0.890625 -1.953125 0.890625 -4.078125q0 -4.421875 -3.59375 -7.953125l0.984375 -0.984375zm20.514648 0q4.265625 3.953125 4.265625 8.8125q0 1.0 -0.203125 2.078125q-0.203125 1.078125 -0.703125 2.203125q-0.484375 1.125 -1.3125 2.28125q-0.828125 1.171875 -2.09375 2.328125l-0.9375 -0.953125q1.8125 -1.78125 2.703125 -3.734375q0.890625 -1.953125 0.890625 -4.078125q0 -4.421875 -3.59375 -7.953125l0.984375 -0.984375z" fill-rule="nonzero"></path><path fill="#000000" fill-opacity="0.0" d="m349.9367 132.14035l22.677155 2.9606323l-2.551178 14.992126l-22.677155 -2.9606323z" fill-rule="nonzero"></path><path fill="#e06666" d="m360.83054 154.24348l1.5900879 0.4282074q-0.5466919 1.6304474 -1.8093567 2.442566q-1.2600403 0.79670715 -2.8713684 0.5863342q-1.9986572 -0.2609253 -3.0023193 -1.7156067q-0.98550415 -1.4680481 -0.5710449 -3.9036407q0.2675476 -1.5723419 0.97821045 -2.6771545q0.72875977 -1.1181793 1.8974915 -1.5643921q1.1868286 -0.45959473 2.441803 -0.29574585q1.5958557 0.2083435 2.4665222 1.1414032q0.8706665 0.93307495 0.913208 2.451355l-1.6532898 0.03627014q-0.06451416 -1.0169067 -0.56933594 -1.5870514q-0.50479126 -0.57014465 -1.3259583 -0.6773529q-1.2549744 -0.16384888 -2.1972961 0.6270752q-0.92681885 0.79293823 -1.2546997 2.719818q-0.3331604 1.9577332 0.27389526 2.9509277q0.60964966 0.97776794 1.8181458 1.1355438q0.97610474 0.1274414 1.7265015 -0.37338257q0.75302124 -0.51623535 1.1488037 -1.725174z" fill-rule="nonzero"></path><path fill="#000000" fill-opacity="0.0" d="m7.0 5.0l689.98425 0l0 49.007874l-689.98425 0z" fill-rule="nonzero"></path><path fill="#434343" d="m284.5312 31.919998l-2.6875 0l-1.203125 -3.734375l-0.421875 -1.53125l-0.421875 1.5625l-1.1875 3.703125l-2.5625 0l-0.671875 -12.125l2.0 0l0.296875 7.78125l0.078125 2.109375l0.546875 -1.890625l1.328125 -4.3125l1.4375 0l1.40625 4.578125l0.46875 1.609375l0.03125 -1.890625l0.296875 -7.984375l1.953125 0l-0.6875 12.125zm7.6322937 -12.0q0 0.296875 -0.125 0.578125q-0.109375 0.265625 -0.3125 0.46875q-0.1875 0.1875 -0.46875 0.3125q-0.265625 0.109375 -0.578125 0.109375q-0.3125 0 -0.59375 -0.109375q-0.265625 -0.125 -0.46875 -0.3125q-0.203125 -0.203125 -0.3125 -0.46875q-0.109375 -0.28125 -0.109375 -0.578125q0 -0.3125 0.109375 -0.578125q0.109375 -0.265625 0.3125 -0.46875q0.203125 -0.203125 0.46875 -0.3125q0.28125 -0.125 0.59375 -0.125q0.3125 0 0.578125 0.125q0.28125 0.109375 0.46875 0.3125q0.203125 0.203125 0.3125 0.46875q0.125 0.265625 0.125 0.578125zm-2.515625 4.34375l-2.65625 0l0 -1.765625l4.984375 0l0 7.65625l2.71875 0l0 1.765625l-8.03125 0l0 -1.765625l2.984375 0l0 -5.890625zm14.991669 7.5625q-0.609375 0.140625 -1.234375 0.21875q-0.625 0.09375 -1.171875 0.09375q-0.9375 0 -1.609375 -0.203125q-0.671875 -0.1875 -1.109375 -0.578125q-0.4375 -0.40625 -0.65625 -1.015625q-0.203125 -0.625 -0.203125 -1.484375l0 -4.59375l-2.53125 0l0 -1.765625l2.53125 0l0 -2.421875l2.328125 -0.59375l0 3.015625l3.65625 0l0 1.765625l-3.65625 0l0 4.421875q0 0.8125 0.359375 1.234375q0.375 0.40625 1.25 0.40625q0.5625 0 1.078125 -0.09375q0.53125 -0.09375 0.96875 -0.21875l0 1.8125zm8.101044 0.09375l0 -6.140625q0 -1.546875 -1.140625 -1.546875q-0.578125 0 -1.109375 0.46875q-0.515625 0.453125 -1.109375 1.25l0 5.96875l-2.25 0l0 -13.109375l2.25 0l0 3.234375l-0.109375 1.703125q0.296875 -0.34375 0.59375 -0.609375q0.296875 -0.28125 0.640625 -0.46875q0.34375 -0.1875 0.734375 -0.28125q0.40625 -0.09375 0.890625 -0.09375q0.71875 0 1.25 0.234375q0.546875 0.234375 0.90625 0.671875q0.359375 0.421875 0.53125 1.03125q0.1875 0.609375 0.1875 1.359375l0 6.328125l-2.265625 0zm23.280212 -8.265625q0 0.953125 -0.328125 1.75q-0.3125 0.796875 -0.9375 1.390625q-0.609375 0.578125 -1.546875 0.90625q-0.921875 0.328125 -2.140625 0.328125l-1.171875 0l0 3.890625l-2.296875 0l0 -12.125l3.5625 0q1.171875 0 2.078125 0.265625q0.90625 0.25 1.515625 0.75q0.625 0.484375 0.9375 1.203125q0.328125 0.71875 0.328125 1.640625zm-2.390625 0.15625q0 -0.484375 -0.15625 -0.875q-0.15625 -0.390625 -0.46875 -0.671875q-0.3125 -0.28125 -0.78125 -0.421875q-0.46875 -0.15625 -1.125 -0.15625l-1.203125 0l0 4.4375l1.28125 0q0.59375 0 1.046875 -0.15625q0.453125 -0.15625 0.765625 -0.453125q0.3125 -0.3125 0.46875 -0.734375q0.171875 -0.4375 0.171875 -0.96875zm9.819794 -3.890625q0 0.296875 -0.125 0.578125q-0.109375 0.265625 -0.3125 0.46875q-0.1875 0.1875 -0.46875 0.3125q-0.265625 0.109375 -0.578125 0.109375q-0.3125 0 -0.59375 -0.109375q-0.265625 -0.125 -0.46875 -0.3125q-0.203125 -0.203125 -0.3125 -0.46875q-0.109375 -0.28125 -0.109375 -0.578125q0 -0.3125 0.109375 -0.578125q0.109375 -0.265625 0.3125 -0.46875q0.203125 -0.203125 0.46875 -0.3125q0.28125 -0.125 0.59375 -0.125q0.3125 0 0.578125 0.125q0.28125 0.109375 0.46875 0.3125q0.203125 0.203125 0.3125 0.46875q0.125 0.265625 0.125 0.578125zm-2.515625 4.34375l-2.65625 0l0 -1.765625l4.984375 0l0 7.65625l2.71875 0l0 1.765625l-8.03125 0l0 -1.765625l2.984375 0l0 -5.890625zm15.694794 2.78125q0 1.296875 -0.375 2.25q-0.359375 0.9375 -1.015625 1.5625q-0.640625 0.625 -1.53125 0.9375q-0.890625 0.296875 -1.9375 0.296875q-0.359375 0 -0.71875 -0.046875q-0.34375 -0.046875 -0.640625 -0.125l0 3.6875l-2.25 0l0 -13.109375l1.953125 0l0.046875 1.390625q0.296875 -0.359375 0.609375 -0.65625q0.3125 -0.296875 0.671875 -0.5q0.359375 -0.21875 0.765625 -0.328125q0.421875 -0.109375 0.953125 -0.109375q0.828125 0 1.46875 0.328125q0.65625 0.328125 1.09375 0.953125q0.453125 0.609375 0.671875 1.5q0.234375 0.875 0.234375 1.96875zm-2.375 0.09375q0 -0.78125 -0.109375 -1.328125q-0.109375 -0.546875 -0.328125 -0.890625q-0.203125 -0.359375 -0.5 -0.515625q-0.296875 -0.171875 -0.6875 -0.171875q-0.578125 0 -1.109375 0.46875q-0.515625 0.453125 -1.109375 1.25l0 4.125q0.28125 0.09375 0.671875 0.171875q0.390625 0.0625 0.796875 0.0625q0.546875 0 0.984375 -0.21875q0.4375 -0.234375 0.75 -0.640625q0.3125 -0.40625 0.46875 -0.984375q0.171875 -0.59375 0.171875 -1.328125zm12.366669 -0.65625q0 0.234375 -0.015625 0.609375q-0.015625 0.359375 -0.046875 0.6875l-6.1875 0q0 0.625 0.1875 1.109375q0.1875 0.46875 0.53125 0.78125q0.359375 0.3125 0.84375 0.484375q0.484375 0.171875 1.078125 0.171875q0.6875 0 1.46875 -0.109375q0.78125 -0.109375 1.625 -0.34375l0 1.796875q-0.359375 0.09375 -0.78125 0.1875q-0.421875 0.078125 -0.875 0.140625q-0.4375 0.078125 -0.90625 0.109375q-0.453125 0.03125 -0.875 0.03125q-1.078125 0 -1.9375 -0.3125q-0.84375 -0.3125 -1.4375 -0.90625q-0.59375 -0.59375 -0.90625 -1.46875q-0.3125 -0.890625 -0.3125 -2.046875q0 -1.15625 0.3125 -2.09375q0.3125 -0.9375 0.890625 -1.609375q0.578125 -0.671875 1.390625 -1.03125q0.828125 -0.375 1.828125 -0.375q1.015625 0 1.78125 0.3125q0.765625 0.296875 1.28125 0.859375q0.53125 0.5625 0.796875 1.328125q0.265625 0.765625 0.265625 1.6875zm-2.296875 -0.328125q0 -0.546875 -0.15625 -0.953125q-0.140625 -0.421875 -0.390625 -0.6875q-0.25 -0.28125 -0.59375 -0.40625q-0.34375 -0.125 -0.734375 -0.125q-0.84375 0 -1.390625 0.578125q-0.546875 0.5625 -0.65625 1.59375l3.921875 0zm7.3822937 -5.578125l-2.65625 0l0 -1.765625l4.984375 0l0 11.34375l2.71875 0l0 1.765625l-8.03125 0l0 -1.765625l2.984375 0l0 -9.578125zm12.772919 -0.65625q0 0.296875 -0.125 0.578125q-0.109375 0.265625 -0.3125 0.46875q-0.1875 0.1875 -0.46875 0.3125q-0.265625 0.109375 -0.578125 0.109375q-0.3125 0 -0.59375 -0.109375q-0.265625 -0.125 -0.46875 -0.3125q-0.203125 -0.203125 -0.3125 -0.46875q-0.109375 -0.28125 -0.109375 -0.578125q0 -0.3125 0.109375 -0.578125q0.109375 -0.265625 0.3125 -0.46875q0.203125 -0.203125 0.46875 -0.3125q0.28125 -0.125 0.59375 -0.125q0.3125 0 0.578125 0.125q0.28125 0.109375 0.46875 0.3125q0.203125 0.203125 0.3125 0.46875q0.125 0.265625 0.125 0.578125zm-2.515625 4.34375l-2.65625 0l0 -1.765625l4.984375 0l0 7.65625l2.71875 0l0 1.765625l-8.03125 0l0 -1.765625l2.984375 0l0 -5.890625zm12.835419 7.65625l0 -6.140625q0 -1.546875 -1.140625 -1.546875q-0.578125 0 -1.109375 0.46875q-0.515625 0.453125 -1.109375 1.25l0 5.96875l-2.25 0l0 -9.421875l1.953125 0l0.046875 1.390625q0.296875 -0.359375 0.609375 -0.65625q0.3125 -0.296875 0.671875 -0.5q0.359375 -0.21875 0.765625 -0.328125q0.421875 -0.109375 0.953125 -0.109375q0.71875 0 1.25 0.234375q0.546875 0.234375 0.90625 0.671875q0.359375 0.421875 0.53125 1.03125q0.1875 0.609375 0.1875 1.359375l0 6.328125l-2.265625 0zm10.194794 -12.0q0 0.296875 -0.125 0.578125q-0.109375 0.265625 -0.3125 0.46875q-0.1875 0.1875 -0.46875 0.3125q-0.265625 0.109375 -0.578125 0.109375q-0.3125 0 -0.59375 -0.109375q-0.265625 -0.125 -0.46875 -0.3125q-0.203125 -0.203125 -0.3125 -0.46875q-0.109375 -0.28125 -0.109375 -0.578125q0 -0.3125 0.109375 -0.578125q0.109375 -0.265625 0.3125 -0.46875q0.203125 -0.203125 0.46875 -0.3125q0.28125 -0.125 0.59375 -0.125q0.3125 0 0.578125 0.125q0.28125 0.109375 0.46875 0.3125q0.203125 0.203125 0.3125 0.46875q0.125 0.265625 0.125 0.578125zm-2.515625 4.34375l-2.65625 0l0 -1.765625l4.984375 0l0 7.65625l2.71875 0l0 1.765625l-8.03125 0l0 -1.765625l2.984375 0l0 -5.890625zm12.835419 7.65625l0 -6.140625q0 -1.546875 -1.140625 -1.546875q-0.578125 0 -1.109375 0.46875q-0.515625 0.453125 -1.109375 1.25l0 5.96875l-2.25 0l0 -9.421875l1.953125 0l0.046875 1.390625q0.296875 -0.359375 0.609375 -0.65625q0.3125 -0.296875 0.671875 -0.5q0.359375 -0.21875 0.765625 -0.328125q0.421875 -0.109375 0.953125 -0.109375q0.71875 0 1.25 0.234375q0.546875 0.234375 0.90625 0.671875q0.359375 0.421875 0.53125 1.03125q0.1875 0.609375 0.1875 1.359375l0 6.328125l-2.265625 0zm11.710419 -7.78125q0.265625 0.328125 0.359375 0.6875q0.109375 0.359375 0.109375 0.734375q0 0.78125 -0.28125 1.390625q-0.265625 0.59375 -0.765625 1.015625q-0.484375 0.40625 -1.1875 0.609375q-0.6875 0.203125 -1.515625 0.203125q-0.5 0 -0.921875 -0.09375q-0.40625 -0.09375 -0.625 -0.21875q-0.15625 0.15625 -0.265625 0.34375q-0.109375 0.1875 -0.109375 0.421875q0 0.15625 0.0625 0.3125q0.078125 0.140625 0.21875 0.265625q0.15625 0.109375 0.34375 0.1875q0.203125 0.0625 0.453125 0.078125l2.234375 0.078125q0.765625 0.015625 1.359375 0.1875q0.609375 0.171875 1.046875 0.5q0.4375 0.3125 0.671875 0.765625q0.234375 0.4375 0.234375 1.015625q0 0.65625 -0.296875 1.234375q-0.296875 0.59375 -0.890625 1.015625q-0.59375 0.4375 -1.484375 0.6875q-0.890625 0.25 -2.078125 0.25q-1.140625 0 -1.96875 -0.1875q-0.8125 -0.171875 -1.34375 -0.5q-0.515625 -0.3125 -0.765625 -0.765625q-0.25 -0.453125 -0.25 -0.984375q0 -0.328125 0.078125 -0.609375q0.09375 -0.28125 0.25 -0.53125q0.171875 -0.25 0.421875 -0.484375q0.25 -0.25 0.59375 -0.5q-0.453125 -0.25 -0.6875 -0.65625q-0.234375 -0.421875 -0.234375 -0.875q0 -0.328125 0.078125 -0.59375q0.09375 -0.28125 0.21875 -0.53125q0.140625 -0.25 0.3125 -0.46875q0.171875 -0.234375 0.375 -0.46875q-0.34375 -0.34375 -0.578125 -0.828125q-0.21875 -0.484375 -0.21875 -1.21875q0 -0.78125 0.28125 -1.390625q0.28125 -0.625 0.78125 -1.046875q0.5 -0.421875 1.1875 -0.640625q0.703125 -0.21875 1.515625 -0.21875q0.421875 0 0.796875 0.046875q0.390625 0.03125 0.703125 0.140625l3.265625 0l0 1.640625l-1.484375 0zm-5.328125 9.03125q0 0.546875 0.546875 0.796875q0.546875 0.265625 1.546875 0.265625q0.640625 0 1.078125 -0.125q0.453125 -0.109375 0.71875 -0.3125q0.28125 -0.1875 0.40625 -0.453125q0.125 -0.25 0.125 -0.53125q0 -0.25 -0.125 -0.421875q-0.109375 -0.171875 -0.328125 -0.296875q-0.203125 -0.109375 -0.484375 -0.171875q-0.28125 -0.0625 -0.625 -0.078125l-2.0 -0.03125q-0.265625 0.1875 -0.4375 0.34375q-0.171875 0.171875 -0.265625 0.328125q-0.09375 0.171875 -0.125 0.328125q-0.03125 0.171875 -0.03125 0.359375zm0.375 -7.5625q0 0.75 0.4375 1.203125q0.4375 0.4375 1.234375 0.4375q0.421875 0 0.71875 -0.140625q0.3125 -0.140625 0.515625 -0.375q0.203125 -0.234375 0.296875 -0.53125q0.109375 -0.3125 0.109375 -0.640625q0 -0.796875 -0.4375 -1.234375q-0.4375 -0.4375 -1.21875 -0.4375q-0.421875 0 -0.734375 0.140625q-0.3125 0.140625 -0.515625 0.375q-0.203125 0.234375 -0.3125 0.546875q-0.09375 0.3125 -0.09375 0.65625z" fill-rule="nonzero"></path><path fill="#d0e0e3" d="m439.77292 141.8203l64.97638 0l0 384.75592l-64.97638 0z" fill-rule="nonzero"></path><path stroke="#000000" stroke-width="1.0" stroke-linejoin="round" stroke-linecap="butt" d="m439.77292 141.8203l64.97638 0l0 384.75592l-64.97638 0z" fill-rule="nonzero"></path><path fill="#434343" d="m476.38245 282.837q0 0.859375 -0.359375 1.515625q-0.34375 0.640625 -0.984375 1.078125q-0.625 0.421875 -1.515625 0.640625q-0.875 0.21875 -1.953125 0.21875q-0.46875 0 -0.953125 -0.046875q-0.484375 -0.03125 -0.921875 -0.09375q-0.4375 -0.046875 -0.828125 -0.125q-0.390625 -0.078125 -0.703125 -0.15625l0 -1.59375q0.6875 0.25 1.546875 0.40625q0.875 0.140625 1.984375 0.140625q0.796875 0 1.359375 -0.125q0.5625 -0.125 0.921875 -0.359375q0.359375 -0.25 0.515625 -0.59375q0.171875 -0.359375 0.171875 -0.8125q0 -0.5 -0.28125 -0.84375q-0.265625 -0.34375 -0.71875 -0.609375q-0.4375 -0.28125 -1.015625 -0.5q-0.578125 -0.234375 -1.171875 -0.46875q-0.59375 -0.25 -1.171875 -0.53125q-0.5625 -0.28125 -1.015625 -0.671875q-0.4375 -0.390625 -0.71875 -0.90625q-0.265625 -0.515625 -0.265625 -1.234375q0 -0.625 0.265625 -1.21875q0.265625 -0.609375 0.8125 -1.078125q0.546875 -0.46875 1.40625 -0.75q0.859375 -0.296875 2.046875 -0.296875q0.296875 0 0.65625 0.03125q0.359375 0.03125 0.71875 0.078125q0.375 0.046875 0.734375 0.125q0.359375 0.0625 0.65625 0.125l0 1.484375q-0.71875 -0.203125 -1.4375 -0.296875q-0.703125 -0.109375 -1.375 -0.109375q-1.421875 0 -2.09375 0.46875q-0.65625 0.46875 -0.65625 1.265625q0 0.5 0.265625 0.859375q0.28125 0.34375 0.71875 0.625q0.453125 0.28125 1.015625 0.515625q0.578125 0.21875 1.171875 0.46875q0.59375 0.234375 1.15625 0.515625q0.578125 0.28125 1.015625 0.6875q0.453125 0.390625 0.71875 0.921875q0.28125 0.515625 0.28125 1.25z" fill-rule="nonzero"></path><path fill="#434343" d="m475.89807 308.11826l-6.90625 0l0 -12.125l6.90625 0l0 1.390625l-5.25 0l0 3.75l5.03125 0l0 1.40625l-5.03125 0l0 4.171875l5.25 0l0 1.40625z" fill-rule="nonzero"></path><path fill="#434343" d="m476.88245 330.11826l-1.859375 0l-1.8125 -3.875q-0.203125 -0.453125 -0.421875 -0.734375q-0.203125 -0.296875 -0.453125 -0.46875q-0.25 -0.171875 -0.546875 -0.25q-0.28125 -0.078125 -0.640625 -0.078125l-0.78125 0l0 5.40625l-1.65625 0l0 -12.125l3.25 0q1.046875 0 1.8125 0.234375q0.765625 0.234375 1.25 0.65625q0.484375 0.40625 0.703125 1.0q0.234375 0.578125 0.234375 1.296875q0 0.5625 -0.171875 1.078125q-0.15625 0.5 -0.484375 0.921875q-0.328125 0.40625 -0.828125 0.71875q-0.484375 0.296875 -1.109375 0.4375q0.515625 0.171875 0.859375 0.625q0.359375 0.4375 0.734375 1.171875l1.921875 3.984375zm-2.640625 -8.796875q0 -0.96875 -0.609375 -1.453125q-0.609375 -0.484375 -1.71875 -0.484375l-1.546875 0l0 4.015625l1.328125 0q0.59375 0 1.0625 -0.140625q0.46875 -0.140625 0.796875 -0.40625q0.328125 -0.265625 0.5 -0.640625q0.1875 -0.390625 0.1875 -0.890625z" fill-rule="nonzero"></path><path fill="#434343" d="m477.5387 339.99326l-4.109375 12.125l-2.234375 0l-4.03125 -12.125l1.875 0l2.609375 8.171875l0.75 2.390625l0.75 -2.390625l2.625 -8.171875l1.765625 0z" fill-rule="nonzero"></path><path fill="#434343" d="m475.89807 374.11826l-6.90625 0l0 -12.125l6.90625 0l0 1.390625l-5.25 0l0 3.75l5.03125 0l0 1.40625l-5.03125 0l0 4.171875l5.25 0l0 1.40625z" fill-rule="nonzero"></path><path fill="#434343" d="m476.88245 396.11826l-1.859375 0l-1.8125 -3.875q-0.203125 -0.453125 -0.421875 -0.734375q-0.203125 -0.296875 -0.453125 -0.46875q-0.25 -0.171875 -0.546875 -0.25q-0.28125 -0.078125 -0.640625 -0.078125l-0.78125 0l0 5.40625l-1.65625 0l0 -12.125l3.25 0q1.046875 0 1.8125 0.234375q0.765625 0.234375 1.25 0.65625q0.484375 0.40625 0.703125 1.0q0.234375 0.578125 0.234375 1.296875q0 0.5625 -0.171875 1.078125q-0.15625 0.5 -0.484375 0.921875q-0.328125 0.40625 -0.828125 0.71875q-0.484375 0.296875 -1.109375 0.4375q0.515625 0.171875 0.859375 0.625q0.359375 0.4375 0.734375 1.171875l1.921875 3.984375zm-2.640625 -8.796875q0 -0.96875 -0.609375 -1.453125q-0.609375 -0.484375 -1.71875 -0.484375l-1.546875 0l0 4.015625l1.328125 0q0.59375 0 1.0625 -0.140625q0.46875 -0.140625 0.796875 -0.40625q0.328125 -0.265625 0.5 -0.640625q0.1875 -0.390625 0.1875 -0.890625z" fill-rule="nonzero"></path><path fill="#d0e0e3" d="m215.25197 141.8203l64.976364 0l0 384.75592l-64.976364 0z" fill-rule="nonzero"></path><path stroke="#000000" stroke-width="1.0" stroke-linejoin="round" stroke-linecap="butt" d="m215.25197 141.8203l64.976364 0l0 384.75592l-64.976364 0z" fill-rule="nonzero"></path><path fill="#434343" d="m251.84589 285.66513q-1.453125 0.609375 -3.0625 0.609375q-2.5625 0 -3.9375 -1.53125q-1.375 -1.546875 -1.375 -4.546875q0 -1.46875 0.375 -2.640625q0.375 -1.171875 1.078125 -2.0q0.71875 -0.828125 1.71875 -1.265625q1.0 -0.453125 2.234375 -0.453125q0.84375 0 1.5625 0.15625q0.734375 0.140625 1.40625 0.4375l0 1.625q-0.65625 -0.359375 -1.375 -0.546875q-0.703125 -0.203125 -1.53125 -0.203125q-0.859375 0 -1.546875 0.328125q-0.6875 0.3125 -1.171875 0.921875q-0.484375 0.609375 -0.75 1.484375q-0.25 0.875 -0.25 2.0q0 2.359375 0.953125 3.5625q0.953125 1.1875 2.796875 1.1875q0.78125 0 1.5 -0.171875q0.71875 -0.1875 1.375 -0.515625l0 1.5625z" fill-rule="nonzero"></path><path fill="#434343" d="m251.75214 308.11826l-6.984375 0l0 -12.125l1.6875 0l0 10.71875l5.296875 0l0 1.40625z" fill-rule="nonzero"></path><path fill="#434343" d="m247.00214 319.38388l-2.796875 0l0 -1.390625l7.25 0l0 1.390625l-2.78125 0l0 9.328125l2.78125 0l0 1.40625l-7.25 0l0 -1.40625l2.796875 0l0 -9.328125z" fill-rule="nonzero"></path><path fill="#434343" d="m251.37714 352.11826l-6.90625 0l0 -12.125l6.90625 0l0 1.390625l-5.25 0l0 3.75l5.03125 0l0 1.40625l-5.03125 0l0 4.171875l5.25 0l0 1.40625z" fill-rule="nonzero"></path><path fill="#434343" d="m251.97089 374.11826l-2.15625 0l-3.53125 -7.5625l-1.03125 -2.421875l0 6.109375l0 3.875l-1.53125 0l0 -12.125l2.125 0l3.359375 7.15625l1.21875 2.78125l0 -6.5l0 -3.4375l1.546875 0l0 12.125z" fill-rule="nonzero"></path><path fill="#434343" d="m252.26776 385.3995l-3.59375 0l0 10.71875l-1.671875 0l0 -10.71875l-3.59375 0l0 -1.40625l8.859375 0l0 1.40625z" fill-rule="nonzero"></path><path fill="#000000" fill-opacity="0.0" d="m282.13055 151.96812l159.52756 23.685043" fill-rule="nonzero"></path><path stroke="#e06666" stroke-width="1.0" stroke-linejoin="round" stroke-linecap="butt" d="m282.13055 151.96812l156.1376 23.181747" fill-rule="evenodd"></path><path fill="#e06666" stroke="#e06666" stroke-width="1.0" stroke-linecap="butt" d="m438.26816 175.14987l-1.2775269 0.9472351l3.221405 -0.6586304l-2.8911133 -1.5661469z" fill-rule="evenodd"></path><path fill="#000000" fill-opacity="0.0" d="m281.175 287.1842l160.53543 -11.842499" fill-rule="nonzero"></path><path stroke="#e06666" stroke-width="1.0" stroke-linejoin="round" stroke-linecap="butt" d="m284.59277 286.9321l157.11765 -11.590393" fill-rule="evenodd"></path><path fill="#e06666" stroke="#e06666" stroke-width="1.0" stroke-linecap="butt" d="m284.5928 286.9321l1.0387878 -1.2042847l-2.9986572 1.3488464l3.1641235 0.8942261z" fill-rule="evenodd"></path><path fill="#000000" fill-opacity="0.0" d="m282.13196 195.7803l159.52756 23.685043" fill-rule="nonzero"></path><path stroke="#bf9000" stroke-width="1.0" stroke-linejoin="round" stroke-linecap="butt" d="m282.13196 195.7803l156.1376 23.181747" fill-rule="evenodd"></path><path fill="#bf9000" stroke="#bf9000" stroke-width="1.0" stroke-linecap="butt" d="m438.26956 218.96205l-1.2775574 0.94721985l3.2214355 -0.6586151l-2.8911133 -1.5661621z" fill-rule="evenodd"></path><path fill="#000000" fill-opacity="0.0" d="m348.98535 176.09427l22.677155 2.960617l-2.551178 14.992126l-22.677155 -2.9606323z" fill-rule="nonzero"></path><path fill="#bf9000" d="m359.59622 198.60167l1.667572 0.43832397q-0.6546631 1.4272461 -1.8963623 2.1160583q-1.2235718 0.6753998 -2.9278564 0.45289612q-2.138092 -0.2791443 -3.170105 -1.7532654q-1.0320129 -1.4741364 -0.62805176 -3.8480682q0.41708374 -2.451004 1.9028931 -3.643692q1.5013123 -1.1906586 3.5309753 -0.92567444q1.9676819 0.2568817 2.9815674 1.7444153q1.0138855 1.4875183 0.6020508 3.9076996q-0.023620605 0.13873291 -0.08895874 0.42959595l-7.281952 -0.95069885q-0.17984009 1.6153107 0.4970398 2.570343q0.6768799 0.9550476 1.9008789 1.1148376q0.8986206 0.11732483 1.6125488 -0.26219177q0.73205566 -0.39291382 1.29776 -1.3905792zm-4.9844055 -3.3768005l5.453705 0.7120056q0.0987854 -1.2319489 -0.30758667 -1.9153137q-0.62753296 -1.0588989 -1.8980103 -1.224762q-1.131012 -0.1476593 -2.0523376 0.519928q-0.90322876 0.6542053 -1.1957703 1.9081421z" fill-rule="nonzero"></path><path fill="#000000" fill-opacity="0.0" d="m281.1764 308.67355l160.53543 -11.842529" fill-rule="nonzero"></path><path stroke="#bf9000" stroke-width="1.0" stroke-linejoin="round" stroke-linecap="butt" d="m284.59418 308.42142l157.11765 -11.590393" fill-rule="evenodd"></path><path fill="#bf9000" stroke="#bf9000" stroke-width="1.0" stroke-linecap="butt" d="m284.59418 308.42142l1.0388184 -1.2042542l-2.9986572 1.3488464l3.1641235 0.8942261z" fill-rule="evenodd"></path><path fill="#000000" fill-opacity="0.0" d="m281.17642 226.63377l159.52756 23.685028" fill-rule="nonzero"></path><path stroke="#134f5c" stroke-width="1.0" stroke-linejoin="round" stroke-linecap="butt" d="m281.17642 226.63377l156.13763 23.181747" fill-rule="evenodd"></path><path fill="#134f5c" stroke="#134f5c" stroke-width="1.0" stroke-linecap="butt" d="m437.31406 249.8155l-1.2775574 0.9472351l3.2214355 -0.6586151l-2.8911133 -1.5661621z" fill-rule="evenodd"></path><path fill="#000000" fill-opacity="0.0" d="m348.98257 205.9346l22.677185 2.960617l-2.5512085 14.992126l-22.677155 -2.960617z" fill-rule="nonzero"></path><path fill="#134f5c" d="m355.34494 231.08614l1.4348755 -8.432083l-1.4718933 -0.19216919l0.22033691 -1.2948608l1.4718933 0.19215393l0.17575073 -1.0328064q0.16525269 -0.9711609 0.4170227 -1.4267731q0.3425598 -0.61709595 1.0123901 -0.923584q0.67245483 -0.3218994 1.7569885 -0.18031311q0.6972046 0.09101868 1.5231323 0.35643005l-0.49447632 1.4166565q-0.49554443 -0.15924072 -0.96035767 -0.21992493q-0.7591858 -0.099121094 -1.1241455 0.18414307q-0.3623352 0.26785278 -0.5118408 1.1465149l-0.15216064 0.8940735l1.9057007 0.24880981l-0.22033691 1.2948608l-1.9057007 -0.24879456l-1.4348755 8.432068l-1.6423035 -0.21440125z" fill-rule="nonzero"></path><path fill="#000000" fill-opacity="0.0" d="m280.22348 326.75668l160.53543 -11.842529" fill-rule="nonzero"></path><path stroke="#134f5c" stroke-width="1.0" stroke-linejoin="round" stroke-linecap="butt" d="m283.6413 326.50452l157.11765 -11.590363" fill-rule="evenodd"></path><path fill="#134f5c" stroke="#134f5c" stroke-width="1.0" stroke-linecap="butt" d="m283.6413 326.50455l1.0387878 -1.2042847l-2.9986572 1.3488464l3.1641235 0.8942261z" fill-rule="evenodd"></path><path fill="#000000" fill-opacity="0.0" d="m282.13055 351.51535l159.52756 23.685028" fill-rule="nonzero"></path><path stroke="#a64d79" stroke-width="1.0" stroke-linejoin="round" stroke-linecap="butt" d="m282.13055 351.51535l156.1376 23.181702" fill-rule="evenodd"></path><path fill="#a64d79" stroke="#a64d79" stroke-width="1.0" stroke-linecap="butt" d="m438.26816 374.69708l-1.2775269 0.9472351l3.221405 -0.6586304l-2.8911133 -1.5661621z" fill-rule="evenodd"></path><path fill="#000000" fill-opacity="0.0" d="m325.22943 326.89877l72.09448 9.480286l-2.551178 14.992126l-72.09448 -9.480316z" fill-rule="nonzero"></path><path fill="#741b47" d="m343.2176 353.58856l0.20983887 -1.2331543q-1.1760559 1.3267212 -2.9730835 1.0904236q-1.1618958 -0.152771 -2.045807 -0.91516113q-0.8657837 -0.77575684 -1.2138977 -1.9877319q-0.32998657 -1.2253418 -0.07556152 -2.7205505q0.24658203 -1.4489746 0.9288025 -2.572754q0.6976929 -1.1217346 1.7657471 -1.6274414q1.0834961 -0.5036621 2.2918396 -0.34475708q0.8830261 0.116119385 1.501709 0.5757141q0.6341858 0.4616089 0.9656372 1.119812l0.8183899 -4.8093567l1.6421204 0.21594238l-2.282074 13.410706l-1.5336609 -0.20169067zm-4.409912 -5.5441284q-0.3173828 1.8651428 0.31530762 2.893921q0.64816284 1.0307922 1.717102 1.1713562q1.0844116 0.14260864 1.9930115 -0.63619995q0.90859985 -0.7788086 1.2181396 -2.5977478q0.3383789 -1.9884644 -0.2788391 -3.0151978q-0.614563 -1.0421448 -1.7454529 -1.1908569q-1.0999146 -0.1446228 -1.995636 0.6516113q-0.89575195 0.79626465 -1.2236328 2.723114zm10.849762 10.4253845q-1.069458 -1.9057007 -1.6236267 -4.326721q-0.5515137 -2.4364624 -0.13183594 -4.9028015q0.37246704 -2.1888733 1.4079895 -4.085663q1.230011 -2.202179 3.340393 -4.272827l1.1928711 0.15682983q-1.4380493 1.7493286 -1.9333801 2.5194397q-0.7727661 1.1906738 -1.331543 2.519806q-0.6939697 1.6580505 -0.98773193 3.3844604q-0.7502136 4.4085693 1.2597351 9.164337l-1.1928711 -0.15686035zm10.895752 -5.8008423l1.6673584 0.4399109q-0.65460205 1.4268188 -1.8961487 2.114563q-1.2234497 0.67437744 -2.9275208 0.45028687q-2.137848 -0.2810974 -3.1697083 -1.7563477q-1.0318604 -1.4752502 -0.6279297 -3.8490906q0.41708374 -2.4509277 1.9027405 -3.642395q1.5011292 -1.1894531 3.530548 -0.9225769q1.9674377 0.2586975 2.9811707 1.747345q1.0137634 1.488617 0.6019287 3.9086914q-0.023590088 0.13873291 -0.08892822 0.42956543l-7.281067 -0.957428q-0.17984009 1.6153259 0.49691772 2.571106q0.67678833 0.95578 1.9006348 1.1166992q0.89852905 0.11816406 1.6123657 -0.2607727q0.7319641 -0.39227295 1.2976379 -1.3895569zm-4.983795 -3.3817444l5.453064 0.71707153q0.0987854 -1.2320251 -0.30752563 -1.9158325q-0.6274414 -1.0596008 -1.8977356 -1.2266235q-1.1308899 -0.14871216 -2.052124 0.51812744q-0.9031067 0.6534424 -1.1956787 1.9072571zm8.479828 7.0406494l0.32003784 -1.8805542l1.8899536 0.24850464l-0.32000732 1.8805847q-0.17575073 1.0327759 -0.65509033 1.6159058q-0.4819641 0.59851074 -1.3297424 0.83374023l-0.3440857 -0.77020264q0.56344604 -0.1465149 0.8873596 -0.5609436q0.3239441 -0.4144287 0.49658203 -1.2427673l-0.9450073 -0.12426758zm5.1080933 0.6717224l1.4348145 -8.431793l-1.4717102 -0.19351196l0.22033691 -1.2948303l1.4717102 0.19354248l0.17575073 -1.0328064q0.16525269 -0.97109985 0.41696167 -1.4265442q0.3425598 -0.6168518 1.0122986 -0.9227905q0.6723938 -0.32131958 1.7568054 -0.17871094q0.69711304 0.091674805 1.5229187 0.35784912l-0.49441528 1.4163818q-0.49551392 -0.159729 -0.9602356 -0.2208252q-0.75909424 -0.099823 -1.1240234 0.18313599q-0.3623047 0.2675476 -0.5118103 1.1461792l-0.15213013 0.89404297l1.9054565 0.25057983l-0.22033691 1.2948303l-1.9054565 -0.25057983l-1.4348145 8.431793l-1.6421204 -0.21594238zm5.1492004 4.711548l-1.1773682 -0.15481567q3.4895935 -4.032593 4.2397766 -8.441162q0.2911682 -1.711029 0.19241333 -3.45755q-0.09185791 -1.4146729 -0.43447876 -2.7520142q-0.21466064 -0.87924194 -1.0047913 -2.937317l1.1773682 0.15481567q1.3442078 2.5249329 1.7718201 4.9450684q0.37423706 2.0822144 0.001739502 4.2710876q-0.41967773 2.4663086 -1.7736206 4.652191q-1.3358154 2.1725159 -2.992859 3.719696z" fill-rule="nonzero"></path><path fill="#000000" fill-opacity="0.0" d="m280.22223 430.7826l160.53543 -11.842499" fill-rule="nonzero"></path><path stroke="#741b47" stroke-width="1.0" stroke-linejoin="round" stroke-linecap="butt" d="m283.64005 430.53046l157.11761 -11.590363" fill-rule="evenodd"></path><path fill="#741b47" stroke="#741b47" stroke-width="1.0" stroke-linecap="butt" d="m283.64005 430.5305l1.0387878 -1.2042847l-2.9986572 1.3488464l3.1641235 0.8942261z" fill-rule="evenodd"></path><path fill="#000000" fill-opacity="0.0" d="m281.17514 482.24542l159.52756 23.685059" fill-rule="nonzero"></path><path stroke="#3d85c6" stroke-width="1.0" stroke-linejoin="round" stroke-linecap="butt" d="m281.17514 482.2454l156.13763 23.181732" fill-rule="evenodd"></path><path fill="#3d85c6" stroke="#3d85c6" stroke-width="1.0" stroke-linecap="butt" d="m437.31277 505.42715l-1.2775269 0.9472351l3.221405 -0.6586304l-2.8911133 -1.5661621z" fill-rule="evenodd"></path><path fill="#000000" fill-opacity="0.0" d="m279.2668 518.6085l160.53546 -11.84256" fill-rule="nonzero"></path><path stroke="#3c78d8" stroke-width="1.0" stroke-linejoin="round" stroke-linecap="butt" d="m282.68463 518.3563l157.11761 -11.590363" fill-rule="evenodd"></path><path fill="#3c78d8" stroke="#3c78d8" stroke-width="1.0" stroke-linecap="butt" d="m282.68463 518.3563l1.0388184 -1.2042236l-2.9986877 1.3488159l3.164154 0.8942261z" fill-rule="evenodd"></path><path fill="#000000" fill-opacity="0.0" d="m325.22943 460.4956l72.09448 9.480316l-2.551178 14.992126l-72.09448 -9.480316z" fill-rule="nonzero"></path><path fill="#3d85c6" d="m340.89462 485.6507q-1.0632935 0.6639404 -1.970398 0.87557983q-0.9045105 0.19625854 -1.8804932 0.06793213q-1.5956421 -0.20983887 -2.3320007 -1.094635q-0.73376465 -0.90023804 -0.5265503 -2.117981q0.123291016 -0.7244873 0.5483093 -1.267456q0.4249878 -0.54296875 1.0120239 -0.8282471q0.58706665 -0.28527832 1.284668 -0.3826599q0.51934814 -0.07354736 1.5136719 -0.053100586q2.0558777 0.018188477 3.0559692 -0.1812439q0.05770874 -0.33914185 0.07345581 -0.4316101q0.17047119 -1.0019531 -0.2234497 -1.479248q-0.54071045 -0.63845825 -1.7955322 -0.8034668q-1.1618958 -0.152771 -1.7877808 0.1746521q-0.625885 0.3274536 -1.067627 1.3410034l-1.5872803 -0.44509888q0.4081421 -1.0022278 1.0114136 -1.5690308q0.6187744 -0.5647888 1.62146 -0.77963257q1.020813 -0.22824097 2.2911377 -0.061187744q1.2548218 0.16500854 1.9795532 0.5597229q0.72473145 0.39474487 1.0204773 0.8906555q0.29571533 0.49591064 0.31973267 1.1924744q0.022125244 0.42843628 -0.16409302 1.5228577l-0.37512207 2.2042847q-0.39083862 2.2967834 -0.4028015 2.9255981q0.006134033 0.6154175 0.23703003 1.2131348l-1.7350464 -0.22817993q-0.17681885 -0.54330444 -0.12072754 -1.2451172zm0.48486328 -3.6869812q-0.9588623 0.23635864 -2.815979 0.2600708q-1.046051 0.004272461 -1.4957886 0.13424683q-0.44976807 0.12997437 -0.74246216 0.45394897q-0.2926941 0.3239746 -0.3661499 0.7555847q-0.11279297 0.6628418 0.30688477 1.1750488q0.43777466 0.49884033 1.3982544 0.6251526q0.96047974 0.12628174 1.7749023 -0.19213867q0.8170471 -0.33380127 1.2940063 -0.9960327q0.3604126 -0.53570557 0.54403687 -1.6147156l0.10229492 -0.6011658zm5.7039185 9.764496q-1.0694885 -1.9057007 -1.6236267 -4.3267517q-0.5515442 -2.4364624 -0.13183594 -4.902771q0.37246704 -2.1888733 1.407959 -4.0856934q1.230011 -2.202179 3.3404236 -4.272827l1.1928711 0.15686035q-1.4380493 1.7492981 -1.9333801 2.5194397q-0.77279663 1.1906433 -1.3315735 2.5197754q-0.6939392 1.6580505 -0.98773193 3.384491q-0.7501831 4.4085693 1.2597656 9.164337l-1.1928711 -0.15686035zm5.204529 -3.3500671l-1.5336609 -0.20166016l2.282074 -13.410706l1.6420898 0.21594238l-0.81314087 4.778534q1.2763977 -1.1717224 2.9030151 -0.9578247q0.89852905 0.11816406 1.6411133 0.59402466q0.74523926 0.46047974 1.1436768 1.1905212q0.41656494 0.7166748 0.5534973 1.6802673q0.13696289 0.963562 -0.041412354 2.0117798q-0.42492676 2.4971619 -1.8977051 3.7060852q-1.4701538 1.193512 -3.2052307 0.96533203q-1.7350464 -0.22814941 -2.467102 -1.7900391l-0.20721436 1.2177429zm0.82388306 -4.9346924q-0.29638672 1.7418518 0.034576416 2.5891113q0.5749817 1.3678894 1.9072571 1.5430603q1.0844116 0.14260864 2.0318909 -0.67837524q0.95007324 -0.83639526 1.267456 -2.7015686q0.32263184 -1.8959656 -0.28167725 -2.9052734q-0.6043396 -1.0092773 -1.6732483 -1.1498413q-1.0844421 -0.14257812 -2.0345154 0.69381714q-0.95007324 0.83639526 -1.2517395 2.6090698zm8.363373 6.1428223l0.32000732 -1.8805847l1.8899841 0.24853516l-0.32000732 1.8805542q-0.17575073 1.0328064 -0.65509033 1.6159058q-0.4819641 0.59851074 -1.3297424 0.83374023l-0.3440857 -0.7701721q0.5634155 -0.14654541 0.8873596 -0.5609741q0.3239441 -0.4144287 0.4965515 -1.2427368l-0.9449768 -0.12426758zm11.041382 1.4519043l0.20983887 -1.2331543q-1.1760559 1.3267517 -2.9730835 1.0904236q-1.1618958 -0.152771 -2.045807 -0.91516113q-0.8658142 -0.7757263 -1.2138977 -1.9877319q-0.32998657 -1.2253418 -0.07556152 -2.7205505q0.24658203 -1.4489746 0.9288025 -2.572754q0.6976929 -1.1217346 1.7657166 -1.6274109q1.0835266 -0.5036621 2.29187 -0.3447876q0.8830261 0.116119385 1.501709 0.5757141q0.6341858 0.4616089 0.9656372 1.119812l0.8183899 -4.809326l1.6421204 0.21591187l-2.282074 13.410706l-1.5336609 -0.20169067zm-4.409912 -5.5441284q-0.3173828 1.8651733 0.31530762 2.893921q0.64816284 1.0308228 1.717102 1.1713867q1.0844116 0.14257812 1.9930115 -0.63623047q0.90859985 -0.7788086 1.2181396 -2.5977173q0.3383484 -1.9884949 -0.2788391 -3.0152283q-0.614563 -1.0421448 -1.7454529 -1.1908569q-1.0999146 -0.1446228 -1.995636 0.65164185q-0.89575195 0.79626465 -1.2236328 2.7230835zm8.773895 10.152435l-1.1773682 -0.15481567q3.4895935 -4.032593 4.239807 -8.441162q0.2911377 -1.711029 0.19238281 -3.45755q-0.09185791 -1.4146729 -0.43447876 -2.7520142q-0.21466064 -0.87924194 -1.0047913 -2.937317l1.1773682 0.15481567q1.3442383 2.5249329 1.7718201 4.945099q0.37423706 2.0821838 0.0017700195 4.271057q-0.41970825 2.4663086 -1.7736511 4.6522217q-1.3358154 2.1724854 -2.992859 3.7196655z" fill-rule="nonzero"></path><path fill="#000000" fill-opacity="0.0" d="m280.20993 463.73132l163.37009 -17.35431" fill-rule="nonzero"></path><path stroke="#93c47d" stroke-width="1.0" stroke-linejoin="round" stroke-linecap="butt" d="m283.61786 463.36935l159.96216 -16.99234" fill-rule="evenodd"></path><path fill="#93c47d" stroke="#93c47d" stroke-width="1.0" stroke-linecap="butt" d="m283.61786 463.36932l0.9994812 -1.2370911l-2.9536743 1.4446716l3.1912537 0.79193115z" fill-rule="evenodd"></path><path fill="#000000" fill-opacity="0.0" d="m279.435 383.9106l158.6142 19.433075" fill-rule="nonzero"></path><path stroke="#93c47d" stroke-width="1.0" stroke-linejoin="round" stroke-linecap="butt" d="m279.43503 383.91058l155.2125 19.016327" fill-rule="evenodd"></path><path fill="#93c47d" stroke="#93c47d" stroke-width="1.0" stroke-linecap="butt" d="m434.64752 402.9269l-1.2529907 0.97946167l3.2035828 -0.7404785l-2.9300842 -1.4919739z" fill-rule="evenodd"></path><path fill="#000000" fill-opacity="0.0" d="m327.2065 356.47412l55.62204 7.338562l-2.551178 14.992126l-55.62207 -7.338562z" fill-rule="nonzero"></path><path fill="#93c47d" d="m337.8152 382.1998l-1.5335999 -0.20233154l2.2820435 -13.410461l1.6420288 0.21664429l-0.81314087 4.7784424q1.2763062 -1.1712341 2.902832 -0.9566345q0.898468 0.11853027 1.6410522 0.5947571q0.7451782 0.4607849 1.1435852 1.1910706q0.41653442 0.7168884 0.5534668 1.6805725q0.13693237 0.9636841 -0.041412354 2.0118713q-0.42495728 2.4971313 -1.897644 3.7055054q-1.4700928 1.1929321 -3.2050476 0.9640503q-1.7349854 -0.22891235 -2.4669495 -1.7911987l-0.20721436 1.2177124zm0.82385254 -4.9346313q-0.29638672 1.7418213 0.034576416 2.589264q0.57492065 1.3682251 1.907135 1.5439758q1.0843506 0.1430664 2.0317688 -0.67755127q0.9500122 -0.83602905 1.267395 -2.7011719q0.32263184 -1.8959656 -0.28164673 -2.905548q-0.60427856 -1.009613 -1.6731567 -1.1506348q-1.0843506 -0.1430664 -2.0343628 0.69299316q-0.9500427 0.83602905 -1.251709 2.608673zm10.417755 10.452484q-1.0694275 -1.90625 -1.6235352 -4.327667q-0.55148315 -2.4368286 -0.1317749 -4.9031067q0.37246704 -2.1888428 1.4079285 -4.085327q1.22995 -2.2017822 3.3402405 -4.2716675l1.1927795 0.15737915q-1.4379578 1.7488098 -1.933258 2.5187683q-0.7727661 1.1903992 -1.3315125 2.5193481q-0.6939087 1.6578674 -0.9877014 3.3842773q-0.7501831 4.408478 1.259613 9.165375l-1.1927795 -0.15737915zm10.658783 -6.269043l1.5898132 0.43041992q-0.54663086 1.6300049 -1.8091125 2.4405823q-1.2598572 0.795166 -2.8708801 0.5826111q-1.9983215 -0.26367188 -3.0017395 -1.7199402q-0.98532104 -1.469635 -0.57089233 -3.9050903q0.2675476 -1.5722656 0.9780884 -2.6763q0.7286682 -1.1174316 1.8972168 -1.5621338q1.1866455 -0.45809937 2.4413757 -0.2925415q1.5955505 0.21051025 2.4660645 1.1448975q0.8705139 0.9343872 0.9130249 2.453003l-1.6530151 0.034057617q-0.06448364 -1.0171509 -0.56921387 -1.5880737q-0.5046997 -0.57092285 -1.3257141 -0.67926025q-1.2547607 -0.16555786 -2.1968994 0.6242676q-0.9266968 0.7918396 -1.2545776 2.718628q-0.33309937 1.9576111 0.2738037 2.9517822q0.6095276 0.97875977 1.81781 1.1381836q0.97592163 0.12875366 1.7261963 -0.3711548q0.7529297 -0.5153198 1.1486511 -1.723938zm2.6727295 8.027954l-1.1773071 -0.15533447q3.489441 -4.031311 4.239624 -8.439819q0.2911377 -1.7109985 0.19241333 -3.457672q-0.09185791 -1.4147949 -0.43444824 -2.7523499q-0.21463013 -0.879364 -1.0046997 -2.9378967l1.1772766 0.15533447q1.3441467 2.5256348 1.771698 4.946106q0.37420654 2.0824585 0.001739502 4.2713013q-0.41967773 2.466278 -1.7735596 4.6517334q-1.3357849 2.172058 -2.9927368 3.7185974z" fill-rule="nonzero"></path></g></svg>
+
diff --git a/chapter/3/E_account_spreadsheet_vats.png b/chapter/3/E_account_spreadsheet_vats.png
new file mode 100644
index 0000000..8ce9624
--- /dev/null
+++ b/chapter/3/E_account_spreadsheet_vats.png
Binary files differ
diff --git a/chapter/3/E_vat.png b/chapter/3/E_vat.png
new file mode 100644
index 0000000..131b0de
--- /dev/null
+++ b/chapter/3/E_vat.png
Binary files differ
diff --git a/chapter/3/message-passing.md b/chapter/3/message-passing.md
index 5898e23..a35a75e 100644
--- a/chapter/3/message-passing.md
+++ b/chapter/3/message-passing.md
@@ -1,11 +1,463 @@
---
layout: page
-title: "Message Passing"
-by: "Joe Schmoe and Mary Jane"
+title: "Message Passing and the Actor Model"
+by: "Nathaniel Dempkowski"
---
-Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum. {% cite Uniqueness --file message-passing %}
+# Introduction
-## References
+Message passing programming models have essentially been discussed since the beginning of distributed computing and as a result message passing can be taken to mean a lot of things. If you look up a broad definition on Wikipedia, it includes things like Remote Procedure Calls (RPC), and Message Passing Interface (MPI). Additionally, there are popular process-calculi like the pi-calculus and Communicating Sequential Processes (CSP) which have inspired practical message passing systems. For example, Go's channels are based on the idea of first-class communication channels from the pi-calculus and Clojure's `core.async` library is based on CSP. However, when people talk about message passing today they mostly mean the actor model. It is a ubiquitous and general message passing programming model that has been developing since the 1970's and is used today to build massive scalable systems.
-{% bibliography --file message-passing %} \ No newline at end of file
+In the field of message passing programming models, it is not only important to consider recent state of the art research, but additionally the historic initial papers on message passing and the actor model that are the roots of the programming models described in more recent papers. It is enlightening to see which aspects of the models have stuck around, and many of the more recent papers reference and address deficiencies present in older papers. There have been plenty of programing languages designed around message passing, especially those focused on the actor model of programming and organizing units of computation.
+
+In this chapter I describe the four primary variants of the actor model: classic actors, process-based actors, communicating event-loops, and active objects. I attempt to highlight historic and modern languages that exemplify these models, as well as the philosophies and tradeoffs that programmers need to be aware of to understand and best make use of these models.
+
+Despite the actor model's originating as far back as the 1970s, it is still being developed and being incorporated into the programming languages of today, as many recently published papers and systems in the field demonstrate. There are a few robust industrial-strength actor systems that are being used to power massive scalable distributed systems; for example Akka has been used to serve PayPal's billions of transactions, {% cite PayPalAkka --file message-passing %} Erlang has been used to send messages for WhatsApp's hundreds of millions of users, {% cite ErlangWhatsAppTalk --file message-passing %} and Orleans has been used to serve Halo 4's millions of players. {% cite OrleansHalo4Talk --file message-passing %} There are a couple of different approaches to building industrial actor frameworks around monitoring, handling fault-tolerance, and managing actor lifecycles which are detailed later in the chapter.
+
+An important framing for the actor models presented is in the question "Why message passing, and specifically why the actor model?" Given the vast number of distributed programming models out there, one might ask, why this one was so important when it was initially proposed? Why has it facilitated advanced languages, systems, and libraries that are widely used today? As we'll see throughout this chapter, some of the broadest advantages of the actor model include isolation of state managed by the given actor, scalability, and simplifying the programmer's ability to reason about their system.
+
+# Original proposal of the actor model
+
+The actor model was originally proposed in _A Universal Modular ACTOR Formalism for Artificial Intelligence_ {% cite Hewitt:1973:UMA:1624775.1624804 --file message-passing %} in 1973 as a method of computation for artificial intelligence research. The original goal of the model was to model parallel computation in communication in a way that could be safely distributed concurrently across workstations. The paper makes few presumptions about implementation details, instead defining the high-level message passing communication model. Gul Agha developed the model further, by focusing on using actors as a basis for concurrent object-oriented programming. This work is collected in _Actors: A Model of Concurrent Computation in Distributed Systems_. {% cite Agha:1986:AMC:7929 --file message-passing %}
+
+Actors are defined as independent units of computation with isolated state. These units have two core characteristics:
+
+* they can send messages asynchronously to one another, and,
+* they have a mailbox which contains messages that they have received, allowing messages to be received at any time and then queued for processing.
+
+Messages are of the form:
+
+```
+(request: <message-to-target>
+ reply-to: <reference-to-messenger>)
+```
+
+Actors attempt to process messages from their mailboxes by matching their `request` field sequentially against patterns or rules which can be specific values or logical statements. When a pattern is matched, computation occurs and the result of that computation is implicitly returned to the reference in the message's `reply-to` field. This is a type of continuation, where the continuation is the message to another actor. These messages are one-way and, there are no guarantees that a message will ever be received in response. The actor model is so general because it places few restrictions on systems. Asynchrony and the absence of message delivery guarantees enable modeling real distributed systems using the actor model. For example, if message delivery was guaranteed, then the model would be much less general, and only able to model systems which include complex message-delivery protocols. This originally-proposed variant of the actor model is limited compared to many of the others, but the early ideas of taking advantage of distribution of processing power to enable greater parallel computation are there.
+
+Interestingly, the original paper introducing the actor model does so in the context of hardware. They mention actors as almost another machine architecture. This paper describes the concepts of an "actor machine" and a "hardware actor" as the context for the actor model, which is totally different from the way we think about modern actors as abstracting away a lot of the hardware details we don't want to deal with. This concept is reminiscent of something like a Lisp machine, though specially built to utilize the actor model of computation for artificial intelligence.
+
+# Classic actor model
+
+The classic actor model was formalized as a unit of computation in Agha's _Concurrent Object-Oriented Programming_. {% cite Agha:1990:COP:83880.84528 --file message-passing %} The classic actor expands on the original proposal of actors, keeping the ideas of asynchronous communication through messages between isolated units of computation and state. The classic actor contains the following primitive operations:
+
+* `create`: create an actor from a behavior description and a set of parameters, including other existing actors
+* `send`: send a message to another actor
+* `become`: have an actor replace their behavior with a new one
+
+As in the original actor model, classic actors communicate by asynchronous message passing. They are a primitive independent unit of computation which can be used to build higher-level abstractions for concurrent programming. Actors are uniquely addressable, and have their own independent mailboxes or message queues. State changes using the classic actor model are specified and aggregated using the `become` operation. Each time an actor processes a message it computes a behavior in response to the next type of message it expects to process. A `become` operation's argument is a named continuation, `b`, representing behavior that the actor should be updated with, along with some state that should be passed to `b`.
+
+This continuation model is flexible. You could create a purely functional actor where the new behavior would be identical to the original and no state would be passed. An example of this is the `AddOne` actor below, which processes a message according to a single fixed behavior.
+
+```
+(define AddOne
+ [add-one [n]
+ (return (+ n 1))])
+```
+
+The model also enables the creation of stateful actors which change behavior and pass along an object representing some state. This state can be the result of many operations, which enables the aggregation of state changes at a higher level of granularity than something like variable assignment. An example of this is a `BankAccount` actor given in _Concurrent Object-Oriented Programming_. {% cite Agha:1990:COP:83880.84528 --file message-passing %}
+
+```
+(define BankAccount
+ (mutable [balance]
+ [withdraw-from [amount]
+ (become BankAccount (- balance amount))
+ (return 'withdrew amount)]
+ [deposit-to [amount]
+ (become BankAccount (+ balance amount))
+ (return 'deposited amount)]
+ [balance-query
+ (return 'balance-is balance)]))
+```
+
+Stateful continuations enable flexibility in the behavior of an actor over time in response to the actions of other actors in the system. Limiting state and behavior changes to `become` operations changes the level at which one analyzes a system, freeing the programmer from worrying about interference during state changes. In the example above, the programmer only has to worry about changes to the account's balance during `become` statements in response to a sequential queue of well-defined message types.
+
+If you squint a little, this actor definition sounds similar to Alan Kay’s original definition of Object Oriented programming. This definition describes a system where objects have a behavior, their own memory, and communicate by sending and receiving messages that may contain other objects or simply trigger actions. Kay's ideas sound closer to what we consider the actor model today, and less like what we consider object-oriented programming. That is, Kay's focus in this description is on designing the messaging and communications that dictate how objects interact.
+
+<blockquote cite="http://lists.squeakfoundation.org/pipermail/squeak-dev/1998-October/017019.html">
+<p>The big idea is "messaging" -- that is what the kernal [sic] of Smalltalk/Squeak is all about (and it's something that was never quite completed in our Xerox PARC phase). The Japanese have a small word -- ma -- for "that which is in between" -- perhaps the nearest English equivalent is "interstitial". The key in making great and growable systems is much more to design how its modules communicate rather than what their internal properties and behaviors should be.</p>
+<footer>Alan Kay {% cite KayQuote --file message-passing %}</footer>
+</blockquote>
+
+## Concurrent Object-Oriented Programming (1990)
+
+One could say that the renaissance of actor models in mainstream program began with Gul Agha's work. His seminal book _Actors: A Model of Concurrent Computation in Distributed Systems_ {% cite Agha:1986:AMC:7929 --file message-passing %} and later paper, _Concurrent Object-Oriented Programming_ {% cite Agha:1990:COP:83880.84528 --file message-passing %}, offer classic actors as a natural solution to solving problems at the intersection of two trends in computing; increased distributed computing resources and the rising popularity of object-oriented programming. The paper defines common patterns of parallelism: pipeline concurrency, divide and conquer, and cooperative problem solving. It then focuses on how the actor model can be used to solve these problems in an object-oriented style, and some of the challenges that arise with distributed actors and objects, as well as strategies and tradeoffs for communication and reasoning about behaviors.
+
+This paper looks at a lot of systems and languages that are implementing solutions in this space, and starts to identify some of the advantages from the perspective of programmers of programming with actors. One of the core languages used for examples in the paper is Rosette {% cite Tomlinson:1988:ROC:67387.67410 --file message-passing %}, but the paper largely focuses on the potential and benefits of the model. Agha claims the benefits of using objects stem from a separation of concerns.
+
+<blockquote>
+<p>By separating the specification of what is done (the abstraction) from how it is done (the implementation), the concept of objects provides modularity necessary for programming in the large. It turns out that concurrency is a natural consequence of the concept of objects.</p>
+<footer>Gul Agha {% cite Agha:1990:COP:83880.84528 --file message-passing %}</footer>
+</blockquote>
+
+Splitting concerns into multiple pieces allows for the programmer to have an easier time reasoning about the behavior of the program. It also allows the programmer to use more flexible abstractions in their programs.
+
+<blockquote>
+<p>It is important to note that the actor languages give special emphasis to developing flexible program structures which simplify reasoning about programs.</p>
+<footer>Gul Agha {% cite Agha:1990:COP:83880.84528 --file message-passing %}</footer>
+</blockquote>
+
+This flexibility turns out to be a highly discussed advantage which continues to be touted in modern actor systems.
+
+## Rosette
+
+Rosette was both a language for concurrent object-oriented programming of actors, as well as a runtime system for managing the usage of and access to resources by those actors. Rosette {% cite Tomlinson:1988:ROC:67387.67410 --file message-passing %} is mentioned throughout Agha's _Concurrent Object-Oriented Programming_, {% cite Agha:1990:COP:83880.84528 --file message-passing %} and the code examples given in the paper are written in Rosette. Agha is even an author on the Rosette paper, so its clear that Rosette is foundational to the classic actor model. It seems to be a language which almost defines what the classic actor model looks like in the context of concurrent object-oriented programming.
+
+The motivation behind Rosette was to provide strategies for dealing with problems like search, where the programmer needs a means to control how resources are allocated to sub-computations to optimize performance in the face of combinatorial explosion. For example in a search problem, you might first compute an initial set of results that you want to further refine. It would be too computationally expensive to exhaustively refine every result, so you want to choose the best ones based on some metric and only proceed with those. Rosette supports the use of concurrency in solving computationally intensive problems whose structure is not statically defined, but rather depends on some heuristic to return results. Rosette has an architecture which uses actors in two distinct ways. They describe two different layers with different responsibilities:
+
+* _Interface layer_: This implements mechanisms for monitoring and control of resources. The system resources and hardware are viewed as actors.
+* _System environment_: This is comprised of actors who actually describe the behavior of concurrent applications and implement resource management policies based on the interface layer.
+
+The Rosette language has a number of object-oriented features, many of which we take for granted in modern object-oriented programming languages. It implements dynamic creation and modification of objects for extensible and reconfigurable systems, supports inheritance, and has objects which can be organized into classes. The more interesting characteristic is that the concurrency in Rosette is inherent and declarative rather than explicit as with many modern object-oriented languages. In Rosette, the concurrency is an inherent property of the program structure and resource allocation. This is different from a language like Java, where all of the concurrency is very explicit. The Java concurrency model is best covered in _Java Concurrency in Practice_, though Java 8 introduces a few new concurrency techniques that the book does not discuss. {% cite Peierls:2005:JCP:1076522 --file message-passing %} The motivation behind this declarative concurrency comes from the heterogeneous nature of distributed concurrent computers. Different computers and architectures have varying concurrency characteristics, and the authors argue that forcing the programmer to tailor their concurrency to the specific machine makes it difficult to re-map a program to another one. This idea of using actors as a more flexible and natural abstraction over concurrency and distribution of resources is an important one which is seen in some form within many actor systems.
+
+Actors in Rosette are organized into three types of classes which describe different aspects of the actors within the system:
+
+* _Abstract classes_ specify requests, responses, and actions within the system which can be observed. The idea behind these is to expose the higher-level behaviors of the system, but tailor the actual actor implementations to the resource constraints of the system.
+* _Representation classes_ specify the resource management characteristics of implementations of abstract classes.
+* _Behavior classes_ specify the actual implementations of actors in given abstract and representation classes.
+
+These classes represent a concrete object-oriented abstraction to organize actors which handles the practical constraints of a distributed system. It represents a step in the direction of handling not just the information flow and behavior of the system, but the underlying hardware and resources. Rosette's model feels like a direct expression of those concerns which are something every actor system in production inevitably ends up addressing.
+
+## Akka
+
+Akka is an effort to bring an industrial-strength actor model to the JVM runtime, which was not explicitly designed to support actors. Akka was developed out of initial efforts of [Scala Actors](#scala-actors) to bring the actor model to the JVM. There are a few notable changes from Scala Actors that make Akka worth mentioning, especially as it is being actively developed while Scala Actors is not. Some important changes are detailed in _On the Integration of the Actor Model in Mainstream Technologies: The Scala Perspective_. {% cite Haller:2012:IAM:2414639.2414641 --file message-passing %}
+
+Akka provides a programming interface with both Java and Scala bindings for actors which looks similar to Scala Actors, but has different semantics in how it processes messages. Akka's `receive` operation defines a global message handler which doesn't block on the receipt of no matching messages, and is instead only triggered when a matching message can be processed. It also will not leave a message in an actor's mailbox if there is no matching pattern to handle the message. The message will simply be discarded and an event will be published to the system. Akka's interface also provides stronger encapsulation to avoid exposing direct references to actors. Akka actors have a limited `ActorRef` interface which only provides methods to send or forward messages to its actor, additionally checks are done to ensure that no direct reference to an instance of an `Actor` subclass is accessible after an actor is created. To some degree this fixes problems in Scala Actors where public methods could be called on actors, breaking many of the guarantees programmers expect from message-passing. This system is not perfect, but in most cases it limits the programmer to simply sending messages to an actor using a limited interface.
+
+The Akka runtime also provides performance advantages over Scala Actors. The runtime uses a single continuation closure for many or all messages an actor processes, and provides methods to change this global continuation. This can be implemented more efficiently on the JVM, as opposed to Scala Actors' continuation model which uses control-flow exceptions which cause additional overhead. Additionally, nonblocking message insert and task schedule operations are used for extra performance.
+
+Akka is the production-ready result of the classic actor model lineage. It is actively developed and actually used to build scalable systems. The production usage of Akka is detailed later in this chapter. Akka has been successful enough that it has been ported to other languages/runtimes. There is an [Akka.NET](http://getakka.net/) project which brings the Akka programming model to .NET and Mono using C# and F#. Akka has even been ported to JavaScript as [Akka.js](https://github.com/unicredit/akka.js/), built on top of [Scala.js](http://www.scala-js.org/).
+
+# Process-based actors
+
+The process-based actor model is essentially an actor modeled as a process that runs from start to completion. This view is broadly similar to the classic actor, but different mechanics exist around managing the lifecycle and behaviors of actors between the models. The first language to explicitly implement this model is Erlang, {% cite Armstrong:2010:ERL:1810891.1810910 --file message-passing %} and they even say in a retrospective that their view of computation is broadly similar to the Agha's classic actor model.
+
+Process-based actors are defined as a computation which runs from start to completion, rather than the classic actor model, which defines an actor almost as a state machine of behaviors and the logic to transition between those. Similar state-machine like behavior transitions are possible through recursion with process-based actors, but programming them feels fundamentally different than using the previously described `become` statement.
+
+These actors use a `receive` primitive to specify messages that an actor can receive during a given state/point in time. `receive` statements have some notion of defining acceptable messages, usually based on patterns, conditionals or types. If a message is matched, corresponding code is evaluated, but otherwise the actor simply blocks until it gets a message that it knows how to handle. The semantics of this `receive` are different than the receive previously described in the section about Akka. Akka's `receive` is explicitly only triggered when an actor gets a message it knows how to handle. Depending on the language implementation `receive` might specify an explicit message type or perform some pattern matching on message values.
+
+An example of these core concepts of a process with a defined lifecycle and use of the `receive` statement to match messages is a simple counter process written in Erlang. {% cite Armstrong:2010:ERL:1810891.1810910 --file message-passing %}
+
+```
+counter(N) ->
+ receive
+ tick ->
+ counter(N+1);
+ {From, read} ->
+ From ! {self(), N},
+ counter(N)
+ end.
+```
+
+This demonstrates the use of `receive` to match on two different values of messages `tick`, which increments the counter, and `{From, read}` where `From` is a process identifier and `read` is a literal. In response to another process sending the message `tick` by doing something like `CounterId ! tick.` the process calls itself with an incremented value which demonstrates a similarity to the `become` statement, but using recursion and an argument value instead of a named behavior continuation and some state. If the counter receives a message of the form `{<processId>, read}` it will then send that process a message with the counter's processId and value, and call itself recursively with the same value.
+
+## Erlang
+
+Erlang's implementation of process-based actors gets to the core of what it means to be a process-based actor. Erlang was the origin of the process-based actor model. The Ericsson company originally developed this model to program large highly-reliable fault-tolerant telecommunications switching systems. Erlang's development started in 1985, but its model of programming is still used today. The motivations of the Erlang model were around four key properties that were needed to program fault-tolerant operations:
+
+* Isolated processes
+* Pure message passing between processes
+* Detection of errors in remote processes
+* The ability to determine what type of error caused a process crash
+
+The Erlang researchers initially believed that shared-memory was preventing fault-tolerance and they saw message-passing of immutable data between processes as the solution to avoiding shared-memory. There was a concern that passing around and copying data would be costly, but the Erlang developers saw fault-tolerance as a more important concern than performance. This model was essentially developed independently from other actor systems and research, especially as its development was started before Agha's classic actor model formalization was even published, but it ends up with a broadly similar view of computation to Agha's classic actor model.
+
+Erlang actors run as lightweight isolated processes. They do not have visibility into one another, and pass around pure messages, which are immutable. These have no dangling pointers or data references between objects, and really enforce the idea of immutable separated data between actors unlike many of the early classic actor implementations in which references to actors and data can be passed around freely.
+
+Erlang implements a blocking `receive` operation as a means of processing messages from a processes' mailbox. They use value matching on message tuples as a means of describing the types of messages a given actor can accept.
+
+Erlang also seeks to build failure into the programming model, as one of the core assumptions of a distributed system is that machines and network connections are going to fail. Erlang provides the ability for processes to monitor one another through two primitives:
+
+* `monitor`: one-way unobtrusive notification of process failure/shutdown
+* `link`: two-way notification of process failure/shutdown allowing for coordinated termination
+
+These primitives can be used to construct complex hierarchies of supervision that can be used to handle failure in isolation, rather than failures impacting your entire system. Supervision hierarchies are notably almost the only scheme for fault-tolerance that exists in the world of actors. Almost every actor system that is used to build distributed systems takes a similar approach, and it seems to work. Erlang's philosophies used to build a reliable fault-tolerant telephone exchange seem to be broadly applicable to the fault-tolerance problems of distributed systems.
+
+An example of a process `monitor` written in Erlang is given below. {% cite Armstrong:2010:ERL:1810891.1810910 --file message-passing %}
+
+```
+on_exit(Pid, F) ->
+ spawn(fun() -> monitor(Pid, F) end).
+
+monitor(Pid, F) ->
+ process_flag(trap_exit, true),
+ link(Pid),
+ receive
+ {‘EXIT’, Pid, Why} ->
+ F(Why)
+end.
+```
+
+This defines two processes: `on_exit` which simply spawns a `monitor` process to call a given function when a given process id exits, and `monitor` which uses `link` to receive a message when the given process id exists, and to call a function with the reason it exited. You could imagine chaining many of these `monitor` and `link` operations together to build processes to monitor one another for failure and perform recovery operations depending on the failure behavior.
+
+It is worth mentioning that Erlang achieves all of this through the Erlang Virtual Machine (BEAM), which runs as a single OS process and OS thread per core. These single OS processes then manage many lightweight Erlang processes. The Erlang VM implements all of the concurrency, monitoring, and garbage collection for Erlang processes within this VM, which almost acts like an operating system itself. This is unlike any other language or actor system described here.
+
+## Scala Actors
+
+Scala Actors is an example of taking and enhancing the Erlang model while bringing it to a new platform. Scala Actors brings lightweight Erlang-style message-passing concurrency to the JVM and integrates it with the heavyweight thread/process concurrency models. {% cite Haller:2009:SAU:1496391.1496422 --file message-passing %} This is stated well in the original paper about Scala Actors as "an impedance mismatch between message-passing concurrency and virtual machines such as the JVM." VMs usually map threads to heavyweight processes, but that a lightweight process abstraction reduces programmer burden and leads to more natural abstractions. The authors claim that “The user experience gained so far indicates that the library makes concurrent programming in a JVM-based system much more accessible than previous techniques.”
+
+The realization of this model depends on efficiently multiplexing actors to threads. This technique was originally developed in Scala actors, and later was adopted by Akka. This integration allows for Actors to invoke methods that block the underlying thread in a way that doesn't prevent actors from making process. This is important to consider in an event-driven system where handlers are executed on a thread pool, because the underlying event-handlers can't block threads without risking thread pool starvation. The end result here is that Scala Actors enabled a new lightweight concurrency primitive on the JVM, with enhancements over Erlang's model. The Erlang model was further enhanced with Scala's pattern-matching capabilities which enable more advanced pattern-matching on messages compared to Erlang's tuple value matching. Scala Actors are of the type `Any => Unit`, which means that they are essentially untyped. They can receive literally any type and match on it with potential side effects. This behavior could be problematic and systems like Cloud Haskell and Akka aim to improve on it. Akka especially directly draws on the work of Scala Actors, and has now become the standard actor framework for Scala programmers.
+
+## Cloud Haskell
+
+Cloud Haskell is an extension of Haskell which essentially implements an enhanced version of the computational message-passing model of Erlang in Haskell. {% cite epstein2011 --file message-passing %} It enhances Erlang's model with advantages from Haskell's model of functional programming in the form of purity, types, and monads. Cloud Haskell enables the use of pure functions for remote computation, which means that these functions are idempotent and can be restarted or run elsewhere in the case of failure without worrying about side-effects or undo mechanisms. This alone isn't so different from Erlang, which operates on immutable data in the context of isolated memory.
+
+One of the largest improvements over Erlang is the introduction of typed channels for sending messages. These provide guarantees to the programmer about the types of messages their actors can handle, which is something Erlang lacks. In Erlang, all you have is dynamic pattern matching based on values patterns, and the hope that the wrong types of message don't get passed around your system. Cloud Haskell processes can also use multiple typed channels to pass messages between actors, rather than Erlang's single untyped channel. Haskell's monadic types make it possible for programmers to use a programming style, where they can ensure that pure and effective code are not mixed. This makes reasoning about where side-effects happen in your system easier. Cloud Haskell has shared memory within an actor process, which is useful for certain applications. This might sound like it could cause problems, but shared-memory structures are forbidden by the type system from being shared across actors. Finally, Cloud Haskell allows for the serialization of function closures, which means that higher-order functions can be distributed across actors. This means that as long as a function and its environment are serializable, they can be spun off as a remote computation and seamlessly continued elsewhere. These improvements over Erlang make Cloud Haskell a notable project in the space of process-based actors. Cloud Haskell is currently supported and also has developed the Cloud Haskell Platform, which aims to provide common functionality needed to build and manage a production actor system using Cloud Haskell.
+
+# Communicating event-loops
+
+The communicating event-loop model was introduced in the E language, {% cite Miller:2005:CSP:1986262.1986274 --file message-passing %} and is one that aims to change the level of granularity at which communication happens within an actor-based system. The previously described actor systems organize communication at the actor level, while the communicating event model puts communication between actors in the context of actions on objects within those actors. The overall messages still reference higher-level actors, but those messages refer to more granular actions within an actor's state.
+
+## E Language
+
+The E language implements a model which is closer to imperative object-oriented programming. Within a single actor-like node of computation called a "vat" many objects are contained. This vat contains not just objects, but a mailbox for all of the objects inside, as well as a call stack for methods on those objects. There is a shared message queue and event-loop that acts as one abstraction barrier for computation across actors. The actual references to objects within a vat are used for addressing communication and computation across actors and operate at a different level of abstraction.
+
+This immediately raises other concerns. When handing out references at a different level of granularity than actor-global, how do you ensure the benefits of isolation that the actor model provides? After all, by referencing objects inside of an actor from many places it sounds like we're just reinventing shared-memory problems. This is answered by two different modes of execution: immediate and eventual calls.
+
+<figure class="main-container">
+ <img src="./E_vat.png" alt="An E vat" />
+ <footer>{% cite Miller:2005:CSP:1986262.1986274 --file message-passing %}</footer>
+</figure>
+
+This diagram shows an E vat, which consists of a heap of objects and a thread of control for executing methods on those objects. The stack and queue represent messages in the two different modes of execution that are used when operating on objects in E. The stack is used for immediate execution, while the queue is used for eventual execution. Immediate calls are processed first, and new immediate calls are added to the top of the stack. Eventual calls are then processed from the queue afterwards. These different modes of message passing are highlighted in communication across vats below.
+
+<figure class="main-container">
+ <img src="./E_account_spreadsheet_vats.png" alt="Communication between E vats" />
+ <footer>{% cite Miller:2005:CSP:1986262.1986274 --file message-passing %}</footer>
+</figure>
+
+From this diagram we can see that local calls among objects within a vat are handled on the immediate stack. Then when a call needs to be made across vats, it is handled on the eventual queue, and delivered to the appropriate object within the vat at some point in the future.
+
+E's reference-states define many of the isolation guarantees around computation that we expect from actors. Two different ways to reference objects are defined:
+
+* _Near reference_: This is a reference only possible between two objects in the same vat. These expose both synchronous immediate-calls and asynchronous eventual-sends.
+* _Eventual reference_: This is a reference which crosses vat boundaries, and only exposes asynchronous eventual-sends, not synchronous immediate-calls.
+
+The difference in semantics between the two types of references means that only objects within the same vat are granted synchronous access to one another. The most an eventual reference can do is asynchronously send and queue a message for processing at some unspecified point in the future. This means that within the execution of a vat, a degree of temporal isolation can be defined between the objects and communications within the vat, and the communications to and from other vats.
+
+This code example ties into the previous diagrams, and demonstrates the two different types reference semantics. {% cite Miller:2005:CSP:1986262.1986274 --file message-passing %}
+
+```
+def makeStatusHolder(var myStatus) {
+ def myListeners := [].diverge()
+
+ def statusHolder {
+ to addListener(newListener) {
+ myListeners.push(newListener)
+ }
+
+ to getStatus() { return myStatus }
+
+ to setStatus(newStatus) {
+ myStatus := newStatus
+ for listener in myListeners {
+ listener <- statusChanged(newStatus)
+ }
+ }
+ }
+ return statusHolder
+}
+```
+
+This creates an object `statusHolder` with methods defined by `to` statements. A method invocation from another vat-local object like `statusHolder.setStatus(123)` causes a message to be synchronously delivered to this object. Other objects can register as event listeners by calling either `statusHolder.addListener()` or `statusHolder <- addListener()` to either synchronously or eventually register as listeners. They will be notified eventually when the value of the `statusHolder` changes. This is done via `<-` which is the eventual-send operator.
+
+The motivation for this referencing model comes from wanting to work at a finer-grained level of references than a traditional actor exposes. The simplest example is that you want to ensure that another actor in your system can read a value, but can't write to it. How do you do that within another actor model? You might imagine creating a read-only variant of an actor which doesn't expose a write message type, or proxies only `read` messages to another actor which supports both `read` and `write` operations. In E because you are handing out object references, you would simply only pass around references to a `read` method, and you don't have to worry about other actors in your system being able to write values. These finer-grained references make reasoning about state guarantees easier because you are no longer exposing references to an entire actor, but instead the granular capabilities of the actor. Finer-grained references also enable partial failures and recoveries within an actor. Individual objects within an actor can fail and be restarted without affecting the health of the entire actor. This is in a way similar to the supervision hierarchies seen in Erlang, and even means that messages to a failed object could be queued for processing while that object is recovering. This is something that could not happen with the same granularity in another actor system, but feels like a natural outcome of object-level references in E.
+
+## AmbientTalk/2
+
+AmbientTalk/2 is a modern revival of the communicating event-loops actor model as a distributed programming language with an emphasis on developing mobile peer-to-peer applications. {% cite Cutsem:2007:AOE:1338443.1338745 --file message-passing %} This idea was originally realized in AmbientTalk/1 {% cite Dedecker:2006:APA:2171327.2171349 --file message-passing %} where actors were modelled as ABCL/1-like active objects {% cite Yonezawa:1986:OCP:960112.28722 --file message-passing %}, but AmbientTalk/2 models actors similarly to E's vats. The authors of AmbientTalk/2 felt limited by not allowing passive objects within an actor to be referenced by other actors, so they chose to go with the more fine-grained approach which allows for remote interactions between and movement of passive objects.
+
+Actors in AmbientTalk/2 are representations of event loops. The message queue is the event queue, messages are events, asynchronous message sends are event notifications, and object methods are the event handlers. The event loop serially processes messages from the queue to avoid race conditions. Local objects within an actor are owned by that actor, which is the only entity allowed to directly execute methods on them. Like E, objects within an actor can communicate using synchronous or asynchronous methods of communication. Again similar to E, objects that are referenced outside of an actor can only be communicated to asynchronously by sending messages. Objects can additionally declare themselves serializable, which means they can be copied and sent to other actors for use as local objects. When this happens, there is no maintained relationship between the original object and its copy.
+
+AmbientTalk/2 uses the event loop model to enforce three essential concurrency control properties:
+
+* _Serial execution_: Events are processed sequentially from an event queue, so the handling of a single event is atomic with respect to other events.
+* _Non-blocking communication_: An event loop doesn't suspend computation to wait for other event loops, instead all communication happens strictly as asynchronous event notifications.
+* _Exclusive state access_: Event handlers (object methods) and their associated state belong to a single event loop, which has access to their mutable state. Mutation of other event loop state is only possible indirectly by passing an event notification asking for mutation to occur.
+
+The end result of all this decoupling and isolation of computation is that it is a natural fit for mobile ad hoc networks. In this domain, connections are volatile with limited range and transient failures. Removing coupling based on time or synchronization is a natural fit for the domain, and the communicating event-loop actor model is a natural model for programming these systems. AmbientTalk/2 provides additional features on top of the communicating event-loop model like service discovery. These enable ad hoc network creation as actors near each other can broadcast their existence and advertise common services that can be used for communication.
+
+AmbientTalk/2 is most notable as a reimagining of the communicating event-loops actor model for a modern use case. This again speaks to the broader advantages of actors and their applicability to solving the problems of distributed systems.
+
+# Active Objects
+
+Active object actors draw a distinction between two different types of objects: active and passive objects. Every active object has a single entry point defining a fixed set of messages that are understood. Passive objects are the objects that are actually sent between actors, and are copied around to guarantee isolation. This enables a separation of concerns between data that relates to actor communication and data that relates to actor state and behavior.
+
+The active object model as initially described in the ABCL/1 language defines objects with a state and three modes:
+
+* `dormant`: Initial state of no computation, simply waiting for a message to activate the behavior of the actor.
+* `active`: A state in which computation is performed that is triggered when a message is received that satisfies the patterns and constraints that the actor has defined it can process.
+* `waiting`: A state of blocked execution, where the actor is active, but waiting until a certain type or pattern of message arrives to continue computation.
+
+## ABCL/1 Language
+
+The ABCL/1 language implements the active object model described above, representing a system as a collection of objects, and the interactions between those objects as concurrent messages being passed around. {% cite Yonezawa:1986:OCP:960112.28722 --file message-passing %} One interesting aspect of ABCL/1 is the idea of explicitly different modes of message passing. Other actor models generally have a notion of priority around the values, types, or patterns of messages they process, usually defined by the ordering of their receive operation, but ABCL/1 implements two different modes of message passing with different semantics. They have standard queued messages in the `ordinary` mode, but more interestingly they have `express` priority messages. When an object receives an express message it halts any other processing of ordinary messages it is performing, and processes the `express` message immediately. This enables an actor to accept high-priority messages while in `active` mode, and also enables monitoring and interrupting actors.
+
+The language also offers different models of synchronization around message-passing between actors. Three different message-passing models are given that enable different use cases:
+
+* `past`: Requests another actor to perform a task, while simultaneously proceeding with computation without waiting for the task to be completed.
+* `now`: Waits for a message to be received, and to receive a response. This acts as a basic synchronization barrier across actors.
+* `future`: Acts like a typical future, continuing computation until a remote result is needed, and then blocking until that result is received.
+
+It is interesting to note that all of these modes can be expressed by the `past` style of message-passing, as long as the type of the message and which actor to reply to with results are known.
+
+The key difference here is around how this different style of actors relates to managing their lifecycle. In the active object style, lifecycle is a result of messages or requests to actors, but in other styles we see a more explicit notion of lifecycle and creating/destroying actors.
+
+## Orleans
+
+Orleans takes the concept of actors whose lifecycle is dependent on messaging or requests and places them in the context of cloud applications. {% cite Bykov:2011:OCC:2038916.2038932 --file message-passing %} Orleans does this via actors (called "grains") which are isolated units of computation and behavior that can have multiple instantiations (called "activations") for scalability. These actors also have persistence, meaning they have a persistent state that is kept in durable storage so that it can be used to manage things like user data.
+
+Orleans uses a different notion of identity than other actor systems. In other systems an "actor" might refer to a behavior and instances of that actor might refer to identities that the actor represents like individual users. In Orleans, an actor represents that persistent identity, and the actual instantiations are in fact reconcilable copies of that identity.
+
+The programmer essentially assumes that a single entity is handling requests to an actor, but the Orleans runtime actually allows for multiple instantiations for scalability. These instantiations are invoked in response to an RPC-like call from the programmer which immediately returns an asynchronous promise.
+
+In Orleans, declaring an actor just looks like making any other class which implements a specific interface. A simple example here is a `PlayerGrain` which can join games. All methods of an Orleans actor (grain) interface must return a `Task<T>`, as they are all asynchronous.
+
+```
+public interface IPlayerGrain : IGrainWithGuidKey
+{
+ Task<IGameGrain> GetCurrentGame();
+ Task JoinGame(IGameGrain game);
+}
+
+public class PlayerGrain : Grain, IPlayerGrain
+{
+ private IGameGrain currentGame
+
+ public Task<IGameGrain> GetCurrentGame()
+ {
+ return Task.FromResult(currentGame);
+ }
+
+ public Task JoinGame(IGameGrain game)
+ {
+ currentGame = game;
+ Console.WriteLine("Player {0} joined game {1}", this.GetPrimaryKey(), game.GetPrimaryKey());
+ return TaskDone.Done;
+ }
+}
+```
+
+Invoking a method on an actor is done like any other asynchronous call, using the `await` keyword in C#. This can be done from either a client or inside another actor (grain). In both cases the call looks almost exactly the same, the only different being clients use `GrainClient.GrainFactory` while actors can use `GrainFactory` directly.
+
+```
+IPlayerGrain player = GrainClient.GrainFactory.GetGrain<IPlayerGrain>(playerId);
+Task joinGameTask = player.JoinGame(currentGame);
+await joinGameTask;
+```
+
+Here a game client gets a reference to a specific player, and has that player join the current game. This code looks like any other asynchronous C# code a developer would be used to writing, but this is really an actor system where the runtime has abstracted away many of the details. The runtime handles all of the actor lifecycle in response to the requests clients and other actors within the system make, as well as persistence of state to long-term storage.
+
+Multiple instances of an actor can be running and modifying the state of that actor at the same time. The immediate question here is how does that actually work? It doesn't intuitively seem like transparently accessing and changing multiple isolated copies of the same state should produce anything but problems when its time to do something with that state.
+
+Orleans solves this problem by providing mechanisms to reconcile conflicting changes. If multiple instances of an actor modify persistent state, they need to be reconciled into a consistent state in some meaningful way. The default here is a last-write-wins strategy, but Orleans also exposes the ability to create fine-grained reconciliation policies, as well as a number of common reconcilable data structures. If an application requires a certain reconciliation algorithm, the developer can implement it using Orleans. These reconciliation mechanisms are built upon Orleans' concept of transactions.
+
+Transactions in Orleans are a way to causally reason about the different instances of actors that are involved in a computation. Because in this model computation happens in response to a single outside request, a given actor's chain of computation via. associated actors always contains a single instantiation of each actor. These causal chain of instantiations is treated as a single transaction. At reconciliation time Orleans uses these transactions, along with current instantiation state to reconcile to a consistent state.
+
+All of this is a longwinded way of saying that Orleans' programmer-centric contributions are that it separates the concerns of running and managing actor lifecycles from the concerns of how data flows throughout your distributed system. It does this is a fault-tolerant way, and for many programming tasks, you likely wouldn't have to worry about scaling and reconciling data in response to requests. It provides the benefits of the actor model through a programming model that attempts to abstract away details that you would otherwise have to worry about when using actors in production.
+
+# Why the actor model?
+
+The actor programming model offers benefits to programmers of distributed systems by allowing for easier programmer reasoning about behavior, providing a lightweight concurrency primitive that naturally scales across many machines, and enabling looser coupling among components of a system allowing for change without service disruption. Actors enable a programmer to easier reason about their behavior because they are at a fundamental level isolated from other actors. When programming an actor, the programmer only has to worry about the behavior of that actor and the messages it can send and receive. This alleviates the need for the programmer to reason about an entire system. Instead the programmer has a fixed set of concerns, meaning they can ensure behavioral correctness in isolation, rather than having to worry about an interaction they hadn’t anticipated occurring. Actors provide a single means of communication (message-passing), meaning that a lot of concerns a programmer has around concurrent modification of data are alleviated. Data is restricted to the data within a single actor and the messages it has been passed, rather than all of the accessible data in the whole system.
+
+Actors are lightweight, meaning that the programmer usually does not have to worry about how many actors they are creating. This is a contrast to other fundamental units of concurrency like threads or processes, which a programmer has to be acutely aware of, as they incur high costs of creation, and quickly run into machine resource and performance limitations.
+
+<blockquote>
+<p>Without a lightweight process abstraction, users are often forced to write parts of concurrent applications in an event-driven style which obscures control flow, and increases the burden on the programmer.</p>
+<footer>Philipp Haller {% cite Haller:2009:SAU:1496391.1496422 --file message-passing %}</footer>
+</blockquote>
+
+Unlike threads and processes, actors can also easily be told to run on other machines as they are functionally isolated. This cannot traditionally be done with threads or processes, as they are unable to be passed over the network to run elsewhere. Messages can be passed over the network, so an actor does not have to care where it is running as long as it can send and receive messages. They are more scalable because of this property, and it means that actors can naturally be distributed across a number of machines to meet the load or availability demands of the system.
+
+Finally, because actors are loosely coupled, only depending on a set of input and output messages to and from other actors, their behavior can be modified and upgraded without changing the entire system. For example, a single actor could be upgraded to use a more performant algorithm to do its work, and as long as it can process the same input and output messages, nothing else in the system has to change. This isolation is a contrast to methods of concurrent programming like remote procedure calls, futures, and promises. These models emphasize a tighter coupling between units of computation, where a process may call a method directly on another process and expect a specific result. This means that both the caller and callee (receiver of the call) need to have knowledge of the code being run, so you lose the ability to upgrade one without impacting the other. This becomes a problem in practice, as it means that as the complexity of your distributed system grows, more and more pieces become linked together.
+
+<blockquote>
+<p>It is important to note that the actor languages give special emphasis to developing flexible program structures which simplify reasoning about programs.</p>
+<footer>Gul Agha {% cite Agha:1990:COP:83880.84528 --file message-passing %}</footer>
+</blockquote>
+
+This is not desirable, as a key characteristic of distributed systems is availability, and the more things are linked together, the more of your system you have to take down or halt to make changes/upgrades. Actors compare favorably to other concurrent programming primitives like threads or remote procedure calls due to their low cost and loosely coupled nature. They are also programmer friendly, and ease the programmer burden of reasoning about a distributed system.
+
+# Modern usage in production
+
+It is important when reviewing models of programming distributed systems not to look just to academia, but to see which of these systems are actually used in industry to build things. This can give us insight into which features of actor systems are actually useful, and the trends that exist throughout these systems.
+
+_On the Integration of the Actor Model in Mainstream Technologies: The Scala Perspective_ {% cite Haller:2012:IAM:2414639.2414641 --file message-passing %} provides some insight into the requirements of an industrial-strength actor implementation on a mainstream platform. These requirements were drawn out of an initial effort with [Scala Actors](#scala-actors) to bring the actor model to mainstream software engineering, as well as lessons learned from the deployment and advancement of production actors in [Akka](#akka).
+
+* _Library-based implementation_: It is not obvious which concurrency abstraction wins in real world cases, and different concurrency models might be used to solve different problems, so implementing a concurrency model as a library enables flexibility in usage.
+* _High-level domain-specific language_: A domain-specific language or something comparable is a requirement to compete with languages that specialize in concurrency, otherwise your abstractions are lacking in idioms and expressiveness.
+* _Event-driven implementation_: Actors need to be lightweight, meaning they cannot be mapped to an entire VM thread or process. For most platforms this means an event-driven model.
+* _High performance_: Most industrial applications that use actors are highly performance sensitive, and high performance enables more graceful scalability.
+* _Flexible remote actors_: Many applications can benefit from remote actors, which can communicate transparently over the network. Flexibility in deployment mechanisms is also very important.
+
+These attributes give us a good basis for analyzing whether an actor system can be successful in production. These are attributes that are necessary, but not sufficient for an actor system to be useful in production.
+
+## Failure handling
+
+One of the most important concepts and reasons people use actor systems in production is their support for failure handling and recovery. The root of this support is the previously mentioned ability for actors to supervise one another, and to have supervisors notified of failures. _Designing Reactive Systems: The Role of Actors in Distributed Architecture_ {% cite ReactiveSystems --file message-passing %} details four well-known recovery steps that a supervising actor may take when they are notified of a problem with one of their workers.
+
+* Ignore the error and let the worker resume processing
+* Restart the worker and reset their state
+* Stop the worker entirely
+* Escalate the problem to the supervisor's supervising actor
+
+Based on this scheme, all actors within a system will have a supervisor, which amounts to a large tree of supervision. At the top of the tree is the actor system itself, which may have a default recovery scheme like simply restarting the actor. An interesting note is that this frees up individual actors from handling their failures. The philosophy around failure shifts to "actors will fail" and that we need other explicit actors and methods for handling failure outside of the business logic of the individual actor.
+
+<figure class="main-container">
+ <img src="./supervision_tree.png" alt="An actor supervision hierarchy tree" />
+ <footer>An actor supervision hierarchy. {% cite ReactiveSystems --file message-passing %}</footer>
+</figure>
+
+Another approach that naturally falls out of supervision heirarchies, is that they can be distributed across machines (nodes) within a cluster of actors for fault tolerance.
+
+<figure class="main-container">
+ <img src="./sentinel_nodes.png" alt="Actor supervision across cluster nodes." />
+ <footer>Actor supervision across cluster nodes. {% cite ReactiveSystems --file message-passing %}</footer>
+</figure>
+
+Critical actors can be monitored across nodes, which means that failures can be detected across nodes within a cluster. This allows for other actors within the cluster to easily react to the entire state of the system, not just the state of their local machine. This is important for a number of problems that arise in distributed systems like load-balancing and data/request partitioning. This also allows naturally allows for some form of recovery from the other machines within a cluster, such as spinning up another node automatically or restarting the failed machine/node.
+
+Flexibility around failure handling is a key advantage of using actors in production systems. Supervision means that worker actors can focus on business logic, and failure-handling actors can focus on managing and recovering those actors. Actors can also be cluster-aware and have a view into the state of the entire distributed system.
+
+## Actors as a framework
+
+One trend that seems common among the actor systems we see in production is extensive environments and tooling. Akka, Erlang, and Orleans are the primary actor systems that see real production use, and the reason for this is that they essentially act as frameworks where many of the common problems of actors are taken care of for you. They offer support for managing and monitoring the deployment of actors as well as patterns or modules to handle problems like fault-tolerance and load balancing which every distributed actor system has to address. This allows the programmer to focus on the problems within their domain, rather than the common problems of monitoring, deployment, and composition.
+
+Akka and Erlang provide modules that you can piece together to build various pieces of functionality into your system. Akka provides a huge number of modules and extensions to configure and monitor a distributed system built using actors. They provide a number of utilities to meet common use-case and deployment scenarios, and these are thoroughly listed and documented. For example Akka includes modules to deal with the following common issues (and more):
+
+* Fault Tolerance via supervision hierarchies
+* Routing to balance load across actors
+* Persistence to save and recover actor state across failures and restarts
+* A testing framework specifically for actors
+* Cluster management to group and distribute actors across physical machines
+
+Additionally they provide support for Akka Extensions, which are a mechanism for adding your own features to Akka. These are powerful enough that some core features of Akka like Typed Actors or Serialization are implemented as Akka Extensions.
+
+Erlang provides the Open Telecom Platform (OTP), which is a framework comprised of a set of modules and standards designed to help build applications. OTP takes the generic patterns and components of Erlang, and provides them as libraries that enable code reuse and best practices when developing new systems. Some examples of OTP libraries are:
+
+* A real-time distributed database
+* An interface to relational databases
+* A monitoring framework for machine resource usage
+* Support for interfacing with other communication protocols like SSH
+* A test framework
+
+Cloud Haskell also provides something analogous to Erlang's OTP called the Cloud Haskell Platform.
+
+Orleans is different from these as it is built from the ground up with a more declarative style and runtime. This does a lot of the work of distributing and scaling actors for you, but it is still definitely a framework which handles a lot of the common problems of distribution so that programmers can focus on building the logic of their system. Orleans takes care of the distribution of actors across machines, as well as creating new actor instances to handle increased load. Additionally, Orleans also deals with reconciliation of consistency issues across actor instantiations, as well as persistence of actor data to durable storage. These are common issues that the other industrial actor frameworks also address in some capacity using modules and extensions.
+
+## Module vs. managed runtime approaches
+
+Based on my research there have been two prevalent approaches to frameworks which are actually used to build production actor systems in industry. These are high-level philosophies about the meta-organization of an actor system. They are the design philosophies that aren't even directly considered when just looking at the base actor programming and execution models. The easiest way to describe these is are as the "module approach" and the "managed runtime approach". A high-level analogy to describe these is that the module approach is similar to manually managing memory, while the managed runtime approach is similar to garbage collection. In the module approach, you care about the lifecycle and physical allocation of actors within your system, while in the managed runtime approach you care more about the reconciliation behavior and flow of persistent state between automatic instantiations of your actors.
+
+Both Akka and Erlang take a module approach to building their actor systems. This means that when you build a system using these languages/frameworks, you are using smaller composable components as pieces of the larger system you want to build. You are explicitly dealing with the lifecycles and instantiations of actors within your system, where to distribute them across physical machines, and how to balance actors to scale. Some of these problems might be handled by libraries, but at some level you are specifying how all of the organization of your actors is happening. The JVM or Erlang VM isn't doing it for you.
+
+Orleans goes in another direction, which I call the managed runtime approach. Instead of providing small components which let you build your own abstractions, they provide a runtime in the cloud that attempts to abstract away a lot of the details of managing actors. It does this to such an extent that you no longer even directly manage actor lifecycles, where they live on machines, or how they are replicated and scaled. Instead you program with actors in a more declarative style. You never explicitly instantiate actors, instead you assume that the runtime will figure it out for you in response to requests to your system. You program in strategies to deal with problems like domain-specific reconciliation of data across instances, but you generally leave it to the runtime to scale and distribute the actor instances within your system.
+
+Both approaches have been successful in industry. Erlang has the famous use case of a telephone exchange and a successful history since then. Akka has an entire page detailing its usage in giant companies. Orleans has been used as a backend to massive Microsoft-scale games and applications with millions of users. It seems like the module approach is more popular, but there's only really one example of the managed runtime approach out there. There's no equivalent to Orleans on the JVM or Erlang VM, so realistically it doesn't have as much exposure in the distributed programming community.
+
+## Comparison to Communicating Sequential Processes
+
+One popular model of message-passing concurrency that has been getting attention is Communicating Sequential Processes (CSP). The basic idea behind CSP is that concurrent communication between processes is done by passing messages through channels. Arguably the most popular modern implementation of this is Go's channels. A lot of the surface-level discussions of actors simply take them as something that is a lightweight concurrency primitive which passes messages. This zoomed-out view might conflate CSP-style channels and actors, but it misses a lot of subtleties as CSP really can't be considered an actor framework. The core difference is that CSP implements some form of synchronous messaging between processes, while the actor model entirely decouples messaging between a sender and a receiver. Actors are much more independent, meaning its easier to run them in a distributed environment without changing their semantics. Additionally, receiver failures don't affect senders in the actor model. Actors are a more loosely-coupled abstraction across a distributed environment, while CSP embraces tight-coupling as a means of synchronization across processes. To conflate the two misses the point of both, as actors are operating at a fundamentally different level of abstraction from CSP.
+
+# References
+
+{% bibliography --file message-passing %}
diff --git a/chapter/3/sentinel_nodes.png b/chapter/3/sentinel_nodes.png
new file mode 100644
index 0000000..21e8bd1
--- /dev/null
+++ b/chapter/3/sentinel_nodes.png
Binary files differ
diff --git a/chapter/3/supervision_tree.png b/chapter/3/supervision_tree.png
new file mode 100644
index 0000000..95bc84b
--- /dev/null
+++ b/chapter/3/supervision_tree.png
Binary files differ
diff --git a/chapter/4/MR.png b/chapter/4/MR.png
new file mode 100644
index 0000000..54db004
--- /dev/null
+++ b/chapter/4/MR.png
Binary files differ
diff --git a/chapter/4/dist-langs.md b/chapter/4/dist-langs.md
index 9c8a8c9..295a307 100644
--- a/chapter/4/dist-langs.md
+++ b/chapter/4/dist-langs.md
@@ -4,8 +4,516 @@ title: "Distributed Programming Languages"
by: "Joe Schmoe and Mary Jane"
---
-Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum. {% cite Uniqueness --file dist-langs %}
+
+## Problems of Distributed Programming
+
+There are problems that exist in distributed system environments that do not exist in single-machine environments.
+Partial failure, concurrency, and latency are three problems that make distributed computing fundamentally different from local computing.
+In order to understand the design decisions behind programming languages and systems for distributed computing, it is necessary to discuss these three problems that make distributed computing unique.
+In this section, we present an overview of these three problems and their impact on distributed programming models.
+
+### Partial Failure
+
+In the case of a crash on a local environment, either the machine has failed (total failure), or the source of the crash can be learned from a central resource manager such as the operating system {% cite waldo1997 --file dist-langs.bib %}
+If an application consists of multiple communicating processes partial failure is possible, however because the cause of the partial failure can be determined, this kind of partial failure can be repaired given the operating system's knowledge.
+For example, a process can be restored based on a checkpoint, another process in the application can query the operating system about the failed process' state, etc.
+
+Because failure in a distributed setting involves another player, the network, it is impossible in most cases to determine the cause of failure.
+In a distributed environment, there is no (reliable) central manager that can report on the state of all components.
+Further, due to the inherent concurrency in a distributed system, nondeterminism is a problem that must be considered when designing distributed models, languages, and systems.
+Communication is perhaps the most obvious example of this; messages may be lost or arrive out-of-order.
+Finally, unlike in a local environment where failure returns control to the caller, failure may not be reported or the response may simply vanish.
+Because of this, distributed communication must be designed expecting partial failure, and be able to "fail gracefully."
+
+Several methods have been developed to deal with the problem of partial failure.
+One method, made popular with batch processing and MapReduce style frameworks, is to remember the series of computations needed to obtain a result and recompute the result in the case of failure.
+Systems such as MapReduce, Spark, GraphX, and Spark Streaming use this model, as well as implement optimizations to make it fast.
+Another method of dealing with partial failure is the two phase commit.
+To perform a change of state across many components, first a logically central "leader" checks to see if all components are ready to perform and action.
+If all reply "yes," the action is *committed*.
+Otherwise, as in the case of partial failure, no changes are committed.
+Two phase commit ensures that state is not changed in a partial manner.
+Another solution to partial failure is redundancy, or replication.
+If one replica of a computation fails, the others may survive and continue.
+Replication can also be used to improve performance, as in MapReduce and Spark Streaming.
+Checkpoint and restore has also been implemented as a way to recover from partial failure.
+By serializing a recent "snapshot" of state to stable storage, recomputing current state is made cheap.
+This is the primary method of partial failure in RDD-based systems.
+In other systems, like Argus, objects are reconstructed from state that is automatically or manually serialized to disk.
+
+### Consistency (Concurrency)
+
+If computing on shared data can be avoided, parallel computations would not be bottlenecked by serialized accesses.
+Unfortunately, there are many instances where operating on shared data is necessary.
+While problems with shared data can be dealt with fairly simply in the local case, distribution introduces problems that make consistency more complex.
+
+In local computing, enforcing consistency is fast and straightforward.
+Traditionally, a piece of data is protected by another piece of data called a *lock*.
+To operate on the data, a concurrent process *acquires* the lock, makes its changes, then *releases* the lock.
+Because the data is either located in on-board memory or an on-chip cache, passing the shared data around is relatively fast.
+As in the case of partial failure, a central resource manager (the OS) is present and can respond to a failed process that has obtained a lock.
+
+In a distributed environment, coordination and locking is more difficult.
+First, because of the lack of a central resource manager, there needs to be a way of preserving or recovering the state of shared data in the case of failure.
+The problem of acquiring locks also becomes harder due to partial failure and higher latency.
+Synchronization protocols must expect and be able to handle failure.
+
+To deal with operations on shared data, there are a few standard techniques.
+A *sequencer* can be used to serialize requests to a shared piece of data.
+When a process on a machine wants to write to shared data, it sends a request to a logically central process called a sequencer.
+The sequencer takes all incoming requests and serializes them, and sends the serialized operations (in order) to all machines with a copy of the data.
+The shared data will then undergo the same sequence of transformations on each machine, and therefore be consistent.
+A similar method for dealing with consistency is the message queue.
+In the actor model, pieces of an application are represented as actors which respond to requests.
+Actors may use *message queues* which behave similarly to sequencers.
+Incoming method calls or requests are serialized in a queue which the actor can use to process requests one at a time.
+Finally, some systems take advantage of the semantics of operations on shared data, distinguishing between read-only operations and write operations.
+If an operation is determined to be read-only, the shared data can be distributed and accessed locally.
+If an operation writes to shared data, further synchronization is required.
+
+Unfortunately, none of these techniques can survive a network partition.
+Consistency requires communication, and a partitioned network will prevent updates to state on one machine from propagating.
+Distributed systems therefore may be forced by other requirements to loosen their requirement of consistency.
+Below, the CAP theorem formalizes this idea.
+
+### Latency
+
+Latency is another major problem that is unique to distributed computing.
+Unlike the other problems discussed in this section, latency does not necessarily affect program correctness.
+Rather, it is a problem that impacts application performance, and can be a source of nondeterminism.
+
+In the case of local computing, latency is minimal and fairly constant.
+Although their may be subtle timing differences that arise from contention from concurrent processes, these fluctuations are relatively small.
+As well, machine hardware is constant.
+There are no changes to the latency of communication channels on a single machine.
+
+Distribution introduces network topology.
+This topology significantly (orders of magnitude) increases the latency of communication, as well as introduces a source of nondeterminism.
+At any time, routing protocols or hardware changes (or both) may cause the latency between two machines to change.
+Therefore, distributed applications may not rely on specific timings of communication in order to function.
+Distributed processes may also be more restricted.
+Because communication across the network is costly, applications may necessarily be designed to minimize communication.
+
+A more subtle (and sinister) problem with increased latency and the network is the inability of a program to distinguish between a slow message and a failed message.
+This situation is analogous to the halting problem, and forces distributed applications to make decisions about when a message, link, or node has "failed."
+
+Several methods have been developed to cope with the latency of communication.
+Static and dynamic analysis may be performed on communication patterns so that entities that communicate frequently are more proximate than those that communicate infrequently.
+Another approach that has been used is data replication.
+If physically separate entities all need to perform reads on a piece of data, that data can be replicated and read from local hardware.
+Another approach is pipelining; a common example of this is in some flavors of the HTTP protocol.
+Pipelining requests allows a process to continue with other work, or issue more requests without blocking for the response of each request.
+Pipelining lends itself to an asynchronous style of programming, where a callback can be assigned to handle the results of a request.
+*Futures* and *promises* have built on this programming style, allowing computations to be queued, and performed when the value of a future or promise is resolved.
+
+### The CAP Theorem
+
+Indeed, the three problems outlined above are not independent, and a solution for one may come at the cost of *amplifying* the effects of another.
+For example, let's suppose when a request to our system arrives, a response should be issued as soon as possible.
+Here, we want to minimize latency.
+Unfortunately, this may come at the cost of consistency.
+We are forced to either (1) honor latency and send a possibly inconsistent result, or (2) honor consistency and wait for the distributed system to synchronize before replying.
+
+The CAP theorem {% cite gilbert2002brewer --file dist-langs.bib %} formalizes this notion.
+CAP stands for Consistency, Availability, and tolerance to Partitioning.
+The theorem states that a distributed system may only have two of these three properties.
+
+Since its introduction, experience suggests this theorem is not as rigid as was originally proposed {% cite brewer2012cap --file dist-langs.bib %}.
+In practice, for example, rareness of network partitioning makes satisfaction of all three easier.
+As well, advancements in consistency models, such as CRDT's {% cite shapiro2011conflict --file dist-langs.bib %}, make balancing consistency and availability flexible to the requirements of the system.
+
+## Three Major Approaches to Distributed Languages
+
+Clearly, there are problems present in distributed programming that prevent traditional local programming models from applying directly to distributed environments.
+Languages and systems built for writing distributed applications can be classified into three categories: distributed shared memory, actors, and dataflow.
+Each model has strengths and weaknesses.
+Here, we describe each model and provide examples of languages and systems that implement them.
+
+### Distributed Shared Memory
+
+Virtual memory provides a powerful abstraction for processes.
+It allows each process running on a machine to believe it is the sole user of the machine, as well as provide each process with more (or less) memory addresses than may be physically present.
+The operating system is responsible for mapping virtual memory addresses to physical ones and swapping addresses to and from disk.
+
+Distributed shared memory (DSM) takes the virtual memory abstraction one step further by allowing virtual addresses to be mapped to physical memory regions on remote machines.
+Given such an abstraction, programs can communicate simply by reading from and writing to shared memory addresses.
+DSM is appealing because the programming model is the same for local and distributed systems.
+However, it requires an underlying system to function properly.
+Mirage, Orca, and Linda are three systems that use distributed shared memory to provide a distributed programming model.
+
+#### Mirage (1989)
+
+Mirage is an OS-level implementation of DSM.
+In Mirage, regions of memory known as *segments* are created and indicated as shared.
+A segment consists of one or more fixed-size pages.
+Other local or remote processes can *attach* segments to arbitrary regions of their own virtual address space.
+When a process is finished with a segment, the region can be *detached*.
+Requests to shared memory are transparent to user processes.
+Faults occur on the page level.
+
+Operations on shared pages are guaranteed to be coherent; after a process writes to a page, all subsequent reads will observe the results of the write.
+To accomplish this, Mirage uses a protocol for requesting read or write access to a page.
+Depending on the permissions of the current "owner" of a page, the page may be invalidated on other nodes.
+The behavior of the protocol is outlined by the table below.
+
+| State of Owner | State of Requester | Clock Check? | Invalidation? |
+|----------------|--------------------|--------------|-------------------------------------------------------------|
+| Reader | Reader | No | No |
+| Reader | Writer | Yes | Yes, Requester is possibly sent current version of the page |
+| Writer | Reader | Yes | No, Owner is demoted to Reader |
+| Writer | Writer | Yes | Yes |
+
+Crucially, the semantics of this protocol are that at any time there may only be either (1) a single writer or (2) one or more readers of a page.
+When a single writer exists, no other copies of the page are present.
+When a read request arrives for a page that is being written to, the writer is demoted to a reader.
+These two properties ensure coherence.
+Many read copies of a page may be present, both to minimize network traffic and provide locality.
+When a write request arrives for a page, all read instances of the page are invalidated.
+
+To ensure fairness, the system associates each page with a timer.
+When a request is honored for a page, the timer is reset.
+The timer guarantees that the page will not be invalidated for a minimum period of time.
+Future request that result in invalidation or demotion (writer to reader) are only honored if the timer is satisfied.
+
+#### Orca (1992)
+
+Orca is a programming language built for distribution and is based on the DSM model.
+Orca expresses parallelism explicitly through processes.
+Processes in Orca are similar to procedures, but are concurrent instead of serial.
+When a process is forked, it can take parameters that are either passed as a copy of the original data, or passed as a *shared data-object*.
+Processes communicate through these shared objects.
+
+Shared data objects in Orca are similar to objects in OOP.
+An object is defined abstractly by a name and a set of interfaces.
+An implementation of the object defines any private data fields as well as the interfaces (methods).
+Importantly, these interfaces are guaranteed to be indivisible, meaning that simultaneous calls to the same interface are serializable.
+Although serializability alone does not eliminate indeterminism from Orca programs, it keeps the model simple while it allows programmers to construct richer, multi-operation locks for arbitrary semantics and logic.
+
+Another key feature of Orca is the ability to express symbolic data structures as shared data objects.
+In Orca, a generic graph type is offered as a first-class data-object.
+Because operations are serializable at the method level, the graph data-object can offer methods that span multiple nodes while still retaining serializability.
+
+#### Linda (1993)
+
+Linda is a programming model based on DSM.
+In Linda, shared memory is known as the *tuple space*; the basic unit of shared data is the tuple.
+Instead of processes reading and writing to shared memory addresses, processes can insert, extract, or copy entries from the tuple space.
+Under the Linda model, processes communicate and distribute work through tuples.
+
+A Linda tuple is not a fixed size; it can contain any combination of primitive data types and values.
+To insert a tuple into the space, the fields of the tuple are fully evaluated before insertion.
+A process can decide to evaluate a tuple serially, or spin up a background task to first evaluate the fields, then insert the tuple.
+To retrieve a tuple from the space, a *template* tuple is provided that contains a number of fixed fields to match against, as well as *formals* that are "filled in" by the tuple that matches the search.
+If many tuples match the template, one is selected arbitrarily.
+When retrieving a tuple, the tuple may be left in the tuple space or removed.
+
+In practice, the tuple space is disjointly distributed among the nodes in the cluster.
+The number and type of elements in a tuple defines the tuple's *class*.
+All requests made for a particular class of tuple are sent through a *rendezvous point*, which provides a logically central way of performing book keeping about tuples.
+The rendezvous services requests for insertion and deletions of all tuples of a class.
+In the most basic implementation of Linda, each rendezvous point is located on a single participating node in the cluster.
+
+### Actor / Object model
+
+Unlike DSM, communication in the actor model is explicit and exposed through message passing.
+Messages can be synchronous or asynchronous, point-to-point or broadcast style.
+In the actor model, concurrent entities do not share state as they do in DSM.
+Each process, object, actor, etc., has its own address space.
+The model maps well to single multicore machines as well as to clusters of machines.
+Although an underlying system is required to differentiate between local and remote messages, the location of processes, objects, or actors can be transparent to the application programmer.
+Erlang, Emerald, Argus, and Orleans are just a few of many implementations of the actor model.
+
+#### Emerald (1987)
+
+Emerald is a distributed programming language based around a unified object model.
+Programs in Emerald consist of collections of Objects.
+Critically, Emerald provides the programmer with a unified object model so as to abstract object location from the invocation of methods.
+With that in mind, Emerald also provides the developer with the tools to designate explicitly the location of objects.
+
+Objects in Emerald resemble objects in other OOP languages such as Java.
+Emerald objects expose methods to implement logic and provide functionality, and may contain internal state.
+However, their are a few key differences between Emerald and Java Objects.
+First, objects in Emerald may have an associated process which starts after initialization.
+In Emerald, object processes are the basic unit of concurrency.
+Additionally, an object may *not* have an associated process.
+Objects that do not have a process are known as *passive* and more closely resemble traditional Java objects; their code is executed when called by processes belonging to other objects.
+Second, processes may not touch internal state (members) of other objects.
+Unlike Java, all internal state of Emerald objects must be accessed through method calls.
+Third, objects in Emerald may contain a special *monitor* section which can contain methods and variables that are accessed atomically.
+If multiple processes make simultaneous calls to a "monitored" method, the calls are effectively serialized.
+
+Emerald also takes an OOP approach to system upgrades.
+With a large system, it may not be desirable to disable the system, recompile, and re-launch.
+Emerald uses abstract types to define sets of interfaces.
+Objects that implement such interfaces can be "plugged in" where needed.
+Therefore, code may be dynamically upgraded, and different implementations may be provided for semantically similar operations.
+
+#### Argus (1988)
+
+Argus is a distributed programming language and system.
+It uses a special kind of object, called a *guardian*, to create units of distribution and group highly coupled data and logic.
+Argus procedures are encapsulated in atomic transactions, or *actions*, which allow operations that encompass multiple guardians to exhibit serializability.
+The presence of guardians and actions was motivated by Argus' use as a platform for building distributed, consistent applications.
+
+An object in Argus is known as a guardian.
+Like traditional objects, guardians contain internal data members for maintaining state.
+Unique to Argus is the distinction between volatile and stable data members.
+To cope with crashes, data members that are stable are periodically serialized to disk.
+When a guardian crashes and is restored, this serialized state is used to reconstruct guardians.
+Like in Emerald, internal data members may not be accessed directly, but rather through handlers.
+Guardians are interacted with through methods known as *handlers*.
+When a handler is called, a new process is created to handle the operation.
+Additionally, guardians may contain a background process for performing continual work.
+
+Argus encapsulates handler calls in what it calls *actions*.
+Actions are designed to solve the problems of consistency, synchronization, and fault tolerance.
+To accomplish this, actions are serializable as well as total.
+Being serializable means that no actions interfere with one another.
+For example, a read operation that spans multiple guardians either "sees" the complete effect of a simultaneous write operation, or it sees nothing.
+Being total means that write operations that span multiple guardians either fully complete or fully fail.
+This is accomplished by a two-phase commit protocol that serializes the state of *all* guardians involved in an action, or discards partial state changes.
+
+#### Erlang (2000)
+
+Erlang is a distributed language which combines functional programming with message passing.
+Units of distribution in Erlang are processes.
+These processes may be co-located on the same node or distributed amongst a cluster.
+
+Processes in Erlang communicate by message passing.
+Specifically, a process can send an asynchronous message to another process' *mailbox*.
+At some future time, the receiving process may enter a *receive* clause, which searches the mailbox for the first message that matches one of a set of patterns.
+The branch of code that is executed in response to a message is dependent on the pattern that is matched.
+
+In general, an application written in Erlang is separated into two broad components: *workers* and *monitors*.
+Workers are responsible for application logic.
+Erlang offers a special function `link(Pid)` which allows one process to monitor another.
+If the process indicated by `Pid` fails, the monitor process will be notified and is expected to handle the error.
+Worker processes are "linked" by monitor processes which implement the fault-tolerance logic of the application.
+
+Erlang, first implemented in Prolog, has the features and styles of a functional programming language.
+Variables in Erlang are immutable; once assigned a value, they cannot be changed.
+Because of this, loops in Erlang are written as tail recursive function calls.
+Although this at first seems like a flawed practice (to traditional procedural programmers), tail recursive calls do not grow the current stack frame, but instead replace it.
+It is worth noting that stack frame growth is still possible, but if the recursive call is "alone" (the result of the inner function is the result of the outer function), the stack will not grow.
+
+#### Orleans (2011)
+
+Orleans is a programming model for distributed computing based on actors.
+An Orleans program can be conceptualized as a collection of actors.
+Because Orleans is intended as a model for building cloud applications, actors do not spawn independent processes as they do in Emerald or Argus.
+Rather, Orleans actors are designed to execute only when responding to requests.
+
+As in Argus, Orleans encapsulates requests (root function calls) within *transactions*.
+When processing a transaction, function calls may span many actors.
+To ensure consistency in the end result of a transaction, Orleans offers another abstraction, *activations*, to allow each transaction to operate on a consistent state of actors.
+An activation is an instance of an actor.
+Activations allow (1) consistent access to actors during concurrent transactions, and (2) high throughput when an actor becomes "hot."
+Consistency is achieved by only allowing transactions to "touch" activations of actors that are not being used by another transaction.
+High throughput is achieved by spawning many activations of an actor for handling concurrent requests.
+
+For example, suppose there is an actor that represents a specific YouTube video.
+This actor will have data fields like `title`, `content`, `num_views`, etc.
+Suppose their are concurrent requests (in turn, transactions) for viewing the video.
+In the relevant transaction, the `num_views` field is incremented.
+Therefore, in order to run the view requests concurrently, two activations (or copies) of the actor are created.
+
+Because there is concurrency within individual actors, Orleans also supports means of state reconciliation.
+When concurrent transactions modify different activations of an actor, state must eventually be reconciled.
+In the case of the above example, it may not be necessary to know immediately the exact view count of a video, but we would like to be able to know this value eventually.
+To accomplish reconciliation, Orleans provides data structures that can be automatically merged.
+As well, the developer can implement arbitrary logic for merging state.
+In the case of the YouTube video, we would want logic to determine the delta of views since the start of the activation, and add that to the actors' sum.
+
+### Dataflow model
+
+In the dataflow model, programs are expressed as transformations on data.
+Given a set of input data, programs are constructed as a series of transformations and reductions.
+Computation is data-centric, and expressed easily as a directed acyclic graph (DAG).
+Unlike the DSM and actor models, processes are not exposed to the programmer.
+Rather, the programmer designs the data transformations, and a system is responsible for initializing processes and distributing work across a system.
+
+#### MapReduce (2004)
+
+MapReduce is a model and system for writing distributed programs that is data-centric.
+Distributed programs are structured as series of *Map* and *Reduce* data transformations.
+These two primitives are borrowed from traditional functional languages, and can be used to express a wide range of logic.
+The key strength of this approach is that computations can be reasoned about and expressed easily while an underlying system takes care of the "dirty" aspects of distributed computing such as communication, fault-tolerance, and efficiency.
+
+A MapReduce program consists of a few key stages.
+First, the data is read from a filesystem or other data source as a list of key-value pairs.
+These pairs are distributed amongst a set of workers called *Mappers*.
+Each mapper processes each element in its partition, and may output zero, one, or many *intermediate* key-value pairs.
+Then, intermediate key-value pairs are grouped by key.
+Finally, *Reducers* take all values pertaining to an intermediate key and output zero, one, or many output key-value pairs.
+A MapReduce job may consist of one or many iterations of map and reduce.
+
+Crucially, for each stage the programmer is only responsible for programming the Map and Reduce logic.
+The underlying system (in the case of Google, a C++ library), handles distributing input data and *shuffling* intermediate entries.
+Optionally, the user can implement custom logic for formatting input and output data.
+
+An example program in MapReduce is illustrated below.
+First, the input file is partitioned and distributed to a set of worker nodes.
+Then, the map function transforms lines of the text file into key-value pairs in the format (\< word \>, 1).
+These intermediate pairs are aggregated by key: the word.
+In the reduce phase, the list of 1's is summed to compute a wordcount for each word.
+
+<figure class="fullwidth">
+ <img src="{{ site.baseurl }}/chapter/4/MR.png" alt="A Sample MapReduce Program" />
+</figure>
+
+#### Discretized Streams (2012)
+
+Discretized Streams is a model for processing streams data in real-time based on the traditional dataflow paradigm.
+Streams of data are "chunked" discretely based on a time interval.
+These chunks are then operated on as normal inputs to DAG-style computations.
+Because this model is implemented on top of the MapReduce framework Spark, streaming computations can be flexibly combined with static MapReduce computations as well as live queries.
+
+Discretized Streams (D-Streams) are represented as a series of RDD's, each spanning a certain time interval.
+Like traditional RDD's, D-Streams offer stateless operations such as *map*, *reduce*, *groupByKey*, etc., which can be performed regardless of previous inputs and outputs.
+Unlike traditional RDD's, D-Streams offer *stateful* operations.
+These stateful operations, such as *runningReduce*, are necessary for producing aggregate results for a *possibly never-ending* stream of input data.
+
+Because the inputs are not known *a priori*, fault tolerance in streaming systems must behave slightly differently.
+For efficiency, the system periodically creates a checkpoint of intermediate data.
+When a node fails, the computations performed since the last checkpoint are remembered, and a new node is assigned to recompute the lost partitions from the previous checkpoint.
+Two other approaches to fault tolerance in streaming systems are replication and upstream backup.
+Replication is not cost effective as every process must be duplicated, and does not cover the case of all replicas failing.
+Upstream backup is slow as the system must wait for a backup node to recompute everything in order to recover state.
+
+#### GraphX (2013)
+
+Many real world problems are expressed using graphs.
+GraphX is a system built on top of the Spark MapReduce framework {% cite zaharia2012resilient --file dist-langs.bib %} that exposes traditional graph operations while internally representing a graph as a collection of RDD's.
+GraphX exposes these operations through what it calls a Resilient Distributed Graph (RDG).
+Internally, an RDG is a collection of RDD's that define a vertex split of a graph {% cite gonzalez2012powergraph --file dist-langs.bib %}.
+Because they are built on top of RDD's, RDG's inherit immutability.
+When a transformation is performed, a new graph is created.
+In this way, fault tolerance in GraphX can be executed the same way as it is in vanilla Spark; when a fault happens, the series of computations is remembered and re-executed.
+
+A key feature of GraphX is that it is a DSL library built on top of a GPL library.
+Because it uses the general purpose computing framework of Spark, arbitrary MapReduce jobs may be performed in the same program as more specific graph operations.
+In other graph-processing frameworks, results from a graph query would have to be written to disk to be used as input to a general purpose MapReduce job.
+
+With GraphX, if you can structure your application logic as a series of graph operations, an implementation may be created on top of RDD's.
+Because many real-world applications, like social media "connections," are naturally expressed as graphs, GraphX can be used to create a highly scalable, fault-tolerant implementation.
+
+## Comparing Design
+
+Here, we present a taxonomy which can be used to classify each of the examples.
+Among these distributed systems appear to be three major defining characteristics: level of implementation, granularity, and level of abstraction.
+Importantly, these characteristics are orthogonal and each present an opportunity for a design decision.
+
+### Level of Implementation
+
+The programming model exposed by each of the examples is implemented at some level in the computer system.
+In Mirage, the memory management needed to enable DSM is implemented within the operating system.
+Mirage's model of DSM lends itself to an OS implementation because data is shared by address.
+In other systems, such as Orca, Argus, and Erlang, the implementation is at the compiler level.
+These are languages that support distribution through syntax and programming style.
+Finally, some systems are implemented as libraries (e.g. Linda, MapReduce).
+In such cases, the underlying language is powerfull enough to support desired operations.
+The library is used to ease programmer burden and supply domain-specific syntax.
+
+The pros and cons of different implementation stategies is discussed further under *DSL's as Libraries*.
+
+### Granularity of Logic and State
+
+The granularity of logic and state is another major characteristic of these systems.
+Generally, the actor and DSM models can be considered fine grain while the dataflow model is course grain.
+
+The actor and DSM models can be considered fine grain because they can be used to define the logic and states of individual workers.
+Under the DSM model, an application may be composed of many separately compiled programs.
+Each of these programs captures a portion of the logic, communication being done through shared memory regions.
+Under the actor model, actors or objects may be used to wrap separate logic.
+These units of logic communicate through RPC or message-passing.
+The benefit to using the actor or DSM model is the ability to wrap unique, cohesive logic in modules.
+Unfortunately, this means that the application or library developer is responsible for handling problems such as scaling and process location.
+
+The dataflow model can be considered course grain because the logic of every worker is defined by a single set of instructions.
+Workers proceed by receiving a partition of the data and the program, and executing the transformation.
+Crucially, each worker operates under the same logic.
+
+To borrow an idea from traditional parallel programming, the actor/DSM model implements a *multiple instruction multiple data* (MIMD) architecture, whereas the dataflow model implements a *single instruction multiple data* (SIMD) architecture.
+
+### Level of Abstraction
+
+Each of the examples of systems for distributed computing offer different levels of abstraction from problems like partial-failure.
+Depending on the requirements of the application, it may be sufficient to let the system handle these problems.
+In other cases, it may be necessary to be able to define custom logic.
+
+In some systems, like Emerald, it is possible to specify the location of processes.
+When the communication patterns of the application are known, this can allow for optimizations on a per-application basis.
+In other systems, the system handles the resource allocation of processes.
+While this may ease development, the developer must trust the system to make the decision.
+
+The actor and DSM models require the application developer to create and destroy individual processes.
+This means that either the application must contain logic for scaling, or a library must be developed to handle scaling.
+Systems that implement the dataflow model handle scaling automatically.
+An exception to this rule is Orleans, which follows the actor model but handles scaling automatically.
+
+Final, fault tolerance is abstracted to varying degrees in these systems.
+Argus exposes fault tolerance to the programmer; object data members are labeled as volatile or stable.
+Periodically, these stable members are serialized to disk and can be used for object reconstruction.
+Other systems, especially those based on dataflow, fully abstract the problem of partial failure.
+
+## Thoughts on System Design
+
+### Domain-Specific Languages
+
+The definition of a domain-specific language is a hot topic and there have been several attempts to concretely define what exactly *it* is.
+
+Here is the definition as given by {% cite Mernik2005 --file dist-langs.bib %}:
+
+> Domain-specific languages are languages tailored to a specific application domain.
+
+Another definition is offered (and commonly cited) by {% cite Deursen2000 --file dist-langs.bib %}:
+
+> A domain-specific language is a programming language or executable specification language that offers, through appropriate notations and abstractions, expressive power focused on, and usually restricted to, a particular problem domain.
+
+Generally, I would refer to a domain-specific language (DSL) as a *system*, be it a standalone language, compiler extension, library, set of macros, etc., that is designed for a set of cohesive operations to be easily expressed.
+
+For example, the python twitter library is designed for easily expressing operations that manage a twitter account.
+
+The problem in defining this term (I believe) is the the vagueness of the components *domain* and *language*.
+Depending on the classification, a set of problems designated in a certain domain may span a "wide" or "narrow" scope.
+For example, does "tweeting" qualify as a domain (within the twitter library)?
+Would "social media sharing" qualify as a domain (containing the twitter library)?
+For my purposes I will accept the definition of a domain as a "well-defined, cohesive set of operations."
+
+It is also difficult to come up with a definition for a language.
+A language may be qualified if it has its own compiler.
+An orthogonal definition qualifies a language by its style, as in the case of sets of macros.
+This confusion is why I adopt the even more vague term *system* in my own definition.
+
+### Distribution as a Domain
+
+Given the examples of models and systems, I think it is reasonable to qualify distribution as a domain.
+Distributed computing has a unique set of problems, such as fault-tolerance, nondeterminism, network partitioning, and consistency that make it unique from parallel computing on a single machine.
+The languages, systems, and models presented here all are built to assist the developer in dealing with these problems.
+
+### DSL's as Libraries
+
+The examples given above demonstrate a trend.
+At first, systems designed to tackle the distribution domain were implemented as stand-alone languages.
+Later, these systems appear as libraries built on top of existing general-purpose languages.
+For many reasons, this style of system development is superior.
+
+#### Domain Composition
+
+Systems like GraphX and Spark Streaming demonstrate a key benefit of developing DSL's as libraries: composition.
+When DSL's are implemented on top of a common language, they may be composed.
+For example, a C++ math library may be used along with MapReduce to perform complex transformations on individual records.
+If the math library and MapReduce were individual languages with separate compilers, composition would be difficult or impossible.
+Further, the GraphX system demonstrates that domains exist at varying degrees of generality, and that building the library for one domain on top of another may result in unique and efficient solutions.
+DSL's that are implemented as full languages with unique compilers are unattractive because existing libraries that handle common tasks must be re-written for the new language.
+
+#### Ecosystem
+
+Another problem that drives DSL development towards libraries is ecosystem.
+In order for a DSL to be adopted, there must be a body of developers that can incorporate the DSL into existing systems.
+If either (1) the DSL does not incorporate well with existing code bases or (2) the DSL requires significant investment to learn, adoption will be less likely.
+
## References
-{% bibliography --file dist-langs %} \ No newline at end of file
+{% bibliography --file dist-langs %}
diff --git a/chapter/5/langs-extended-for-dist.md b/chapter/5/langs-extended-for-dist.md
index feb4694..70a33fc 100644
--- a/chapter/5/langs-extended-for-dist.md
+++ b/chapter/5/langs-extended-for-dist.md
@@ -1,10 +1,271 @@
---
layout: page
title: "General Purpose Languages Extended for Distribution"
-by: "Joe Schmoe and Mary Jane"
+by: "Sam Caldwell"
---
-Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum. {% cite Uniqueness --file langs-extended-for-dist %}
+## Introduction
+
+In very general terms, a distributed system is comprised of nodes
+that internally perform computation and communicate with each other.
+Therefore programming a distributed system requires two distinct
+models: one for the computation on each node and one to model the
+network of communications between nodes.
+
+A slightly-secondary concern is fault-tolerance. Failure of nodes and communication is one of the defining aspects of distributed computing {% cite note-on-dc --file langs-extended-for-dist %}. A programming model for distributed systems then must either implement a strategy for handling failures or equip programmers to design their own.
+
+Nodes can perform computation in any of the various paradigms
+(imperative, functional, object-oriented, relational, and so on). The
+models are not completely orthogonal. Some matters necessarily
+concern both. Serialization is a communication concern that is greatly
+influenced by the design of the computation language. Similar concerns
+affect the means of deployment and updating systems.
+
+Early designs of distributed programming models in the late 1970s focused on *building novel programming languages* (Eden, Argus, Emerald). As time has gone on researchers shifted towards *extending existing languages* with facilities for programming distributed system. This article explores the history of these designs and the tradeoffs to be made between the two approaches.
+
+## Languages and Libraries
+
+The different approaches to implementing a distributed programming model differ on the tradeoffs they offer to both language designers and users.
+
+### Clean-slate language implementations
+A straightforward approach to implementing a new language is to start from scratch. Beginning with a clean slate offers several advantages. First, the implementor has complete control (up to what they are willing and able to implement) over every aspect of the design. Second, some elements of language design are difficult to integrate with other languages. Type systems are a good example of the problem. How to combine the designs of two type systems into one that has the properties of both is an open problem. If types play a prominent role in the language than starting anew may be the only option. However, as explained below, this strategy has apparent drawbacks not only in term of implementation effort for the creator(s) but also for users.
+
+### Extension to an existing language
+Another option is to use another language as a starting point and modify that language’s implementation for your own purposes. Think of it as forking a compiler and/or runtime and then making your own modifications. An advantage of this approach is the savings in implementation effort. If the computation models coincide then the language designer only has to implement the communication model and fault-tolerance strategy. Another plus is that users of the existing language may be more likely to consider trying or adopting the new language since it is only a step away from what they already know. A downside for maintaining any fork is keeping up to date with upstream changes.
+
+### Library
+The final approach to consider is similar to extending a language, but this time doing so only by writing code in that language to create new abstractions implementing a language model. This strategy offers similar savings in implementation effort to extending a language (perhaps more so). But a major benefit is that is significantly easier for users to use and adopt; the only new concepts they must learn are the specifics of the programming model, as opposed to other concerns such as syntax.
+
+## Roadmap
+Language designers have explored a broad spectrum of designs for distributed programming models.
+
+### Actors
+One common sense and popular approach to extending a language for
+distribution is to implement the constructs of the Actor model (discussed in chapter 3). This
+is the approach taken by Termite Scheme, CloudHaskell, and Scala
+Actors to name a few. Doing so requires two steps: first, implementing facilities
+for spawning and linking actors and sending and receiving messages; and second, fitting actor-style concurrency on top of the language’s concurrency model.
+
+
+Erlang/OTP {% cite Erlang --file langs-extended-for-dist %} is a language designed with the intention of
+building resilient distributed applications for operating telephony
+switches. Erlang provides a simple base language and an extensive
+library of facilities for distribution. Erlang/OTP fits into the
+Actor model of distribution: the basic agents in a system are
+processes which communicate by sending messages to each other.
+Erlang/OTP stands out for the extent to which fault tolerance is
+considered. Erlang programmers are encouraged to think about failure
+as a routine occurrence and provides libraries for describing policies
+on how to handle such failures. Chapter 3 includes a more detailed overview of Erlang.
+
+TermiteScheme {% cite TermiteScheme --file langs-extended-for-dist %} is an extension to a Scheme
+language with constructs for Erlang-style distribution. The primary
+innovations of Termite are found by leveraging features of the host
+language. In particular, the features of interest are macros and
+continuations.
+
+Macros allow users or library developers to design higher-level abstractions than
+the actor model and treat them as first-class in the language. Macros are a very powerful tool for abstraction. They allow library authors to elevate many patterns, such as patterns of communication, to first-class constructs. A simple example is a construct for RPC implemented by TermiteScheme as a macro expanding to a send followed by a receive.
+
+A continuation is a concrete representation of how to finish computation of a program. A helpful analogy is to the call-stack in a procedural language. The call-stack tells you, once you’ve finished the current procedure, what to do next. Likewise, a continuation tells you, once you’ve finished evaluating the current expression, what to do next. Languages with first-class continuations have a way of reifying this concept, historically named `call/cc`, an abbreviation for `call-with-current-continuation`. First-class continuations allow for simple process migration; a process can capture its continuation and use that as the behavior of a spawned
+actor on another node.
+
+In addition to supporting the classic actor style which places few if
+any constraints on what messages may be sent, Haskell and Scala based
+implementations leverage the host language's powerful type system to
+create more structured communication patterns. Both CloudHaskell and
+Akka, the successor to Scala Actors, provide *typed
+ channels* between actors. For example, the type
+checker will reject a program where an actor expecting to receive
+numbers is instead sent a string. Anecdotally, typed actors in Akka are not commonly used. This might suggest that errors due to incorrect message types are not that serious of a concern for users.
+
+One disadvantage of this approach of implementing the simple actor model of Erlang is that Erlang also provides an extensive support platform for creating and deploying distributed systems. Even after the Erlang model has been implemented in Scheme, it is a long road to feature-parity.
+
+### Types
+
+Several efforts have explored the interplay between statically typed
+languages and distributed computing. Research is focused on extending functional languages like SML, Haskell, and Scala which feature relatively advanced type systems. Areas of investigation include how to integrate an existing model of distribution with a type system and how to take advantage of types to help programmers.
+
+CloudHaskell {% cite CloudHaskell --file langs-extended-for-dist %} is an extension to Haskell.
+CloudHaskell is largely designed in the mold of Erlang
+style process-based actors. It offers a novel typeclass based approach for
+ensuring safe serialization (as in only serializing things known to be
+serializable). Typeclasses {% cite typeclasses --file langs-extended-for-dist %} are a novel feature of Haskell’s language system that enable flexible overloading. Typeclass *instances* can be *derived automatically* {% cite deriving-typeclasses --file langs-extended-for-dist %} to reduce boilerplate. For example, the code to perform serialization and deserialization can be automatically generated for each kind of message using its definition. CloudHaskell takes the stance that the cost of communication operations should be readily apparent to the programmer. A consequence of this is that calls to the (automatically generated) serialization functions for messages must be explicitly inserted into the code. For example, in Erlang where we might write `send msg` in CloudHaskell we write `send (serialize msg)`. CloudHaskell is implemented as an extension to GHC Haskell (actually it was first implemented as a library then later on the features they needed were added as extensions to GHC).
+
+The fact that CloudHaskell allows concurrency within an actor demonstrates the separation between computation and communication models. That is, in CloudHaskell a single process might internally be implemented by a group of threads. To the rest of the system this fact is completely opaque (a good thing).
+
+CloudHaskell also takes a stance against “everything is serializable”, for example mutable reference cells should not be. This makes serializing functions (closures) tricky because it is not apparent how to tell if a closure’s environment is serializable. To solve this, CloudHaskell uses *static pointers* and forces closures to use pre-serialized environments. Pre-serialization means that a closure is a pair of a closed expression and a byte string representing the environment; presumably the first code the closure will execute when invoked is to deserialize the environment to retrieve the proper information.
+
+ML5 {% cite ML5 --file langs-extended-for-dist %} is an extension to
+the SML {% cite sml --file langs-extended-for-dist %} programming language. ML5 uses the notion of
+*location* and *movability*. The type system keeps track
+of the locations (nodes) in the system, which location each piece of
+the program is, and what can and cannot be moved between locations. As
+a result, the system is able to detect unpermitted attempts to access a resource. The ML5 type system cleanly integrates with the ML type system. ML5 supports type inference, a tricky but fundamental feature of SML.
+
+Though applicable to distributed systems in general, the presentation of ML5 focuses in particular on client-server interactions, and more specifically on those between a web front-end and back-end. The ML5 compiler is able to generate Javascript code that executes in-browser.
+
+An example of some ML5 code (from {% cite ML5 --file langs-extended-for-dist %}) follows:
+
+```
+extern javascript world home
+extern val alert : string -> unit @ home
+do alert [Hello, home!]
+```
+
+This snippet declares a location in the system called `home` that can execute Javascript code. Additionally, there is a function named `alert` that can be used from the location `home` (defined through an interface to plain Javascript). The last line is an example of calling the `alert` function with the string `”Hello, home!”` (yes, strings are in brackets).
+
+A slightly more complex example (again from {% cite ML5 --file langs-extended-for-dist %}) demonstrates movement between locations:
+```
+extern bytecode world server
+extern val server : server addr @ home
+extern val version : unit -> string @ server
+extern val alert : string -> unit @ home
+do alert (from server get version ())
+```
+This example features two locations (worlds), the default `home` location and one named `server`. As before, we can call `alert` from `home`, but now we can also call `version` on `server`. Locations access one another using addresses; the declaration of `server` as a `server addr @ home` says that the home location knows the address of the `server` location and can therefore access `version`.
+
+ML5 does not consider fault-tolerance. For the web front/back-end examples this is perhaps not too big of an issue. Still, the absence of a compelling strategy for handling failures is a shortcoming of the design. Since `home` is the default location, the same way that `main` is the default entry point for C programs, this program typechecks. An error would be raised if the `alert` function was called from a different location.
+
+AliceML {% cite Alice --file langs-extended-for-dist %} is an
+example of another extension to SML. AliceML leverages SML's advanced module
+system to explore building *open* distributed systems. Open
+systems are those where each node is only loosely coupled to its
+peers, meaning it allows for dynamic update and replacements. AliceML
+enables this by forcing components to program to interfaces and
+dynamically type-checking components as they enter the system.
+
+### Objects
+Object-Oriented languages have been extended for distribution in
+various fashions. Objects are an appealing model for agents in a
+distributed system. Indeed, Alan Kay's metaphor of objects as
+miniature computers and method invocation as message passing
+{% cite smalltalkHistory --file langs-extended-for-dist %} seems directly evocative of distributed
+systems.
+
+Eden {% cite Eden --file langs-extended-for-dist %} was an early pioneer (the project began in 1979) in
+both distributed and object-oriented programming languages. In fact,
+Eden can be classified as a language extended for distribution: Eden
+applications “were written in the Eden Programming Language (EPL) — a
+version of Concurrent Euclid to which the Eden team had added
+support for remote object invocation” {% cite bhjl07 --file langs-extended-for-dist %}. However, Eden
+did not take full advantage of the overlap (namely, objects) between
+the computation and distribution models.
+
+See Chapter 4 for a more extensive overview of the languages using an object-based model for distribution, such as Emerald, Argus, and E.
+
+### Batch Processing
+Another common extension is in the domain of batch processing
+large-scale data. MapReduce (and correspondingly Hadoop) is the landmark example of a programming model for distributed batch processing. Essentially, batch processing systems present a restricted programming model. For example, MapReduce programs must fit into the rigid model of two-step produce then aggregate. A restricted model alllows the system to make more guarantees, such as for fault tolerance. Subsequently language designers have sought to increase the expressiveness of the programming models and to boost performance. Chapter 8 covers batch processing languages in more detail.
+
+MBrace {% cite MBrace --file langs-extended-for-dist %} is an extension to the F# programming language for writing computations for clusters. MBrace provides a *monadic* interface (point to something about monads here) which allows for building cluster jobs out of smaller jobs while still exploiting available parallelism. MBrace is a framework that features its own runtime for provisioning, scheduling, and monitoring nodes in a cluster.
+
+Batch processing systems offer an interesting take on how to handle
+fault tolerance. The common approach taken is to use a central
+coordinator (for example, on the machine that initiated the job) to
+detect the failure of nodes. By tracking what each other node in the
+system is doing, the coordinator can restart a task on failure.
+
+### Consistency
+
+Several languages explore designs for ensuring the *consistency* of distributed applications, or the requirement that each node agree on some state. CRDTs are a family of distributed data structures that maintain consistency. Chapter 6 explains consistency and CRDTS in further detail while chapter 7 takes a closer look at the following languages.
+
+Dedalus {% cite Dedalus --file langs-extended-for-dist %} and Bloom {% cite Bloom --file langs-extended-for-dist %} both represent uses of a logic-programming based attempt to design languages providing consistency guarantees. Dedalus is meant to provide the underlying model of distribution while Bloom provides a high-level language intended to be usable for creating applications. Logic-programming is an attractive model because execution is order-independent. Using Datalog, a logic-programming language, as the computation model encourages programmers to develop applications that are agnostic to message-reorderings. Bloom is implemented as a Ruby library BUD.
+
+Lasp {% cite Lasp --file langs-extended-for-dist %} is a language model where CRDTs are the only distributed data. CRDT sets are the primary data structure and Lasp programs are built by composing set transformations such as maps, filters, and folds.
+
+Lasp is implemented as an Erlang library. In practice, applications using Lasp are free to use other forms of distributed data, but consistency is only promised for the library-provided tools. Lasp nodes then share the same fault-tolerance platform as Erlang applications.
+
+Lasp’s programming model is very restrictive; sets are the primary data structure and folding operations. Future work may show how to enrich the programming model while still making the same strong consistency guarantees.
+
+### Tuplespaces
+Linda {% cite Linda --file langs-extended-for-dist %} is a distributed programming model more in the vein of shared memory than message passing. Linda programs are a collection of processes and a shared *tuplespace*. The tuplespace holds an unordered collection of tuples. A process adds data to the tuplespace using the `out` form. The `in` form takes a *pattern* and removes and returns a tuple matching the pattern from the tuplespace, blocking until one such appears.
+
+A simple example demonstrating communication in Linda:
+
+```
+;; INITIALLY the tuplespace is empty
+;; Process A
+out (“Hello”, “from process A”);
+
+;; after Process A executes the tuplespace contains one record, (“Hello”, “from process A”)
+
+;; Process B
+in(“Hello”, x);
+;; x = “from process A”
+;; the tuplespace is empty again
+```
+
+If the tuplespace was not empty but instead contained tuples of some for *besides* 2-entry tuples where the first element was the string `”Hello”` the above interaction would remain unchanged. However, if some process C had entered the tuple `(“Hello”, “I am process C”)` then
+B would receive either A’s or B’s tuple with one of the tuples remaining in the tuplespace.
+
+The Linda model of distribution has several advantages. Linda processes are not only spatially decoupled (able to execute on different machines, as in actor implementations) but *temporally* decoupled as well. That is, a Linda process can communicate with other processes even after it exits! Tuples remain in the shared space even after the process that created them exits. Therefore a different process can receive those tuples at any point in the future. Another point is that Linda processes don’t need to know the identities of the other process it communicates with. Unlike actors, which in order to send a message need to know the address of the other actor’s mailbox, Linda processes operate entirely through their connection to the tuplespace. Processes need to “know exactly as much about one another as is appropriate for the programming situation at hand” {% cite Linda --file langs-extended-for-dist %}.
+
+Linda is agnostic to the internal computation language of each node. Implementations of tuplespaces have been built on top of C and Fortran as well as a little-used Java one named JavaSpaces. (There is reportedly at least one financial institution in the UK that has some JavaSpaces-based systems). The choices of C and Fortran may be why Linda did not gain more popularity. Both languages are low-level and not well-suited for creating new abstractions (`void*` is just about the only abstraction mechanism in C).
+
+Maintaining the consistency of the tuplespace is of paramount importance for Linda implementations. Another concern is the detection of process crashes. Unfortunately, presentations of Linda have not clearly addressed these concerns.
+
+## Discussion
+
+The line between a language and a library can be extremely blurry
+{% cite LanguagesAsLibraries --file langs-extended-for-dist %}. A language provides some building blocks
+and forms for combining and composing {% cite 700pl --file langs-extended-for-dist %}. For example, numbers, strings, and Booleans are common primitive building blocks. Operations like addition, concatenation, and negation offer ways of combining such blocks. More interesting operations allow *abstraction* over parts of the language: function (or λ) abstraction allows for creating operations over all numbers. Objects provide a way of combining data and functions in a way that abstracts over particular implementation details. Most importantly, a language
+defines the *semantics*, or meaning, of such operations.
+
+Many libraries can
+be described in the same way. The first-order values provided by a
+library are the primitives while the provided functions, classes, or objects perform
+combination and composition. A library can implement the primitives
+and operations such that they are in correspondence with the forms and semantics defined by a language.
+
+A theme in the literature is presenting an idea as a language and
+implementing it as a library or extension to an existing language.
+Doing so allows the authors to analyze a minimal presentation of their
+idea, but enjoy the benefits of the library/extension approach. Both Linda and Lasp take this approach, for example.
+
+Language models benefit from implementations {% cite Redex --file langs-extended-for-dist %}. The
+reasons to implement a new distributed language model on top of
+existing work are abundant:
+
+* As mentioned above, a distributed language includes both a model
+ for single-node computation as well as inter-node communication.
+ Building a distributed language from scratch also means
+ re-implementing one of the paradigms of computing, and we would rather re-use existing language implementations
+
+* Implementing real-world applications calls for models of many
+ domains besides computation and communication, such as persistence
+ and parsing. The availability of a rich repository of libraries is
+ an important concern for many users in adopting a new technology
+ {% cite socioPLT --file langs-extended-for-dist %}.
+
+* Many applications of distributed systems are extremely
+ performance sensitive. Performance engineering and fine-tuning are
+ time and labor intensive endeavors. It makes sense to re-use as much
+ of the work in existing language ecosystems as possible.
+
+These reasons form the foundation of an argument in favor of general-purpose programming languages that allow for the creation of rich abstractions. Such a language can be adapted for different models of computation. (Although, most general-purpose languages have a fixed computation model such as functional or object-oriented). When models are implemented as libraries in a common language, users can mix-and-match different models to get exactly the right behavior where it is needed by the application.
+
+One can argue that one reason the Linda model failed to catch on in is that the primary implementations were extensions to C and Fortran. While general-purpose, C and Fortran suffer a paucity of options for creating abstractions.
+
+That being said, there are advantages to clean slate languages. A fresh start grants the designer full control over every part of the language. For some areas, such as novel type systems, this is almost a necessity. However, we have seen designs like ML5 that elegantly integrate with a base language's type system. Finally, a language designer can put their new ideas center stage, easily allowing people to see what is new and cool; in time those ideas may spread to other languages.
+
+Erik Meijer points out in {% cite salesman --file langs-extended-for-dist %} that programmers don't jump to new technologies to access new features. Rather, in time those features make their way to old technologies. Examples abound, such as the arrival of λ in Java and of course all of Meijer's work in C# and Visual Basic.
+
+Ideas from distributed programming models have influenced more than just the languages used in practice. Microservices {% cite microservices --file langs-extended-for-dist %} are a popular technique for architecting large systems as a collection of manageable components. Microservices apply the lessons learned from organizing programs
+around actors to organizing the entire system. Actor programming uses
+shared-nothing concurrent agents and emphasizes fault-tolerance.
+Building a system using microservices means treating an entire block
+of functionality as an actor: it can exchange messages with the other
+components, but it can also go on and offline (or crash) independent
+from the rest of the system. The result is more resilient and modular.
+
+## Conclusion
+
+All design lies on a spectrum of tradeoffs; to gain convenience in one
+area means to sacrifice in another. Distributed systems come with a
+famous trade off: the CAP theorem. Building new language models on top
+of an existing, flexible general-purpose language is an attractive
+option for both implementors and users.
## References
diff --git a/chapter/6/acidic-to-basic-how-the-database-ph-has-changed.md b/chapter/6/acidic-to-basic-how-the-database-ph-has-changed.md
new file mode 100644
index 0000000..ffc94c0
--- /dev/null
+++ b/chapter/6/acidic-to-basic-how-the-database-ph-has-changed.md
@@ -0,0 +1,182 @@
+---
+layout: page
+title: "ACIDic to BASEic: How the database pH has changed"
+by: "Aviral Goel"
+---
+
+## 1. The **ACID**ic Database Systems
+
+Relational Database Management Systems are the most ubiquitous database systems for persisting state. Their properties are defined in terms of transactions on their data. A database transaction can be either a single operation or a sequence of operations, but is treated as a single logical operation on the data by the database. The properties of these transactions provide certain guarantees to the application developer. The acronym **ACID** was coined by Andreas Reuter and Theo Härder in 1983 to describe them.
+
+* **Atomicity** guarantees that any transaction will either complete or leave the database unchanged. If any operation of the transaction fails, the entire transaction fails. Thus, a transaction is perceived as an atomic operation on the database. This property is guaranteed even during power failures, system crashes and other erroneous situations.
+
+* **Consistency** guarantees that any transaction will always result in a valid database state, i.e., the transaction preserves all database rules, such as unique keys.
+
+* **Isolation** guarantees that concurrent transactions do not interfere with each other. No transaction views the effects of other transactions prematurely. In other words, they execute on the database as if they were invoked serially (though a read and write can still be executed in parallel).
+
+* **Durability** guarantees that upon the completion of a transaction, the effects are applied permanently on the database and cannot be undone. They remain visible even in the event of power failures or crashes. This is done by ensuring that the changes are committed to disk (non-volatile memory).
+
+<blockquote><p><b>ACID</b>ity implies that if a transaction is complete, the database state is structurally consistent (adhering to the rules of the schema) and stored on disk to prevent any loss.</p></blockquote>
+
+Because of the strong guarantees this model simplifies the life of the developer and has been traditionally the go to approach in application development. It is instructive to examine how these properties are enforced.
+
+Single node databases can simply rely upon locking to ensure *ACID*ity. Each transaction marks the data it operates upon, thus enabling the database to block other concurrent transactions from modifying the same data. The lock has to be acquired both while reading and writing data. The locking mechanism enforces a strict linearizable consistency, i.e., all transactions are performed in a particular sequence and invariants are always maintained by them. An alternative, *multiversioning* allows a read and write operation to execute in parallel. Each transaction which reads data from the database is provided the earlier unmodified version of the data that is being modified by a write operation. This means that read operations don't have to acquire locks on the database. This enables read operations to execute without blocking write operations and write operations to execute without blocking read operations.
+
+This model works well on a single node. But it exposes a serious limitation when too many concurrent transactions are performed. A single node database server will only be able to process so many concurrent read operations. The situation worsens when many concurrent write operations are performed. To guarantee *ACID*ity, the write operations will be performed in sequence. The last write request will have to wait for an arbitrary amount of time, a totally unacceptable situation for many real time systems. This requires the application developer to decide on a **Scaling** strategy.
+
+### 1.2. Scaling transaction volume
+
+To increase the volume of transactions against a database, two scaling strategies can be considered
+
+* **Vertical Scaling** is the easiest approach to scale a relational database. The database is simply moved to a larger computer which provides more transactional capacity. Unfortunately, its far too easy to outgrow the capacity of the largest system available and it is costly to purchase a bigger system each time that happens. Since its specialized hardware, vendor lock-in will add to further costs.
+
+* **Horizontal Scaling** is a more viable option and can be implemented in two ways. Data can be segregated into functional groups spread across databases. This is called *Functional Scaling*. Data within a functional group can be further split across multiple databases, enabling functional areas to be scaled independently of one another for even more transactional capacity. This is called *sharding*.
+
+Horizontal Scaling through functional partitioning enables high degree of scalability. However, the functionally separate tables employ constraints such as foreign keys. For these constraints to be enforced by the database itself, all tables have to reside on a single database server. This limits horizontal scaling. To work around this limitation the tables in a functional group have to be stored on different database servers. But now, a single database server can no longer enforce constraints between the tables. In order to ensure *ACID*ity of distributed transactions, distributed databases employ a two-phase commit (2PC) protocol.
+
+* In the first phase, a coordinator node interrogates all other nodes to ensure that a commit is possible. If all databases agree then the next phase begins, else the transaction is canceled.
+
+* In the second phase, the coordinator asks each database to commit the data.
+
+2PC is a blocking protocol and updates can take from a few milliseconds up to a few minutes to commit. This means that while a transaction is being processed, other transactions will be blocked. So the application that initiated the transaction will be blocked. Another option is to handle the consistency across databases at the application level. This only complicates the situation for the application developer who is likely to implement a similar strategy if *ACID*ity is to be maintained.
+
+## 2. The Distributed Concoction
+
+A distributed application is expected to have the following three desirable properties:
+
+1. **Consistency** - This is the guarantee of total ordering of all operations on a data object such that each operation appears indivisible. This means that any read operation must return the most recently written value. This provides a very convenient invariant to the client application. This definition of consistency is the same as the **Atomic**ity guarantee provided by relational database transactions.
+
+2. **Availability** - Every request to a distributed system must result in a response. However, this is too vague a definition. Whether a node failed in the process of responding or it ran a really long computation to generate a response or whether the request or the response got lost due to network issues is generally impossible to determine by the client and willHence, for all practical purposes, availability can be defined as the service responding to a request in a timely fashion, the amount of delay an application can bear depends on the application domain.
+
+3. **Partition Tolerance** - Partitioning is the loss of messages between the nodes of a distributed system. During a network partition, the system can lose arbitrary number of messages between nodes. A partition tolerant system will always respond correctly unless a total network failure happens.
+
+Consistency requirement implies that every request will be treated atomically by the system even if the nodes lose messages due to network partitions.
+Availability requirement implies that every request should receive a response even if a partition causes messages to be lost arbitrarily.
+
+## 3. The CAP Theorem
+
+![Partitioned Network](resources/partitioned-network.jpg)
+
+In the network above, all messages between the node set M and N are lost due to a network issue. The system as a whole detects this situation. There are two options -
+
+1. **Availability first** - The system allows any application to read and write to data objects on these nodes independently even though they are not able to communicate. The application writes to a data object on node M. Due to **network partition**, this change is not propagated to replicas of the data object in N. Subsequently, the application tries to read the value of that data object and the read operation executes in one of the nodes of N. The read operation returns the older value of the data object, thus making the application state not **consistent**.
+
+2. **Consistency first** - The system does not allow any application to write to data objects as it cannot ensure **consistency** of replica states. This means that the system is perceived to be **unavailable** by the applications.
+
+If there are no partitions, clearly both consistency and availability can be guaranteed by the system. This observation led Eric Brewer to conjecture in an invited talk at PODC 2000-
+
+<blockquote>It is impossible for a web service to provide the following three guarantees:
+Consistency
+Availability
+Partition Tolerance</blockquote>
+
+This is called the CAP theorem.
+
+It is clear that the prime culprit here is network partition. If there are no network partitions, any distributed service will be both highly available and provide strong consistency of shared data objects. Unfortunately, network partitions cannot be remedied in a distributed system.
+
+## 4. Two of Three - Exploring the CAP Theorem
+
+The CAP theorem dictates that the three desirable properties, consistency, availability and partition tolerance cannot be offered simultaneously. Let's study if its possible to achieve two of these three properties.
+
+### Consistency and Availability
+If there are no network partitions, then there is no loss of messages and all requests receive a response within the stipulated time. It is clearly possible to achieve both consistency and availability. Distributed systems over intranet are an example of such systems.
+
+### Consistency and Partition Tolerance
+Without availability, both of these properties can be achieved easily. A centralized system can provide these guarantees. The state of the application is maintained on a single designated node. All updates from the client are forwarded by the nodes to this designated node. It updates the state and sends the response. When a failure happens, then the system does not respond and is perceived as unavailable by the client. Distributed locking algorithms in databases also provide these guarantees.
+
+### Availability and Partition Tolerance
+Without atomic consistency, it is very easy to achieve availability even in the face of partitions. Even if nodes fail to communicate with each other, they can individually handle query and update requests issued by the client. The same data object will have different states on different nodes as the nodes progress independently. This weak consistency model is exhibited by web caches.
+
+Its clear that two of these three properties are easy to achieve in any distributed system. Since large scale distributed systems have to take partitions into account, will they have to sacrifice availability for consistency or consistency for availability? Clearly giving up either consistency or availability is too big a sacrifice.
+
+## 5. The **BASE**ic distributed state
+
+When viewed through the lens of CAP theorem and its consequences on distributed application design, we realize that we cannot commit to perfect availability and strong consistency. But surely we can explore the middle ground. We can guarantee availability most of the time with occasional inconsistent view of the data. The consistency is eventually achieved when the communication between the nodes resumes. This leads to the following properties of the current distributed applications, referred to by the acronym BASE.
+
+* **Basically Available** services are those which are partially available when partitions happen. Thus, they appear to work most of the time. Partial failures result in the system being unavailable only for a section of the users.
+
+* **Soft State** services provide no strong consistency guarantees. They are not write consistent. Since replicas may not be mutually consistent, applications have to accept stale data.
+
+* **Eventually Consistent** services try to make application state consistent whenever possible.
+
+## 6. Partitions and latency
+Any large scale distributed system has to deal with latency issue. In fact, network partitions and latency are fundamentally related. Once a request is made and no response is received within some duration, the sender node has to assume that a partition has happened. The sender node can take one of the following steps:
+
+1) Cancel the operation as a whole. In doing so, the system is choosing consistency over availability.
+2) Proceed with the rest of the operation. This can lead to inconsistency but makes the system highly available.
+3) Retry the operation until it succeeds. This means that the system is trying to ensure consistency and reducing availability.
+
+Essentially, a partition is an upper bound on the time spent waiting for a response. Whenever this upper bound is exceeded, the system chooses C over A or A over C. Also, the partition may be perceived only by two nodes of a system as opposed to all of them. This means that partitions are a local occurrence.
+
+## 7. Handling Partitions
+Once a partition has happened, it has to be handled explicitly. The designer has to decide which operations will be functional during partitions. The partitioned nodes will continue their attempts at communication. When the nodes are able to establish communication, the system has to take steps to recover from the partitions.
+
+### 7.1. Partition mode functionality
+When at least one side of the system has entered into partition mode, the system has to decide which functionality to support. Deciding this depends on the invariants that the system must maintain. Depending on the nature of problem, the designer may choose to compromise on certain invariants by allowing partitioned system to provide functionality which might violate them. This means the designer is choosing availability over consistency. Certain invariants may have to be maintained and operations that will violate them will either have to be modified or prohibited. This means the designer is choosing consistency over availability.
+Deciding which operations to prohibit, modify or delay also depends on other factors such as the node. If the data is stored on the same node, then operations on that data can typically proceed on that node but not on other node.
+In any event, the bottomline is that if the designer wishes for the system to be available, certain operations have to be allowed. The node has to maintain a history of these operations so that it can be merged with the rest of the system when it is able to reconnect.
+Since the operations can happen simultaneously on multiple disconnected nodes, all sides will maintain this history. One way to maintain this information is through version vectors.
+Another interesting problem is to communicate the progress of these operations to the user. Until the system gets out of partition mode, the operations cannot be committed completely. Till then, the user interface has to faithfully represent their incomplete or in-progress status to the user.
+
+### 7.2. Partition Recovery
+When the partitioned nodes are able to communicate, they have to exchange information to maintain consistency. Both sides continued in their independent direction but now the delayed operations on either side have to be performed and violated invariants have to be fixed. Given the state and history of both sides, the system has to accomplish the following tasks.
+
+#### 7.2.1. Consistency
+During recovery, the system has to reconcile the inconsistency in state of both nodes. This is relatively straightforward to accomplish. One approach is to start from the state at the time of partition and apply operations of both sides in an appropriate manner, ensuring that the invariants are maintained. Depending on operations allowed during the partition phase, this process may or may not be possible. The general problem of conflict resolution is not solvable but a restricted set of operations may ensure that the system can always always merge conflicts. For example, Google Docs limits operations to style and text editing. But source-code control systems such as Concurrent Versioning System (CVS) may encounter conflict which require manual resolution. Research has been done on techniques for automatic state convergence. Using commutative operations allows the system to sort the operations in a consistent global order and execute them. Though all operations can't be commutative, for example - addition with bounds checking is not commutative. Mark Shapiro and his colleagues at INRIA have developed *commutative replicated data types (CRDTs)* that provably converge as operations are performed. By implementing state through CRDTs, we can ensure Availability and automatic state convergence after partitions.
+
+#### 7.2.2. Compensation
+During partition, its possible for both sides to perform a series of actions which are externalized, i.e. their effects are visible outside the system. To compensate for these actions, the partitioned nodes have to maintain a history.
+For example, consider a system in which both sides have executed the same order during a partition. During the recovery phase, the system has to detect this and distinguish it from two intentional orders. Once detected, the duplicate order has to be rolled back. If the order has been committed successfully then the problem has been externalized. The user will see twice the amount deducted from his account for a single purchase. Now, the system has to credit the appropriate amount to the user's account and possibly send an email explaining the entire debacle. All this depends on the system maintaining the history during partition. If the history is not present, then duplicate orders cannot be detected and the user will have to catch the mistake and ask for compensation.
+It would have been great if the duplicate order was not issued by the system in the first place. But the requirement to maintain system availability trumps consistency. Mistakes in such cases cannot always be corrected internally. But by admitting them and compensating for them, the system arguably exhibits equivalent behavior.
+
+### 8. What's the right pH for my distributed solution?
+
+Whether an application chooses to be an *ACID*ic or *BASE*ic service depends on the domain. An application developer has to consider the consistency-availability tradeoff on a case by case basis. *ACID*ic databases provide a very simple and strong consistency model making application development easy for domains where data inconsistency cannot be tolerated. *BASE*ic systems provide a very loose consistency model, placing more burden on the application developer to understand the invariants and manage them carefully during partitions by appropriately limiting or modifying the operations.
+
+## 9. References
+
+https://neo4j.com/blog/acid-vs-base-consistency-models-explained/
+https://en.wikipedia.org/wiki/Eventual_consistency/
+https://en.wikipedia.org/wiki/Distributed_transaction
+https://en.wikipedia.org/wiki/Distributed_database
+https://en.wikipedia.org/wiki/ACID
+http://searchstorage.techtarget.com/definition/data-availability
+https://aphyr.com/posts/288-the-network-is-reliable
+http://research.microsoft.com/en-us/um/people/navendu/papers/sigcomm11netwiser.pdf
+http://web.archive.org/web/20140327023856/http://voltdb.com/clarifications-cap-theorem-and-data-related-errors/
+http://static.googleusercontent.com/media/research.google.com/en//archive/chubby-osdi06.pdf
+http://www.hpl.hp.com/techreports/2012/HPL-2012-101.pdf
+http://research.microsoft.com/en-us/um/people/navendu/papers/sigcomm11netwiser.pdf
+http://www.cs.cornell.edu/projects/ladis2009/talks/dean-keynote-ladis2009.pdf
+http://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf
+https://people.mpi-sws.org/~druschel/courses/ds/papers/cooper-pnuts.pdf
+http://blog.gigaspaces.com/nocap-part-ii-availability-and-partition-tolerance/
+http://stackoverflow.com/questions/39664619/what-if-we-partition-a-ca-distributed-system
+https://people.eecs.berkeley.edu/~istoica/classes/cs268/06/notes/20-BFTx2.pdf
+http://ivoroshilin.com/2012/12/13/brewers-cap-theorem-explained-base-versus-acid/
+https://www.quora.com/What-is-the-difference-between-CAP-and-BASE-and-how-are-they-related-with-each-other
+http://berb.github.io/diploma-thesis/original/061_challenge.html
+http://dssresources.com/faq/index.php?action=artikel&id=281
+https://saipraveenblog.wordpress.com/2015/12/25/cap-theorem-for-distributed-systems-explained/
+https://www.infoq.com/articles/cap-twelve-years-later-how-the-rules-have-changed
+https://dzone.com/articles/better-explaining-cap-theorem
+http://www.julianbrowne.com/article/viewer/brewers-cap-theorem
+http://delivery.acm.org/10.1145/1400000/1394128/p48-pritchett.pdf?ip=73.69.60.168&id=1394128&acc=OPEN&key=4D4702B0C3E38B35%2E4D4702B0C3E38B35%2E4D4702B0C3E38B35%2E6D218144511F3437&CFID=694281010&CFTOKEN=94478194&__acm__=1479326744_f7b98c8bf4e23bdfe8f17b43e4f14231
+http://dl.acm.org/citation.cfm?doid=1394127.1394128
+https://en.wikipedia.org/wiki/Eventual_consistency
+https://en.wikipedia.org/wiki/Two-phase_commit_protocol
+https://en.wikipedia.org/wiki/ACID
+https://people.eecs.berkeley.edu/~brewer/cs262b-2004/PODC-keynote.pdf
+http://www.johndcook.com/blog/2009/07/06/brewer-cap-theorem-base/
+http://searchsqlserver.techtarget.com/definition/ACID
+http://queue.acm.org/detail.cfm?id=1394128
+http://www.dataversity.net/acid-vs-base-the-shifting-ph-of-database-transaction-processing/
+https://neo4j.com/developer/graph-db-vs-nosql/#_navigate_document_stores_with_graph_databases
+https://neo4j.com/blog/aggregate-stores-tour/
+https://en.wikipedia.org/wiki/Eventual_consistency
+https://en.wikipedia.org/wiki/Distributed_transaction
+https://en.wikipedia.org/wiki/Distributed_database
+https://en.wikipedia.org/wiki/ACID
+http://searchstorage.techtarget.com/definition/data-availability
+https://datatechnologytoday.wordpress.com/2013/06/24/defining-database-availability/
+{% bibliography --file rpc %}
diff --git a/chapter/6/being-consistent.md b/chapter/6/being-consistent.md
new file mode 100644
index 0000000..233d987
--- /dev/null
+++ b/chapter/6/being-consistent.md
@@ -0,0 +1,82 @@
+---
+layout: page
+title: "Being Consistent"
+by: "Aviral Goel"
+---
+
+## Replication and Consistency
+Availability and Consistency are the defining characteristics of any distributed system. As dictated by the CAP theorem, accommodating network partitions requires a trade off between the two properties. Modern day large scale internet based distributed systems have to be highly available. To manage huge volumes of data (big data) and to reduce access latency for geographically diverse user base, their data centers also have to be geographically spread out. Network partitions which would otherwise happen with a low probability on a local network become certain events in such systems. To ensure availability in the event of partitions, these systems have to replicate data objects. This begs the question, how to ensure consistency of these replicas? It turns out there are different notions of consistency which the system can adhere to.
+
+* **Strong Consistency** implies linearizability of updates, i.e., all updates applied to a replicated data type are serialized in a global total order. This means that any update will have to be simultaneously applied to all other replicas. Its obvious that this notion of consistency is too restrictive. A single unavailable node will violate this condition. Forcing all updates to happen synchronously will impact system availability negatively. This notion clearly does not fit the requirements of highly available fault tolerant systems.
+
+* **Eventual Consistency** is a weaker model of consistency that does not guarantee immediate consistency of all replicas. Any local update is immediately executed on the replica. The replica then sends its state asynchronously to other replicas. As long as all replicas share their states with each other, the system eventually achieves stability. Each replica finally contains the same value. During the execution, all updates happen asynchronously at all replicas in a non-deterministic order. So replicas can be inconsistent between updates. If updates arrive concurrently at a replica, a consensus protocol can be employed to ensure that both updates taken together do not violate an invariant. If they do, a rollback has to be performed and the new state is communicated to all the other replicas.
+
+Most large scale distributed systems try to be **Eventually Consistent** to ensure high availability and partition-tolerance. But conflict resolution is hard. There is little guidance on correct approaches to consensus and its easy to come up with an error prone ad-hoc approach. What if we side-step conflict resolution and rollback completely? Is there a way to design data structures which do not require any consensus protocols to merge concurrent updates?
+
+## A Distributed Setting
+
+### TODO need to write pseudocode. Will finish this part with the detailed explanation of CRDTs in the next chapter.
+Consider a replicated counter. Each node can increment the value of its local copy. The figure below shows three nodes which increment their local copies at arbitrary time points and each replica sends its value asynchronously to the other two replicas. Whenever it recieves the value of its replica, it adds it to its current value. If two values are received concurrently, both will be added together to its current value. So merging replicas in this example becomes trivial.
+
+Let's take a look at another interesting generalization of this. Integer Vector
+
+
+We can make an interesting observation from the previous examples:
+
+__*All distributed data structures don't need conflict resolution*__
+
+This raises the following question:
+
+__*How can we design a distributed structure such that we don't need conflict resolution?*__
+
+The answer to this question lies in an algebraic structure called the **join semilattice**.
+
+## Join Semilattice
+A join-semilattice or upper semilattice is a *partial order* `≤` with a *least upper bound* (LUB) `⊔` for all pairs.
+`m = x ⊔ y` is a Least Upper Bound of `{` `x` `,` `y` `}` under `≤` iff `∀m′, x ≤ m′ ∧ y ≤ m′ ⇒ x ≤ m ∧ y ≤ m ∧ m ≤ m′`.
+
+`⊔` is:
+
+**Associative**
+
+`(x ⊔ y) ⊔ z = x ⊔ (y ⊔ z)`
+
+**Commutative**
+
+`x ⊔ y = y ⊔ x`
+
+**Idempotent**
+
+`x ⊔ x = x`
+
+The examples we saw earlier were of structures that could be modeled as join semilattices. The merge operation for the increment only counter is the summation function and for the integer vector it is the per-index maximum of the vectors being merged.
+So, if we can model the state of the data structure as a partially ordered set and design the merge operation to always compute the "larger" of the two states, its replicas will never need consensus. They will always converge as execution proceeds. Such data structures are called CRDTs (Conflict-free Replicated Data Type). But what about consistency of these replicas?
+
+## Strong Eventual Consistency (SEC)
+We discussed a notion of consistency, *Eventual Consistency*, in which replicas eventually become consistent if there are no more updates to be merged. But the update operation is left unspecified. Its possible for an update to render the replica in a state that causes it to conflict with a later update. In this case the replica may have to roll back and use consensus to ensure that all replicas do the same to ensure consistency. This is complicated and wasteful. But if replicas are modeled as CRDTs, the updates never conflict. Regardless of the order in which the updates are applied, all replicas will eventually have equivalent state. Note that no conflict arbitration is necessary. This kind of Eventual Consistency is a stronger notion of consistency than the one that requires conflict arbitration and hence is called *Strong Eventual Consistency*.
+
+### Strong Eventual Consistency and CAP Theorem
+
+Let's study SEC data objects from the perspective of CAP theorem.
+
+#### 1. Consistency and Network Partition
+Each distributed replica will communicate asynchronously with other reachable replicas. These replicas will eventually converge to the same value. There is no consistency guarantee on the value of replicas not reachable due to network conditions and hence this condition is strictly weaker than strong consistency. But as soon as those replicas can be reached, they will also converge in a self-stabilizing manner.
+
+#### 2. Availability and Network Partition
+Each distributed replica will always be available for local reads and writes regardless of network partitions. In fact, if there are n replicas, a single replica will function even if the remaining n - 1 replicas crash simultaneously. This **provides an extreme form of availability**.
+
+SEC facilitates maximum consistency and availability in the event of network partitions by relaxing the requirement of global consistency. Note that this is achieved by virtue of modeling the data objects as join semilattices.
+
+#### Strong Eventual Consistency and Linearizability
+In a distributed setting, a replica has to handle concurrent updates. In addition to its sequential behavior, a CRDT also has to ensure that its concurrent behavior also ensures strong eventual consistency. This makes it possible for CRDTs to exhibit behavior that is simply not possible for sequentially consistent objects.
+Consider a set CRDT used in a distributed setting. One of the replicas p<sub>i</sub> executes the sequence `add(a); remove(b)`. Another replica p<sub>j</sub> executes the sequence `add(b); remove(a)`. Now both send their states asynchronously to another replica p<sub>k</sub> which has to merge them concurrently. Same element exists in one of the sets and does not exist in the other set. There are multiple choices that the CRDT designer can make. Let's assume that the implementation always prefers inclusion over exclusion. So in this case, p<sub>k</sub> will include both `a` and `b`.
+Now consider a sequential execution of the two sequences on set data structure. The order of execution will be either `add(a); remove(b); add(b); remove(a)` or `add(b); remove(a); add(a); remove(b)`. In both cases one of the elements is excluded. This is different from the state of the CRDT set implementation.
+Thus, strong eventually consistent data structures can be sequentially inconsistent.
+Similarly, if there are `n` sequentially consistent replicas, then they would need consensus to ensure a single order of execution of operations across all replicas. But if `n - 1` replicas crash, then consensus cannot happen. This makes the idea of sequential consistency incomparable to that of strong eventual consistency.
+
+## What Next?
+This chapter introduced Strong Eventual Consistency and the formalism behind CRDTs, join semilattices, which enables CRDTs to exhibit strong eventual consistency. The discussion however does not answer an important question:
+
+__*Can all standard data structures be designed as CRDTs?*__
+
+The next chapter sheds more light on the design of CRDTs and attempts to answer this question.
diff --git a/chapter/6/consistency-crdts.md b/chapter/6/consistency-crdts.md
deleted file mode 100644
index fcb49e7..0000000
--- a/chapter/6/consistency-crdts.md
+++ /dev/null
@@ -1,11 +0,0 @@
----
-layout: page
-title: "Consistency & CRDTs"
-by: "Joe Schmoe and Mary Jane"
----
-
-Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum. {% cite Uniqueness --file consistency-crdts %}
-
-## References
-
-{% bibliography --file consistency-crdts %} \ No newline at end of file
diff --git a/chapter/6/resources/partitioned-network.jpg b/chapter/6/resources/partitioned-network.jpg
new file mode 100644
index 0000000..513fc13
--- /dev/null
+++ b/chapter/6/resources/partitioned-network.jpg
Binary files differ
diff --git a/chapter/8/Hive-architecture.png b/chapter/8/Hive-architecture.png
new file mode 100644
index 0000000..9f61454
--- /dev/null
+++ b/chapter/8/Hive-architecture.png
Binary files differ
diff --git a/chapter/8/Hive-transformation.png b/chapter/8/Hive-transformation.png
new file mode 100644
index 0000000..a66e9ea
--- /dev/null
+++ b/chapter/8/Hive-transformation.png
Binary files differ
diff --git a/chapter/8/big-data.md b/chapter/8/big-data.md
index bfd3e7b..a04c72a 100644
--- a/chapter/8/big-data.md
+++ b/chapter/8/big-data.md
@@ -1,11 +1,720 @@
---
layout: page
title: "Large Scale Parallel Data Processing"
-by: "Joe Schmoe and Mary Jane"
+by: "Jingjing and Abhilash Mysore Somashekar"
---
+## Introduction
+The growth of Internet has generated the so-called big data(terabytes or petabytes). It is not possible to fit them into a single machine or process them with one single program. Often the computation has to be done fast enough to provide practical services. A common approach taken by tech giants like Google, Yahoo, Facebook is to process big data across clusters of commodity machines. Many of the computations are conceptually straightforward, and Google proposed the MapReduce framework, which separates the programming logic and underlying execution details(data distribution, fault tolerance and scheduling). The model has been proved to be simple and powerful, and from then on, the idea inspired many other programming models.
-Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum. {% cite Uniqueness --file big-data %}
+This chapter covers the original idea of MapReduce framework, split into two sections: programming model and execution model. For each section, we first introduce the original design for MapReduce and its limitations. Then we present follow-up models(e.g. FlumeJava) to either work around these limitations or other models (e.g. Dryad, Spark) that take alternative designs to circumvent inabilities of MapReduce. We also review declarative programming interfaces(Pig, Hive, SparkSQL) built on top of MapReduce frameworks to provide programming efficiency and optimization benefits. In the last section, we briefly outline the ecosystem of Hadoop and Spark.
-## References
+Outline
+1. Programming Models
+ - 1.1 Data parallelism: MapReduce, FlumeJava, Dryad, Spark
+ - 1.2 Querying: Hive/HiveQL, Pig Latin, SparkSQL
+ - 1.3 Large-scale parallelism on Graph: BSP, GraphX
+2. Execution Models
+ - 2.1 MapReduce execution model
+ - 2.2 Spark execution model
+ - 2.3 Hive execution model
+ - 2.4 SparkSQL execution model
+3. Big Data Ecosystem:
+ - 3.1 Hadoop ecosystem
+ - 3.2 Spark ecosystem
+
+## 1 Programming Models
+### 1.1 Data parallelism
+*Data parallelism* is, given a dataset, the simultaneous execution on multiple machines or threads of the same function across groups of elements of a dataset. Data parallelism can also be thought of as a subset of SIMD ("single instruction, multiple data") execution, a class of parallel execution in Flynn's taxonomy. Comparably, one could think a sequential computation as *"for all elements in the dataset, do operation A"* on a single big dataset, whose size can reach to terabytes or petabytes. The challenges to doing this sequential computation in a parallelized manner include how to abstract the different types of computations in a simple and correct way, how to distribute the data to hundreds/thousands of machines or clusters, how to schedule tasks and handle failures and so on.
+
+<figure class="main-container">
+ <img src="{{ site.baseurl }}/resources/img/data-parallelism.png" alt="Data Parallelism" />
+</figure>
+
+**MapReduce** {% cite dean2008mapreduce --file big-data %} is a programming model proposed by Google to initially satisfy their demand of large-scale indexing for web search service. It provides a simple user program interface: *map* and *reduce* functions and automatically handles the parallelization and distribution. The underlying execution systems can provide fault tolerance and scheduling.
+
+The MapReduce model is simple and powerful and quickly becomes very popular among developers. However, when developers start writing real-world applications, they often end up writing many boilerplates and chaining together these stages. Moreover, The pipeline of MapReduce forces them to write additional coordinating codes, i.e., the development style goes backward from simple logic computation abstraction to lower-level coordination management. As we will discuss in *section 2 execution model*, MapReduce writes all data into disk after each stage, which causes severe delays. Programmers need to do manual optimizations for targeted performance, and this again requires them to understand the underlying execution model. The whole process soon becomes cumbersome. **FlumeJava** {%cite chambers2010flumejava --file big-data%} library intends to provide support for developing data-parallel pipelines by abstracting away the complexity involved in data representation and implicitly handling the optimizations. It defers the evaluation, constructs an execution plan from parallel collections, optimizes the plan, and then executes underlying MR primitives. The optimized execution is comparable with hand-optimized pipelines, thus there is no much need to write raw MR programs directly.
+
+
+After MapReduce, Microsoft proposed their counterpart data parallelism model: **Dryad** {% cite isard2007dryad --file big-data %}, which abstracts individual computational tasks as vertices, and constructs a communication graph between those vertices. What programmers need to do is to describe this DAG graph and let Dryad execution engine construct the execution plan and manage scheduling and optimization. One of the advantages of Dryad over MapReduce is that Dryad vertices can process an arbitrary number of inputs and outputs, while MR only supports a single input and a single output for each vertex. Besides the flexibility of computations, Dryad also supports different types of communication channel: file, TCP pipe, and shared-memory FIFO. The programming model is less elegant than MapReduce, programmers are not meant to interact with them directly. Instead, they are expected to use the high-level programming interfaces DryadLinq {% cite yu2008dryadlinq --file big-data %}, which more expressive and well embedded with .NET framework. We can see some examples in the end of *section 1.1.3 Dryad*.
+
+
+Dryad expresses computation as acyclic data flows, which might be too expensive for some complex applications, e.g. iterative machine learning algorithms. **Spark** {% cite zaharia2010spark --file big-data%} is a framework that uses functional programming and pipelining to provide such support. It is largely inspired by MapReduce's model and builds upon the ideas behind DAG, lazy evaluation of DryadLinq. Instead of writing data to disk for each job as MapReduce does Spark can cache the results across jobs. Spark explicitly caches computational data in memory through specialized immutable data structure named Resilient Distributed Sets(RDD) and reuse the same dataset across multiple parallel operations. The Spark builds upon RDD to achieve fault tolerance by reusing the lineage information of the lost RDD. This results in lesser overhead than what is seen in fault tolerance achieved by the checkpoint in Distributed Shared Memory systems. Moreover, Spark is the underlying framework upon which many very different systems are built, e.g., Spark SQL & DataFrames, GraphX, Streaming Spark, which makes it easy to mix and match the use of these systems all in the same application.These feature makes Spark the best fit for iterative jobs and interactive analytics and also helps it in providing better performance.
+
+Following four sections discuss the programming models of MapReduce, FlumeJava, Dryad, and Spark.
+
+
+### 1.1.1 MapReduce
+In this model, parallelizable computations are abstracted into map and reduce functions. The computation accepts a set of key/value pairs as input and produces a set of key/value pairs as output. The process involves two phases:
+- *Map*, written by the user, accepts a set of key/value pairs("record") as input, applies *map* operation on each record, then it computes a set of intermediate key/value pairs as output.
+- *Reduce*, also written by the user, accepts an intermediate key and a set of values associated with that key, operate on them, produces zero or one output value.
+ Note: there is a *Shuffle* phase between *map* and *reduce*, provided by MapReduce library, groups the all the intermediate values of the same key together and pass to *Reduce* function. We will discuss more in Section 2 Execution Models.
+
+Conceptually, the map and reduction functions have associated **types**:
+
+\\[map (k1,v1) \rightarrow list(k2,v2)\\]
+
+\\[reduce (k2,list(v2)) \rightarrow list(v2)\\]
+
+
+The input keys and values are drawn from a different domain than the output keys and values. The intermediate keys and values are from the same domain as the output keys and values.
+
+
+Concretely, considering the problem of counting the number of occurrence of each word in a large collection of documents: each time, a `map` function that emits a word plus its count 1; a `reduce` function sums together all counts emitted for the same word
+
+```
+map(String key, String value):
+ // key: document name
+ // value: document contents
+ for each word w in value:
+ EmitIntermediate(w, "1");
+
+reduce(String key, Iterator values):
+ // key: a word
+ // values: a list of counts
+ int result = 0;
+ for each v in values:
+ result += ParseInt(v);
+ Emit(AsString(result));
+```
+
+During executing, the MapReduce library assigns a master node to manage data partition and scheduling, other nodes can serve as workers to run either *map* or *reduce* operations on demands. More details of the execution model are discussed later. Here, it's worth mentioning that the intermediate results are written into disks and reduce operation will read from disk. This is crucial for fault tolerance.
+
+*Fault Tolerance*
+MapReduce runs on hundreds or thousands of unreliable commodity machines, so the library must provide fault tolerance. The library assumes that master node would not fail, and it monitors worker failures. If no status update is received from a worker on timeout, the master will mark it as failed. Then the master may schedule the associated task to other workers depending on task type and status. The commits of *map* and *reduce* task outputs are atomic, where the in-progress task writes data into private temporary files, once the task succeeds, it negotiate with the master and rename files to complete the task. In the case of failure, the worker discards those temporary files. This guarantees that if the computation is deterministic, the distribution implementation should produce same outputs as non-faulting sequential execution.
+
+*Limitations*
+Many analytics workloads like K-means, logistic regression, graph processing applications like PageRank, shortest path using parallel breadth-first search require multiple stages of MapReduce jobs. In regular MapReduce framework like Hadoop, this requires the developer to manually handle the iterations in the driver code. At every iteration, the result of each stage T is written to HDFS and loaded back again at stage T+1 causing a performance bottleneck. The reason being wastage of network bandwidth, CPU resources, and mainly the disk I/O operations which are inherently slow. In order to address such challenges in iterative workloads on MapReduce, frameworks like Haloop {% cite bu2010haloop --file big-data %}, Twister {% cite ekanayake2010twister --file big-data %} and iMapReduce {% cite zhang2012imapreduce --file big-data %} adopt special techniques like caching the data between iterations and keeping the mapper and reducer alive across the iterations.
+
+
+### 1.1.2 FlumeJava
+FlumeJava {%cite chambers2010flumejava --file big-data %}was introduced to make it easy to develop, test, and run efficient data-parallel pipelines. FlumeJava represents each dataset as an object and transformation is invoked by applying methods on these objects. It constructs an efficient internal execution plan from a pipeline of MapReduce jobs, uses deferred evaluation and optimizes based on plan structures. The debugging ability allows programmers to run on the local machine first and then deploy to large clusters.
+
+*Core Abstraction*
+- `PCollection<T>`, a immutable bag of elements of type `T`, it can be created from in-memory Java `Collection<T>` or from reading a file with encoding specified by `recordOf`.
+- `recordOf(...)`, specifies the encoding of the instance
+- `PTable<K, V>`, a subclass of `PCollection<Pair<K,V>>`, a immutable multi-map with keys of type `K` and values of type `V`
+- `parallelDo()`, can be expressed both the map and reduce parts of MapReduce
+- `groupByKey()`, same as shuffle step of MapReduce
+- `combineValues()`, semantically a special case of `parallelDo()`, a combination of a MapReduce combiner and a MapReduce reducer, which is more efficient than doing all the combining in the reducer.
+- `flatten`, takes a list of `PCollection<T>`s and returns a single logic `PCollection<T>`.
+
+An example implemented using FlumeJava:
+```java
+PTable<String,Integer> wordsWithOnes =
+ words.parallelDo(
+ new DoFn<String, Pair<String,Integer>>() {
+ void process(String word,
+ EmitFn<Pair<String,Integer>> emitFn) {
+ emitFn.emit(Pair.of(word, 1));
+ }
+ }, tableOf(strings(), ints()));
+PTable<String,Collection<Integer>>
+ groupedWordsWithOnes = wordsWithOnes.groupByKey();
+PTable<String,Integer> wordCounts =
+ groupedWordsWithOnes.combineValues(SUM_INTS);
+```
+
+*Deferred Evaluation & Optimizer*
+One of the merits of using FlumeJava to pipeline MapReduce jobs is that it enables optimization automatically, by executing parallel operations lazily using *deferred evaluation*. The state of each `PCollection` object is either *deferred* (not yet computed) and *materialized* (computed). When the program invokes a *parallelDo()*, it creates an operation pointer to the actual deferred operation object. These operations form a directed acyclic graph called execution plan. The execution plan doesn't get evaluated until *run()* is called. This will cause optimization of the execution plan and evaluation in forward topological order. These optimization strategies for transferring the modular execution plan into an efficient one include:
+- Fusion: $$f(g(x)) => g \circ f(x)$$, which is essentially function composition. This usually help reduce steps.
+- MapShuffleCombineReduce (MSCR) Operation: combination of ParallelDo, GroupByKey, CombineValues and Flatten into one MapReduce job. This extends MapReduce to accept multiple inputs and multiple outputs. Following figure illustrates the case a MSCR operation with 3 input channels, 2 grouping(GroupByKey) output channels and 1 pass-through output channel.
+ <figure class="main-container">
+ <img src="{{ site.baseurl }}/resources/img/mscr.png" alt="A MapShuffleCombineReduce operation with 3 input channels" />
+ </figure>
+
+An overall optimizer strategy involves a sequence of optimization actions with the ultimate goal to produce the fewest, most efficient MSCR operations:
+1. Sink Flatten: $$h(f(a)+g(b)) \rightarrow h(f(a)) + h(g(b))$$
+2. Lift combineValues operations: If *CombineValues* operation immediately follows a *GroupByKey* operation, the GroupByKey records the fact and original *CombineValues* is left in place, which can be treated as normal *ParallelDo* operation and subject to ParallelDo fusions.
+3. Insert fusion blocks:
+4. Fuse ParallelDos
+5. Fuse MSCRs: create MSCR operations, and convert any remaining unfused ParallelDo operations into trivial MSCRs.
+The SiteData example{%cite chambers2010flumejava --file big-data %} shows that 16 data-parallel operations can be optimized into two MSCR operations in the final execution plan (refer to Figure 5 in the original paper). One limitation of the optimizer is that all these optimizations are based on the structures of the execution plan, FlumeJava doesn't analyze user-defined functions.
+
+
+### 1.1.3 Dryad
+Dryad is a general-purpose data-parallel execution engine that allows developers to *explicitly* specify an arbitrary directed acyclic graph (DAG) for computations, where each vertex is a computation task and the edges represent communication channels(file, TCP pipe, or shared-memory FIFI) between tasks.
+
+A Dryad job is a logic computation graph that is automatically mapped to physical resources at runtime. From programmers' point of view, the channels produce or consume heap objects and the type of data channel makes no difference to reading or writing these objects. In Dryad system, a process called "job manager" connects to the cluster network and is responsible for scheduling jobs by consulting the name server (NS) and delegating commands to the daemon (D) running on each computer in the cluster.
+
+
+*Writing program*
+
+The Dryad library is written in C++ and it uses a mixture of method calls and operator overloading. It describes a Dryad graph as $$G=\langle V_G, E_G, I_G, O_G \rangle$$, where $$V_G$$ is a sequences of vertices, $$E_G$$ is a set of directed edges, $$I_G$$ and $$O_G$$ represent vertices for *inputs* and *outputs*.
+
+- *Creating new vertices* The library calls static program factory to create a graph vertex and it also provides $$^$$ operator to clone a graph and $$\otimes$$ to concatenate sequences.
+- *Adding graph edges* $$C=A\circ B$$ creates a new graph $$C=\langle V_A \otimes V_B, E_A \cup E_B \cup E_{new}, I_A, O_B \rangle$$. The composition of set of edges are defined by two types:
+ 1) $$A>=B$$ pointwise composition
+ 2) and $$A>>B$$ complete bipartite graph between $$O_A$$ and $$I_B$$.
+- *Merging two graphs* $$C=A \mid\mid B$$ creates a new graph $$C=\langle V_A \otimes^* V_B, E_A \cup E_B, I_A \cup^* I_B, O_A\cup^* O_B \rangle$$.
+
+Following is an example graph builder program.
+```c
+GraphBuilder XSet = moduleX^N;
+GraphBuilder DSet = moduleD^N;
+GraphBuilder MSet = moduleM^(N*4);
+GraphBuilder SSet = moduleS^(N*4);
+GraphBuilder YSet = moduleY^N;
+GraphBuilder HSet = moduleH^1;
+
+GraphBuilder XInputs = (ugriz1 >= XSet) || (neighbor >= XSet);
+GraphBuilder YInputs = ugriz2 >= YSet;
+
+GraphBuilder XToY = XSet >= DSet >> MSet >= SSet;
+
+for (i = 0; i < N*4; ++i)
+{
+ XToY = XToY || (SSet.GetVertex(i) >= YSet.GetVertex(i/4));
+}
+GraphBuilder YToH = YSet >= HSet;
+GraphBuilder HOutputs = HSet >= output;
+
+GraphBuilder final = XInputs || YInputs || XToY || YToH || HOutputs;
+```
+
+In fact, developers are not expected to write raw Dryad programs as complex as above. Instead, Microsoft introduced a querying model DryadLINQ {% cite yu2008dryadlinq --file big-data %} which is more declarative. We will discuss querying models and their power to express complex operations like join in *section 1.2 Querying*. Here we just show a glimpse of querying example in DryadLINQ (who is compiled into Dryad jobs and executed in Dryad execution engine):
+
+```c#
+//SQL-style syntax to join two input sets:
+// scoreTriples and staticRank
+var adjustedScoreTriples =
+ from d in scoreTriples
+ join r in staticRank on d.docID equals r.key
+ select new QueryScoreDocIDTriple(d, r);
+var rankedQueries =
+ from s in adjustedScoreTriples
+ group s by s.query into g
+ select TakeTopQueryResults(g);
+
+// Object-oriented syntax for the above join
+var adjustedScoreTriples =
+ scoreTriples.Join(staticRank,
+ d => d.docID, r => r.key,
+ (d, r) => new QueryScoreDocIDTriple(d, r));
+var groupedQueries =
+ adjustedScoreTriples.GroupBy(s => s.query);
+var rankedQueries =
+ groupedQueries.Select(
+ g => TakeTopQueryResults(g));
+```
+
+*Fault tolerance policy*
+The communication graph is acyclic, so if given immutable inputs, the computation result should remain same regardless of the sequence of failures. When a vertex fails, the job manager will either get notified or receive a heartbeat timeout and then the job manager will immediately schedule to re-execute the vertex.
+
+*Comparison with FlumeJava*
+Both support multiple inputs/outputs for the computation nodes. The big difference is that FlumeJava still exploits the MapReduce approach to reading from/writing to disks between stages, where Dryad has the option to do in-memory transmission. This leaves Dryad a good position to do optimization like re-using in-memory data. In the other hand, Dryad has no optimizations on the graph itself.
+
+
+### 1.1.4 Spark
+
+Spark  {%cite zaharia2010spark --file big-data %} is a fast, in-memory data processing engine with an elegant and expressive development interface which enables developers to efficiently execute machine learning, SQL or streaming workloads that require fast iterative access to datasets. It is a functional style programming model (similar to DryadLINQ) where a developer can create acyclic data flow graphs and transform a set of input data through a map - reduce like operators. Spark provides two main abstractions: distributed in-memory storage (RDD) and parallel operations (based on Scala’s collection API) on data sets with high-performance processing, scalability, and fault tolerance. 
+
+*Distributed in-memory storage - Resilient Distributed Data sets*
+
+RDD is a partitioned, read-only collection of objects which can be created from data in stable storage or by transforming other RDD. It can be distributed across multiple nodes (parallelize) in a cluster and is fault tolerant(resilient). If a node fails, an RDD can always be recovered using its lineage; the DAG of computations performed on the source dataset. An RDD is stored in memory (as much as it can fit and rest is spilled to disk) and is immutable - It can only be transformed to a new RDD. These transformations are deferred; that means they are built up and staged and are not actually applied until an action is performed on an RDD. Thus, it is important to note that while one might have applied many transformations to a given RDD, some resulting transformed RDD may not be materialized even though one may hold a reference to it.
+
+The properties that power RDD with the above-mentioned features:
+- A list of dependencies on other RDD’s.
+- An array of partitions that a dataset is divided into.
+- A compute function to do a computation on partitions.
+- Optionally, a Partitioner for key-value RDDs (e.g. to say that the RDD is hash-partitioned)
+- Optional preferred locations (aka locality info), (e.g. block locations for an HDFS file)
+
+
+<figure class="main-container">
+ <img src="./spark_pipeline.png" alt="Spark pipeline" />
+</figure>
+
+
+Spark API provide two kinds of operations on an RDD:
+
+- Transformations - lazy operations that return another RDD.
+ - `map (f : T => U) : RDD[T] ⇒ RDD[U]` Return a MappedRDD[U] by applying function f to each element
+ - `flatMap( f : T ⇒ Seq[U]) : RDD[T] ⇒ RDD[U]` Return a new FlatMappedRDD[U] by first applying a function to all elements  and then flattening the results.
+ - `filter(f:T⇒Bool) : RDD[T] ⇒ RDD[T]` Return a FilteredRDD[T] having elemnts that f return true
+ - `groupByKey()` Being called on (K,V) Rdd, return a new RDD[([K], Iterable[V])]
+ - `reduceByKey(f: (V, V) => V)` : Being called on (K, V) Rdd, return a new RDD[(K, V)] by aggregating values using eg: reduceByKey(_+_)
+ - `join((RDD[(K, V)], RDD[(K, W)]) ⇒ RDD[(K, (V, W))]` Being called on (K,V) Rdd, return a new RDD[(K, (V, W))] by joining them by key K.
+
+
+- Actions - operations that trigger computation on an RDD and return values.
+
+ - `reduce(f:(T,T)⇒T) : RDD[T] ⇒ T` Return T by reducing the elements using specified commutative and associative binary operator
+ - `collect()` Return an Array[T] containing all elements
+ - `count()` Return the number of elements
+
+RDDs by default are discarded after use. However, Spark provides two explicit operations: persist() and cache() to ensure RDDs are persisted in memory once the RDD has been computed for the first time.
+
+*Why RDD, not Distributed Shared memory (DSM) ?*
+
+RDDs are immutable and can only be created through coarse-grained transformations while DSM allows fine-grained read and write operations to each memory location. Since RDDs are immutable and can be derived from their lineages, they do not require checkpointing at all. Hence RDDs do not incur the overhead of checkpointing as DSM does. Additionally, in DSM, any failure requires the whole program to be restored. In the case of RDDs, only the lost RDD partitions need to be recovered. This recovery happens parallelly on the affected nodes. RDDs are immutable and hence a straggler (slow node) can be replaced with a backup copy as in MapReduce. This is hard to implement in DSM as two copies point to the same location and can interfere the update with one another.
+
+
+
+***Challenges in Spark*** {% cite armbrust2015scaling --file big-data%}
+
+- *Functional API semantics* The `GroupByKey` operator is costly in terms of performance. In that it returns a distributed collection of (key, list of value) pairs to a single machine and then an aggregation on individual keys is performed on the same machine resulting in computation overhead. Spark does provide `reduceByKey` operator which does a partial aggregation on individual worker nodes before returning the distributed collection. However, developers who are not aware of such a functionality can unintentionally choose `groupByKey`. The reason being functional programmers (Scala developers) tend to think more declaratively about the problem and only see the end result of the `groupByKey` operator. They may not be necessarily trained on how `groupByKey` is implemented atop of the cluster. Therefore, to use Spark, unlike functional programming languages, one needs to understand how the underlying cluster is going to execute the code. The burden of saving performance is then left to the programmer, who is expected to understand the underlying execution model of Spark, and to know when to use `reduceByKey` over `groupByKey`.
+
+- *Debugging and profiling* There is no availability of debugging tools and developers find it hard to realize if a computation is happening more on a single machine or if the data-structure they used were inefficient.
+
+### 1.2 Querying: declarative interfaces
+MapReduce takes care of all the processing over a cluster, failure and recovery, data partitioning etc. However, the framework suffers from rigidity with respect to its one-input data format (key/value pair) and two-stage data flow. Several important patterns like equi-joins and theta-joins {% cite okcan2011processing --file big-data%} which could be highly complex depending on the data, require programmers to implement by hand. Hence, MapReduce lacks many such high level abstractions requiring programmers to be well versed with several of the design patterns like map-side joins, reduce-side equi-join etc. Also, java based code (like in Hadoop framework) in MapReduce can sometimes become repetitive when the programmer wants to implement most common operations like projection, filtering etc. A simple word count program as shown below, can span up to 63 lines.
+
+*Complete code for Word count in Hadoop (Java based implementation of MapReduce)*
+
+```java
+import java.io.IOException;
+import java.util.*;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.conf.*;
+import org.apache.hadoop.io.*;
+import org.apache.hadoop.mapreduce.*;
+import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
+import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
+import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
+import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;
+
+public class WordCount
+{
+ public static class Map extends Mapper<LongWritable, Text, Text, IntWritable>
+ {
+ private final static IntWritable one = new IntWritable(1);
+ private Text word = new Text();
+
+ public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException
+ {
+ String line = value.toString();
+ StringTokenizer tokenizer = new StringTokenizer(line);
+ while (tokenizer.hasMoreTokens())
+ {
+ word.set(tokenizer.nextToken());
+ context.write(word, one);
+ }
+ }
+
+ public static class Reduce extends Reducer<Text, IntWritable, Text, IntWritable>
+ {
+ public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException
+ {
+ int sum = 0;
+ for (IntWritable val : values)
+ {
+ sum += val.get();
+ }
+ context.write(key, new IntWritable(sum));
+ }
+ }
+
+ public static void main(String[] args) throws Exception
+ {
+ Configuration conf = new Configuration();
+ Job job = new Job(conf, "wordcount");
+ job.setOutputKeyClass(Text.class);
+ job.setOutputValueClass(IntWritable.class);
+ job.setMapperClass(Map.class);
+ job.setReducerClass(Reduce.class);
+ job.setInputFormatClass(TextInputFormat.class);
+ job.setOutputFormatClass(TextOutputFormat.class);
+ FileInputFormat.addInputPath(job, new Path(args[0]));
+ FileOutputFormat.setOutputPath(job, new Path(args[1]));
+ }
+
+ job.waitForCompletion(true);
+}
+```
+
+*Why SQL over MapReduce ?*
+
+SQL already provides several operations like join, group by, sort which can be mapped to the above mentioned MapReduce operations. Also, by leveraging SQL like interface, it becomes easy for non MapReduce experts/non-programmers like data scientists to focus more on logic than hand coding complex operations {% cite armbrust2015scaling --file big-data%}. Such an high level declarative language can easily express their task while leaving all of the execution optimization details to the backend engine.
+SQL also lessens the amount of code (code examples can be seen in individual model’s section) and significantly reduces the development time.
+Most importantly, as you will read further in this section, frameworks like Pig, Hive, Spark SQL take advantage of these declarative queries by realizing them as a DAG upon which the compiler can apply transformation if an optimization rule is satisfied. Spark which does provide high level abstraction unlike MapReduce, lacks this very optimization resulting in several human errors as discussed in the Spark’s data-parallel section.
+
+Sawzall {% cite pike2005interpreting --file big-data%} is a programming language built on top of MapReduce. It consists of a *filter* phase (map) and an *aggregation* phase (reduce). User program only need to specify the filter function, and emit the intermediate pairs to external pre-built aggregators. This largely eliminates the trouble for programmers put into having to write reducers, just the following example shows, programmers can use built-in reducer supports to do the a reducing job. The serialization of the data uses Google's *protocol buffers*, which can produce *meta-data* file for the declared scheme, but the scheme is not used for any optimization purpose per se. Sawzall is good for most of the straightforward processing on large dataset, but it does not support more complex and still common operations like *join*. The pre-built aggregators are limited and it is non-trivial to add more supports.
+
+- *Word count implementation in Sawzall*
+
+ ```
+ result: table sum of int;
+ total: table sum of float;
+ x: float = input;
+ emit count <- 1;
+ emit total <- x;
+ ```
+
+Apart from Sawzall, Pig {%cite olston2008pig --file big-data %} and Hive {%cite thusoo2009hive --file big-data %} are the other major components that sit on top of Hadoop framework for processing large data sets without the users having to write Java based MapReduce code. Both support more complex operations than Sawzall: e.g. database join.
+
+Hive is built by Facebook to organize dataset in structured formats and still utilize the benefit of MapReduce framework. It has its own SQL-like language: HiveQL {%cite thusoo2010hive --file big-data %} which is easy for anyone who understands SQL. Hive reduces code complexity and eliminates lots of boiler plate that would otherwise be an overhead with Java based MapReduce approach.
+
+- *Word count implementation in Hive*
+
+ ```
+ CREATE TABLE docs (line STRING);
+ LOAD DATA INPATH 'docs' OVERWRITE INTO TABLE docs;
+ CREATE TABLE word_counts AS
+ SELECT word, count(1) AS count FROM
+ (SELECT explode(split(line, '\\s')) AS word FROM docs) w
+ GROUP BY word
+ ORDER BY word;
+ ```
+
+Pig Latin by Yahoo aims at a sweet spot between declarative and procedural programming. For advanced programmers, SQL is unnatural to implement program logic and Pig Latin wants to dissemble the set of data transformation into a sequence of steps. This makes Pig more verbose than Hive. Unlike Hive, Pig Latin does not persist metadata, instead it has better interoperability to work with other applications in Yahoo's data ecosystem.
+
+- *Word count implementation in PIG*
+
+ ```
+ lines = LOAD 'input_fule.txt' AS (line:chararray);
+ words = FOREACH lines GENERATE FLATTEN(TOKENIZE(line)) as word;
+ grouped = GROUP words BY word;
+ wordcount = FOREACH grouped GENERATE group, COUNT(words);
+ DUMP wordcount;
+ ```
+
+SparkSQL though has the same goals as that of Pig, is better given the Spark exeuction engine, efficient fault tolerance mechanism of Spark and specialized data structure called Dataset.
+
+- *Word count example in SparkSQL*
+
+ ```
+ val ds = sqlContext.read.text("input_file").as[String]
+ val result = ds
+ .flatMap(_.split(" "))
+ .filter(_ != "")
+ .toDF()
+ .groupBy($"value")
+ .agg(count("*") as "count")
+ .orderBy($"count" desc)
+ ```
+
+The following subsections will discuss Hive, Pig Latin, SparkSQL in details.
+
+
+### 1.2.1 Hive/HiveQL
+
+Hive {% cite thusoo2010hive --file big-data%} is a data-warehousing infrastructure built on top of the MapReduce framework - Hadoop. The primary responsibility of Hive is to provide data summarization, query, and analysis. It supports analysis of large datasets stored in Hadoop’s HDFS {% cite shvachko2010hadoop --file big-data%}. It supports SQL-Like access to structured data which is known as HiveQL (or HQL) as well as big data analysis with the help of MapReduce. These SQL queries can be compiled into MapReduce jobs that can be executed be executed on Hadoop. It drastically brings down the development time in writing and maintaining Hadoop jobs.
+
+Data in Hive is organized into three different formats:
+
+`Tables`: Like RDBMS tables Hive contains rows and tables and every table can be mapped to HDFS directory. All the data in the table is serialized and stored in files under the corresponding directory. Hive is extensible to accept user-defined data formats, customized serialize and de-serialize methods. It also supports external tables stored in other native file systems like HDFS, NFS or local directories.
+
+`Paritions`: Distribution of data in sub directories of table directory is determined by one or more partitions. A table can be further partitioned on columns.
+
+`Buckets`: Data in each partition can be further divided into buckets on the basis on hash of a column in a table. Each bucket is stored as a file in the partition directory.
+
+***HiveSQL***: Hive query language consists of a subset of SQL along with some extensions. The language is very SQL-like and supports features like subqueries, joins, cartesian product, group by, aggregation, describe and more. MapReduce programs can also be used in Hive queries. A sample query using MapReduce would look like this:
+```
+FROM (
+ MAP inputdata USING 'python mapper.py' AS (word, count)
+ FROM inputtable
+ CLUSTER BY word
+ )
+ REDUCE word, count USING 'python reduce.py';
+```
+*Example from {% cite thusoo2010hive --file big-data%}*
+
+This query uses mapper.py for transforming inputdata into (word, count) pair, distributes data to reducers by hashing on word column (given by CLUSTER) and uses reduce.py.
+
+
+***Serialization/Deserialization***
+Hive implements the LazySerDe as the default SerDe interface. A SerDe is a combination of serialization and deserialization which helps developers instruct Hive on how their records should be processed. The Deserializer interface translates rows into internal objects lazily so that the cost of Deserialization of a column is incurred only when it is needed. The Serializer, however, converts a Java object into a format that Hive can write to HDFS or another supported system. Hive also provides a RegexSerDe which allows the use of regular expressions to parse columns out from a row.
+
+
+
+### 1.2.2 Pig Latin
+Pig Latin {% cite olston2008pig --file big-data%} is a programming model built on top of MapReduce to provide a declarative description. Different from Hive, who has SQL-like syntax, the goal of Pig Latin is to attract experienced programmers to perform ad-hoc analysis on big data and allow programmers to write execution logic by a sequence of steps. For example, suppose we have a table URLs: `(url, category, pagerank)`. The following is a simple SQL query that finds, for each sufficiently large category, the average pagerank of high-pagerank URLs in that category.
+
+```
+SELECT category, AVG(pagerank)
+FROM urls WHERE pagerank > 0.2
+GROUP BY category HAVING COUNT(*) > 106
+```
+
+And Pig Latin provides an alternative to carrying out the same operations in the way programmers can reason more easily:
+
+```
+good_urls = FILTER urls BY pagerank > 0.2;
+groups = GROUP good_urls BY category;
+big_groups = FILTER groups BY COUNT(good_urls)>106;
+output = FOREACH big_groups GENERATE
+ category, AVG(good_urls.pagerank);
+```
+
+
+*Interoperability* Pig Latin is designed to support ad-hoc data analysis, which means the input only requires a function to parse the content of files into tuples. This saves the time-consuming import step. While as for the output, Pig provides freedom to convert tuples into byte sequence where the format can be defined by users. This allows Pig to interoperate with other existing applications in Yahoo's ecosystem.
-{% bibliography --file big-data %} \ No newline at end of file
+*Nested Data Model* Pig Latin has a flexible, fully nested data model, and allows complex, non-atomic data types such as set, map, and tuple to occur as fields of a table. The benefits include: closer to how programmer think; data can be stored in the same nested fashion to save recombining time; can have algebraic language; allow rich user defined functions.
+
+*UDFs as First-Class Citizens* Pig Latin supports user-defined functions (UDFs) to support customized tasks for grouping, filtering, or per-tuple processing, which makes Pig Latin more declarative.
+
+*Debugging Environment* Pig Latin has a novel interactive debugging environment that can generate a concise example data table to illustrate the output of each step.
+
+*Limitations* The procedural design gives users more control over execution, but at same time the data schema is not enforced explicitly, so it much harder to utilize database-style optimization. Pig Latin has no control structures like loop or conditions, if needed, one has to embed it in Java like JDBC style, but this can easily fail without static syntax checking. It is also not easy to debug.
+
+
+
+
+### 1.2.3 SparkSQL
+
+The major contributions of Spark SQL {% cite armbrust2015spark --file big-data%} are the Dataframe API and the Catalyst. Spark SQL intends to provide relational processing over native RDDs and on several external data sources, through a programmer friendly API, high performance through DBMS techniques, support semi-structured data and external databases, support for advanced analytical processing like machine learning algorithms and graph processing.
+
+***Programming API***
+
+Spark SQL runs on the top of Spark providing SQL interfaces. A user can interact with this interface through JDBC/ODBC, command line or Dataframe API.
+A Dataframe API lets users to intermix both relational and procedural code with ease. Dataframe is a collection of schema based rows of data and named columns on which relational operations can be performed with optimized execution. Unlike an RDD, Dataframe allows developers to define the structure for the data and can be related to tables in a relational database or R/Python’s Dataframe. Dataframe can be constructed from tables of external sources or existing native RDD’s. Dataframe is lazy and each object in it represents a logical plan which is not executed until an output operation like save or count is performed.
+Spark SQL supports all the major SQL data types including complex data types like arrays, maps, and unions.
+Some of the Dataframe operations include projection (select), filter(where), join and aggregations(groupBy).
+Illustrated below is an example of relational operations on employees data frame to compute the number of female employees in each department.
+
+```scala
+employees.join(dept, employees("deptId") === dept("id"))
+ .where(employees("gender") === "female")
+ .groupBy(dept("id"), dept("name"))
+ .agg(count("name"))
+```
+Several of these operators like $$===$$ for equality test, $$>$$ for greater than, arithmetic ones ($$+$$, $$-$$, etc) and aggregators transforms to an abstract syntax tree of the expression which can be passed to Catalyst for optimization.
+A `cache()` operation on the data frame helps Spark SQL store the data in memory so it can be used in iterative algorithms and for interactive queries. In the case of Spark SQL, memory footprint is considerably less as it applies columnar compression schemes like dictionary encoding / run-length encoding.
+
+The DataFrame API also supports inline UDF definitions without complicated packaging and registration. Because UDFs and queries are both expressed in the same general purpose language (Python or Scala), users can use standard debugging tools.
+
+However, a DataFrame lacks type safety. In the above example, attributes are referred to by string names. Hence, it is not possible for the compiler to catch any errors. If attribute names are incorrect then the error will only be detected at runtime, when the query plan is created.
+
+Also, Dataframe is both very brittle and very verbose as well, because the user has to cast each row and column to specific types before they can do anything on them. Naturally, this is very error-prone because one could accidentally choose the wrong index for a row/column and end up with a ```ClassCastException```.
+
+Spark introduced an extension to Dataframe called ***Dataset*** to provide this compile type safety. It embraces object-oriented style for programming and has an additional feature termed Encoders. Encoders translate between JVM representations (objects) and Spark’s internal binary format. Spark has built-in encoders which are very advanced in that they generate bytecode to interact with off-heap data and provide on-demand access to individual attributes without having to de-serialize an entire object
+
+
+Winding up - we can compare SQL vs Dataframe vs Dataset as below :
+
+<figure class="main-container">
+ <img src="./sql-vs-dataframes-vs-datasets.png" alt="SQL vs Dataframe vs Dataset" />
+</figure>
+*Figure from the website :* https://databricks.com/blog/2016/07/14/a-tale-of-three-apache-spark-apis-rdds-dataframes-and-datasets.html
+
+
+
+### 1.3 Large-scale parallelism on graphs
+MapReduce doesn’t scale easily for iterative / graph algorithms like page rank and machine learning algorithms. Iterative algorithms require a programmer to explicitly handle the intermediate results (writing to disks) resulting in a lot of boilerplate code. Hence, every iteration requires reading the input file and writing the results to the disk resulting in high disk I/O which is a performance bottleneck for any batch processing system.
+
+Also, graph algorithms require an exchange of messages between vertices. In a case of PageRank, every vertex requires the contributions from all its adjacent nodes to calculate its score. MapReduce currently lacks this model of message passing which makes it complex to reason about graph algorithms. One model that is commonly employed for implementing distributed graph processing is the graph parallel model.
+
+In the graph-parallel abstraction, a user-defined vertex program is instantiated concurrently for each vertex and interacts with adjacent vertex programs through messages or shared state. Each vertex program can read and modify its vertex property and in some cases adjacent vertex properties. When all vertex programs vote to halt the program terminates. The bulk-synchronous parallel (BSP) model {% cite valiant1990bridging --file big-data%} is one of the most commonly used graph-parallel model.
+
+BSP was introduced in 1980 to represent the hardware design features of parallel computers. It gained popularity as an alternative for MapReduce since it addressed the above-mentioned issues with MapReduce
+BSP model is a message passing synchronous model where -
+
+- Computation consists of several steps called as super steps.
+- The processors involved have their own local memory and every processor is connected to other via a point-to-point communication.
+- At every super step, a processor receives input at the beginning, performs computation and outputs at the end.
+- A processor at super step S can send a message to another processor at super step S+1 and can as well receive a message from super step S-1.
+- Barrier synchronization syncs all the processors at the end of every super step.
+- A notable feature of the model is the complete control of data through communication between every processor at every super step. Though similar to MapReduce model, BSP preserves data in memory across super steps and helps in reasoning iterative graph algorithms.
+
+The graph-parallel abstractions allow users to succinctly describe graph algorithms, and provide a runtime engine to execute these algorithms in a distributed nature. They simplify the design, implementation, and application of sophisticated graph algorithms to large-scale real-world problems. Each of these frameworks presents a different view of graph computation, tailored to an originating domain or family of graph algorithms. However, these frameworks fail to address the problems of data preprocessing and construction, favor snapshot recovery over fault tolerance and lack support from distributed data flow frameworks. The data-parallel systems are well suited to the task of graph construction and are highly scalable. However, suffer from the very problems mentioned before for which the graph-parallel systems came into existence. GraphX {%cite xin2013graphx --file big-data%} is a new computation system which builds upon the Spark’s Resilient Distributed Dataset (RDD) to form a new abstraction Resilient Distributed Graph (RDG) to represent records and their relations as vertices and edges respectively. RDG’s leverage the RDD’s fault tolerance mechanism and expressivity.
+
+***How does GraphX improve over the existing graph-parallel and data flow models?***
+
+Similar to the data flow model, GraphX moves away from the vertex-centric view and adopts transformations on graphs yielding a new graph. The RDGs in GraphX provides a set of elegant and expressive computational primitives to support graph transformations as well as enable many graph-parallel systems like Pregel {%cite malewicz2010pregel --file big-data%}, PowerGraph {%cite gonzalez2012powergraph --file big-data%} to be easily expressed with minimal lines of code changes to Spark. GraphX simplifies the process of graph ETL and analysis through new operations like filter, view etc. It minimizes communication and storage overhead across the system by adopting vertex-cuts for effective partitioning.
+
+**GraphX**
+
+GraphX models graph as property graphs where vertices and edges can have properties. Property graphs are directed multigraph having multiple parallel edges with same source and destination to realize scenarios where multiple relationships could exist between two vertices. For example, in a social graph where every vertex represents a person, there could be a scenario where two people are both co-workers and a friend at the same time. A vertex is keyed by a unique 64-bit long identifier (Vertex ID) while edges contain the corresponding source and destination vertex identifiers.
+
+GraphX API provides the below primitives for graph transformations (From the website : https://spark.apache.org/docs/2.0.0-preview/graphx-programming-guide.html):
+
+- `graph` - constructs property graph given a collection of edges and vertices.
+- `vertices: VertexRDD[VD]`, `edges: EdgeRDD[ED]`- decompose the graph into a collection of vertices or edges by extracting vertex or edge RDDs.
+- `mapVertices(map: (Id,V)=>(Id,V2)) => Graph[V2, E]`- transform the vertex collection.
+- `mapEdges(map: (Id, Id, E)=>(Id, Id, E2))` - transform the edge collection.
+- `triplets RDD[EdgeTriplet[VD, ED]]` -returns collection of form ((i, j), (PV(i), PE(i, j), PV(j))). The operator essentially requires a multiway join between vertex and edge RDD. This operation is optimized by shifting the site of joins to edges, using the routing table, so that only vertex data needs to be shuffled.
+- `leftJoin` - given a collection of vertices and a graph, returns a new graph which incorporates the property of matching vertices from the given collection into the given graph without changing the underlying graph structure.
+- `subgraph` - Applies predicates to return a subgraph of the original graph by filtering all the vertices and edges that don’t satisfy the vertices and edges predicates respectively.
+- `aggregateMessages (previously mapReduceTriplets) ` - It takes two functions, sendMsg and mergeMsg. The sendMsg function maps over every edge triplet in the graph while the mergeMsg acts like a reduce function in MapReduce to aggregate those messages at their destination vertex. This is an important function which supports analytics tasks and iterative graph algorithms (eg., PageRank, Shortest Path) where individual vertices rely upon the aggregated properties of their neighbors.
+- `filterVertices(f: (Id, V)=>Bool): Graph[V, E]` - Filter the vertices by applying the predicate function f to return a new graph post filtering.
+- `filterEdges(f: Edge[V, E]=>Bool): Graph[V, E]` - Filter the edges by applying the predicate function f to return a new graph post filtering.
+
+
+***Why partitioning is important in graph computation systems ?***
+Graph-parallel computation requires every vertex or edge to be processed in the context of its neighborhood. Each transformation depends on the result of distributed joins between vertices and edges. This means that graph computation systems rely on graph partitioning (edge-cuts in most of the systems) and efficient storage to minimize communication and storage overhead and ensure balanced computation.
+
+<figure class="main-container">
+ <img src="./edge-cut.png" alt="edge cuts" />
+</figure>
+
+*Figure from {%cite xin2013graphx --file big-data%}*
+
+***Why Edge-cuts are expensive ?***
+Edge-cuts for partitioning requires random assignment of vertices and edges across all the machines. Thus the communication and storage overhead is proportional to the number of edges cut, and this makes balancing the number of cuts a priority. For most real-world graphs, constructing an optimal edge-cut is cost prohibitive, and most systems use random edge-cuts which achieve appropriate work balance, but nearly worst-case communication overhead.
+
+***Vertex-cuts - GraphX’s solution to effective partitioning*** : An alternative approach which does the opposite of edge-cut — evenly assign edges to machines, but allow vertices to span multiple machines. The communication and storage overhead of a vertex-cut is directly proportional to the sum of the number of machines spanned by each vertex. Therefore, we can reduce communication overhead and ensure balanced computation by evenly assigning edges to machines in a way that minimizes the number of machines spanned by each vertex.
+
+***Implementation of Vertex-cut***
+
+<figure class="main-container">
+ <img src="./vertex-cut-datastructure.png" alt="vertex-cut-implementation" />
+</figure>
+
+*Figure from the website : https://spark.apache.org/docs/2.0.0-preview/graphx-programming-guide.html*
+
+The GraphX RDG structure implements a vertex-cut representation of a graph using three unordered horizontally partitioned RDD tables. These three tables are as follows:
+
+- `EdgeTable(pid, src, dst, data)`: Stores adjacency structure and edge data.
+- `VertexDataTable(id, data)`: Stores vertex data. Contains states associated with vertices that are changing in the course of graph computation
+- `VertexMap/Routing Table(id, pid)`: Maps vertex ids to the partitions that contain their adjacent edges. Remains static as long as the graph structure doesn’t change.
+
+
+
+## 2 Execution Models
+There are many possible implementations for those programming models. In this section, we will discuss a few different execution models, how the above programming interfaces exploit them, the benefits and limitations of each design and so on. At a very high level, MapReduce, its variants, and Spark all adopt the master/workers model, where the master(or driver in Spark) is responsible for managing data and dynamically scheduling tasks to workers. The master monitors workers' status, and when failure happens, the master will reschedule the task to another idle worker. However, data in MapReduce(section 2.1) is distributed over clusters and needs to be moved in and out of the disk, and Spark(section 2.2) takes the in-memory processing approach. This practice saves significant I/O operations and thus is much faster than MapReduce. As for fault tolerance, MapReduce uses data persistence and Spark achieves it by using lineage(recomputation for failed task).
+
+As for more declarative querying models, the execution engine needs to take care of query compilation and in the meantime has the opportunity of optimizations. For example, Hive(section 2.3) not only needs a driver as the way MapReduce and Spark do but also has to manage the meta store as well as to take advantage of optimization gain from traditional database like design. SparkSQL(section 2.4) adopts Catalyst framework for SQL optimization: rule-based and cost-based.
+
+
+### 2.1 MapReduce execution model
+The original MapReduce model is implemented and deployed in Google infrastructure. As described in section 1.1.1, user program defines map and reduce functions and the underlying system manages data partition and schedules jobs across different nodes. Figure 2.1.1 shows the overall flow when the user program calls MapReduce function:
+1. Split data. The input files are split into *M* pieces;
+2. Copy processes. The user program creates a master process and the workers. The master picks idle workers to do either map or reduce task;
+3. Map. The map worker reads corresponding splits and passes to the map function. The generated intermediate key/value pairs are buffered in memory;
+4. Partition. The buffered pairs are written to local disk and partitioned to *R* regions periodically. Then the locations are passed back to the master;
+5. Shuffle. The reduce worker reads from the local disks and groups together all occurrences of the same key together;
+6. Reduce. The reduce worker iterates over the grouped intermediate data and calls reduce function on each key and its set of values. The worker appends the output to a final output file;
+7. Wake up. When all tasks finish, the master wakes up the user program.
+
+<figure class="fullwidth">
+ <img src="{{ site.baseurl }}/resources/img/mapreduce-execution.png" alt="MapReduce Execution Overview" />
+</figure>
+<p>Figure 2.1.1 Execution overview<label for="sn-proprietary-monotype-bembo" class="margin-toggle sidenote-number"></label><input type="checkbox" id="sn-proprietary-monotype-bembo" class="margin-toggle"/><span class="sidenote">from original MapReduce paper {%cite dean2008mapreduce --file big-data%}</span></p>
+
+At step 4 and 5, the intermediate dataset is written to the disk by map worker and then read from the disk by reducing worker. Transferring big data chunks over the network is expensive, so the data is stored on local disks of the cluster and the master tries to schedule the map task on the machine that contains the dataset or a nearby machine to minimize the network operation.
+
+
+### 2.2 Spark execution model
+
+<figure class="main-container">
+ <img src="./cluster-overview.png" alt="MapReduce Execution Overview" />
+</figure>
+*Figure & information (this section) from the website: http://spark.apache.org/docs/latest/cluster-overview.html*
+
+The Spark driver defines SparkContext which is the entry point for any job that defines the environment/configuration and the dependencies of the submitted job. It connects to the cluster manager and requests resources for further execution of the jobs.
+The cluster manager manages and allocates the required system resources to the Spark jobs. Furthermore, it coordinates and keeps track of the live/dead nodes in a cluster. It enables the execution of jobs submitted by the driver on the worker nodes (also called Spark workers) and finally tracks and shows the status of various jobs running on the worker nodes.
+A Spark worker executes the business logic submitted by the user by way of the Spark driver. Spark workers are abstracted and are allocated dynamically by the cluster manager to the Spark driver for the execution of submitted jobs. The driver will listen for and accept incoming connections from its executors throughout its lifetime.
+
+***Job scheduler optimization:*** Spark’s job scheduler tracks the persistent RDD’s saved in memory. When an action (count or collect) is performed on an RDD, the scheduler first analyzes the lineage graph to build a DAG of stages to execute. These stages only contain the transformations having narrow dependencies. Outside these stages are the wider dependencies for which the scheduler has to fetch the missing partitions from other workers in order to build the target RDD. The job scheduler is highly performant. It assigns tasks to machines based on data locality or to the preferred machines in the contained RDD. If a task fails, the scheduler re-runs it on another node and also recomputes the stage’s parent is missing.
+
+***How are persistent RDD’s memory managed ?***
+
+Persistent RDDs are stored in memory as java objects (for performance) or in memory as serialized data (for less memory usage at cost of performance) or on disk. If the worker runs out of memory upon creation of a new RDD, Least Recently Used(LRU) policy is applied to evict the least recently accessed RDD unless its same as the new RDD. In that case, the old RDD is excluded from eviction given the fact that it may be reused again in future. Long lineage chains involving wide dependencies are checkpointed to reduce the time in recovering an RDD. However, since RDDs are read-only, checkpointing is still ok since consistency is not a concern and there is no overhead to manage the consistency as is seen in distributed shared memory.
+
+### 2.3 Hive execution model
+
+The Hive execution model {% cite thusoo2010hive --file big-data%} composes of the below important components (and as shown in the below Hive architecutre diagram below):
+
+- Driver: Similar to the Drivers of Spark/MapReduce application, the driver in Hive handles query submission & its flow across the system. It also manages the session and its statistics.
+
+- Metastore – A Hive meta store stores all information about the tables, their partitions, schemas, columns and their types, etc. enabling transparency of data format and its storage to the users. It, in turn, helps in data exploration, query compilation, and optimization. Criticality of the Matastore for managing the structure of Hadoop files requires it to be updated on a regular basis.
+
+- Query Compiler – The Hive Query compiler is similar to any traditional database compilers. it processes the query in three steps:
+ - Parse: In this phase, it uses Antlr (A parser generator tool) to generate the Abstract syntax tree (AST) of the query.
+ - Transformation of AST to DAG (Directed acyclic graph): In this phase, it generates a logical plan and does a compile type checking. The logical plan is generated using the metadata (stored in Metastore) information of the required tables. It can flag errors if any issues found during the type checking.
+
+ - Optimization: Optimization forms the core of any declarative interface. In the case of Hive, optimization happens through chains of transformation of DAG. A transformation could include even a user defined optimization and it applies an action on the DAG only if a rule is satisfied. Every node in the DAG implements a special interface called as Node interface which makes it easy for the manipulation of the operator DAG using other interfaces like GraphWalker, Dispatcher, Rule, and Processor. Hence, by transformation, we mean walking through a DAG and for every Node we encounter we perform a Rule satisfiability check. If a Rule is satisfied, a corresponding processor is invoked. A Dispatcher maintains a list of Rule to Processor mappings.
+
+ <figure class="main-container" align="center">
+ <img src="./Hive-transformation.png" alt="Hive transformation" />
+ </figure>
+
+*Figure to depict the transformation flow during optimization, from:* {%cite thusoo2010hive --file big-data %}
+
+- Execution Engine: Execution Engine finally executes the tasks in order of their dependencies. A MapReduce task first serializes its part of the plan into a plan.XML file. This file is then added to the job cache and mappers and reducers are spawned to execute relevant sections of the operator DAG. The final results are stored to a temporary location and then moved to the final destination (in the case of say INSERT INTO query).
+
+
+***Summarizing the flow***
+
+*Hive architecture diagram*
+<figure class="main-container">
+ <img src="./Hive-architecture.png" alt="Hive architecture" />
+</figure>
+
+
+The query is first submitted via CLI/the web UI/any other interface. The query undergoes all the compiler phases as explained above to form an optimized DAG of MapReduce and its tasks which the execution engine executes in its correct order using Hadoop.
+
+
+Some of the important optimization techniques in Hive are:
+
+ - Column Pruning - Consider only the required columns needed in the query processing for projection.
+ - Predicate Pushdown - Filter the rows as early as possible by pushing down the predicates. It is important that unnecessary records are filtered first and transformations are applied to only the needed ones.
+ - Partition Pruning - Predicates on partitioned columns are used to prune out files of partitions that do not satisfy the predicate.
+ - Map Side Joins - Smaller tables in the join operation can be replicated in all the mappers and the reducers.
+ - Join Reordering - Reduce "reducer side" join operation memory by keeping only smaller tables in memory. Larger tables need not be kept in memory.
+ - Repartitioning data to handle skew in GROUP BY processing can be achieved by performing GROUP BY in two MapReduce stages. In first stage data is distributed randomly to the reducers and partial aggregation is performed. In the second stage, these partial aggregations are distributed on GROUP BY columns to different reducers.
+ - Similar to combiners in MapReduce, hash based partial aggregations in the mappers can be performed to reduce the data that is sent by the mappers to the reducers. This helps in reducing the amount of time spent in sorting and merging the resulting data.
+
+
+
+
+### 2.4 SparkSQL execution model
+
+SparkSQL {% cite armbrust2015spark --file big-data%} execution model leverages Catalyst framework for optimizing the SQL before submitting it to the Spark Core engine for scheduling the job.
+A Catalyst is a query optimizer. Query optimizers for MapReduce frameworks can greatly improve performance of the queries developers write and also significantly reduce the development time. A good query optimizer should be able to optimize user queries, extensible for user to provide information about the data and even dynamically include developer defined specific rules.
+
+Catalyst leverages the Scala’s functional language features like pattern matching and runtime meta programming to allow developers to concisely specify complex relational optimizations.
+
+Catalyst includes both rule-based and cost-based optimization. It is extensible to include new optimization techniques and features to Spark SQL and also let developers provide data source specific rules.
+Catalyst executes the rules on its data type Tree - a composition of node objects where each node has a node type (subclasses of TreeNode class in Scala) and zero or more children. Node objects are immutable and can be manipulated. The transform method of a Tree applies pattern matching to match a subset of all possible input trees on which the optimization rules needs to be applied.
+
+Hence, in Spark SQL, transformation of user queries happens in four phases :
+
+<figure class="main-container">
+ <img src="./sparksql-data-flow.jpg" alt="SparkSQL optimization plan Overview" />
+</figure>
+*Figure from : {% cite armbrust2015spark --file big-data%}*
+
+***Analyzing a logical plan to resolve references :*** In the analysis phase a relation either from the abstract syntax tree (AST) returned by the SQL parser or from a DataFrame is analyzed to create a logical plan out of it, which is still unresolved (the columns referred may not exist or may be of wrong datatype). The logical plan is resolved using using the Catalyst’s Catalog object(tracks the table from all data sources) by mapping the named attributes to the input provided, looking up the relations by name from catalog, by propagating and coercing types through expressions.
+
+***Logical plan optimization :*** In this phase, several of the rules like constant folding, predicate push down, projection pruning, null propagation, boolean expression simplification are applied on the logical plan.
+
+***Physical planning :*** In this phase, Spark generates multiples physical plans out of the input logical plan and chooses the plan based on a cost model. The physical planner also performs rule-based physical optimizations, such as pipelining projections or filters into one Spark map operation. In addition, it can push operations from the logical plan into data sources that support predicate or projection pushdown.
+
+***Code Generation :*** The final phase generates the Java byte code that should run on each machine.Catalyst transforms the Tree which is an expression in SQL to an AST for Scala code to evaluate, compile and run the generated code. A special scala feature namely quasiquotes aid in the construction of abstract syntax tree(AST).
+
+
+## 3. Big Data Ecosystem
+*3.1 Hadoop ecosystem*
+
+Apache Hadoop is an open-sourced framework that supports distributed processing of large dataset. It involves dozens of projects, all of which are listed [here](https://hadoopecosystemtable.github.io/). In this section, it is also important to understand the key players in the system, namely two parts: the Hadoop Distributed File System (HDFS) and the open-sourced implementation of MapReduce model - Hadoop.
+
+<figure class="main-container">
+ <img src="./hadoop-ecosystem.jpg" alt="Hadoop Ecosystem" />
+</figure>
+*Figure is from http://thebigdatablog.weebly.com/blog/the-hadoop-ecosystem-overview*
+
+
+HDFS forms the data management layer, which is a distributed file system designed to provide reliable, scalable storage across large clusters of unreliable commodity machines. The idea was inspired by GFS{%cite ghemawat2003google --file big-data%}. Unlike closed GFS, HDFS is open-sourced and provides various libraries and interfaces to support different file systems, like S3, KFS etc.
+
+To satisfy different needs, big companies like Facebook and Yahoo developed additional tools. Facebook's Hive, as a warehouse system, can provide more declarative programming interface and translate to Hadoop jobs. Yahoo's Pig platform is an ad-hoc analysis tool that can structurize HDFS objects and support operations like grouping, joining and filtering.
+
+
+*3.2 Spark ecosystem*
+
+Apache Spark's rich-ecosystem constitutes of third party libraries like Mesos{%cite hindman2011mesos --file big-data%}/Yarn{%cite vavilapalli2013apache --file big-data%} and several major components that have been already discussed in this article like Spark-core, SparkSQL, GraphX.
+In this section we will discuss the remaining yet very important components/libraries which help Spark deliver high performance.
+
+<figure class="main-container">
+ <img src="./spark-ecosystem.png" alt="Spark ecosystem" />
+</figure>
+
+*Spark Streaming - A Spark component for streaming workloads*
+
+Spark achieves fault tolerant, high throughput data streaming workloads in real-time through a light weight Spark Streaming API. Spark streaming is based on Discretized Streams model{% cite zaharia2012discretized --file big-data%}. Spark Streaming processes streaming workloads as a series of small batch workloads by leveraging the fast scheduling capacity of Apache Spark Core and fault tolerance capabilities of a RDD. A RDD in here represents each batch of streaming data and transformations are applied on the same. Data source in Spark Streaming could be from many a live streams like Twitter, Apache Kafka {% cite kreps2011kafka --file big-data%}, Akka Actors (http://doc.akka.io/docs/akka/2.4.1/scala/actors.html), IoT Sensors, Apache Flume(https://flume.apache.org/FlumeUserGuide.html), etc. Spark streaming also enables unification of batch and streaming workloads and hence developers can use the same code for both batch and streaming workloads. It supports the integration of streaming data with historical data.
+
+
+*Apache Mesos*
+
+
+Apache Mesos{%cite hindman2011mesos --file big-data%} is an open source heterogenous cluster/resource manager developed at the University of California, Berkley and used by companies such as Twitter, Airbnb, Netflix etc. for handling workloads in a distributed environment through dynamic resource sharing and isolation. It aids in the deployment and management of applications in large-scale clustered environments. Mesos abstracts node allocation by combining the existing resources of the machines/nodes in a cluster into a single pool and enabling fault-tolerant elastic distributed systems. Variety of workloads can utilize the nodes from this single pool voiding the need of allocating specific machines for different workloads. Mesos is highly scalable, achieves fault tolerance through Apache Zookeeper {%cite hunt2010zookeeper --file big-data%} and is a efficient CPU and memory-aware resource scheduler.
+
+*Alluxio/Tachyon*
+
+Alluxio/Tachyon{% cite li2014tachyon --file big-data%} is an open source memory-centric distributed storage system that provides high throughput writes and reads enabling reliable data sharing at memory-speed across cluster jobs. Tachyon can integrate with different computation frameworks, such as Apache Spark and Apache MapReduce. In the big data ecosystem, Tachyon fits between computation frameworks or jobs like spark or mapreducce and various kinds of storage systems, such as Amazon S3, OpenStack Swift, GlusterFS, HDFS, or Ceph. It caches the frequently read datasets in memory, thereby avoiding going to disk to load every dataset. In Spark RDDs can automatically be stored inside Tachyon to make Spark more resilient and avoid GC overheads.
+
+
+
+
+## References
+{% bibliography --file big-data %}
diff --git a/chapter/8/cluster-overview.png b/chapter/8/cluster-overview.png
new file mode 100644
index 0000000..b1b7c1a
--- /dev/null
+++ b/chapter/8/cluster-overview.png
Binary files differ
diff --git a/chapter/8/ecosystem.png b/chapter/8/ecosystem.png
new file mode 100644
index 0000000..c632ec2
--- /dev/null
+++ b/chapter/8/ecosystem.png
Binary files differ
diff --git a/chapter/8/edge-cut.png b/chapter/8/edge-cut.png
new file mode 100644
index 0000000..ae30396
--- /dev/null
+++ b/chapter/8/edge-cut.png
Binary files differ
diff --git a/chapter/8/hadoop-ecosystem.jpg b/chapter/8/hadoop-ecosystem.jpg
new file mode 100644
index 0000000..2ba7aa9
--- /dev/null
+++ b/chapter/8/hadoop-ecosystem.jpg
Binary files differ
diff --git a/chapter/8/spark-ecosystem.png b/chapter/8/spark-ecosystem.png
new file mode 100644
index 0000000..d3569fc
--- /dev/null
+++ b/chapter/8/spark-ecosystem.png
Binary files differ
diff --git a/chapter/8/spark_pipeline.png b/chapter/8/spark_pipeline.png
new file mode 100644
index 0000000..ac8c383
--- /dev/null
+++ b/chapter/8/spark_pipeline.png
Binary files differ
diff --git a/chapter/8/sparksql-data-flow.jpg b/chapter/8/sparksql-data-flow.jpg
new file mode 100644
index 0000000..1cf98f5
--- /dev/null
+++ b/chapter/8/sparksql-data-flow.jpg
Binary files differ
diff --git a/chapter/8/sql-vs-dataframes-vs-datasets.png b/chapter/8/sql-vs-dataframes-vs-datasets.png
new file mode 100644
index 0000000..600c68b
--- /dev/null
+++ b/chapter/8/sql-vs-dataframes-vs-datasets.png
Binary files differ
diff --git a/chapter/8/vertex-cut-datastructure.png b/chapter/8/vertex-cut-datastructure.png
new file mode 100644
index 0000000..4379bec
--- /dev/null
+++ b/chapter/8/vertex-cut-datastructure.png
Binary files differ
diff --git a/chapter/9/DAG.jpg b/chapter/9/DAG.jpg
new file mode 100644
index 0000000..54c041a
--- /dev/null
+++ b/chapter/9/DAG.jpg
Binary files differ
diff --git a/chapter/9/DiagramStream.jpg b/chapter/9/DiagramStream.jpg
new file mode 100644
index 0000000..81ca7bc
--- /dev/null
+++ b/chapter/9/DiagramStream.jpg
Binary files differ
diff --git a/chapter/9/Kafka.jpg b/chapter/9/Kafka.jpg
new file mode 100644
index 0000000..9ebe2c4
--- /dev/null
+++ b/chapter/9/Kafka.jpg
Binary files differ
diff --git a/chapter/9/Naiad.jpg b/chapter/9/Naiad.jpg
new file mode 100644
index 0000000..98b2348
--- /dev/null
+++ b/chapter/9/Naiad.jpg
Binary files differ
diff --git a/chapter/9/TimelyD.jpg b/chapter/9/TimelyD.jpg
new file mode 100644
index 0000000..cfb99eb
--- /dev/null
+++ b/chapter/9/TimelyD.jpg
Binary files differ
diff --git a/chapter/9/Topology.jpg b/chapter/9/Topology.jpg
new file mode 100644
index 0000000..6abc10c
--- /dev/null
+++ b/chapter/9/Topology.jpg
Binary files differ
diff --git a/chapter/9/streaming.md b/chapter/9/streaming.md
index cab6dea..d805d1d 100644
--- a/chapter/9/streaming.md
+++ b/chapter/9/streaming.md
@@ -1,10 +1,278 @@
---
layout: page
title: "Large Scale Streaming Processing"
-by: "Joe Schmoe and Mary Jane"
+by: "Fangfan Li"
---
-Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum. {% cite Uniqueness --file streaming %}
+The previous chapter discusses the large scale batch processing system, where the computation involves the pieces of data stored across the distributed file system. Those systems satisfy the requirements such as scalability and fault-tolerance for applications that deal with 'big data' stored in a distributed way. The batch processing systems are suitable for processing *static* datasets, where the input data do not change overtime during the whole process, thus the system can distribute the computation and perform synchronization assuming the inputs would stay the same during the whole computation. In such *static* model, the processing system can first *pull* data from the disk, and then perform the computation over the pulled data. However, a large number of networking applications are not *static*, instead, the data is constantly in motion, and the inputs would be provided as *stream*, as new data constantly arrives. In the *stream* model, data is *pushed* to the processor. This fundamental difference makes the traditional batch processing system un-suitable for streaming applications, as even the slightest change in the dataset would require the batch processor to *pull* the whole dataset and perform the computation again. Thus in this chapter, we would introduce the history and systems that are created for the streaming processing.
+
+There are many challenges for implementing large scale streaming processing system. Similar to large scale batch processing systems, large scale streaming systems also have to deal with consistency and fault-tolerance due to the distributed nature of those systems. Moreover, latency at the scale of several minutes is at most a nuisance in batch processing while latency is not as tolerable in large streaming processing.
+
+Despite those challenges, there are many active research and productions in the stream processing area, and we want to answer the following questions in this article: 1) what are the earliest ideas of stream processing, why would people want to analyze a stream of data 2) what exactly is a stream, how is it implemented in real system 3) what are the systems that are built for large scale stream processing, and what are the differences between them 4) what are the systems that are being used by companies for their applications, do they build their own system or they would use the existing systems.
+
+
+## Data in constant motion
+
+Computing data stream has long been studied in the area of Theory of Computing. Assume we have a sequence of elements, and we want to compute the frequency moments of the data (i.e., count how many of each of the distinct data appear in the sequence). To do that, we could maintain a full histogram on the data, a counter for each data value. However, the memory that we have is not unlimited, thus we can not gather every data, we can then use randomized algorithms for approximating the frequency moments with limited resource{% cite alon1996space --file streaming %}. Thus analyzing the stream using random algorithm was because of the lack of computation resources.
+
+Besides randomized processing on the data sequence, systems were also being developed to deal with the input data that is not static and predictable. Instead of dealing with the lack of resources, those projects were mostly motivated by the fact that in emerging networked environments, the value of the ever increasing amount of data is realized only within the time that it is needed. TelegraphCQ {% cite chandrasekaran2003telegraphcq --file streaming %} is one example among those earliest such systems, which aims at meeting the challenges that arise in handling large streams of continuous queries over high-volume, high-variable data. In contrast to traditional view that data can be found statically in known locations, the authors of TelegraphCQ realized that data becomes fluid and being constantly moving and changing, thus the traditional database can 'pull' data from the storage while data is being 'pushed' into the query processor in case of processing stream. The examples of applications that use this *data in motion* include: event-based processing where the system would react to some special data received or when some event happens (e.g., at a certain time), and query processing over streaming data sources such as network monitoring. TelegraphCQ is one example of the systems that can query processing over data stream.
+
+The fundamental difference between TelegraphCQ to other traditional query system is the view of input data, instead of handling a query with detailed static data, TelegraphCQ has to react to the newly arrived data and process the queries *on-the-fly*. In order to always react, the query need to be alway running, so TelegraphCQ runs *continuous queries*, where the queries are constantly running and as new data arrives, the processor would route it to the set of active queries that are listening. TelegraphCQ also uses *shared processing* to avoid the overhead of processing each query individually, in order to avoid blocking and having to interrupt the dataflow, data should be processed simultaneously by all the queries that require this dataflow. In TelegraphCQ, those queries with such commonality can be combined together to improve the performance.
+
+TelegraphCQ shows the importance of modeling data as stream and how can we process such data stream, however it was only implemented in a non-distributed prototype.
+
+Beyond TelegraphCQ, there are systems that were built for continuously querying on large scale streaming data. For example, PipelineDB{% cite pipelinedb --file streaming %} is a system that was designed to run SQL queries continuously on streaming data, where the output of those continuous queries is stored in regular tables which can be queried like any other table. PipelineDB can reduce the cardinality of its input streams by performing different filtering or aggregations on stream once the continuous queries read the raw data, and only the needed information would then be persisted to disk (i.e., the raw data is then discarded). By doing this, PipelineDB can process large volumes of data very efficiently using relatively small number of resources.
+
+As we described before, stream processing is not only query processing. Apache Flink {% cite apacheflink --file streaming %} is a system that supports both event-based processing and query processing. Each program in Flink is a streaming dafalow consisting of streams and transformation operators, the stream of data in a streaming dataflow can come from multiple sources (i.e., producers) and travel to one or more sinks (i.e., consumers). The stream of data would get transformed when traveling through the operators, where the computations happen. In order to distribute the work, streams are split into stream partitions and operators are split into operator subtasks in Flink where each subtask executes independently.
+
+What is event-based processing in Flink then? Unlike batch processing, to aggregate a event is more subtle in stream processing, for example we can not count the element in a stream since it is generally unbounded. Instead, Flink enable event-based processing with the notion of time and windows, for example, we can specify something like 'count over 5 minutes window'. Besides time-based window, Flink also supports count windows, and such event would be 'do something when the 100th elements arrive'. Flink has different notions of time such as event time when an event was created and processing time which is when the operator performs a time-based operation. The time are then used internally to keep the order and state for each event and also used by the windowing logic. The flexible streaming windows can then be transformed to flexible triggering condition which makes event-based processing possible in Flink.
+
+We just very briefly introduced PipelineDB and Apache Flink here, and there are many other systems that can perform stream processing in large scale and we would look into few examples in detail in section 3.
+
+## How to represent data stream
+
+Why would we need to process data stream in a large scale? I will use an example to illustrate the idea. For example, assume you are Twitter, and you have a constant feed of user's comments and posts, you want to find out what is the most *trending* topic right now that people are talking about, and your advertisement team want to follow on that. You can store all the posts that happened during the day from 12:01 a.m to 11:59 p.m in a large file system and then run a batch *Spark* {% cite zaharia2012resilient --file streaming %} job to analyze them. The *Spark* job itself may again probably take several hours, but after all these works, the *trending* topic comes out from your analysis might be useless since it might not be hot anymore. Thus we want a stream processing system that can take the constant stream of posts from all different sources as input and output the result with low latency (i.e., before it becomes useless).
+
+Before dive into the details of the large scale processing, we would first introduce a few concepts: producer, processor and consumer based on the example.
+
+- The producer is where the data stream comes from, it would be a user who are tweeting.
+- The consumer is where the results are needed, the advertisement team would be the consumer.
+- The processor is then the *magical* component that takes the stream and produce the results.
+
+<figure class="fullwidth">
+ <img src="{{ site.baseurl }}/chapter/9/DiagramStream.jpg" alt="An example of a stream processing system" />
+</figure>
+
+The producers and consumers are fairly straight forward, it is the processor that are being discussed in this chapter.
+
+In this section, we would first illustrate what is the *stream* (i.e., the tuples between components) that the producers are giving to the processor, which is the component between producers and processors-the data stream.
+
+We have been talking about the stream of data, but this is a bit under-specified, since the data can be collected from many producers (i.e. different users), how do we combine those data into actual streams and send the them to the processors? What does a data stream really look like?
+
+A natural view of a data stream can be an infinite sequence of tuples reading from a queue. However, a traditional queue would not be sufficient in large scale system since the consumed tuple might got lost or the consumer might fail thus it might request the previous tuple after a restart. Furthermore, since the processing power for a single machine is limited, we want several machines to be able to read from the same queue, thus they can work on the stream in parallel. The alternative queue design is then a multi-consumer queue, where a pool of readers may read from a single queue and each record goes to one of them. In a traditional multi-consumer queue, once a consumer reads the data out, it is gone. This would be problematic in a large stream processing system, since the messages are more likely to be lost during transmission, and we want to keep track of what are the data that are successfully being consumed and what are the data that might be lost on their way towards the consumer. Thus we need a little fancier queue to keep track of *what* has been consumed, in order to be resilient in the face of packet loss or network failure.
+
+A naïve approach to attempting to handle lost messages or failures could be to record the message upon sending it, and to wait for the acknowledgement from the receiver. This simple method is a pragmatic choice since the storage in many messaging systems are scarce resources, the system want to free the data immediately once it knows it is consumed successfully thus to keep the queue small. However, getting the two ends to come into agreement about what has been consumed in not a trivial problem. Acknowledgement fixes the problem of losing messages, because if a message is lost, it would not be acknowledged thus the data is still in the queue and can be sent again, this would ensure that each message is processed at least once, however, it also creates new problems. First problem is the receiver might successfully consumed the message *m1* but fail to send the acknowledgment, thus the sender would send *m1* again and the receiver would process the same data twice. Another problem is memory consumption, since the sender has now to keep track of every single messages being sent out with multiple stages, and only free them when acknowledged.
+
+<figure class="fullwidth">
+ <img src="{{ site.baseurl }}/chapter/9/Kafka.jpg" alt="An example of a stream processing system" />
+</figure>
+
+Apache Kafka {% cite apachekafka --file streaming %} handles this differently to achieve better performance. Apache Kafka is a distributed streaming platform, where the producer, processor and consumers can all subscribe to, and create/read the stream they need from, one can think of Kafka as the stream between all components in a stream processing system. Records in Kafka are grouped in topics, where each topic is a category to which this record is published. Each topic is then divided into several partitions, where one topic can always have multi-subscriber and each partition has one reader at a time. Each record is assigned with a offset that uniquely identifies it in that partition. By doing this Kafka can ensure that the only reader of that partition and consumes the data in order. Since there are many partitions of each topic, Kafka balances the load over many consumer instances by assigning different partitions to them. This makes the state about what has been consumed very small, just one number (i.e., the offset) for each partition, and by periodically checkpointing, the equivalent of message acknowledgements becomes very cheap. Kafka retains all published records whether they have been consumed or not during their configurable retention period, this also allows consumers to rewind the stream and replay everything from the point of interest by going back to the specific offset. For example, if the user code has a bug which is discovered later, the user can re-consume those messages from the previous offset once the bug is fixed while ensuring that the processed events are in the order of their origination, or the user can simply start computing with the latest records from "now".
+
+With the notions of topics and partitions, Kafka guarantees that the total order over records within a partition, and multiple consumers can subscribe to a single topic which would increase the throughput. If a strong guarantee on the ordering of all records in a topic is needed, the user can simply put all records in this topic into one partition.
+
+Those features of Apache Kafka make it a very popular platform used by many stream processing systems, and we can think of the stream as Apache Kafka in the rest of this article.
+
+## How to process data stream
+
+Now we know what the stream looks like and how do we ensure that the data in the stream are successfully processed. We would then talk about the processors that consume the data stream. There are two main approaches in processing data stream. The first approach is the continuous queries model, similar to TelegraphCQ, where the queries keep running and the arrival of data initiates the processing. Another approach is micro-batching, where the streaming computation becomes a series of stateless, deterministic batch computations on batch of stream, where certain timer would trigger the processing on the batch in those systems. We would discuss Apache Storm as one example for the fist design and Spark Streaming, Naiad and Google Dataflow are examples of the second approach. These systems not only differ in the way how they process stream, but also how they ensure fault-tolerance which is one of the most important aspects of large scale distributed system.
+
+### a) Continuous queries (operators) on each tuple
+
+- Apache Storm
+
+After MapReduce, Hadoop, and the related batch processing system came out, the data can be processed at scales previously unthinkable. However, as we stated before, large scale stream processing becomes more and more important for many businesses. *Apache Storm* {% cite apachestorm --file streaming %} is actually one of the first system that can be described as "Hadoop of stream processing" that feed the needs. Users can process messages in a way that doesn't lose data and also scalable with the primitives provided by *Storm*.
+
+In *Storm*, the logic of every processing job is described as a *Storm* topology. A *Storm* topology in *Storm* can be think of as a MapReduce job in Hadoop, the difference is that a MapReduce job will finish eventually but a *Storm* topology will run forever. There are three components in the topology: stream, spouts and bolts.
+
+In *Storm*, a stream is a unbounded sequence of tuples, tuples can contain arbitrary types of data, which also related to the core concept of *Storm*: process the tuples in a stream.
+
+The next abstraction in a topology is spout. A spout is a source of streams. For example, a spout may read tuples off of a Kafka queue as we discussed before and emit them as a stream.
+
+A bolt is where the processing really take place, it can take multiple streams as input and produce multiple streams as output. Bolts are where the logic of the topology are implemented, they can run functions, filter data, compute aggregations and so forth.
+
+A topology is then arbitrary combination of the three components, where spouts and bolts are the vertices and streams are the edges in the topology.
+
+
+```ruby
+TopologyBuilder builder = new TopologyBuilder();
+builder.setSpout("words", new TestWordSpout(), 10);
+builder.setBolt("exclaim1", new ExclamationBolt(), 3)
+ .shuffleGrouping("words");
+builder.setBolt("exclaim2", new ExclamationBolt(), 5)
+ .shuffleGrouping("words")
+ .shuffleGrouping("exclaim1");
+
+```
+
+<figure class="fullwidth">
+ <img src="{{ site.baseurl }}/chapter/9/Topology.jpg" alt="The topology created by the example code" />
+</figure>
+
+Here is how we can build a simple topology which contains a spout and two bolts, where the spout emits words and each bolt would append exclamation '!' to its input. The exclaim1 bolt is connected to the spout while the exclaim2 bolt is connected to both the spout and exclaim2 specified by 'Grouping', and we will show what 'shuffle grouping' means in the next paragraph. The nodes are arranged as shown in the graph. For example if the bolt emits the tuple ["Hi"], if it travels from exclaim1 to exclaim2, then exclaim2 would emit the word ["Hi!!"].
+
+Since all the works are distributed, any given vertex is not necessarily running on a single machine, instead they can be spread on different workers in the cluster. The parameter 10, 3 and 5 in the example code actually specify the amount of parallelism the user wants. *Storm* also provides different *stream grouping* schemes for users to determine which vertex should be consuming the output stream from a given vertex. The grouping method can be shuffle grouping as shown in our example, where the tuples from the output stream will be randomly distributed across this bolt's consumers in a way such that each consumer is guaranteed to get an equal number of tuples. Another example would be fields grouping, where the tuples of the stream is partitioned by the fields specified in the grouping, the tuples with the same value in that field would always go to the same bolt.
+
+A natural question to ask here is what if something goes wrong for example a single tuple get lost. One might think that *Storm* maintains a queue similar to what we discussed before to ensure that every tuple is processed at least once. In fact, *Storm* does not keep such queues internally, the reason might be that there would be so many states to maintain if it needs to construct such queue for every edge. In stead, *Storm* maintains a directed acyclic graph (DAG) for every single tuple, where each DAG contains the information of this tuple as how the original tuple is split among different workers. *Storm* uses the DAG to track each tuple, if the tuple fails to be processed, then the system would retry the tuple from the spout again.
+
+<figure class="fullwidth">
+ <img src="{{ site.baseurl }}/chapter/9/DAG.jpg" alt="The simple tuple DAG" />
+</figure>
+
+There might be two concerns here. The first is how can *Storm* track every DAG efficiently and scalably, would it actually use more resources than just maintain the queues? The second concern is starting all the way from spout again instead of the intermediate queue seems taking a step backwards. For the first concern, *Storm* actually uses a very efficient algorithm to create the DAG of each tuple, it would take at mote 20 bytes for any tuple even the DAG contains trillions of tuples in it. For the second concern, if we look at the guarantees provided by both techniques, tracking DAG and intermediate queues, they are actually the same. They both guarantee that each tuple is processed at least once, so there is no fundamental differences between them.
+
+Thus as shown before, *Storm* can guarantee the primitives, it can process a stream of data, distribute the work among multiple workers and guarantee each tuple in the stream is processed.
+
+### b) Micro-batch
+
+We have seen *Apache Storm* as a stream processing system that has the guarantees needed by such system. However, the core of *Storm* is to process stream at a granularity of each tuple. Sometimes such granularity is unnecessary, for the Twitter example that we had before, maybe we are only interested in the *stream* of tuples that came within a 5 minutes interval, with *Storm*, such specification can only be set on top of the system while one really want a convenient way to express such requirement within the system itself. In the next section, we would introduce several other stream processing systems, all of them can act on data stream in real time at large scale as *Storm*, but they provide more ways for the users to express how they want the tuples in the stream to be grouped and then processed. We refer to grouping the tuples before processing them as putting them into small *micro-batches*, and the processor can then provide results by working on those batches instead of single tuple.
+
+- Spark Streaming
+
+The *Spark* streaming {% cite zaharia2012discretized --file streaming %} system is built upon *Apache Spark*, a system for large-scale parallel batch processing, which uses a data-sharing abstraction called 'Resilient Distributed Datasets' or RDDs to ensure fault-tolerance while achieve extremely low latency. The challenges with 'big data' stream processing were long recovery time when failure happens, and the the stragglers might increase the processing time of the whole system. Spark streaming overcomes those challenges by a parallel recovery mechanism that improves efficiency over traditional replication and backup schemes, and tolerate stragglers.
+
+The challenge of the fault-tolerance comes from the fact that the stream processing system might need hundreds of nodes, at such scale, two major problems are *faults* and *stragglers*. Some system use continuous processing model such as *Storm*, in which long-running, stateful queries receive each tuple, update its state and send out the result tuple. While such model is natural, it also makes difficult to handle faults. As shown before *Storm* uses *upstream backup*, where the messages are buffered and replayed if a message fail to be processed. Another approach for fault-tolerance used by previous system is replication, where there are two copies of everything. The first approach takes long time to recovery while the latter one costs double the storage space. Moreover, neither approach handles stragglers. In the first approach, a straggler must be treated as a failure which incurs a costly recovery while the straggler would slow down both replicas because of the use of synchronization protocols to coordinate replicas in the second approach.
+
+*Spark streaming* overcomes these challenges by a new stream processing model-instead of running long-lived queries, it divided a stream into a series of batched tuples on small time intervals, then launch a Spark job to process on the batch. Each computation is deterministic given the input data in that time interval, and this also makes *parallel recovery* possible, when a node fails, each node in the cluster works to recompute part of the lost node's RDDs. *Spark streaming* can also recover from straggler in a similar way.
+
+*D-stream* is the *Spark streaming* abstraction, and in the *D-stream* model, a streaming computation is treated as series of deterministic batch computations on small time intervals. Each batch of the stream is stored as RDDs, and the result after processing this RDD also be stored as RDDs. A *D-stream* is a sequence of RDDs that can be transformed into new *D-streams*. For example, a stream can be divided into one second batches, to process the events in second *s*, *Spark streaming* would first launch a map job to process the events happened in second *s* and it would then launch a reduce job that take both this mapped result the reduced result of data *s - 1*. Thus each *D-stream* can turn into a sequence of *RDDs*, and the *lineage* (i.e., the sequence of operations used to build it) of the *D-streams* are tracked for recovery. If a node fails, it would recover the lost RDD partitions by re-running the operations that used to create them. The re-computation can be ran in parallel on separate nodes since the *lineage* is distributed, and the work on straggler can be re-ran the same way.
+
+```ruby
+val ssc = new StreamingContext(conf, Seconds(1))
+val lines = ssc.socketTextStream("localhost", 9999)
+val words = lines.flatMap(_.split(" "))
+val pairs = words.map(word => (word, 1))
+val wordCounts = pairs.reduceByKey(_ + _)
+wordCounts.print()
+
+```
+
+Let's look at an example of how we can count the word received from a TCP socket with *Spark streaming*. We first set the processing interval to be 1 second, and we will create a *D-stream* lines that represents the streaming data received from the specific TCP socket. Then we split the lines by space into words, now the stream of words is represented as the words *D-stream*. The words stream is further mapped to a *D-stream* of pairs, which is then reduced to count the number of words in each batch of data.
+
+*Spark streaming* handles the slow recovery and straggler issue by dividing stream into small batches on small time intervals and using RDDs to keep track of how the result of certain batched stream is computed. This model makes handling recovery and straggler easier because the computation can be ran in parallel by re-computing the result while RDDs make the process fast.
+
+**Structured Streaming** Besides *Spark streaming*, Apache Spark recently added a new higher-level API, *Structured Streaming*{% cite structuredstreaming --file streaming %}, which is also built on top of the notion of RDDs while makes a strong guarantee that the output of the application is equivalent to executing a batch job on a prefix of data at any time, which is also known as *prefix integrity*. *Structured Streaming* makes sure that the output tables are always consistent with all the records in a prefix of the data, thus the out-of-order data is easy to identify and can simply be used to update its respective row in the table. *Structured Streaming* provides a simple API where the users can just specify the query as if it were a static table, and the systems would automatically convert this query to a stream processing job.
+
+```ruby
+// Read data continuously from an S3 location
+val inputDF = spark.readStream.json("s3://logs")
+
+// Do operations using the standard DataFrame API and write to MySQL
+inputDF.groupBy($"action", window($"time", "1 hour")).count()
+ .writeStream.format("jdbc")
+ .start("jdbc:mysql//...")
+
+```
+The programming model of *Structured Streaming* views the latest data as newly appended rows in an unbounded table, every trigger interval, new rows would be added to the existing table which would eventually update the output table. The event-time then becomes nature in this view, since each event from producers is a row where the even-time is just a column value in this row, which then makes window-based aggregations become simply grouping on the event-time column.
+
+Unlike other systems where users have to specify how to aggregate the records when outputing, *Structured Streaming* would take care of updating the result table when there is new data, users can then just specify different modes to decide what gets written to the external storage. For example in Complete Mode, the entire updated result table would be written to external storage while in Update Mode, only the rows that were updated in the result table will be written out.
+
+- Naiad
+
+*Naiad* {% cite murray2013naiad --file streaming %} is another distributed system for executing data stream which is developed by *Microsoft*. *Naiad* combines the benefits of high throughput of batch processors and the low latency of stream processors by its computation model called *timely dataflow* that enables dataflow computations with timestamps.
+
+The *timely dataflow*, like topology described in *Storm*, contains stateful vertices that represent the nodes that would compute on the stream. Each graph contains input vertices and output vertices, which are responsible for consuming or producing messages from external sources. Every message being exchanged is associated with a timestamp called epoch, the external source is responsible of providing such epoch and notifying the input vertices the end of each epoch. The notion of epoch is powerful since it allows the producer to arbitrarily determine the start and the end of each batch by assigning different epoch number on tuples. For example, the way to divide the epochs can be time as in *spark streaming*, or it can be the start of some event.
+
+<figure class="fullwidth">
+ <img src="{{ site.baseurl }}/chapter/9/Naiad.jpg" alt="A simple Timely Dataflow" />
+</figure>
+
+
+```ruby
+void OnNotify(T time)
+{
+foreach (var pair in counts[time])
+ this.SendBy(output, pair, time);
+counts.Remove(time);
+}
+
+```
+
+In this example, A, B are different processing vertices and each of them has one message being processed, and the OnNotify function is running on node B. For A, the number 2 in its message (e2,2) indicates that this messages is generated in epoch 2, thus a on B counter would increase by 1 if it is counting the number of messages in epoch 1. In the example code, *counts* would be the counter that counts the number of distinct messages received (i.e., in other functions). Once B gets notified that one epoch has ended, the OnNotify function would be triggered, and a count for each distinct input record would then be sent to output.
+
+
+*Naiad* can also execute cyclic dataflow program. If there is a loop in the data flow graph, for example where the message need to be processed with the processed result of previous message, then each message circulating in the group has another counter associated with it along with the epoch. This loop counter would increase by one whenever it complete a loop once. Thus the epoch and counter can work together for the system to track progress of the whole computation.
+
+Tracking process is not a trivial task since there are many messages with different timestamps being sent between nodes. For example, a node *n* is in charge of notifying the end of each epoch and performing a task 'count the number of the event in each epoch'. Then the next question is when can *n* say for sure that a certain epoch has already ended thus the counting job can start. The problematic issue here is that even the node has been receiving messages with epoch *e*, there might still be messages with epoch *e-1* that are still *circulating* (i.e., haven't been consumed) in the dataflow thus if *n* fires the counting right now, it would end up with wrong results since those *circulating* messages are not counted. *Naiad* accomplishes this task by tracking all the messages that being sent and haven't being successfully consumed yet, the system can then compute a *could-result-in* map with those messages. In a *could-result-in* map, a message could lead to a notification of the end of epoch *e* if and only if the messages has timestamp *t* <= *e*, and there is a path from the message to the notification location *n* and all the ingress, egress, feedback vertex on that path satisfies *t* <= *e*. This is guaranteed by that messages are not sent "back in time". Thus the *could-result-in* map can keep track of the epochs, and functions rely on epochs can work correctly.
+
+*Naiad* is the implementation of *timely dataflow* in a cluster, where the tracker on each machine would broadcast both the messages that has not been consumed and recently been consumed in order for every tracker to maintain a single view of the global *could-result-in* map, thus the process of the whole computation is guaranteed. *Naiad* also optimizes its performance by dealing with micro-stragglers such as making changes on TCP layer to reduce network latency and customizing garbage collection methods.
+
+Another interesting point about *Naiad* is how it deals with failures. As described before, there are systems that achieve fault-tolerance by replication and systems such as *Storm* that would replay the tuple from beginning. Then we have *Spark streaming*, which would keep the *lineage* of all operations and is able to rebuilt the RDDs in parallel. *Naiad* more or less can be seen as an example that takes the replay approach, it would checkpoint the computation and can perform potentially more compact checkpointing when requested. When the system periodically checkpoints, all processes would pause and finish ongoing works. Then the system would perform checkpointing on each vertex and then resume. To recover from a failure, all live processes would revert to the last durable checkpoint, and the work from the failed vertex would be reassigned to other processes. This method might have higher latency for recovery due to both checkpointing and resuming than other approaches.
+
+In short, *Naiad* allows processing of messages from different epochs and aggregating result from the same epoch by using timestamps on messages. Moreover, by allowing producers to set epoch on messages arbitrarily (i.e., set logical time), *Naiad* provides a powerful way to create batches of streams. However, the computation model of *Naiad* introduce high latency when dealing with failures.
+
+- Google Dataflow
+
+We now have seen three different systems that can process data stream in large scale, however, each of them are constraint in the way of viewing the dataset. *Storm* can perform stream processing on each tuple, where *Spark streaming* and *Naiad* have their own way of grouping tuples together into small batches before processing. The authors of *Google Dataflow* {% cite akidau2015dataflow --file streaming %} believe that the fundamental problem of those views is they are limited by the processing engine, for example, if you were to use *Spark streaming* to process the stream, you can only group the tuples into small time intervals. The motivation of *Google Dataflow* is then a general underlying system with which the users can express what processing model they want.
+
+*Google Dataflow* is a system that allows batch, micro-bath and stream processing where users can choose based on the tradeoffs provided by each processing model: latency or resouce constraint. *Google Dataflow* implements many features in order to achieve its goal, and we will briefly talk about them.
+
+*Google Dataflow* provides a windowing model that supports unaligned event-time windows, which helped the users to express how to batch the tuples together in a stream. Windowing slices a dataset into finite chunks for processing as a group, one can think of it as batching as we discussed before. Unaligned windows are the windows that would only be applied to certain tuples during the period, for example, if we have an unaligned window *w[1:00,2:00)(k)*, and only the events with key *k* during the time period [1:00, 2:00) would be grouped by this window. This is powerful since it provides an alternative way of batching tuples other than just time before processing.
+
+The next question is then how does *Google Dataflow* knows when to emit the results of a certain window, this requires some other signal to show when the window is done. *Google Dataflow* handles this by providing different choices of triggering methods. One example would be completion estimation, this is useful when combined with percentile watermarks, one might only care about processing a minimum percentage of the input data quickly than finishing every last piece of it. Another interesting triggering method is responding to data arrival, this is useful for application that are grouping data based on the number of them, for example, the processor can be fired once 100 data points are received. These real triggering semantics help *Google Dataflow* to become a general purposed processing system, the first method allows the users to deal with stragglers while the second one provides a way to support tuple-based windows.
+
+In addition to controlling when results can be emitted, the system also provides a way to control how windows can relate to each other. The results can be *discarding*, where the contents would be discarded once triggering, this makes data storage more efficient since once the results are consumed, we can clear them from the buffers. The results can also be *accumulating*, once triggering, the contents are left intact and stored in persistent state, later results can become a refinement of previous results, this mode is useful when the downstream consumers are expected to overwrite old result once the new one comes, for example, we might want to write the count of a view of certain movie from the stream pipeline with low latency, and we can refine the count at the end of the day by running a slower batch process on the aggregated data. The last mode is *accumulating & retracting*, where in addition to *accumulating* semantics, a copy of the emitted value is also stored in persistent state. When the window triggers again in the future, a retraction for the previous value will be emitted first, followed by the new value, this is useful when both the results from the previous processing and the later one are needed to be combined. For example, one process is counting the number of views during a certain period, a user went offline during the window and came back after the window ended when the result of the counting *c* was already emitted, the process now need to retract the previous result *c* and indicate that the correct number should be *c+1*.
+
+```ruby
+PCollection<KV<String, Integer>> output = input
+ .apply(Window.trigger(Repeat(AtPeriod(1, MINUTE)))
+ .accumulating())
+ .apply(Sum.integersPerKey());
+
+```
+
+The above example code shows how to apply a trigger that repeatedly fires on one-minute period, where PCollection can be viewed as the data stream abstraction in *Google Dataflow*. The *accumulating* mode is also specified so that the *Sum* can be refined overtime.
+
+*Google Dataflow* also relies on MillWheel{% cite akidau2013millwheel --file streaming %} as the underlying execution engine to achieve exactly-once-delivery of the tuples. MillWheel is a framework for building low-latency data-processing applications used at Google. It achieves exactly-once-delivery by first checking the incoming record and discard duplicated ones, then pending the productions (i.e., produce records to any stream) until the senders are acknowledges, only then the pending productions are sent.
+
+In conclusion, one of the most important core principles that drives *Google Dataflow* is to accommodate the diversity of known use cases, it did so by providing a rich set of abstractions such as windowing, triggering and controlling. Compared to the 'specialized' system that we discussed above, *Google Dataflow* is a more general system that can fulfill batch, micro-batch, and stream processing requirements.
+
+
+## The systems being used nowadays
+
+Till now we have talked about what is stream processing and what are the different model/system built for this purpose. As shown before, the systems vary on how they view stream, for example *Storm* can perform operation on the level of each tuple while *Spark streaming* could group tuples into micro-batches and then process on the level of batch. They also differ on how to deal with failures, *Storm* can replay the tuple from spout while *Naiad* would keep checkpointing. Then we introduced *Google Dataflow*, which is seems the most powerful tool so far that allows the users to express how to group and control the tuples in the stream.
+
+Despite all the differences among them, they all started with more or less the same goal: to be *the* stream processing system that would be used by companies, and we showed several examples of why companies might need such system. In this section, we would discuss three companies that use the stream processing system as the core of their business: Alibaba, Twitter and Spotify.
+
+## Alibaba
+Alibaba is the largest e-commerce retailer in the world with an annual sales more than eBay and Amazon combined in 2015. Alibaba search is the its personalized search and recommendation platform which uses Apache Flink to power the critical aspects of it{% cite alibabaflink --file streaming %}.
+
+The processing engine of Alibaba runs on 2 different pipelines: a batch pipeline and a streaming pipeline, where the first one would process all data sources while the latter process updates that occur after the batch job is finished. As we can see the second pipeline is one example of stream processing. One of the example applications for the streaming pipeline is the online machine learning recommendation system. There are special days of the year (i.e., Singles Day in China, which is very similar to Black Friday in the U.S.) where transaction volume is huge and the previously-trained model would not correctly reflect the current trends, thus Alibaba needs a streaming job to take the real-time data into account. There are many reasons that Alibaba chose Flink, for example, Flink is general enough to express both the batch pipeline and the streaming pipeline. Another reason is that the changes to the products must be reflected in the final search result thus at-least-once semantics is needed, while other products in Alibaba might need exactly-once semantics, and Flink provides both semantics.
+
+Alibaba developed a forked version of Flink called Blink to fit some of the unique requirements at Alibaba. One important improvement here is a more robust integration with YARN{% cite hadoopyarn --file streaming %}, where YARN is used as the global resource manager for Flink. YARN requires a job in Flink to grab all required resources up front and can not require or release resources dynamically. As Alibaba search engine is currently running on over 1000 machines, a better resources utilization is critical. Blink improves on this by letting each job has its own JobMaster to request and release resources as the job requires, which optimizes the resources usage.
+
+## Twitter
+
+Twitter is one of the 'go-to' examples that people would think of when considering large scale stream processing system, since it has a huge amount of data that needed to be processed in real-time. Twitter bought the company that created *Storm* and used *Storm* as its real-time analysis tool for several years {% cite toshniwal2014storm --file streaming %}. However, as the data volume along with the more complex use cases increased, Twitter needed to build a new real-time stream data processing system as *Storm* can no longer satisfies the new requirements. We would talk about how *Storm* was used at Twitter and then the system that they built to replace *Storm*-*Heron*.
+
+- Storm@Twitter
+
+Twitter requires processing complex computation on streaming data in real-time since each interaction with a user requires making a number of complex decisions, often based on data that has just been created, and they use *Storm* as the real-time distributed stream data processing engine. As we described before, *Storm* represents one of the early open-source and popular stream processing systems that is in use today, and was developed by Nathan Marz at BackType which was acquired by Twitter in 2011. After the acquisition, *Storm* has been improved and open-sourced by Twitter and then picked up by various other organizations.
+
+We will first briefly introduce the structure of *Storm* at Twitter. *Storm* runs on a distributed cluster, and clients submit topologies to a master node, which is in charge of distributing and coordinating the execution of the topologies. The actual bolts and spouts are tasks, and multiple tasks are grouped into executor, multiple executors are in turn grouped into a worker. The worker process would then be distributed to an actual worker node (i.e., machine), where there can be multiple worker processes be running on. Each worker node runs a supervisor that communicates with the master node thus the state of the computation can be tracked.
+
+As shown before, *Storm* can guarantee each tuple is processed 'at least once', however, at Twitter, *Storm* can provide two types of semantic guarantees-'at least once' and 'at most once'. 'At least once' semantic is guaranteed by the directed acyclic graph as we showed before, and 'at most once' semantic is guaranteed by dropping the tuple in case of a failure (e.g., by disabling the acknowledgements of each tuple). Note that for 'at least once' semantic, the coordinators (i.e., Zookeeper) would checkpoint each processed tuple in the topology, and the system can start processing tuples from the last 'checkpoint' that is recorded once recovered from a failure.
+
+*Storm* fulfilled many requirements at Twitter with satisfactory performance. *Storm* was running on hundreds of servers and several hundreds of topologies ran on these clusters some of which run on more than a few hundred nodes, terabytes of data flows through the cluster everyday and generated several billions of output tuples. These topologies were used to do both simple tasks such as filtering and aggregating the content of various streams and complex tasks such as machine learning on stream data. *Storm* was resilient to failures and achieved relatively low latency, a machine can be taken down for maintenance without interrupting the topology and the 99% response time for processing a tuple is close to 1ms.
+
+In conclusion, *Storm* was a critical infrastructure at Twitter that powered many of the real-time data-driven decisions that were made at Twitter.
+
+- Twitter Heron
+
+*Storm* has long served as the core of Twitter for real-time analysis, however, as the scale of data being processed has increased, along with the increase in the diversity and the number of use cases, many limitations of *Storm* became apparent {% cite kulkarni2015twitter --file streaming %}.
+
+There are several issues with *Storm* that make using is at Twitter become challenging. The first challenge is debug-bility, there is no clean mapping from the logical units of computation in the topology to each physical process, this makes finding the root cause of misbehavior extremely hard. Another challenge is as the cluster resources becomes precious, the need for dedicated cluster resources in *Storm* leads to inefficiency and it is better to share resources across different types of systems. In addition, Twitter needs a more efficient system, simply with the increase scale, any improvement in performance can translate to huge benefit.
+
+Twitter realized in order to meet all the needs, they needed a new real-time stream data processing system-Heron, which is API-compatible with Storm and provides significant performance improvements, lower resource consumption along with better debug-ability scalability and manageability.
+
+A key design goal for Heron is compatibility with the *Storm* API, thus Heron runs topologies, graphs with spouts and bolts like Storm. Unlike *Storm* though, the Heron topology is translated into a physical plan before actual execution, and there are multiple components in the physical plan.
+
+Each topology is run as an Aurora{% cite apacheaurora --file streaming %} job, instead of using Nimbus{% cite nimbusproject --file streaming %} as scheduler. Nimbus used to be the master node of *Storm* that schedules and manages all running topologies, it delopys topology on *Storm*, and assigns workers to execute the topology where Aurora is also a service scheduler that can manage long-running services. Twitter chose Aurora since it is developed and used by other Twitter projects. Each Aurora job is then consisted of several containers, the first container runs Topology Master, which provides a single point of contact for discovering the status of the topology and also serves as the gateway for the topology metrics through an endpoint. The other containers each run a Stream Manager, a Metrics Manager and a number of Heron Instances. The key functionality for each Stream Manager is to manage the routing of tuples efficiently, all Stream Managers are connected to each other and the tuples from Heron Instances in different containers would be transmitted through their Stream Managers, thus the Stream Managers can be viewed as Super Node for communication. Stream Manager also provides a backpressure mechanism, if the receiver component is unable to handle incoming data/tuples, then the sender can dynamically adjust the rate of the data flows through the network. For example, if the Stream Managers of the bolts are overwhelmed, they would then notice the Stream Managers of the spouts to slow down thus ensure all the data are properly processed. Heron Instance carries out the real work for a spout or a bolt, unlike worker in *Storm*, each Heron Instance runs only a single task as a process, in addition to performing the work, Heron Instance is also responsible for collecting multiple metrics. The metrics collected by Heron Instances would then be sent to the Metrics Manager in the same container and to the central monitoring system.
+
+The components in the Heron topology are clearly separated, so the failure in various level would be handled differently. For example, if the Topology Master dies, the container would restart the process, and the stand-by Topology Master would take over the master while the restarted would become the stand-by. When a Stream Manager dies, it gets started in the same container, and after rediscovers the Topology Master, it would fetch and check whether there are any changes need to be made in its state. Similarly, all the other failures can be handled gracefully by Heron.
+
+Heron addresses the challenges of *Storm*. First, each task is performed by a single Heron Instance, and the different functionalities are abstracted into different level, which makes debug clear. Second, the provisioning of resources is abstracted out thus made sharing infrastructure with other systems easier. Third, Heron provides multiple metrics along with the backpressure mechanism, which can be used to precisely reason about and achieve a consistent rate of delivering results.
+
+*Storm* has been decommissioned and Heron is now the de-facto streaming system at Twitter and an interesting note is that after migrating all the topologies to Heron, there was an overall 3X reduction in hardware. Not only Heron reduces the infrastructure needed, it also outperform *Storm* by delivering 6-14X improvements in throughput, and 5-10X reductions in tuple latencies.
+
+## Spotify
+Another company that deploys large scale distributed system is Spotify {% cite spotifylabs --file streaming %}. Every small piece of information, such as listening to a song or searching an artist, is sent to Spotify servers and processed. There are many features of Spotify that need such stream processing system, such as music/playlist recommendations. Originally, Spotify would collect all the data generated from client softwares and store them in their HDFS, and those data would then be processed on hourly basis by a batch job (i.e., the data collected each hour would be stored and processed together).
+
+In the original Spotify structure, each job must determine, with high probability, that all data from the hourly bucket has successfully written to a persistent storage before firing the job. Each job were running as a batch job by reading the files from the storage, so late-arriving data for already completed bucket can not be appended since jobs generally only read data once from a hourly bucket, thus each job has to treat late data differently. All late data is written to a currently open hourly bucket then.
+
+Spotify then decided to use *Google Dataflow*, since the features provided by it is exactly what Spotify wants. The previous batch jobs can be written as streaming jobs with one hour window size, and all the data stream can be grouped based on both window and key, while the late arriving data can be gracefully handled if the controlling is set to *accumulating & retracting*. Also, *Google Dataflow* also reduces the export latency of the hourly analysis results, since when assigning windows, Spotify would have an early trigger that is set to emit pane (i.e., result) every N tuples until the window is closed.
+
+The worst end-to-end latency observed with new Spotify system based on *Google Dataflow* is four times lower than the previous system and also with much lower operational overhead.
## References