Merge branch 'master' into Jingjing-Abhilash-bigdata-v2Jingjing-Abhilash-bigdata-v2

author: Heather Miller <heather.miller@epfl.ch> 2016-12-18 15:59:07 -0500
committer: GitHub <noreply@github.com> 2016-12-18 15:59:07 -0500
commit: b8ea0136877b83801707f85ca902ce033fffe360 (patch)
tree: 0405d53412ce19830d1550f08d4879148d79201f /chapter/1
parent: 313f5c7bdd02346683b376f54301792e9f2e48f5 (diff)
parent: 5b299d189255e04ca3481a15e15b2a2e10b29b42 (diff)
13 files changed, 697 insertions, 4 deletions
diff --git a/chapter/1/figures/grpc-benchmark.png b/chapter/1/figures/grpc-benchmark.png
new file mode 100644
index 0000000..9f39c71
--- /dev/null
+++ b/chapter/1/figures/grpc-benchmark.png
diff --git a/chapter/1/figures/grpc-client-transport-handler.png b/chapter/1/figures/grpc-client-transport-handler.png
new file mode 100644
index 0000000..edd5236
--- /dev/null
+++ b/chapter/1/figures/grpc-client-transport-handler.png
diff --git a/chapter/1/figures/grpc-cross-language.png b/chapter/1/figures/grpc-cross-language.png
new file mode 100644
index 0000000..c600f67
--- /dev/null
+++ b/chapter/1/figures/grpc-cross-language.png
diff --git a/chapter/1/figures/grpc-googleapis.png b/chapter/1/figures/grpc-googleapis.png
new file mode 100644
index 0000000..62718e5
--- /dev/null
+++ b/chapter/1/figures/grpc-googleapis.png
diff --git a/chapter/1/figures/grpc-languages.png b/chapter/1/figures/grpc-languages.png
new file mode 100644
index 0000000..1f1c50d
--- /dev/null
+++ b/chapter/1/figures/grpc-languages.png
diff --git a/chapter/1/figures/grpc-server-transport-handler.png b/chapter/1/figures/grpc-server-transport-handler.png
new file mode 100644
index 0000000..fe895c0
--- /dev/null
+++ b/chapter/1/figures/grpc-server-transport-handler.png
diff --git a/chapter/1/figures/hello-world-client.png b/chapter/1/figures/hello-world-client.png
new file mode 100644
index 0000000..c4cf7d4
--- /dev/null
+++ b/chapter/1/figures/hello-world-client.png
diff --git a/chapter/1/figures/hello-world-server.png b/chapter/1/figures/hello-world-server.png
new file mode 100644
index 0000000..a51554b
--- /dev/null
+++ b/chapter/1/figures/hello-world-server.png
diff --git a/chapter/1/figures/http2-frame.png b/chapter/1/figures/http2-frame.png
new file mode 100644
index 0000000..59d6ed5
--- /dev/null
+++ b/chapter/1/figures/http2-frame.png
diff --git a/chapter/1/figures/http2-stream-lifecycle.png b/chapter/1/figures/http2-stream-lifecycle.png
new file mode 100644
index 0000000..87333cb
--- /dev/null
+++ b/chapter/1/figures/http2-stream-lifecycle.png
diff --git a/chapter/1/figures/protobuf-types.png b/chapter/1/figures/protobuf-types.png
new file mode 100644
index 0000000..aaf3a1e
--- /dev/null
+++ b/chapter/1/figures/protobuf-types.png
diff --git a/chapter/1/gRPC.md b/chapter/1/gRPC.md
new file mode 100644
index 0000000..f6c47b7
--- /dev/null
+++ b/chapter/1/gRPC.md
@@ -0,0 +1,323 @@
+---
+layout: page
+title:  "gRPC"
+by: "Paul Grosu (Northeastern U.),  Muzammil Abdul Rehman (Northeastern U.), Eric Anderson (Google, Inc.), Vijay Pai (Google, Inc.), and Heather Miller (Northeastern U.)"
+---
+
+<h1>
+<p align="center">gRPC</p>
+</h1>
+
+<h4><em>
+<p align="center">Paul Grosu (Northeastern U.),  Muzammil Abdul Rehman (Northeastern U.), Eric Anderson (Google, Inc.), Vijay Pai (Google, Inc.), and Heather Miller (Northeastern U.)</p>
+</em></h4>
+
+<hr>
+
+<h3><em><p align="center">Abstract</p></em></h3>
+
+<em>gRPC has been built from a collaboration between Google and Square as a public replacement of Stubby, ARCWire and Sake {% cite Apigee %}.  The gRPC framework is a form of an Actor Model based on an IDL (Interface Description Language), which is defined via the Protocol Buffer message format.  With the introduction of HTTP/2 the internal Google Stubby and Square Sake frameworks are now been made available to the public.  By working on top of the HTTP/2 protocol, gRPC enables messages to be multiplexed and compressed bi-directionally as premptive streams for maximizing capacity of any microservices ecosystem.  Google has also a new approach to public projects, where instead of just releasing a paper describing the concepts will now also provide the implementation of how to properly interpret the standard.
+</em>
+  
+<h3><em>Introduction</em></h3>
+
+In order to understand gRPC and the flexibity of enabling a microservices ecosystem to become into a Reactive Actor Model, it is important to appreciate the nuances of the HTTP/2 Protocol upon which it is based.  Afterward we will describe the gRPC Framework - focusing specifically on the gRPC-Java implementation - with the scope to expand this chapter over time to all implementations of gRPC.  At the end we will cover examples demonstrating these ideas, by taking a user from the initial steps of how to work with the gRPC-Java framework.
+
+<h3>1 <em>HTTP/2</em></h3>
+
+The HTTP 1.1 protocol has been a success for some time, though there were some key features which began to be requested by the community with the increase of distributed computing, especially in the area of microservices.  The phenomenon of creating more modularized functional units that are organically constructed based on a <em>share-nothing model</em> with a bidirectional, high-throughput request and response methodology demands a new protocol for communication and integration.  Thus the HTTP/2 was born as a new standard, which is a binary wire protocol providing compressed streams that can be multiplexed for concurrency.   As many microservices implementations currently scan header messages before actually processing any payload in order to scale up the processing and routing of messages, HTTP/2 now provides header compression for this purpose.  One last important benefit is that the server endpoint can actually push cached resources to the client based on anticipated future communication, dramatically saving client communication time and processing.
+
+<h3>1.1 <em>HTTP/2 Frames</em></h3>
+
+The HTTP/2 protocol is now a framed protocol, which expands the capability for bidirectional, asynchronous communication.  Every message is thus part of a frame that will have a header, frame type and stream identifier aside from the standard frame length for processing.  Each stream can have a priority, which allows for dependency between streams to be achieved forming a <em>priority tree</em>.  The data can be either a request or response which allows for the bidirectional communication, with the capability of flagging the communication for stream termination, flow control with priority settings, continuation and push responses from the server for client confirmation.  Below is the format of the HTTP/2 frame {% cite RFC7540 %}:
+
+<p align="center">
+  <img src="figures/http2-frame.png" /><br>
+  <em>Figure 1: The encoding a HTTP/2 frame.</em>
+</p>
+
+<h3>1.2 <em>Header Compression</em></h3>
+
+The HTTP header is one of the primary methods of passing information about the state of other endpoints, the request or response and the payload.  This enables endpoints to save time when processing a large quantity to streams, with the ability to forward information along without wasting time to inspect the payload.  Since the header information can be quite large, it is possible to now compress the them to allow for better throughput and capacity of stored stateful information.
+
+<h3>1.3 <em>Multiplexed Streams</em></h3>
+
+As streams are core to the implementation of HTTP/2, it is important to discuss the details of their implemenation in the protocol.  As many streams can be open simultanously from many endpoints, each stream will be in one of the following states.  Each stream is multiplexed together forming a chain of streams that are transmitted over the wire, allowing for asynchronous bi-directional concurrency to be performed by the receiving endpoint.  Below is the lifecycle of a stream {% cite RFC7540 %}:
+
+<p align="center">
+  <img src="figures/http2-stream-lifecycle.png" /><br>
+  <em>Figure 2: The lifecycle of a HTTP/2 stream.</em>
+</p>
+
+To better understand this diagram, it is important to define some of the terms in it:
+
+<em>PUSH_PROMISE</em> - This is being performed by one endpoint to alert another that it will be sending some data over the wire.
+
+<em>RST_STREAM</em> - This makes termination of a stream possible.
+
+<em>PRIORITY</em> - This is sent by an endpoint on the priority of a stream.
+
+<em>END_STREAM</em> - This flag denotes the end of a <em>DATA</em> frame.
+
+<em>HEADERS</em> - This frame will open a stream.
+
+<em>Idle</em> - This is a state that a stream can be in when it is opened by receiving a <em>HEADERS</em> frame.
+
+<em>Reserved (Local)</em> - To be in this state is means that one has sent a PUSH_PROMISE frame.
+
+<em>Reserved (Remote)</em> - To be in this state is means that it has been reserved by a remote endpoint.
+
+<em>Open</em> - To be in this state means that both endpoints can send frames.
+
+<em>Closed</em> - This is a terminal state.
+
+<em>Half-Closed (Local)</em> - This means that no frames can be sent except for <em>WINDOW_UPDATE</em>, <em>PRIORITY</em>, and <em>RST_STREAM</em>.
+
+<em>Half-Closed (Remote)</em> - This means that a frame is not used by the remote endpoint to send frames of data.
+
+<h3>1.4 <em>Flow Control of Streams</em></h3>
+
+Since many streams will compete for the bandwidth of a connection, in order to prevent bottlenecks and collisions in the transmission.  This is done via the <em>WINDOW_UPDATE</em> payload for every stream - and the overall connection as well - to let the sender know how much room the receiving endpoint has for processing new data.
+
+<h3>2 <em>Protocol Buffers with RPC</em></h3>
+
+Though gRPC was built on top of HTTP/2, an IDL had to be used to perform the communication between endpoints.  The natural direction was to use Protocol Buffers is the method of stucturing key-value-based data for serialization between a server and client.  At the time of the start of gRPC development only version 2.0 (proto2) was available, which only implemented data structures without any request/response mechanism.  An example of a Protocol Buffer data structure would look something like this:
+
+```
+// A message containing the user's name.
+message Hello {
+  string name = 1;
+}
+```
+<p align="center">
+  <em>Figure 3: Protocol Buffer version 2.0 representing a message data-structure.</em>
+</p>
+
+This message will also be encoded for highest compression when sent over the wire.  For example, let us say that the message is the string <em>"Hi"</em>.  Every Protocol Buffer type has a value, and in this case a string has a value of `2`, as noted in the Table 1 {% cite Protobuf-Types %}.
+
+<p align="center">
+  <img src="figures/protobuf-types.png" /><br>
+  <em>Table 1: Tag values for Protocol Buffer types.</em>
+</p>
+
+One will notice that there is a number associated with each field element in the Protocol Buffer definition, which represents its <em>tag</em>.  In Figure 3, the field `name` has a tag of `1`.  When a message gets encoded each field (key) will start with a one byte value (8 bits), where the least-significant 3-bit value encode the <em>type</em> and the rest the <em>tag</em>.  In this case tag which is `1`, with a type of 2.  Thus the encoding will be `00001 010`, which has a hexdecimal value of `A`.  The following byte is the length of the string which is `2`, followed by the string as `48` and `69` representing `H` and `i`.  Thus the whole tranmission will look as follows:
+
+```
+A 2 48 69
+```
+
+Thus the language had to be updated to support gRPC and the development of a service message with a request and a response definition was added for version version 3.0.0 of Protocol Buffers.  The updated implementation would look as follows {% cite HelloWorldProto %}:
+
+```
+// The request message containing the user's name.
+message HelloRequest {
+  string name = 1;
+}
+
+// The response message containing the greetings
+message HelloReply {
+  string message = 1;
+}
+
+// The greeting service definition.
+service Greeter {
+  // Sends a greeting
+  rpc SayHello (HelloRequest) returns (HelloReply) {}
+}
+```
+<p align="center">
+  <em>Figure 4: Protocol Buffer version 3.0.0 representing a message data-structure with the accompanied RPC definition.</em>
+</p>
+
+Notice the addition of a service, where the RPC call would use one of the messages as the structure of a <em>Request</em> with the other being the <em>Response</em> message format.
+
+Once of these Proto file gets generated, one would then use them to compile with gRPC to for generating the <em>Client</em> and <em>Server</em> files representing the classical two endpoints of a RPC implementation.
+
+<h3>3 <em>gRPC</em></h3>
+
+gRPC was built on top of HTTP/2, and we will cover the specifics of gRPC-Java, but expand it to all the implementations with time.  gRPC is a cross-platform framework that allows integration across many languages as denoted in Figure 5 {% cite gRPC-Overview %}.
+
+<p align="center">
+  <img src="figures/grpc-cross-language.png" /><br>
+  <em>Figure 5: gRPC allows for asynchronous language-agnostic message passing via Protocol Buffers.</em>
+</p>
+
+To ensure scalability, benchmarks are run on a daily basis to ensure that gRPC performs optimally under high-throughput conditions as illustrated in Figure 6 {% cite gRPC-Benchmark %}.
+
+<p align="center">
+  <img src="figures/grpc-benchmark.png" /><br>
+  <em>Figure 6: Benchmark showing the queries-per-second on two virtual machines with 32 cores each.</em>
+</p>
+
+To standardize, most of the public Google APIs - including the Speech API, Vision API, Bigtable, Pub/Sub, etc. - have been ported to support gRPC, and their definitions can be found at the following location:
+
+<p align="center">
+  <img src="figures/grpc-googleapis.png" /><br>
+  <em>Figure 7: The public Google APIs have been updated for gRPC, and be found at <a href="https://github.com/googleapis/googleapis/tree/master/google">https://github.com/googleapis/googleapis/tree/master/google</a></em>
+</p>
+
+
+<h3>3.1 <em>Supported Languages</em></h3>
+
+The officially supported languages are listed in Table 2 {% cite gRPC-Languages %}.
+
+<p align="center">
+  <img src="figures/grpc-languages.png" /><br>
+  <em>Table 2: Officially supported languages by gRPC.</em>
+</p>
+
+<h3>3.2 <em>Authentication</em></h3>
+
+There are two methods of authentication that are available in gRPC:
+
+* SSL/TLS
+* Google Token (via OAuth2)
+
+gRPC is flexible in that once can plug in their custom authentication system if that is preferred.
+
+<h3>3.3 <em>Development Cycle</em></h3>
+
+In its simplest form gRPC has a structured set of steps one goes about using it, which has this general flow:
+
+<em>1. Download gRPC for the language of interest.</em>
+
+<em>2. Implement the Request and Response definition in a ProtoBuf file.</em>
+
+<em>3. Compile the ProtoBuf file and run the code-generators for the the specific language.  This will generate the Client and Server endpoints.</em>
+
+<em>4. Customize the Client and Server code for the desired implementation.</em>
+
+Most of these will require tweaking the Protobuf file and testing the throughput to ensure that the network and CPU capacities are optimally maximized.
+
+<h3>3.4 <em>The gRPC Framework (Stub, Channel and Transport Layer)</em></h3>
+
+One starts by initializing a communication <em>Channel</em> between <em>Client</em> to a <em>Server</em> and storing that as a <em>Stub</em>.  The <em>Credentials</em> are provided to the Channel when being initialized.  These form a <em>Context</em> for the Client's connection to the Server.  Then a <em>Request</em> can be built based on the definition in the Protobuf file.  The Request and associated expected<em>Response</em> is executed by the <em>service</em> constructed in the Protobuf file.  The Response is them parsed for any data coming from the Channel.
+
+The connection can be asynchronous and bi-directionally streaming so that data is constantly flowing back and available to be read when ready.  This allows one to treat the Client and Server as endpoints where one can even adjust not just the flow but also intercept and decoration to filter and thus request and retrieve the data of interest.
+
+The <em>Transport Layer</em> performs the retrieval and placing of binary protocol on the wire.  For <em>gRPC-Java</em> has three implementations, though a user can implement their own: <em>Netty, OkHttp, and inProcess.</em>
+
+<h3>3.5 <em>gRPC Java</em></h3>
+
+The Java implementation of gRPC been built with Mobile platform in mind and to provide that capability it requires JDK 6.0 to be supported.  Though the core of gRPC is built with data centers in mind - specifically to support C/C++ for the Linux platform - the Java and Go implementations are two very reliable platform to experiment the microservice ecosystem implementations.
+
+There are several moving parts to understanding how gRPC-Java works.  The first important step is to ensure that the Client and Server stub inferface code get generated by the Protobuf plugin compiler.  This is usually placed in your <em>Gradle</em> build file called `build.gradle` as follows:
+
+```
+  compile 'io.grpc:grpc-netty:1.0.1'
+  compile 'io.grpc:grpc-protobuf:1.0.1'
+  compile 'io.grpc:grpc-stub:1.0.1'
+```
+
+When you build using Gradle, then the appropriate base code gets generated for you, which you can override to build your preferred implementation of the Client and Server.
+
+Since one has to implement the HTTP/2 protocol, the chosen method was to have a <em>Metadata</em> class that will convert the key-value pairs into HTTP/2 Headers and vice-versa for the Netty implementation via <em>GrpcHttp2HeadersDecoder</em> and <em>GrpcHttp2OutboundHeaders</em>.
+
+Another key insight is to understand that the code that handles the HTTP/2 conversion for the Client and the Server are being done via the <em>NettyClientHandler.java</em> and <em>NettyServerHandler.java</em> classes shown in Figures 8 and 9.
+
+<p align="center">
+  <img src="figures/grpc-client-transport-handler.png" /><br>
+  <em>Figure 8: The Client Tranport Handler for gRPC-Java.</em>
+</p>
+
+<p align="center">
+  <img src="figures/grpc-server-transport-handler.png" /><br>
+  <em>Figure 9: The Server Tranport Handler for gRPC-Java.</em>
+</p>
+
+
+<h3>3.5.1 <em>Downloading gRPC Java</em></h3>
+
+The easiest way to download the gRPC-Java implementation is by performing the following command:
+
+```
+git clone -b v1.0.0 https://github.com/grpc/grpc-java.git
+```
+
+Next compile on a Windows machine using Gradle (or Maven) using the following steps - and if you are using any Firewall software it might be necessary to temporarily disable it while compiling gRPC-Java as sockets are used for the tests:
+
+```
+cd grpc-java
+set GRADLE_OPTS=-Xmx2048m
+set JAVA_OPTS=-Xmx2048m
+set DEFAULT_JVM_OPTS="-Dfile.encoding=utf-8"
+echo skipCodegen=true > gradle.properties
+gradlew.bat build -x test
+cd examples
+gradlew.bat installDist
+```
+
+If you are having issues with Unicode (UTF-8) translation when using Git on Windows, you can try the following commands after entering the `examples` folder:
+
+```
+wget https://raw.githubusercontent.com/benelot/grpc-java/feb88a96a4bc689631baec11abe989a776230b74/examples/src/main/java/io/grpc/examples/routeguide/RouteGuideServer.java
+
+copy RouteGuideServer.java src\main\java\io\grpc\examples\routeguide\RouteGuideServer.java
+```
+
+<h3>3.5.2 <em>Running the Hello World Demonstration</em></h3>
+
+Make sure you open two Command (Terminal) windows, each within the `grpc-java\examples\build\install\examples\bin` folder.  In the first of the two windows type the following command:
+
+```
+hello-world-server.bat
+```
+
+You should see the following:
+
+<p align="center">
+  <img src="figures/hello-world-server.png" /><br>
+  <em>Figure 10: The Hello World gRPC Server.</em>
+</p>
+
+In the second of the two windows type the following command:
+
+```
+hello-world-client.bat
+```
+
+You should see the following response:
+
+<p align="center">
+  <img src="figures/hello-world-client.png" /><br>
+  <em>Figure 10: The Hello World gRPC Client and the response from the Server.</em>
+</p>
+
+<h3>4 <em>Conclusion</em></h3>
+
+This chapter presented an overview of the concepts behing gRPC, HTTP/2 and will be expanded in both breadth and language implementations.  The area of microservices one can see how a server endpoint can actually spawn more endpoints where the message content is the protobuf definition for new endpoints to be generated for load-balancing like for the classical Actor Model.
+
+## References
+
+` `[Apigee]: https://www.youtube.com/watch?v=-2sWDr3Z0Wo
+
+` `[Authentication]: http://www.grpc.io/docs/guides/auth.html
+
+` `[Benchmarks]: http://www.grpc.io/docs/guides/benchmarking.html
+
+` `[CoreSurfaceAPIs]: https://github.com/grpc/grpc/tree/master/src/core
+
+` `[ErrorModel]: http://www.grpc.io/docs/guides/error.html
+
+` `[gRPC]: https://github.com/grpc/grpc/blob/master/doc/g_stands_for.md
+
+` `[gRPC-Companies]: http://www.grpc.io/about/
+
+` `[gRPC-Languages]: http://www.grpc.io/docs/
+
+` `[gRPC-Protos]: https://github.com/googleapis/googleapis/ 
+
+` `[Netty]: http://netty.io/ 
+
+` `[RFC7540]: http://httpwg.org/specs/rfc7540.html  
+
+` `[HelloWorldProto]: https://github.com/grpc/grpc/blob/master/examples/protos/
+helloworld.proto
+
+` `[Protobuf-Types]: https://developers.google.com/protocol-buffers/docs/encoding
+
+` `[gRPC-Overview]: http://www.grpc.io/docs/guides/
+
+` `[gRPC-Languages]: http://www.grpc.io/about/#osp
+
+` `[gRPC-Benchmark]: http://www.grpc.io/docs/guides/benchmarking.html
diff --git a/chapter/1/rpc.md b/chapter/1/rpc.md
index b4bce84..ccc9739 100644
--- a/chapter/1/rpc.md
+++ b/chapter/1/rpc.md
@@ -1,11 +1,381 @@
 ---
 layout: page
-title:  "Remote Procedure Call"
-by: "Joe Schmoe and Mary Jane"
+title:  "RPC is Not Dead: Rise, Fall and the Rise of Remote Procedure Calls"
+by: "Muzammil Abdul Rehman and Paul Grosu"
 ---
 
-Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum. {% cite Uniqueness --file rpc %}
+## Introduction:
+
+*Remote Procedure Call* (RPC) is a design *paradigm* that allow two entities to communicate over a communication channel in a general request-response mechanism. The definition of RPC has mutated and evolved significantly over the past three decades, and therefore RPC *paradigm* is a generic, broadly classifying term to refer to all RPC-esque systems that have arisen over the past four decades. The *definition* of RPC has evolved over the decades.  It has moved on from a simple *client-server* design to a group of inter-connected *services*. While the initial RPC *implementations* were designed as tools for outsourcing computation to a server in a distributed system, however, RPC has evolved over the years to build language-agnostic ecosystem of applications. This RPC *paradigm* has been part of the driving force in creating truly revolutionizing distributed systems and giving rise to various communication schemes and protocols between diverse systems.
+
+RPC *paradigm* has been used to implement our every-day systems. From lower level applications like Network File Systems{% cite sunnfs --file rpc %} and Remote Direct Memory Access{% cite rpcoverrdma --file rpc %} to access protocols to developing an ecosystem of microservices, RPC has been used everywhere. RPC has a diverse variety of applications -- SunNFS{% cite sunnfs --file rpc %}, Twitter's Finagle{% cite finagle --file rpc %}, Apache Thrift{% cite thrift --file rpc %}, Java RMI{% cite rmipaper --file rpc %}, SOAP, CORBA{% cite corba --file rpc %} and Google's gRPC{% cite grpc --file rpc %} to name a few. 
+
+RPC has evolved over the years. Starting off as a synchronous, insecure, request-response system, RPC has evolved into a secure, asynchronous, resilient *paradigm* that has influenced protocols and programming designs, like, HTTP, REST, and just about anything with a request-response system. It has transitioned to an asynchronous, bidirectional, communication mechanism for connecting services and devices across the internet. While the initial RPC implementations mainly focused on a local, private network with multiple clients communicating with a server and synchronously waiting for the response from the server, modern RPC systems have *endpoints* communicating with each other, asynchronously passing arguments and processing responses, as well having two-way request-response streams(from client to server, and also from server to client). RPC has influenced various design paradigms and communication protocols.
+
+## Remote Procedure Calls:
+
+The *Remote Procedure Call paradigm* can be defined, at a high level, as a set of two communication *endpoints* connected over a network with one endpoint sending a request and the other endpoint generating a response based on that request. In the simplest terms, it's a request-response paradigm where the two *endpoints*/hosts have different *address space*. The host that requests a remote procedure can be referred to as *caller* and the host that responds to this can be referred to as *callee*.
+
+The *endpoints* in the RPC can either be a client and a server, two nodes in a peer-to-peer network, two hosts in a grid computation system, or even two microservices. The RPC communication is not limited to two hosts, rather could have multiple hosts or *endpoints* involved {% cite anycastrpc --file rpc %}.
+
+<p align="center">
+[ Image Source: {% cite rpcimage --file rpc %}]
+</p>
+<figure>
+  <img src="{{ site.baseurl }}/resources/img/rpc_chapter_1_ycog_10_steps.png" alt="RPC in 10 Steps." />
+<p>Fig1. - Remote Procedure Call.</p>
+</figure>
+
+The simplest RPC implementation looks like Fig1. In this case, the *client*(or *caller*) and the *server*(or *callee*) are separated by a physical network. The main components of the system are the client routine/program, the client stub, the server routine/program, the server stub, and the network routines. A *stub* is a small program that is generally used as a stand-in(or an interface) for a larger program{% cite stubrpc --file rpc %}. A *client stub*  exposes the functionality provided by the server routine to the client routine while the server stub provides a client-like program to the server routine{% cite rpcimage --file rpc %}. The client stub takes the input arguments from the client program and returns the result, while the server stub provides input arguments to the server program and gets the results. The client program can only interact with the client stub that provides the interface of the remote server to the client. This stub also provides marshalling/pickling/serialization of the input arguments sent to the stub by the client routine. Similarly, the server stub provides a client interface to the server routines as well as the marshalling services. 
+
+When a client routine performs a *remote procedure*, it calls the client stub, which serializes the input argument. This serialized data is sent to the server using OS network routines (TCP/IP){% cite rpcimage --file rpc %}. The data is serialized by the server stub, present to the server routines for the given arguments. The return value from the server routines is serialized again and sent over the network back to the client where it's deserialized by the client stub and presented to the client routine. This *remote procedure* is generally hidden from the client routine and it appears as a *local procedure* to the client. RPC services also require a discovery service/host-resolution mechanism to bootstrap the communication between the client and the server.
+
+One important feature of RPC is different *address space* {% cite implementingrpc --file rpc %} for all the endpoints, however, passing the locations to a global storage(Amazon S3, Microsoft Azure, Google Cloud Store) is not impossible. In RPC,  all the hosts have separate *address spaces*. They can't share pointers or references to a memory location in one host. This *address space* isolation means that all the information is passed in the messages between the host communicating as a value (objects or variables) but not by reference.  Since RPC is a *remote* procedure call, the values sent to the *remote* host cannot be pointers or references to a *local* memory. However, passing links to a global shared memory location is not impossible but rather dependent on the type of system (see *Applications* section for detail).
+
+Originally, RPC was developed as a synchronous request-response mechanism, tied to a specific programming language implementation, with a custom network protocol to outsource computation {% cite implementingrpc --file rpc %}. It had registry system to register all the servers. One of the earliest RPC-based system {% cite implementingrpc --file rpc %} was implemented in the Cedar programming language in early 1980's. The goal of this system was to provide similar programming semantics as local procedure calls. Developed for a LAN network with an inefficient network protocol and a *serialization* scheme to transfer information using the said network protocol, this system aimed at executing a *procedure*(also referred as *method* or a *function*) in a remote *address space*. The single-thread synchronous client and the server were written in an old *Cedar* programming language with a registry system used by the servers to *bind*(or register) their procedures. The clients used this registry system to find a specific server to execute their *remote* procedures. This RPC implementation {% cite implementingrpc --file rpc %} had a very specific use-case. It was built specifically for outsourcing computation between  a  "Xerox research internetwork", a small, closed, ethernet network with 16-bit addresses{% cite implementingrpc --file rpc %}.
+
+Modern RPC-based systems are language-agnostic, asynchronous, load-balanced systems. Authentication and authorization to these systems have been added as needed along with other security features. Most of these systems have fault-handling built into them as modules and the systems are generally spread all across the internet.
+
+RPC programs have a network (or a communication channel), therefore, they need to handle remote errors and be able to communication information successfully. Error handling generally varies and is categorized as *remote-host* or *network* failure handling. Depending on the type of the system, and the error, the caller (or the callee) return an error and these errors can be handled accordingly. For asynchronous RPC calls, it's possible to specify events to ensure progress.
+
+RPC implementations use a *serialization*(also referred to as *marshalling* or *pickling*) scheme on top of an underlying communication protocol (traditionally TCP over IP). These *serialization* schemes allow both the caller *caller* and *callee* to become language agnostic allowing both these systems to be developed in parallel without any language restrictions. Some examples of serialization schemes are JSON, XML, or Protocol Buffers {% cite grpc --file rpc %}.
+
+Modern RPC systems allow different components of a larger system to be developed independently of one another. The language-agnostic nature combined with a decoupling of some parts of the system allows the two components (caller and callee) to scale separately and add new functionalities. This independent scaling of the system might lead to a mesh of interconnected RPC *services* facilitating one another.
+
+### Examples of RPC
+
+RPC has become very predominant in modern systems. Google even performs orders of 10^10^ RPC calls per second {% cite grpcpersec --file rpc %}. That's *tens of trillions* of RPC calls *every second*. It's more than the *annual GDP of United States* {%cite usgdp --file rpc%}.
+
+In the simplest RPC systems, a client connects to a server over a network connection and performs a *procedure*. This procedure could be as simple as `return "Hello World"` in your favorite programming language. However, the complexity of the of this remote procedure has no upper bound.
+
+Here's the code of this simple RPC server, written in Python3.
+```python
+from xmlrpc.server import SimpleXMLRPCServer
+
+# a simple RPC function that returns "Hello World!"
+def remote_procedure(n):
+    return "Hello World!"
+
+server = SimpleXMLRPCServer(("localhost", 8080))
+print("RPC Server listening on port 8080...")
+server.register_function(remote_procedure, "remote_procedure")
+server.serve_forever()
+```
+
+This code for a simple RPC client for the above server, written in Python3, is as follows.
+
+```python
+import xmlrpc.client
+
+with xmlrpc.client.ServerProxy("http://localhost:8080/") as proxy:
+    print(proxy.remote_procedure())
+```
+
+In the above example, we create a simple function called `remote_procedure` and *bind* it to port *8080* on *localhost*. The RPC client then connects to the server and *request* the `remote_procedure` with no input arguments. The server then *responds* with a return value of the `remote_procedure`. 
+
+One can even view the *three-way handshake* as an example of RPC paradigm. The  *three-way handshake* is most commonly used in establishing a TCP connection. Here, a server-side application *binds* to a port on the server, and adds a hostname resolution entry is added to a DNS server(can be seen as a *registry* in RPC). Now, when the client has to connect to the server, it requests a DNS server to resolve the hostname to an IP address and the client sends a SYN packet. This SYN packet can be seen as a *request* to another *address space*. The server, upon receiving this, returns a SYN-ACK packet. This SYN-ACK packet from the server can be seen as *response* from the server, as well as a *request* to establish the connection. The client then *responds* with an ACK packet.
+
+## Evolution of RPC:
+
+RPC paradigm was first proposed in 1980’s and still continues as a relevant model of performing distributed computation, which initially was developed for a LAN and now can be implemented on open networks, as web services across the internet. It has had a long and arduous journey to its current state. Here are the three main(overlapping) stages that RPC went through.
+
+### The Rise: All Hail RPC(Early 1970's - Mid 1980's)
+
+RPC started off strong. With RFC 674{% cite rfc674 --file rpc %} and RFC 707{% cite rfc674 rfc707 --file rpc %} coming out and specifying the design of Remote Procedure Calls, followed by Nelson et. al{% cite implementingrpc --file rpc %} coming up with a first RPC implementation for the Cedar programming language, RPC revolutionized systems in general and gave rise to one of the earliest distributed systems(apart from the internet, of course).
+
+With these early achievements, people started using RPC as the defacto design choice. It became a Holy Grail in the systems community for a few years after the first implementation.
+
+### The Fall: RPC is Dead(Late 1970's - Late 1990's)
+
+RPC, despite being an initial success, wasn't without flaws. Within a year of its inception, the limitation of the RPC started to catch up with it. RFC 684 criticized RPC for latency, failures, and the cost. It also focussed on message-passing systems as an alternative to RPC design. Similarly, a few years down the road, in 1988, Tenenbaum et.~al presented similar concerns against RPC {%cite critiqueofrpc --file rpc %}. It talked about problems heterogeneous devices, message passing as an alternative, packet loss, network failure, RPC's synchronous nature, and highlighted that RPC is not a one-size-fits-all model.
+
+In 1994, *A Note on Distributed Computing* was published. This paper claimed RPC to be "fundamentally flawed" {%cite notedistributed --file rpc %}. It talked about a unified object view and cited four main problems with dividing these objects for distributed computing in RPC: communication latency, address space separation, partial failures and concurrency issues(resulting from accessing same remote object by two concurrent client requests). Although most of these problems(except partial failures) were inherently associated with distributed computing itself but partial failures for RPC systems meant that progress might not always be possible in an RPC system.
+
+This era wasn't a dead end for RPC, though. Some of the preliminary designs for modern RPC systems were introduced in this era. Perhaps, the earliest system in this era was SunRPC {% cite sunnfs --file rpc %} used for the Sun Network File System(NFS). Soon to follow SunRPC was CORBA{% cite corba --file rpc %} which was followed by Java RMI{% cite rmipaper --file rpc %}. 
+
+However, the initial implementations of these systems were riddled with various issues and design flaws. For instance, Java RMI didn't handle network failures and assumed a reliable network with zero-latency{% cite rmipaper --file rpc %}.
+
+### The Rise, Again: Long Live RPC(Late 1990's - Today)
+
+Despite facing problems in its early days, RPC withstood the test of time. Researchers realized the limitations of RPC and focussed on rectifying and instead of enforcing RPC, they started to use RPC in applications where it was needed. The designer started adding exception-handling, async, network failure handling and heterogeneity between different languages/devices to RPC. 
+
+In this era, SunRPC went through various additions and became came to be known as Open Network Computing RPC(ONC RPC). CORBA and RMI have also undergone various modifications as internet standards were set.
+
+A new breed of RPC also started in this era, Async(asynchronous) RPC, giving rise to systems that use *futures* and *promises*, like Finagle{% cite finagle --file rpc %} and Cap'n Proto(post-2010).
+
+
+<p align="center">
+[ Image Source: {% cite norman --file rpc %}]
+</p>
+<figure>
+  <img src="{{ site.baseurl }}/resources/img/rpc_chapter_1_syncrpc.jpg" alt="RPC in 10 Steps." />
+<p>Fig2. - Synchronous RPC.</p>
+</figure>
+
+
+<p align="center">
+[ Image Source: {% cite norman --file rpc %}]
+</p>
+<figure>
+  <img src="{{ site.baseurl }}/resources/img/rpc_chapter_1_asyncrpc.jpg" alt="RPC in 10 Steps." />
+<p>Fig3. - Asynchronous RPC.</p>
+</figure>
+
+
+A traditional, synchronous RPC is a *blocking* operation while an asynchronous RPC is a *non-blocking* operation{%cite dewan --file rpc %}. Fig2. shows a synchronous RPC call while Fig3. shows an asynchronous RPC call. In synchronous RPC, the client sends a request to the server and blocks and waits for the server to perform its computation and return the result. Only after getting the result from the server, the client proceeds onwards. In an asynchronous RPC, the client performs a request to the server and waits only for the acknowledgment of the delivery of input parameters/arguments. After this, the client proceeds onwards and when the server is finished processing, it sends an interrupt to the client. The client receives this message from the server, receives the results, and continues.
+
+Asynchronous RPC makes it possible to separate the remote call from the return value making it possible to write a single-threaded client to handle multiple RPC calls at the specific intervals it needs to process{%cite async --file rpc%}. It also allows easier handling of slow clients/servers as well as transferring large data easily(due to their incremental nature){%cite async --file rpc%}. 
+
+In the post-2000 era, MAUI{% cite maui --file rpc %}, Cap'n Proto{% cite capnprotosecure --file rpc %}, gRPC{% cite grpc --file rpc %}, Thrift{% cite thrift --file rpc %} and Finagle{% cite finagle --file rpc %} have been released, which have significantly boosted the widespread use of RPC. 
+
+Most of these newer systems came up with their Interface Description Languages(IDLs). These IDLs specified the common protocols and interfacing language that could be used to transfer information clients and servers written in different programming languages, making these RPC implementations language-agnostic. Some of the most common IDLs are JSON, XML, and ProtoBufs.
+
+A high-level overview of some of the most important RPC implementation is as follows. 
+
+#### Java Remote Method Invocation
+Java RMI (Java Remote Method Invocation){% cite rmibook --file rpc %} is a Java implementation for performing RPC (Remote Procedure Calls) between a client and a server.  The client using a stub passes via a socket connection the information over the network to the server that contains remote objects. The Remote Object Registry (ROR){% cite rmipaper --file rpc %} on the server contains the references to objects that can be accessed remotely and through which the client will connect to. The client then can request the invocation of methods on the server for processing the requested call and then responds with the answer.
+
+RMI provides some security by being encoded but not encrypted, though that can be augmented by tunneling over a secure connection or other methods. Moreover, RMI is very specific to Java. It cannot be used to take advantage of the language-independence feature that is inherent to most RPC implementations. Perhaps the main problem with RMI is that it doesn't provide *access transparency*. This means that a programmer(not the client program) cannot distinguish between the local objects or the remote objects making it relatively difficult handle partial failures in the network{%cite roi --file rpc %}.
+
+#### CORBA
+CORBA (Common Object Request Broker Architecture){% cite corba --file rpc %} was created by the Object Management Group {% cite corbasite --file rpc %} to allow for language-agnostic communication among multiple computers. It is an object-oriented model defined via an Interface Definition Language (IDL) and the communication is managed through an Object Request Broker (ORB). This ORB acts as a broker for objects. CORBA can be viewed as a language-independent RMI system where each client and server have an ORB by which they communicate. The benefits of CORBA is that it allows for multi-language implementations that can communicate with each other, but much of the criticism around CORBA relates to poor consistency among implementations and it's relatively outdated by now. Moreover, CORBA suffers from same access transparency issues as Java RMI.
+
+#### XML-RPC and SOAP
+The XML-RPC specifications {% cite Wiener --file rpc%} performs an HTTP Post request to a server formatted as XML composed of a *header* and *payload* that calls only one method. It was originally released in the late 1990's and unlike RMI, it provides transparency by using HTTP as a transparent mechanism.
+
+The header has to provide the basic information, like user agent and the size of the payload. The payload has to initiate a `methodCall` structure by specifying the name via `methodName` and associated parameter values.  Parameters for the method can be scalar, structures or (recursive) arrays.  The types of scalar can be one of `i4`, `int`, `boolean`, `string`, `double`, `dateTime.iso8601` or `base64`. The scalars are used to create more complex structures and arrays.
+
+Below is an example as provided by the XML-RPC documentation{% cite Wiener --file rpc%}:
+
+```XML
+
+POST /RPC2 HTTP/1.0
+User-Agent: Frontier/5.1.2 (WinNT)
+Host: betty.userland.com
+Content-Type: text/xml
+Content-length: 181
+
+<?xml version="1.0"?>
+<methodCall>
+   <methodName>examples.getStateName</methodName>
+   <params>
+      <param>
+         <value><i4>41</i4></value>
+         </param>
+      </params>
+   </methodCall>
+```
+
+The response to a request will have the `methodResponse` with `params` and values, or a `fault` with the associated `faultCode` in case of an error {% cite Wiener --file rpc %}:
+
+```XML
+HTTP/1.1 200 OK
+Connection: close
+Content-Length: 158
+Content-Type: text/xml
+Date: Fri, 17 Jul 1998 19:55:08 GMT
+Server: UserLand Frontier/5.1.2-WinNT
+
+<?xml version="1.0"?>
+<methodResponse>
+   <params>
+      <param>
+         <value><string>South Dakota</string></value>
+         </param>
+      </params>
+   </methodResponse>
+```
+
+SOAP (Simple Object Access Protocol) is a successor of XML-RPC as a web-services protocol for communicating between a client and server. It was initially designed by a group at Microsoft {% cite soaparticle1 --file rpc %}. The SOAP message is an XML-formatted message composed of an envelope inside which a header and a payload are provided(just like XML-RPC). The payload of the message contains the request and response of the message, which is transmitted over HTTP or SMTP(unlike XML-RPC).
+
+SOAP can be viewed as the superset of XML-RPC that provides support for more complex authentication schemes{%cite soapvsxml --file rpc %} as well as its support for WSDL(Web Services Description Language), allowing easier discovery and integration with remote web services{%cite soapvsxml --file rpc %}.
+
+The benefit of SOAP is that it provides the flexibility for transmission over multiple transport protocol. The XML-based messages allow SOAP to become language agnostic, though parsing such messages could become a bottleneck. 
+
+#### Thrift
+Thrift is an *asynchronous* RPC system created by Facebook and now part of the Apache Foundation {% cite thrift --file rpc %}. It is a language-agnostic Interface Description Language(IDL) by which one generates the code for the client and server. It provides the opportunity for compressed serialization by customizing the protocol and the transport after the description file has been processed.
+
+Perhaps, the biggest advantage of Thrift is that its binary data format has a very low overhead. It has a relatively lower transmission cost(as compared to other alternatives like SOAP){%cite thrifttut --file rpc %} making it very efficient for large amounts of data transfer.
+
+#### Finagle
+Finagle is a fault-tolerant, protocol-agnostic runtime for doing RPC and high-level API for composing futures(see Async RPC section), with RPC calls generated under the hood. It was created by Twitter and is written in Scala to run on a JVM.  It is based on three object types: Service objects, Filter objects and Future objects {% cite finagle --file rpc %}. 
+
+The Future objects act by asynchronously being requested for a computation that would return a response at some time in the future. These Future objects are the main communication mechanism in Finagle. All the inputs and the output are represented as Future objects.
+
+The Service objects are an endpoint that will return a Future upon processing a request. These Service objects can be viewed as the interfaces used to implement a client or a server.
+
+A sample Finagle Server that reads a request and returns the version of the request is shown below. This example is taken from Finagle documentation{% cite finagletut --file rpc %}
+
+```Scala
+import com.twitter.finagle.{Http, Service}
+import com.twitter.finagle.http
+import com.twitter.util.{Await, Future}
+
+object Server extends App {
+  val service = new Service[http.Request, http.Response] {
+    def apply(req: http.Request): Future[http.Response] =
+      Future.value(
+        http.Response(req.version, http.Status.Ok)
+      )
+  }
+  val server = Http.serve(":8080", service)
+  Await.ready(server)
+}
+```
+
+A Filter object transforms requests for further processing in case additional customization is required from a request. These provide program-independent operations like, timeouts, etc. They take in a Service and provide a new Service object with the applied Filter. Aggregating multiple Filters is alos possible in Finagle.
+
+A sample timeout Filter that takes in a service and creates a new service with timeouts is shown below. This example is taken from Finagle documentation{% cite finagletut --file rpc %}
+
+```Scala
+import com.twitter.finagle.{Service, SimpleFilter}
+import com.twitter.util.{Duration, Future, Timer}
+
+class TimeoutFilter[Req, Rep](timeout: Duration, timer: Timer)
+  extends SimpleFilter[Req, Rep] {
+
+  def apply(request: Req, service: Service[Req, Rep]): Future[Rep] = {
+    val res = service(request)
+    res.within(timer, timeout)
+  }
+}
+```
+
+#### Open Network Computing RPC(ONC RPC)
+ONC was originally introduced as SunRPC {%cite sunnfs --file rpc %} for the Sun NFS.  The Sun NFS system had a stateless server, with client-side caching, unique file handlers, and supported NFS read, write, truncate, unlink, etc operations. However, SunRPC was later revised as ONC in 1995 {%cite rfc1831 --file rpc %} and then in 2009 {%cite rfc5531 --file rpc %}. The IDL used in ONC(and SunRPC) is External Data Representation (XDR), a serialization mechanism specific to networks communication and therefore, ONC is limited to applications like Network File Systems.
+
+#### Mobile Assistance Using Infrastructure(MAUI) 
+The MAUI project {% cite maui --file rpc %}, developed by Microsoft is a computation offloading system for mobile systems. It's an automated system that offloads a mobile code to a dedicated infrastructure in order to increase the battery life of the mobile, minimize the load on the programmer and perform complex computations offsite. MAUI uses RPC as the communication protocol between the mobile and the infrastructure.
+
+#### gRPC
+
+gRPC is a multiplexed, bi-directional streaming RPC protocol developed Google and Square. The IDL for gRPC is Protocol Buffers(also referred as ProtoBuf) and is meant as a public replacement of Stubby, ARCWire, and Sake {% cite Apigee --file rpc %}. More details on Protocol Buffers, Stubby, ARCWire, and Sake are available in our gRPC chapter{% cite grpcchapter --file rpc %}.
+
+gRPC provides a platform for scalable, bi-directional streaming using both synchronized and asynchronous communication. 
+
+In a general RPC mechanism, the client initiates a connection to the server and only the client can *request* while the server can only *respond* to the incoming requests. However, in bi-directional gRPC streams, although the initial connection is initiated by the client(call it *endpoint 1*), once the connection is established, both the server(call it *endpoint 2*) and the *endpoint 1* can send *requests* and receive *responses*. This significantly eases the development where both *endpoints* are communicating with each other(like, grid computing). It also saves the hassle of creating two separate connections between the endpoints (one from *endpoint 1* to *endpoint 2* and another from *endpoint 2* to *endpoint 1*) since both streams are independent.
+
+It multiplexes the requests over a single connection using header compression. This makes it possible for gRPC to be used for mobile clients where battery life and data usage are important.
+The core library is in C -- except for Java and GO -- and surface APIs are implemented for all the other languages connecting through it{% cite CoreSurfaceAPIs --file rpc %}.
+
+Since Protocol Buffers has been utilized by many individuals and companies, gRPC makes it natural to extend their RPC ecosystems via gRPC. Companies like Cisco, Juniper and Netflix {% cite gRPCCompanies --file rpc %} have found it practical to adopt it.
+A majority of the Google Public APIs, like their places and maps APIs, have been ported to gRPC ProtoBuf {% cite gRPCProtos --file rpc %} as well.
+
+More details about gRPC and bi-directional streaming can be found in our gRPC chapter {% cite grpcchapter --file rpc %}
+
+#### Cap'n Proto
+CapnProto{% cite capnprotosecure --file rpc %} is a data interchange RPC system that bypasses data-encoding step(like JSON or ProtoBuf) to significantly improve the performance. It's developed by the original author of gRPC's ProtoBuf, but since it uses bytes(binary data) for encoding/decoding, it outperforms gRPC's ProtoBuf. It uses futures and promises to combine various remote operations into a single operation to save the transportation round-trips. This means if an client calls a function `foo` and then calls another function `bar` on the output of `foo`, Cap'n Proto will aggregate these two operations into a single `bar(foo(x))` where `x` is the input to the function `foo` {% cite capnprotosecure --file rpc %}. This saves multiple roundtrips, especially in object-oriented programs.
+
+### The Heir to the Throne: gRPC or Thrift
+
+Although there are many candidates to be considered as top contenders for RPC throne, most of these are targeted for a specific type of application. ONC is generally specific to the Network File System(though it's being pushed as a standard), Cap'n Proto is relatively new and untested, MAUI is specific to mobile systems, the open-source Finagle is primarily being used at Twitter(not widespread), and the Java RMI simply doesn't even come close due to its transparency issues(sorry to burst your bubble Java fans).
+
+Probably, the most powerful, and practical systems out there are Apache Thrift and Google's gRPC, primarily because these two systems cater to a large number of programming languages, have a significant performance benefit over other techniques and are being actively developed.
+
+Thrift was actually released a few years ago, while the first stable release for gRPC came out in August 2016. However, despite being 'out there', Thrift is currently less popular than gRPC {%cite trendrpcthrift --file rpc %}.
+
+gRPC {% cite gRPCLanguages --file rpc %} and Thrift, both, support most of the popular languages, including Java, C/C++, and Python. Thrift supports other languages, like Ruby, Erlang, Perl, Javascript, Node.js and OCaml while gRPC currently supports Node.js and Go.
+
+The gRPC core is written in C(with the exception of Java and Go) and wrappers are written in other languages to communicate with the core, while the Thrift core is written in C++.
+
+gRPC also provides easier bidrectional streaming communication between the caller and callee. The client generally initiates the communication {% cite gRPCLanguages --file rpc %} and once the connection is established the client and the server can perform reads and writes independently of each other. However, bi-directional streaming in Thrift might be a little difficult to handle, since it focuses explicitly on a client-server model. To enable bidirectional, async streaming, one may have to run two separate systems {%cite grpcbetter --file rpc%}. 
+
+Thrift provides exception-handling as a message while the programmer has to handle exceptions in gRPC. In Thrift, exceptions can be returned built into the message, while in gRPC, the programmer explicitly defines this behavior. This Thrift exception-handling makes it easier to write client-side applications.
+
+Although custom authentication mechanisms can be implemented in both these system, gRPC come with a Google-backed authentication using SSL/TLS and Google Tokens {% cite grpcauth --file rpc %}.
+
+Moreover, gRPC-based network communication is done using HTTP/2. HTTP/2 makes it feasible for communicating parties to multiplex network connections using the same port. This is more efficient(in terms of memory usage) as compared to HTTP/1.1. Since gRPC communication is done HTTP/2, it means that gRPC can easily multiplex different services. As for Thrift, multiplexing services is possible, however, due to lack of support from underlying transport protocol, it is performed using a `TMulitplexingProcessor` class(in code) {% cite multiplexingthrift --file rpc %}. 
+
+However, both gRPC and Thrift allow async RPC calls. This means that a client can send a request to the server and continue with its execution and the response from the server is processed it arrives.
+
+
+The major comparison between gRPC and Thrift can be summed in this table.
+
+| Comparison | Thrift | gRPC |
+| ----- | ----- | ----- |
+| License | Apache2 | BSD |
+| Sync/Async RPC | Both | Both |
+| Supported Languages | C++, Java, Python, PHP, Ruby, Erlang, Perl, Haskell, C#, Cocoa, JavaScript, Node.js, Smalltalk, and OCaml | C/C++, Python, Go, Java, Ruby, PHP, C#, Node.js, Objective-C |
+| Core Language | C++| C |
+| Exceptions | Allows being built in the message | Implemented by the programmer |
+| Authentication | Custom | Custom + Google Tokens |
+| Bi-Directionality | Not straightforward | Straightforward |
+| Multiplexing | Possible via `TMulitplexingProcessor` class | Possible via HTTP/2 |
+
+Although, it's difficult to specifically choose one over the other, however, with increasing popularity of gRPC, and the fact that it's still in early stages of development, the general trend{%cite trendrpcthrift --file rpc %} over the past year has started to shift in favor of gRPC and it's giving Thrift a run for its money. Although, it may not be considered as a metric, but the gRPC was searched, on average, three times more as compared to Thrift{%cite trendrpcthrift --file rpc %}. 
+
+**Note:** This comparison is performed in December 2016 so the results are expected to change with time.
+
+## Applications:
+
+Since its inception, various papers have been published in applying RPC paradigm to different domains, as well as using RPC implementations to create new systems. Here are some of applications and systems that incorporated RPC.
+
+#### Shared State and Persistence Layer
+
+One major limitation(and the advantage) of RPC is considered the separate *address space* of all the machines in the network. This means that *pointers* or *references* to a data object cannot be passed between the caller and the callee. Therefore, Interweave {% cite interweave2 interweave1 interweave3 --file rpc %} is a *middleware* system that allows scalable sharing of arbitrary datatypes and language-independent processes running on heterogeneous hardware. Interweave is specifically designed and is compatible with RPC-based systems and allows easier access to the shared resources between different applications using memory blocks and locks.
+
+Although research has been done in order to ensure a global shared state for an RPC-based system, However, these systems tend to take away the sense of independence and modularity between the *caller* and the *callee* by using a shared storage instead of a separate *address space*.
+
+#### GridRPC
+
+Grid computing is one of the most widely used applications of RPC paradigm. At a high level, it can be seen as a mesh (or a network) of computers connected with each other to for *grid* such each system can leverage resources from any other system in the network.
+
+In the GridRPC paradigm, each computer in the network can act as the *caller* or the *callee* depending on the amount of resources required {% cite grid1 --file rpc %}. It's also possible for the same computer to act as the *caller* as well as the *callee* for *different* computations.
+
+Some of the most popular implementations that allow one to have GridRPC-compliant middleware are GridSolve{% cite gridsolve1 gridsolve2 --file rpc %} and Ninf-G{% cite ninf --file rpc %}. Ninf is relatively older than GridSolve and was first published in the late 1990's. It's a simple RPC layer that also provides authentication and secure communication between the two parties. GridSolve, on the other hand, is relatively complex and provides a middleware for the communications using a client-agent-server model.
+
+#### Mobile Systems and Computation Offloading
+
+Mobile systems have become very powerful these days. With multi-core processors and gigabytes of RAM, they can undertake relatively complex computations without a hassle. Due to this advancement, they consume a larger amount of energy and hence, their batteries, despite becoming larger, drain quickly with usage. Moreover, mobile data (network bandwidth) is still limited and expensive. Due to these requirements, it's better to offload mobile computations from mobile systems when possible. RPC plays an important role in the communication for this *computation offloading*. Some of these services use Grid RPC technologies to offload this computation. Whereas, other technologies use an RMI(Remote Method Invocation) system for this. 
+
+The Ibis Project {% cite ibis --file rpc %} builds an RMI(similar to JavaRMI) and GMI (Group Method Invocation) model to facilitate outsourcing computation. Cuckoo {% cite cuckoo --file rpc %} uses this Ibis communication middleware to offload computation from applications(built using Cuckoo) running on Android smartphones to remote Cuckoo servers. 
+
+The Microsoft's MAUI Project {% cite maui --file rpc %} uses RPC communication and allows partitioning of .NET applications and "fine-grained code offload to maximize energy savings with minimal burden on the programmer". MAUI decides the methods to offload to the external MAUI server at runtime.
+
+#### Async RPC, Futures and Promises
+
+Remote Procedure Calls can be asynchronous. Not only that but these async RPCs play in integral role in the *futures* and *promises*. *Future* and *promises* are programming constructs that where a *future* is seen as variable/data/return type/error while a *promise* is seen as a *future* that doesn't have a value, yet. We follow Finagle's {% cite finagle --file rpc %} definition of *futures* and *promises*, where the *promise* of a *future*(an empty *future*) is considered as a *request* while the async fulfillment of this *promise* by a *future* is seen as the *response*. This construct is primarily used for asynchronous programming.
+
+Perhaps the most renowned systems using this type of RPC model are Twitter's Finagle{% cite finagle --file rpc %} and Cap'n Proto{% cite capnprotosecure --file rpc %}.
+
+#### RPC in Microservices Ecosystem:
+
+RPC implementations have moved from a one-server model to multiple servers and on to dynamically-created, load-balanced microservices. RPC started as separate implementations of REST, Streaming RPC, MAUI, gRPC, Cap'n Proto, and has now made it possible for integration of all these implementations as a single abstraction as a user *endpoint*. The endpoints are the building blocks of *microservices*. A *microservice* is usually *service* with a very simple, well-defined purpose, written in almost any language that interacts with other microservices to give the feel of one large monolithic *service*. These microservices are language-agnostic. One *microservice* for airline tickets written in C/C\++, might be communicating with a number of other microservices for individual airlines written in different languages(Python, C\++, Java, Node.js) using a language-agnostic, asynchronous, RPC framework like gRPC{%cite grpc --file rpc %} or Thrift{%cite thrift --file rpc %}.
+
+The use of RPC has allowed us to create new microservices on-the-fly. The microservices can not only created and bootstrapped at runtime but also have inherent features like load-balancing and failure-recovery. This bootstrapping might occur on the same machine, adding to a Docker container {% cite docker --file rpc %}, or across a network (using any combination of DNS, NATs or other mechanisms).
+
+RPC can be defined as the "glue" that holds all the microservices together{% cite microservices1rpc --file rpc %}. This means that RPC is one of the primary communication mechanism between different microservices running on different systems. A microservice requests another microservice to perform an operation/query. The other microservice, upon receiving such request, performs an operation and returns a response. This operation could vary from a simple computation to invoking another microservice creating a series of RPC events to creating new microservices on the fly to dynamically load balance the microservices system.  These microservices are language-agnostic. One *microservice* could be written in C/C++, another one could be in different languages(Python, C++, Java, Node.js) and they all might be communicating with each other using a language-agnostic, asynchronous, performant RPC framework like gRPC{%cite grpc --file rpc %} or Thrift{%cite thrift --file rpc %}.
+
+An example of a microservices ecosystem that uses futures/promises is Finagle{%cite finagle --file rpc %} at Twitter.
+
+## Security in RPC:
+
+The initial RPC implementation {% cite implementingrpc --file rpc %} was developed for an isolated LAN network and didn't focus much on security. There're various attack surfaces in that model, from the malicious registry to a malicious server, to a client targeting for Denial-of-Service to Man-in-the-Middle attack between client and server.
+
+As time progressed and internet evolved, new standards came along, and RPC implementations became much more secure. Security, in RPC, is generally added as a *module* or a *package*. These modules have libraries for authentication and authorization of the communication services (caller and callee). These modules are not always bug-free and it's possible to gain unauthorized access to the system. Efforts are being made to rectify these situations by the security in general, using code inspection and bug bounty programs to catch these bugs beforehand. However, with time new bugs arise and this cycle continues. It's a vicious cycle between attackers and security experts, both of whom tries to outdo their opponent.
+
+For example, the Oracle Network File System uses a *Secure RPC*{% cite oraclenfs --file rpc %} to perform authentication in the NFS. This *Secure RPC* uses Diffie-Hellman authentication mechanism with DES encryption to allow only authorized users to access the NFS. Similarly, Cap'n Proto {% cite capnprotosecure --file rpc %} claims that it is resilient to memory leaks, segfaults, and malicious inputs and can be used between mutually untrusting parties. However, in Cap'n Proto "the RPC layer is not robust against resource exhaustion attacks, possibly allowing denials of service", nor has it undergone any formal verification {% cite capnprotosecure --file rpc %}.
+
+Although, it's possible to come up with a *Threat Model* that would make an RPC implementation insecure to use, however, one has to understand that using any distributed system increases the attack surface anyways and claiming one *paradigm* to be more secure than another would be a biased statement, since *paradigms* are generally an idea and it depends on different system designers to use these *paradigms* to build their systems and take care of features specific to real systems, like security and load-balancing. There's always a possibility of rerouting a request to a malicious server(if the registry gets hacked), or there's no trust between the *caller* and *callee*. However, we maintain that RPC *paradigm* is not secure or insecure(for that matter), and that the most secure systems are the ones that are in an isolated environment, disconnected from the public internet with a self-destruct mechanism{% cite selfdest --file rpc %} in place, in an impenetrable bunker, and guarded by the Knights Templar(*they don't exist! Well, maybe Fort Meade comes close*).
+
+## Discussion:
+
+RPC *paradigm* shines the most in *request-response* mechanisms. Futures and Promises also appear to a new breed of RPC. This leads one to question, as to whether every *request-response* system is a modified implementation to of the RPC *paradigm*, or does it actually bring anything new to the table? These modern communication protocols, like HTTP and REST, might just be a different flavor of RPC. In HTTP, a client *requests* a web page(or some other content), the server then *responds* with the required content. The dynamics of this communication might be slightly different from your traditional RPC, however, an HTTP Stateless server adheres to most of the concepts behind RPC *paradigm*. Similarly, consider sending a request to your favorite Google API. Say, you want to translate your latitude/longitude to an address using their Reverse Geocoding API, or maybe want to find out a good restaurant in your vicinity using their Places API, you'll send a *request* to their server to perform a *procedure* that would take a few input arguments, like the coordinates, and return the result. Even though these APIs follow a RESTful design, it appears to be an extension to the RPC *paradigm*.
+
+RPC paradigm has evolved over time. It has evolved to the extent that, currently, it's become very difficult differentiate RPC from non-RPC. With each passing year, the restrictions and limitations of RPC evolve. Current RPC implementations even have the support for the server to *request* information from the client to *respond* to these requests and vice versa (bidirectionality). This *bidirectional* nature of RPCs have transitioned RPC from simple *client-server* model to a set of *endpoints* communicating with each other.
+
+For the past four decades, researchers and industry leaders have tried to come up with *their* definition of RPC. The proponents of RPC paradigm view every *request-response* communication as an implementation the RPC paradigm while those against RPC try to explicitly enumerate the limitations of RPC. These limitations, however, seem to slowly vanish as new RPC models are introduced with time. RPC supporters consider it as the Holy Grail of distributed systems. They view it as the foundation of modern distributed communication. From Apache Thrift and ONC to HTTP and REST, they advocate it all as RPC while REST developers have strong opinions against RPC.
+
+Moreover, with modern global storage mechanisms, the need for RPC systems to have a separate *address space* seems to be slowly dissolving and disappearing into thin air. So, the question remains what *is* RPC and what * is not* RPC? This is an open-ended question. There is no unanimous agreement about what RPC should look like, except that it has communication between two *endpoints*. What we think of RPC is:
+
+*In the world of distributed systems, where every individual component of a system, be it a hard disk, a multi-core processor, or a microservice, is an extension of the RPC, it's difficult to come with a concrete definition of the RPC paradigm. Therefore, anything loosely associated with a request-response mechanism can be considered as RPC.*
+
+<blockquote>
+<p align="center">
+<em>**RPC is not dead, long live RPC!**</em>
+</p>
+</blockquote>
 
 ## References
 
-{% bibliography --file rpc %}
-\ No newline at end of file
+{% bibliography --file rpc --cited %}
author	Heather Miller <heather.miller@epfl.ch>	2016-12-18 15:59:07 -0500
committer	GitHub <noreply@github.com>	2016-12-18 15:59:07 -0500
commit	b8ea0136877b83801707f85ca902ce033fffe360 (patch)
tree	0405d53412ce19830d1550f08d4879148d79201f /chapter/1
parent	313f5c7bdd02346683b376f54301792e9f2e48f5 (diff)
parent	5b299d189255e04ca3481a15e15b2a2e10b29b42 (diff)