Alan Shreve is an hacker, entrepreneur and creator of ngrok.com. ngrok is the best way to connect expose services behind a NAT or firewall to the internet for demos, webhook development and IoT connectivity. Today, he's giving us a whirlwind tour of gRPC and how to use it in your production web service.
Q: How do Microservices talk to each other?
A: SOAP. Just kidding. Today, it's mostly HTTP + JSON.
I will die happy if I never write another REST client library in my life.
It's a lot of repetitive boilerplate.
Why do REST APIs suck?
Streaming is difficult (nigh-impossible in some languages)
Bi-directional streaming isn’t possible at all
Operations are difficult to model (e.g. ‘restart the machine’)
Inefficient (textual representations aren’t optimal for networks)
Your internal services aren’t RESTful anyways, they’re just HTTP endpoints
Hard to get many resources in a single request (GraphQL)
Corollary: humans are expensive and don’t like writing client libraries
So let's use gRPC to build a cache service
gRPC is a "high-performance open source universal RPC framework."
Let's build a caching service together using gRPC.
We don't define it in code. We actually define it in an Interface Definition Language (IDL), in this case, protobufs.
Here's our caching service (app.proto):
Snap your fingers and now you have client libraries in 9 different languages: C++, Java (including Android), Python, Go, Ruby, C#, JavaScript, Objective-C, PHP.
But wait, there's more: you also have server stubs for that API in 7 languages: C++, Java, Python, Go, Ruby, C#, JavaScript.
We won't dive into the generated code itself, but let's see how we can use it.
server.go
Let's look at server.go.
client.go
Note: don't have to write networking code or serialization code.
Wait...
Is this just WSDL all over again?
gRPC compared to SOAP/WSDL
Inextricably tied to XML (grpc is pluggable)
Very heavyweight service definition format: XML/XSD nightmare
Unnecessarily complex, bloated with unnecessary features (Two-
phase commit?!)
Inflexible and intolerant of forward-compatibility (unlike protobuf)
Performance, streaming not solved . . .
Machine-readable API contracts are actually a really great idea
Clients were responsible for generating libraries instead of vendors
gRPC compared to Swagger
Solves the machine-readable contract problem, but none
of the other problems with HTTP/JSON (performance,
streaming, modeling)
Swagger definitions are cumbersome and incredibly
verbose. Compared to writing grpc protobuf definitions,
they’re a gigantic pain
gRPC compared to Thrift
Thrift actually a really great idea, very similar project goals
Never achieved same ubiquity and ease of use. This is really hard. Requires all major language implementations to be:
well documented
reliable
highly performant
easy to install
Implementing the methods
Let's fill in the stubs that gRPC generated for us:
gRPC errors
There's a number of error codes each corresponding to types of errors. It's kind of like HTTP errors, but no response body.
Secure transport
There's a simple API for that. On the server:
On the client:
What's going on underneath the hood? How does it work?
protobuf serialized over HTTP/2:
HTTP/2
protobuf serialization (pluggable)
Clients open one long-lived connection to a grpc server
A new HTTP/2 stream for each RPC call
Allows simultaneous in-flight RPC calls
Allows client-side and server-side streaming
Implementations
There are 3 implementations at the moment:
Three high-performance event loop driven implementations
C
Ruby, Python, node.js, PHP, C#, Objective-C, and C++ are all bindings to the “C
core”
PHP via PECL extension (apache or nginx/php-fpm)
Java
Netty + BoringSSL via JNI
Go
Pure Go implementation using Go stdlib crypto/tls package
Where did gRPC come from?
Originally pioneered by a team at Google
Next generation version of an internal Google project
called ‘stubby’
Now a F/OSS project with a completely open spec and
contributors from many companies
Development is still primarily executed by Google devs
What if you have misbehaving clients
Let's try Multitenancy (associating client ID with each request):
app.proto:
client.go:
server.go:
"It's too slow"
Now someone tells you your service is too slow. And here you realize you have no visibility. What do you do?
Option 1: add logging:
client.go:
server.go:
Note: this is a lot of boilerplate. Turns out gRPC has something for that: the client interceptor. Every time you make a remote call, the interceptor middleware will be invoked.
Client interceptor
interceptor.go:
client.go:
server.go:
Server interceptor
On the server side, there's a server interceptor.
interceptor.go:
server.go:
Adding timeouts
Server-side timeout:
server.go:
Now you look in the logs you and find you're still failing SLA. Some round-trips take 2.2 seconds (more than 2 seconds). Why? Your timeout only covers a portion of the full request/response roundtrip.
So let the client set its own timeout:
client.go:
How does that interact with the timeout you set previously on the server? Simple: "the context propagates through."
Dry run
Now let's say you want to call your service with a dry run flag. I want to run it without side effects. I want it to work on all mutable API.
This is simple with passing the right gRPC metadata (analogous to HTTP headers).
Networks fail
Let's add retry logic.
Idempotent logic? We can have safe retries:
But now failed operations are slow. What's going on?
Structured errors
In your response message, you add a new field that indicates whether it is possible to retry in the case of an error.
You really want a structured error, not just a code and string. You want a full object's worth of parameters.
Something like this:
Unfortunately, this gets a little messy... A lot of serialization and deserialization is required, since gRPC doesn't have response bodies.
This is one of the larger frustrations working with gRPC compared to HTTP.
Another feature request: Cache dump
Let's say you now want to add the ability to dump the contents of your cache service.
Now you run into non-enough memory errors. What defensive measures can you add?
Set max number of concurrent streams (simultaneous HTTP/2 streams per client):
gRPC also lets you use an InTap handler, a piece of code just like the server interceptor, but happens a little bit earlier in request lifecycle.
But what if that doesn't solve all your memory issues?
Streaming
app.proto:
server.go:
client.go:
Load balancing
There's not enough time to go into detail, but one thing to note: The fact that you're establishing a single persistent connection means that every request you make goes to the same server.
So, you have to put the load-balancing logic in the client.
You can of course put this logic into a middleware server that does this for you, so the actual client doesn't have to worry about this. This is still pretty new, the spec is experimental.
gRPC clients in other languages
Using your service from Python:
gRPC downsides
Load Balancing
Structured error handling is unfortunate
No support for browser JS
Breaking API changes
Poor documentation for some languages
No standardization across languages
Where is gRPC used in production?
ngrok — all 20+ internal services communicate via gRPC
Square — replacement for all of their internal RPC. one of
the very first adopters and contributors to gRPC.
CoreOS — Production API for etcd v3 is entirely gRPC.
Google — Production APIs for Google Cloud Services (e.g.
PubSub, Speech Rec)
Netflix, Yik Yak, VSCO, Cockroach, + many more
The future of gRPC
The future of gRPC is easy to track: look at the grpc/grpc-proposals repository and grpc-io mailing list.
New languages (swift + haskell are currently experimental)
Further stability, reliability, performance improvements
Increasingly fine-grained APIs for customizing behavior (connection
management, channel tracing)
Browser JS!
Get Cody, the AI coding assistant
Cody makes it easy to write, fix, and maintain code.