Death to JSON!

September 15, 2017

TLDR; I don’t really want JSON to die – but we need to stop using it for APIs.

Background

Back in 2011 I was working at LiveProfile, a mobile messenger startup. JSON was relatively new and REST APIs (as we know them today) were still in their infancy. We were building out a SOA (Service Oriented Architecture) for our backend system. Mobile clients used the XMPP protocol and we were using an Erlang XMPP server, ejabberd. We wanted to keep the XMPP server simple, just a frontend, and all business logic (contact lists, stored messages, authentication, etc,.) would be built in a Java backend. The Erlang application needed to communicate with the backend, and so we began looking at various options for Erlang -> backend communication.

At the time there were a few major players:

We knew that we could do an HTTP API, as this was tried and tested and probably the most popular option, but we didn’t want to discount RPC, so we took a seroius look at Apache Thrift (which at the time was recently open sourced by Facebook).

RPC

RPC, or Remote Procedure Calls was what we ultimately decided on. RPC, in the most basic sense, is a system of request/response messages typically used in a client/server model. An RPC system defines a strict protocol for defining request and response messages. This includes serialization of messages between client and server and in some cases, like Apache Thrift, the transport protocol also. At LiveProfile we decided to use Apache Thrift over, specifically, an HTTP API for the following reasons.

Performance

Prior to HTTP/2, you needed to have one connection for each concurrent request (Pipelining existed in HTTP 1.1, but it did not fit the bill). With a high concurrency application you will end up with many, many HTTP connections. This is not a problem, per se, but it is much more efficient to multiplex over a single (or few) TCP connections than having hundreds or thousands of sockets.

In addition, while HTTP implementations are rather efficient, the Apache Thrift transport protocol was more efficient. This makes a big difference when you are handling tens of thousands of concurrent requests. Thrift supported multiplexing with an asynchronous transport protocol, mitigating the issue of needed many sockets, and instead, just a few.

Serialization performance is also important, and while JSON parsing and serialization performance has gotten a lot better in recent years, it will probably always trail behind a binary serialization format. For example, see this benchmark or search for yourself.

Strong Message/Type Contracts

JSON schemas do exist now, and XML schemas have existed for a while, but the mapping process can be tedious leaving lots of room for human error. The errors typically encountered is with using the wrong types or misspelling a field name. While something like a JSON schema does help, it’s a guarantee that someone using your API will forego the JSON schema and write the JSON by hand.

gRPC, a modern RPC framework

Thrift served us well in 2011, but today there is a better alternative, gRPC. gRPC uses proto3 for serialization and HTTP/2 for the transport layer. The gRPC ecosystem has many robust server and client libraries that provide high performance RPC in many languages.

gRPC’s use of HTTP/2 allows it take advantage of multiplexing over a single TCP connection. The proto3 message format is mature and high performance, meaning lower CPU and in some cases better space efficiency for messages (although compressing JSON does come close).

The beauty of protobuf is there is a single, canonical definition of your API. For example, see the following service definition:

service HelloService {
  rpc SayHello (HelloRequest) returns (HelloResponse);
}

message HelloRequest {
  string greeting = 1;
}

message HelloResponse {
  string reply = 1;
}

With the above service definition, code can be generated to produce stubs for server implementations and full, working clients. There is no need to worry about type mismatches or field name mismatches. In addition, a message can have new fields added at any time to support new functionality without clients needing to update.

Should JSON really die?

Well, no. For anything that might be hand written (or modified) like a configuration file, JSON is perfect. Also, Browser to Server (via JavaScript, for example) communication with JSON is still probably easier and more practical in most cases, but if you do choose gRPC for writing your services there is grpc-gateway that will automatically handle the JSON to gRPC translation for you.

gRPC really shines for internal service communication and specific use cases like a mobile iOS or Android application talking to a backend server, so I highly recommend using it in place of a REST API for those.

comments powered by Disqus