Introduction
Documenting an undocumented API is typically an unassumingly difficult exercise. Two popular approaches to autogenerating one exist. Firstly, proxies can be used to autogenerate specifications from network traffic. Secondly, tools exist to generate specifications from the structure of a codebase.
Rather than deal with the fuss of manually configuring proxies, at-your-service uses a service worker as a proxy directly on the frontend.
It records network traffic independently, with no need to integrate with actual code. The tool records observations of network traffic over time and learns the shape of APIs.
What This Is About
Why Care?
There is a huge variety of applications for OpenAPI specifications:
- Generate client TypeScript definitions for your API with libraries such as openapi-typescript
- Generate Postman collections and convert into a range of other formats with APIMatic
- Generate mock servers with dynamic sampling using Prism
- Generate clients APIs and documentation using OpenAPI Generator
Information is valuable. Attaining it has a cost, but the dividend it provides can pay that back many times over. Accurate specifications and documentation offer a lot of utility.
The Problem
If there is no documentation on an API or said documentation is not completely accurate then users of the API are faced with a problem. The only way to know what the API does is to observe it in action or inspect the source code.
An observation in this sense means taking a look at HTTP messages and inferring the structure of headers and bodies. If there is truly no documentation at all then even the structure of paths must be determined.
In the real world there is almost always information on the paths of an API as this is basic information required to use it. But documentation on API behaviour requires far more maintenance and rigour, and its existence doesn’t guarantee accuracy.
It is always in our best interests to ensure the information we have to hand on APIs is as accurate as possible.
Solution 1: Inspect Code and Document the API
The optimal solution is to create an accurate specification for the API’s behaviour. While this is optimal it can be either impossible or difficult for several reasons.
Firstly the API may belong to a third party or be otherwise inaccessible, making it impossible to review the codebase. Secondly it requires a non-trivial investment of time and resources. The scope of work required to document large complex systems can be intimidatingly vast and costly. The backend may be a complex monolith or an API Gateway behind which sits any number of microservices.
While this solution is best practice, it often isn’t practical.
Solution 2: Autogenerate Specification from Code
An effective solution with implementations in a variety of languages and frameworks is to autogenerate OpenAPI specifications from code. This approach relies on some general assumptions being true, namely that data is deserialized into defined data structures or otherwise modelled such that an underlying schema can be derived.
Codegen tools make use of heuristics that rely on various factors which may not be true in a particular codebase. For example, that server frameworks are used in a particular way. Given the above factors the generated result may prove unreliable in practice.
If this solution doesn’t work then it may be possible to invest a small amount of effort to address issues in the codebase preventing specification autogeneration from success. However if the scope of work is large then you may as well implement Solution 1.
Solution 3: Infer Behaviour From Observations in a Proxy
The idea is to configure a proxy in front of a backend service that forwards requests and saves information about the application layer of the network stack. Then an API is interacted with in some way, requests are fired, and responses to those requests received. Finally, once sufficient data has been collected it is converted into some format and from that format an OpenAPI specification is generated.
An advantage of this approach is the inversion of control that happens when developers have the opportunity to create conditions that exhaust the possible behaviours of an API. That way the observations of the API are complete, no additional observation exists that contains information that would change the generated specification. As a quick and dirty solution, or just the only solution, that’s nice.
A minimum set of requests can be identified to fully observe the behaviour of an API. In terms of the accuracy of any resulting specification, we can say that it is accurate given the conditions that it was generated from remain true. Should those conditions change a new specification can be generated to account for new, revised, or deleted information.
The at-your-service tool takes the proxy approach to API observability, specification creation, and code generation. In this case a Service Worker is the proxy and everything happens in the context of a browser environment. Its main design goal is to enable rapid prototyping and investigation of backend APIs with a tool that is easy to install and doesn’t require integration with application code.
Proxies
How They Work
Proxies act as middlemen that stand between backend services and a client. When a client sends a request to a backend service the request is routed through a proxy which forwards it on to a destination server. This server then replies to the proxy directly, and so the proxy is aware of everything that is happening on the network from the application’s perspective.
Service Workers as Proxies
Service workers are a special class of web worker that have unique capabilities. They are used to create progressive web apps and implement advanced caching behaviour.
Under the hood they act like a proxy server. Service workers can intercept requests dispatched by a frontend application through the FetchEvent and can cache responses or perform other tasks. This process does not require any integration with application code. They are installed asynchronously and act independently.
Therefore rather than using a third party proxy such as Charles the at-your-service tool uses a service worker that emits events as messages. The main thread listens for these messages and converts them into an optimised data structure.
Observations
Recording Requests and Responses
To produce accurate schemas and model code all possibilities must be accounted for. Imagine an endpoint /api
that returns either { "error": "oh no" }
or { "error": null }
.
The goal is to determine the type of the final response. To do that the tool uses a heuristic. It relies on everything being JSON. That way, it can rely on using underlying types in JSON objects. Aggregating bodies produces a schema for a given path that includes information from all observations.
Each request/response pair is considered an event and stored if it contains new information. An event is considered to have new information if its schema has not been observed so far.
Architecture of At Your Service
The tool consists of a client that runs on the main thread, and a service worker that acts as a proxy. The service worker listens for fetch events that fire when a request is dispatched. It emits events containing information about requests and responses that the client receives on the main thread.
The client stores these events as Samples in a Store. A Sample is only stored if its schema is different to previous observations. The Store persists data and hydrates when the browser refreshes. This lets it work with both single and multiple page applications.
The at-your-service tool includes a UI that consists of a button that opens a drawer. Inside that drawer is the basic functionality of the library. You can copy an OpenAPI 3.1 specification that uses JSON Schema 2020-12 for request and response objects.
You can also leverage the quicktype library to generate model code and type definitions for response bodies in 10+ languages such as TypeScript, Rust, and Python.
Summary
Some innovative solutions in this space have cropped up recently. Namely Akita for API discovery and observability of backend services. It attaches a proxy agent that records information about requests and sends metadata to an aggregator.
Using service workers to this effect may be of interest to anyone looking to explore an application’s network traffic.