Leveled Logging

One of the recent updates to selfster was integrating a logging library. The goal was to get the foundations implemented and leave room for more iteration as needs become more concrete. Another goal was to restrict the total dependencies and API surface area to a minimum. Libraries like logrus, zap, and glog are good tools, but also a lot. Logrus is one of the first structured logging libraries in go and follows an older style where it’s mostly a big leveled extension to the log package. Zap adds some explicit structure but also is built to integrate with much more hyper-scale like environments, including some ergonomics being discouraged for performance. There is lots to think through before picking a tool so let’s take a step back and look at our needs.

Leveled Logging

Dave Cheney boils down, in slightly hyperbolic fashion, that logging is for two things

I believe that their are only two things you should log:

  1. Things that developers care about when they are developing or debugging software.
  2. Things that users care about when using your software

The thesis there is that you only need two levels, INFO for users/operators and DEBUG for developers. I agree with this in essence, but I stand by that you still need at least ERROR to better filter the log stream for important messages. The real kicker is recognizing that no one is going to read your logs unless something is broken. So only log what matters. As far as pre-filtering, you shouldn’t be allowed to set the minimum level any higher than INFO, that I’ll agree with. As the 12 Factor app states,

A twelve-factor app never concerns itself with routing or storage of its output stream

and

each running process writes it’s event stream, unbuffered, to stdout

Basically filtering, routing, and storing logs is an external tool’s job, not the app itself. So in that environment, you may still benefit from an error level as a quick incident heuristic, but shouldn’t be able to only log errors. Similarly all the intermediate levels, like Warn, Critical, Trace, Debug-2, are all silliness.

So we do want levels, just not many. Next we want some way to define multiple fields for logs so we can give better context for messages.

Structured Logging

Structured logging defines a standard multi-field format for logs. This allows the logs to be treated more like events in a stream instead of unstructured text. With a defined structure we can enable other applications to read, filter, aggregate and act on the log stream. Some basic fields might be

  1. Timestamp
  2. Level
  3. Message

From there we can expand based on the use case. For an http server request logger, we might add

  1. IP address
  2. Request ID
  3. Path
  4. Status Code
  5. Latency

The actual output format can be fields separated by white-space, JSON, XML or any other suitable format. The trick is how to handle field types, grouping and nesting attributes, and setting attributes that are shared between multiple log calls. As an example, let’s take a request scoped logger. This logger might be a clone of the default process logger. We will need to add some basic context as the request starts its handler sequence that will be shared throughout the request’s lifetime. We add a request ID to the logger at the start of the handler chain. Now every subsequent handler can pull out the request logger and log something and it will also contain the request ID. Now In the log stream, even if request logs are interleaved, you can construct a time-ordered view with a simple grep call. On the return path the logger handler can get info about the request and response to log the request as handled while still sharing the same context. Request IDs should be ordinal. That way you can get useful metrics about requests per hour right from the logs.

Implementation

The implementation strategy I have ultimately started with here is use the proposed standard library package slog. At the time of implementation this was still experimental. It covered the needs of selfster without making assumptions about environment, has just enough features, and strives to be as performant as possible. It draws a lot of inspiration from zap and zerolog while being somewhere in-between in terms of features. The logging library itself will be wrapped with an internal interface to help control use patterns in the application and allow easier replacement with the backing library if needed. Since integration in the project, go 1.21 introduced slog as part of the standard lib

An interesting thing that you get with request scoped logging is a pretty cheap method of extracting 4 golden monitoring signals.

  • Latency as measured per request
  • Errors by having a consistent pattern for error responses, like logging at ERROR level.
  • Traffic by using ordered request ids. You can find out how many requests happened in the last hour by subtracting the most recent request ID from one at most from an hour ago.
  • The 99th percentile over a small window can indicate saturation.

last-modified: 2023-09-27 15:23 CDT