Using Go's 'context' library in panic handlers

Dec 1, 2016 · 532 words · 3 minute read

golang

golang · bugsnag · context · advent

So far in this series, I’ve looked at how to use context for two cross-cutting concerns: logging and performance monitoring. The third is one that somewhat overlaps with the logging, which is dealing with panics.

There are a ludicrous number of services out there which let you capture detailed information about what went wrong when you get a panic. I’m going to talk about bugsnag, but there’s no particular reason that another one like sentry would be much different.

This post comes with a cautionary tale. Unfortunately when a panic happens, it starts rapidly unwinding stacks until recover() is called in a defer block somewhere. These are somewhat few and far between, and then occasionally fitted into libraries in which perhaps they should not be. I’m looking at you, database/sql and your custom type encoding/decoding API swallowing my exceptions.

As that unwinding happens, potentially useful information in the context are thrown away. Even if the recovery handler was able to find them, it would need to look inside the private contents of the context to get anything useful out of it.

But all is not lost. There are a few things you can do to mitigate this, and you can use context to help a little.

First, supply your own defer/recover block, perhaps in your base or wrapper handler function, and make sure it has access to an object which has a unique ID and/or a context in it, and then pass it through with whatever else you can gather:

    func (handler *BaseHandler) Recover() {
        if exc := recover(); exc != nil {
            panicField = zap.Object("panic", exc)
            // log the panic
            logger := logging.WithContext(handler.Ctx)
            logger.Error("caught a panic", panicField)
            // Bugsnag...
            metadata = make(bugsnag.MetaData)
            metadata['IDs'] = handler.GetIds()
            metadata['Args'] = handler.GetArgs()
            bugsnag.Notify(exc, metadata)
        }
    }

Secondly, make sure that this unique ID is logged on every line in your logs. While an individual context can’t be mutated, I’ve found that methods which “add” information to a context stored on, eg a request object, is useful:

    func (handler *BaseHandler) AddStringId(idName, id string) {
        handler.Ctx = logging.NewContext(
            handler.Ctx, zap.String(idName, id),
        )
        handler.ids[fieldName]
    }

A warning, though - be careful storing context objects on other objects; in general they should not persist longer than the scope in which they were created. This ‘handler’ object happens to be created for each request, so I’m happy with this - but I wouldn’t add any ID unless I knew it was relevant to the entire request.

Finally, make sure you’re logging everything you need to be able to reconstruct the call: API call arguments, the address of the event in the kafka logs, key identifiers, etc, and that this line has the unique ID.

    func (handler *BaseHandler) logRq(args map[string]interface{}) {
        fields := handler.asFieldsScrubbed(args)
        if handler.Auth != nil {
            fields = append(fields, handler.Auth.Fields()...)
        }
        logger := logging.WithContext(handler.Ctx)
        logger.Info(handler.Path, fields...)
    }

So far that’s about the best I have found to be able to do. It would be amazing though if a panic handler could poke around in the call stack, find the most specific context, walk its private contents and send everything to the exception handling service. So far, I haven’t found any library which does this, but perhaps one day!