The Problems with Errors

Currently I find myself working simultaneously in two very different languages: My work at Canonical sees me writing almost exclusively in Python, while in my spare time I'm working with Go and Qml. Both Python and Go have some wonderful, killer features, but today I'd like to write about their error handling strategies, and particularly an annoyance I have with both of their approaches. Before we begin that, we should recap the approaches used by both languages.

Exceptions vs Error Codes

This post is not about exceptions vs. error codes. I want to get this out of the way quickly, since this seems to be a bone of contention between the gophers and the pythonistas. The problems I'll describe later apply equally to both languages.

The Python Way

Python indicates errors using exceptions. The "pythonic" idiom for error handling is to perform operations, and catch exceptions later. The mantra when writing Python code is that it's "easier to ask for forgiveness than permissions". That is, don't ask whether you can perform a certain operation, just try it, and if it blows up, you can catch the exception and recover gracefully. A typical (rather contrived) example of this style of error handling is shown below:

try:
    print(my_dict['custom_greeting'])
except KeyError:
    print("Hello World")

The important point here is that when writing this code, you need to know what possible exceptions your code will raise. Learning the names of the various low-level exceptions is a key skill in becoming an experienced Python programmer. When writing low level code like the example above, it's reasonable to expect the programmer to learn the comprehensive list of possible exceptions raised by a particular operation. The Python documentation is reasonably good at listing the exceptions raised by a particular type. For example, the documentation for the code above says:

d[key]

    Return the item of d with key key. Raises a KeyError if key is not in the map.

(some text omitted for clarity).

Learning which exceptions to catch for low-level code isn't so hard.

The Go Way

I should mention at this point that I'm not a Go expert, so anything I say in this article about Go may very well be wrong!

Go uses returned error codes instead of exceptions. A similarly contrived example to the Python snippet above might look like this:

var custom_greeting, exists = m["custom_greeting"]
if !exists {
    fmt.Print("Hello world")
} else {
    fmt.Print(custom_greeting)
}

In this particular case, the "error code" is a simple boolean value, indicating whether the key existed or not. Like Python, the Go documentation is reasonably good at pointing out what the error code return values are:

An index expression on a map a of type map[K]V may be used in an assignment or initialization of the special form

v, ok = a[x]
v, ok := a[x]
var v, ok = a[x]

where the result of the index expression is a pair of values with types (V, bool). In this form, the value of ok is true if the key x is present in the map, and false otherwise. The value of v is the value a[x] as in the single-result form.

The Problem

In Go

The problem comes when calling higher level library code. In both languages, it can be very hard to determine what the potential error code or exceptions are. Consider this snippet of Go code:

resp, err := http.Get("http://google.com")

This attempts to do an HTTP "get" request to the google.com server. There are literally hundreds of things that can go wrong here. Some errors are common, some are rare; some errors are recoverable, some are not. In all cases though, we need to handle the error correctly. To name just a few of the potential problems we might encounter:

  • The computer has no default route set, and so cannot work out which network interface to use.
  • The computer has no DNS server configured, and so cannot resolve 'google.com' to an IP address.
  • The DNS server cannot resolve the 'google.com' domain to an IP address.
  • The connection to the DNS server times out.
  • The TCP connection is rejected by the destination server.
  • The HTTP connection times out.
  • One of the many HTTP error codes is returned.

The documentation for the http.Get call says the following:

func (*Client) Get

func (c *Client) Get(url string) (resp *Response, err error)

Get issues a GET to the specified URL. If the response is one of the following redirect codes, Get follows the redirect after calling the Client's CheckRedirect function.

301 (Moved Permanently)
302 (Found)
303 (See Other)
307 (Temporary Redirect)

An error is returned if the Client's CheckRedirect function fails or if there was an HTTP protocol error. A non-2xx response doesn't cause an error.

When err is nil, resp always contains a non-nil resp.Body. Caller should close resp.Body when done reading from it.

Nowhere (that I can see) in the documentation does it give a list of the possible error codes. Without knowing which possible error codes may be returned by that library call, how am I supposed to handle them?

In Python

Before the Python readers get too smug, the problem exists in your language as well:

response = urllib.request.open("http://google.com")

This code is just as fragile as the Go code was above. The Python docs for this call say:

urllib.request.urlopen(url, data=None[, timeout], *, cafile=None, capath=None, cadefault=False)

    Open the URL url, which can be either a string or a Request object.

    data must be a bytes object specifying additional data to be sent to the server, or None if no such data is needed. data may also be an iterable object and in that case Content-Length value must be specified in the headers. Currently HTTP requests are the only ones that use data; the HTTP request will be a POST instead of a GET when the data parameter is provided.

    data should be a buffer in the standard application/x-www-form-urlencoded format. The urllib.parse.urlencode() function takes a mapping or sequence of 2-tuples and returns a string in this format. It should be encoded to bytes before being used as the data parameter. The charset parameter in Content-Type header may be used to specify the encoding. If charset parameter is not sent with the Content-Type header, the server following the HTTP 1.1 recommendation may assume that the data is encoded in ISO-8859-1 encoding. It is advisable to use charset parameter with encoding used in Content-Type header with the Request.

    urllib.request module uses HTTP/1.1 and includes Connection:close header in its HTTP requests.

    The optional timeout parameter specifies a timeout in seconds for blocking operations like the connection attempt (if not specified, the global default timeout setting will be used). This actually only works for HTTP, HTTPS and FTP connections.

    The optional cafile and capath parameters specify a set of trusted CA certificates for HTTPS requests. cafile should point to a single file containing a bundle of CA certificates, whereas capath should point to a directory of hashed certificate files. More information can be found in ssl.SSLContext.load_verify_locations().

    The cadefault parameter specifies whether to fall back to loading a default certificate store defined by the underlying OpenSSL library if the cafile and capath parameters are omitted. This will only work on some non-Windows platforms.

    Warning

    If neither cafile nor capath is specified, and cadefault is False, an HTTPS request will not do any verification of the server’s certificate.

    For http and https urls, this function returns a http.client.HTTPResponse object which has the following HTTPResponse Objects methods.

    For ftp, file, and data urls and requests explicity handled by legacy URLopener and FancyURLopener classes, this function returns a urllib.response.addinfourl object which can work as context manager and has methods such as

        geturl() — return the URL of the resource retrieved, commonly used to determine if a redirect was followed
        info() — return the meta-information of the page, such as headers, in the form of an email.message_from_string() instance (see Quick Reference to HTTP Headers)
        getcode() – return the HTTP status code of the response.

    Raises URLError on errors.

    Note that None may be returned if no handler handles the request (though the default installed global OpenerDirector uses UnknownHandler to ensure this never happens).

    In addition, if proxy settings are detected (for example, when a *_proxy environment variable like http_proxy is set), ProxyHandler is default installed and makes sure the requests are handled through the proxy.

    The legacy urllib.urlopen function from Python 2.6 and earlier has been discontinued; urllib.request.urlopen() corresponds to the old urllib2.urlopen. Proxy handling, which was done by passing a dictionary parameter to urllib.urlopen, can be obtained by using ProxyHandler objects.

    Changed in version 3.2: cafile and capath were added.

    Changed in version 3.2: HTTPS virtual hosts are now supported if possible (that is, if ssl.HAS_SNI is true).

    New in version 3.2: data can be an iterable object.

    Changed in version 3.3: cadefault was added.

I've included the complete documentation for that function because I don't want to be accused of skulduggery. In the middle of that documentation is a single sentence:

Raises URLError on errors.

"Aha!" you may be thinking, "Python docs are better than Go docs! We win the language wars! Death to the gophers!" "Not so fast", I say, there're a few problems here.

First, that sentence suggests to me that this method raises URLError on all errors, which is flat out wrong, as a simple scan of the source code will reveal, and this Python interpreter session will confirm:

>>> import urllib.request
>>> urllib.request.urlopen("https://google.com", capath="/etc/ca-certificates/")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.4/urllib/request.py", line 143, in urlopen
    raise ValueError('SSL support not available')
ValueError: SSL support not available

Straight away we can see that the documentation is at best misleading, and at worst flat out wrong.

The second issue is that even if we ignore the fact that the method does, in fact raise things other than URLError, that allows us to write the following code:

try:
    response = urllib.request.open("http://google.com")
except URLError as e:
    # now what?

So yeah, we can catch URLError instances, but there's no list (as far as I can see) of all the possible errors, and how to identify them through the exception instance.

In Both Languages

This situation leaves us with a few terrible options:

We could catch all errors, and assume that they're all fatal, and don't continue any processing in this function. This leaves us with code that looks like this in Go:

resp, err := http.Get(full_url)
if err != nil {
    return nil, err
}

The Python equivalent is just as repugnant. We can let the exception unwind the stack by not catching it in the first place (or we could catch it and replace it with a different exception):

response = urllib.request.open("http://google.com")

The other option is we can simply ignore the errors, and try and continue anyway. This probably an even worse idea than assuming all errors are fatal. In Go, this would look like:

resp, _ := http.Get(full_url)

We use _ to ignore the returned error code. In Python, we catch all exceptions and don't do any handling:

try:
    response = urllib.request.open("http://google.com")
except Exception:
    pass

As an aside, note how the different error raising techniques (i.e.- exceptions vs. error codes) affects the code you need to write. In the first set of examples, where we're assuming that all errors are fatal (which is probably the more sensible approach), returning error codes requires you to type more - you need to test for an error code, and explicitly return it, while exceptions will do that for you. In the second set of examples, where we're ignoring all error codes (which is probably a really dumb thing to do), it's easier (less typing) to ignore an error code than it is an exception. I'm not saying that either approach is better than the other, I just find it interesting to note the concrete symptoms of language design.

The Solution

Both languages ship a standard library with reasonable documentation. I say "reasonable" since, while they do a good job of documenting the parameters required and the effect of the function in question, they almost never do a good job of documenting the possible errors that function may raise.

The list of possible errors raised is an *essential* part of a functions contract, and must be documented.

To my mind, the error cases present within a function are every bit as important as the parameters it takes, the value(s) it returns, and the effect of calling the function. They're a fundamental piece of information that's required in order to call the function safely.

If I'm writing an application, I want to be able to give a reasonably strong guarantee to the user that the application will do the correct thing in all cases. This means I need to be able to react to errors in a sane manner. Without good error-case documentation, I'm stuck with a strategy that assumes that all errors are equally fatal, which, to be honest, seems like cheating.

I'd love to see some tooling that would help programmers figure out which errors are likely to be raised in each different scenario. This is something I might look at for Python, since inspecting function objects is relatively straightforward.

Library authors: please - document your error cases!


comments powered by Disqus