My Debugging Process

Note: This post is mostly aimed at my work colleagues. We're working on deploying systems that consist of multiple micro-services and I have a few tools and techniques that aid debugging. This post explains my process.

Directory Setup

I use a slightly non-standard directory setup. I have a directory for each service and within that directory I store my bzr branches, as well as a virtualenv directory, and a config file for the service. This looks like this (for our current sprint):

core-image-watcher/
  trunk/
  # other branches
  secrets.conf
  ve/
core-image-publisher/
  trunk/
  # other branches
  secrets.conf
  ve/
core-image-tester/
  trunk/
  # other branches
  secrets.conf
  ve/
core-result-checker/
  trunk/
  # other branches
  secrets.conf
  ve/

The non-standard parts here are:

  • Use a single virtualenv folder for every service, rather than one per branch. This means less time spent downloading packages from pypi (which, if you had my Internet connection you'd appreciate), and allows us to install a few utilities which are useful for development.
  • Keep a single config file, rahter than editing the default config file in each branch. This helps prevent you accidentally publishing your cloud credentials to the world (although apparaently that's not enough in my case).

Development Process

There are two essential packages I make sure I have installed in every virtualenv:

IPython is seriously awesome. It's the only python interpreter I ever use. It can do a whole lot of clever things, but I'll show you just a few of them here.

Understanding the Code:

If you want to get some help about an object foo, you can type foo? and press enter to see the docstring for foo (this is equivalent to help(foo), but way less typing). Sometimes the docstring isn't enough and you want to look at the source code. In that case, just add another '?': foo?? will show you the source code for foo - this works for everything: functions, classes, even entire modules.

Editing the Code:

Often I'll use the ?? trick to view some source code and then find something that needs to be changed. I used to edit everything in a separate editor, but that can be tiresome - often I know which object I want to edit, but I forgot where I imported it from. Also, after editing the code, I then need to re-import the module in ipython, which is error prone and annoying. There's a better way! type %edit followed by the object you want to edit. IPython will open whatever editor is set as your default and point it to the object you want to edit. When you close the editor, ipython will re-load those edited objects. Here's a short example of me fixing a bug in our code:

Debugging the Code

Sometimes I'll be running functions in IPython and get an uncaught exception. I used to wish I was in a debugger so I could figure out what went wrong, then I learned that typing %debug after an uncaugfht exception will run ipdb in post-mortem mode, allowing you to inspect the stack and learn what went wrong.

You can also debug any arbitrary statement as well by typing %debug <statement>. I think of this approach as 'bottom-up' debugging - I'm debugging errors in the low-level functional code. Sometimes though you need to debug something in the imperative guts of the service. This is where I've started using pudb. pudb is a drop-in replacement for pdb or ipdb, but has a much nicer curses-based user interface. You can invoke it in two ways:

  • From within the code: add a line that contains import pudb; pudb.set_trace() wherever you want the debugger to turn on.
  • From the command line. You can run the entire service inside the debugger from the command line: python -m pudb <script-to-run>

The latter approach is demonstrated below:

It's hard to show in the video, but I'm navigating modules (press 'm'), setting breakpoints ('b'), and dropping into IPython (press '!').

Updating the pip-cache

You may notice that I don't have a locally checked out copy of our pip cache branch in my directory structure. That's because I have a handy-dandy script that updates the cache for me. Given a service-name (like 'core-image-watcher') and the path to a requirements.txt file, it will ensure that all the packages are present in the pip cache branch. It's pretty rough around the edges, but it may be useful to you. You can get it from: lp:~thomir/+junk/update-pip-cache

Running the Services

Sooner or later you'll want to run everything together to make sure everything is working as expected. I simply run each service in a new terminal window. This allows me to switch from one terminal window to the next to monitor what each service is doing. In our current sprint the communication from one service to the next is fairly linear, so it's reasonably easy to follow.

Because we're using rabbitMQ, if anything goes wrong, you can simply shut down that service, do some hacking, and fire it back up again.

Wrapping up

I've found the combination of all the above tips (plus a few I'm not sharing here) to be a significant help when writing python services. Let me know if you have any similar tips.


comments powered by Disqus