22.2 Overall approach
Finding your bug is a process of confirming the many things that you believe are true — until you find one which is not true.
—Norm Matloff
Finding the root cause of a problem is always challenging. Most bugs are subtle and hard to find because if they were obvious, you would’ve avoided them in the first place. A good strategy helps. Below I outline a four step process that I have found useful:
Google!
Whenever you see an error message, start by googling it. If you’re lucky, you’ll discover that it’s a common error with a known solution. When googling, improve your chances of a good match by removing any variable names or values that are specific to your problem.
You can automate this process with the errorist (Balamuta 2018a) and searcher (Balamuta 2018b) packages. See their websites for more details.
Make it repeatable
To find the root cause of an error, you’re going to need to execute the code many times as you consider and reject hypotheses. To make that iteration as quick possible, it’s worth some upfront investment to make the problem both easy and fast to reproduce.
Start by creating a reproducible example (Section 1.7). Next, make the example minimal by removing code and simplifying data. As you do this, you may discover inputs that don’t trigger the error. Make note of them: they will be helpful when diagnosing the root cause.
If you’re using automated testing, this is also a good time to create an automated test case. If your existing test coverage is low, take the opportunity to add some nearby tests to ensure that existing good behaviour is preserved. This reduces the chances of creating a new bug.
Figure out where it is
If you’re lucky, one of the tools in the following section will help you to quickly identify the line of code that’s causing the bug. Usually, however, you’ll have to think a bit more about the problem. It’s a great idea to adopt the scientific method. Generate hypotheses, design experiments to test them, and record your results. This may seem like a lot of work, but a systematic approach will end up saving you time. I often waste a lot of time relying on my intuition to solve a bug (“oh, it must be an off-by-one error, so I’ll just subtract 1 here”), when I would have been better off taking a systematic approach.
If this fails, you might need to ask help from someone else. If you’ve followed the previous step, you’ll have a small example that’s easy to share with others. That makes it much easier for other people to look at the problem, and more likely to help you find a solution.
Fix it and test it
Once you’ve found the bug, you need to figure out how to fix it and to check that the fix actually worked. Again, it’s very useful to have automated tests in place. Not only does this help to ensure that you’ve actually fixed the bug, it also helps to ensure you haven’t introduced any new bugs in the process. In the absence of automated tests, make sure to carefully record the correct output, and check against the inputs that previously failed.
References
Balamuta, James. 2018a. Errorist: Automatically Search Errors or Warnings. https://github.com/coatless/errorist.
Balamuta, James. 2018b. Searcher: Query Search Interfaces. https://github.com/coatless/searcher.