Book review: Working effectively with legacy code by Michael C. Feathers

legacy_codeWorking effectively with legacy code by Michael C. Feathers is one of the programmer’s classic texts. I’d seen it lying around the office at ScraperWiki but hadn’t picked it up since I didn’t think I was working with legacy code. I returned to read it having found it at the top of the list of recommended programming books from Stackoverflow at dev-books. Reading the description I learnt that it’s more a book about testing than about legacy code. Feathers defines legacy code simply as code without tests, he is of the Agile school of software development for whom tests are central.

With this in mind I thought it would be a useful read for me to improve my own code with the application of better tests and perhaps incidentally picking up some object-oriented style, in which I am currently lacking.

Following the theme of my previous blog post on women authors I note that there are two women authors in the 30 books on the dev-books list. It’s interesting that a number of books in the style of Working Effectively explicitly reference women as project managers, or testers in the text, i.e part of the team – I take this as a recognition that there exists a problem which needs to be addressed and this is pretty much the least you can do. However, beyond the family, friends and publishing team the acknowledgements mention one women in a lengthy list.

The book starts with a general overview of the techniques it will introduce, including the tools used to address them. These come down to testing frameworks and the refactoring tools found in many IDEs. The examples in the book are typically written in C++ or Java. I particularly liked the introduction of the ideas of the “seam”, a place where behaviour can be changed without editing the code and the “enabling point” – the place where a change can be made at that seam. A seam may be a class that can be replaced by another one, or a value altered. In desperate cases (in C) the preprocessor can be used to invoke test-time changes in the executed code.

There are then a set of chapters that answer questions that a legacy code-ridden developer might have such as:

  • I can’t get this class into a test harness
  • How do I know that I’m not breaking anything?
  • I need to make a change. What methods should I test?

This makes the book easy to navigate, if not a bit inelegant. It seems to me that the book addresses two problems in getting suitably sized pieces of code into a test harness. One of these is breaking the code into suitable sized pieces by, for example, extracting methods. The second is gaining independence of the pieces of code such that they can be tested without building a huge infrastructure up to support them.

Although I’ve not done any serious programming in Java or C++ I felt I generally understood the examples presented. My favoured language is Python, and the problems I tackle tend to be more amenable to a functional style of programming. Despite this I think many of the methods described are highly relevant – particularly those describing how to break down monster functions. The book is highly pragmatic, it accepts that the world is not full of applications in which beautiful structure diagrams are replicated by beautiful code.

There are differences between these compiled object-oriented languages and Python though. C#, Java, and C++ all have a collection of keywords (like public, private, protected, static and final) which control who can see what methods exist on a class and whether they can be over-ridden or replaced. These features present challenges for bringing legacy code under test. Python, on the other hand, has a “gentleman’s agreement” that method names starting with an underscore are private, but that’s it, and there are no mechanisms to prevent you using these “private” functions! Similarly, pretty much any method in Python can be over-ridden by monkey-patching. That’s to say if you don’t like a function in an imported library you can simply overwrite it with your own version after you’ve imported the library. This is not necessarily a good thing. A second difference is that Python comes with a unit testing framework and a mocking library rather than them being functionality which is third-party added. Although to be fair, the mocking library in Python was originally third party.

I’ve often felt I should programme in a more object-oriented style but this book has made me reconsider. It’s quite clear that spaghetti code can be written in an object oriented language as well as any other. And I suspect the data processing for which I am normally coding fits very well with a functional style of coding. The ideas of single responsibility functions, and testing still fit well with more functional programming styles.

Working effectively is readable and pragmatic. I suspect the developer’s dirty secret is that actually we wrote the legacy code that we’re now trying to fix.