March 16, 2005
Code Reviews
by peterbYou might think I don't proofread my articles carefully before I post them. Every time I post an article -- every single time -- about 5 minutes after it goes up, one of my friends will point out a spelling mistake, or a typo, or some sentence where I left out a verb. But I do proofread. And I always miss some of the errors. I think the problem, in large part, is that I've written and read the article so many times in my mind that it's difficult to read the article on the screen before me. I'll read the sentence with the missing verb, and my brain will helpfully supply it. "No problem!" my brain says. "That looks great!"
This is why I like having all the code I write go through code review.
Code review means many things to many people. Every software development house has its own process. Some of them review every change. Others only review during crunch time. Some have "group" reviews where a bunch of developers read code together. Others just ship context diffs around between developers. When I say "I like having all the code I write go through code review" what I really mean is "I like having another intelligent programmer look at the code with an eye towards finding the places where I screwed up." I've certainly sat through code reviews that were useless. But they don't have to be.
And make no mistake — there will be screwups. Different research presents different numbers. I've seen claims as high as an average of 1 bug for every 20 lines of (unreviewed) code. Carnegie Mellon claims the average commercial product (many of which presumably underwent some form of code review) has between 20 and 30 bugs per thousand lines of code, which sounds like a low estimate to me. Basically, if you think that your code doesn't have at least this many bugs, it's a surefire indication that you're either not writing enough code, or not adequately testing your code.
With this in mind, there is more than one good reason to do code reviews. First I want to talk about those reasons (some are obvious, some aren't), and then I want to talk about the mechanics of what I think makes a code review effective or useless.
Reasons To Do Code Reviews
Catching the stupid mistakes. I discussed this above. Every programmer has blind spots, or is fatigued from staring at the same file for hours at a time. Problems that sneak past the tired developer often jump out in sharp relief to someone with a fresh pair of eyes.
Converging on Consistent Style. I think this is a huge reason to do code reviews. Typically, what happens in a development group of any size is this: early on in the codebase's development, someone writes up a document specifying the stylistic conventions to be used when writing code. This can go from the highest level ("Everything is in C, no GCC-specific extensions allowed") to the lowest ("If you write conditionals without putting braces around the block that gets executed, I'm going to come to your office and yell at you.") Then, for the next few years, everyone completely ignores the style guide; the only person who even remembers it exists is the person who wrote it. New developers are hired, and they begin writing code that is completely at odds with the style guidelines, and from then on, whenever any developer has to look at anyone else's code, she (or he) has to spend an extra few minutes acclimating to the subtle differences. Not because those difference are significant, but just because the code looks different. That makes the code less maintainable. Code reviews give developers a perfect opportunity to say to each other "Hey, the way you use tabs instead of spaces in these files? I hate that. It makes me want to die." This sort of gentle (yet loving) peer pressure is much more effective at achieving a consistent style across a large codebase than a document checked into the depths of CVS somewhere.
Find the Guru (or, alternatively) Find the Bozo. Developing software is an activity that takes long periods of intense concentration. The cost of an interruption is severe — a 5 minute question in the middle of the day can cost a half-hour for the interruptee to get back up to speed. If you're like me, that means there's a tendency to huddle in front of your monitor and try to not talk to people. If you don't talk to people, you can't get a good sense of their abilities, and having that sense will be important when you hit a tough problem and need to decide who you are going to interrupt. Code reviews provide a structured way (you can plan for the interruptions in advance) for you to talk to your coworkers about their code, or simply to see how their minds work. Along a similar vein, if there's someone who consistently writes lousy code, and never seems to improve no matter how often their problem patterns are discussed, that will come out fairly quickly as well.
Knowledge transfer. I've heard the following sentiment many times, in many places: "Oh, man, if [insert name here] ever leaves the company, we're so hosed. He's the only person who understands [heinously complex software component]." Sometimes people will say that because the employee in question is, say, Albert Einstein, and the component in question is the Gravity-Powered Relativity Time Machine. However, the common case is that the employee is just really, really smart, and is the only person who has taken the time to understand the component. If you're doing regular code reviews, then you have a natural chance to expose other people to the heinously complex (and not so complex) parts of your software before the crisis where the guru gets hit by a truck and no one else knows how the software works.
How To Do Code Reviews
The only real rule is "do what works for you and your organization." But I'll share my prejudices with you anyway.
Review everything, all the time. If reviews are only done during "special" times, you're just setting up another bit of process that will hit everyone at the most stressful time. Reviews should be soothing, because they are helping you insure that the bugs get caught in-house, instead of by the customers. Making them a regular part of your process will pay off in spades.
Review incrementally. There's a tendency, particularly when just getting the review process off the ground, to want to grab Alessio, pull him into a room, and ask him to go through 8,500 lines of code all at once. This will lead to utter failure. You'll get some good comments for about the first 100 lines of code, and then you'll get 8,400 lines of "Yeah, that looks good. Let's go on to the next section." Every checkin should be reviewed. If the developer is working in a branch, do the reviews in that branch as the developer checks in his manageable pieces. Don't wait for it to hit the trunk in one huge indigestible mass. There is a time and a place for the huge review of everything, but that's during the design phase. Mixing design review and comments about off-by-one errors is begging for failure.
Small review panels are better. There's a saying in the open source world: "Many eyes make all bugs shallow." This is bunk. Many eyes make only shallow bugs shallow. Having a large review group just lets the phenomenon known as diffusion of responsibility kick in. Plus, the collective cost to the organization of reviews goes up exponentially. One or two reviewers is fine. More than that is overkill. A good rule of thumb is: if you need a projector to do the code review, you're doomed. You can do your reviews in a meeting, or you can do them by email. Whatever works for you.
Track the review status formally. Your bug tracking system should tell you whether a given change has been reviewed or not. If you rely on developers' memories, the tendency will be for people to say "Uh, yeah, I think I looked at that. It was OK."
So: review early, review everything, review incrementally, and review in small groups. Do this, and your code will be better, more consistent, more maintainable, and your bug count will go down.
Now, if only I could develop a process for getting my articles reviewed before I post them.
Additional Resources
Darley J M & Latane B., Bystander intervention in emergencies: diffusion of responsibility. J. Personal. Soc. Psychol. 8:377-83, 1968. [New York Univ. and Columbia Univ., New York, NY]Posted by peterb at March 16, 2005 08:56 PM | Bookmark This
I believe that's a misapplication of the diffusion of responsibility phenomenon. With a typical DoR you have a large group of witnesses or bystanders with no vested interest in a particular situation who are implicitly expected to act on that situation, often with no personal gain for themselves.
Contrast this with a typical open source situation -- contributors have many, many reasons to invest effort and time in finding bugs, providing optimizations, porting to new targets, etc. Sometimes it's economic -- I need MySQL to run on my old VAX/8600!!! -- and many times it's just flat out ego -- I was the first person to port Linux to my toaster!!!!
That's very different than driving by a crime in action and not doing anything about it. It comes down to return on investment of effort.
That's fair. I think the point I was really trying to get at didn't quite come across. I wasn't trying to slam open source as much as I was trying to say that a code review with 15 people in a room is less effective than a code review with 3 people in a room, because with 15 people it's easier for each individual reviewer to simply not contribute (and, if they all have jobs, they've even more incentive to not talk because they want to get out of the room in less than 4 hours, which won't happen if they are all actively participating).
Which brings up another point I forgot to mention: if you're doing code reviews in person, set a time limit. Personally, I can't review code in a meeting context for more than about an hour without wanting a nap. (For some reason, if I'm reviewing someone else's code "on my own" for commenting via email, I can concentrate for longer periods of time).
Posted by peterb at March 17, 2005 12:06 AMThis reminded me a bit of an excellent Den Beste article. Thought you might be interested:
It's OK to be wrong (http://denbeste.nu/cd_log_entries/2002/10/ItsOKtobewrong.shtml)
Posted by Strider at March 17, 2005 11:04 AMPlease help support Tea Leaves by visiting our sponsors.