Generally, strong developers are characterized by the same thing that characterizes a strong development team: the ability to deliver quality code on time.

Managing Virtual and Distributed Development Groups in an Outsourced World – Part 3 of 5

In Part 2 of “Who Moved My Team,” we discussed how to manage change. Today we look at how to evaluate a team’s performance.

Part 3 – How Can I Evaluate a My Team’s Performance?

I’ve never met a development manager yet who would say that they actually didn’t know who their best developers were. But when asked, “What makes you think she’s your best developer?” most managers will give some pat answer such as, “I’ve been doing this for 20 years – it’s my job to know.”

In fact, we’ve seen again and again that when a team is evaluated, it’s often the “best” developers who are causing some of the most serious problems. That’s not to say that they aren’t “good,” or even the most important developers on the team. They frequently are. But often, it’s the most prolific developer, the one who’s been with the company since the beginning, who knows all the ins and outs of the code and who personally wrote many of the core routines, who is the one that breaks the build most often, or introduces schema changes which cause problems for other developers, or has surprisingly high defect rates. It’s not always clear why this is so, but we have observed that such key developers are frequently given the most difficult tasks, rarely turn to others for help or review, and are usually overloaded. Combine that with the fact that the environment in which they made their first and greatest contributions was one of frantic activity and heroic effort by a few developers, and it may be that they just have difficulty making the transition from a skunk-works mentality to that of a mature, disciplined development shop.

Why Bother?

So, do we really care who the best developers are? Isn’t it a team effort? There are many reasons to try to evaluate individuals, which go beyond the simple, “whom do I keep, and whom do I fire?” questions. The goal is not so much to force-rank your team, but to determine the strengths and weaknesses of individuals so that they can be better managed.

If you can clearly identify these strengths and weaknesses, it becomes much easier to coach these individuals to higher levels of performance. For instance, a developer who consistently misses deadlines may turn into the company’s most valuable asset if just given a private office space with a door, because he is brilliant but easily distracted (see [1] for a discussion of the impact of walled offices on developer productivity). Another may benefit enormously from outside training in a particular subject. Finally, knowing your team’s individual strengths helps you meet their needs better, and thus keep your most valuable employees on hand.

Traits of a Good Developer

What makes a good developer? Generally, strong developers are characterized by the same thing that characterizes a strong development team: the ability to deliver quality code on time. There is also the “technical prowess” aspect, but in my opinion, finding the Alpha Geek is not really the same thing as finding the best developer.

So how can you tell a good developer from a bad one? Part of it is, in fact, the walking around thing; the human touch – the manager’s investment of personal time in her team – which should not be overlooked. But if you’re a world away, what else do you have? You need more quantitative data − and there are several metrics which are particularly revealing:

Estimation Accuracy

Not to beat a dead horse, but it’s all about delivering quality code on time. Developers are notoriously inaccurate when it comes to estimating times, and there is absolutely no substitute for developers estimating all tasks, and then comparing actuals to estimates. And by all tasks, I mean all coding tasks. At one company, we were tracking estimation accuracy. We were pleasantly surprised to discover that the median task was delivered in 80% of its estimated time. Unfortunately, that didn’t explain our problems, so we did a little more digging and discovered something interesting: the average time estimate was just over one hour. What was happening was that people would estimate simple, straightforward (and quick) bug fixes, but when faced with the hairier and more time-consuming bugs and features, they just wouldn’t bother with an estimate. (“But I don’t know how long it’s going to take me – I haven’t even started it yet!” is a reasonable response, but not very helpful when trying to scope work.) When we started mandating estimates on every task, the results quickly began to look much uglier. The bottom line is, if you’re not estimating every task, you’re not getting valid data.

As an aside, time can be measured in different units, e.g. elapsed, active, on-task, etc. (Get the blog post on measuring time here.) In the end, it really doesn’t matter which units developers use to give estimates, but the developers themselves must recognize which time frame they are operating in and be consistent about it. And you, of course, must know how to interpret their estimates.

Defect Rates

The other part of the “ability to deliver quality code on time” mantra is delivering quality code. Measuring defect rates is difficult, but important. With a locally managed team, managers can often come to a sense of this by asking developers (all developers, not just the “good” ones) where they think defects are coming from. Some will cover for their friends and some will pin the blame on someone they don’t like, but if you consistently ask this question of everyone eventually a picture will form.

A more direct way to get this information is through the dreaded “bug bowl” (aka “test fest,” “bug-o-rama,” etc.). I say dreaded, because I don’t like them for many reasons, which I won’t go into here. However, they can serve a purpose, especially if done with a specific goal. One approach is to grant bonuses to developers based on the number of bugs they find, but subtract money from their bonus pool for each bug of theirs that someone finds. There are some keys caveats and rules to be followed here, but it’s possible to quickly generate an awareness of who tends to introduce bugs. This can be done with your outsourced team as easily as with a local group.

When a developer boasts that “no one understands this module but me,” it’s actually a pretty severe self-indictment.

Maintainable Code

Lastly, code must be maintainable by others. When a developer boasts that “no one understands this module but me,” it’s actually a pretty severe self-indictment. A developer who writes code that the other developers (of comparable skill and experience) can’t easily extend, fix, or otherwise recode, is either unskilled, undisciplined, or trying to build job security. No one stays with a company forever, and managers must plan for turnover. In his article, “Extreme Programming Explained,” Kent Beck describes the importance of cross-pollinating developers, so that no one “owns” one section of code. [2] This is important for the reasons he outlines, but from my point of view the most important one is to ensure that code is maintainable by others.

One common measure people apply to maintainable code is the comment ratio. This can be interesting, but comments for their own sake are rarely useful. More valuable methods may be applied, however. First, one can solicit the feedback of developers (directly or indirectly) on the toughest modules to work on, as when looking for defect rates. Or one can actually try to measure the defect rates. In this case, you’re not so interested in who wrote the code that had to be fixed, but who wrote the code that the bug was written on top of. Another way to approach this is, “What is the average defect-introduction by module?” For instance, if your team is actually tracking defects as they are fixed, and determining who wrote the bugs, you can take it one step farther and start tracking who wrote the code that the buggy code was leveraging. If Bob writes buggy code in module A, then his defect rate goes up. But if the average developer is twice as likely to introduce a bug in module A than in any other module, then the writer of module A might actually be a larger part of the problem than Bob.

What Works?

It would be great to keep track of which developers are writing code that others are having trouble extending, and how many defects per 1000 lines of code different developers are introducing, based on the technologies they’re implementing, and all of that. But how do you do this with a remote team? For that matter, how do you do this with your local team?

It’s true that this level of detail is difficult to track without specialized tools, but valuable information can be obtained just by tracking simpler values: estimates to actuals, task completion rates, and bug recidivism, to name three. If the monitoring of these relatively straightforward metrics is coupled with regular polling of the development team about problem areas of the code (“if we could stop for four weeks, which sections of the code would you want to see rewritten?…”), or for that matter, problem developers, some real conclusions can begin to be drawn about both your team, and about individual contributors.

Next Up

We continue the conversation of managing virtual and distributed teams in Part 4 – How Can I Ensure IP Continuity When No One Else Actually Works on the Product I Manage?

[1] Joel Spolsky, The Joel Test: 12 Steps to Better Code, JoelOnSoftware.com

[2] Kent Beck, Extreme Programming Explained, Boston, Addison-Wesley