A software organization is only as good as its ability to ship quality code on time.

Managing Virtual and Distributed Development Groups in an Outsourced World – Part 1 of 5

Agile software processes are difficult to execute well when your team is local, vastly more so when it is not. However, with a few exceptions (namely managing the risk of a wholesale team change) the goals of managing an offshore team are the same as managing a local one:  delivering quality code on time, while ensuring intellectual property security and continuity, and providing for the growth and improvement of your team.  It is the execution of these goals that is necessarily different.

As long as outsourcing managers try to emulate a close contact approach, their results must necessarily suffer. You cannot manage by walking around, and should not try to. The concept of management from a distance without relying on physical bridges must be embraced. That is, rather than a manager flying to India once a month to make sure they “have the big picture,” tools and techniques should be employed to generate the big picture without relying on direct contact. That is, we must construct “virtual bridges.”

There are four major questions that are key in any development organization, but which are particularly difficult to answer in an outsourced environment:

  • How can I know the release will really ship on time?
  • How can I manage change?
  • How can I evaluate my team’s performance?
  • How can I ensure IP continuity, when no one in my physical work center actually works on the product I manage?

Managers who can answer these questions accurately and consistently can write their own ticket in the evolving software development world. Parts 1 through 4 of this blog examine each of these questions in detail, and Part 5 concludes with obtaining key metrics and best practices for managing virtual/distributed teams.

Part 1 – How Can I Know the Release Will Really Ship on Time?

Ultimately, a release shipping on time is the acid test for a software team – because a software organization is only as good as its ability to ship quality code on time. You can have the best developers, the best hardware, and the best processes – but if you consistently miss your ship dates, you are simply not a good software organization. It doesn’t matter how strong your architecture is, how smart your developers are, or how advanced your technology is. If you’re 18 months late with the latest killer app, no one will care. Likewise, if your team is consistently four to six weeks late with every release, you’ll be looking for a new job before long. At least, you would be if you worked for me.

The 80% Rule

So how can you tell? The obvious answer is to ask the developers – but a corollary to the old 80% rule (“80% of the problems come from 20% of the code”) is that 80% of the code is always 80% done. Ask most software developers, “How’s it coming?” and the majority of the time you’ll hear something along the lines of, “nearly done,” or “looking good – a few things left to do,” etc. Obviously, you’ll get the occasional, “I’m concerned – it’s taking much longer than I’d expected,” (the 20%), but most of the time, you’ll get the expected answer. The point is not that software developers deliberately deceive, but that software development is difficult and full of unknowns. The more quantitative, unbiased, data you can accumulate, the better.

Milestones

Consider a manager with a distributed team, starting a large new software project. The project will take a year, and the team is half a world away, so clearly some sort of accountability mechanism must be put in place. A common mechanism is the milestone, so let’s start there. Let’s say there are six two-month milestones (M1-M6) in place. The project begins, and as many do, encounters some growing pains.  Nevertheless, no one wants to start off a project by missing their first milestone, so the team pulls together and works long hours to hit the first set of deliverables, even while still tweaking the infrastructure.

The next milestone is more complex and reflects that the team should be in full swing. Though they hit their first milestone, things are still pretty messy. In fact, M2 is nearly missed when it’s discovered that some of the work in M1 was actually defective, and two weeks are lost correcting the problems. But M2 is met – largely through the heroics of a few key people.

Local management decides that they have to get the problem under control, so they crack down. No vacations, and everyone must work Saturdays until we “catch up” or “get ahead of the problem.” The problem is that people under that kind of pressure actually get less work done than those who work regular hours.1  The developers are tired and distracted, and so introduce more defects.

M3 is a death march for the entire team, but the overtime and extra work are successful, and M3 is met. During M4, the wheels really come off of the cart. It’s been six months of overtime and pressure, cancelled vacations, and six- to seven-day work weeks. At this point, some key employees quit.  Specifically, the ones most confident of their ability to get a new job with a more humane atmosphere – read: the best developers. With fewer people and some key personnel missing, the remaining team members are under enormous pressure.

Near the end, it’s realized that many of the features that had thought to have been completed are actually bug-ridden and will need to be re-implemented. By working frantically, just enough features are completed in time to make the M4 goal – except that new errors are being found every day. Those are put into the backlog to be fixed later, masking the problem.2 Upon “successful” completion of M4, the development team takes a much-needed long weekend. They return to a laundry list of bugs that the testing team has discovered in their absence, which means that nearly one-third of the work will need to be re-implemented. Needless to say, M5 is missed and the project ultimately fails.

Milestones give periodic snapshots of the project, and have three major problems:

1) They tend to spawn “milestone-itis,” where hitting the milestone is seen as an end to itself,
2) A large portion of the available time may be spent preparing (administratively and otherwise) for the milestone meeting, and
3) There’s little visibility into what’s going on the rest of the time.

Task-Based Management

Another way to handle this is the “task-based” approach, where instead of large, discreet milestones, a progress chart is generated with a daily target of completed items. If the project above has 500 subtasks, one might expect to complete an average of two tasks per work day over the year. All tasks aren’t equivalent, but if you pass month three and you’ve only completed 50 tasks, it’s probably time to start asking some hard questions.

The advantages to this method are obvious. First, there’s no “milestone” to prepare for, so there’s less time spent in preparation/administration. Second, there’s a more consistent visibility into the pace of the work. Local management can’t bury a schedule slip under a blizzard of overtime and bonuses – it can be noticed immediately. However, if you look at the scenario above, the schedule was kept, with a few minor blips, right up until the moment M4 was delivered. That is, at M4, the correct number of tasks had been “delivered.” The observant manager might have recognized the bumps along the road for what they were – real problems being dealt with by overwhelming manpower – but she might as easily have attributed them to the standard ups and downs of the development process. It was not until things totally fell apart after the end of M4 in September that the scope of the problem would be apparent.

Metrics-Based Management

What were the warning signs?  Really, there were two. First, people were working longer hours, both during the week and the weekend. Eventually, vacations were cancelled, weekend work was mandated, and developers started leaving in frustration.  Second, as people were overworked, a greater and greater percentage of their time was dedicated to fixing bugs. If either of these metrics had been available, it would have been apparent before the end of M1 – in January/February – that there was a problem with workload.3  Around the end of April, it would have been noticed that the time spent on bug fixes spiked. In July, management could have observed a bunch of senior people stopping work (as they quit) and a bunch of new names show up on the development team. In August, the bug fix numbers spike again. But all along, since January, the average number of hours worked by the developers hovered in the 60-70 hour/week range.

Calavista-software-development-leader-austin-task-progress-vs-developer-hours-chart

This project was in trouble back in January. By managing to milestones, it wasn’t discovered until October – far too late to attempt to fix things. By adopting “task-based” metrics, it was clear at the end of August that things were in trouble, allowing four months to try to correct. But by monitoring more subtle cues – specifically defect rates, workloads, and changes to those metrics over time – it becomes clear 11 months before the end of the project that it is in serious trouble, and clearly not improving.

What Works?

The lesson learned here is that milestones give an incomplete picture and encourage local management to manage to the milestone, rather than the project. Task-based management gives a better view, but can only tell you when things have gone wrong. A better approach is to manage to metrics – what’s going on with the team, as well as what’s going on with the code base.

Another way of thinking about this is from a doctor’s point of view. Running a project by milestones is like someone telling you when the patient’s heart has stopped. You call the crash cart, use the paddles, and pray they can be saved.  If you’ve missed a project milestone, you’re already in extremis, and it’s unlikely the project will be a success. Likewise, using task completion rates is like tracking one vital sign – say, the pulse. Useful in many ways, but ineffective if the patient is suffering from cancer.

As a remote manager, you want the fullest array of vital signs that you can make available for yourself. Some metrics that are of particular importance are:

  • Who’s working on the project?
  • How hard are they working?
  • What are they spending their time on?
  • Are certain areas of the code causing undue problems?
  • How are their estimates holding up?

And, perhaps most important,

  • How are these metrics changing over time?

Next Up

We continue the conversation of managing virtual and distributed teams in the Part 2 – How Can I Manage Change?


 1 Steve Maguire, Debugging the Development Process, Microsoft Press, pp. 151ff
 2 I once worked on a project where some significant piece of functionality wasn’t done by the “feature freeze” deadline. It wasn’t even started. Rather than admitting to having missed the deadline, a developer was instructed to add the widget that would drive the functionality to the screen, and just leave it unimplemented. So then it could be classified as “complete,” and when QA pointed out that it didn’t work, they’d just call it a bug, and the developer could spend a week during the hardening period to actually implement the feature. Swear to God.
3 The progress chart (Fig. 1) shows the relationship to task progress, defect fix time, and average hours spent on a project in trouble. In this example it’s obvious at a glance that this project is in serious trouble early on – even though the task completion rate remains well over 100%.