Measuring and Improving Developer Productivity

Why is it important to have an unbiased and quantitative measure of developer productivity? 

Unbiased and quantitative measures of developer productivity are crucial for informed decision-making – engineering is an expensive and important function. Many decisions in engineering are driven by opinion rather than data. Having an unbiased and quantitative measure of developer productivity provides concrete evidence to guide decisions and reach better outcomes.

Measuring productivity can help you find the right structure to maximize productivity – there isn’t a one-size-fits-all approach—individual skills and context matter. Every team is composed of individuals possessing a unique set of skills. Recognizing and leveraging these skills is crucial to the success of the team. Implementing productivity measures can help balance your team and determine the most effective way of working. 

It supports informed tool, framework, and work modality selection – when deciding which programming languages or frameworks to adopt, having a measure of productivity can provide a clear picture of what works best. Having a quantitative measure helps you identify the most beneficial work modalities like Test-Driven Development (TDD) or Continuous Integration. 

To improve your approach, you need to measure it – without measurement, you can’t see the impact of changes you make. To understand how your developer productivity responds to changes, you need to measure it. Our algorithm allows you to quantify the value of different approaches and make decisions based on data results rather than opinion. 

Why don’t traditional metrics work well as a measure of developer productivity?

Process metrics play their role in mature engineering organizations, but they’re not productivity measures – they provide valuable insights into the operations of an organization. By measuring certain process KPIs, organizations can identify areas of improvement and work towards eliminating inefficiencies such as wait times and scheduling issues; but while they can help improve productivity, they don’t directly measure it. For instance, reducing wait times can naturally increase productivity, but it does so indirectly. Instead of increasing real productivity, employees measured on process metrics may make smaller commits, ship smaller functions, or just increase the frequency of certain actions. Without a quantitative assessment of what’s shipped, it’s difficult to determine whether changes actually lead to an increase in productivity.

Story points are subjective and not calibrated to productivity – they were initially conceived to prevent engineers from underestimating tasks, but can be easily manipulated. A Scrum Master can adjust the number of points included in a sprint to ensure it can be successfully completed. The meaning and size of a story point evolve over time and the estimation of story points can vary greatly from one team to another. What are five story points for one team could be eight for another. This makes it impossible to benchmark performance using story points. 

Why Prior Methods of Measuring Productivity Fall Short
MetricsDefinitionCounterproductive Incentive
Story Points (or Velocity Points)Effort estimation for tasksEncourages inflating the number of points a task will require to complete
Pull Request CountsNumber of proposed code changesEncourages making smaller, less significant changes
Avg. Code Review Turnaround TimeTime for code review and approvalEncourages reviewers to be less diligent to improve their turnaround time
Release FrequencyRate of software updatesEncourages prioritizing speed over quality and feature bloat
Lines of CodeNumber of lines written in a programEncourages verbose, redundant code
Commit CountsNumber of updates to a code repositoryEncourages multiple small commits instead of meaningful larger updates

How It Works

How did your research begin? Where did the idea come from?

The research behind the tool was inspired by a need for productivity metrics – Simon was the technical lead of several software engineering teams looking to improve their way of working around 2010. His business was acquired in 2016 based on the impressive results of a team in Moldova. The buyer wanted to include a measure of productivity in the purchase contract and guarantee that productivity wouldn’t decrease over the next three years. This sparked Simon’s curiosity about how such a measure could be implemented, leading to further research and development of the tool.

How does your research algorithm measure developer productivity? What are you measuring?

The algorithm reads source code from repositories, analyzes it, and uses git metadata – to arrive at the historical & current output of developers, teams, and organizations. It doesn’t measure productivity based on activity, instead, it sizes the code and builds a score—when viewed in relation to output units over time provides a measure of productivity. 

The algorithm examines various dimensions of the source code, such as APIs consumed, persistence layers, classes, class surfaces, dependencies, dependency injections, and architectural patterns. It also takes into account code complexity.

How does it account for productivity in different languages? 

It grants the same score to functionally comparable programs, irrespective of coding style or language –  even when developed by two different developers. It can size code written in 10+ languages to evaluate uniformly across varying programming languages. It works best when applied to object-oriented languages, as it can thoroughly analyze classes and dependencies. It’s continuously updated and improved to new programming languages and frameworks, ensuring it remains a reliable tool for measuring code size and developer productivity.

Research Findings/Practical Implications

What are your findings around different QA practices?

Test automation significantly enhances performance – teams that have implemented test automation significantly outperform those that haven’t. While it’s possible to manage without it in the early stages of a project, as the codebase and complexity grow, the lack of proper test coverage can lead to a decline in velocity. Adequate test coverage is beneficial for productivity in the long run. While waiting for feedback from QA, software engineers may lose a lot of time, which can be mitigated with proper test coverage.

Every team needs QA whether it’s automated or manual  – having a dedicated manual QA team can be more beneficial than not having one at all, as it can free up software engineers to be more productive. However, manual testing can be time-consuming, leading to delays in feedback. Automated testing is not always feasible or valuable. For instance, when developing code for hardware like TVs or gaming consoles, a field test is necessary. 

Jr. vs. Senior engineers: 

  • Junior engineers –  junior engineers tend to produce a lot of lines of code but often need to refactor and rework them to achieve a satisfactory result. 
  • Senior engineers – tend to write less overall code, but to greater effect (i.e. it needs less rework).

How does remote vs. in-office work affect developer productivity?

Hybrid setups can work – research has shown that hybrid setups can be effective. The success of such setups is dependent on various factors, including the team’s dynamics, culture, and location, among others.

The tails of performance outcomes are extended with remote work – in remote work settings, low performers tend to do worse. This could be due to a variety of reasons, including lack of supervision, distractions at home, or the absence of a conducive work environment. However, at the same time, there tend to be more individuals who perform exceptionally well with remote work. 

You might consider flexible work even if it ends up less productive for your team – even if a team might be slightly less productive working from home, offering it as a benefit could attract more talent. It’s important to weigh the potential benefits against any potential decrease in productivity.  The nature of the work and the team culture also play significant roles in determining the effectiveness of each setting.

Are offshore or satellite office teams as productive as headquarters teams?

Satellite offices can be as productive as headquarters   – we’ve even seen situations where satellite offices outperform their headquarters. There is no general rule that a satellite office is always better or worse than the headquarters. With the right organization, a satellite office can deliver good results.

Discipline is key to productivity regardless of location – for instance, if a business team and engineers are in the same office, they may not always follow the process and may inject things into the roadmap without proper prioritization. This can be avoided with discipline, regardless of the geographical location of the team.

Case study on the potential effect of an offshore team:

How does developer use of generative AI impact productivity? 

Generative AI boosts productivity – our research has observed that generative AI enhances productivity. However, it’s important to note that the extent of this improvement may vary based on the specific tools used. Pick a tool and use it based on your individual needs and preferences. 

What DevOps practices and tools are shown to boost productivity? 

DevOps practices and tools can significantly improve throughput – adopting DevOps practices and tools can be a key differentiator in the evolution of output over time. Case studies have shown that companies that adopt DevOps practices and tools suited to their needs can see a marked improvement in their throughput. 

However, once these practices and tools are in place, continuous improvement is necessary – when you have multiple teams all using DevOps practices and tools, the question becomes, “What next?” There is always room for improvement, and this is where the value of process metrics may start to diminish.  Process metrics can provide valuable insights into the impact of DevOps practices and tools – however, as time goes on and improvements are made, these metrics may no longer provide the necessary information for further growth and development. 

How does software engineering output distribute among top vs. bottom quartile performers? 

Engineering productivity varies greatly between individuals and is highly contextual – the productivity of an engineer can be influenced by a variety of factors. These can include their understanding of the business and code base, their motivation and work ethic, and even external factors such as life events or organizational changes. 

Productivity is not static – even top performers can have periods of lower productivity due to various factors. Understanding this can help managers to better support their team and maintain productivity levels.

Data can help identify high performers and understand their methods – by analyzing productivity data, managers can identify high performers and understand how they achieve their results. This information can then be used to help other team members improve their own productivity.

The concept of 10x engineers is relative – the idea of an engineer who is ten times more productive than their peers is often seen as a stretch. Even assembling a high-performing team requires more than just high performers. Simply putting together a team of high performers does not guarantee a high-performing team. The team’s performance will also depend on what they are working on and how well they work together.

What are the most important things to get right?

There are three reasons to measure productivity, the third isn’t valid: 

  1. Drive change and Improvement – the primary reason for measuring developer productivity should be to identify areas for improvement and drive positive change. 
  2. Validation and confirmation – some may use productivity measurements to confirm that everything is running smoothly. However, this should not be the sole reason for measuring productivity, as it may lead to complacency and hinder growth. 
  3. Political Reasons – using productivity measurements for political reasons, such as to prove someone’s incompetence, is not a healthy or productive use of these insights.

Measuring productivity of changes in your way of working – the data should be used to help and empower teams to improve and understand why things might not be working as they should. 

Understanding the reasons behind the output – there can be many reasons for the output. For example, a senior engineer could be slowed down because they are helping other team members. It’s not the ultimate measure but it’s important to measure it. 

What are common pitfalls?

Not measuring productivity – failing to measure productivity can lead to inefficiencies and missed opportunities for improvement. 

Weaponizing the data to create a toxic work environment – productivity metrics are meant to help you understand the impact of changes and improve productivity—not to serve as a performance management tool for individuals. Using it as such can lead to people shutting down and making changes can become very difficult. Use the data responsibly.

Responses