AI Leads To Slowdown In Developer Productivity

Wednesday, 16 July 2025

Empirical research into whether access to AI-powered tools, primarily Cursor, reduces or lengthens the time taken to deal with routine software development tasks produced an unexpected result. Using AI tools increased task-completion time by 19%.

This was a result that took the study's lead authors, Joel Becker and Nate Rush, by surprise. On the basis prior of research, anecdotal evidence and their own experience, before conducting the experiment they were pretty confident that access to AI-powered tools would lead to increased productivity - i.e. shorter task completion times. And from what we have reported over the past couple of years on I Programmer that would have been our expectation.

The reseach comes from METR (Model Evaluation & Threat Research), a non-profit research group focused on evaluating frontier AI models.The company was founded in 2022 by Elizabeth Barnes, formerly a researcher at OpenAI where she worked on AI alignment, ensuring that advanced AI systems act in accordance with human values, goals, and intentions. Now METR is dedicated to understanding the potential capabilities and risks of advanced AI as these systems become more autonomous.

METR's recent empirical study was a randomized controlled trial (RCT) to understand how "early-2025 AI tools" affect the productivity of experienced open-source developers working on their own repositories. As this chart shows, and contrary to all prior expectations which forecast a speedup of 20 to 40%, the observed result was a slowdown of 19%.

METRchart

The result was opposite not only to the METR researchers' expectations and the experts' forecasts, but also to the forcasts made by the developers participating in the study and even to their estimates having taken part - instead of perceiving the slowdown of 20% they felt they had achieved a speedup of 20%!

Details of the methodology of the RTC are given in the arXiv paper, Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity, but in brief it involved 16 experienced developers from large open-source repositories that they’ve contributed to for on average 5 years. The developers themselves identified the tasks (pull requests) for the study - bug fixes, features, and refactors that would normally be part of their regular work, 246 in total and with an average completion time of around 2 hours. Each issue was randomly assigned to either allow or disallow use of AI. When AI was allowed, developers could use any tools they chose (primarily Cursor Pro with Claude 3.5/3.7 Sonnet—frontier models at the time of the study); when disallowed, they work without generative AI assistance.

As this chart shows the forecasted time was slightly greater for tasks with AI disallowed than with AI allowed, but in fact AI-disallowed tasks were actually completed in a slightly shorter avarage tiem. In the case of AI-allowed taks the observed implementation time was considerably longer that that forecast.

METRchart2 These counter-intuitive results led the researchers into an extensive analysis of why the use of AI-tools slowed the developers down and they came up with some interesting observations.

Some were to do with developer experience including that using AI, developers were slowed down more on issues they were more familiar with, i.e. AI assistance was less helpful the more experienced the developer.

Others were to do with the the projects themselves, such as developers reporting that AI performs worse in large and complex complex environments.

It was the issue of low AI reliability that impinged on developers time. Overall developers accepted less than 44% of AI generations. Moreover the majority reported making major changes too clean up AI code such that 9% of time was spent reviewing and cleaning AI output.

The researchers attempted to reconcile their observation of reduced productivity when using AI-powered tools with the widely held perception, even among the developers in the study, that AI enhances developer productivity. One factor could have been that the developres in the RCT had typically only used Cursor for a few dozen hours prior to the study.

Another could be that:

"AI capabilities may be comparatively lower in settings with very high quality standards, or with many implicit requirements (e.g. relating to documentation, testing coverage, or linting/formatting) that take humans substantial time to learn".

The researchers are at pains not to let their findings undermine the usefulness of generative AI, stating:

It seems plausible or likely that AI tools are useful in many other contexts different from our setting, for example, for less experienced developers, or for developers working in an unfamiliar codebase.

More Information

Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity by Joel Becker, Nate Rush, Elizabeth Barnes, David Rein

Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity METR blog

Programming In The Age of AI

Does AI Help or Hinder?

GitHub Copilot Provides Productivity Boost

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.

Akka Launches Agentic Platform
14/07/2025

Akka has launched a new Akka Agentic Platform that can be used to build, operate, and evaluate any type of agentic AI system. The platform provides orchestration, memory, toolkits for agents, and [ ... ]

+ Full Story

Robot Crabs Attacked By Real Crabs
08/08/2025

A robot crab called Wavy Dave has been having a rough time as his real life rivals ripped his claw off.

+ Full Story

More News

Comments

or email your comment to: comments@i-programmer.info

Last Updated ( Wednesday, 16 July 2025 )

Recent Articles

Recent Book Reviews

Popular Articles

More Information

Related Articles

Comments