24 posts tagged “productivity”
2025
Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity (via) METR - for Model Evaluation & Threat Research - are a non-profit research institute founded by Beth Barnes, a former alignment researcher at OpenAI (see Wikipedia). They've previously contributed to system cards for OpenAI and Anthropic, but this new research represents a slightly different direction for them:
We conduct a randomized controlled trial (RCT) to understand how early-2025 AI tools affect the productivity of experienced open-source developers working on their own repositories. Surprisingly, we find that when developers use AI tools, they take 19% longer than without—AI makes them slower.
The full paper (PDF) has a lot of details that are missing from the linked summary.
METR recruited 16 experienced open source developers for their study, with varying levels of exposure to LLM tools. They then assigned them tasks from their own open source projects, randomly assigning whether AI was allowed or not allowed for each of those tasks.
They found a surprising difference between developer estimates and actual completion times:
After completing the study, developers estimate that allowing AI reduced completion time by 20%. Surprisingly, we find that allowing AI actually increases completion time by 19%—AI tooling slowed developers down.
I shared my initial intuition about this paper on Hacker News the other day:
My personal theory is that getting a significant productivity boost from LLM assistance and AI tools has a much steeper learning curve than most people expect.
This study had 16 participants, with a mix of previous exposure to AI tools - 56% of them had never used Cursor before, and the study was mainly about Cursor.
They then had those 16 participants work on issues (about 15 each), where each issue was randomly assigned a "you can use AI" v.s. "you can't use AI" rule.
So each developer worked on a mix of AI-tasks and no-AI-tasks during the study.
A quarter of the participants saw increased performance, 3/4 saw reduced performance.
One of the top performers for AI was also someone with the most previous Cursor experience. The paper acknowledges that here:
However, we see positive speedup for the one developer who has more than 50 hours of Cursor experience, so it's plausible that there is a high skill ceiling for using Cursor, such that developers with significant experience see positive speedup.
My intuition here is that this study mainly demonstrated that the learning curve on AI-assisted development is high enough that asking developers to bake it into their existing workflows reduces their performance while they climb that learing curve.
I got an insightful reply there from Nate Rush, one of the authors of the study, which included these notes:
- Some prior studies that find speedup do so with developers that have similar (or less!) experience with the tools they use. In other words, the "steep learning curve" theory doesn't differentially explain our results vs. other results.
- Prior to the study, 90+% of developers had reasonable experience prompting LLMs. Before we found slowdown, this was the only concern that most external reviewers had about experience was about prompting -- as prompting was considered the primary skill. In general, the standard wisdom was/is Cursor is very easy to pick up if you're used to VSCode, which most developers used prior to the study.
- Imagine all these developers had a TON of AI experience. One thing this might do is make them worse programmers when not using AI (relatable, at least for me), which in turn would raise the speedup we find (but not because AI was better, but just because with AI is much worse). In other words, we're sorta in between a rock and a hard place here -- it's just plain hard to figure out what the right baseline should be!
- We shared information on developer prior experience with expert forecasters. Even with this information, forecasters were still dramatically over-optimistic about speedup.
- As you say, it's totally possible that there is a long-tail of skills to using these tools -- things you only pick up and realize after hundreds of hours of usage. Our study doesn't really speak to this. I'd be excited for future literature to explore this more.
In general, these results being surprising makes it easy to read the paper, find one factor that resonates, and conclude "ah, this one factor probably just explains slowdown." My guess: there is no one factor -- there's a bunch of factors that contribute to this result -- at least 5 seem likely, and at least 9 we can't rule out (see the factors table on page 11).
Here's their table of the most likely factors:
I think Nate's right that jumping straight to a conclusion about a single factor is a shallow and unproductive way to think about this report.
That said, I can't resist the temptation to do exactly that! The factor that stands out most to me is that these developers were all working in repositories they have a deep understanding of already, presumably on non-trivial issues since any trivial issues are likely to have been resolved in the past.
I think this is a really interesting paper. Measuring developer productivity is notoriously difficult. I hope this paper inspires more work with a similar level of detail to analyzing how professional programmers spend their time:
To compare how developers spend their time with and without AI assistance, we manually label a subset of 128 screen recordings with fine-grained activity labels, totaling 143 hours of video.
Using LLMs and Cursor to become a finisher (via) Zohaib Rauf describes a pattern I've seen quite a few examples of now: engineers who moved into management but now find themselves able to ship working code again (at least for their side projects) thanks to the productivity boost they get from leaning on LLMs.
Zohaib also provides a very useful detailed example of how they use a combination of ChatGPT and Cursor to work on projects, by starting with a spec created through collaboration with o1, then saving that as a SPEC.md
Markdown file and adding that to Cursor's context in order to work on the actual implementation.
2024
When presented with a difficult task, I ask myself: “what if I didn’t do this at all?”. Most of the time, this is a stupid question, and I have to do the thing. But ~5% of the time, I realize that I can completely skip some work.
Advanced Topics in Reminders and To Do Lists. Fred Benenson’s advanced guide to the Apple Reminders ecosystem. I live my life by Reminders—I particularly like that you can set them with Siri, so “Hey Siri, remind me to check the chickens made it to bed at 7pm every evening” sets up a recurring reminder without having to fiddle around in the UI. Fred has some useful tips here I hadn’t seen before.
Tom Scott, and the formidable power of escalating streaks
Ten years ago yesterday, Tom Scott posted this video to YouTube about “Special Crossings For Horses In Britain”. It was the first in his Things You Might Not Know series, but more importantly it was the start of a streak.
[... 1,352 words]2023
[On AI-assisted programming] I feel like I got a small army of competent hackers to both do my bidding and to teach me as I go. It's just pure delight and magic.
It's riding a bike downhill and playing with legos and having a great coach and finishing a project all at once.
When you start a creative project but don’t finish, the experience drags you down. Worst of all is when you never decisively abandon a project, instead allowing it to fade into forgetfulness. The fades add up; they become a gloomy haze that whispers, you’re not the kind of person who DOES things.
When you start and finish, by contrast — and it can be a project of any scope: a 24-hour comic, a one-page short story, truly anything — it is powerful fuel that goes straight back into the tank. When a project is finished, it exits the realm of “this is gonna be great” and becomes instead something you (and perhaps others) can actually evaluate. Even if that evaluation is disastrous, it is also, I will insist, thrilling and productive. A project finished is the pump of a piston, preparing the engine for the next one.
2022
Coping strategies for the serial project hoarder
I gave a talk at DjangoCon US 2022 in San Diego last month about productivity on personal projects, titled “Massively increase your productivity on personal projects with comprehensive documentation and automated tests”.
[... 3,865 words]2020
We’re generally only impressed by things we can’t do - things that are beyond our own skill set. So, by definition, we aren’t going to be that impressed by the things we create. The end user, however, is perfectly able to find your work impressive.
2019
For creative work, you can't cheat. My believe is that there are 5 creative hours in everyone's day. All I ask of people at Shopify is that 4 of those are channeled into the company.
Weeknotes: Niche Museums, Kepler, Trees and Streaks
Every now and then someone will ask “so when are you going to build Museums Near Me then?”, based on my obsession with niche museums and websites like www.owlsnearme.com.
[... 872 words]Seeking the Productive Life: Some Details of My Personal Infrastructure (via) Stephen Wolfram’s 15,000 word epic about his personal approach to productivity, developed over the past thirty years. This is a fascinating document—I found myself thinking “surely there can’t be more information than this” and then spotting that the scrollbar wasn’t even a third done yet. Very hard to summarize: it turns out if you’re the work-from-home CEO of your own privately held 800 person company you can construct some very opinionated habits.
2014
What are some tips for improving public speaking skills quickly?
Practice your talk, out loud, in private, as many times as possible before you deliver it. There’s no better way of ensuring you know your material and that you can deliver it at a sensible pace without freezing up.
[... 127 words]2013
What should be considered when deciding to do a marathon?
Running a marathon is easier then you think.
[... 200 words]How do I overcome my fear of public speaking (of people just “switching off”, or simply getting up and leaving the room)?
Look for opportunities to give “lightning talks”—5 minute talks given as part of a series of talks. These are excellent for beginner speakers as they help force you to get to the point as quickly as possible, and you only have to survive for five minutes! They are good for the audience too as if they don’t enjoy our talk they only have to sit politely for a couple of minutes before the next talk comes along.
[... 107 words]Is there a website or app that helps you track what you have completed each day?
I’ve tried a few solutions for this. Surprisingly the one that has stuck for me is Evernote—I keep a different document for each week (I tried a document per day but that was annoying to update, and meant I didn’t look at my older notes as often) and each day I add a new header at the top of the document for that day. Being able to link through to other notes from my day summaries is useful too.
[... 103 words]I want to write a short summary for every article I read online for future use. What is the best tool to do this?
http://pinboard.in is well suited to this. It’s a bookmarking service (like the old Delicious) with a bookmarklet that lets you quickly annotate and add tags to a link, privately or in public. For an extra fee the site will archive copies of the pages you are linking to as well in case they vanish in the future.
What are some recommended efficient apps for personal productivity (e.g., setting & receiving reminders for completing tasks)?
Things is excellent (at least if you are a Mac/iPhone person)—intuitive, powerful and with flawless syncing. Only catch is it’s a tad expensive considering you have to buy the iPhone and Mac apps separately.
[... 59 words]What are some productive things to do for 15 minutes a day?
Learn a foreign language—using DuoLingo on the iPhone, or with podcasts such as Coffee Break Spanish.
[... 36 words]2012
What are some creative programs to use for presentations?
Here’s a trick I’ve used with success in the past: set up your Mac to have 9 virtual desktops, then arrange your “slides” on each desktop using a combination of applications. I’ve done this with a title slide in keynote on the first desktop, a text editor with some sample code on the second, a terminal prompt set up for live coding on the third, a browser showing a demo on the fourth and so on. Learn the keyboard commands to switch between desktops and off you go.
[... 169 words]2011
What are some things I can do if I’m lying awake and unable to sleep for an hour?
Listen to a podcast. If you’re lying at rest in bed you’ll still get at least some of the benefits of sleeping, and you might find that listening to the podcast helps take your mind off things and sends you to sleep.
[... 66 words]2009
Software engineers today are about 200-400% more productive than software engineers were 10 years ago because of open source software, better programming tools, common libraries, easier access to information, better education, and other factors. This means that one engineer today can do what 3-5 people did in 1999!
2007
Spend 10 minutes collecting everything you need to work on a problem, and unplug the internet for 2 hours. You'll finish in 30 minutes.
Geek | Manager. Meri Williams is one of the most productive people I know. This is her new blog on being a manager and a geek at the same time, with plenty of productivity advice.