LLMs excel at programming—how can they be so…

Nov 6, 2025

My explanation for the mystery of why LLMs can be both exceptionally good and quite terrible at programming.

15 Comments

This might be a niche preference, but I would absolutely and happily read a whole post from you on the limits of python for data analysis, as someone who's used and appreciated both pandas and matplotlib! In my experience python is fine, and R is better. Wonderful post overall!

Manjari Narayan

Nov 17

> Typical use cases were situations where the model had hallucinated an API call or a function parameter or a return value, and when it saw the error message it recognized the problem and often came up with the right way to fix the issue. But sometimes this process could go haywire. Just the other day I asked for a fairly simple (I thought) function that could load two protein structures and align them. And the model just couldn’t figure out how to correctly call the superimpose() function from the biotite package

For these kinds of situations, it helps to already have documentation about a package, examples of how to use it and so forth in the context. Anything rare that isn't part of the ubiquitous python stack or equivalent in another language needs this I think.

I still struggle with getting the context right for a new project. It is quite fatiguing to neither enjoy the process of doing a thing from scratch nor get a timely solution that just works.

Reply (1)

Claus Wilke

Nov 17

It's a lot of work though if you're just putting together some quick teaching examples and every week you're working with a different library or asking totally different questions. If I need to engineer the context each time I might just as well read the library documentation and do things manually.

Mohan

Nov 6

“so much of programming is reading the documentation and tutorials and code examples”.

My own work is not, and as a result I’d been finding it hard to understand what LLMs were useful for. What you write is depressingly plausible.

Nemo

Nov 6

I think your notion in this article is excellent and goes beyond programming. LLM’s are great at fuzzily reproducing big fields (provided they’re not too niche), but they don’t have the ability to reason about how these facts really connect

Kai Williams

Dec 2

Thank you for this clear, well written piece!

I wonder whether this analysis also applies to math research. From the outside, it seems like models have been getting better at math, but I wonder how much we can make a separation into "deep reasoning" versus having a very good tool bag.

Reply (1)

Claus Wilke

Dec 2

I think it's difficult to assess for math. Unless you're doing cutting-edge math research you'll almost never encounter a problem that the AI hasn't seen before. Maybe the litmus test would be whether mathematicians are starting to develop new proofs with AI. As far as I understand, this is not quite happening yet.

Great explanation!

This is so true.

Why should they?

Why should machines understand —

when we, the supposed origin of meaning, have long confused understanding with prediction,

and intelligence with efficiency?

Computation was never meant to know.

It sustains coherence syntactically —

while human sense-making, once an ontopoietic act of becoming,

has decayed into the repetition of learned operations.

We built machines that complete patterns —

and in doing so, we trained ourselves to live as patterns.

What we call artificial intelligence is not a rival to thought.

It is a mirror of our epistemic exhaustion:

a reflection of cognition stripped of orientation,

of syntax detached from soul.

The real hallucination was never inside the model.

It lives in our belief that truth can be computed,

that care can be outsourced,

that the architecture of meaning can persist

without subjects capable of sustaining it.

AI does not imitate humanity.

It exposes the redundancy of a civilization

that has mistaken expression for awareness

and automation for understanding.

🜂 The question is not whether machines can think like us —

but whether we still can think as beings at all.

Mohan

Nov 6

BTW, can you say more about plotnine? I googled it and skimmed some opinions and the main point I pick up is that it is closer to ggplot2.

Reply (1)

Claus Wilke

Nov 6

Yes, it's for the most part a direct copy of the ggplot2 API to Python. Since I'm very familiar with ggplot2 this makes it easy for me to use. But also, I think the ggplot2 approach to building plots is exceptionally powerful. Once you understand how it works, you'd never want to go back to something like matplotlib or seaborn.

Nemo

Nov 6

Why don’t you like Python for data analysis? And what are your issues with pandas and matplotlib. I don’t do much Python anymore so I’m a bit behind the curve and am curious how things have progressed

Reply (1)

Claus Wilke

Nov 6

It's a long story. I'll have to write a post about it. But in brief, stuff is simply too cumbersome, due to design flaws. You can do everything in Python, but it's too complicated and insufficiently intuitive. I let my students freely choose between R or Python, whatever they feel more comfortable with. And the experience I've had for coming up on two decades is that when a student uses Python, and I ask them to do a quick modification to their analysis on the fly ("just make this other plot" or "calculate this other quantity") they never can do it. It always requires them to go back to their desk, work for 30 minutes, and then they have the result. And to be clear, these are strong students. All the evidence I have is they are limited by their methods. Things that I could do in 5 minutes in R take them 30 minutes in Python.

Reply (1)

Nemo

Nov 6

That sounds a lot like my experience… I should probably finally learn R

Genes, Minds, Machines

LLMs excel at programming—how can they be so…