Daniel Waterworth

The state of things

Polarization

It's easy to find polarized opinions on the use of LLMs for programming online.

Some people seem to believe that human programmers are already obsolete, that LLMs can do everything that human programmers can do, but faster, and that any view that contradicts this belief is willful ignorance by programmers who are in denial - having spent the better part of their lives learning skills that are now unnecessary.

Others, who perhaps are in denial, push the opposite view: LLMs are not useful at all, because they are not infallible and the mistakes they make take longer to rectify than making the thing from scratch without them.

Nuance

Reality, as always, is more nuanced than this.

It's been clear now, for some time, that, for certain tasks, LLMs are better programmers than humans. For instance, if I wanted to ingest some data from a CSV file and produce a plot of it, I wouldn't even consider doing this myself. I wouldn't be able to write out a perfect script for this job without reference, because I simply haven't memorized all of matplotlib.

LLMs have. They are excellent at writing stuff that requires you to have memorized things, but they can struggle on tasks that require less memorization and more reasoning.

Another view that you'll find is that LLMs are just fancy autocomplete and are therefore incapable of any reasoning, but that's also not true. It's clear that some genuine reasoning is possible, but they are not oracles.

Of course, evaluating LLMs is made more complex by the fact that they are always changing. If your view on what LLMs are capable of doing today is based on experiments you conducted last year, your opinions are likely to be somewhat out of date.

So, as of writing, simply put, my view is this: there is a continuum between tasks that require memorization and tasks that require reasoning. LLMs are likely to do better at tasks on the memorization side of things, but limited reasoning is certainly possible.

This explains why LLMs do better at generating code from scratch (which requires more memorization) than at maintaining code that already exists (which requires more reasoning).

The interesting thing is what you do with this information. I'm not trying to prove any of the above to you, because I hope this is already self-evident.

How to use LLMs more effectively

Now, if you have a task that requires more reasoning, does that mean that LLMs are of no help at all, since they are incapable of one-shotting the whole task? Or is there some way that we can attack the task that makes it more amenable to LLMs?

Of course, the answer is the latter, and, although I'm sure there are a myriad of interesting techniques for productively using LLMs on complex tasks, I'm just going to tell you about one.

Perhaps you'll recognize another trope from online discussions on coding. This time in haiku form:

Claude, write a web app,

Python and ruby are slow,

Write it in C, please.

It stands to reason, does it not? If I have an oracle that can produce code in any language, why not choose a fast one?

Of course, the flaw in this reasoning is that LLMs are not oracles. Python and ruby are simply easier to write and so LLMs are more likely to produce good code in those languages.

Let's extend this reasoning further, however. If we have a difficult task, perhaps there's some framework in which the answer is easy to express.

Derivation.js

This finally gets us to derivation.js, a set of libraries that I've been working on for making collaborative web applications. The core library, 'derivation', an npm package written in Typescript, provides a dataflow execution model.

You see, the problem with the way a lot of people write realtime things is that they write the logic to render the UI from a snapshot. Then, separately, they write the logic to decide when to refresh the view. This means that you have the same logic being expressed in two places, and, when that happens, it's inevitable that they will get out of sync.

The defining feature of derivation.js, however, is that you write your logic once in a functional style and the responsibility for updating things efficiently is automatically handled for you by the library.

To build a full application, you can make base reactive values on the server, you can construct derived values from these, you can mirror them to the browser (via websockets), create further derived values from those and show the results in a UI. It's reactive end-to-end from the DB to the user's eyeballs.

My hypothesis is that, with this structure in place, LLMs will be able to produce working collaborative web applications much more easily. More easily than if they were to try to do it in C, obviously, but also much more easily than in a high level language.

Early indications are that it works. I've already had Claude one-shot a working chat application using my template (which for the time being is private).