*October 25, 2023* # AI Firemen, Tinkerers ![[stokers.jpg]] > [Wikipedia](https://en.wikipedia.org/wiki/Fireman_(steam_engine)): A **fireman**, **stoker** or **watertender** is a person whose occupation it is to tend the fire for the running of a [boiler](https://en.wikipedia.org/wiki/Boiler "Boiler"), heating a building, or powering a [steam engine](https://en.wikipedia.org/wiki/Steam_engine "Steam engine") If large language models are engines, we are burning coal. We have yet to narrow down what exactly *can* be done with a transformer, let alone how it works or how to optimize its process. [This comment](https://news.ycombinator.com/item?id=35377828) left on [[The Cambrian Period of AI]] captures it perfectly: <div class="hn-comment"><div class="hn-header"> <span class="hn-username">gwd</span> <span class="hn-timestamp">2023-03-30</span> </div><div class="hn-body">Your description of "dumb AI" as being "just useful enough to maintain a flow of funding" reminds me of A Collection of Unmitigated Pedatry's description of the start of the Industrial Revolution [1]:<br><br> &gt; The specificity matters here because _each innovation in the chain required not merely the discovery of the principle, but also the design and an economically viable use-case to all line up in order to have impact_. The steam engine is an excellent example of this problem. Early tinkering with the idea of using heat to create steam to power rotary motion – the core function of a steam-engine – go all the way back to Vitruvius (c. 80 BC -15 AD) and Heron of Alexandria (c. 10-70 AD). With the benefit of hindsight we can see they were tinkering with an importance principle but the devices they actually produced – the aeolipile – had no practical use – it’s fearsomely fuel inefficient, produces little power and has to be refilled with water (that then has to be heated again from room temperature to enable operation).<br><br> He goes on to say that the very first "commercial" steam engine only happened to be commercially useful because of the particular situation in which it was invented: England had cut down most of their trees, but luckily had lots of coal. The engine wasn't quite useful enough to use to (say) pump water out of an iron mine, because it was so resource-hungry that it the cost of the fuel _plus_ getting fuel _to_ the engine would be too much. But it's _just barely_ useful enough to pump water out of a _coal_ mine, if you can provide it coal without having to transport it. And that gave it just enough of a commercial toe-hold to fund its optimization.<br><br> It sounds like "dumb AI" of the 2000's may have performed a similar function; and we are, perhaps, on the edge of the "AI revolution", similar to the Industrial Revolution, where we reach a hockey-stick proliferation.<br><br> EDIT: Fixed link<br><br> [1] <a href="https://acoup.blog/2022/08/26/collections-why-no-roman-industrial-revolution/comment-page-1/">[https://acoup.blog/2022/08/26/collections-why-no-roman-indus...]</a> </div> </div> From an engineering point of view, a major facet of the problem is how we look at the machinery. The transformer architecture is often described in very specific language, suggestive to what was in mind when it was invented. In truth, it doesn't seem to work for the reason it was originally meant to. We use language like "queries", "keys", and "attention"; we need a vocabulary of some kind to stay on the same page. But many of these pre-conceived intuitions probably stand in the way of our understanding. In all likelihood, our view is still very narrow. To build a *working* engine, you only need to grasp a tiny subset of the principles necessary for an *efficient* engine. If you don't understand thermodynamics, the only way to improve an engine is to tinker ^[The development of thermodynamics/statistical physics is intimately tied with the development of the engine, see [here](https://en.wikipedia.org/wiki/Thermodynamics#History)]. You change things here and there and see if it gets better or worse. The rate of improvement is limited to the rate that you stumble into improvements. To call ourselves AI "engineers" is an overstatement of what we understand. Realistically, we are AI "Firemen". To run a steam engine, the fireman's job is to keep the fire burning and the engine turning. It is mostly hard manual labour, and requires little understanding of how the engine works. We shovel data into the furnace, sit back, and wait to see if the machine does what we want it to. If it doesn't, we turn a few knobs, change the fuel, and try again. We graduate when we discover the equivalent to thermodynamics; that is when we earn the *engine* in engineer. The scope of what can be done with AI, and the precision at which it is used will become contextualized, and designable. For the time being, we'll be tinkers and firemen.