In 1965, in an obscure technical journal, a young electronics engineer made a striking claim:
“The complexity for minimum component costs has increased at a rate of roughly a factor of two per year…. there is no reason to believe it will not remain nearly constant for at least ten years. “1
What he meant was that the number of transistors you could fit on a semi-conductor, or micro-chip - which roughly translates to the computational power of the chip - would double every year. It was later qualified to every two years.
The author was Gordon Moore, who later co-founded semi-conductor giant Intel. And when a journalist stumbled upon this article a few years later, it went a kind of pre-internet viral over several years, it was dubbed Moore’s Law, and it become deeply enmeshed in the foundational mythos of Silicon Valley.
Over the decades Moore’s law become synonymous with Silicon Valley’s prodigious technological advances,
As Silicon Valley companies grew in wealth and influence, Moore’s Law was further buttressed by other laws, such as, Metcalfe’s Law2, more commonly known as the “Network Effect”, which posits that the value of a network is the number of nodes squared. So a network of 2 telephones has a value of 4, but a network of 3 telephones has a value of 9 – and so on.
And then the “Power Law”3, borrowed from mathematics, and adapted (to put it mildly) by venture capitalists to explain how, across a spread of risky investments, 10 per cent of them will provide 90 per cent of the gains.
These laws are repeated so often that they have become primordial forces in the Silicon Valley universe; they are so utterly immutable and incontestable to us mere humans, that all we can do – if we are smart - is meekly obey them.
Show me the data
These laws also of course feed into another dimension of the Silicon Valley mythos; of a ruthlessly data driven culture, where all decisions are rigorously optimised through empirical evidence. Tech, we are frequently told, is a world of hard-nosed rationality, where sentiment and feelings have no place.
But it turns out that Silicon Valley’s interpretation of rational data-informed decision making is surprisingly elastic – as indeed is its understanding of laws.
There is no law that determines the pace of technological development of micro-chips; Moore was simply making an observation. And Moore’s observation, when miraculously transformed into a law, became a self-fulfilling – and of course self-validating - prophecy; Moore as CEO of Intel, alongside other manufacturers, planned research and investment specifically to meet the next target for doubling computational power, as defined by his eponymous law.
In another universe somewhere they call this a circular economy.
Likewise, The Network Effect is an entirely arbitrary way to ascertain value. it sounds plausible, but became a self fulfilling mantra as investors ascribed a network effect value to social media platforms.
They invested accordingly.
And that’s how we ended up with Facebook.
To be fair, Facebook does indeed have a significant value on the stock market.
But ask yourself: how much value does Facebook have for you?
And the Power Law simply describes venture capitalists’ assumptions about their own investments; there is no science there, just a lot of confirmation bias.
Technology Magic
These are not so much laws then, but rather more like “Lore”; they are folk beliefs held to be true by a given culture, and beliefs that fit into – and by doing so reinforce – a larger value system.
It’s what you might call reinforcement feedback learning.
The bigger value system is critical; Moore’s Law became a byword for Silicon Valley’s predilection for exponential growth, not just of computing power, but of revenues, profits, total addressable markets– and the all important “market capitalisation”.
Not coincidentally, The Network Effect and The Power Law also amplified and fetishised growth as a universal and quasi-spiritual goal for the entire technology industry.
Reid Hoffman, founder of Linked.in and one of the “PayPal Mafia” with Elon Musk and Peter Thiel, famously called the tech industry’ fixation on growth “Blitzscaling”4, as he explained to wide-eyed Stanford students in 2015. In Hoffman’s view, internet technologies had harnessed these mystical and irresistible forces for the obvious betterment of humanity. (Or at least for Y Combinator fellows, which amounted to much the same thing.)
Some people might have argued that Hoffman’s blitzscaling was essentially a function of easy access to vast pools of unregulated venture capital, and also the radical relaxation of anti-trust enforcement. In what other era, these people might ask, would Amazon’s 15 or so years of losing billions, or Uber’s predatory pricing escape legal sanction?
But of course these people could only be those who had somehow not read the laws - or lore - of Silicon Valley.
In recent years the gravitational pull of Moore’s law on Silicon Valley’s self-aggrandizing mythos has been on the wane. Moore’s Intel is now a husk of its former self, left in the silicon dust of the white label foundries like TSMC and Broadcom, who now make virtually all the cutting edge chips sold by companies like Apple and Nvidia.
Indeed, Nvidia’s CEO Jenson Huang, declared in 2019 that “Moore’s law is dead”.5 He should probably know.
But as Moore’s Law is now being quietly discarded like an iPhone with no more security updates, and the world is now seemingly mesmerised by Silicon Valley’s wondrous new AI, another law has fortuitously emerged to take its place: The Scaling Law.
Shared as far back as 2020 in research papers by some leading AI researchers (one was Dario Amodai, the founder of Anthropic)6, the Scaling Laws claim that when an AI system’s model size, dataset size and compute is increased, the AI will improve with striking regularity.
Not only that, but as the size and computation of models is increased, they show signs of “emergent” behaviours; that is they acquire unforseen capabilities that more closely resemble human – or maybe even super human - intelligence.
The scaling law is rather like Moores, Metcalfe’s and the Power law all rolled into one; they call for growth above all, and at all costs.
And because the scaling law requires the building of infrastructure at such incredible scale, only a handful of companies with limitless reserves of cash can possibly compete.
So it is the scaling law that all AI companies are now religiously following. And they have the air of zealots about them.
You may have noticed. It’s a little disquieting.
Take for example Sam Altman’s recent musings in the matter:
“A defining characteristic of the Intelligence Age will be massive prosperity.
Although it will happen incrementally, astounding triumphs – fixing the climate, establishing a space colony, and the discovery of all of physics – will eventually become commonplace. With nearly-limitless intelligence and abundant energy – the ability to generate great ideas, and the ability to make them happen – we can do quite a lot.”7
(Helpfully, for this article at least, Sam also proclaimed a “Moore’s Law for Everything” a few years back too.)8
Strict adherence to the scaling laws, plainly,can only result in an even greater concentration of networked digital power.
The wealthiest companies the world has ever known are now betting pretty much everything they own on scaling up their AI systems to unimaginable size. In doing so, they are facing three bottlenecks:
The first is getting hold of the advanced GPUs chips that power AI systems. Moore’s law might be dead, but the chips are still getting more complex - and a lot more expensive. Nvidia, which supplies 90 per cent of them, has ramped up production from TSMC, its main producer, to 6 million a year; the big tech companies - and fossil-rich gulf states (that’s a whole other newsletter) - are buying up virtually all of them. If you want to set up your own AI model, these days you’ll need to rent your chips off them – and its generally assumed you’ll need a billion dollars to spare.
Secondly, the AI labs are running out of data to feed their models. They have literally scraped the entire internet, they have illegally plundered copyrighted archives, but they still need more, so now they are creating “synthetic data” to feed their models – running the very acute risk of “model collapse” - with potentially dire consequences for our information ecology.
The third bottleneck has been well covered recently; energy – there simply isn’t enough out there to power these garguantuan systems. AI is causing the US grid to melt – as it will many other grids before long. To get an extra 800MW of power, Microsoft recently announced it will rebuild 3 Mile Island, a US nuclear power station that closed after an accident in the 1970s. But even that won’t be enough. Microsoft plans to build a 5GW data centre; that’s more than twice as much energy as required by Berlin.
This is energy we simply don’t have – unless we burn more and more fossil fuels.
That is why AI data centres are now the fastest growing source of carbon emissions in Europe and North America.
There is a fourth potential constraint, and that is capital. Obviously. But as big tech companies have essentially unlimited piles of it, its not slowing them down so far. That said, investors are getting nervous about where all that money’s going, so that may change soon.
Yes, but….
But are the scaling laws any more reliable than the other Silicon Valley laws?
Arvind Narayanan and Sayash Kapoor, authors of the recently published AI Snake Oil (good so far - not yet finished), argue not:
“What exactly is a “better” model? Scaling laws only quantify the decrease in perplexity, that is, improvement in how well models can predict the next word in a sequence. Of course, perplexity is more or less irrelevant to end users”9
What really matters is the emergent behaviours; and they continue:
“Emergence is not governed by any law-like behaviour.”
And we know that to be true; ChatGPT was barely any improvement on GPT 3 with regard to scaling - but plainly, it crossed the yawning chasm from GPT 3’s obviously machine-made responses, to being convincingly human-like.
Like Moore’s Law before it, the Scaling Law is far more like “Lore”; a folk belief held in the AI industry, which already ascribes alarmingly mystical qualities to the systems they are building. And that is largely because they essentially have no idea how the models they build actually work.
So how are they hoping to keep these ineffably huge machines from going off the rails? Well mostly they use a method called “Reinforcement learning with Human Feedback”.
In other words, they get humans – normally very badly paid humans – to tell the models how close their answers are to a human’s answer. And then the AI system learns from that.
But the problem with this is exactly the same as with the laws we discussed above; if we’re working with systems based on belief rather than evidence, then the systems themselves are essentially reflections of the developers’ own values and assumptions - what data is valuable, and what is not, for example.
If humans are feeding back, there is a significant risk, then rather than serving as a check on the system’s values and assumptions, they will instead validate the developer’s values and assumptions, and of course, reinforce them.
So AI companies’ strict adherence to the Scaling Law actually looks a lot more like blind faith than anything grounded in empirical evidence. Microsoft CEO Satyar Nadella more or less confirmed this on a recent earnings call:
“The risk of under-investing in AI is dramatically greater than the risk of over-investing,”10
He said that when pressed by investors, and he could offer no obvious pathway to future revenues.
As some highly regarded tech analysts have recently observed11, that sounds strikingly similar to Pascal’s Wager, an 18th century thought experiment, where Pascal claimed it was entirely rational to believe in God because:
if you believe in God and you’re wrong, you may have wasted time and money and other earthly possessions, but…
if you don’t believe in God and you’re wrong, then you’ll burn in hell for eternity.
Of course burning in hell in this context would be another company’s AI model reaching some kind of super intelligence – and naturally its first priority will be crushing Microsoft into some kind of irrelevant e-waste, with no end-of-life support.
If this is the rationale, than its hardly surprising that more than a few investors are looking a little spooked right now.
These companies have no real idea if what they are doing is actually going to pay off – much less reach super–intelligence – but they are betting their entire companies on it.
Indeed the entire global economy is betting on it (the parts with capital, that is).
And as it happens, they are betting the planet on it too.
Most people in AI wave away these issues with talk of nuclear, geo-thermal or nuclear fusion energy - none of which are realistic for a range of reasons.
However, Google’s former CEO Eric Schmidt was altogether more honest enough about this problem last week:
“My own opinion is that we’re not going to hit the climate goals anyway because we are not organized to do it and yes the needs in this area [AI] will be a problem. But I’d rather bet on AI solving the problem than constraining it.”
Schmidt said out loud what many in AI are actually thinking.
We can be reasonably sure that’s pretty much what Sam Altman also meant when he talked of AI “fixing the climate” earlier.
And that gives us a pretty good understanding of where Silicon Valley lore on infinite growth is currently taking us.
So maybe now is the time to think about alternative - and more evidence-based - ways to meet our current predicament.
Thanks for reading this far…
And I’d love to hear your thoughts….
And it you want still more - here are some links I found interesitng from the last few days…
From climate change to landfill, AI promises to solve Earth’s big environmental problems – but there’s a hitch (The Conversation)
The climate protesters who threw soup at a van Gogh painting. (And why they won’t stop.) (Politico)
How Self-Driving Cars Get Help From Humans Hundreds of Miles Away (New York Times)
AI coding assistants do not boost productivity or prevent burnout, study finds (Techspot)
Bill Gates, Big Agriculture and the fight for the future of Africa’s farmland (The Continent)
oh… and for more on Pascal’s Wager and a seriously deep dive into semi-conductors, please watch below….
Cramming more components onto integrated circuits, Gordon E. Moore (Electronics, Volume 38, Number 8, April 19, 1965)
Metcalfe's law (Wikipedia)
Understanding the VC Power Law: Why Fund Size Matters in Venture Capital Returns, Kely Perdew (Medium.com)
Blitzscaling 01: Overview of the Five Stages of Blitzscaling, Reid Hoffman (YouTube)
Scaling Laws for Neural Language Models (Computer Science)
The Intelligence Age, Sam Altman
Moore's Law for Everything, Sam Altman
AI scaling myths (Arvind Narayanan and Sayash Kapoor, AI Snake Oil Substack)