The large language models that power AI chatbots are advancing so quickly that in just eight months, the hardware resources they need to run have been halved—the chips themselves are making much more modest progress.
There are two ways to improve the performance of AI systems, notes MIT researcher Tamay Besiroglu: increasing the size of large language models, which will require a commensurate increase in computing power, although AI hardware is in short supply today; or optimize underlying algorithms to make more efficient use of existing hardware. Current developers of large language models seem to prefer the second approach.
The scientists analyzed the performance of 231 large language models developed between 2012 and 2023 and found that the computing power required to run them dropped by half every eight months on average. This is significantly faster than the empirical Moore's Law, which states that the number of transistors on a chip (a measure of its performance) doubles every 18 to 24 months. The researchers note that this increase in performance of AI systems is partly due to code optimization, although this cannot be determined precisely because AI algorithms often cannot be analyzed. The development of hardware components, of course, also played a role.
The difference in development rates is an indicator of how effectively developers of large language models use the resources available to them. It will not be possible to endlessly optimize algorithms, Besiroglou believes, and it is not clear whether this pace of development will continue in the long term. There are also concerns that improving the efficiency of models could, on the contrary, increase the energy consumption of the AI industry, so it is impossible to focus on just one aspect and ignore the rest, scientists warn.
If you notice an error, select it with the mouse and press CTRL+ENTER.