Blog

Mulți jucători din România aleg CazinoOnlineBaniReali pentru experiența lor de joc online, datorită ușurinței de utilizare. Je snadné pochopit, proč je OnlineKasinaCesko tak oblíbený mezi hráči z České republiky i regionu. Los interesados en juegos en línea apreciarán la variedad y calidad ofrecidas por casino online dinero real todos los días.

Avrupa merkezli casino siteleri yeni altyapısı, Türk oyuncular için düşük ping bağlantısı sunar.

Yepyeni özellikleriyle bettilt giriş versiyonu heyecan veriyor.

Bahis sektöründe kalitesiyle ön bahsegel plana çıkan kullanıcılarını memnun eder.

Running AI models is turning into a memory game

uip5
Educational Expenses

Running AI models is turning into a memory game

When we talk about the cost of AI infrastructure, the focus is usually on Nvidia and GPUs — but memory is an increasingly important part of the picture. As hyperscalers prepare to build out billions of dollars’ worth of new data centers, the price for DRAM chips has jumped roughly 7x in the last year.


At the same time, there’s a growing discipline in orchestrating all that memory to make sure the right data gets to the right agent at the right time. The companies that master it will be able to make the same queries with fewer tokens, which can be the difference between folding and staying in business.


Semiconductor analyst Doug O’Laughlin has an interesting look at the importance of memory chips on his Substack, where he talks with Val Bercovici, chief AI officer at Weka. They’re both semiconductor guys, so the focus is more on the chips than the broader architecture; the implications for AI software are pretty significant too.

The question here is how long Claude holds your prompt in cached memory: You can pay for a 5-minute window, or pay more for an hour-long window. It’s much cheaper to draw on data that’s still in the cache, so if you manage it right, you can save an awful lot. There is a catch though: Every new bit of data you add to the query may bump something else out of the cache window.


This is complex stuff, but the upshot is simple enough: Managing memory in AI models is going to be a huge part of AI going forward. Companies that do it well are going to rise to the top.


And there is plenty of progress to be made in this new field. Back in October, I covered a startup called Tensormesh that was working on one layer in the stack known as cache optimization.

[Read More…]

Subscribe To Our Newsletter

By clicking submit, I authorize SkillFull Learning and its affiliated companies to: (1) use, sell, and share my information for marketing purposes, including cross-context behavioral advertising, as described in our Terms of Service and Privacy Policy, (2) supplement the information that I provide with additional information lawfully obtained from other sources, like demographic data from public sources, interests inferred from web page views, or other data relevant to what might interest me, like past purchase or location data, (3) contact me or enable others to contact me by email with offers for goods and services from any category at the email address provided, and (4) retain my information while I am engaging with marketing messages that I receive and for a reasonable amount of time thereafter. I understand I can opt out at any time through an email that I receive, or by clicking here
Skip to content