LLM Inference Speed (Tech Deep Dive) Thinking Machines: AI & Philosophy podcast

Artwork

Tech Machine Learning Artificial Intelligence Society Philosophy Daniel Reid Cahn MLOps

Content provided by Daniel Reid Cahn. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Daniel Reid Cahn or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://ro.player.fm/legal.

Thinking Machines: AI & Philosophy « »
LLM Inference Speed (Tech Deep Dive)

1y ago 39:36

Distribuie

MP3•Pagina episodului

Content provided by Daniel Reid Cahn. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Daniel Reid Cahn or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://ro.player.fm/legal.

In this tech talk, we dive deep into the technical specifics around LLM inference.

The big question is: Why are LLMs slow? How can they be faster? And might slow inference affect UX in the next generation of AI-powered software?

We jump into:

Is fast model inference the real moat for LLM companies?
What are the implications of slow model inference on the future of decentralized and edge model inference?
As demand rises, what will the latency/throughput tradeoff look like?
What innovations on the horizon might massively speed up model inference?

… continue reading

23 episoade

#Tech #Machine Learning #Artificial Intelligence #Society #Philosophy #Daniel Reid Cahn #MLOps

Artwork

LLM Inference Speed (Tech Deep Dive)

Thinking Machines: AI & Philosophy

published 1y ago

Distribuie

MP3•Pagina episodului

Content provided by Daniel Reid Cahn. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Daniel Reid Cahn or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://ro.player.fm/legal.

In this tech talk, we dive deep into the technical specifics around LLM inference.

The big question is: Why are LLMs slow? How can they be faster? And might slow inference affect UX in the next generation of AI-powered software?

We jump into:

Is fast model inference the real moat for LLM companies?
What are the implications of slow model inference on the future of decentralized and edge model inference?
As demand rises, what will the latency/throughput tradeoff look like?
What innovations on the horizon might massively speed up model inference?

… continue reading

23 episoade

#Tech #Machine Learning #Artificial Intelligence #Society #Philosophy #Daniel Reid Cahn #MLOps

Όλα τα επεισόδια

×

Bun venit la Player FM!

Player FM scanează web-ul pentru podcast-uri de înaltă calitate pentru a vă putea bucura acum. Este cea mai bună aplicație pentru podcast și funcționează pe Android, iPhone și pe web. Înscrieți-vă pentru a sincroniza abonamentele pe toate dispozitivele.

Ascultă peste 500 de subiecte

Ghid rapid de referință

Podcast-uri de top

Florin Rosoga Podcast

Morning Glory cu Răzvan Exarhu

Epic Show Podcast

UPGRADE 100 Podcasts

România în direct - Europa FM

Deşteptarea - Europa FM

Ajutor/FAQ | Upgrade | Advertise

Arte|Afaceri|Comedie|Economie|Divertisment|Știri|Politică|Religie

Ştiinţă|Fotbal|Sport|Povestiri|Tehnologie|True Crime

Drepturi de autor 2024 | Harta site-ului | Politica de confidenţialitate | Termenii serviciului | | Copyright