Artwork

Content provided by Sequoia Capital. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Sequoia Capital or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://ro.player.fm/legal.
Player FM - Aplicație Podcast
Treceți offline cu aplicația Player FM !

Fireworks Founder Lin Qiao on How Fast Inference and Small Models Will Benefit Businesses

39:18
 
Distribuie
 

Manage episode 433955242 series 3586723
Content provided by Sequoia Capital. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Sequoia Capital or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://ro.player.fm/legal.

In the first wave of the generative AI revolution, startups and enterprises built on top of the best closed-source models available, mostly from OpenAI. The AI customer journey moves from training to inference, and as these first products find PMF, many are hitting a wall on latency and cost.

Fireworks Founder and CEO Lin Qiao led the PyTorch team at Meta that rebuilt the whole stack to meet the complex needs of the world’s largest B2C company. Meta moved PyTorch to its own non-profit foundation in 2022 and Lin started Fireworks with the mission to compress the timeframe of training and inference and democratize access to GenAI beyond the hyperscalers to let a diversity of AI applications thrive.

Lin predicts when open and closed source models will converge and reveals her goal to build simple API access to the totality of knowledge.

Hosted by: Sonya Huang and Pat Grady, Sequoia Capital

Mentioned in this episode:

  • Pytorch: the leading framework for building deep learning models, originated at Meta and now part of the Linux Foundation umbrella
  • Caffe2 and ONNX: ML frameworks Meta used that PyTorch eventually replaced
  • Conservation of complexity: the idea that that every computer application has inherent complexity that cannot be reduced but merely moved between the backend and frontend, originated by Xerox PARC researcher Larry Tesler
  • Mixture of Experts: a class of transformer models that route requests between different subsets of a model based on use case
  • Fathom: a product the Fireworks team uses for video conference summarization
  • LMSYS Chatbot Arena: crowdsourced open platform for LLM evals hosted on Hugging Face

00:00 - Introduction

02:01 - What is Fireworks?

02:48 - Leading Pytorch

05:01 - What do researchers like about PyTorch?

07:50 - How Fireworks compares to open source

10:38 - Simplicity scales

12:51 - From training to inference

17:46 - Will open and closed source converge?

22:18 - Can you match OpenAI on the Fireworks stack?

26:53 - What is your vision for the Fireworks platform?

31:17 - Competition for Nvidia?

32:47 - Are returns to scale starting to slow down?

34:28 - Competition

36:32 - Lightning round

  continue reading

14 episoade

Artwork
iconDistribuie
 
Manage episode 433955242 series 3586723
Content provided by Sequoia Capital. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Sequoia Capital or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://ro.player.fm/legal.

In the first wave of the generative AI revolution, startups and enterprises built on top of the best closed-source models available, mostly from OpenAI. The AI customer journey moves from training to inference, and as these first products find PMF, many are hitting a wall on latency and cost.

Fireworks Founder and CEO Lin Qiao led the PyTorch team at Meta that rebuilt the whole stack to meet the complex needs of the world’s largest B2C company. Meta moved PyTorch to its own non-profit foundation in 2022 and Lin started Fireworks with the mission to compress the timeframe of training and inference and democratize access to GenAI beyond the hyperscalers to let a diversity of AI applications thrive.

Lin predicts when open and closed source models will converge and reveals her goal to build simple API access to the totality of knowledge.

Hosted by: Sonya Huang and Pat Grady, Sequoia Capital

Mentioned in this episode:

  • Pytorch: the leading framework for building deep learning models, originated at Meta and now part of the Linux Foundation umbrella
  • Caffe2 and ONNX: ML frameworks Meta used that PyTorch eventually replaced
  • Conservation of complexity: the idea that that every computer application has inherent complexity that cannot be reduced but merely moved between the backend and frontend, originated by Xerox PARC researcher Larry Tesler
  • Mixture of Experts: a class of transformer models that route requests between different subsets of a model based on use case
  • Fathom: a product the Fireworks team uses for video conference summarization
  • LMSYS Chatbot Arena: crowdsourced open platform for LLM evals hosted on Hugging Face

00:00 - Introduction

02:01 - What is Fireworks?

02:48 - Leading Pytorch

05:01 - What do researchers like about PyTorch?

07:50 - How Fireworks compares to open source

10:38 - Simplicity scales

12:51 - From training to inference

17:46 - Will open and closed source converge?

22:18 - Can you match OpenAI on the Fireworks stack?

26:53 - What is your vision for the Fireworks platform?

31:17 - Competition for Nvidia?

32:47 - Are returns to scale starting to slow down?

34:28 - Competition

36:32 - Lightning round

  continue reading

14 episoade

Toate episoadele

×
 
Loading …

Bun venit la Player FM!

Player FM scanează web-ul pentru podcast-uri de înaltă calitate pentru a vă putea bucura acum. Este cea mai bună aplicație pentru podcast și funcționează pe Android, iPhone și pe web. Înscrieți-vă pentru a sincroniza abonamentele pe toate dispozitivele.

 

Ghid rapid de referință