Interviewing Sebastian Raschka on the state of open LLMs, Llama 3.1, and AI education

Interconnects Audio

Content provided by Nathan Lambert. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Nathan Lambert or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://ro.player.fm/legal.

3M ago 1:03:42

MP3•Pagina episodului

Fetch error

Hmmm there seems to be a problem fetching this series right now. Last successful fetch was on October 02, 2024 13:35 (23d ago)

What now? This series will be checked again in the next hour. If you believe it should be working, please verify the publisher's feed link below is valid and includes actual episode links. You can contact support to request the feed be immediately fetched.

This week, I had the pleasure of chatting with Sebastian Raschka. Sebastian is doing a ton of work on the open language model ecosystem and AI research broadly. He’s been writing the great Ahead of AI newsletter (that has the biggest audience overlap with Interconnects, at 26%, so a lot of you know him) and multiple educational books, all on top of being a full time machine learning engineer at Lightning.ai, where he maintains LitGPT, which he described as being like Karpathy’s NanoGPT, with slightly more abstractions.

This conversation mostly surrounds keeping up with AI research, the state of the open LLM ecosystem post Llama 3.1, and many narrow topics in between. I learned that Sebastian used to be an Arxiv moderator, which gives some simple color on how Arxiv and sifting through thousands of papers works. We cover a lot of ground here, so I hope you enjoy it.

00:00:00 Introduction & Sebastian's background
00:04:28 The state of deep learning and language models in 2018
00:08:02 Sebastian's work at Lightning AI and LitGPT
00:12:23 Distillation and its potential in language model training
00:14:14 Implementing language models and common pitfalls
00:18:45 Modern architectures: Mixture of experts models, early v. late fusion multimodal
00:24:23 Sebastian's book on building language models from scratch
00:27:13 Comparing ChatGPT, Claude, and Google's Gemini for various tasks
00:38:21 Vibing and checking new language models during implementation
00:40:42 Selecting papers to read and moderating Arxiv
00:45:36 Motivation for working on AI education
00:52:46 Llama 3 fine-tuning
00:57:26 The potential impact of AI on jobs in writing and education
01:00:57 The future directions of AI

More details: https://www.interconnects.ai/interviewing-sebastian-raschka

58 episoade

#Science #Tech #Nathan Lambert #Artificial Intelligence #Machine Learning