Artwork

Content provided by Ronak Nathani, Guang Yang, Ronak Nathani, and Guang Yang. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Ronak Nathani, Guang Yang, Ronak Nathani, and Guang Yang or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://ro.player.fm/legal.
Player FM - Aplicație Podcast
Treceți offline cu aplicația Player FM !

Todd Underwood - On lessons from running ML systems at Google for a decade, what it takes to be a ML SRE, challenges with generalized ML platforms and much more - #10

1:07:34
 
Distribuie
 

Manage episode 291882599 series 2838288
Content provided by Ronak Nathani, Guang Yang, Ronak Nathani, and Guang Yang. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Ronak Nathani, Guang Yang, Ronak Nathani, and Guang Yang or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://ro.player.fm/legal.

Todd is a Sr Director of Engineering at Google where he leads Site Reliability Engineering teams for Machine Learning. Having recently presented on how ML breaks in production, by examining more than a decade of outage postmortems at Google, Todd joins the show to chat about why many ways that ML systems break in production have nothing to do with ML, what’s different about engineering reliable systems for ML, vs traditional software (and the many ways that they are similar), what he looks for when hiring ML SREs, and more.

  continue reading

55 episoade

Artwork
iconDistribuie
 
Manage episode 291882599 series 2838288
Content provided by Ronak Nathani, Guang Yang, Ronak Nathani, and Guang Yang. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Ronak Nathani, Guang Yang, Ronak Nathani, and Guang Yang or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://ro.player.fm/legal.

Todd is a Sr Director of Engineering at Google where he leads Site Reliability Engineering teams for Machine Learning. Having recently presented on how ML breaks in production, by examining more than a decade of outage postmortems at Google, Todd joins the show to chat about why many ways that ML systems break in production have nothing to do with ML, what’s different about engineering reliable systems for ML, vs traditional software (and the many ways that they are similar), what he looks for when hiring ML SREs, and more.

  continue reading

55 episoade

Tous les épisodes

×
 
Loading …

Bun venit la Player FM!

Player FM scanează web-ul pentru podcast-uri de înaltă calitate pentru a vă putea bucura acum. Este cea mai bună aplicație pentru podcast și funcționează pe Android, iPhone și pe web. Înscrieți-vă pentru a sincroniza abonamentele pe toate dispozitivele.

 

Ghid rapid de referință