68 subscribers
Treceți offline cu aplicația Player FM !
Streaming Data Into The Lakehouse With Iceberg And Trino At Going
Manage episode 450691448 series 3449056
In this episode, I had the pleasure of speaking with Ken Pickering, VP of Engineering at Going, about the intricacies of streaming data into a Trino and Iceberg lakehouse. Ken shared his journey from product engineering to becoming deeply involved in data-centric roles, highlighting his experiences in ecommerce and InsurTech. At Going, Ken leads the data platform team, focusing on finding travel deals for consumers, a task that involves handling massive volumes of flight data and event stream information.
Ken explained the dual approach of passive and active search strategies used by Going to manage the vast data landscape. Passive search involves aggregating data from global distribution systems, while active search is more transactional, querying specific flight prices. This approach helps Going sift through approximately 50 petabytes of data annually to identify the best travel deals.
We delved into the technical architecture supporting these operations, including the use of Confluent for data streaming, Starburst Galaxy for transformation, and Databricks for modeling. Ken emphasized the importance of an open lakehouse architecture, which allows for flexibility and scalability as the business grows.
Ken also discussed the composition of Going's engineering and data teams, highlighting the collaborative nature of their work and the reliance on vendor tooling to streamline operations. He shared insights into the challenges and strategies of managing data life cycles, ensuring data quality, and maintaining uptime for consumer-facing applications.
Throughout our conversation, Ken provided a glimpse into the future of Going's data architecture, including potential expansions into other travel modes and the integration of large language models for enhanced customer interaction. This episode offers a comprehensive look at the complexities and innovations in building a data-driven travel advisory service.
454 episoade
Manage episode 450691448 series 3449056
In this episode, I had the pleasure of speaking with Ken Pickering, VP of Engineering at Going, about the intricacies of streaming data into a Trino and Iceberg lakehouse. Ken shared his journey from product engineering to becoming deeply involved in data-centric roles, highlighting his experiences in ecommerce and InsurTech. At Going, Ken leads the data platform team, focusing on finding travel deals for consumers, a task that involves handling massive volumes of flight data and event stream information.
Ken explained the dual approach of passive and active search strategies used by Going to manage the vast data landscape. Passive search involves aggregating data from global distribution systems, while active search is more transactional, querying specific flight prices. This approach helps Going sift through approximately 50 petabytes of data annually to identify the best travel deals.
We delved into the technical architecture supporting these operations, including the use of Confluent for data streaming, Starburst Galaxy for transformation, and Databricks for modeling. Ken emphasized the importance of an open lakehouse architecture, which allows for flexibility and scalability as the business grows.
Ken also discussed the composition of Going's engineering and data teams, highlighting the collaborative nature of their work and the reliance on vendor tooling to streamline operations. He shared insights into the challenges and strategies of managing data life cycles, ensuring data quality, and maintaining uptime for consumer-facing applications.
Throughout our conversation, Ken provided a glimpse into the future of Going's data architecture, including potential expansions into other travel modes and the integration of large language models for enhanced customer interaction. This episode offers a comprehensive look at the complexities and innovations in building a data-driven travel advisory service.
454 episoade
All episodes
×![Artwork](/static/images/128pixel.png)
1 CSVs Will Never Die And OneSchema Is Counting On It 54:40
![Artwork](/static/images/128pixel.png)
1 Breaking Down Data Silos: AI and ML in Master Data Management 57:30
![Artwork](/static/images/128pixel.png)
1 Building a Data Vision Board: A Guide to Strategic Planning 49:59
![Artwork](/static/images/128pixel.png)
1 How Orchestration Impacts Data Platform Architecture 59:39
![Artwork](/static/images/128pixel.png)
1 An Exploration Of The Impediments To Reusable Data Pipelines 51:32
![Artwork](/static/images/128pixel.png)
1 The Art of Database Selection and Evolution 59:56
![Artwork](/static/images/128pixel.png)
1 Bridging Code and UI in Data Orchestration with Kestra 44:30
![Artwork](/static/images/128pixel.png)
1 Streaming Data Into The Lakehouse With Iceberg And Trino At Going 39:49
![Artwork](/static/images/128pixel.png)
1 An Opinionated Look At End-to-end Code Only Analytical Workflows With Bruin 56:11
![Artwork](/static/images/128pixel.png)
1 Feldera: Bridging Batch and Streaming with Incremental Computation 47:36
![Artwork](/static/images/128pixel.png)
1 Accelerate Migration Of Your Data Warehouse with Datafold's AI Powered Migration Agent 48:50
![Artwork](/static/images/128pixel.png)
1 Bring Vector Search And Storage To The Data Lake With Lance 58:01
![Artwork](/static/images/128pixel.png)
1 The Role of Python in Shaping the Future of Data Platforms with DLT 54:08
![Artwork](/static/images/128pixel.png)
1 Build Your Data Transformations Faster And Safer With SDF 42:36
![Artwork](/static/images/128pixel.png)
1 Scaling Airbyte: Challenges and Milestones on the Road to 1.0 57:11
Bun venit la Player FM!
Player FM scanează web-ul pentru podcast-uri de înaltă calitate pentru a vă putea bucura acum. Este cea mai bună aplicație pentru podcast și funcționează pe Android, iPhone și pe web. Înscrieți-vă pentru a sincroniza abonamentele pe toate dispozitivele.