Sign in

Technology
Business
Rudderstack
Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.
Total 400 episodes
12
...
6
7
8
Go to
The PRQL: Is A/B Testing Only Relevant for B2C?

The PRQL: Is A/B Testing Only Relevant for B2C?

Eric and Kostas preview their upcoming conversation with Che Sharma of geteppo.com.
03:2211/03/2022
78: The Etymology of Reverse ETL & Why It’s a Key Piece Of The Modern Data Stack with Boris Jabes of Census

78: The Etymology of Reverse ETL & Why It’s a Key Piece Of The Modern Data Stack with Boris Jabes of Census

Highlights from this week’s conversation include:Boris’ background career journey (2:32)The origins of “reverse ETL” (6:39)Reverse Fivetran (16:35)Product as an experience (22:41)Fivetran users vs Census users (24:14)How to add value to a data dump (26:56)Ways companies are creating IP (33:48)The cascade effect of the modern data stack (37:56)Defining “data federation” (43:51)Lessons from building a product (49:10)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
01:05:5109/03/2022
The PRQL: Reverse ETL and the Distinction Between Operation vs Analysis on Data

The PRQL: Reverse ETL and the Distinction Between Operation vs Analysis on Data

Eric and Kostas preview their upcoming conversation with Borris Jabes of Census.
03:0604/03/2022
77: Standardizing Unstructured Data with Verl Allen of Claravine

77: Standardizing Unstructured Data with Verl Allen of Claravine

Highlights from this week’s conversation include:Verl’s career journey (2:46)M&A data evaluation criteria (7:12)What Claravine does (10:48)The breadth of data (15:03)Adding to content and advertising data (18:22)How Claravine standardizes data (23:53)Designing a data model (25:40)The underlying technologies of building a product (33:43)The main consumer (35:02)Maintaining quality (39:06)Helping solidify definitions (41:37)Implementing Claravine’s model across various companies (44:54)Internal changes affect on the model (46:47)Connection brought about by structure (49:19)Applying unstructured context to structured stamping (52:36)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
01:00:5502/03/2022
The PRQL: If Everything Is Data, How Can We Make Sense of It All?

The PRQL: If Everything Is Data, How Can We Make Sense of It All?

Eric and Kostas preview their upcoming conversation with Verl Allen of Claravine.
06:1925/02/2022
76: Why a Data Team Should Limit Its Own Superpowers with Sean Halliburton of CNN

76: Why a Data Team Should Limit Its Own Superpowers with Sean Halliburton of CNN

Highlights from this week’s conversation include:Sean’s career journey (3:27)Optimization and localized testing results (7:49)Denying potential access to more data (13:46)Other dimensions data has (18:32)The other side of capturing events (20:55)Data equivalent of API contracts (25:03)SDK restrictiveness for developers (27:40)How to know if you’re still sending the right data (30:38)Debugging that starts in a client of a mobile app (36:08)Communicating about data (38:36)The next phase of tooling (41:49)Advice for aspiring managers (45:21)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. 
51:4323/02/2022
The PRQL: How Important Is the Human Factor When Working With Data?

The PRQL: How Important Is the Human Factor When Working With Data?

Eric and Kostas preview their upcoming show with Sean Halliburton of Warnermedia.
04:0018/02/2022
75: How To Become a Data Engineer with  Parham Parvizi of the Data Stack Academy

75: How To Become a Data Engineer with Parham Parvizi of the Data Stack Academy

Highlights from this week’s conversation include:Par’s background and current role (2:48)About Talend (6:46)Nonlinear pathways to data engineering roles (11:08)What a data engineer needs to be successful (17:37)Before “data engineer” was a title (27:59)Signs you should be a data engineer (32:39)Curiosity and data engineering (38:31)Defining the modern data stack (45:07)How to get a feel for data engineering (52:52)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
58:5616/02/2022
The PRQL: Can We Define the Role of the Data Engineer (Yet)?

The PRQL: Can We Define the Role of the Data Engineer (Yet)?

In this PRQL, Eric and Kostas preview their upcoming conversation with Parham Parvizi of tura.io.
03:4811/02/2022
74: Kostas Respawns at Starburst, is Interviewed by Eric, and Reminisces About Winamp

74: Kostas Respawns at Starburst, is Interviewed by Eric, and Reminisces About Winamp

Highlights from this week’s conversation include:Big News: podcast hits, Kostas’ career change (2:19)Kostas’ career start in data pipelines (4:09)The Winamp and Napster era (11:46)Starting an API gateway (16:56)Observing new technology from afar (23:43)Starting Blendo (32:38)Problems faced in architecting the product (37:12)Kostas’ role at Starburst (40:25)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
44:5909/02/2022
The PRQL: What Prompts a Conversation About Winamp & Quake Arena on The Data Stack Show?

The PRQL: What Prompts a Conversation About Winamp & Quake Arena on The Data Stack Show?

Eric and Kostas preview some exciting news coming up on episode 74 of the Data Stack Show.
03:1304/02/2022
73: What a High Performing Data Team (and Stack) Looks Like with Paige Berry of Netlify

73: What a High Performing Data Team (and Stack) Looks Like with Paige Berry of Netlify

Highlights from this week’s conversation include:Paige’s career path (2:44)Paige’s role and responsibilities at Netlify (6:38)Sharing data insights (8:55)Scope in the context of delivering an insight (12:39)Defining “insight” (15:10)Where the client journey begins (16:43)Miscommunication because of vague terminology (20:06)Netlify’s internal knowledge repository (23:01)Breaking down Netlify’s hub and spoke model (30:45)What data tools to use and when (35:21)The metric layer and BI (44:17)Next steps in the data space (49:42)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
56:5302/02/2022
The PRQL: How High Performing Data Teams Put Tooling in the Background

The PRQL: How High Performing Data Teams Put Tooling in the Background

This week on the PRQL, Eric and Kostas discuss tooling as they preview the upcoming show with Paige Berry of Netlify.
03:5228/01/2022
72: Building Data Ops Into the Data Lifecycle with Douwe Maan of Meltano

72: Building Data Ops Into the Data Lifecycle with Douwe Maan of Meltano

Highlights from this week’s conversation include:Douwe’s career journey (3:04)The missing piece in GitLab’s data tooling (7:35)The open-source offering in the data space (12:38)Singer’s connection with Meltano (22:31)How Meltano manages connectors on a diverse codebase (35:21)The data house side of Meltano (39:47)Data house operating versus Airflow (44:06)Meltano’s vision present today (47:02)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
55:0226/01/2022
The PRQL: Is It Viable to Manage Integrations Open Source?

The PRQL: Is It Viable to Manage Integrations Open Source?

Eric and Kostas preview the upcoming show featuring Douwe Maan of Meltano.
06:0021/01/2022
71: ETL at the Edges with Jimmy Chan of Dropbase

71: ETL at the Edges with Jimmy Chan of Dropbase

Highlights from this week’s conversation include:Jimmy’s career background (3:01)How to use Data cubes (5:52)What Dropbase is and who it is built for (11:01)Getting sales and marketing data in usable formats (16:46)Ensuring data remains flexible and transferable (28:36)Defining what “offline data” is and how to use it (34:09)How Dropbase can work with the rest of the data stack (43:30)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
57:2319/01/2022
The PRQL: Is Kostas an Excel Power User Yes/No?

The PRQL: Is Kostas an Excel Power User Yes/No?

Eric and Kostas preview the upcoming conversation with Jimmy Chan of Dropbase.
05:3314/01/2022
70: The Difference Between Data Lakes and Data Warehouses with Vinoth Chandar of Apache Hudi

70: The Difference Between Data Lakes and Data Warehouses with Vinoth Chandar of Apache Hudi

Highlights from this week’s conversation include:Vinoth’s career background (3:19)Building a data lake at Uber (6:52)Defining what a data lake is (14:01)How data warehouses differ from data lakes (22:46)When you should utilize an open source solution in your datastack (37:36)Evolving from a data warehouse to a data lake (45:09)Early wins Hudi earned inside of Uber (52:30)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
01:00:1212/01/2022
The PRQL: What Old Tech Concepts Were Borrowed to Build the Data Lake House?

The PRQL: What Old Tech Concepts Were Borrowed to Build the Data Lake House?

Eric and Kostas preview the upcoming show as they talk about data lakes and data warehouses and why these are important.
04:3707/01/2022
69: What is the Modern Data Stack?

69: What is the Modern Data Stack?

Highlights from this week’s conversation include:Panel introductions and backgrounds (2:55)What the modern data stack means to each of our panelists (5:04)Defining the fundamental components of a modern data stack (17:22)How the modern stack drives insights and actions for businesses (28:03)Getting to a uniform definition to the modern stack (33:45)Managing the modernization of a large scale data stack (39:09)How testing works in the dbt context (48:44)The relationship between the data warehouse and the data lake (52:25)What has us most excited or the future of modern data stacks (56:02)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
01:03:3105/01/2022
The PRQL: Should Data Trust Drive the Evolution of Your Data Stack?

The PRQL: Should Data Trust Drive the Evolution of Your Data Stack?

In this PRQL, Eric and Kostas preview their upcoming show where they discuss the modern data stack with some of the top experts in the industry.
04:3031/12/2021
68: Season Three Recap: Holiday Edition with Eric Dodds and Kostas Pardalis

68: Season Three Recap: Holiday Edition with Eric Dodds and Kostas Pardalis

In this episode, Eric and Kostas look back over the great topics and guests from season three of the Data Stack Show.  The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data. RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
25:1129/12/2021
67: Now is the Time to Think About Data Quality with Manu Bansal of Lightup Data

67: Now is the Time to Think About Data Quality with Manu Bansal of Lightup Data

Highlights from this week’s conversation include:Manu’s career background and describing Lightup (2:31)Why traditional tools don’t work for modern data problems (6:04)How a data lake differs from a data warehouse (11:35)Defining data quality (14:07)The business impact of solving and applying data quality (31:36)Constructing a healthy financial view on the impact of data (41:09)How to work with unstructured data in a meaningful way (47:44)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
56:1122/12/2021
The PRQL: Will Data Quality Always Require a Human in the Loop?

The PRQL: Will Data Quality Always Require a Human in the Loop?

Eric and Kostas preview the upcoming show by talking about data quality.
04:2321/12/2021
66: How Data Infrastructure Has Evolved and Managing High Performing Data Teams with Srivatsan Sridharan

66: How Data Infrastructure Has Evolved and Managing High Performing Data Teams with Srivatsan Sridharan

Highlights from this week’s conversation include:Starting his career on the first-ever data team at Yelp (2:00)How to approach the adoption of new technology (7:04)When to use stream processing vs. batching (11:35)What is a pipeline and why is it core to a data engineer? (14:07)Where a new data scientist should begin their career (19:14)The key factors impacting a new technology decision (27:09)Managing team emotions in decision making (34:25)The unique challenge of Fintech vs other consumer industries (45:03)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
50:5015/12/2021
The PRQL: How Would You Define a Data Pipeline? Featuring the RudderStack Eng. Team

The PRQL: How Would You Define a Data Pipeline? Featuring the RudderStack Eng. Team

On the PRQL this week, Eric and Kostas bring in some of the Rudderstack engineering team to discuss data pipelines and preview episode 66 of the Data Stack Show.
05:1510/12/2021
65: Operationalizing Data from the Warehouse With Aayush Jain of Cliff.ai

65: Operationalizing Data from the Warehouse With Aayush Jain of Cliff.ai

Highlights from this week’s conversation include:Aayush’s career background (4:13)How his biological sciences academic training impacts his work (8:04)How do we allow dashboards to get messy? (9:35)Building cultural or technical solutions to effective dashboards (15:19)Using data dashboards to make material business improvements (23:19)What is business observability? (32:23)Building a platform for operations teams (43:15)How important community is to the cliff.ai business proposition (41:03)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
55:3508/12/2021
The PRQL: Why is the Data Engineer's Role Expanding?

The PRQL: Why is the Data Engineer's Role Expanding?

In this show PRQL, Eric and Kostas talk about the evolution of the role of a data engineer and preview the conversation with Aayush Jain.
09:4603/12/2021
64: Data Stack Composability and Commoditization with Michel Tricot of Airbyte

64: Data Stack Composability and Commoditization with Michel Tricot of Airbyte

Highlights from this week’s conversation include:Announcement: Data Stack Live! (1:00)Michel’s career background (4:13)Solving the technical and process challenges of moving data (7:04)Lessons learned from managing data at Live Ramp (9:35)How to build a modern data stack (16:19)Triggers to signal when more data infrastructure is needed (23:19)Why Airbyte is an open-source product (30:23)Airbyte’s role in providing support to open-source problems (38:15)How important DPT is for the Airbyte protocol and platform (41:03)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
55:3701/12/2021
The PRQL: The Beauty of Commoditization

The PRQL: The Beauty of Commoditization

For this week's PRQL, Eric and Kostas preview their upcoming episode with Michel Tricot.
07:1726/11/2021
63: The ETL - ELT Flip With Ciaran Dynes of Matillion

63: The ETL - ELT Flip With Ciaran Dynes of Matillion

On this week’s episode of The Data Stack Show, Eric and Kostas have a conversation with Ciaran Dynes, the Chief Product Officer at Matillion, a powerful and easy-to-use, completely cloud-capable ETL/ELT solution.
56:0524/11/2021
The PRQL: What Part of the Data Stack Will Be Commoditized Next?

The PRQL: What Part of the Data Stack Will Be Commoditized Next?

On this week's PRQL, Kostas and Eric preview their upcoming conversation with Ciaran Dynes of Matillion.
06:4719/11/2021
62: The Internet of Everything with Rob Rastovich of ThingLogix

62: The Internet of Everything with Rob Rastovich of ThingLogix

Highlights from this week’s conversation include:Rob’s career began as an early adopter in internet marketing and then he got the bug for machine-to-machine IoT (2:47)Making assumptions about mass scale (8:44)Pervasiveness of IoT in the market (11:47)Initial reactions to technological advances that we take for granted today (17:28)What makes IoT unique (23:56)Killing the SQL server (29:11)What really separates a smart device from a dumb device that can send data to the cloud (33:13)5G, LoRa, and drawbacks and advances in widespread IoT adoption (37:05)Security and privacy in IoT (41:23)Using IoT as a cattle rancher (45:20)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
51:5217/11/2021
The PRQL: Are you afraid of IOT?

The PRQL: Are you afraid of IOT?

In this PRQL, Eric and Kostas preview their upcoming conversation with Rob Rastovich of ThingLogix, Inc.
07:2312/11/2021
61:  What is Data Design? With Kevin Gervais of Touchless

61: What is Data Design? With Kevin Gervais of Touchless

Highlights from this week’s conversation include:Kevin’s interaction with data at an early age (2:35)Working with telecom data (5:08)Analyzing emojis in customer sentiment (8:44)Infrastructure needed for diverse data (12:22)Building better interfaces and looking out for human error (24:17)Dealing with differences in identities in different layers of the stack (41:21)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
55:0410/11/2021
The PRQL: Will we ever get rid of the CSV?

The PRQL: Will we ever get rid of the CSV?

09:3008/11/2021
Data Debrief: The Highs and Lows of Open Source Projects

Data Debrief: The Highs and Lows of Open Source Projects

Eric and Kostas break down further topics from episode 60 about stream processing and open source projects.
05:3005/11/2021
60: Architecting a Boring Stream Processing Tool With Ashley Jeffs of Benthos

60: Architecting a Boring Stream Processing Tool With Ashley Jeffs of Benthos

Highlights from this week’s conversation include:A brief overview of Ashley’s background (2:47)Benthos’ creation and the problems it was meant to address (4:01)Use cases for Benthos (18:25)Key features of Benthos that make it stand out (22:23)Adding windowing to Benthos for fun (29:23)The highs and lows of maintaining an open source project for five years (32:17)The architecture of Benthos (36:23)The importance of ordering in streaming processing (42:15)Gaining traction with an open source project (53:21)Benthos’ blobfish mascot (58:03) The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
01:06:5403/11/2021
Data Debrief: What Open Source Data Projects Have Come Out of Facebook, Whoops, *Meta?

Data Debrief: What Open Source Data Projects Have Come Out of Facebook, Whoops, *Meta?

On this week's debrief, Kostas and Eric talk about the variety of open source projects that come from Facebook.
13:5029/10/2021
59: Making ETL Optional with Justin Borgman of Starburst Data

59: Making ETL Optional with Justin Borgman of Starburst Data

Highlights from this week’s conversation include:Starburst Data is Justin’s second startup (2:42)Starburst focuses on doing data warehousing analytics without the need for the data warehouse (4:14)Multi-cloud solutions among merger and acquisition use cases (8:32)Ways the stack is increasing in complexity (12:25)Comparing essential components of a data stack from 2010 to now (15:01)The future of ETL (27:36)The best maturity stage for an organization to implement Starburst (31:27)Starburst connectors (36:55)Monetizing enterprise solutions while promoting open source ones (41:52)The history of Presto and Trino (45:37)Benefits of a decentralized data mesh (49:53)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
57:3027/10/2021
Data Debrief: Will Enterprise Build The Future of Data Tooling?

Data Debrief: Will Enterprise Build The Future of Data Tooling?

On this week's Data Debrief, Eric and Kostas dig more into the topic of data tooling.
07:5222/10/2021
58: Data Federation is No Longer The "F" Word with Scott Gnau of InterSystems

58: Data Federation is No Longer The "F" Word with Scott Gnau of InterSystems

Highlights from this week’s conversation include:Solving problems with data has been a long-time passion of Scott’s (2:52)Day-to-day use of data at InterSystems (6:25)The technical aspects involved in constructing a data fabric (17:52)Companies at a variety of maturity levels can adopt a data fabric (26:49) A paradigm shift in the marketplace (28:39)Comparing and contrasting data fabric and data mesh (30:49)Sharing data across the business and not having it siloed in different departments (39:46)Privacy and security within a data fabric (41:22)The future of data fabric and pushing the edge (43:17)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
49:4520/10/2021
Data Debrief: Can Tools Help Solve Data Quality Organizational Challenges?

Data Debrief: Can Tools Help Solve Data Quality Organizational Challenges?

On this Data Debrief, Eric and Kostas are joined by Brian from Rudderstack to talk about Data Quality.
06:5515/10/2021
57: Improving Data Quality Using Data Product SLAs with Egor Gryaznov of Bigeye

57: Improving Data Quality Using Data Product SLAs with Egor Gryaznov of Bigeye

Highlights from this week’s conversation include:Egor’s software engineering background and history with Uber (2:19)Experimentation platforms and analytics definitions (7:49)Bigeye’s function and use cases (9:40)Managing the relationship between the data engineer maintaining the pipelines and the downstream teams providing the context (18:49)Pinpointing problems in data compared to problems in software (21:55)Defining data quality at Bigeye (24:13)Machine learning models as a data product (28:38)Determining SLAs (32:22)How Bigeye brings different parties together and addresses natural communication barriers (36:42)Looking at when an organization needs to implement data quality tooling (45:54)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
56:0713/10/2021
56: Stream Processing and Observability with Jeff Chao of Stripe

56: Stream Processing and Observability with Jeff Chao of Stripe

Highlights from this week’s conversation include:Jeff’s history with stream processing (2:52)Working with Mantis to address the impact of Netflix downtime (4:20)Defining observability as operational insight (6:58)Time series data and the value of data today (18:52)Data integration’s shift from batch to streaming (29:34)The current state of change data capture (32:20)How an engineer thinks of the end-user (56:21)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
01:03:5506/10/2021
55: Tables vs. Streams and Defining Real-Time with Pete Goddard of Deephaven Data Labs

55: Tables vs. Streams and Defining Real-Time with Pete Goddard of Deephaven Data Labs

Highlights from this week’s conversation include:Pete’s background in data engineering and capital market trading (2:10)Comparison of the tooling from 2012 when Deephaven started with that of today (10:30)Taking a closer look at defining real-time data (19:47)Getting non-technical people, clients, and developers all on the same platform (36:11)Deephaven’s incremental update model (40:25)Kafka, timely data flow, and Deephaven (44:22)Use cases for Deephaven (51:52)Going to GitHub to try out Deephaven (1:02:43)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
01:06:4829/09/2021
54: The Center of the Modern Data Stack with Neil Rahilly of Mixpanel

54: The Center of the Modern Data Stack with Neil Rahilly of Mixpanel

Highlights from this week’s conversation include:Neil’s programming hobby turned into a career and how he cold-contacted Mixpanel for a job (2:28)Lessons learned from nine years at Mixpanel (5:05)Defining product analytics (8:06)How Mixpanel has evolved into the product it is today (10:56)The importance of Mixpanel’s real-time analysis (19:52)Looking at Arb, Mixpanel’s own arbitrary segmentation database (23:44)The business impact that the rise of the cloud data warehouse had on Mixpanel (34:56)Sub-second latencies and real-time use cases (49:05)Career advice from Neil (1:02:02)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
01:08:5322/09/2021
53: What Religion, a Cult, and a Tech Product Have in Common, with Bart Farrell of DoKC

53: What Religion, a Cult, and a Tech Product Have in Common, with Bart Farrell of DoKC

Highlights from this week’s conversation include:Bart’s journey from southern California, to New York, to Egypt, to London, to Spain (3:31)Exposure to different communities and finding shared language and experience (10:21)Looking back at early online communities and how they furthered your learning journey (27:50)How the level of niche-ness impacts a community (44:06)The cautionary tale of WeWork (57:28)Surefire community killers (1:03:44)Open source communities in tech and the passion that drives them (1:08:11)Follow the Data on Kubernetes Community at DoK.community and on Twitter at @DoKCommunity. You can follow Bart at @birthmarkbart.The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
01:20:0515/09/2021
52: Discussing Data Warehouses, Lakes, and Meshes with James Serra of EY

52: Discussing Data Warehouses, Lakes, and Meshes with James Serra of EY

Highlights from this week’s conversation include:James’ background at Microsoft and current work with EY’s data fabric (2:22)The external and internal facing components of EY’s data fabric (6:39)The importance of the data lineage (11:29)The most important requirements for data quality (15:32)Looking at the data capabilities of Microsoft (21:30)The data warehouse, explained (29:00)Using a data warehouse or a data lake (34:33)Defining the buzzword data mesh (51:13)The problem with data mesh (59:31)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
01:08:5308/09/2021
51:  Democratizing AI and ML with Tristan Zajonc of Continual

51: Democratizing AI and ML with Tristan Zajonc of Continual

Topics in this wide-ranging conversation include: Tristan’s background with Cloudera and the need for continual operational ML and AI (3:15)How the complexity of Continual is hidden behind a simplicity of use (14:48)Focusing on data that lives within a data warehouse (18:43)Understanding features in the ML conversation (22:47)The three layers of Continual (26:11)The importance of SQL to Continual (30:19)Caching layers and the data warehouse centric approach (38:28)Betting on the warehouse being a central component of data stack architecture (43:34)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
54:5001/09/2021