Sign in
Technology
Business
Rudderstack
Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.
50: From Data Infrastructure to Data Management with Ananth Packkildurai
Highlights from this week’s episode:Ananth’s background (2:51)The evolution of Slack (4:54)Kafka and Presto’s two of the most reliable and flexible tools for Ananth (9:43)How Snowflake gained an advantage over Presto (13:24)Opinions about data lakes (17:23)Core features of data infrastructure (23:22)The tools define the process, and not the other way around (31:30)Defining a data mesh (36:44)Data is inherently social in nature (40:31)Lessons learned from writing Data Engineering Weekly (49:14)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
58:4425/08/2021
49: MLops - The Finalization of the Data Stack with Ben Rogojan of Facebook
Topics in this conversation include: Ben's background and his shift to data engineering (2:19)Trends in the data space: finding the most efficient tools, the Snowflake phenomenon, and keeping up with new functionalities (5:33)Key differences in data practices in small companies and Facebook-sized companies (12:38)Having to build tools specifically designed for Facebook because of SaaS product limitations (16:00)Team structure at Facebook (18:17)Developing more robust systems that are resistent to pipeline failure (19:50)Defining data stacks (24:01)A sample data stack for a young company (28:37)Why Redshift and Snowflake have trended in the opposite direction (33:02)BigQuery and Snowflake comparisons (36:06)MLOps and whose responsibility is it (39:12)Feast, Tecton, and feature stores (45:40)Having a good community around an open source product (49:30)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
55:0918/08/2021
48: Season Two Recap with Eric Dodds and Kostas Pardalis
Highlights from this week’s episode:Dissecting the different team structures from organizations in season two (1:16)The people behind the data are key to the data itself (9:17)Open source licensing and the core components needed for large scale commercial viability (15:13)Game-changing core technologies in the new data economy (22:09)Snowflake vs. Databricks battle. "The UFC of Geeks" (25:54)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
32:4511/08/2021
47: Taming the Four Dragons of Data with Sven Balnojan of Mercateo Gruppe
Highlights from this week’s episode include:Sven's Ph.D. in Singularity Theory (2:59)The Databricks vs. Snowflake conversation (8:17)The difficulty of not just inventing something new, but making it accessible (18:01)Databricks and unstructured data (22:22)Organizational change responding to technological change (29:27)The three-dimensional evolution of a successful open source project (40:31)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
51:0504/08/2021
46: A New Paradigm in Stream Processing with Arjun Narayan of Materialize
Highlights from this week’s episode include:Introducing Arjun and how he fell in love with databases (2:51)Looking at what Materialize brings to the stack (5:28)Analytics starts with a human in the loop and comes into its own when analysts get themselves out and automate it (15:46)Using Materialize instead of the materialized view from another tool (18:44)Comparing Postgres and Materialize and looking at what's under the hood of Materialize (23:16)Making Materialize simple to use (32:33)Why Materialize doubled down on writing 100% in Rust (35:43)The best use case to start with (42:03)Lessons learned from making Materialize a cloud offering (44:22)Keeping databases to the cloud for low latency (48:31) The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
56:1228/07/2021
45: Open Source and Attribution with Ophir Prusak of Codesmith
Highlights from today's conversation include:Ophir's decision to switch from software engineering to marketing and riding the startup train (2:39)Open sourcing in the world of software (5:55)How open source has changed Ophir's life as a marketeer working at startups (10:28)Chartio's sunsetting drove Ophir to search for a data tooling replacement (27:27)Discussing trends in adoption of tools for small scale and large scale companies (35:01)Data challenges related to attribution--how wrong do you want to be? (44:07)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
55:5921/07/2021
44: Leveraging Data in a Post-Covid World with Ruben Ugarte of Practico Analytics
Highlights from this week's episode: Ruben's background (2:36)Massive shifts in data caused by COVID (4:47)Big Tech is no longer untouchable (9:54)Accelerations in the BI space (15:17)A focus on people and on trust (23:43)Numbers are filtered by the biases of the people viewing them (28:46)AI trends and adoption (38:06)Using qualitative data for insights, particularly at early stages (40:56)Recommendations for taking stock of who is using the data and assessing what their skills are (50:06)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
57:3314/07/2021
43: Modern Authentication and User Management with Sokratis Vidros of Clerk.dev
Highlights from this week's episode:Sokratis' realization that big corporations were not the best thing for him (2:56)Transitioning for Workable to Clerk.dev (3:40)Convincing developers to outsource components to a service (9:36)Clerk's layered solutions and how it affects the developer and the end-user (12:41)Starting with Clerk from scratch vs. using Clerk to replace an existing component (19:55)Synergies and SaaS starter kit (24:06)Building Clerk to avoid a single point of failure (29:19)Reflecting back on the transformation and growth of Workable, and how it was like working at eight different companies (33:03)Lessons that Sokratis has taken away from his years as a developer (42:18)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
46:0007/07/2021
42: Scaling Data Science with Ryan Boyer of Shipt
Highlights from this week’s episode include:Ryan's full circle path from stocking shelves at Target to using data science for a company owned by Target (2:00)Building great tools and wielding them effectively (5:04)Changes at Shipt since being acquired (9:29)How people’s bias impacts models built by data scientists (12:30)The different data sources Shipt incorporates (22:02)How Ryan's work as a data scientist has changed as Shipt has grown (25:29)How data science helps marketing (31:38)Improving search experience (34:23)Shipt's evolving data stack (38:27)New trends in data science (47:06)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
52:5330/06/2021
41: Doing MLOps on Top of Apache Pulsar and Trino with Joshua Odmark of Pandio
Highlights from this week’s episode:Joshua started his first company at age 15 and then sold two more startups after that (2:15)Embracing the open source movement and not reinventing the wheel if you don't have to (12:15)Pulsar seemed built to address Kafka's weaknesses (17:23)Using Redis as a coordinator for federated learning and taking advantage of its portability (23:05)The pillars of Pandio and some practical use cases (31:24)Feature stores and model versioning (38:23)Seeing Pulsar as the future because of the ability to run tens of millions of topics (41:04)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
50:2023/06/2021
40: Graph Processing on Snowflake for Customer Behavioral Analytics
Highlights from this week’s episode include:Launching Affinio and the engineering backgrounds of the co-founders (2:36)The massive transformation in customer data privacy regulation in the past eight years (6:23)Creating the underpinning technology that can apply to any customer behavioral data set (10:05)Ranking and scoring surfing patterns and sorting nodes and edges (14:13)Placing the importance of attributes into a simple UI experience (19:28)Going from a columnar database to a graph processing system (25:20)Working with custom or atypical data (32:46)The decision to work with Snowflake (37:43)Next steps for utilizing third-party tools within Snowflake (52:18)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
57:4616/06/2021
39: Diving deeper into CDC with Ali Hamidi and Taron Foxworth of Meroxa
Highlights from this week’s episode include:Meroxa is a real-time data engineering managed platform (4:53)Use cases for CDC (6:20)Meroxa leverages open source tools to provide initial snapshots and start the CDC stream (12:29)Making the platform publicly available (14:14)What the Meroxa user experience looks like (16:10)Raising Series A funding (17:49)Easiest and most difficult data sources for CDC (20:23)The current state of open CDC (23:16)Expected latency when using CDC (29:56CDC, reverse ETL, and a focus on real-time (36:39) Are existing parts of the stack when Meroxa is adapted? (39:45)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
49:4111/06/2021
38: Graph Databases & Data Governance with David Allen of Neo4j
Highlights from this week's episode include: David’s background in comparative databases (1:50)David’s experience and lessons he learned from writing his book (3:23)How writing a technical book compares to writing technical documentation (4:41)The process of writing a book (6:30)The best and worst part of David’s book writing experience (8:02)An introduction to what Neo4j is (9:08)What you need to graph (11:13)Typical problems a graph database is a good solution for (13:00)The difference between performance and relational databases (18:41)How Neo4j addresses performance and ergonomics (23:30)Neo4j and scalability (26:20)How Neo4j fits in the modern data stack (31:48)Neo4j use cases (35:45)Practical implementation of Neo4j (40:51)Neo4j’s relationship with open source (45:50)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
50:4802/06/2021
37: The Components of Data Governance with Dave Melillo of FanDuel
Highlights from this week's episode include:Dave's "nerdy" interests in sports statistics and data (2:12)Trends in collecting, processing, and using data (4:45)Finding a better term for "reverse ETL" (5:48)The blurring of the distinction between sources and destinations (7:41)The role of BI is changing (13:24)Data governance and the physical execution behind it (19:00)Data governance is defining and managing data in a logical way that is actionable by the business (23:43)Consolidation of tools and services (28:49)Databricks vs. Snowflake (33:49) Dave's focus on regulatory data at FanDuel (45:47)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
54:2726/05/2021
36: Crypto and Compliance with Nick Fogle, Co-Founder of Churnkey and Wavve
On this week's episode of The Data Stack Show, Eric and Kostas talk with Nick Fogle, co-founder of Churnkey and Wavve. Together they discuss how having a legal background can impact engineering decisions, dealing with privacy and compliance concerns, and selling Wavve and starting Churnkey as a result.Highlights from this week's episode include: Nick's background in economics and law and teaching himself to code (2:01)Thinking like a lawyer and trying to minimize risk to the greatest extent possible (4:23)GDPR and compliance (8:23)Blockchain contracts (18:26)Unique challenges surrounding compliance with a cryptocurrency startup (21:41)Reconciling the right to be forgotten, GDPR, and blockchain permanence (27:16)Building Churnkey after developing it as a way to lower churn among Wavve users (31:31)How Churnkey's stack works (37:16)Crypto predictions (39:02)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
42:4019/05/2021
35: The Future of Development is Distributed with Jim Walker of Cockroach Labs
This week on The Data Stack Show, Eric and Kostas talk with Jim Walker, the VP of product marketing at Cockroach Labs, about distributed systems, competing against the speed of light, and making data easy.Highlights from this week's episode include: Jim background of translating deep technical concepts into understandable English and his work at Cockroach Labs (2:23)The origin of Cockroach Labs and distributed SQL (6:10) Living without Atomic Clocks (10:10)Having the speed of light as the ultimate competitor (13:49)CockroachDB’s users (19:35)Figuring out big data for transactions (25:14)Dealing with failure (35:04)Open source code, community, and consumption (39:26)Making data easy, and what's next for Cockroach (43:12)Bringing programming into marketing (46:18)Mentioned Links:Spanner White PaperRaft & PaxosMichael Stonebraker The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.
54:2712/05/2021
34: The Intersection of Data Engineering and Marketing with John Marbachm of Grafana Labs
On this week's episode of The Data Stack Show, Eric and Kostas talk with John Marbach, senior growth manager at Grafana Labs. In this conversation, John discusses marketing ops and the blending of roles of data engineering and marketing.Highlights from this week's episode include:Grafana Labs John Marbach Senior Growth ManagerIntroduction to John Marbach and working in the blurred lines between marketing and data engineering (2:14)How managing pipeline building and consuming data influences the use of downstream tools (6:28)Experiments in marketing (11:28)Exploring the role of marketing ops (15:35)How accruing technical debt can grind things to a halt (20:35)Matching the stack with the company's scale (24:48)CDPs and marketing to developers (28:40)Biggest challenges and barriers between data engineering and marketing (35:19)Takes on reverse ETL (39:07)Thoughts on cryptocurrency and the blockchain (44:08)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
49:0728/04/2021
33: ML is a Data Quality Problem with Peter Gao from Aquarium Learning
On this week's episode of The Data Stack Show, Eric and Kostas talk with Peter Gao, co-founder, and CEO at Aquarium Learning. A former engineer at Cruise Automation, Peter and Aquarium Learning help ML teams improve their model performances by improving their data.Highlights from this week's episode include:How getting hit by a drunk driver made researching self-driving cars personal for Peter (2:12)Filtering out the hype in self-driving car news to get a clear picture of its state today (6:52)The data required for a self-driving vehicle (13:56)Operation Vacation and how Aquarium can help provide the tools to make models better (16:53)Utilizing neural networks to index data (20:41)How Aquarium fits in the ML stack (30:25)Interesting use cases of Aquarium (33:59)Distinguishing subclasses of machine learning (40:05)Human involvement in machine learning (46:13)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
56:3514/04/2021
32: Cooking with Data Ops with Chris Bergh from DataKitchen
On this week's episode of The Data Stack Show, Eric and Kostas talk with Chris Bergh, the CEO and head chef at Data Kitchen. DataKitchen’s mission is to provide the software, service, and knowledge that makes it possible for every data and analytics team to realize their full potential with DataOps.Highlights from this week's episode include: Chris' background and how the lessons learned in the Peace Corps and at NASA apply to him today (2:03)Why AI left Chris feeling like a jilted lover (7:49)Most projects that people do in data analytics fail (10:12)Three things that DataOps focuses on (16:37)Comparing and contrasting DevOps and DataOps (22:30)The types of data that DataKitchen handles and building a product or a service around DataOps (29:29)Fixing problems at the source instead of just offering a tool to slightly improve things downstream (37:17)Where we are at in the process of how companies are going to run on data (41:43)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
58:3107/04/2021
31: How a 160 Year-Old Publisher is Using Data with Jenna Lemonias From the Atlantic
On this week's episode of The Data Stack Show, Eric and Kostas chat with Jenna Lemonias, director of data science at The Atlantic. The Atlantic, a publication that's been around since 1857, is adapting with the times and is implementing and emulating some of the data science practices seen at big tech companies. Highlights from this week's episode include:Jenna's background in astrophysics and how she pivoted to data science (2:14)Differences in dealing with data at a FinTech company and then at a publication (4:40)The relationship between analog and digital data at The Atlantic (9:22)How The Atlantic structures its data science team (11:44)The role data engineering plays (14:42)Using natural language processing and machine-generated metadata (17:37)The Atlantic's data stack (28:22)The kind of data that's important to The Atlantic (29:44)Big projects forthcoming for the data science team (37:13)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
43:1431/03/2021
30: The DataStack Journey with Rachel Bradley-Haas and Alex Dovenmuehle of Big Time Data
On this week’s episode of The Data Stack Show, Eric and Kostas are joined by the co-founders of Big Time Data, Rachel Bradley-Haas, and Alex Dovenmuehle, formerly of Mattermost and prior to that, Heroku. At Big Time Data, they work together to provide companies with the ability to derive value and insights from decentralized datasets, improve business processes through data enrichment and automation, and build a scalable foundation to enable a data-driven culture.Highlights from this week’s episode include:Rachel and Alex's background and their goal to make data approachable for companies everywhere (3:09)The data stack journey: making decisions when you're small that allow you to grow with your data and your organization (12:28)The problems faced when a data stack isn't nurtured early on (15:59)Changes in data stack technology (21:32)How Alex and Rachel's roles at Big Time Data differ and interact with each other (39:00)Client use cases (43:34)Comparing the stacks of seed-stage startups, mid-sized companies, and giant enterprises (48:54)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
01:01:3124/03/2021
29: The Present and Future of Data Engineering with Joe Reis and Matthew Housley from Ternary Data
On this week’s episode of The Data Stack Show, Eric and Kostas are joined by Matthew Housley, CTO, and Joe Reis, CEO and co-founder of Ternary Data. These self-described “recovering data scientists” focus on teaching skills to build a solid foundation for organizations to work with their data.Highlights from this week’s episode include:Joe and Matt’s background and expertise (2:44)Common threads and trends in the data sphere (9:39)Differences and commonalities between startups and enterprises and the way they deal with data (18:28)Discussing how the role of data engineering has evolved over the years and what it might morph into in the near future (27:52)The ideal data infrastructure and what future shifts excite them (39:52)How ML is shaping the data space (44:30)The state of real time (49:56)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
58:0717/03/2021
28: Next Gen Data Governance with Stefania from Avo
On this week’s episode of The Data Stack Show, Eric and Kostas are joined by Stefanía Bjarney Ólafsdóttir, the CEO and co-founder of Avo. Avo, which started in 2018, provides data analytics governance as a service, helping organizations make data-driven decisions to improve their customer experience.Highlights from this week’s episode include:Stefania's background with mathematics, philosophy, bioinformatics and consumer mobile (2:39)Making pioneering decisions as head of data science at QuizUp (8:34)Is less more? Choosing fundamental parts of the customer experience and understanding them very well (16:56)Bringing data consumers closer to data producers (18:34)Avo mission to provide analytics governance as a service (25:09)Avo use cases (36:37)Focusing on event-based data (44:29)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
59:1810/03/2021
27: Building B2B Marketplaces with Mike Luby from LeafLink
On this week’s episode of The Data Stack Show, Eric and Kostas are joined by Mike Luby, director of engineering at LeafLink. LeafLink is a cannabis industries B2B wholesale marketplace where thousands of brands can manage and track their orders and relationships.Highlights from this week’s episode include:The infrastructure LeafLink provides for the cannabis supply chain and how it deals with compliance issues. (2:03)Structuring product management organization to launch high-velocity teams (8:08)How it started vs. How it's going (12:00)Containerization and leveraging AWS tools for LeafLink's stack (13:19)Shifting to an event-driven architecture (24:46)Using APIs to provide critical integrations for customers to automate and optimize their businesses (32:47)Keeping an eye for the future but building for today (36:56)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
42:2203/03/2021
26: Democratizing the Insurance Market with Daniel Gremmell from Policygenius Inc.
On this week’s episode of The Data Stack Show, Eric and Kostas are joined by Daniel Gremymell, head of data at Policygenius, Inc. Policygenius, an insurance marketplace, strives to make it easy for people to understand their options, compare quotes, and buy a policy all in one place with help from licensed experts.Highlights from this week’s episode include:What brought Daniel to Policygenius and how his background in industrial engineering and statistics impacts what he does (1:49)Policygenius consolidates carriers and pairs insurance customers with live experts to get the best prices and plans (6:29)How data analysts and data scientists re-shape the customer experience of selecting insurance (10:36)How roles and titles like "head of data" are changing the industry (24:32)Organizing a company with structured embedding (27:28) Policygenius' data stack (31:31)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
38:4724/02/2021
25: MLOps and Feature Stores with Willem Pienaar from Tecton
On this week’s episode of The Data Stack Show, Kostas is joined by Willem Pienaar, tech lead at Tecton to discuss machine learning, features and feature stores.Highlights from this week’s episode include:Willem Pienaar's background in South Africa and southeast Asia and from Goject to Tecton (1:58)Tecton was founded by the builders of Uber's Michaelangelo (6:37)Defining features and their life cycles (10:05)Comparing a feature store to a database (16:40)Data architecture in a feature store (26:16)How feature stores evolve as a company expands (30:12)Main touchpoints between the feature and the data infrastructure (37:59)How Tecton manages productizing complex architectures (41:44)How Feast and Tecton work together (45:12)Tecton impressing VCs and preparing for a competitive, emerging market (48:14)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
51:1217/02/2021
24: Demystifying AI with Duc Haba
On this week’s episode of The Data Stack Show, Eric is joined by Duc Haba, an AI researcher and enterprise mobility solution architect consultant who most recently did AI consulting work with Cognizant. Their discussion revolves around demystifying artificial intelligence and why so many people either fear AI or place too much trust in it. Duc talks about some of the AI projects he has worked on, some successes and some failures, and points to how the data biases that humans bring into the models can radically alter the outcome of those endeavors.Highlights from this week’s episode include:Duc's background with AI and getting to work with LeVar Burton (1:44)Demystifying AI and coming up with a definition for it (3:34)Misplaced fears of AI (7:53)Misplaced trust in AI (10:36)Public versus hidden AI (13:58)Acquiring the data needed for to train AI models (23:11)Examples of interesting AI projects Duc has worked on (27:58)Where to go to learn more about AI (35:06)Thinking of AI as something that can help your business do something better with what it's already been doing (39:53)Anticipating the near-future of AI (44:16)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
51:0010/02/2021
23: Migrating from On-Premises to the Cloud with Alex Lancaster from Intuit
On this week’s episode of The Data Stack Show, Kostas and Eric are joined by the risk data engineering manager at Intuit, Alex Lancaster. Alex has been with Intuit, known for its products like QuickBooks, TurboTax, Mint and more, for 15 years and was part of a recent massive and successful re-architecturing from on prem to cloud-based.Highlights from this week’s episode include:Alex and his role at Intuit (1:51)Data marts at Intuit (2:57)Revolutionary changes in the data engineering space in the past 15 years (6:46)Security in the cloud vs. on prem (12:46)Data architecture at Intuit (15:42)Doing ETLs inside or outside of the database (19:11)How to transition successfully from on prem to cloud. Forklifting vs. re-stacking (23:22)Alex’s application of software engineering skills to data engineering (28:44)Dealing with data engineering challenges related to security and regulation (31:48)Pipelines managed and challenges in data types (36:45)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
42:5803/02/2021
22: Season One Recap with Eric Dodds and Kostas Pardalis
Season One of The Data Stack Show is in the books, and in this episode, Kostas and Eric take a look back at some of the biggest takeaways, trends, and topics from the season. With some great guests already set for season two, the next slate of episodes is shaping up to take an even deeper dive into the world of data and the people shaping it.Key points in the conversation include:Patterns with data warehouses and data lakes (3:38)Looking back at the people behind the data and their stories (8:12)Minimizing flaws while remembering that data is built by humans, for humans (11:02) Using proven technology and making mature solutions (15:20)Data involves a significant amount of trust (23:38)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
29:4229/01/2021
21: Data Integrity and Governance with Patrick Thompson and Ondrej Hrebicek from Iteratively
On this week’s episode of The Data Stack Show, Kostas and Eric are joined by the co-founders of Iteratively, CEO Patrick Thompson and CTO Ondrej Hrebicek. Iteratively helps companies know that their data can be trusted by helping capture clean, consistent product analytics. Today’s conversation digs into the behind the scenes of Iteratively and how trust in data can help accelerate the velocity of an organization.Highlights from this week’s episode include:Patrick and Ondrej’s background and the biggest problem Iteratively addresses (2:50)Why some companies still use spreadsheet schema management and the potential pitfalls they’re setting themselves up for with this (4:39)Defining schema in the context of data (7:02)Viewing the process as a team sport (11:34)Identifying common mistakes and implementing best practices (13:46)A walkthrough of Iteratively (17:13)Utilizing a JSON schema format (26:58)Laying Iteratively on top of or integrating it with an implementation for analytics (30:36)Entry point into organizations (33:02)Organizational change and velocity realized after implementing Iteratively (36:04)What’s next for Iteratively? (42:47)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
46:2320/01/2021
20: Transforming the Real Estate Market with Predictive Analytics with Arian Osman from Homesnap
This week on The Data Stack Show, Kostas and Eric are joined by Arian Osman, a senior data scientist at Homesnap who is also nearing the end of his PhD in computational sciences and informatics and is the developer of an e-commerce clothing brand. Homesnap is designed for both homebuyers and agents to access data from the MLS (Multiple Listing Service), providing real-time, accurate information to all parties involved.Highlights from this week’s episode include:Arian’s background and an overview of Homesnap (2:30)Utilizing data in Arian’s e-commerce clothing brand (7:14)Homesnap’s sell speed feature and visualizing outputs (13:28)The psychology that drives upper and lower limits (19:33)Deciding the life-cycle of a model (25:50)Collaborating with internal stakeholders (30:47)Unique challenges of data in the real estate domain (38:16)Useful third-party tools (43:33)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
52:4213/01/2021
19: Defining Data Governance with Stephen Bailey from Immuta
This week on The Data Stack Show, Kostas and Eric are joined by Stephen Bailey, Director of Applied Data Science at Immuta. Immuta is a startup that focuses on enabling data teams to have really fast, efficient, and understandable access controls on their data. Highlights from this week’s episode include:The problem that Immuta solves (2:04)Stephen’s background researching how the brain works (4:56)Immuta’s stack (15:09)Leveraging metadata (18:02)The main use case for Immuta is simplifying the access control layer (20:06)Unifying data (31:52)Defining the quality of data (34:04)Learning to trust the numbers (39:42)What’s next for Immuta (46:15)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
53:1006/01/2021
18: Data Science in Health Insurance with Jason Haupt of Bind
This week on The Data Stack Show, Kostas and Eric are joined by Jason Haupt, data science lead at Bind, a no-deductible health insurance company determined to give immediate answers and clear costs before point of care. Jason’s unique background of having a Ph.D. in particle physics and working at the Large Hadron Collider at CERN have informed the way he goes about approaching data at Bind.Highlights from this week’s episode include:Jason’s background in particle physics and his path to Bind (2:53)A cloud-only approach to data and utilizing AWS (9:01)Focusing on activities that help its members (12:08)Dealing with 12,000 columns of data from an insurance claim form (17:13)Rethinking the relationship between marketing and product teams (25:28)Examining the data pipeline (29:30)Privacy and security concerns with medical information (35:45)How experience with the LHC impacted the way he thinks about data (40:06)Transition from academic work to industry (46:20)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
54:3131/12/2020
17: Working with Data at Netflix with Ioannis Papapanagiotou
This week on The Data Stack Show, Kostas and Eric are joined by Ioannis Papapanagiotou, senior engineering manager at Netflix. Ioannis oversees Netflix’s data storage platform and its data integration platform. Their conversation highlighted the various responsibilities his lean teams have, utilizing open source technology and incorporating change data capture solutions.Key points in this week’s episode include:Ioannis’ background with academia and Netflix (4:42)Comparing the data storage and data integration teams (6:19)Discussing indexing and encryption (20:31)Netflix’s role in the open source community (27:21)Implementing change data capture (40:42)Using Bulldozer to efficiently move data in batches from data warehouse tables to key-value stores (42:43)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
57:0709/12/2020
16: Applying the Event Sourcing Pattern at Scale with Andrew Elster from Earnnest
On this week’s episode of The Data Stack Show, Kostas and Eric finish part two of a conversation about Earnnest, a digital platform originally designed for facilitating real estate transactions. In the previous episode, they talked with the CTO and co-founder Daniel Jeffords, and in this week’s episode, they talked with the other co-founder, Andrew Elster, CIO and chief architect. Andrew describes more about Earnnest’s stack and their decision to utilize Elixir and talks about their vision for scaling up their product.Key topics in the conversation include:Andrew’s journey from electrical engineering, to avoiding pirates in oceanic oil exploration, to starting Earnnest (2:57)Keeping the platform flexible to expand beyond real estate transactions (10:24)Being adaptable to support existing workflows (18:33)The evolution of the database and implementing event sourcing (25:01)Using a functional language like Elixir (30:54)Developing Earnnest with scale in mind (37:33)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
45:5303/12/2020
15: Early Stage Analytics and Learning from the Y Combinator Experience with Axel Delafosse from Pool
This week on The Data Stack Show, Kostas and Eric are joined by Axel Delafosse, founder and CEO of Pool, a messaging app designed to help couples spend less time deciding what to do and spend more time together. Axel shares his story of how he went from having his idea being shot down in person by Paul Graham to being accepted for Y Combinator. While Pool is still a young startup, Axel offers wise insight from lessons he’s learned along the way.Highlights from this week’s episode include:Pool Messenger, “the ultimate antidote to decision paralysis” (2:50)Pitching to Paul Graham and applying to YC (6:17)The importance of the co-founder relationship (14:01)The YC experience and losing Facebook’s API (17:37)Products die, relationships last (22:05)Breaking down the data stack (28:50)Using data and conversations with users to evaluate the experience (36:12)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
47:5919/11/2020
14: Breaking Down Electronic Money Transfers and Modernizing Real Estate Transactions with Dan Jeffords of Earnnest
This week on The Data Stack Show, Kostas and Eric chat with Daniel Jeffords, CTO and co-founder of Earnnest, a financial tool for the real estate industry. Earnnest’s digital platform allows buyers to securely and electronically deposit funds directly to an escrow holder and keeps agents, buyers, and escrow holders in the loop with automated emails and tracking information.Highlights from this week’s episode include:Earnnest’s approach to the way payments are handled in an antiquated real estate industry (2:12)Clearing up the differences in the way money changes hands, ACH, wire, and checks (12:39)How Earnnest works and who are the involved parties (21:06)Disrupting a highly regulated industry (24:24)Emphasizing security and transparency (30:09)Erlang, Elixir, Dwolla and more. How Earnnest uses data (33:40)Trying very hard to store very little data (42:58)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
48:0711/11/2020
13: Building Open Source Products at Scale with Reza Shafii from Kong Inc.
This week on The Data Stack Show, Reza Shafii, vice president of products at Kong Inc. discusses open source projects and products with Kostas and Eric. Kong is a cloud connectivity company best known for being the creator and primary supporter of Kong, the most widely adopted open-source micro service API gateway.Highlights from this week’s episode include:Being a self-proclaimed middleware geek (2:17)Middleware explained (5:41)Kong as a company, open source project, and a brand (10:44)Drawing the lines between the open source and property parts of a SaaS platform (24:22)Dealing with the extra friction in adopting middleware from the bottom up (33:02)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
39:5206/11/2020
12: Building a CDP on your Data Warehouse with Nicholas Ziech-Lopez of MessageGears
In this episode of The Data Stack Show, hosts Kostas Pardalis and Eric Dodds talk with Nicholas Ziech-Lopez, director of product strategy at MessageGears. MessageGears is designed to reduce data friction for marketers by connecting directly to a brand’s data source and using their live data. This episode centered around the world of CDPs and where MessageGears fits in that space.Highlights from this week’s episode include:Nicholas’ arrival at MessageGears and the company’s background (2:20)MessageGears data sources (6:52)Accessing the data warehouses (9:19)Coordination and crossover of data and marketing roles (20:57)Being a customer marketing platform (31:43)Dealing with messy data (36:04)Bridging the physical and digital world with consumers (43:49)What’s coming up next for MessageGears (51:09)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
55:5028/10/2020
11: Why Modern Cyber Security is a Data Problem with Jack Naglieri of Panther Labs
This week’s episode of The Data Stack Show features a conversation with hosts Kostas Pardalis and Eric Dodds and guest Jack Naglieri, founder and CEO of Panther Labs. Panther, a San Francisco based startup, is an open platform that helps security teams detect and respond to breaches in cloud-native environments, providing a modern alternative to traditional SIEMs.Highlights from this week’s episode include:Introduction to Jack and Panther Labs (2:33)The different pillars of data security (10:24)Onboarding process for a company using Panther (18:40)Thinking of security as a data problem (24:55)Using S3 and other infrastructure suggestions that will be helpful in the long run (32:16)Use cases for analyzing past and real-time data (39:20)Panther’s data stack (42:54)Open source technology being helpful for the community (47:57)The future for Panther (54:39)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
58:3821/10/2020
10: The Evolution of the BI Market with Huy Nguyen of Holistics
In this week’s episode of The Data Stack Show, Kostas Pardalis and Eric Dodds are joined by CTO and Co-Founder of Holistics, Huy Nguyen. Holistics takes an approach to business intelligence and data analytics that they call DataOps. They focus on data team productivity and company-wide access to insights. Important points in the conversation included:Introduction to Huy and Holistics (3:12)Approaching BI with more than just visualization (8:59)How friction between different roles within an organization is addressed by Holistics (15:20)Holistics as a complementary tool (23:25)Describing their own data stack (34:47)History of BI and trends for the future (39:33)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
56:2814/10/2020
09: Building the Operating System for Work with Ivan Kanevski of Slapdash
On this week’s episode of The Data Stack Show, Kostas Pardalis and Eric Dodds are joined by Slapdash co-founder Ivan Kanevski. Slapdash describes itself as the operating system for work. Slapdash emphasizes reducing the time people spend controlling their computer in relation to the time they spend expressing their intent.Key topics discussed were:Starting Slapdash and expanding on tools from working at Facebook (3:31)Being client agnostic and working with the tools that people bring to the job (7:35)Distinctions between mouse-centric and keyboard-centric users (12:58)Slapdash’s approach to collecting data (16:08)Building Slapdash to scale and using Postgres (19:45)Using a graph model and a focus on efficiency (24:50)Challenges of reducing latency (29:35)Opening up Slapdash to be programmable (38:17)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
43:3607/10/2020
08: When data alone is not enough - Reinventing book shopping at Bookshop.org with Mason Stewart
In this week’s episode of The Data Stack Show, Kostas Pardalis and Eric Dodds chat with Mason Stewart, the lead engineer at Bookshop.org. Bookshop is an online bookstore with a mission to financially support local, independent bookstores. Their hope is to help strengthen the fragile ecosystem and margins around bookselling and keep local bookstores an integral part of our culture and communities.Among other topics, today’s conversation talked about making what some might call boring decisions with the data stack that are better described as mature decisions and the intertwining of human interaction with data for problem solving and recommendations.Background on Mason and Bookshop.org (3:28)Technical challenges of keeping up with a rapidly expanding business (10:00)Interacting with data from fulfillment partners (14:36)Data schema for books and dealing with Elasticsearch (24:46)Human intervention in recognizing problems and exceptions (31:38)In-depth look at Bookshop’s data stack (37:06)Using curated lists from bookstores instead of algorithmic recommendations (43:50)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
54:0630/09/2020
07: Discussing Data Engineering Best Practices with IFTTT’s Peter Darche
In this week’s episode of The Data Stack Show, Kostas Pardalis and Eric Dodds connect with IFTTT data scientist Peter Darche. IFTTT is a free platform that helps all your products and services work better together through automated tasks. Their discussion covered a lot of ground involving their data stack, their use cases and clearing up once and for all how to pronounce the company’s name.Background on IFTTT (2:12)Peter tells the proper way to pronounce IFTTT (3:34)An overview of IFTTT’s technological architecture (6:14)The uses of data and analytics at IFTTT (8:04)Constructing the data stack (10:11)Dealing with challenges (15:20)Best practices for communicating with internal teams about the data (23:04)Discussing functional data engineering (26:05)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
35:1523/09/2020
06: The Technical Challenges and Opportunities of Building a Startup Inside a Large Bank with Sam Bledsoe of Ruby
This week on The Data Stack Show, Kostas Pardalis and Eric Dodds continue their conversation about Ruby, a start-up designed to help families navigate their financial situation in some of life’s most challenging moments, by talking with the Nashville company’s CTO Sam Bledsoe. This follow-up discussion digs into how their data engineering and marketing setups co-exist and how they rely on Azure.Sam’s background and more info about Ruby (2:11)Privacy-related challenges at the intersection of banking and medical data (4:33)What to expect from using Azure (15:06)Breaking down the stack (24:44)The need for marketing people with technological skills (36:20)Talking Big Query, RedShift, Spark and data virtualization (41:15)Biggest changes anticipated in moving as a spin-off from the bank (43:59)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
48:4616/09/2020
05: The Convergence of Data Engineering and Marketing with Nic Discepoli of Ruby
On this week’s episode of The Data Stack Show, Kostas Pardalis and Eric Dodds are joined by Nic Discepoli for the first of two conversations about Ruby, a startup where he is the customer analytics lead. This Nashville-based company is designed to help families navigate their financial situation in some of life’s most challenging moments, often corresponding with a medical event. Their conversation included topics like: Launching Ruby and Nic’s involvement in the company (3:35)Realizing that tracking data manually on spreadsheets was no longer sustainable (7:07)Rundown of Ruby’s toolset (9:50)Challenges with data quality (14:27)Using unique IDs and following UTM parameters through the stack (21:04)Recalculating customer acquisition cost with data (33:05)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
44:0609/09/2020
04: Relational to Real-Time with Change Data Capture with DeVaris Brown of Meroxa
In this episode of The Data Stack Show, Kostas Pardalis and Eric Dodds talk change data capture (CDC) with DeVaris Brown, co-founder and CEO of Meroxa. Their conversation digs into the benefits of utilizing CDC and how Meroxa is using it. Highlights from the conversation include:Introduction to DeVaris and Meroxa (3:24)Why CDC has more traction today (6:58)How CDC is changing the way we build products (12:52)Where CDC is playing an important role (21:11)The experience that Meroxa delivers (24:42)Looking at Meroxa’s sources, technology and data stack (27:28)DeVaris’ vision for the company (37:10)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
50:0702/09/2020
03: Turning All Data at Grofers into Live Event Streams
In this week’s episode of The Data Stack Show, Kostas Pardalis connects with Satyam Krishna, a data engineer at Grofers, India’s largest low-price online supermarket. Grofers boasts a network of more than 5,000 partner stores, a user base with three million iOS and Android app downloads, and an efficient supply chain that allow it to deliver more than 25 million products to customers every month. Satyam offers insights into how he helped build the data engineering function at Grofers, how they developed a robust data stack, how they’re turning production databases into live event streams using Change Data Capture, how Grofers’ internal customers consume data, and the company made adjustments due to the pandemic. Topics of discussion included:Satyam moving from a developer to a data engineer (2:43)Describing Grofers’ data stack and data lake (6:41)Who is consuming data inside the company and what are some of their common uses specific to Grofers? (12:03)What are the biggest issues day-to-day as a data engineer? (18:21)COVID’s impact on business practices and the data stack (21:28)The big problem of data discoverability and metadata cataloging (27:44)Completely changing architecture to something that can scale up (33:16)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
38:3227/08/2020
02: The Importance of Data During a Global Pandemic with Utkarsh Gupta of 1mg
In this episode, Kostas Pardalis sits down with Utkarsh Gupta, senior engineer of data science at 1mg, India’s largest online healthcare platform.Together they discussed 1mg’s data infrastructure, its response to the global pandemic and how data drives their product and their business. Highlights from the show included discussions about:Utkarsh and 1MG’s background (1:32) 1mg being based on a bedrock of data (4:25)Business analytics (5:33)Effects of COVID-19 pandemic on business (11:40)Description of 1mg’s data stack (16:53)Biggest challenges faced and managing collaboration (27:08)Opinions on open source technology (40:31)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
45:2719/08/2020
01: Discussing Mattermost Data Infrastructure with Alex Dovenmuehle
In this episode, Kostas Pardalis sits down with Alex Dovenmuehle, head of data engineering for Mattermost, an open-source self-hosted communication tool that optimizes dev workflows in highly secure environments. Kostas and Alex discuss:Alex's background and experience (2:29)Data stack Mattermost is using (9:25)How Mattermost built their Data Stack (21:05)Using data to understand the story of the customer's journey (24:58)Focus on privacy and security (26:33)Practical ways Mattermost is using data (37:14)What's next for data analytics at Mattermost and wrap up (42:45)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
45:4512/08/2020