<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[Data Engineer Things]]></title><description><![CDATA[Data Engineer Things is dedicated to creating and sharing learning resources for data engineering. Our audience ranges from aspirational data engineers to experienced data leaders. Subscribe to grow and learn together!]]></description><link>https://dataengineerthings.substack.com</link><image><url>https://substackcdn.com/image/fetch/$s_!tTfP!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66064eba-64c2-444a-8c61-2f9c14174abd_800x800.png</url><title>Data Engineer Things</title><link>https://dataengineerthings.substack.com</link></image><generator>Substack</generator><lastBuildDate>Mon, 06 Apr 2026 12:12:11 GMT</lastBuildDate><atom:link href="https://dataengineerthings.substack.com/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Xinran Waibel]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[dataengineerthings@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[dataengineerthings@substack.com]]></itunes:email><itunes:name><![CDATA[Data Engineer Things]]></itunes:name></itunes:owner><itunes:author><![CDATA[Data Engineer Things]]></itunes:author><googleplay:owner><![CDATA[dataengineerthings@substack.com]]></googleplay:owner><googleplay:email><![CDATA[dataengineerthings@substack.com]]></googleplay:email><googleplay:author><![CDATA[Data Engineer Things]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[Data Engineer Things Newsletter - Data Pulse Edition (Mar 2026)]]></title><description><![CDATA[OpenAI's P99 latency with 800 million users, Netflix's LLM post training, LinkedIn's exabyte scale clusters, ETL &#8594; ECL, and Elements of a modern data strategy]]></description><link>https://dataengineerthings.substack.com/p/data-engineer-things-newsletter-data-d8d</link><guid isPermaLink="false">https://dataengineerthings.substack.com/p/data-engineer-things-newsletter-data-d8d</guid><dc:creator><![CDATA[Data Engineer Things]]></dc:creator><pubDate>Tue, 17 Mar 2026 15:01:09 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!7Oba!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F192e8bca-e7e3-4f90-884d-a3460870f170_1456x1048.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!7Oba!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F192e8bca-e7e3-4f90-884d-a3460870f170_1456x1048.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!7Oba!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F192e8bca-e7e3-4f90-884d-a3460870f170_1456x1048.png 424w, https://substackcdn.com/image/fetch/$s_!7Oba!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F192e8bca-e7e3-4f90-884d-a3460870f170_1456x1048.png 848w, https://substackcdn.com/image/fetch/$s_!7Oba!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F192e8bca-e7e3-4f90-884d-a3460870f170_1456x1048.png 1272w, https://substackcdn.com/image/fetch/$s_!7Oba!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F192e8bca-e7e3-4f90-884d-a3460870f170_1456x1048.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!7Oba!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F192e8bca-e7e3-4f90-884d-a3460870f170_1456x1048.png" width="1456" height="1048" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/192e8bca-e7e3-4f90-884d-a3460870f170_1456x1048.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1048,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:337646,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://dataengineerthings.substack.com/i/188303152?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F192e8bca-e7e3-4f90-884d-a3460870f170_1456x1048.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!7Oba!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F192e8bca-e7e3-4f90-884d-a3460870f170_1456x1048.png 424w, https://substackcdn.com/image/fetch/$s_!7Oba!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F192e8bca-e7e3-4f90-884d-a3460870f170_1456x1048.png 848w, https://substackcdn.com/image/fetch/$s_!7Oba!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F192e8bca-e7e3-4f90-884d-a3460870f170_1456x1048.png 1272w, https://substackcdn.com/image/fetch/$s_!7Oba!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F192e8bca-e7e3-4f90-884d-a3460870f170_1456x1048.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Hello Folks,</p><p style="text-align: justify;">Great to connect with you again for another edition! I&#8217;m writing this from Philadelphia, where the worst of winter is finally behind us and the days are slowly getting longer and warmer.</p><p style="text-align: justify;">Much like the shifting seasons, data engineering itself is in the middle of a transformation. In 2026, it's no longer just about moving and storing data; it's about making data meaningful, trustworthy, and AI-ready. Whether you&#8217;re building pipelines, designing data platforms, or enabling AI systems, the common thread is clear that the future of data engineering is as much about context and reliability as it is about movement and scale.</p><p style="text-align: justify;">If you&#8217;re looking for a place where these conversations are happening in person, check out the Data Engineering Open Forum in San Francisco on April 16th. The agenda is packed with sessions you'll actually want to stay for, and it's a great chance to connect with engineers navigating the same challenges you are.</p><p>- Ananda</p><div><hr></div><h3><strong>&#128218;</strong> Data Pulse</h3><h4><a href="https://www.dataengineeringweekly.com/p/data-engineering-after-ai">Data Engineering After AI: ECL - Extract, Contextualize, Link</a></h4><blockquote><p><strong>&#128214; Topic</strong>: Data &amp; Context Engineering<br>&#129504; <strong>Level</strong>:  Beginner</p></blockquote><p style="text-align: justify;"><strong>Summary: </strong>As AI continues to automate data engineering tasks such as pipeline generation, transformation logic, and schema inference, the core role of data engineers is shifting from moving data to defining and managing its meaning. Traditional ETL architectures focused on data movement, and they often embedded business logic within pipelines, allowing interpretive context to drift as data passed through successive transformations. An alternative framework, ECL (Extract, Contextualize, Link), addresses this gap by emphasizing three stages: extracting reliable data from source systems, enriching it with contextual definitions, and linking entities across systems to preserve coherence.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!p7NY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F389d84ee-51d7-47a5-b7f2-61309db97551_739x220.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!p7NY!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F389d84ee-51d7-47a5-b7f2-61309db97551_739x220.png 424w, https://substackcdn.com/image/fetch/$s_!p7NY!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F389d84ee-51d7-47a5-b7f2-61309db97551_739x220.png 848w, https://substackcdn.com/image/fetch/$s_!p7NY!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F389d84ee-51d7-47a5-b7f2-61309db97551_739x220.png 1272w, https://substackcdn.com/image/fetch/$s_!p7NY!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F389d84ee-51d7-47a5-b7f2-61309db97551_739x220.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!p7NY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F389d84ee-51d7-47a5-b7f2-61309db97551_739x220.png" width="739" height="220" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/389d84ee-51d7-47a5-b7f2-61309db97551_739x220.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:220,&quot;width&quot;:739,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:169190,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineerthings.substack.com/i/188303152?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F389d84ee-51d7-47a5-b7f2-61309db97551_739x220.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!p7NY!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F389d84ee-51d7-47a5-b7f2-61309db97551_739x220.png 424w, https://substackcdn.com/image/fetch/$s_!p7NY!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F389d84ee-51d7-47a5-b7f2-61309db97551_739x220.png 848w, https://substackcdn.com/image/fetch/$s_!p7NY!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F389d84ee-51d7-47a5-b7f2-61309db97551_739x220.png 1272w, https://substackcdn.com/image/fetch/$s_!p7NY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F389d84ee-51d7-47a5-b7f2-61309db97551_739x220.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a><figcaption class="image-caption"><a href="https://www.dataengineeringweekly.com/p/data-engineering-after-ai">Source</a></figcaption></figure></div><p><strong>&#128161; Why is this relevant for DEs?</strong></p><ul><li><p style="text-align: justify;"><strong>Shift in core responsibilities:</strong> Data engineers should focus more on designing architectures that preserve and govern the semantic meaning of data across systems. </p></li><li><p style="text-align: justify;"><strong>Need for semantic and governance infrastructure:</strong> With ECL, data engineers need to build and manage data contracts, lineage systems, and context stores that ensure data definitions remain consistent, versioned, and trustworthy as data moves through multiple transformation layers. </p></li><li><p style="text-align: justify;"><strong>Emergence of a new role</strong>: The discipline is evolving from pipeline engineering to &#8220;context architect,&#8221; where data engineers design the contextual frameworks that allow AI systems and downstream applications to interpret and use data reliably.</p></li></ul><h4><a href="https://blog.bytebytego.com/p/how-openai-scaled-to-800-million">OpenAI: Scaling to 800 Million Users With Postgres - P99 latency</a></h4><blockquote><p><strong>&#128214; Topic</strong>: Databases<br>&#129504; <strong>Level</strong>: Intermediate</p></blockquote><p><strong>Summary</strong>: OpenAI scaled PostgreSQL to support over 800 million ChatGPT users using a single primary database with dozens of read replicas, skipping complex sharding entirely. By focusing on reducing primary writer load, optimizing queries and connections, and preventing cascading failures, they achieved low double-digit millisecond latency and 99.999% availability. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!WZzY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F641a0ccc-ef3e-4324-a824-273d9c1e0cf4_759x430.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!WZzY!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F641a0ccc-ef3e-4324-a824-273d9c1e0cf4_759x430.png 424w, https://substackcdn.com/image/fetch/$s_!WZzY!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F641a0ccc-ef3e-4324-a824-273d9c1e0cf4_759x430.png 848w, https://substackcdn.com/image/fetch/$s_!WZzY!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F641a0ccc-ef3e-4324-a824-273d9c1e0cf4_759x430.png 1272w, https://substackcdn.com/image/fetch/$s_!WZzY!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F641a0ccc-ef3e-4324-a824-273d9c1e0cf4_759x430.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!WZzY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F641a0ccc-ef3e-4324-a824-273d9c1e0cf4_759x430.png" width="719" height="407.33860342555994" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/641a0ccc-ef3e-4324-a824-273d9c1e0cf4_759x430.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:430,&quot;width&quot;:759,&quot;resizeWidth&quot;:719,&quot;bytes&quot;:118937,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineerthings.substack.com/i/188303152?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F641a0ccc-ef3e-4324-a824-273d9c1e0cf4_759x430.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!WZzY!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F641a0ccc-ef3e-4324-a824-273d9c1e0cf4_759x430.png 424w, https://substackcdn.com/image/fetch/$s_!WZzY!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F641a0ccc-ef3e-4324-a824-273d9c1e0cf4_759x430.png 848w, https://substackcdn.com/image/fetch/$s_!WZzY!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F641a0ccc-ef3e-4324-a824-273d9c1e0cf4_759x430.png 1272w, https://substackcdn.com/image/fetch/$s_!WZzY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F641a0ccc-ef3e-4324-a824-273d9c1e0cf4_759x430.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>&#128161; Why is this relevant for DEs?</strong></p><ul><li><p style="text-align: justify;"><strong>Optimize before adding complexity:</strong> OpenAI showed that tuning PostgreSQL for query optimization, caching, and connection pooling can delay or eliminate the need for sharding and distributed systems.</p></li><li><p style="text-align: justify;"><strong>Design for your actual workload:</strong> ChatGPT is largely read-heavy, so read replicas and caching worked. Match your architecture to what your system actually does, not to generic best practices.</p></li><li><p style="text-align: justify;"><strong>Reliability is key:</strong> Tools like PgBouncer, rate limiting, and workload isolation show that modern data engineering is as much about keeping systems stable as it is about building pipelines.</p></li></ul><h4><strong><a href="https://netflixtechblog.com/scaling-llm-post-training-at-netflix-0046f8790194">Netflix: Scaling LLM Post-Training</a></strong></h4><blockquote><p><strong>&#128214; Topic</strong>: Data Engineering &amp; AI<br>&#129504; <strong>Level</strong>: Intermediate</p></blockquote><p style="text-align: justify;"><strong>Summary:</strong> Netflix built an LLM post-training framework to scale the adaptation of foundation models for production use cases such as recommendations, personalization, and search. Pre-trained models must be adapted to understand Netflix&#8217;s catalog and user behavior, but doing this at Netflix scale, with massive data pipelines and distributed GPU clusters, is a major engineering challenge. The framework, built on their ML platform Mako using PyTorch, Ray, and vLLM, enables techniques like fine-tuning, reinforcement learning, preference optimization, and knowledge distillation. The framework is organized around four pillars: <strong>data, model, compute, and workflow</strong>, providing a unified way to manage datasets, shard models, orchestrate GPUs, and run multi-stage training pipelines.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!oDE9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2fcb3e0-3016-47ab-ad58-a4cab281ce3d_703x261.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!oDE9!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2fcb3e0-3016-47ab-ad58-a4cab281ce3d_703x261.png 424w, https://substackcdn.com/image/fetch/$s_!oDE9!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2fcb3e0-3016-47ab-ad58-a4cab281ce3d_703x261.png 848w, https://substackcdn.com/image/fetch/$s_!oDE9!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2fcb3e0-3016-47ab-ad58-a4cab281ce3d_703x261.png 1272w, https://substackcdn.com/image/fetch/$s_!oDE9!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2fcb3e0-3016-47ab-ad58-a4cab281ce3d_703x261.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!oDE9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2fcb3e0-3016-47ab-ad58-a4cab281ce3d_703x261.png" width="703" height="261" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b2fcb3e0-3016-47ab-ad58-a4cab281ce3d_703x261.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:261,&quot;width&quot;:703,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:155097,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineerthings.substack.com/i/188303152?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2873a576-4917-474a-bc31-2a146b7e943c_703x261.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!oDE9!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2fcb3e0-3016-47ab-ad58-a4cab281ce3d_703x261.png 424w, https://substackcdn.com/image/fetch/$s_!oDE9!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2fcb3e0-3016-47ab-ad58-a4cab281ce3d_703x261.png 848w, https://substackcdn.com/image/fetch/$s_!oDE9!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2fcb3e0-3016-47ab-ad58-a4cab281ce3d_703x261.png 1272w, https://substackcdn.com/image/fetch/$s_!oDE9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2fcb3e0-3016-47ab-ad58-a4cab281ce3d_703x261.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://netflixtechblog.com/scaling-llm-post-training-at-netflix-0046f8790194">Source</a></figcaption></figure></div><p><strong>&#128161; Why is this relevant for DEs?</strong></p><ul><li><p style="text-align: justify;"><strong>Data pipelines are foundational to LLM training</strong>: Post-training workflows depend on well-prepared data curated datasets, proper tokenization, and efficient streaming to distributed training systems. Data engineers own the pipelines that handle all of this, from selecting and transforming domain-specific data.</p></li><li><p style="text-align: justify;"><strong>LLM systems require scalable data infrastructure</strong>: Netflix&#8217;s framework underscores the need for distributed workflows to coordinate across GPUs, storage, and orchestration layers. Data engineers are central to building the plumbing that moves, batches, and serves data reliably for both training and inference.</p></li></ul><h4><a href="https://www.linkedin.com/blog/engineering/infrastructure/rethinking-hfds-block-placement-for-exabyte-scale-clusters">LinkedIn: Maintaining exabyte-scale Hadoop clusters</a></h4><blockquote><p><strong>&#128214; Topic</strong>: Data Infrastructure<br>&#129504; <strong>Level</strong>: Advanced</p></blockquote><p style="text-align: justify;"><strong>Summary:</strong> LinkedIn re-engineered the block placement strategy in Apache Hadoop&#8217;s HDFS to support exabyte-scale clusters storing about<strong> 5 exabytes of data and 10 billion objects</strong> while maintaining <strong>99.99% availability</strong>. As clusters expanded to thousands of nodes, the default rack-based replication policy caused heavy overhead during maintenance because nodes had to replicate large volumes of data before going offline.</p><p style="text-align: justify;">To address this, LinkedIn introduced <strong>upgrade domains</strong>, logical groupings of datanodes that distribute replicas across broader failure boundaries than racks. By updating the block placement policy without disrupting production traffic, LinkedIn eliminated large-scale replication during maintenance, reducing network congestion, speeding up upgrades, and allowing maintenance on up to<strong> 4.5% of datanodes per day</strong> while maintaining performance and reliability.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!JQlb!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc81e9318-5e27-4589-add1-70c5bbe2f69a_728x550.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!JQlb!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc81e9318-5e27-4589-add1-70c5bbe2f69a_728x550.png 424w, https://substackcdn.com/image/fetch/$s_!JQlb!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc81e9318-5e27-4589-add1-70c5bbe2f69a_728x550.png 848w, https://substackcdn.com/image/fetch/$s_!JQlb!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc81e9318-5e27-4589-add1-70c5bbe2f69a_728x550.png 1272w, https://substackcdn.com/image/fetch/$s_!JQlb!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc81e9318-5e27-4589-add1-70c5bbe2f69a_728x550.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!JQlb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc81e9318-5e27-4589-add1-70c5bbe2f69a_728x550.png" width="728" height="550" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c81e9318-5e27-4589-add1-70c5bbe2f69a_728x550.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:550,&quot;width&quot;:728,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:90462,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineerthings.substack.com/i/188303152?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc81e9318-5e27-4589-add1-70c5bbe2f69a_728x550.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!JQlb!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc81e9318-5e27-4589-add1-70c5bbe2f69a_728x550.png 424w, https://substackcdn.com/image/fetch/$s_!JQlb!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc81e9318-5e27-4589-add1-70c5bbe2f69a_728x550.png 848w, https://substackcdn.com/image/fetch/$s_!JQlb!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc81e9318-5e27-4589-add1-70c5bbe2f69a_728x550.png 1272w, https://substackcdn.com/image/fetch/$s_!JQlb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc81e9318-5e27-4589-add1-70c5bbe2f69a_728x550.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://www.linkedin.com/blog/engineering/infrastructure/rethinking-hfds-block-placement-for-exabyte-scale-clusters">Source</a></figcaption></figure></div><p><strong>&#128161; Why is this relevant for DEs?</strong></p><ul><li><p style="text-align: justify;"><strong>Data reliability and availability at massive scale:</strong> Data engineers working with distributed storage systems such as Apache Hadoop must design architectures that ensure high data availability, redundancy, and fault tolerance, even when clusters contain thousands of nodes and exabytes of data.</p></li><li><p style="text-align: justify;"><strong>Operational scalability is a core data engineering challenge:</strong> As data platforms grow, routine operations such as hardware upgrades, patching, and cluster maintenance must be redesigned to avoid massive data movement or downtime, highlighting the need for an architecture that scales operationally as well as technically.</p></li></ul><h4><a href="https://www.analytics8.com/blog/elements-of-a-data-strategy/">Elements of a Modern Data Strategy</a></h4><blockquote><p><strong>&#128214; Topic</strong>: Data Strategy<br>&#129504; <strong>Level</strong>: Intermediate</p></blockquote><p style="text-align: justify;"><strong>Summary: </strong>A modern data strategy is what separates organizations that talk about data from those that actually drive results with it. It aligns people, processes, and technology across five foundational pillars. </p><ul><li><p style="text-align: justify;">Tying every data initiative directly to business outcomes through deep stakeholder engagement. </p></li><li><p style="text-align: justify;">Selecting a tech stack that scales and works as a cohesive ecosystem.</p></li><li><p style="text-align: justify;">Embedding governance for high-quality data and trust.</p></li><li><p style="text-align: justify;">Investing in talent with clear roles and continuous enablement.</p></li><li><p style="text-align: justify;">Laying out a prioritized roadmap that balances quick wins with long-term transformation. </p></li></ul><p style="text-align: justify;">Without this foundation, organizations end up with fragmented data, slow decision-making, shelfware investments, and an inability to capitalize on AI and automation when it matters most.</p><p><strong>&#128161; Why is this relevant for DEs?</strong></p><ul><li><p style="text-align: justify;">The data stack, pipelines, storage architecture, and integration layers defined in a data strategy are largely designed and implemented by data engineers, making them central to turning strategy into working systems.</p></li><li><p style="text-align: justify;">Data engineers operationalize data governance, lineage, access controls, and reliable data pipelines, ensuring that data across the organization is trusted, consistent, and usable for analytics and AI.</p></li></ul><div><hr></div><h3>&#9874;&#65039; Data Engineering Open Forum 2026</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!K679!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf85f342-1bd6-4509-8ec4-6f276476ec6a_1920x1080.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!K679!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf85f342-1bd6-4509-8ec4-6f276476ec6a_1920x1080.jpeg 424w, https://substackcdn.com/image/fetch/$s_!K679!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf85f342-1bd6-4509-8ec4-6f276476ec6a_1920x1080.jpeg 848w, https://substackcdn.com/image/fetch/$s_!K679!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf85f342-1bd6-4509-8ec4-6f276476ec6a_1920x1080.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!K679!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf85f342-1bd6-4509-8ec4-6f276476ec6a_1920x1080.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!K679!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf85f342-1bd6-4509-8ec4-6f276476ec6a_1920x1080.jpeg" width="499" height="280.6875" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/af85f342-1bd6-4509-8ec4-6f276476ec6a_1920x1080.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:499,&quot;bytes&quot;:165482,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineerthings.substack.com/i/188303152?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf85f342-1bd6-4509-8ec4-6f276476ec6a_1920x1080.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!K679!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf85f342-1bd6-4509-8ec4-6f276476ec6a_1920x1080.jpeg 424w, https://substackcdn.com/image/fetch/$s_!K679!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf85f342-1bd6-4509-8ec4-6f276476ec6a_1920x1080.jpeg 848w, https://substackcdn.com/image/fetch/$s_!K679!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf85f342-1bd6-4509-8ec4-6f276476ec6a_1920x1080.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!K679!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf85f342-1bd6-4509-8ec4-6f276476ec6a_1920x1080.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;">The Data Engineering Open Forum (DEOF) is a community-driven conference featuring in-depth sessions that address the real challenges and innovations in data engineering today. The talks cover a wide variety of topics, including AI Agents, Multimodal data, Data lineage, Data Observability, Semantic layer, and more.</p><p style="text-align: justify;">Out of a stacked lineup, the two sessions below caught my eye.</p><ul><li><p style="text-align: justify;"><strong><a href="https://www.dataengineeringopenforum.com/?session=from-manual-to-magical-building-ai-agents-for-etl-automation#agenda">From Manual to Magical: Building AI Agents for ETL Automation by Himanshi Manglunia, Senior Data Engineer @ AWS</a>:</strong> This talk demonstrates a production-grade, autonomous ETL system in which AI agents (powered by Kiro and Claude) handle end-to-end pipeline changes.</p></li><li><p style="text-align: justify;"><strong><a href="https://www.dataengineeringopenforum.com/?session=inside-openais-internal-ai-data-agent#agenda">Inside OpenAI&#8217;s Internal AI Data Agent by Bonnie Xu, Staff Software Engineer @ OpenAI</a>:</strong> This session provides a look under the hood of OpenAI&#8217;s internal data agent, a tool that lets employees turn questions into insights in minutes. Xu unpacks the agent&#8217;s core architecture, the multiple layers of context it uses to answer queries, and how the team keeps that context updated with almost zero manual intervention.</p></li></ul><p>&#128203; Agenda and more details <strong><a href="https://www.dataengineeringopenforum.com/?utm_source=newsletter">HERE</a></strong>.</p><p>&#128073; <a href="https://luma.com/deof2026?coupon=DETCOMMUNITY">RSVP</a> before March 22 to get an exclusive DET community discount (33% off).</p><div><hr></div><h3><strong>&#128142; Open Source Gems</strong></h3><p><strong><a href="https://cube.dev/product/cube-core">Cube Core: Semantic Layer</a></strong></p><p style="text-align: justify;">Cube Core is an open-source semantic layer that lets organizations define metrics, dimensions, and business logic once and reuse them across BI tools, embedded analytics, and AI agents through standard REST, GraphQL, and SQL APIs. It works with all major SQL data sources, Snowflake, Databricks, BigQuery, Postgres, and more, and includes a built-in caching engine for sub-second query performance. </p><p>&#128161; <strong>Why is this useful for DEs</strong>?</p><ul><li><p style="text-align: justify;">With Cube Core, data engineers define metrics and business logic in one place, and every tool, BI dashboards, applications, and AI agents pull from the same definitions.</p></li><li><p style="text-align: justify;">Most organizations run multiple BI tools and data platforms. A semantic layer sits between them and provides a single, consistent way to query data, eliminating duplicate logic and unnecessary complexity.</p></li></ul><p><strong>Github:</strong> <a href="https://github.com/cube-js/cube">https://github.com/cube-js/cube</a></p><div><hr></div><h3><strong>&#128161; DE Tip of the Month </strong></h3><h3><strong><a href="https://docs.getdbt.com/docs/build/incremental-microbatch?version=1.10">Processing large time-series datasets in dbt </a></strong></h3><p style="text-align: justify;">Microbatch incremental models in dbt efficiently process large time-series datasets by splitting transformations into small, time-bounded batches based on an event_time column.</p><ul><li><p style="text-align: justify;">Atomic, idempotent batch execution: Each batch represents a self-contained unit of work that can run independently, be retried if it fails, and even execute in parallel for faster data processing.</p></li><li><p style="text-align: justify;">Simpler incremental logic: Unlike traditional incremental models that require custom SQL conditions, micro batch automatically determines which batches to run, simplifying model design.</p></li><li><p style="text-align: justify;">Flexible backfills and failure recovery: Engineers can easily reprocess historical data or retry failed batches by specifying time ranges (--event-time-start and --event-time-end) without rebuilding the entire dataset.</p></li></ul><div><hr></div><h3>&#128202; Community Poll</h3><div class="poll-embed" data-attrs="{&quot;id&quot;:473261}" data-component-name="PollToDOM"></div><p>Until next time, cheers!</p><p><a href="https://www.linkedin.com/in/anandaganesh/">Ananda</a>, <a href="https://www.linkedin.com/in/sukanyawadawadagi/">Sukanya</a> &amp; <a href="https://www.linkedin.com/in/vjanz/">Volker</a></p><div><hr></div><h4>&#8505;&#65039; About Data Engineer Things</h4><p><a href="https://www.dataengineerthings.org/">Data Engineer Things</a> (DET) is a global community built by data engineers for data engineers. Subscribe to the <a href="https://dataengineerthings.substack.com/">newsletter</a> and follow us on <a href="https://www.linkedin.com/company/data-engineer-things/posts/?feedView=all">LinkedIn</a> to gain access to exclusive learning resources and networking opportunities, including articles, webinars, meetups, conferences, mentorship, and much more.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://dataengineerthings.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Data Engineer Things! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Data Engineer Things Newsletter - Community Spotlight Edition (Mar 2026)]]></title><description><![CDATA[Why nobody believes "100x faster" benchmarks, and why the dedicated graph database might be dead.]]></description><link>https://dataengineerthings.substack.com/p/data-engineer-things-newsletter-community-324</link><guid isPermaLink="false">https://dataengineerthings.substack.com/p/data-engineer-things-newsletter-community-324</guid><dc:creator><![CDATA[Data Engineer Things]]></dc:creator><pubDate>Thu, 05 Mar 2026 16:03:09 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!LK4R!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd94751f3-fc94-4f3f-a7b0-a93bcbbbbcaa_1456x1048.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!LK4R!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd94751f3-fc94-4f3f-a7b0-a93bcbbbbcaa_1456x1048.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!LK4R!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd94751f3-fc94-4f3f-a7b0-a93bcbbbbcaa_1456x1048.png 424w, https://substackcdn.com/image/fetch/$s_!LK4R!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd94751f3-fc94-4f3f-a7b0-a93bcbbbbcaa_1456x1048.png 848w, https://substackcdn.com/image/fetch/$s_!LK4R!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd94751f3-fc94-4f3f-a7b0-a93bcbbbbcaa_1456x1048.png 1272w, https://substackcdn.com/image/fetch/$s_!LK4R!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd94751f3-fc94-4f3f-a7b0-a93bcbbbbcaa_1456x1048.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!LK4R!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd94751f3-fc94-4f3f-a7b0-a93bcbbbbcaa_1456x1048.png" width="1456" height="1048" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d94751f3-fc94-4f3f-a7b0-a93bcbbbbcaa_1456x1048.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1048,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:270152,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://dataengineerthings.substack.com/i/189008910?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd94751f3-fc94-4f3f-a7b0-a93bcbbbbcaa_1456x1048.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!LK4R!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd94751f3-fc94-4f3f-a7b0-a93bcbbbbcaa_1456x1048.png 424w, https://substackcdn.com/image/fetch/$s_!LK4R!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd94751f3-fc94-4f3f-a7b0-a93bcbbbbcaa_1456x1048.png 848w, https://substackcdn.com/image/fetch/$s_!LK4R!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd94751f3-fc94-4f3f-a7b0-a93bcbbbbcaa_1456x1048.png 1272w, https://substackcdn.com/image/fetch/$s_!LK4R!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd94751f3-fc94-4f3f-a7b0-a93bcbbbbcaa_1456x1048.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Hi everyone,</p><p>This series is designed for one outcome - <strong>clear lessons on how experienced builders approach data engineering problems</strong>, not just a founder story or a project overview, so that our DET community can have reusable mental model on how to think about scale, operability under real constraints.</p><p>To kick things off, we sat down with <strong>Weimo Liu</strong>, co-founder of <strong>PuppyGraph</strong>, whose career spans database research, <strong>TigerGraph</strong>, and Google&#8217;s <strong>F1</strong> team. His perspective is a perfect stress test for the themes we care about: why graph ideas have been academically compelling for decades, why production adoption is still hard, and how &#8220;scale&#8221; in industry changes what success even means. In this interview, Weimo walks through the shift from chasing <strong>benchmark wins to optimizing for cost and operability</strong>&#8212;and shares a provocative approach to graphs: running graph queries directly on modern table formats like <strong>Apache Iceberg</strong>, instead of asking enterprises to migrate and reload everything.</p><p>Let&#8217;s dive in and walk through Weimo&#8217;s journey.</p><p>- Swetha Sekhar</p><div><hr></div><h3><strong>&#127903;&#65039; Conference: Data Engineering Open Forum 2026</strong></h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!e4SV!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e374f98-924c-4cf5-b46c-553e385156f6_1134x327.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!e4SV!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e374f98-924c-4cf5-b46c-553e385156f6_1134x327.png 424w, https://substackcdn.com/image/fetch/$s_!e4SV!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e374f98-924c-4cf5-b46c-553e385156f6_1134x327.png 848w, https://substackcdn.com/image/fetch/$s_!e4SV!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e374f98-924c-4cf5-b46c-553e385156f6_1134x327.png 1272w, https://substackcdn.com/image/fetch/$s_!e4SV!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e374f98-924c-4cf5-b46c-553e385156f6_1134x327.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!e4SV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e374f98-924c-4cf5-b46c-553e385156f6_1134x327.png" width="1134" height="327" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7e374f98-924c-4cf5-b46c-553e385156f6_1134x327.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:327,&quot;width&quot;:1134,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:254314,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineerthings.substack.com/i/189008910?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e374f98-924c-4cf5-b46c-553e385156f6_1134x327.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!e4SV!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e374f98-924c-4cf5-b46c-553e385156f6_1134x327.png 424w, https://substackcdn.com/image/fetch/$s_!e4SV!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e374f98-924c-4cf5-b46c-553e385156f6_1134x327.png 848w, https://substackcdn.com/image/fetch/$s_!e4SV!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e374f98-924c-4cf5-b46c-553e385156f6_1134x327.png 1272w, https://substackcdn.com/image/fetch/$s_!e4SV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e374f98-924c-4cf5-b46c-553e385156f6_1134x327.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>We are hosting the 3rd Data Engineering Open Forum (DEOF) on April 16 in San Francisco! Here is what you will experience expect at the event:</p><ul><li><p>Sessions by speakers who are solving cutting edge problems in data engineering, for examples, Apache project creators and PMCs like <a href="https://www.linkedin.com/in/julienledem/">Julien Le Dem</a>, <a href="https://www.linkedin.com/in/boyang-jerry-peng/">Boyang Jerry Peng</a>, and <a href="https://www.linkedin.com/in/yezhaoqin/">Jack Ye</a>.</p></li><li><p>Intentionally-designed activities that you can sign up to engage in small-group networking (because we know it&#8217;s hard to make conversations at conference).</p></li><li><p>Opportunities to connect with data engineering teams at top tech companies (like <strong>Netflix</strong>, <strong>Airbnb</strong>, and more) at their booths.</p></li></ul><p>Our ultimate goal is that when you look back some day, you could confidently say &#8220;I&#8217;m so glad I went to DEOF&#8221;, because the people you met there or ideas you walked away with made a difference in your career.</p><p>&#128073; See the agenda <strong><a href="https://www.dataengineeringopenforum.com/">HERE</a></strong>. <strong><a href="https://luma.com/deof2026?utm_source=newsletter-march-3">RSVP</a></strong> before Early Bird price ends on March 11.</p><div><hr></div><h3>Spotlight: Weimo Liu</h3><div class="pullquote"><p>&#8220;I realized users weren&#8217;t really looking for a graph database, they were looking for the graph itself.&#8221;</p></div><blockquote><p><em>For those in the DET community who may not know you yet, could you briefly introduce yourself?</em></p></blockquote><p>Hi everyone &#8212; I&#8217;m <a href="https://www.linkedin.com/in/weimoliu/">Weimo</a>. It sounds like the &#8220;self-driving&#8221; car, Waymo. I&#8217;m the co-founder of <a href="https://www.puppygraph.com/">PuppyGraph</a>. Before starting PuppyGraph, I worked at TigerGraph and Google&#8217;s F1 team. TigerGraph is a graph database startup, and F1 is Google&#8217;s internal unified SQL query engine, serving billions of queries per day. Thanks so much for having me, it&#8217;s a real pleasure to be here.</p><div><hr></div><blockquote><p><em>You&#8217;ve been working in the database world since your PhD years. Tell us about your career journey &#8212; what initially drew you to databases, and what kept you in the space for so many years?</em></p></blockquote><p>Back in college, I did very well in my data structures class, and my professor invited me to join his lab. That led me into database research, where I published papers on spatial databases.</p><p>When I applied for PhD programs, I actually looked up professors who had published heavily at SIGMOD and VLDB in recent years, and found my advisor Dr. Zhang.</p><p>After my PhD, I joined TigerGraph. The founding CTO was a close friend of Dr. Zhang, and I worked there for almost three years. Later, I joined Google &#8212; partly because it felt like a safe choice &#8212; and worked on the F1 team.</p><p><strong>What keeps me in databases is that the field is very concrete. The problems are understandable, progress is measurable, and when something improves, you can usually prove it.</strong></p><div><hr></div><blockquote><p><em>You once said that &#8220;half the database papers back then were about graphs.&#8221; What made graph workloads so captivating for researchers?</em></p></blockquote><p>First, graphs are hard. The data is highly connected, and it&#8217;s very difficult to shard and distribute efficiently, which constantly creates room for new algorithms and system designs.</p><p>Second, motivation is very natural. <strong>Researchers don&#8217;t need to invent artificial use cases just to justify the work &#8212; real graph problems already exist everywhere, such as anti-fraud, network observability, social network analytics, and cybersecurity</strong>.</p><p>Third, graph theory in mathematics is extremely rich. Computer scientists can borrow powerful ideas from math and turn them into systems that are both interesting and practical.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!2tDg!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60fc8e1a-b19b-4a52-bab9-ccf1f1065789_1514x730.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!2tDg!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60fc8e1a-b19b-4a52-bab9-ccf1f1065789_1514x730.png 424w, https://substackcdn.com/image/fetch/$s_!2tDg!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60fc8e1a-b19b-4a52-bab9-ccf1f1065789_1514x730.png 848w, https://substackcdn.com/image/fetch/$s_!2tDg!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60fc8e1a-b19b-4a52-bab9-ccf1f1065789_1514x730.png 1272w, https://substackcdn.com/image/fetch/$s_!2tDg!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60fc8e1a-b19b-4a52-bab9-ccf1f1065789_1514x730.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!2tDg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60fc8e1a-b19b-4a52-bab9-ccf1f1065789_1514x730.png" width="1514" height="730" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/60fc8e1a-b19b-4a52-bab9-ccf1f1065789_1514x730.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:730,&quot;width&quot;:1514,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:149207,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineerthings.substack.com/i/189008910?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5eea42a-b2c9-4574-a076-2c62ede33ddd_1514x730.gif&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!2tDg!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60fc8e1a-b19b-4a52-bab9-ccf1f1065789_1514x730.png 424w, https://substackcdn.com/image/fetch/$s_!2tDg!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60fc8e1a-b19b-4a52-bab9-ccf1f1065789_1514x730.png 848w, https://substackcdn.com/image/fetch/$s_!2tDg!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60fc8e1a-b19b-4a52-bab9-ccf1f1065789_1514x730.png 1272w, https://substackcdn.com/image/fetch/$s_!2tDg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60fc8e1a-b19b-4a52-bab9-ccf1f1065789_1514x730.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Source: <a href="https://puppygraph.com/">puppygraph.com</a></figcaption></figure></div><div><hr></div><blockquote><p><em>After your PhD, you took your first industry role at TigerGraph. What surprised you most about the difference between academic database research and real-world database engineering?</em></p></blockquote><p>In academia, performance is everything. You aim for benchmarks that are 10&#215; faster than the state of the art.</p><p>In industry, speed alone doesn&#8217;t mean much. Cost, scalability, operability, these often matter far more. You can&#8217;t just spend unlimited resources to get a better benchmark number.</p><p>Also, no one really cares about &#8220;10&#215; faster,&#8221; or even &#8220;100&#215; faster.&#8221; <strong>If you walk around AWS re:Invent, every product claims to be 100&#215; better than the state of the art, without even defining what that state of the art is. There&#8217;s no peer review, and nobody truly believes those numbers</strong>.</p><div><hr></div><blockquote><p><em>Later at Google, you worked on database systems at a much larger scale. What kinds of problems were you solving there, and how did that experience change how you thought about data systems?</em></p></blockquote><p><a href="https://research.google/pubs/f1-a-distributed-sql-database-that-scales/">F1</a> is a federated query engine that can query almost every data source and format inside Google. It serves billions of queries per day and handles most of Google&#8217;s OLAP workloads &#8212; from the fastest, most expensive systems to the slowest, cheapest ones.</p><p>In the early days, Google had many different OLAP systems across different organizations. Over time, more and more teams connected their data sources to F1. Eventually, most of them were either deprecated or absorbed into the F1 ecosystem.</p><p>That experience made me realize how important unified federation really is.</p><div><hr></div><h4>&#128161; Editorial Note: F1</h4><p><em>F1 is Google&#8217;s globally distributed SQL database built on top of <a href="https://spanner.fyi/">Spanner</a>, giving full relational features (SQL, ACID, secondary indexes) at planetary scale. Spanner handles sharding, replication via <a href="https://en.wikipedia.org/wiki/Paxos_(computer_science)">Paxos</a>, and external consistency using TrueTime (a globally synchronized clock), while F1 provides a stateless SQL layer that parses, optimizes, and executes distributed query plans. Data is organized with hierarchical, interleaved tables to co-locate related rows in the same key ranges, reducing cross-shard transactions. Writes use Spanner&#8217;s <a href="https://martinfowler.com/articles/patterns-of-distributed-systems/two-phase-commit.html">two-phase commit</a> across replicas, and reads use consistent snapshots. In short, Spanner provides globally consistent distributed storage, and F1 turns it into a fully featured distributed SQL database.</em></p><p>&#128073; Read the full F1 paper <strong><a href="https://paperhub.s3.amazonaws.com/8f4f0d6a5f3740a64d0a6a1df1bfade1.pdf">HERE</a></strong>.<br>&#128073; Read the full Spanner paper <strong><a href="https://storage.googleapis.com/gweb-research2023-media/pubtools/pdf/44915.pdf">HERE</a></strong>.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!gAcg!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffddba257-18e3-4b02-abdd-3511eb9e9b67_511x390.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!gAcg!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffddba257-18e3-4b02-abdd-3511eb9e9b67_511x390.png 424w, https://substackcdn.com/image/fetch/$s_!gAcg!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffddba257-18e3-4b02-abdd-3511eb9e9b67_511x390.png 848w, https://substackcdn.com/image/fetch/$s_!gAcg!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffddba257-18e3-4b02-abdd-3511eb9e9b67_511x390.png 1272w, https://substackcdn.com/image/fetch/$s_!gAcg!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffddba257-18e3-4b02-abdd-3511eb9e9b67_511x390.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!gAcg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffddba257-18e3-4b02-abdd-3511eb9e9b67_511x390.png" width="511" height="390" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fddba257-18e3-4b02-abdd-3511eb9e9b67_511x390.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:390,&quot;width&quot;:511,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:53819,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineerthings.substack.com/i/189008910?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffddba257-18e3-4b02-abdd-3511eb9e9b67_511x390.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!gAcg!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffddba257-18e3-4b02-abdd-3511eb9e9b67_511x390.png 424w, https://substackcdn.com/image/fetch/$s_!gAcg!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffddba257-18e3-4b02-abdd-3511eb9e9b67_511x390.png 848w, https://substackcdn.com/image/fetch/$s_!gAcg!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffddba257-18e3-4b02-abdd-3511eb9e9b67_511x390.png 1272w, https://substackcdn.com/image/fetch/$s_!gAcg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffddba257-18e3-4b02-abdd-3511eb9e9b67_511x390.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Spanner architecture, <a href="https://storage.googleapis.com/gweb-research2023-media/pubtools/pdf/44915.pdf">source</a></figcaption></figure></div><div><hr></div><blockquote><p><em>What prompted you to write a new graph query framework? What problem did you feel wasn&#8217;t being solved, or wasn&#8217;t being solved in the right way?</em></p></blockquote><p>I had been thinking about this for a long time. At TigerGraph, many potential users showed strong interest in graph technology, but most of them couldn&#8217;t actually adopt it in production. For example, a large bank spent 18 months loading all of its data into the system &#8212; something that would be unrealistic for most enterprises. That told me something fundamental was wrong.</p><p>After joining Google&#8217;s F1 team, I realized these users weren&#8217;t really looking for a graph database, they were looking for the graph itself.</p><p>I didn&#8217;t start the project immediately because, unlike Google, the external data world wasn&#8217;t standardized. That changed when I read an <a href="https://a16z.com/announcement/investing-in-tabular/">a16z blog post</a> announcing that the Apache Iceberg team had left Netflix to found Tabular. I realized the timing was finally right.</p><p><strong>My co-founders and I actually reached out to the Apache Iceberg creators with a very simple demo: running graph queries directly on Iceberg, faster than most graph databases on the market. That surprised them too, they hadn&#8217;t optimized Iceberg for graph workloads at all. They supported us a lot, not just on engineering, but also on go-to-market.</strong></p><div><hr></div><blockquote><p><em>During the development of the framework, what were some of the hardest technical problems you had to solve?</em></p></blockquote><p>Graph systems are notoriously difficult to scale, which is one of the main reasons they have not been widely adopted by enterprises with very large datasets.</p><p>We define the operators in a graph query plan as NodeOperators and EdgeOperators, where the input and output of each operator is a collection of nodes or edges. PuppyGraph assumes that all graph queries, patterns, and algorithms can be expressed as combinations of these operators.</p><p><strong>PuppyGraph implements a rule-based optimizer for logical query execution plans, and a hybrid rule-and-cost-based optimizer for physical execution plans. Because all inputs and outputs are collections, any individual operator can be massively parallel processed (MPP) and vectorized for evaluation.</strong></p><div><hr></div><blockquote><p><em>How did you find your co-founders and early team members, and what qualities did you look for when building the team?</em></p></blockquote><p>They&#8217;re all old friends. Our CTO was my college roommate, and our chief architect lived next door. They&#8217;re well-known competitive programming contestants, and honestly, much better programmers than I am.</p><p><strong>Trust mattered the most. In the early days, you need to move fast and collaborate extremely smoothly. We were lucky to have already built that trust years ago, it really feels like a reunion of old teammates.</strong></p><div><hr></div><blockquote><p><em>Has the graph problem space changed in response to the rise of AI? Where do you think the field is actually heading?</em></p></blockquote><p>Yes, very much so. Initially, we didn&#8217;t think of this problem space as being closely related to AI. But more and more AI companies and AI teams started reaching out.</p><p><strong>AI generates more data, and at the same time, it activates a lot of previously &#8220;cold&#8221; data. A human data analyst might be able to keep track of hundreds of tables, but AI systems can reason over thousands or even more simultaneously.</strong> Our view is that when you already have tables, you already have knowledge. You don&#8217;t need to build a separate knowledge graph &#8212; you can simply treat your existing data as a graph. That structure naturally provides context to LLMs.</p><div><hr></div><blockquote><p><em>Before we let you go&#8230; who should we interview next, and why?</em></p></blockquote><p>Haha &#8212; I&#8217;d suggest Zhou Sun, the co-founder of Mooncake Labs. He&#8217;s sharp, opinionated, and deeply knowledgeable about databases. I think it would be a really engaging conversation.</p><div><hr></div><h3>Key Takeaways</h3><ul><li><p>The central reframing: many teams don&#8217;t need &#8220;a graph database&#8221;; they need <strong>graph type queries over the data they already have</strong>, and open table formats like <strong>Apache Iceberg</strong> make that approach more feasible.</p></li><li><p><strong>AI shifts the bottleneck from &#8220;finding data&#8221; to &#8220;connecting data with meaningful relationships&#8221;</strong> When AI can reason across thousands of tables, the problem becomes building reliable structure/context across them.</p></li><li><p><strong>Weimo&#8217;s differentiator is translation:</strong> turning research-grade system design into something enterprises can run&#8212;where operability and time-to-value matter as much as performance.</p></li></ul><div><hr></div><h3>Community poll</h3><div class="poll-embed" data-attrs="{&quot;id&quot;:462916}" data-component-name="PollToDOM"></div><p></p><div><hr></div><h3>&#128172; How to stay connected</h3><ul><li><p><a href="https://www.linkedin.com/in/weimoliu/">LinkedIn</a></p></li><li><p><a href="https://www.puppygraph.com/">PuppyGraph</a></p></li></ul><div><hr></div><h3>&#8505;&#65039; About Data Engineer Things</h3><p><a href="https://www.dataengineerthings.org/">Data Engineer Things</a> (DET) is a global community built by data engineers for data engineers. Subscribe to the <a href="https://dataengineerthings.substack.com/">newsletter</a> and follow us on <a href="https://www.linkedin.com/company/data-engineer-things/posts/?feedView=all">LinkedIn</a> to gain access to exclusive learning resources and networking opportunities, including articles, webinars, meetups, conferences, mentorship, and much more.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://dataengineerthings.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Data Engineer Things! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Data Engineer Things Newsletter - Data Pulse Edition (Feb 2026)]]></title><description><![CDATA[Data Movement at Netflix, Uber's Trillion-Record Lake, AI Skills for Agents and Building Your Brand in Data Engineering]]></description><link>https://dataengineerthings.substack.com/p/data-engineer-things-newsletter-data-fef</link><guid isPermaLink="false">https://dataengineerthings.substack.com/p/data-engineer-things-newsletter-data-fef</guid><dc:creator><![CDATA[Data Engineer Things]]></dc:creator><pubDate>Tue, 17 Feb 2026 16:02:48 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!bDiI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb746ef5b-d76d-42e5-8078-51fd0671bdc6_1456x1048.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!bDiI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb746ef5b-d76d-42e5-8078-51fd0671bdc6_1456x1048.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!bDiI!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb746ef5b-d76d-42e5-8078-51fd0671bdc6_1456x1048.png 424w, https://substackcdn.com/image/fetch/$s_!bDiI!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb746ef5b-d76d-42e5-8078-51fd0671bdc6_1456x1048.png 848w, https://substackcdn.com/image/fetch/$s_!bDiI!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb746ef5b-d76d-42e5-8078-51fd0671bdc6_1456x1048.png 1272w, https://substackcdn.com/image/fetch/$s_!bDiI!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb746ef5b-d76d-42e5-8078-51fd0671bdc6_1456x1048.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!bDiI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb746ef5b-d76d-42e5-8078-51fd0671bdc6_1456x1048.png" width="1456" height="1048" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b746ef5b-d76d-42e5-8078-51fd0671bdc6_1456x1048.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1048,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:209937,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://dataengineerthings.substack.com/i/185447857?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb746ef5b-d76d-42e5-8078-51fd0671bdc6_1456x1048.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!bDiI!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb746ef5b-d76d-42e5-8078-51fd0671bdc6_1456x1048.png 424w, https://substackcdn.com/image/fetch/$s_!bDiI!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb746ef5b-d76d-42e5-8078-51fd0671bdc6_1456x1048.png 848w, https://substackcdn.com/image/fetch/$s_!bDiI!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb746ef5b-d76d-42e5-8078-51fd0671bdc6_1456x1048.png 1272w, https://substackcdn.com/image/fetch/$s_!bDiI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb746ef5b-d76d-42e5-8078-51fd0671bdc6_1456x1048.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Hello Folks,</p><p>Great to connect with you through this month&#8217;s newsletter! I hope the resolutions you set at the start of the year are still going strong.</p><p>I'm writing this from Coimbatore, India, where the cooler winter mornings are slowly transitioning to warmer and brighter days. There's something I love about this time of year here - clear skies, fresh energy in the air, and that sense of momentum building as the season shifts.</p><p>That momentum reminds me of data engineering. The big transformations rarely happen overnight, but every new tool we explore, every system we scale, every challenge we solve it all adds up over time. It&#8217;s that steady, consistent work that fuels real progress. And through it all, we're building something meaningful: data systems people can actually rely on.</p><p>Outside of work, I enjoy reading, travelling, and watching movies - small ways to recharge and stay curious. In many ways, this community offers the same: a space to learn from each other and continue growing together.</p><p>This edition is packed with ideas and resources from across the community. I hope you find something here that sparks a new idea or gives you a fresh perspective.</p><p>Happy reading, and thank you for being on this journey with us.</p><p>- Sri</p><div><hr></div><h3><strong>&#128218;</strong> Data Pulse</h3><h4><a href="https://netflixtechblog.medium.com/data-bridge-how-netflix-simplifies-data-movement-36d10d91c313">Netflix: Simplify Data Movement using Data Bridge</a></h4><blockquote><p><strong>&#128214; Topic</strong>: Data Ingestion<br>&#129504; <strong>Level</strong>:  Intermediate</p></blockquote><p><strong>Summary: </strong>Netflix introduces <strong>Data Bridge, </strong>a unified control plane that standardizes how data is moved across its vast ecosystem of data stores. Instead of teams building case-specific/custom pipelines for every new data movement use case, Data Bridge abstracts the implementation details of how it is moved from why and what data needs to be moved. Data engineers declare their intent once, and the platform handles routing, execution, and operational concerns behind the scenes.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!1KIZ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea2b0ab4-8b9e-497c-a889-4b3ebc3e5ed5_1275x582.webp" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!1KIZ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea2b0ab4-8b9e-497c-a889-4b3ebc3e5ed5_1275x582.webp 424w, https://substackcdn.com/image/fetch/$s_!1KIZ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea2b0ab4-8b9e-497c-a889-4b3ebc3e5ed5_1275x582.webp 848w, https://substackcdn.com/image/fetch/$s_!1KIZ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea2b0ab4-8b9e-497c-a889-4b3ebc3e5ed5_1275x582.webp 1272w, https://substackcdn.com/image/fetch/$s_!1KIZ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea2b0ab4-8b9e-497c-a889-4b3ebc3e5ed5_1275x582.webp 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!1KIZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea2b0ab4-8b9e-497c-a889-4b3ebc3e5ed5_1275x582.webp" width="1275" height="582" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ea2b0ab4-8b9e-497c-a889-4b3ebc3e5ed5_1275x582.webp&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:582,&quot;width&quot;:1275,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:32228,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/webp&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineerthings.substack.com/i/185447857?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea2b0ab4-8b9e-497c-a889-4b3ebc3e5ed5_1275x582.webp&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!1KIZ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea2b0ab4-8b9e-497c-a889-4b3ebc3e5ed5_1275x582.webp 424w, https://substackcdn.com/image/fetch/$s_!1KIZ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea2b0ab4-8b9e-497c-a889-4b3ebc3e5ed5_1275x582.webp 848w, https://substackcdn.com/image/fetch/$s_!1KIZ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea2b0ab4-8b9e-497c-a889-4b3ebc3e5ed5_1275x582.webp 1272w, https://substackcdn.com/image/fetch/$s_!1KIZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea2b0ab4-8b9e-497c-a889-4b3ebc3e5ed5_1275x582.webp 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Data bridge control plane, <a href="https://netflixtechblog.medium.com/data-bridge-how-netflix-simplifies-data-movement-36d10d91c313">source</a></figcaption></figure></div><p><strong>&#128161; Why is this relevant for DEs?</strong></p><ul><li><p><strong>Platform over Pipelines:</strong> It helps data engineers move from building siloed data movement pipelines to designing a centralized orchestration layer. This enables reuse, governance, and delivery at scale, rather than reinventing the wheel for every use case.</p></li><li><p><strong>Self-Service:</strong> New teams and data stores can plug into the system without reinventing ingestion or replication patterns.</p></li><li><p><strong>Operational Consistency:</strong> It has built-in handling for retries, monitoring, and failure management, so it ensures consistent reliability across all data transfers.</p></li><li><p><strong>Reduced Fragmentation:</strong> It consolidates fragmented data movement systems into one standardized approach.</p></li></ul><h4><a href="http://varianceexplained.org/r/start-blog/">Advice to Start a Blog</a></h4><blockquote><p><strong>&#128214; Topic</strong>: Brand Building<br>&#129504; <strong>Level</strong>: All levels</p></blockquote><p><strong>Summary:</strong> David Robinson argues that aspiring data scientists (<em>note: valid for DEs as well</em>) should start blogging as a key strategy for breaking into the field. Rather than just completing courses, candidates should publicly share analyses, tutorials, and projects on topics they find interesting. Blogging serves three critical purposes: it provides hands-on practice with real-world data analysis and communication skills; it creates a portfolio that demonstrates capabilities to potential employers better than resumes alone; and it generates feedback from the community while helping build a professional network. Robinson emphasizes that posts don&#8217;t need to be perfect! Sharing any public work is valuable, and even simple explanations of concepts you&#8217;ve mastered can resonate with audiences.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!otds!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe966789-fdf7-4a86-8e33-1a257389f495_639x330.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!otds!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe966789-fdf7-4a86-8e33-1a257389f495_639x330.jpeg 424w, https://substackcdn.com/image/fetch/$s_!otds!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe966789-fdf7-4a86-8e33-1a257389f495_639x330.jpeg 848w, https://substackcdn.com/image/fetch/$s_!otds!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe966789-fdf7-4a86-8e33-1a257389f495_639x330.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!otds!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe966789-fdf7-4a86-8e33-1a257389f495_639x330.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!otds!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe966789-fdf7-4a86-8e33-1a257389f495_639x330.jpeg" width="639" height="330" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/be966789-fdf7-4a86-8e33-1a257389f495_639x330.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:330,&quot;width&quot;:639,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:19248,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineerthings.substack.com/i/185447857?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe966789-fdf7-4a86-8e33-1a257389f495_639x330.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!otds!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe966789-fdf7-4a86-8e33-1a257389f495_639x330.jpeg 424w, https://substackcdn.com/image/fetch/$s_!otds!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe966789-fdf7-4a86-8e33-1a257389f495_639x330.jpeg 848w, https://substackcdn.com/image/fetch/$s_!otds!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe966789-fdf7-4a86-8e33-1a257389f495_639x330.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!otds!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe966789-fdf7-4a86-8e33-1a257389f495_639x330.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Don&#8217;t be afraid to post, <a href="https://x.com/rundavidrun/status/587671657193455616">source</a></figcaption></figure></div><p><strong>&#128161; Why is this relevant for DEs?</strong></p><ul><li><p><strong>Portfolio &amp; Personal Brand:</strong> A blog provides concrete examples of your work that make interviews and applications more compelling than resumes alone, while establishing you as a thought leader and building a professional network that creates unexpected job opportunities.</p></li></ul><ul><li><p><strong>Practice with Purpose:</strong> Blogging forces you to work with real-world messy data and communicate findings, developing the exact skills employers need while revealing knowledge gaps and hidden strengths through community feedback.</p></li></ul><h4><a href="https://www.linkedin.com/blog/engineering/infrastructure/engineering-linkedins-job-ingestion-system-at-scale">LinkedIn: Engineering the Job Ingestion System at Scale</a></h4><blockquote><p><strong>&#128214; Topic</strong>: Data Platform<br>&#129504; <strong>Level</strong>: Intermediate</p></blockquote><p><strong>Summary: </strong>LinkedIn shares how they built a large-scale job ingestion system capable of processing millions of job postings from diverse sources reliably and efficiently. Instead of relying on fragmented ingestion workflows, LinkedIn designed a unified, scalable ingestion architecture that standardizes parsing, validation, enrichment, and indexing of job data. The system focuses on handling high-volume, heterogeneous inputs while ensuring data quality, low latency, and operational stability across downstream search and recommendation systems.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ojgw!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F293d89c0-527b-4214-ba20-eec0bfbdb204_1920x792.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ojgw!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F293d89c0-527b-4214-ba20-eec0bfbdb204_1920x792.png 424w, https://substackcdn.com/image/fetch/$s_!ojgw!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F293d89c0-527b-4214-ba20-eec0bfbdb204_1920x792.png 848w, https://substackcdn.com/image/fetch/$s_!ojgw!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F293d89c0-527b-4214-ba20-eec0bfbdb204_1920x792.png 1272w, https://substackcdn.com/image/fetch/$s_!ojgw!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F293d89c0-527b-4214-ba20-eec0bfbdb204_1920x792.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ojgw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F293d89c0-527b-4214-ba20-eec0bfbdb204_1920x792.png" width="1456" height="601" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/293d89c0-527b-4214-ba20-eec0bfbdb204_1920x792.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:601,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Job ingestion flow&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Job ingestion flow" title="Job ingestion flow" srcset="https://substackcdn.com/image/fetch/$s_!ojgw!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F293d89c0-527b-4214-ba20-eec0bfbdb204_1920x792.png 424w, https://substackcdn.com/image/fetch/$s_!ojgw!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F293d89c0-527b-4214-ba20-eec0bfbdb204_1920x792.png 848w, https://substackcdn.com/image/fetch/$s_!ojgw!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F293d89c0-527b-4214-ba20-eec0bfbdb204_1920x792.png 1272w, https://substackcdn.com/image/fetch/$s_!ojgw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F293d89c0-527b-4214-ba20-eec0bfbdb204_1920x792.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Job ingestion flow, <a href="https://www.linkedin.com/blog/engineering/infrastructure/engineering-linkedins-job-ingestion-system-at-scale">source</a></figcaption></figure></div><p><strong>&#128161; Why is this relevant for DEs?</strong></p><ul><li><p><strong>Scalable Ingestion Architecture: </strong>Instead of building ingestion logic per partner or per feed, LinkedIn invested in a standardized ingestion framework. This represents the move from ad hoc pipelines to platform engineering. The architecture supports horizontal scaling and distributed processing, allowing millions of records to be processed per day without degrading performance or increasing operational overhead.</p></li><li><p><strong>Data Quality at Ingestion Layer: </strong>Schema validation, normalization, and enrichment happen early in the pipeline, reducing downstream correction cycles and improving the reliability of search and recommendation systems.</p></li><li><p><strong>Handling Heterogeneous Sources: </strong>In real-world systems, inputs from different sources rarely follow a single format. A strong ingestion framework abstracts this variability and converts it into a unified internal schema. For DEs, this means designing flexible parses, schema evolution strategies, and metadata-driven mappings that can scale without rewriting pipelines for every new source.</p></li><li><p><strong>Operational Resilience: </strong>The architecture is designed to manage failures, retries, and backpressure effectively, ensuring stability even during traffic spikes.</p></li></ul><h4><a href="https://www.uber.com/en-IN/blog/apache-hudi-at-uber/?uclick_id=9bc37d06-5ebb-43f5-b4ba-7817e58d1a0c">Uber: Engineering for Trillion-Record-Scale Data Lake Operations</a></h4><blockquote><p><strong>&#128214; Topic</strong>: Data Lake<br>&#129504; <strong>Level</strong>: Beginner</p></blockquote><p><strong>Summary: </strong>Uber shares how they adopted <a href="https://hudi.apache.org">Apache Hudi</a> to solve large-scale data lake challenges such as late-arriving data, incremental processing, and inconsistent batch pipelines. Hudi adds transactional semantics, upserts, and time travel capabilities on top of cloud object storage, enabling Uber to treat their data lake more like a database while retaining scalability.</p><p>Instead of rebuilding datasets from scratch, Uber uses Hudi&#8217;s incremental ingestion and record-level updates to efficiently manage continuously evolving datasets across thousands of pipelines. </p><p><strong>&#128161; Why is this relevant for DEs?</strong></p><ul><li><p><strong>Reliable Data Lakes (Lakehouse Foundations):</strong> Hudi introduces ACID transactions, schema evolution, and rollback support to the data lakes; it helps data engineers safely handle late data, reprocessing, and failures without breaking downstream pipelines. </p></li><li><p><strong>Scales with Streaming and Batch:</strong> Hudi supports both streaming ingestion and batch workloads, helping teams unify real-time and batch pipelines under a single data store.</p></li><li><p><strong>Incremental Pipelines:</strong> It allows pipelines to process just the new or updated data since the previous run, instead of scanning terabytes of data in each run.</p></li><li><p><strong>Indexing and Fast Upserts:</strong> It allows data engineers to perform fast row-level updates across hundreds of partitions and billions of records.</p></li></ul><div><hr></div><h3>&#9874;&#65039; Workshop: Getting Started with Protobuf APIs</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!KZiq!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F042fa021-f7b2-47f5-983e-4577ad5e536c_800x327.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!KZiq!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F042fa021-f7b2-47f5-983e-4577ad5e536c_800x327.png 424w, https://substackcdn.com/image/fetch/$s_!KZiq!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F042fa021-f7b2-47f5-983e-4577ad5e536c_800x327.png 848w, https://substackcdn.com/image/fetch/$s_!KZiq!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F042fa021-f7b2-47f5-983e-4577ad5e536c_800x327.png 1272w, https://substackcdn.com/image/fetch/$s_!KZiq!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F042fa021-f7b2-47f5-983e-4577ad5e536c_800x327.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!KZiq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F042fa021-f7b2-47f5-983e-4577ad5e536c_800x327.png" width="599" height="244.84125" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/042fa021-f7b2-47f5-983e-4577ad5e536c_800x327.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:327,&quot;width&quot;:800,&quot;resizeWidth&quot;:599,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!KZiq!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F042fa021-f7b2-47f5-983e-4577ad5e536c_800x327.png 424w, https://substackcdn.com/image/fetch/$s_!KZiq!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F042fa021-f7b2-47f5-983e-4577ad5e536c_800x327.png 848w, https://substackcdn.com/image/fetch/$s_!KZiq!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F042fa021-f7b2-47f5-983e-4577ad5e536c_800x327.png 1272w, https://substackcdn.com/image/fetch/$s_!KZiq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F042fa021-f7b2-47f5-983e-4577ad5e536c_800x327.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>If schema evolution keeps biting you in production (breaking changes, inconsistent validation, tooling sprawl), this Protobuf workshop is worth your time. You will learn:</p><ul><li><p>How to adopt Protobuf to prevent breaking changes and evolve schemas safely across teams and languages</p></li><li><p>How to use Protobuf in event streaming and data pipelines for better data quality</p></li><li><p>Best practices for designing real-world APIs</p></li></ul><p>&#128073; Sign up for the workshop <strong><a href="https://fandf.co/4a4w0w9">HERE</a></strong>.</p><p><em>(This message is sponsored by Buf)</em></p><div><hr></div><h3><strong>&#128142; Open Source Gems</strong></h3><p><strong><a href="https://github.com/vercel-labs/skills">skills CLI for the open agent skills ecosystem</a></strong></p><p>Skills are reusable capabilities for AI agents. They provide procedural knowledge that helps agents accomplish specific tasks more effectively. skills is an open-source CLI for installing and managing skill packages for agents.</p><p>Together with <a href="https://skills.sh/">skills.sh</a>, a directory and leaderboard for skill packages, it allows you to easily install and manage skills for all common AI tools like <strong>Claude Code</strong>, <strong>Codex</strong>, <strong>Cursor</strong>, <strong>OpenClaw</strong>, <strong>Gemini CLI</strong>, and <a href="https://github.com/vercel-labs/skills?tab=readme-ov-file#supported-agents">many more</a>.</p><p>&#128161; <strong>Why is this relevant for DEs</strong>?</p><p>Using AI to implement data pipelines, or any type of software development, has become the new norm. However, in Data Engineering in particular, we work with highly specialized tools, different versions, and varying best practices. This often leads to significant back and forth when using AI to speed up implementation. Installing skills can help, as they target specific use cases. Tools like Claude Code automatically apply a skill when they determine it fits the use case, which can significantly improve the quality of the output.</p><p>&#129489;&#8205;&#128187; <strong>Skills for DEs</strong></p><ul><li><p><a href="https://skills.sh/wshobson/agents/data-quality-frameworks">data-quality-frameworks</a>: Production patterns for implementing data quality with Great Expectations, dbt tests, and data contracts to ensure reliable data pipelines.</p></li><li><p><a href="https://skills.sh/obra/superpowers/brainstorming">brainstorming</a>: Turn ideas into fully formed designs and specs.</p></li><li><p><a href="https://skills.sh/wshobson/agents/dbt-transformation-patterns">dbt-transformation-patterns</a>: Production-ready patterns for dbt.</p></li><li><p><a href="https://skills.sh/astronomer/agents/authoring-dags">authoring-dags</a>: Creating and validating Airflow DAGs using best practices.</p></li><li><p><a href="https://skills.sh/wshobson/agents/data-storytelling">data-storytelling</a>: Transform raw data into compelling narratives.</p></li></ul><p><em>And many more.</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!9Wa9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47228967-2ecd-461d-8d81-3b0b85416b37_2220x1420.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!9Wa9!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47228967-2ecd-461d-8d81-3b0b85416b37_2220x1420.png 424w, https://substackcdn.com/image/fetch/$s_!9Wa9!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47228967-2ecd-461d-8d81-3b0b85416b37_2220x1420.png 848w, https://substackcdn.com/image/fetch/$s_!9Wa9!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47228967-2ecd-461d-8d81-3b0b85416b37_2220x1420.png 1272w, https://substackcdn.com/image/fetch/$s_!9Wa9!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47228967-2ecd-461d-8d81-3b0b85416b37_2220x1420.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!9Wa9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47228967-2ecd-461d-8d81-3b0b85416b37_2220x1420.png" width="1456" height="931" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/47228967-2ecd-461d-8d81-3b0b85416b37_2220x1420.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:931,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:629004,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineerthings.substack.com/i/185447857?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47228967-2ecd-461d-8d81-3b0b85416b37_2220x1420.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!9Wa9!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47228967-2ecd-461d-8d81-3b0b85416b37_2220x1420.png 424w, https://substackcdn.com/image/fetch/$s_!9Wa9!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47228967-2ecd-461d-8d81-3b0b85416b37_2220x1420.png 848w, https://substackcdn.com/image/fetch/$s_!9Wa9!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47228967-2ecd-461d-8d81-3b0b85416b37_2220x1420.png 1272w, https://substackcdn.com/image/fetch/$s_!9Wa9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47228967-2ecd-461d-8d81-3b0b85416b37_2220x1420.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Installing a skill via CLI</figcaption></figure></div><p><strong>Github:</strong> <a href="https://github.com/vercel-labs/skills">https://github.com/vercel-labs/skills</a></p><div><hr></div><h3><strong>&#128161; DE Tip of the Month </strong></h3><h3><strong>Treat data contracts as non-negotiable in modern pipelines:</strong></h3><p>As data feeds more dashboards, models, and AI systems, failures are shifting from job failures to correctness. Pipelines run successfully, but silent changes in schema, freshness, or semantics break dashboards, models, and business decisions.</p><p>Data contracts help prevent this. They define clear expectations between data producers and consumers so everyone knows what a dataset promises to deliver.</p><p><strong>&#128161; Why it matters now more than ever</strong></p><ul><li><p>AI agents, auto-generated SQL, and self-serve analytics are increasing the number of data consumers without deep context</p></li><li><p>Faster development with a variety of tools like dbt, Spark, and Flink increases the risk of unintended schema changes.</p></li><li><p>The cost of bad data is often higher than pipeline downtime.</p></li></ul><p>&#128161; <strong>How to start</strong></p><ul><li><p>Add contracts to your most critical datasets</p></li><li><p>Enforce them with tests and freshness checks</p></li><li><p>Make them visible in your metadata catalog</p></li><li><p>Start small and expand gradually.</p></li></ul><p>Teams that treat data contracts seriously spend less time firefighting and build stronger trust between data producers and consumers.</p><div><hr></div><p>Let us know what you like the most in the newsletter. See you next time!</p><div class="poll-embed" data-attrs="{&quot;id&quot;:450229}" data-component-name="PollToDOM"></div><p>Until next time, cheers!</p><p><a href="https://www.linkedin.com/in/srivigneshkn/">Sri</a>, <a href="https://www.linkedin.com/in/sukanyawadawadagi/">Sukanya</a> &amp; <a href="https://www.linkedin.com/in/vjanz/">Volker</a></p><div><hr></div><h4>&#8505;&#65039; About Data Engineer Things</h4><p><a href="https://www.dataengineerthings.org/">Data Engineer Things</a> (DET) is a global community built by data engineers for data engineers. Subscribe to the <a href="https://dataengineerthings.substack.com/">newsletter</a> and follow us on <a href="https://www.linkedin.com/company/data-engineer-things/posts/?feedView=all">LinkedIn</a> to gain access to exclusive learning resources and networking opportunities, including articles, webinars, meetups, conferences, mentorship, and much more.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://dataengineerthings.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Data Engineer Things! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Data Engineer Things Newsletter - Community Spotlight Edition (Feb 2026)]]></title><description><![CDATA[A non-traditional path into data and how frustration with monolithic BI tools became a career in building them]]></description><link>https://dataengineerthings.substack.com/p/data-engineer-things-newsletter-community-88a</link><guid isPermaLink="false">https://dataengineerthings.substack.com/p/data-engineer-things-newsletter-community-88a</guid><dc:creator><![CDATA[Eddy Zulkifly]]></dc:creator><pubDate>Tue, 03 Feb 2026 16:03:05 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!GVox!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40c3ec69-edda-44ff-9807-72f5ba2ae383_1456x1048.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!GVox!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40c3ec69-edda-44ff-9807-72f5ba2ae383_1456x1048.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!GVox!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40c3ec69-edda-44ff-9807-72f5ba2ae383_1456x1048.png 424w, https://substackcdn.com/image/fetch/$s_!GVox!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40c3ec69-edda-44ff-9807-72f5ba2ae383_1456x1048.png 848w, https://substackcdn.com/image/fetch/$s_!GVox!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40c3ec69-edda-44ff-9807-72f5ba2ae383_1456x1048.png 1272w, https://substackcdn.com/image/fetch/$s_!GVox!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40c3ec69-edda-44ff-9807-72f5ba2ae383_1456x1048.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!GVox!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40c3ec69-edda-44ff-9807-72f5ba2ae383_1456x1048.png" width="1456" height="1048" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/40c3ec69-edda-44ff-9807-72f5ba2ae383_1456x1048.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1048,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:284325,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://dataengineerthings.substack.com/i/180265814?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40c3ec69-edda-44ff-9807-72f5ba2ae383_1456x1048.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!GVox!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40c3ec69-edda-44ff-9807-72f5ba2ae383_1456x1048.png 424w, https://substackcdn.com/image/fetch/$s_!GVox!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40c3ec69-edda-44ff-9807-72f5ba2ae383_1456x1048.png 848w, https://substackcdn.com/image/fetch/$s_!GVox!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40c3ec69-edda-44ff-9807-72f5ba2ae383_1456x1048.png 1272w, https://substackcdn.com/image/fetch/$s_!GVox!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40c3ec69-edda-44ff-9807-72f5ba2ae383_1456x1048.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Hi everyone,</p><p>This interview highlights how <strong>diverse backgrounds and perspectives help people build better data tools</strong>.</p><p>I met Archie through a virtual data conference called <a href="https://www.mdsfest.com/">MDS Fest</a> where he introduced Evidence as an open-source reporting tool using only SQL and Markdown (<a href="https://www.youtube.com/live/L0M_RWSp4RE?si=hNZAm5R51LxR3caM">source</a>). What stood out to me was his non-traditional path in data moving from strategy consulting to building open-source data tools. Spending years translating raw data into business decisions gave Archie a deep perspective on how data actually gets used, and where data tools might often get in the way.</p><p>Since then, he&#8217;s been active in the open-source community shipping data tools, and he&#8217;s now leading growth at Evidence.</p><p>Hope you find the lessons from this conversation meaningful.</p><p>Let&#8217;s go!<br>&#8212; Eddy</p><div><hr></div><h3>Spotlight: Archie Wood (Head of Growth at Evidence)</h3><div class="pullquote"><p>&#8220;Someone has to fix this and I&#8217;d like to be part of that solution&#8221;</p></div><p>For years, BI meant expensive, monolithic systems that stifled adaptability. But as data environments grew more complex, the demand for flexibility exploded.</p><p>This shift has driven the rise of composable BI. Instead of a &#8220;black box&#8221;, modern stacks use decoupled components to give teams total control over their metrics and visuals.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!sQmY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3dd876e6-db21-4319-a9fb-b7ea977cd7a2_1209x1094.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!sQmY!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3dd876e6-db21-4319-a9fb-b7ea977cd7a2_1209x1094.jpeg 424w, https://substackcdn.com/image/fetch/$s_!sQmY!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3dd876e6-db21-4319-a9fb-b7ea977cd7a2_1209x1094.jpeg 848w, https://substackcdn.com/image/fetch/$s_!sQmY!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3dd876e6-db21-4319-a9fb-b7ea977cd7a2_1209x1094.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!sQmY!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3dd876e6-db21-4319-a9fb-b7ea977cd7a2_1209x1094.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!sQmY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3dd876e6-db21-4319-a9fb-b7ea977cd7a2_1209x1094.jpeg" width="1209" height="1094" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3dd876e6-db21-4319-a9fb-b7ea977cd7a2_1209x1094.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1094,&quot;width&quot;:1209,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:755619,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineerthings.substack.com/i/180265814?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3dd876e6-db21-4319-a9fb-b7ea977cd7a2_1209x1094.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!sQmY!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3dd876e6-db21-4319-a9fb-b7ea977cd7a2_1209x1094.jpeg 424w, https://substackcdn.com/image/fetch/$s_!sQmY!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3dd876e6-db21-4319-a9fb-b7ea977cd7a2_1209x1094.jpeg 848w, https://substackcdn.com/image/fetch/$s_!sQmY!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3dd876e6-db21-4319-a9fb-b7ea977cd7a2_1209x1094.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!sQmY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3dd876e6-db21-4319-a9fb-b7ea977cd7a2_1209x1094.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Full-stack composable BI, <a href="https://www.pracdata.io/p/the-evolution-of-business-intelligence-stack">source</a></figcaption></figure></div><p>It was exactly this frustration with the &#8220;old way&#8221; that pushed <strong>Archie Wood</strong> (Head of Growth at Evidence) to move from analyzing data to <strong>building the tools behind it</strong>. His path is anything but traditional, and his story reveals a lot about where modern BI is heading.</p><div><hr></div><h3>&#128202; Sponsored Insight: The State of Airflow 2026 Report</h3><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!vf8-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F883a3669-b99b-48f3-b9eb-d85438acbc9c_1920x1008.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!vf8-!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F883a3669-b99b-48f3-b9eb-d85438acbc9c_1920x1008.png 424w, https://substackcdn.com/image/fetch/$s_!vf8-!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F883a3669-b99b-48f3-b9eb-d85438acbc9c_1920x1008.png 848w, https://substackcdn.com/image/fetch/$s_!vf8-!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F883a3669-b99b-48f3-b9eb-d85438acbc9c_1920x1008.png 1272w, https://substackcdn.com/image/fetch/$s_!vf8-!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F883a3669-b99b-48f3-b9eb-d85438acbc9c_1920x1008.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!vf8-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F883a3669-b99b-48f3-b9eb-d85438acbc9c_1920x1008.png" width="420" height="220.3846153846154" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/883a3669-b99b-48f3-b9eb-d85438acbc9c_1920x1008.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:764,&quot;width&quot;:1456,&quot;resizeWidth&quot;:420,&quot;bytes&quot;:1096748,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineerthings.substack.com/i/180265814?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F883a3669-b99b-48f3-b9eb-d85438acbc9c_1920x1008.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!vf8-!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F883a3669-b99b-48f3-b9eb-d85438acbc9c_1920x1008.png 424w, https://substackcdn.com/image/fetch/$s_!vf8-!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F883a3669-b99b-48f3-b9eb-d85438acbc9c_1920x1008.png 848w, https://substackcdn.com/image/fetch/$s_!vf8-!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F883a3669-b99b-48f3-b9eb-d85438acbc9c_1920x1008.png 1272w, https://substackcdn.com/image/fetch/$s_!vf8-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F883a3669-b99b-48f3-b9eb-d85438acbc9c_1920x1008.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>The State of Apache Airflow 2026 Report, the largest data engineering survey ever, draws on insights from 5,800+ data engineers on how Airflow is actually being used today. In this report, you&#8217;ll learn:</p><ul><li><p>How the role of the data engineer is evolving</p></li><li><p>How early adopters are leveraging Airflow 3 features</p></li><li><p>How orchestration-first teams tend to ship AI to production faster</p></li></ul><p>&#128073; Read the report <a href="https://www.astronomer.io/airflow/state-of-airflow/?utm_campaign=2025q4-ebook-tf-state-of-airflow-2026&amp;utm_medium=paidmedia&amp;utm_source=data-engineering-things">HERE</a></p><p><em>(This message is sponsored by Astronomer.)</em></p><div><hr></div><blockquote><p><em>Could you walk us through your journey into the data engineering space?</em></p></blockquote><p>Today I work as an open-source maintainer and lead growth for a BI tool but I got here by being deeply frustrated as a non-technical data user.<br><br>I maintain several open-source projects (<a href="https://github.com/evidence-dev/duckdb_gsheets">DuckDB GSheets</a> and <a href="https://github.com/archiewood/gosql">gosql</a>) on top of Evidence because I&#8217;ve lived the pain of data tools getting in the way of insights.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!gEFj!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F507f66ac-cde9-44f2-b89b-6fe3133dcbf6_1998x992.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!gEFj!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F507f66ac-cde9-44f2-b89b-6fe3133dcbf6_1998x992.png 424w, https://substackcdn.com/image/fetch/$s_!gEFj!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F507f66ac-cde9-44f2-b89b-6fe3133dcbf6_1998x992.png 848w, https://substackcdn.com/image/fetch/$s_!gEFj!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F507f66ac-cde9-44f2-b89b-6fe3133dcbf6_1998x992.png 1272w, https://substackcdn.com/image/fetch/$s_!gEFj!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F507f66ac-cde9-44f2-b89b-6fe3133dcbf6_1998x992.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!gEFj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F507f66ac-cde9-44f2-b89b-6fe3133dcbf6_1998x992.png" width="1456" height="723" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/507f66ac-cde9-44f2-b89b-6fe3133dcbf6_1998x992.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:723,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:249325,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineerthings.substack.com/i/180265814?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F507f66ac-cde9-44f2-b89b-6fe3133dcbf6_1998x992.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!gEFj!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F507f66ac-cde9-44f2-b89b-6fe3133dcbf6_1998x992.png 424w, https://substackcdn.com/image/fetch/$s_!gEFj!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F507f66ac-cde9-44f2-b89b-6fe3133dcbf6_1998x992.png 848w, https://substackcdn.com/image/fetch/$s_!gEFj!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F507f66ac-cde9-44f2-b89b-6fe3133dcbf6_1998x992.png 1272w, https://substackcdn.com/image/fetch/$s_!gEFj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F507f66ac-cde9-44f2-b89b-6fe3133dcbf6_1998x992.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Archie&#8217;s open source projects, <a href="https://archie.evidence.app/">source </a></figcaption></figure></div><p>I&#8217;ve been working with data for about 8 years, though I came into data in a non-traditional way. I started in management consulting, where data was mainly used to make slides and working on Excel until it fell over at a couple hundred thousand rows. Tableau opened my eyes to what better BI tools could do, and I was hooked.</p><p>Later, I joined an e-commerce company in London called Patch Plants (UK houseplant retailer) as Strategy Manager and eventually Chief of Staff where I learned SQL and Python because I kept pestering the data team for requests. They eventually gave me developer access and let me self-serve. We used Snowflake, dbt, Looker, Metabase and a lot of Google Sheets. It was a wild couple of years. Between 2019 and 2021, Patch grew nearly 200% and delivered over a million plants to 300,000 customers. Being part of that growth while building out the BI function while navigating complex challenges like Brexit was an incredible learning experience.</p><p>In 2021, I moved to Canada and spent a bit of time working on my own data startup ideas, but nothing really stuck. Eventually, I met Adam and Sean, the founders of <a href="https://evidence.dev/">Evidence</a> who were building exactly what I&#8217;d always wanted, a more flexible BI tool that gave you more control over visuals.  It often felt like the BI software I used was slowing me down more than helping me and I remember thinking, &#8220;Someone has to fix this and I&#8217;d like to be part of that solution&#8221;. I traded SQL and spreadsheets for building code extensions that made it easier for data folks to tell data stories using software engineering principles.  <br><br><strong>&#128161; Key takeaway:</strong> Non-traditional paths into data are a superpower. Firsthand frustrations coupled with domain expertise become the fuel to build better data tools. </p><div><hr></div><blockquote><p><em>As an open source maintainer and Head of Growth at Evidence, how do you balance the engineering mindset with the community and adoption side of the work?</em></p></blockquote><p>It comes in waves. Some months I&#8217;m deep in code, shipping features non-stop. Other months are more about storytelling and figuring out how to explain what we&#8217;ve built, who it&#8217;s for, and where it fits.</p><p>A big part of making that balance work is being clear about who we&#8217;re talking about when we say &#8220;users&#8221;, because we have two different audiences.</p><p>Our open-source product is something teams can self-host and deploy as a static site. The source code is fully public on GitHub, and the community around it includes contributors, maintainers, and people using it in production on their own infrastructure.</p><p>Evidence Studio is our commercial SaaS product and while it isn&#8217;t open source, we&#8217;re still working out whether we may open source parts of it over time.<br><br>When you&#8217;re balancing engineering, customer success, and an open-source community, empathy matters a lot. In open source especially, it&#8217;s easy to think &#8220;people are using it wrong,&#8221; but most of the time it means we designed something with too much friction. Evidence initially required installing Node.js and managing local dependencies, which worked fine for contributors but was a blocker for a lot of non-technical users. As a result, we&#8217;ve built a browser-based experience to make onboarding smoother and collaboration easier, while maintaining core functionality.</p><p>Keeping the project authentic to its open-source roots is a constant balance, and it&#8217;s not always straightforward. One way we&#8217;ve supported open-source development is by offering commercial support services for teams self-hosting the open-source version. Bigger enterprises often want the ability to pick up the phone and resolve issues quickly, and that support helps fund the work while keeping the open-source project healthy.<br><br><strong>&#128161; Key takeaway:</strong> Balancing engineering and community comes down to empathy. Data engineers need to listen to users, understand their real needs, and remove friction wherever possible.</p><div><hr></div><blockquote><p><em>On a personal level, you&#8217;re balancing building tools, growing a community and parenting. What habits or systems have helped you sustain creativity and focus amid that mix?</em></p></blockquote><p>I&#8217;m still early in parenting days so any &#8220;system&#8221; I have keeps changing. Kids force you to adapt constantly. My partner and I both work full time, and because she&#8217;s based in New York, I handle weekdays and she takes weekends. It&#8217;s intense, but it works.</p><p>I run my days in blocks: mornings and evenings are family time, the core workday is protected, and nights are for decompression or light work. Startups don&#8217;t respect schedules and if something breaks, you jump in but parenting has made me ruthless about focus. You quickly see how much time gets wasted otherwise.</p><p><strong>&#128161; Key takeaway</strong>: It&#8217;s important to <strong>be intentional with time management and adaptability.</strong> Knowing when to pivot is crucial to staying productive and effective.</p><div><hr></div><blockquote><p><em>What&#8217;s something you&#8217;ve learned from the Evidence community that changed your thinking about how open-source BI should evolve?</em></p></blockquote><p>BI tools are deceptively complex pieces of software. Setting one up can take weeks, and the person doing it usually has three other jobs. They might be a data engineer, director of data, or even a software developer doing BI on the side. By the time they&#8217;ve cleaned and modelled their data, users just want a tool that helps them explore and visualize it without the complex setup time.</p><p>That gap between lightweight data exploration (querying in SQL or using a notebook) and building polished dashboards is still massive. It shouldn&#8217;t be. Most people want something that feels as quick and flexible as a notebook or SQL client, but with the clarity and polish of BI.</p><p>That&#8217;s where open source really shines. It can move faster, stay flexible, and meet people where they already work. Projects like Evidence, Streamlit and Rill Data live in that middle ground between exploration and presentation. Where you can go from query to insight to something you&#8217;re proud to share, without waiting on a whole implementation cycle.</p><p><strong>&#128161; Key takeaway:</strong> There&#8217;s a distinct product niche for BI tools that can offer out of the box data exploration and BI tools which provide well governed dashboards.</p><div><hr></div><blockquote><p><em>You mentioned other open-source projects (Streamlit, Rill) that sit in this middle ground. As a data engineer, how do these compare to Evidence, and when should one choose which tool?</em></p></blockquote><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!cGXb!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a1cea49-f37c-4702-bdfa-98b396182f95_1536x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!cGXb!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a1cea49-f37c-4702-bdfa-98b396182f95_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!cGXb!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a1cea49-f37c-4702-bdfa-98b396182f95_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!cGXb!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a1cea49-f37c-4702-bdfa-98b396182f95_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!cGXb!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a1cea49-f37c-4702-bdfa-98b396182f95_1536x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!cGXb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a1cea49-f37c-4702-bdfa-98b396182f95_1536x1024.png" width="1456" height="971" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5a1cea49-f37c-4702-bdfa-98b396182f95_1536x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:638508,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineerthings.substack.com/i/180265814?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a1cea49-f37c-4702-bdfa-98b396182f95_1536x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!cGXb!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a1cea49-f37c-4702-bdfa-98b396182f95_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!cGXb!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a1cea49-f37c-4702-bdfa-98b396182f95_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!cGXb!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a1cea49-f37c-4702-bdfa-98b396182f95_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!cGXb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a1cea49-f37c-4702-bdfa-98b396182f95_1536x1024.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">BI tools based on app customization and reporting vs self-serve analytics</figcaption></figure></div><p><strong><a href="https://evidence.dev/">Evidence</a></strong> is the most analytics-engineering friendly if your goal is <em>trusted reporting as code</em>. It&#8217;s SQL-first and Markdown-first, so dashboards behave like software artifacts: version controlled, reviewable in PRs, reproducible across environments, and easy to ship as a static site or data product. It&#8217;s especially strong when you want to pair charts with narrative and definitions (&#8220;what changed, why it changed, and what to do next&#8221;).</p><p><strong><a href="https://streamlit.io/">Streamlit</a></strong> is best thought of as a Python application framework, not a BI tool. It&#8217;s ideal when you need more than charts (custom workflows, business logic, forms) and UI needs to be bespoke. The tradeoff is governance isn&#8217;t built in: metric definitions, permissions, consistency, and performance patterns are on your team to implement. It&#8217;s great for prototyping, but can become &#8220;a collection of apps&#8221; without standards.</p><p><strong><a href="https://www.rilldata.com/">Rill</a></strong> is closer to a productized BI experience: fast dashboards, metrics-driven exploration, and workflows optimized for operational analytics and drill down analysis. It shines when you want a Looker-style experience without building an app. Compared to Evidence, it&#8217;s less about narrative reporting and more about interactive exploration with an opinionated structure for speed and consistency.</p><p><strong>Rule of thumb:</strong></p><ul><li><p>Choose <strong>Evidence</strong> for governed, version-controlled reporting with narrative.</p></li><li><p>Choose <strong>Rill</strong> for fast metrics + interactive exploration with a BI-product feel.</p></li><li><p>Choose <strong>Streamlit</strong> when you need a custom data application, not just dashboards.</p></li></ul><div><hr></div><blockquote><p><em>Was there a specific GitHub issue or a discussion on a PR that provided a distinct technical learning moment you can share?</em></p></blockquote><p>One small example that stuck with me was <strong>number formatting</strong>. A user opened a GitHub discussion pointing out that our default format:</p><p>&#8364;1,234.56 was incorrect!</p><p>In <strong>Germany</strong>, the same number should be displayed as:</p><p>1.234,56 &#8364;</p><p>That difference matters:</p><ul><li><p><strong>Comma</strong> for decimals (,56)</p></li><li><p><strong>Period</strong> for thousands (1.234)</p></li><li><p><strong>Currency </strong>symbol at the end (&#8364;)</p></li></ul><p>The user could technically &#8220;fix&#8221; it upstream in the database layer, but that would turn the number into a string and break Evidence features like charts and numeric props. This edge case required a code change to support locale-aware formatting properly.</p><p>It was a good reminder that small details can impact clarity and trust, and we only caught it because someone raised it publicly on GitHub.</p><p>&#128161; <strong>Key takeaway:</strong> Open source makes your blind spots visible. Real users surface real-world edge cases that make the product better for everyone.</p><div><hr></div><blockquote><p><em>What advice would you give to data engineers who are curious about building open source tools?</em></p></blockquote><p>Start by contributing. If you use an open source tool and spot a bug or a feature gap, engage with the community on GitHub, Slack, or Discord. Small contributions are the best way to learn how open source really works: people building in public, improving tools they rely on, and collaborating across the world.</p><p>If you want to build your own open source tool or company, go in with equal parts excitement and realism. It&#8217;s deeply rewarding but hard to sustain. You&#8217;ll meet amazing contributors and a few demanding users, and you&#8217;ll eventually face the hardest question which is how to monetize without breaking what makes it open.</p><p><strong>&#128161; Key takeaway:</strong> Start small and engage the open source community on public channels.</p><div><hr></div><h3>Community poll</h3><div class="poll-embed" data-attrs="{&quot;id&quot;:437579}" data-component-name="PollToDOM"></div><div><hr></div><h3>&#128172; How to stay connected</h3><ul><li><p><a href="https://www.linkedin.com/in/archiesarrewood/">https://www.linkedin.com/in/archiesarrewood/</a></p></li><li><p><a href="https://github.com/archiewood">https://github.com/archiewood</a></p></li><li><p><a href="https://evidence.dev/">https://evidence.dev/</a></p></li></ul><div><hr></div><h3>&#8505;&#65039; About Data Engineer Things</h3><p><a href="https://www.dataengineerthings.org/">Data Engineer Things</a> (DET) is a global community built by data engineers for data engineers. Subscribe to the <a href="https://dataengineerthings.substack.com/">newsletter</a> and follow us on <a href="https://www.linkedin.com/company/data-engineer-things/posts/?feedView=all">LinkedIn</a> to gain access to exclusive learning resources and networking opportunities, including articles, webinars, meetups, conferences, mentorship, and much more.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://dataengineerthings.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Data Engineer Things! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Data Engineer Things Newsletter - Data Pulse Edition (Jan 2026)]]></title><description><![CDATA[DET editors' New Year resolutions, templatizing Spark declarative pipelines, Meta's video invisible watermarking, Lyft's feature store architecture]]></description><link>https://dataengineerthings.substack.com/p/data-engineer-things-newsletter-data</link><guid isPermaLink="false">https://dataengineerthings.substack.com/p/data-engineer-things-newsletter-data</guid><dc:creator><![CDATA[Data Engineer Things]]></dc:creator><pubDate>Wed, 14 Jan 2026 16:02:22 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!mOS6!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F748ef0fa-9801-45fe-8e52-17a7dc5b1a5f_1456x1048.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!mOS6!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F748ef0fa-9801-45fe-8e52-17a7dc5b1a5f_1456x1048.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!mOS6!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F748ef0fa-9801-45fe-8e52-17a7dc5b1a5f_1456x1048.png 424w, https://substackcdn.com/image/fetch/$s_!mOS6!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F748ef0fa-9801-45fe-8e52-17a7dc5b1a5f_1456x1048.png 848w, https://substackcdn.com/image/fetch/$s_!mOS6!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F748ef0fa-9801-45fe-8e52-17a7dc5b1a5f_1456x1048.png 1272w, https://substackcdn.com/image/fetch/$s_!mOS6!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F748ef0fa-9801-45fe-8e52-17a7dc5b1a5f_1456x1048.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!mOS6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F748ef0fa-9801-45fe-8e52-17a7dc5b1a5f_1456x1048.png" width="1456" height="1048" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/748ef0fa-9801-45fe-8e52-17a7dc5b1a5f_1456x1048.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1048,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:168144,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://dataengineerthings.substack.com/i/182380267?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F748ef0fa-9801-45fe-8e52-17a7dc5b1a5f_1456x1048.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!mOS6!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F748ef0fa-9801-45fe-8e52-17a7dc5b1a5f_1456x1048.png 424w, https://substackcdn.com/image/fetch/$s_!mOS6!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F748ef0fa-9801-45fe-8e52-17a7dc5b1a5f_1456x1048.png 848w, https://substackcdn.com/image/fetch/$s_!mOS6!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F748ef0fa-9801-45fe-8e52-17a7dc5b1a5f_1456x1048.png 1272w, https://substackcdn.com/image/fetch/$s_!mOS6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F748ef0fa-9801-45fe-8e52-17a7dc5b1a5f_1456x1048.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>First of all, Happy New Year, everyone!</p><p>Hope you all had a happy and restorative holiday season. My own break was a beautiful blur of firsts: spending the first Christmas and New Year with my new daughter. There is something profoundly grounding about watching a child experience the magic of a Christmas market for the first time. Between the smell of roasted nuts and the twinkling lights, I found myself doing a different kind of &#8220;data collection&#8221;capturing every giggle and wide-eyed stare.</p><p>That calm period gave me some much-needed space to reflect on 2025. In the data world, 2025 was the year we stopped talking about AI alone and started weaving it into our actual pipelines. We moved from Copilots that write SQL to AI-assisted metadata management. Personally, it taught me that while the tools change faster than a toddler&#8217;s mood, the fundamentals: reliability, cost-efficiency, and trust remain our North Star.</p><p>Let&#8217;s make 2026 the year we build less &#8220;tech debt&#8221; and more &#8220;data wealth.&#8221;</p><p>With that, we start our first newsletter of the year with a bonus section. A few of our DET editors share their New Year Resolutions with you, our amazing community. Hope you enjoy this edition and perhaps draw some inspiration for yours, if you&#8217;re in the process of creating your own. </p><p>- Chozhan</p><div><hr></div><h3><strong>&#128218;</strong> Data Pulse</h3><h4><strong><a href="https://www.databricks.com/blog/chaos-scale-templatizing-spark-declarative-pipelines-dlt-meta">Templatizing Spark Declarative Pipelines with DLT-META</a></strong></h4><blockquote><p><strong>&#128214; Topic</strong>: Data Pipelines<br>&#129504; <strong>Level</strong>: Beginner</p></blockquote><p><strong>Summary:</strong> Databricks introduces <strong>DLT-META</strong>, an open-source framework designed to solve pipeline sprawl. It allows engineers to use a metadata-driven approach to generate Delta Live Table (DLT) pipelines. Instead of writing unique code for every table, you define the pipeline logic in a JSON/YAML template, and the framework dynamically generates the Spark jobs.</p><p><strong>&#128161; Why is this relevant for DEs?</strong></p><ul><li><p><strong>Operational Scalability:</strong> It enables us to move from &#8220;coding&#8221; individual pipelines to &#8220;architecting&#8221; frameworks, allowing a single engineer to manage thousands of tables.</p></li><li><p><strong>Reduced Tech Debt:</strong> By templatizing the logic, you ensure consistent data quality checks (expectations) and governance across all data assets.</p></li><li><p><strong>Faster Onboarding:</strong> New data sources can be integrated by simply updating a metadata file rather than deploying new code modules.</p></li><li><p><strong>DRY (Don&#8217;t Repeat Yourself):</strong> It enforces a &#8220;standard library&#8221; of transformations, making maintenance and debugging significantly simpler across the lakehouse.</p></li></ul><h4><a href="https://engineering.fb.com/2025/11/04/video-engineering/video-invisible-watermarking-at-scale/">Meta: Video Invisible Watermarking at Scale</a></h4><blockquote><p><strong>&#128214; Topic</strong>: Data Engineering<br>&#129504; <strong>Level</strong>: Intermediate</p></blockquote><p><strong>Summary: </strong>Meta uses Video Invisible Watermarking, a system designed to embed traceable metadata into video content without affecting visual quality. This enables Meta in detecting AI-generated videos, verifying who posted a video first, and identifying the source and tools used to create a video. This is a massive-scale data engineering challenge given the amount of videos being circulated across Meta platforms every minute. It requires processing billions of video uploads in real-time, ensuring the watermark survives compression, cropping, and screen recording. In this post, Meta shares how they overcame the challenges of scaling invisible watermarking, including how they built a CPU-based solution that offers comparable performance to GPUs, but with better operational efficiency.</p><p>&#128161; <strong>Why is this relevant for DEs</strong>?</p><ul><li><p><strong>Complex Data Modeling:</strong> It showcases how to handle unstructured data as a first-class citizen in a pipeline.</p></li><li><p><strong>High-Throughput Processing:</strong> DEs can learn about the infrastructure required to run computationally expensive ML models (for watermarking) on every single write operation.</p></li><li><p><strong>Data Provenance and Trust:</strong> As GenAI content floods the web, building &#8220;Trust Infrastructure&#8221; like watermarking will become a core responsibility for data platforms.</p></li><li><p><strong>Multimodal Engineering:</strong> It bridges the gap between traditional signal processing and modern distributed data pipelines at an exabyte scale.</p></li></ul><h4><strong><a href="https://eng.lyft.com/lyfts-feature-store-architecture-optimization-and-evolution-7835f8962b99">Lyft&#8217;s Feature Store: Architecture, Optimization, and Evolution</a></strong></h4><blockquote><p><strong>&#128214; Topic</strong>: Data Architecture<br>&#129504; <strong>Level</strong>: Advanced</p></blockquote><p>Lyft's revamped Feature Store is a mission-critical "platform of platforms" designed to centralize and scale machine learning feature management across the organization's entire rideshare stack. The implementation utilizes a "Features-as-Code" approach where engineers define features via Spark SQL for business logic and JSON for metadata. This configuration is automatically converted into production-ready Airflow DAGs that compute and publish data to both a Hive-based offline store for training and a high-performance online serving layer. For low-latency retrieval, the system employs a hybrid architecture featuring DynamoDB backed by a ValKey write-through cache, alongside OpenSearch specifically for vector embedding support. This robust setup now handles over a trillion annual operations, serving as the foundational infrastructure for everything from real-time pricing to fraud detection.</p><p>&#128161; <strong>Why is this relevant for DEs</strong>?</p><ul><li><p><strong>Democratizing feature engineering:</strong> By allowing anyone with SQL and JSON knowledge to deploy production-grade pipelines.</p></li><li><p><strong>Separating storage and computation:</strong> To ensure high-throughput batch writes never interfere with sub-millisecond read performance.</p></li><li><p><strong>Solving the small file problem:</strong> Through automated background compaction and clustering within the streaming lakehouse layer.</p></li><li><p><strong>Enforcing organization-wide Data Contracts</strong>: Guaranteeing feature freshness, ownership, and schema stability for downstream consumers.</p></li><li><p><strong>Reducing operational overhead:</strong> By migrating complex orchestration from in-house tools to managed services like Astronomer (Airflow).</p></li><li><p><strong>Future-proofing infrastructure for AI:</strong> With native vector indexing in OpenSearch to support emerging LLM and agentic workflows.</p></li><li><p><strong>Standardizing multimodal data management</strong>: Unifying raw metadata, binary blobs, and embeddings into a single searchable ecosystem.</p></li></ul><div><hr></div><h3>&#128161;Blog: Building AI Agents for Data Engineering Ops</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!JgiO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd2dcc41-7767-408b-9d00-8812ce80cbf3_1600x870.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!JgiO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd2dcc41-7767-408b-9d00-8812ce80cbf3_1600x870.png 424w, https://substackcdn.com/image/fetch/$s_!JgiO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd2dcc41-7767-408b-9d00-8812ce80cbf3_1600x870.png 848w, https://substackcdn.com/image/fetch/$s_!JgiO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd2dcc41-7767-408b-9d00-8812ce80cbf3_1600x870.png 1272w, https://substackcdn.com/image/fetch/$s_!JgiO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd2dcc41-7767-408b-9d00-8812ce80cbf3_1600x870.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!JgiO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd2dcc41-7767-408b-9d00-8812ce80cbf3_1600x870.png" width="667" height="362.81868131868134" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fd2dcc41-7767-408b-9d00-8812ce80cbf3_1600x870.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:792,&quot;width&quot;:1456,&quot;resizeWidth&quot;:667,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!JgiO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd2dcc41-7767-408b-9d00-8812ce80cbf3_1600x870.png 424w, https://substackcdn.com/image/fetch/$s_!JgiO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd2dcc41-7767-408b-9d00-8812ce80cbf3_1600x870.png 848w, https://substackcdn.com/image/fetch/$s_!JgiO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd2dcc41-7767-408b-9d00-8812ce80cbf3_1600x870.png 1272w, https://substackcdn.com/image/fetch/$s_!JgiO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd2dcc41-7767-408b-9d00-8812ce80cbf3_1600x870.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Debugging production data pipelines is a critical part of a data engineer&#8217;s job but it can be particularly difficult. The team at Corelayer explored how AI agents can assist with investigations: detecting anomalies, correlating logs and DAG runs, forming hypotheses, and producing evidence-backed root cause analyses with humans firmly in the loop. In this article, you will learn the failure modes, operational complexity, and the design principles (evals, guardrails, transparency) that matter when applying AI to real-world data engineering workflows.</p><p>&#128073;&#127996; Read the full article <a href="https://www.corelayer.com/blog/production-data-eng-is-hard?utm_source=det_newsletter&amp;utm_medium=email&amp;utm_campaign=det_jan_2026">HERE</a>.</p><p><em>(This message is sponsored by Corelayer.)</em></p><div><hr></div><h3>&#10024; New Year Resolutions from DET Editors</h3><h4>Volker</h4><blockquote><p>In our first Community Spotlight newsletter, the vote showed that the biggest blockers to write or speak about work and experiences are &#8220;stuck in drafts forever&#8221; and &#8220;too tired after coding all day.&#8221; I feel both. I tend to overthink everything I share, so ideas die in perfectionism. But data engineering moves fast, and the value of insights fades quickly. <strong>This year, I want to be braver: share earlier, share imperfectly, and don&#8217;t be afraid of having opinions</strong>.</p></blockquote><h4>Swetha</h4><blockquote><p>In 2026, I want to <strong>develop consistent technical writing in data engineering, using it to support others in the field and also make it as a learning tool</strong>. Alongside this, I want to also expand the mentoring guidance to provide aspiring data engineers, give them industry trends, interview expectations and help them achieve their goals, I have always loved contributing to the community and I look forward to taking the deliberate effort to do that this year.</p></blockquote><h4>Eddy</h4><blockquote><p>In 2026, my resolution is to <strong>create more conversations about data engineering</strong>. A lot of my growth as a data/software engineer came through meetups and having conversations with dev/product/sales/marketing folks across the tech spectrum. I&#8217;ve been fortunate to grow in a place with a vibrant data community, and that environment played a huge role in my professional and technical development. In the future, I hope to carry that same mindset forward: investing in conversations, building community, and learning in public.</p></blockquote><h4>Ananda</h4><blockquote><p>I plan to <strong>contribute to open source</strong>, as it gives me significant leverage to create impact across the data engineering community. I intend to increase my knowledge in large-scale data engineering and infrastructure optimization. A more ambitious idea is to learn a new programming language, probably Rust, to expand my skill set. I&#8217;ll share what I learnt with the community through mentoring, blogging, and speaking engagements, taking small, consistent steps that compound into decent progress.</p></blockquote><h4>Chozhan</h4><blockquote><p>Starting late 2025, we&#8217;ve been inching toward Autonomous Data Engineering, where pipelines don&#8217;t just alert us when they break, but suggest the fix or even auto heal. With that as an opportunity, I approach 2026 with the mindset of <strong>reclaiming the mental space for creative architecture &amp; strategy that drives more business value by (re)designing every system to automate the mundane &amp; repetitive</strong>. Also, I learned a lot in the past years from being part of communities and this year, I&#8217;d like to give back more by sharing my experiences, especially failures &amp; lessons learned using different mediums.</p></blockquote><p><em>So, dear community, we are curious to hear what your 2026 resolutions are. Let us know via the poll at the end </em>&#128071;<em> or in the comments or the community chat </em>&#128172;<em>.</em> </p><div><hr></div><h3><strong>&#128142; Open Source Gems</strong></h3><p><strong><a href="https://lancedb.com/">LanceDB: The Multimodal Vector Database for the AI Era</a></strong></p><p>LanceDB is an open-source, serverless-native vector database designed to simplify the management and retrieval of unstructured data for AI applications. Built on the high-performance <strong>Lance</strong> columnar data format, it offers up to 100x faster random access performance compared to traditional formats like Parquet. It is uniquely useful because it allows engineers to store raw data (images, videos), metadata, and vector embeddings together in a single, unified table. This "SQLite-like" approach means it can run embedded directly in your application or scale to massive cloud-native lakehouses without the overhead of managing a complex database cluster.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!z5kL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb4c0ce17-ea24-4e97-9eab-50afcffcb588_921x201.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!z5kL!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb4c0ce17-ea24-4e97-9eab-50afcffcb588_921x201.png 424w, https://substackcdn.com/image/fetch/$s_!z5kL!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb4c0ce17-ea24-4e97-9eab-50afcffcb588_921x201.png 848w, https://substackcdn.com/image/fetch/$s_!z5kL!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb4c0ce17-ea24-4e97-9eab-50afcffcb588_921x201.png 1272w, https://substackcdn.com/image/fetch/$s_!z5kL!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb4c0ce17-ea24-4e97-9eab-50afcffcb588_921x201.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!z5kL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb4c0ce17-ea24-4e97-9eab-50afcffcb588_921x201.png" width="425" height="92.75244299674267" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b4c0ce17-ea24-4e97-9eab-50afcffcb588_921x201.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:201,&quot;width&quot;:921,&quot;resizeWidth&quot;:425,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;GitHub - lancedb/lancedb-private: Developer-friendly, serverless vector  database for AI applications. Easily add long-term memory to your LLM apps!&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="GitHub - lancedb/lancedb-private: Developer-friendly, serverless vector  database for AI applications. Easily add long-term memory to your LLM apps!" title="GitHub - lancedb/lancedb-private: Developer-friendly, serverless vector  database for AI applications. Easily add long-term memory to your LLM apps!" srcset="https://substackcdn.com/image/fetch/$s_!z5kL!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb4c0ce17-ea24-4e97-9eab-50afcffcb588_921x201.png 424w, https://substackcdn.com/image/fetch/$s_!z5kL!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb4c0ce17-ea24-4e97-9eab-50afcffcb588_921x201.png 848w, https://substackcdn.com/image/fetch/$s_!z5kL!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb4c0ce17-ea24-4e97-9eab-50afcffcb588_921x201.png 1272w, https://substackcdn.com/image/fetch/$s_!z5kL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb4c0ce17-ea24-4e97-9eab-50afcffcb588_921x201.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>&#128161; <strong>Why is this relevant for DEs</strong>?</p><ul><li><p><strong>Unifies raw media and embeddings</strong> in a single table, eliminating complex sync logic between disparate data stores.</p></li><li><p><strong>Eliminates Parquet&#8217;s random access bottleneck</strong>, delivering the sub-millisecond retrieval required for real-time RAG.</p></li><li><p><strong>Scales from local to petabyte-scale cloud storage</strong> using an embedded, zero-infrastructure serverless architecture.</p></li></ul><p><strong>GitHub:</strong> <a href="https://github.com/lancedb/lancedb">https://github.com/lancedb/lancedb</a></p><div><hr></div><h3><strong>&#128161; DE Tip of the Month</strong></h3><h3><strong>Build Context-Aware Metadata &amp; Smart Lineage</strong></h3><p>Most lineage systems only show which tables depend on which, but context-aware metadata explains <em>why</em> changes happened and what triggered them. By capturing deployment info, config changes, and upstream data shifts, teams can perform faster root-cause analysis and smarter testing. This turns metadata into an operational control plane rather than passive documentation.</p><p>&#128161; <strong>Key ideas &amp; how to apply:</strong></p><ul><li><p>Capture deployment metadata alongside data changes<br>Example: tagging runs in Airflow / Dagster with context.</p><pre><code><code>run_tags = {
    "change_type": "schema_update",
    "jira_ticket": "DATA-2411",
    "release": "v2026.01",
    "affected_columns": "order_status, delivery_eta"
}</code></code></pre><p><em>Store these tags in your metadata system (OpenLineage, DataHub, Marquez).</em></p></li><li><p>Track schema evolution events and backfills explicitly (change_type = backfill)</p></li></ul><p>Use this metadata to auto-trigger downstream tests &amp; validations and feed lineage to impact analysis before major releases. When something breaks, you already know, what changed, who changed it, what downstream systems are affected, so the incidents resolve faster and safer.</p><div><hr></div><p>Let us know what you like the most in the newsletter. See you next time!</p><div class="poll-embed" data-attrs="{&quot;id&quot;:432351}" data-component-name="PollToDOM"></div><p>Until next time, cheers!</p><p>Chozhan, Shubham &amp; Sugandhi</p><div><hr></div><h4>&#8505;&#65039; About Data Engineer Things</h4><p><a href="https://www.dataengineerthings.org/">Data Engineer Things</a> (DET) is a global community built by data engineers for data engineers. Subscribe to the <a href="https://dataengineerthings.substack.com/">newsletter</a> and follow us on <a href="https://www.linkedin.com/company/data-engineer-things/posts/?feedView=all">LinkedIn</a> to gain access to exclusive learning resources and networking opportunities, including articles, webinars, meetups, conferences, mentorship, and much more.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://dataengineerthings.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Data Engineer Things! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Data Engineer Things Newsletter - Community Spotlight Edition (Dec 2025)]]></title><description><![CDATA[From big tech to independent consulting: the art of professional brand building and communication. Featuring Ben Rogojan (aka the Seattle Data Guy).]]></description><link>https://dataengineerthings.substack.com/p/data-engineer-things-newsletter-community</link><guid isPermaLink="false">https://dataengineerthings.substack.com/p/data-engineer-things-newsletter-community</guid><dc:creator><![CDATA[Data Engineer Things]]></dc:creator><pubDate>Thu, 18 Dec 2025 16:00:50 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!M680!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F179a3c80-7b0a-4fab-95ec-468ebb22fb12_1456x1048.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!M680!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F179a3c80-7b0a-4fab-95ec-468ebb22fb12_1456x1048.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!M680!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F179a3c80-7b0a-4fab-95ec-468ebb22fb12_1456x1048.png 424w, https://substackcdn.com/image/fetch/$s_!M680!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F179a3c80-7b0a-4fab-95ec-468ebb22fb12_1456x1048.png 848w, https://substackcdn.com/image/fetch/$s_!M680!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F179a3c80-7b0a-4fab-95ec-468ebb22fb12_1456x1048.png 1272w, https://substackcdn.com/image/fetch/$s_!M680!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F179a3c80-7b0a-4fab-95ec-468ebb22fb12_1456x1048.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!M680!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F179a3c80-7b0a-4fab-95ec-468ebb22fb12_1456x1048.png" width="1456" height="1048" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/179a3c80-7b0a-4fab-95ec-468ebb22fb12_1456x1048.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1048,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:258743,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://dataengineerthings.substack.com/i/181481110?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F179a3c80-7b0a-4fab-95ec-468ebb22fb12_1456x1048.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!M680!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F179a3c80-7b0a-4fab-95ec-468ebb22fb12_1456x1048.png 424w, https://substackcdn.com/image/fetch/$s_!M680!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F179a3c80-7b0a-4fab-95ec-468ebb22fb12_1456x1048.png 848w, https://substackcdn.com/image/fetch/$s_!M680!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F179a3c80-7b0a-4fab-95ec-468ebb22fb12_1456x1048.png 1272w, https://substackcdn.com/image/fetch/$s_!M680!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F179a3c80-7b0a-4fab-95ec-468ebb22fb12_1456x1048.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Hi everyone,</p><p>I&#8217;m excited to launch our new <strong>Community Spotlight</strong> series! Alongside our classic <strong>Data Pulse</strong> edition, this format focuses entirely on deep-dive interviews. We have one rule for these: <strong>they must offer actionable learnings</strong>.</p><p>So, grab your favorite warm drink and meet our first guest: <a href="https://www.linkedin.com/in/benjaminrogojan/">Ben Rogojan</a>, aka the <strong>Seattle Data Guy</strong>! From leaving Facebook to building an independent empire, Ben breaks down how he grew a following of 100k+ and why communication often beats coding.</p><p>One thing Ben mentions really stood out to me: &#8220;<em>You can get a lot of scale through content, but it&#8217;s hard to beat in-person relationships.</em>&#8221;</p><p>I recently returned from AWS re:Invent in Las Vegas, and I can absolutely confirm that. It was wonderful to meet so many people there, but now it&#8217;s time to slow down &#127876;.</p><p>Let&#8217;s dive in!</p><p>- Volker</p><div><hr></div><h3>Spotlight: Ben Rogojan (Seattle Data Guy)</h3><div class="pullquote"><p>&#8220;You don&#8217;t have to learn or be everything at once. Wherever your journey is now, enjoy it. You never know when it&#8217;ll shift.&#8221;</p></div><blockquote><p><em>Please introduce yourself briefly to the Data Engineer Things community and share how you&#8217;ve built your personal brand to reach over 100,000 followers.</em></p></blockquote><p>Hello! My name is <a href="https://www.linkedin.com/in/benjaminrogojan/">Ben Rogojan</a>, I&#8217;ve been working in the data world for over a decade now. I started as an analyst and then found data engineering when I was looking for a role that matched the skills I enjoyed over title. While I was working full-time I had a few people ask me to help on some side projects so I started the <a href="https://www.theseattledataguy.com/">Seattle Data Guy</a> brand as a consulting company but I very quickly realized I needed to find a way to get myself known. <strong>So I started writing content I wish I had more of when I started</strong>.</p><div><hr></div><blockquote><p><em>If your journey from corporate data engineer to successful consultant and content creator was a Git repository, what would be the commit message for where you are right now? And what was the most significant &#8220;merge conflict&#8221; you had to resolve along the way?</em></p></blockquote><p>I think the commit message would say something like &#8220;breaking out and trying new things&#8221; or &#8220;pattern interrupt phase&#8221;. I&#8217;ve been spending a lot of time trying to figure out how to shake up my thinking and habits. I&#8217;ve been consulting for a few years and it can be tempting to fall into the same habits. <strong>So I have been looking for places where I can challenge my habits</strong>.</p><p>For merge conflicts, I think one that sticks out was when I was making the decision to leave Facebook. I had spent so much time and effort getting a job there that it felt wrong. I had a consulting business that was growing but it was hard to rationalize the decision due to all the prior effort.</p><div><hr></div><blockquote><p><em>You made the transition from working at Facebook to running your own data consulting business. What were the most critical steps in that journey, and what would you do differently if you were starting over today?</em></p></blockquote><p>I had been consulting off and on through most of my career. The first project I did was actually helping a client move from Access to SQL Server and now its a lot of SQL Server to cloud migration projects.</p><p>The most critical step for any consultant is figuring out how you will land clients. Some people are good at marketing, others sales motions, still others are great at networking in person. <strong>I think in terms of what I would have done differently is I would have put more effort into meeting more people in person and building relationships</strong>.</p><p>I believe that&#8217;s even more true now than it was in the past. <strong>You can get a lot of scale through content, but it&#8217;s hard to beat in person relationships</strong>.</p><div class="pullquote"><p>&#8220;By the end of the first year I had already made a more than working at Facebook and since then my total take home has grown comfortably.&#8221;</p></div><blockquote><p><em>Your LinkedIn profile states you&#8217;re &#8220;Tool-Agnostic, Outcome-Obsessed&#8221;. How did you develop this clear value statement, and how has it helped you attract the right clients for your consulting business?</em></p></blockquote><p>I think it can be tempting to prescribe a prior solution to every client. Also, <strong>I&#8217;ve come across many clients whose data stacks have gotten decently chaotic due to the fact that a consultant decided to add their preferred stack on top of what already exists</strong>.</p><p>Instead, I aim to come into all my projects by understanding the companies business needs first, their technical talent, budget, and so forth. From there I look for the tools that meet the companies&#8217; needs. In some cases buying a solution can prove far faster and easier on a company with limited data resources and in others the company&#8217;s goal is to make data a major part of their offering and the overhead of adding more data per customer would be too expensive for an out of the box solution.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Qy8N!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f9ea4af-5f2e-4f1a-b5a4-9c7970c3c644_1024x768.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Qy8N!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f9ea4af-5f2e-4f1a-b5a4-9c7970c3c644_1024x768.png 424w, https://substackcdn.com/image/fetch/$s_!Qy8N!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f9ea4af-5f2e-4f1a-b5a4-9c7970c3c644_1024x768.png 848w, https://substackcdn.com/image/fetch/$s_!Qy8N!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f9ea4af-5f2e-4f1a-b5a4-9c7970c3c644_1024x768.png 1272w, https://substackcdn.com/image/fetch/$s_!Qy8N!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f9ea4af-5f2e-4f1a-b5a4-9c7970c3c644_1024x768.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Qy8N!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f9ea4af-5f2e-4f1a-b5a4-9c7970c3c644_1024x768.png" width="1024" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8f9ea4af-5f2e-4f1a-b5a4-9c7970c3c644_1024x768.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:212310,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineerthings.substack.com/i/181481110?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f9ea4af-5f2e-4f1a-b5a4-9c7970c3c644_1024x768.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Qy8N!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f9ea4af-5f2e-4f1a-b5a4-9c7970c3c644_1024x768.png 424w, https://substackcdn.com/image/fetch/$s_!Qy8N!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f9ea4af-5f2e-4f1a-b5a4-9c7970c3c644_1024x768.png 848w, https://substackcdn.com/image/fetch/$s_!Qy8N!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f9ea4af-5f2e-4f1a-b5a4-9c7970c3c644_1024x768.png 1272w, https://substackcdn.com/image/fetch/$s_!Qy8N!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f9ea4af-5f2e-4f1a-b5a4-9c7970c3c644_1024x768.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Tool based thinking vs outcome based thinking, source: <a href="https://www.linkedin.com/in/benjaminrogojan/">Ben</a></figcaption></figure></div><div><hr></div><blockquote><p><em>Your newsletter and YouTube channel have both reached over 100,000 subscribers. What content strategy has been most effective for growing your audience, and how do you balance creating content with client work?</em></p></blockquote><p>I&#8217;ve tried multiple times to create and follow a content strategy and calendar. But I always tend to drift.</p><p>So my approach is to <strong>find 2-4 themes I enjoy at the moment based on the problems I am either experiencing with recent clients, or discussing with data leaders and create content around that</strong>. I find that there are trends in problems so if you keep experiencing a similar problem, it&#8217;s likely there are plenty more people dealing with it.</p><div><hr></div><blockquote><p><em>If you had to start over today with zero followers across all platforms, which single platform would you focus on first, and what specific content format would you prioritize to build your audience most efficiently as a beginner?</em></p></blockquote><p>I think <strong>writing is always the easiest place to start</strong>. Start sharing your ideas on a platform like Substack (as long as it stays friendly to email) or if you aren&#8217;t wanting to write long articles then consider a platform like LinkedIn. I don&#8217;t think there is a best place. Early on, I believe you&#8217;re focused on finding your voice. So having a large audience isn&#8217;t the goal. <strong>You&#8217;re trying to learn, figure out what type of content you enjoy</strong>, etc. Overall, good content gets noticed.</p><div><hr></div><blockquote><p><em>Many professionals struggle to grow their audience while maintaining focus on their core business. What&#8217;s your framework for deciding which content to create, and how do you balance creating content that attracts followers versus content that converts followers into clients?</em></p></blockquote><p>Early on I was mostly focused on creating content that I enjoyed and felt like needed to be covered. Since then I&#8217;ve started to create a mix of both content for new data engineers as well as for possible clients.</p><p>I think this is an area I could generally improve. I don&#8217;t spend a lot of time on a content plan with clear ratios of content x and y. Instead, <strong>I write what I want to write about</strong>.</p><div><hr></div><blockquote><p><em>For data professionals considering consulting, what has been your most effective approach for finding and securing new clients, and how has this evolved as your brand has grown?</em></p></blockquote><p>I&#8217;ve spoken with dozens of consultants from both technical and non-technical backgrounds and <strong>the most common way most of them land clients is through their network</strong>. I know that can seem hard as if you&#8217;re just starting out as a data engineer or analyst as you&#8217;ve likely got a small network. But there are plenty of ways to grow it.</p><p>Working, going to events, looking for chances to give whether it be sharing content, working on open source projects, etc.</p><p>I&#8217;d also add that you don&#8217;t need to find large projects first. <strong>There are plenty of projects out there where people need help automating an Excel report or some other task that you might think is small but you&#8217;ll learn a lot from (I did plenty of those projects)</strong>.</p><div><hr></div><blockquote><p><em>What are the two most impactful actions data professionals can take today to start building their personal brand, even if they&#8217;re currently employed full-time?</em></p></blockquote><p><strong>We are all building our brand everyday</strong>. That doesn&#8217;t have to mean posting on LinkedIn, YouTube or Twitter.</p><p>So I&#8217;d say first, <strong>build a reputation at your job as the person that gets things done well</strong>. Be willing to take on hard problems. That&#8217;s how I started consulting as well, at a job a director who was a consultant and turning around a team learned that I was technical through the grape vine and he reached out asking if I wanted to help on a project.</p><p><strong>Second, if you do want to do content, create content that you wished you&#8217;d had when you started</strong>.</p><div><hr></div><blockquote><p><em>How important are communication skills versus technical skills for data professionals today, and what are the three most impactful lessons you&#8217;ve learned?</em></p></blockquote><p>Learning how to communicate seems to be a forever lesson for me. I am always speaking with people who are better and conveying ideas, getting buy-in, teaching and other skills that require different forms of communication.</p><p>Technical skills are always important. I really enjoy <a href="https://www.alexewerlof.com/">Alex Ewerl&#246;f</a>&#8217;s diagram of skills for making sure that&#8217;s not skipped. In terms of lessons:</p><ol><li><p><strong>Think about who you are communicating and tailor the message for them</strong> - It&#8217;s temping to get frustrated when you explain something you think should be simple to understand but the other party doesn&#8217;t seem to get it. I view this as my failure to understand my audience instead of their failure to understand.</p></li><li><p><strong>Images work</strong> - Even an image that isn&#8217;t perfect is more likely to keep your reader engaged. It also makes it far easier to explain concepts like your internal network map of your various servers so that new employees can quickly get up to speed or if you&#8217;re trying to get buy-in for a dashboard having a mock-up makes the end state all the more real. Don&#8217;t just write when you can draw or diagram.</p></li><li><p><strong>Cut out fluff</strong> - I tend to lean on the fluffy side of writing. I ramble., sometimes add sections  purely for self indulgence. But when I reread it, I realize that what I&#8217;ve written adds very little to the overall piece. So cut it out.</p></li></ol><div><hr></div><blockquote><p><em>What&#8217;s your vision for the future of independent data professionals, and what emerging opportunities should they be positioning themselves for?</em></p></blockquote><p>I think data problems will continue to grow for a few reasons.</p><p>I believe businesses will continue to demand more and more from their data.</p><p>For some businesses that will mean more granular or complex data like images and unstructured data.</p><p>For others it&#8217;ll be integrating data sets that have never been connected.</p><p>But still for others it&#8217;ll just be answering key questions about the business. <strong>I think many people would be surprised how many businesses and organizations are early in their data journey or perhaps needing to revamp it</strong>. I like to say that many companies are in different data decades. So there will continue to be plenty of work for the next few years.</p><div><hr></div><blockquote><p><em>What&#8217;s one message you&#8217;d like to share with the Data Engineer Things community?</em></p></blockquote><p><strong>You don&#8217;t have to learn or be everything at once. Wherever your journey is now, enjoy it</strong>. You never know when it&#8217;ll shift. I didn&#8217;t know that the last day I was going to enter Facebooks offices was in March 2020. I kept assuming I&#8217;d be able to go back. Then it was gone.</p><p>And one day I am sure I&#8217;ll have my last consulting client.</p><p>I&#8217;ll put out my last YouTube video.</p><p>So enjoy being in whatever stage you&#8217;re at.</p><p>If you&#8217;re learning, learn! Dive deep, don&#8217;t worry about what other people are doing.</p><p>If you&#8217;re executing, execute to the best of your ability.</p><p>If you&#8217;re raising a family, do that. <strong>Be in that moment, because it&#8217;ll end and you&#8217;ll miss it</strong>.</p><div><hr></div><h3>Connect with Ben</h3><ul><li><p><a href="https://www.linkedin.com/in/benjaminrogojan/">LinkedIn Ben</a></p></li><li><p><a href="https://www.linkedin.com/company/seattle-data-guy">LinkedIn Seattle Data Guy</a></p></li><li><p><a href="https://seattledataguy.substack.com/">Substack</a></p></li><li><p><a href="https://www.youtube.com/c/SeattleDataGuy">YouTube</a></p></li></ul><div><hr></div><h3>Community poll</h3><div class="poll-embed" data-attrs="{&quot;id&quot;:418846}" data-component-name="PollToDOM"></div><div class="pullquote"><p><em>&#127876;</em></p><p>The Data Engineer Things team wishes you <strong>Happy Holidays</strong> and a <strong>Happy New Year</strong>! We look forward to giving back to this community with even more deep-dive interviews, news, and resources in the coming year.</p></div><h3>&#8505;&#65039; About Data Engineer Things</h3><p><a href="https://www.dataengineerthings.org/">Data Engineer Things</a> (DET) is a global community built by data engineers for data engineers. Subscribe to the <a href="https://dataengineerthings.substack.com/">newsletter</a> and follow us on <a href="https://www.linkedin.com/company/data-engineer-things/posts/?feedView=all">LinkedIn</a> to gain access to exclusive learning resources and networking opportunities, including articles, webinars, meetups, conferences, mentorship, and much more.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://dataengineerthings.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Data Engineer Things! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Data Engineer Things Newsletter #26 (Dec 2025)]]></title><description><![CDATA[Thinking like a Data engineer, Real-time distributed graph, Secret behind super fast databases, Multimodal data workloads, and more.]]></description><link>https://dataengineerthings.substack.com/p/data-engineer-things-newsletter-26</link><guid isPermaLink="false">https://dataengineerthings.substack.com/p/data-engineer-things-newsletter-26</guid><dc:creator><![CDATA[Data Engineer Things]]></dc:creator><pubDate>Thu, 04 Dec 2025 16:02:43 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!ZuUR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8db62b89-408c-43ff-8872-b1d431a21e1d_1456x1048.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ZuUR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8db62b89-408c-43ff-8872-b1d431a21e1d_1456x1048.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ZuUR!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8db62b89-408c-43ff-8872-b1d431a21e1d_1456x1048.png 424w, https://substackcdn.com/image/fetch/$s_!ZuUR!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8db62b89-408c-43ff-8872-b1d431a21e1d_1456x1048.png 848w, https://substackcdn.com/image/fetch/$s_!ZuUR!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8db62b89-408c-43ff-8872-b1d431a21e1d_1456x1048.png 1272w, https://substackcdn.com/image/fetch/$s_!ZuUR!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8db62b89-408c-43ff-8872-b1d431a21e1d_1456x1048.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ZuUR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8db62b89-408c-43ff-8872-b1d431a21e1d_1456x1048.png" width="1456" height="1048" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8db62b89-408c-43ff-8872-b1d431a21e1d_1456x1048.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1048,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:299894,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://dataengineerthings.substack.com/i/178289327?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8db62b89-408c-43ff-8872-b1d431a21e1d_1456x1048.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ZuUR!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8db62b89-408c-43ff-8872-b1d431a21e1d_1456x1048.png 424w, https://substackcdn.com/image/fetch/$s_!ZuUR!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8db62b89-408c-43ff-8872-b1d431a21e1d_1456x1048.png 848w, https://substackcdn.com/image/fetch/$s_!ZuUR!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8db62b89-408c-43ff-8872-b1d431a21e1d_1456x1048.png 1272w, https://substackcdn.com/image/fetch/$s_!ZuUR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8db62b89-408c-43ff-8872-b1d431a21e1d_1456x1048.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Hi Everyone,</p><p>Welcome to the final edition of the 2025 newsletter, a year that pushed data engineering through one of its most transformative leaps. From the rise of agentic data engineering, and multimodal distributed engines, to real-time graph architectures reshaping event processing at massive scale, one thing became clear: data engineering is evolving faster than ever, fueled by the breakneck pace of AI. </p><p>Amid all this change, one constant stood out: data engineers remain vital for organizations. They build the reliable, scalable foundations that data systems depend on, ensuring data is clean, governed, and available in real time while optimizing cost and performance as workloads grow. </p><p>Today, data engineers sit at the very center of AI readiness, designing data layers that support multimodal workloads, enforcing data contracts to keep teams aligned, and building real-time infrastructure that powers everything from fraud detection to personalized experiences. </p><p>We close the year with a powerful reminder: as our systems scale, so must our influence. Becoming a force multiplier is no longer optional for data engineers; it&#8217;s essential for driving lasting impact. 2026 will be an even more exciting and challenging year for data engineering.</p><p>Thank you for being part of this journey. </p><p>- Ananda</p><div><hr></div><h3><strong>&#128218;</strong> Data Pulse</h3><h4><strong><a href="https://www.dataengineeringweekly.com/p/thinking-like-a-data-engineer">Thinking Like a Data Engineer</a></strong></h4><blockquote><p><strong>&#128214; Topic</strong>: Data Engineer<br>&#129504; <strong>Level</strong>: Beginner</p></blockquote><p>Summary: Ananth Packkildurai&#8217;s article reminds us that becoming a data engineer isn&#8217;t really about mastering technology and tools; it is about systems thinking. With the help of a few mentors, he realized that curiosity, the ability to model the world around you, the willingness to iterate, and the confidence to trust yourself matter far more than any framework or technology.</p><p>&#128161; Why is this relevant for DEs?<br>This article is valuable for data engineers because it reframes the discipline beyond tools, cloud platforms, or pipelines, and focuses on the thinking patterns that truly drive long-term success. In an industry where tools and frameworks change every year, curiosity, systems thinking, iterative design, and self-belief are timeless skills that help engineers navigate complexity and uncertainty. For anyone entering or growing in the data engineering field, these lessons offer a grounding perspective on what actually makes a data engineer effective, resilient, and innovative.</p><h4><a href="https://www.databricks.com/blog/introducing-python-user-defined-table-functions-udtfs-unity-catalog">Python User-Defined Table Functions (UDTFs) in Unity Catalog</a></h4><blockquote><p><strong>&#128214; Topic</strong>: Data Engineering<br>&#129504; <strong>Level</strong>: Intermediate</p></blockquote><p><strong>Summary: </strong>Databricks introduces user-defined table functions (UDTFs) in Unity Catalog for PySpark. This feature allows users to write stateful, table-generating logic in Python and register it as a governed object in Unity Catalog (UC). Unlike a scalar UDF (which returns one value per row), a UDTF (User-Defined <strong>Table</strong> Function) returns zero, one, or many rows for each input row, making it powerful for exploding data or complex transformations. You define it as a Python class with an <code>eval</code> method and call it directly from SQL using the <code>TABLE()</code> keyword. UDTFs are useful for complex tasks like ML inference or pattern detection, but are generally not useful for simple, one-to-one transformations best handled by built-in Spark functions.</p><p>&#128161; <strong>Why is this relevant for DEs</strong>?<br>With this new UDTFs, Data Engineers can now implement complex Python logic once, register it centrally under Unity Catalog&#8217;s security model, and instantly make it available for use across all workspaces, SQL Endpoints, and pipelines. </p><p>Furthermore, it provides automated lineage capture, reduces code duplication, and enables heavy setup logic (like loading a model) to run only once per partition via the Python class structure, significantly boosting performance for stateful processing.</p><h4><a href="https://netflixtechblog.com/how-and-why-netflix-built-a-real-time-distributed-graph-part-1-ingesting-and-processing-data-80113e124acc">Netflix: Real-Time Distributed Graph (Data Ingestion and Processing)</a></h4><blockquote><p><strong>&#128214; Topic</strong>: Data Architecture<br>&#129504; <strong>Level</strong>: Advanced</p></blockquote><p>Netflix has evolved from pure video streaming into a multi-vertical platform offering ads, live events, and mobile games, which requires connecting member behavior across many devices and services. Their microservices architecture created siloed data, making it challenging to understand cross-app interactions. To solve this, Netflix built Real-Time Distributed Graphs (RDG) that link events such as watching, logging in, or playing games into a unified, relationship-driven graph. Events flow from devices &#8594; API Gateway &#8594; Kafka &#8594; Apache Flink, where they are filtered, enriched, deduplicated, and transformed into graph nodes and edges before being stored through their Data Mesh at &gt;5 million records per second. Distributed Flink jobs (one per Kafka topic) allow scalable, low-latency processing. The RDG stores a property-graph model of nodes (members, titles, devices) and edges (watching, logging in, playing).</p><p>&#128161; <strong>Why is this relevant for DEs</strong>?<br>This system illustrates the real-world challenges DEs face when dealing with large-scale, cross-device behavioral data. Netflix&#8217;s RDG shows how traditional warehouses or microservice-siloed datasets fail for real-time identity stitching, behavioral tracking, and relationship-centric analytics. The architecture highlights core DE competencies: event-driven design with Kafka, low-latency stream processing with Flink, schema governance with Avro + registry, backfill strategies with Iceberg, and graph modeling. For data engineers, the RDG is a model for designing high-throughput, low-latency, scalable systems that unify fragmented data to power personalization, fraud detection, recommendations, and cross-domain insights.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!k-uV!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffda349e7-1620-4170-9a92-909b89cd8ef4_1706x910.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!k-uV!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffda349e7-1620-4170-9a92-909b89cd8ef4_1706x910.png 424w, https://substackcdn.com/image/fetch/$s_!k-uV!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffda349e7-1620-4170-9a92-909b89cd8ef4_1706x910.png 848w, https://substackcdn.com/image/fetch/$s_!k-uV!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffda349e7-1620-4170-9a92-909b89cd8ef4_1706x910.png 1272w, https://substackcdn.com/image/fetch/$s_!k-uV!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffda349e7-1620-4170-9a92-909b89cd8ef4_1706x910.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!k-uV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffda349e7-1620-4170-9a92-909b89cd8ef4_1706x910.png" width="1456" height="777" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fda349e7-1620-4170-9a92-909b89cd8ef4_1706x910.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:777,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:356918,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineerthings.substack.com/i/178289327?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffda349e7-1620-4170-9a92-909b89cd8ef4_1706x910.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!k-uV!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffda349e7-1620-4170-9a92-909b89cd8ef4_1706x910.png 424w, https://substackcdn.com/image/fetch/$s_!k-uV!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffda349e7-1620-4170-9a92-909b89cd8ef4_1706x910.png 848w, https://substackcdn.com/image/fetch/$s_!k-uV!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffda349e7-1620-4170-9a92-909b89cd8ef4_1706x910.png 1272w, https://substackcdn.com/image/fetch/$s_!k-uV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffda349e7-1620-4170-9a92-909b89cd8ef4_1706x910.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://netflixtechblog.com/how-and-why-netflix-built-a-real-time-distributed-graph-part-1-ingesting-and-processing-data-80113e124acc">source</a></figcaption></figure></div><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!1BEZ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa35de7a7-53ba-4d72-98e4-6bab3b956cf1_708x112.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!1BEZ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa35de7a7-53ba-4d72-98e4-6bab3b956cf1_708x112.png 424w, https://substackcdn.com/image/fetch/$s_!1BEZ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa35de7a7-53ba-4d72-98e4-6bab3b956cf1_708x112.png 848w, https://substackcdn.com/image/fetch/$s_!1BEZ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa35de7a7-53ba-4d72-98e4-6bab3b956cf1_708x112.png 1272w, https://substackcdn.com/image/fetch/$s_!1BEZ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa35de7a7-53ba-4d72-98e4-6bab3b956cf1_708x112.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!1BEZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa35de7a7-53ba-4d72-98e4-6bab3b956cf1_708x112.png" width="708" height="112" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a35de7a7-53ba-4d72-98e4-6bab3b956cf1_708x112.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:112,&quot;width&quot;:708,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:33106,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineerthings.substack.com/i/178289327?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa35de7a7-53ba-4d72-98e4-6bab3b956cf1_708x112.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!1BEZ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa35de7a7-53ba-4d72-98e4-6bab3b956cf1_708x112.png 424w, https://substackcdn.com/image/fetch/$s_!1BEZ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa35de7a7-53ba-4d72-98e4-6bab3b956cf1_708x112.png 848w, https://substackcdn.com/image/fetch/$s_!1BEZ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa35de7a7-53ba-4d72-98e4-6bab3b956cf1_708x112.png 1272w, https://substackcdn.com/image/fetch/$s_!1BEZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa35de7a7-53ba-4d72-98e4-6bab3b956cf1_708x112.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a><figcaption class="image-caption"><a href="https://netflixtechblog.com/how-and-why-netflix-built-a-real-time-distributed-graph-part-1-ingesting-and-processing-data-80113e124acc">source</a></figcaption></figure></div><h4><a href="https://leaddev.com/leadership/become-better-force-multiplier-4-steps">Become a better force multiplier in 4 steps</a></h4><blockquote><p><strong>&#128214; Topic</strong>: Career Development<br>&#129504; <strong>Level</strong>: Intermediate</p></blockquote><p>Force multiplication means creating a large, sustained impact without being directly involved in every task. Individual Contributors (ICs) can achieve this by sharing knowledge instead of repeating answers, delegating with guidance rather than control, removing friction points that slow others down, and stepping back to connect broader patterns across teams. At its heart, force multiplication is about shifting from doing everything yourself to helping others move faster and grow. It&#8217;s the impact you create when you document what you know, coach teammates, remove friction, and help everyone stay aligned. In the end, it&#8217;s not just about taking on more work; it&#8217;s about empowering people so things keep moving forward even when you&#8217;re not in the room.</p><p>&#128161; <strong>Why is this relevant for DEs</strong>?<br>For data engineers, force multiplication is essential because modern data systems scale far beyond what any single engineer can build or maintain alone. High-impact DEs create durable value by documenting pipelines and playbooks, mentoring teammates on standards, designing reusable data frameworks, improving infrastructure bottlenecks, and aligning cross-functional teams around unified data models or platform capabilities. Whether you&#8217;re building ingestion frameworks, managing governance, introducing orchestration standards, or guiding architectural decisions, your ability to enable ML engineers, software teams, and fellow DEs, multiplies organizational velocity. Force multiplication turns a DE from a &#8220;pipeline builder&#8221; into a strategic technical leader whose influence shapes reliability, scalability, and long-term data culture.</p><div><hr></div><h3>&#127908; Online Summit: What&#8217;s Ahead in 2026 for Data &amp; Analytics</h3><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!mGFI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3197ddb1-48d9-4408-872a-fa200721bb1d_554x242.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!mGFI!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3197ddb1-48d9-4408-872a-fa200721bb1d_554x242.png 424w, https://substackcdn.com/image/fetch/$s_!mGFI!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3197ddb1-48d9-4408-872a-fa200721bb1d_554x242.png 848w, https://substackcdn.com/image/fetch/$s_!mGFI!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3197ddb1-48d9-4408-872a-fa200721bb1d_554x242.png 1272w, https://substackcdn.com/image/fetch/$s_!mGFI!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3197ddb1-48d9-4408-872a-fa200721bb1d_554x242.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!mGFI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3197ddb1-48d9-4408-872a-fa200721bb1d_554x242.png" width="416" height="181.71841155234657" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3197ddb1-48d9-4408-872a-fa200721bb1d_554x242.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:242,&quot;width&quot;:554,&quot;resizeWidth&quot;:416,&quot;bytes&quot;:75371,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineerthings.substack.com/i/178289327?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2410a739-6f71-4957-812c-dab99ceed59e_627x258.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!mGFI!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3197ddb1-48d9-4408-872a-fa200721bb1d_554x242.png 424w, https://substackcdn.com/image/fetch/$s_!mGFI!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3197ddb1-48d9-4408-872a-fa200721bb1d_554x242.png 848w, https://substackcdn.com/image/fetch/$s_!mGFI!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3197ddb1-48d9-4408-872a-fa200721bb1d_554x242.png 1272w, https://substackcdn.com/image/fetch/$s_!mGFI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3197ddb1-48d9-4408-872a-fa200721bb1d_554x242.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>The summit focuses on preparing organizations for next-generation trends in data, analytics, and AI by showcasing evolving technologies, modern architectural best practices, and emerging AI/ML trends.</p><ul><li><p>&#8220;Streaming Intelligence: Unlocking AI&#8217;s Potential with Real-Time Context&#8221; by Jake Bengtson, VP of AI, Striim.</p></li><li><p>&#8220;Putting GenAI to Work with Business Data: Driving Value and Managing Risk&#8221; by Kristy Hollingshead, Associate Director of Data Science, Further.</p></li></ul><p>&#128073;&#127996; Sign up for the online summit <a href="https://tdwi.org/events/virtual/dec/whats-ahead-for-data-and-analytics-in-2026/home.aspx#nav-agenda">HERE</a> (free registration).</p><div><hr></div><h3>&#128278; Featured Read</h3><h4><strong>SIMD: The real superpower behind super fast databases</strong></h4><p><em>Author: <a href="https://medium.com/@shubham-tomar">Shubham Tomar</a></em></p><p>Modern analytical databases like ClickHouse, DuckDB, BigQuery, and Redshift feel incredibly fast because they&#8217;re smart at every step, how they store data, how they fetch it, and especially how they process it. Things like columnar storage, compression, and pruning help reduce the amount of data they touch, but the real boost comes from what happens inside the CPU when it starts crunching the numbers. This is where <a href="https://celerdata.com/glossary/single-instruction-multiple-data-simd">SIMD</a> (Single Instruction, Multiple Data) becomes the hidden superpower, instead of performing operations one value at a time, SIMD allows CPUs to process multiple values simultaneously using wide vector registers (SSE, AVX, AVX-512). Because registers operate at near-zero latency compared to RAM, vectorized operations massively boost throughput for large-scale analytical workloads.</p><p>These databases utilize SIMD using vectorized execution engines, where operations are performed on batches of values rather than one row at a time. Early systems like MonetDB/X100 introduced this idea, and modern engines such as DuckDB, ClickHouse, and other columnar databases have fully embraced it. By running thousands of values through SIMD in a single instruction, these systems make far better use of the CPU&#8217;s cache, avoid unnecessary branching, and often achieve speedups of 10x or more. </p><p>For data engineers, this means queries return faster without endless tuning, pipelines scale smoothly as data grows, and compute costs stay under control because the engine does far more work per CPU cycle. Understanding how SIMD and <a href="https://www.dremio.com/wiki/vectorized-query-execution/">vectorized execution</a> work helps data engineers make more intelligent choices when selecting analytical databases during the design phase. When you know which engines fully utilize SIMD under the hood, you can pick systems that deliver faster queries, lower compute costs, and better scalability right from day one instead of discovering performance issues later and trying to fix them with tuning, caching, or hardware upgrades.</p><p><strong>Key Features and Learnings</strong></p><ul><li><p><strong>SIMD enables true parallelism:</strong> one CPU instruction can operate on multiple values at once.</p></li><li><p><strong>CPU Registers are the real performance hotspot:</strong> operations on registers take 1 cycle vs. 100&#8211;200 cycles for RAM access.</p></li><li><p><strong>Massive speedups for scans &amp; aggregates:</strong> SUM, COUNT, and predicates run across entire vectors in one instruction.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!4ajK!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F853e94ce-6181-400c-a6fc-6b8b71b4fbcf_759x354.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!4ajK!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F853e94ce-6181-400c-a6fc-6b8b71b4fbcf_759x354.png 424w, https://substackcdn.com/image/fetch/$s_!4ajK!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F853e94ce-6181-400c-a6fc-6b8b71b4fbcf_759x354.png 848w, https://substackcdn.com/image/fetch/$s_!4ajK!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F853e94ce-6181-400c-a6fc-6b8b71b4fbcf_759x354.png 1272w, https://substackcdn.com/image/fetch/$s_!4ajK!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F853e94ce-6181-400c-a6fc-6b8b71b4fbcf_759x354.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!4ajK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F853e94ce-6181-400c-a6fc-6b8b71b4fbcf_759x354.png" width="759" height="354" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/853e94ce-6181-400c-a6fc-6b8b71b4fbcf_759x354.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:354,&quot;width&quot;:759,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:141592,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineerthings.substack.com/i/178289327?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F853e94ce-6181-400c-a6fc-6b8b71b4fbcf_759x354.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!4ajK!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F853e94ce-6181-400c-a6fc-6b8b71b4fbcf_759x354.png 424w, https://substackcdn.com/image/fetch/$s_!4ajK!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F853e94ce-6181-400c-a6fc-6b8b71b4fbcf_759x354.png 848w, https://substackcdn.com/image/fetch/$s_!4ajK!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F853e94ce-6181-400c-a6fc-6b8b71b4fbcf_759x354.png 1272w, https://substackcdn.com/image/fetch/$s_!4ajK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F853e94ce-6181-400c-a6fc-6b8b71b4fbcf_759x354.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://blog.dataengineerthings.org/simd-the-real-superpower-behind-super-fast-databases-104ce03dfa20">source</a></figcaption></figure></div><p>&#128073;&#127996; Read the full article <strong><a href="https://blog.dataengineerthings.org/simd-the-real-superpower-behind-super-fast-databases-104ce03dfa20">HERE</a></strong>.</p><div><hr></div><h3><strong>&#128142; Open Source Gems</strong></h3><p><strong><a href="https://www.daft.ai/blog/introducing-flotilla-simplifying-multimodal-data-processing-at-scale">Flotilla - Distributed engine - Multimodal data workloads</a></strong><br>Flotilla is <a href="https://docs.daft.ai/en/stable/architecture/">Daft&#8217;s</a> next-generation distributed execution engine built for multimodal data workloads that involve massive PDFs, images, audio, video, embeddings, and GPU-heavy operations. Flotilla introduces a node-level, streaming, concurrent processing model powered by its Swordfish engine. Benchmarks show Flotilla achieving 2&#8211;7&#215; faster performance than <a href="https://docs.ray.io/en/latest/data/data.html">Ray Data</a> and 4&#8211;18&#215; faster than Spark, all while using smaller, cheaper clusters and avoiding OOM issues through bounded-memory streaming execution.</p><p><strong>Why is this useful for data engineers?<br></strong>Flotilla simplifies the challenge of building scalable multimodal pipelines that previously required complex configuration tuning, cluster resizing, and careful partition management. You can now write workflows such as PDF ingestion, image processing, video object detection, transcription, and embedding declaratively using DataFrame APIs, and Flotilla executes them efficiently across CPUs and GPUs in a distributed manner. Flotilla is faster, more reliable, and far easier to operate than traditional distributed engines, unlocking productivity gains for data engineers working on AI, LLM retrieval systems, and unstructured-data-heavy applications.</p><p>&#129489;&#8205;&#128187; Daft on <a href="https://github.com/Eventual-Inc/Daft">GitHub</a>. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!S_37!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa734d9d0-5f0f-4d12-b478-67c6290d1aad_619x446.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!S_37!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa734d9d0-5f0f-4d12-b478-67c6290d1aad_619x446.png 424w, https://substackcdn.com/image/fetch/$s_!S_37!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa734d9d0-5f0f-4d12-b478-67c6290d1aad_619x446.png 848w, https://substackcdn.com/image/fetch/$s_!S_37!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa734d9d0-5f0f-4d12-b478-67c6290d1aad_619x446.png 1272w, https://substackcdn.com/image/fetch/$s_!S_37!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa734d9d0-5f0f-4d12-b478-67c6290d1aad_619x446.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!S_37!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa734d9d0-5f0f-4d12-b478-67c6290d1aad_619x446.png" width="728" height="524.5363489499192" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a734d9d0-5f0f-4d12-b478-67c6290d1aad_619x446.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;normal&quot;,&quot;height&quot;:446,&quot;width&quot;:619,&quot;resizeWidth&quot;:728,&quot;bytes&quot;:26385,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineerthings.substack.com/i/178289327?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F995e88ff-0273-4821-99b5-0e86dfd5d79c_619x561.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!S_37!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa734d9d0-5f0f-4d12-b478-67c6290d1aad_619x446.png 424w, https://substackcdn.com/image/fetch/$s_!S_37!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa734d9d0-5f0f-4d12-b478-67c6290d1aad_619x446.png 848w, https://substackcdn.com/image/fetch/$s_!S_37!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa734d9d0-5f0f-4d12-b478-67c6290d1aad_619x446.png 1272w, https://substackcdn.com/image/fetch/$s_!S_37!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa734d9d0-5f0f-4d12-b478-67c6290d1aad_619x446.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"></figcaption></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!2Gmy!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7cddd4d3-17bc-49a3-b922-88b2bdc732a1_770x370.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!2Gmy!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7cddd4d3-17bc-49a3-b922-88b2bdc732a1_770x370.png 424w, https://substackcdn.com/image/fetch/$s_!2Gmy!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7cddd4d3-17bc-49a3-b922-88b2bdc732a1_770x370.png 848w, https://substackcdn.com/image/fetch/$s_!2Gmy!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7cddd4d3-17bc-49a3-b922-88b2bdc732a1_770x370.png 1272w, https://substackcdn.com/image/fetch/$s_!2Gmy!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7cddd4d3-17bc-49a3-b922-88b2bdc732a1_770x370.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!2Gmy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7cddd4d3-17bc-49a3-b922-88b2bdc732a1_770x370.png" width="728" height="349.8181818181818" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7cddd4d3-17bc-49a3-b922-88b2bdc732a1_770x370.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:370,&quot;width&quot;:770,&quot;resizeWidth&quot;:728,&quot;bytes&quot;:61405,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineerthings.substack.com/i/178289327?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7cddd4d3-17bc-49a3-b922-88b2bdc732a1_770x370.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!2Gmy!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7cddd4d3-17bc-49a3-b922-88b2bdc732a1_770x370.png 424w, https://substackcdn.com/image/fetch/$s_!2Gmy!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7cddd4d3-17bc-49a3-b922-88b2bdc732a1_770x370.png 848w, https://substackcdn.com/image/fetch/$s_!2Gmy!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7cddd4d3-17bc-49a3-b922-88b2bdc732a1_770x370.png 1272w, https://substackcdn.com/image/fetch/$s_!2Gmy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7cddd4d3-17bc-49a3-b922-88b2bdc732a1_770x370.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://www.daft.ai/blog/benchmarks-for-multimodal-ai-workloads">source</a></figcaption></figure></div><h3><strong>&#128161; DE Tip of the Month</strong></h3><p><strong><a href="https://www.gability.com/en/courses/data-modeling/02-dimension-types/08-fast-changing-dimension/#:~:text=Fast%20Changing%20Dimension%20Handling%20*%20Identify%20the,dimension%20with%20the%20main%20dimension%20using%20mini%2Ddimension.">Handling rapidly changing dimensions in the data warehouse:</a></strong></p><p>Fast-changing dimensions should not be handled with a single SCD Type 2 approach, as highly volatile attributes can cause dimension bloat and slow performance. The best practice is to split attributes by their nature, keep slowly changing attributes in the main dimension, track the history with Type 2, and move frequently changing attributes into a new mini-dimension.</p><ul><li><p>Identify fast vs. slowly changing attributes: Attributes that seldom change should remain in the main dimension table.  </p></li><li><p><a href="https://www.kimballgroup.com/data-warehouse-business-intelligence-resources/kimball-techniques/dimensional-modeling-techniques/junk-dimension/">Create a junk dimension</a>: For the fast-changing attributes, create a junk dimension table </p></li><li><p><a href="https://www.kimballgroup.com/data-warehouse-business-intelligence-resources/kimball-techniques/dimensional-modeling-techniques/type-4-mini-dimension/">Create a mini-dimension table</a>: Bridge table to link the main dimension and the junk dimension</p></li><li><p>Update fact tables: Update the fact tables to include both the primary dimension key and the mini-dimension key</p></li></ul><div><hr></div><p>Let us know what you like the most in the newsletter. See you next time!</p><div class="poll-embed" data-attrs="{&quot;id&quot;:412154}" data-component-name="PollToDOM"></div><p>Cheers,</p><p>Ananda, Chozhan, Srivignesh</p><div><hr></div><h4>&#8505;&#65039; About Data Engineer Things</h4><p><a href="https://www.dataengineerthings.org/">Data Engineer Things</a> (DET) is a global community built by data engineers for data engineers. Subscribe to the <a href="https://dataengineerthings.substack.com/">newsletter</a> and follow us on <a href="https://www.linkedin.com/company/data-engineer-things/posts/?feedView=all">LinkedIn</a> to gain access to exclusive learning resources and networking opportunities, including articles, webinars, meetups, conferences, mentorship, and much more.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://dataengineerthings.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Data Engineer Things! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Data Engineer Things Newsletter #25 (Nov 2025)]]></title><description><![CDATA[From Reddit Questions to LinkedIn's AI Stack: Lessons Across Data Decades]]></description><link>https://dataengineerthings.substack.com/p/data-engineer-things-newsletter-25</link><guid isPermaLink="false">https://dataengineerthings.substack.com/p/data-engineer-things-newsletter-25</guid><dc:creator><![CDATA[Data Engineer Things]]></dc:creator><pubDate>Wed, 05 Nov 2025 16:02:24 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!hub_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00baafdb-3511-447f-bc4b-b0344ff464fa_1456x1048.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!hub_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00baafdb-3511-447f-bc4b-b0344ff464fa_1456x1048.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!hub_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00baafdb-3511-447f-bc4b-b0344ff464fa_1456x1048.png 424w, https://substackcdn.com/image/fetch/$s_!hub_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00baafdb-3511-447f-bc4b-b0344ff464fa_1456x1048.png 848w, https://substackcdn.com/image/fetch/$s_!hub_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00baafdb-3511-447f-bc4b-b0344ff464fa_1456x1048.png 1272w, https://substackcdn.com/image/fetch/$s_!hub_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00baafdb-3511-447f-bc4b-b0344ff464fa_1456x1048.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!hub_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00baafdb-3511-447f-bc4b-b0344ff464fa_1456x1048.png" width="1456" height="1048" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/00baafdb-3511-447f-bc4b-b0344ff464fa_1456x1048.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1048,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:255727,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://dataengineerthings.substack.com/i/175724520?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00baafdb-3511-447f-bc4b-b0344ff464fa_1456x1048.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!hub_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00baafdb-3511-447f-bc4b-b0344ff464fa_1456x1048.png 424w, https://substackcdn.com/image/fetch/$s_!hub_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00baafdb-3511-447f-bc4b-b0344ff464fa_1456x1048.png 848w, https://substackcdn.com/image/fetch/$s_!hub_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00baafdb-3511-447f-bc4b-b0344ff464fa_1456x1048.png 1272w, https://substackcdn.com/image/fetch/$s_!hub_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00baafdb-3511-447f-bc4b-b0344ff464fa_1456x1048.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Highlights of the DET Newsletter November 2025 Edition</figcaption></figure></div><p>Hey folks,</p><p>October brought me to the Pacific Northwest for the Airflow Summit 2025, and I&#8217;m still processing everything I learned and everyone I met.</p><p>I caught up with Ben (Seattle Data Guy) over coffee, and we did what data engineers do best: talked about the <em>good old days</em> when Hadoop was king and SSIS was the ETL workhorse &#128517;.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!BZmd!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce5b93a0-26f3-4067-b381-f5437650fd22_1065x489.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!BZmd!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce5b93a0-26f3-4067-b381-f5437650fd22_1065x489.png 424w, https://substackcdn.com/image/fetch/$s_!BZmd!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce5b93a0-26f3-4067-b381-f5437650fd22_1065x489.png 848w, https://substackcdn.com/image/fetch/$s_!BZmd!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce5b93a0-26f3-4067-b381-f5437650fd22_1065x489.png 1272w, https://substackcdn.com/image/fetch/$s_!BZmd!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce5b93a0-26f3-4067-b381-f5437650fd22_1065x489.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!BZmd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce5b93a0-26f3-4067-b381-f5437650fd22_1065x489.png" width="1065" height="489" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ce5b93a0-26f3-4067-b381-f5437650fd22_1065x489.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:489,&quot;width&quot;:1065,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:852706,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineerthings.substack.com/i/175724520?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce5b93a0-26f3-4067-b381-f5437650fd22_1065x489.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!BZmd!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce5b93a0-26f3-4067-b381-f5437650fd22_1065x489.png 424w, https://substackcdn.com/image/fetch/$s_!BZmd!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce5b93a0-26f3-4067-b381-f5437650fd22_1065x489.png 848w, https://substackcdn.com/image/fetch/$s_!BZmd!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce5b93a0-26f3-4067-b381-f5437650fd22_1065x489.png 1272w, https://substackcdn.com/image/fetch/$s_!BZmd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce5b93a0-26f3-4067-b381-f5437650fd22_1065x489.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">A photo of Ben and me, plus one from my visit to the Pike Place Market in Seattle</figcaption></figure></div><p>For me, the conference was a reminder that tools like Airflow aren&#8217;t just open source software. They are collective intelligence. When OpenAI shares how they use Airflow, when SAP shows how they&#8217;re building RAG pipelines, when someone releases an open benchmark framework, we all get better. Open source is king!</p><p>Ben said something in his talk that I keep thinking about: Companies aren&#8217;t in the same decade when it comes to data infrastructure. Some are still wrangling Hadoop, others are orchestrating LLM pipelines. But batch is still king, dashboards are still slow, and aligning with the business is still hard. The tools change, but the fundamentals don&#8217;t.</p><p>There&#8217;s something grounding about that. Much like the changing seasons, data engineering is about balance, keeping systems reliable while adapting to change. <strong>And when we do that work in the open, sharing our lessons and mistakes, we all move forward together</strong>.</p><p>Hope you find something here that sparks your curiosity.</p><p>Happy reading, and thanks for being here.<br>- Volker</p><div><hr></div><h3><strong>&#128218;</strong> Data Pulse</h3><h4><strong><a href="https://motherduck.com/blog/data-engineers-answer-10-top-reddit-questions/">4 Senior Data Engineers Answer 10 Top Reddit Questions</a></strong></h4><blockquote><p><strong>&#128214; Topic</strong>: Data Engineering<br>&#129504; <strong>Level</strong>: Beginner</p></blockquote><p><strong>Summary</strong>: Four data engineers: Ben Rogojan, Julien Hurault, Mehdi Ouazza, and Simon Sp&#228;ti, answer the 10 most upvoted questions from r/dataengineering (174K members), covering topics from interview preparation and data quality to cloud costs and career wisdom.</p><p>&#128161; <strong>Why is this relevant for DEs</strong>?<br>This article addresses the real-world challenges data engineers face daily, offering battle-tested wisdom from DEs with decades of combined experience. Here are our top 5 lessons from the post:</p><ul><li><p>Fundamentals matter more than chasing the newest tools.</p></li><li><p>Select your data infrastructure based on real-world factors like budget, timeline, and team skills rather than following trends.</p></li><li><p>It requires strong leadership, a clear strategy and transparent communication to acknowledge the complexity of seemingly simple changes like a schema change.</p></li><li><p>Challenge demands for real-time to avoid complexity, a must often turns into don&#8217;t need it (<a href="https://en.wikipedia.org/wiki/You_aren%27t_gonna_need_it">YAGNI</a>)!</p></li><li><p>Focus on stakeholder needs and business objectives while designing systems that can gracefully handle and recover from technical issues.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!nnQe!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F195a79c3-995e-4709-8414-948a96a4f414_886x333.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!nnQe!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F195a79c3-995e-4709-8414-948a96a4f414_886x333.png 424w, https://substackcdn.com/image/fetch/$s_!nnQe!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F195a79c3-995e-4709-8414-948a96a4f414_886x333.png 848w, https://substackcdn.com/image/fetch/$s_!nnQe!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F195a79c3-995e-4709-8414-948a96a4f414_886x333.png 1272w, https://substackcdn.com/image/fetch/$s_!nnQe!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F195a79c3-995e-4709-8414-948a96a4f414_886x333.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!nnQe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F195a79c3-995e-4709-8414-948a96a4f414_886x333.png" width="886" height="333" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/195a79c3-995e-4709-8414-948a96a4f414_886x333.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:333,&quot;width&quot;:886,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:194808,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineerthings.substack.com/i/175724520?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F195a79c3-995e-4709-8414-948a96a4f414_886x333.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!nnQe!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F195a79c3-995e-4709-8414-948a96a4f414_886x333.png 424w, https://substackcdn.com/image/fetch/$s_!nnQe!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F195a79c3-995e-4709-8414-948a96a4f414_886x333.png 848w, https://substackcdn.com/image/fetch/$s_!nnQe!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F195a79c3-995e-4709-8414-948a96a4f414_886x333.png 1272w, https://substackcdn.com/image/fetch/$s_!nnQe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F195a79c3-995e-4709-8414-948a96a4f414_886x333.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://motherduck.com/blog/data-engineers-answer-10-top-reddit-questions/">source</a></figcaption></figure></div><h4><strong><a href="https://blog.dataexpert.io/p/the-2025-breaking-into-data-engineering-roadmap">The 2025 AI + Data Engineering Roadmap</a></strong></h4><blockquote><p><strong>&#128214; Topic</strong>: Career<br>&#129504; <strong>Level</strong>: Intermediate</p></blockquote><p><strong>Summary</strong>: Zach Wilson&#8217;s 2025 data engineering roadmap outlines the essential skills needed to break into the field, emphasizing that traditional SQL and Python knowledge must now be complemented with AI integration capabilities. The guide covers six critical areas: SQL fundamentals, Python with AI integration, distributed compute systems, data modeling and quality, portfolio projects, and personal branding.</p><p>&#128161; <strong>Why is this relevant for DEs</strong>?<br>This article is highly relevant because it reflects the rapidly evolving landscape where data engineering and AI are converging. Modern data engineers can no longer rely solely on traditional ETL skills, they must understand how to build pipelines that support AI workloads, including embedding generation, vector search, and RAG implementations. The roadmap provides actionable guidance on which technologies to prioritize (Spark, Iceberg, vector databases) and emphasizes practical demonstration through portfolio projects that showcase both data engineering and AI integration skills.</p><h4><strong><a href="https://blog.bytebytego.com/p/the-evolution-of-linkedins-generative">The Evolution of LinkedIn&#8217;s Generative AI Tech Stack</a></strong></h4><blockquote><p><strong>&#128214; Topic</strong>: Generative AI<br>&#129504; <strong>Level</strong>: Advanced</p></blockquote><p><strong>Summary</strong>: LinkedIn evolved its GenAI infrastructure from fragmented, team-specific implementations to a unified platform capable of supporting sophisticated multi-agent systems. The transformation involved shifting from Java to Python as the primary language, adopting LangChain as the application framework, and building centralized systems for prompt management, skill registries, memory, and model inference. By 2025, LinkedIn leveraged its existing Messaging infrastructure as the orchestration backbone for AI agents, enabling complex workflows like the Hiring Assistant while maintaining strict privacy, security, and human-in-the-loop controls.</p><p>&#128161; <strong>Why is this relevant for DEs</strong>?<br>This article shows three architectural principles data engineers can apply when building AI systems. Reuse before rebuild: LinkedIn used existing Messaging infrastructure for orchestration rather than creating new systems. Standardize foundations first: centralized prompt management and skill registries prevented fragmentation as teams scaled. Reduce friction: adopting Python for both development and production eliminated costly translation steps. Successful AI platforms prioritize developer velocity and shared abstractions over chasing the newest frameworks.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Vzkz!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66b97856-da59-4b0d-ad54-acf5d36ccdc1_954x1064.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Vzkz!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66b97856-da59-4b0d-ad54-acf5d36ccdc1_954x1064.png 424w, https://substackcdn.com/image/fetch/$s_!Vzkz!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66b97856-da59-4b0d-ad54-acf5d36ccdc1_954x1064.png 848w, https://substackcdn.com/image/fetch/$s_!Vzkz!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66b97856-da59-4b0d-ad54-acf5d36ccdc1_954x1064.png 1272w, https://substackcdn.com/image/fetch/$s_!Vzkz!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66b97856-da59-4b0d-ad54-acf5d36ccdc1_954x1064.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Vzkz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66b97856-da59-4b0d-ad54-acf5d36ccdc1_954x1064.png" width="954" height="1064" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/66b97856-da59-4b0d-ad54-acf5d36ccdc1_954x1064.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1064,&quot;width&quot;:954,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:273801,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineerthings.substack.com/i/175724520?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66b97856-da59-4b0d-ad54-acf5d36ccdc1_954x1064.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Vzkz!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66b97856-da59-4b0d-ad54-acf5d36ccdc1_954x1064.png 424w, https://substackcdn.com/image/fetch/$s_!Vzkz!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66b97856-da59-4b0d-ad54-acf5d36ccdc1_954x1064.png 848w, https://substackcdn.com/image/fetch/$s_!Vzkz!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66b97856-da59-4b0d-ad54-acf5d36ccdc1_954x1064.png 1272w, https://substackcdn.com/image/fetch/$s_!Vzkz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66b97856-da59-4b0d-ad54-acf5d36ccdc1_954x1064.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://blog.bytebytego.com/p/the-evolution-of-linkedins-generative">source</a></figcaption></figure></div><h4><a href="https://www.alibabacloud.com/blog/building-a-unified-lakehouse-for-large-scale-recommendation-systems-with-apache-paimon-at-tiktok_602568">Building a Unified Lakehouse for Large-Scale Recommendation Systems with Apache Paimon at TikTok</a></h4><blockquote><p><strong>&#128214; Topic</strong>: Lakehouse Architecture<br>&#129504; <strong>Level</strong>: Advanced</p></blockquote><p><strong>Summary</strong>: TikTok built a unified Lakehouse using <a href="https://paimon.apache.org/">Apache Paimon</a> to support Large-scale Recommendation Models (LRMs) that rely on user behavior sequences, replacing fragmented pipelines with a four-layer architecture (DIM, DWD, DWS, ADS) that unifies stream and batch processing. The solution processes 600TB of data in 5 hours using <a href="https://nightlies.apache.org/flink/flink-cdc-docs-stable/">Flink CDC</a> for real-time features and Spark for point-in-time joins, while reducing state size from 1PB to 300GB through user-grouped samples.</p><p>&#128161; <strong>Why is this relevant for DEs</strong>?<br>Schema standardization across teams eliminates 70%+ of redundant pipeline development, while unified storage with transactional guarantees solves the classic Lambda architecture consistency problems that plague many organizations. The bucket partitioning by user ID strategy is particularly valuable for any user-centric analytics, enabling efficient sort-merge joins without expensive shuffling operations. Most importantly, TikTok&#8217;s approach of treating data as shared assets rather than team-owned silos demonstrates how proper lakehouse design can transform data engineering from reactive pipeline maintenance to proactive platform building, reducing time-to-insight from days to hours while supporting both real-time serving and batch analytics from a single source of truth.</p><p>&#127909; <strong>Watch the related talk!</strong></p><p><em>Flink Forward Asia 2025 talk by Shuiqiang Chen, Big Data Engineer at TikTok</em></p><div id="youtube2-tQCxZjYgl5U" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;tQCxZjYgl5U&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/tQCxZjYgl5U?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><h4><strong><a href="https://www.helenmin.com/blog/developer-marketing-in-the-age-of-ai">Developer Marketing in the Age of AI</a></strong></h4><blockquote><p><strong>&#128214; Topic</strong>: Personal Brand<br>&#129504; <strong>Level</strong>: Beginner</p></blockquote><p><strong>Summary</strong>: AI has transformed developer marketing by expanding the audience to include AI agents, compressing development timelines from days to hours, and enabling non-traditional developers to ship production code. While core principles remain (show don&#8217;t tell, be helpful, remove friction) the key differentiator has shifted from technical capability to brand taste and trust in an era of abundant AI-generated demos. </p><p>&#128161; <strong>Why is this relevant for DEs</strong>?<br>As data engineers build their personal brands and engage with the community, this article highlights critical shifts in how technical professionals should present their work and expertise. In an era where AI can generate demos and code quickly, a data engineer&#8217;s personal brand becomes a trust signal: showing taste, depth of thinking, and clear opinions about data architecture matters more than just technical capability. Data engineers should focus on creating content that serves both humans and LLMs (documentation, tutorials, architecture explanations), engage authentically in real-time communities (Discord, Slack, Twitter), and demonstrate genuine helpfulness rather than self-promotion.</p><div><hr></div><h3>&#127908; Online Conference: Signals Summit 2025</h3><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!F4mb!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c2b45a1-ca07-4f86-b4b0-f6a47cf04c66_1600x271.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!F4mb!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c2b45a1-ca07-4f86-b4b0-f6a47cf04c66_1600x271.png 424w, https://substackcdn.com/image/fetch/$s_!F4mb!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c2b45a1-ca07-4f86-b4b0-f6a47cf04c66_1600x271.png 848w, https://substackcdn.com/image/fetch/$s_!F4mb!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c2b45a1-ca07-4f86-b4b0-f6a47cf04c66_1600x271.png 1272w, https://substackcdn.com/image/fetch/$s_!F4mb!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c2b45a1-ca07-4f86-b4b0-f6a47cf04c66_1600x271.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!F4mb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c2b45a1-ca07-4f86-b4b0-f6a47cf04c66_1600x271.png" width="1456" height="247" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7c2b45a1-ca07-4f86-b4b0-f6a47cf04c66_1600x271.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:247,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!F4mb!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c2b45a1-ca07-4f86-b4b0-f6a47cf04c66_1600x271.png 424w, https://substackcdn.com/image/fetch/$s_!F4mb!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c2b45a1-ca07-4f86-b4b0-f6a47cf04c66_1600x271.png 848w, https://substackcdn.com/image/fetch/$s_!F4mb!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c2b45a1-ca07-4f86-b4b0-f6a47cf04c66_1600x271.png 1272w, https://substackcdn.com/image/fetch/$s_!F4mb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c2b45a1-ca07-4f86-b4b0-f6a47cf04c66_1600x271.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Signals Summit 2025 brings together data leaders to explore how to design  resilient, reliable, and AI-ready systems. Here are a few sessions that caught our eye:</p><ul><li><p>&#8220;The Promise of MCP-Powered Workflows: A New Operating System for Data&#8221; by Maxime Beauchemin, creator of Apache Airflow.</p></li><li><p>&#8220;Beyond the Modern Data Stack: Why Architecture Is the Real Source of Trust&#8221; by Joe Reis, co-author of Fundamentals of Data Engineering.</p></li><li><p>&#8220;Data Contracts, Metadata, and Lineage: What Actually Works?&#8221; by Hannah Rounds and Callum O&#8217;Connor.</p></li></ul><p>&#128073;&#127996; Sign up for the online summit <a href="https://hubs.ly/Q03P_4lX0">HERE</a> (free registration).</p><p><em>(This message is sponsored by Sifflet.)</em></p><div><hr></div><h3>&#129309;&#127996; DET Mentorship Program</h3><p>The new DET Mentorship Program is here! The revamped mentorship program is designed to meet you where you are in your professional journey by offering different types of mentorship experiences. During the pilot phase, you can book:</p><ul><li><p>Career Development: Beginner + Early Career</p></li><li><p>Career Development: Mid-Career</p></li><li><p>Resume Review</p></li><li><p>Interview Preparation</p></li></ul><p>&#128073;&#127996; To get started and find a mentor (or apply to be a mentor), read the <a href="http://mentorship.dataengineerthings.org/">program guide</a>.</p><div><hr></div><h3>&#9999;&#65039; DET Writers Workshop Series</h3><p>The DET Medium Editorial team is hosting a 4-week virtual technical writers series. This workshop is designed to help intermediate to advanced writers sharpen their technical writing skills.</p><p><strong>&#128467;&#65039; Dates: </strong>Every Monday between Nov 24 and Dec 15, 2025</p><p><strong>&#128341; Time:</strong> 6:00 PM - 7:15 PM Eastern</p><p>&#128073;&#127996; Sign up for the workshop <a href="https://forms.gle/ejpia6nzhd1SAnAk7">HERE</a>.</p><div><hr></div><h3>&#128278; Featured Read</h3><h4><strong>Why Data Product Management Is Nothing Like Software Product Management</strong></h4><p><em>Author: <a href="https://medium.com/@cmgambetti">Clay Gambetti</a></em></p><p>Data product management isn&#8217;t just software PM with different content, it&#8217;s a fundamentally different discipline.</p><p>In software PM, conceptual technical understanding suffices. In data PM, it doesn&#8217;t. When an engineer asks about acceptable data latency, they&#8217;re asking a product question that requires technical depth to answer. Say &#8220;as fast as possible&#8221; and you drive expensive architectural decisions. Say &#8220;daily is fine&#8221; and you can use simpler infrastructure. You can&#8217;t make that call without understanding both user needs and technical implications.</p><p>Software products have uncertainty around user needs. Data products add fundamental uncertainty about the data itself. You often don&#8217;t know if data quality will be adequate until real data meets real users. This affects how you scope work, communicate with stakeholders, plan roadmaps, and define success.</p><p><strong>The Five Core Competencies</strong></p><ol><li><p><strong>Technical fluency</strong> to evaluate tradeoffs and understand when constraints are real vs. negotiable</p></li><li><p><strong>Adapted user research</strong> that validates data feasibility early and understands downstream impact</p></li><li><p><strong>Translation skills</strong> between business needs, technical requirements, and business risk</p></li><li><p><strong>Cross-discipline stakeholder management</strong> for leaders, engineers, analysts, and compliance</p></li><li><p><strong>Data product sense</strong> to anticipate quality issues and recognize when simpler is better</p></li></ol><p><strong>Managing Tensions, Not Optimizing for One Dimension</strong></p><p>Data product management is about managing tensions: business urgency vs. technical reality, comprehensive solutions vs. focused products, short-term wins vs. long-term investment.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!mY-G!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29cdd019-38b7-418c-9c3f-1d0da0315205_841x574.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!mY-G!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29cdd019-38b7-418c-9c3f-1d0da0315205_841x574.png 424w, https://substackcdn.com/image/fetch/$s_!mY-G!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29cdd019-38b7-418c-9c3f-1d0da0315205_841x574.png 848w, https://substackcdn.com/image/fetch/$s_!mY-G!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29cdd019-38b7-418c-9c3f-1d0da0315205_841x574.png 1272w, https://substackcdn.com/image/fetch/$s_!mY-G!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29cdd019-38b7-418c-9c3f-1d0da0315205_841x574.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!mY-G!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29cdd019-38b7-418c-9c3f-1d0da0315205_841x574.png" width="539" height="367.87871581450656" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/29cdd019-38b7-418c-9c3f-1d0da0315205_841x574.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:574,&quot;width&quot;:841,&quot;resizeWidth&quot;:539,&quot;bytes&quot;:92112,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineerthings.substack.com/i/175724520?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29cdd019-38b7-418c-9c3f-1d0da0315205_841x574.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!mY-G!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29cdd019-38b7-418c-9c3f-1d0da0315205_841x574.png 424w, https://substackcdn.com/image/fetch/$s_!mY-G!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29cdd019-38b7-418c-9c3f-1d0da0315205_841x574.png 848w, https://substackcdn.com/image/fetch/$s_!mY-G!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29cdd019-38b7-418c-9c3f-1d0da0315205_841x574.png 1272w, https://substackcdn.com/image/fetch/$s_!mY-G!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29cdd019-38b7-418c-9c3f-1d0da0315205_841x574.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://blog.dataengineerthings.org/why-data-product-management-is-nothing-like-software-product-management-bcb6abac5b64">source</a></figcaption></figure></div><p>&#128073;&#127996; Read the full article <strong><a href="https://blog.dataengineerthings.org/why-data-product-management-is-nothing-like-software-product-management-bcb6abac5b64">HERE</a></strong>.</p><div><hr></div><h3><strong>&#128142; Open Source Gems</strong></h3><p><strong>just</strong><br>just is a command runner that saves and executes project-specific commands from a file called &#8288;<code>justfile</code>, offering a simpler and more user-friendly alternative to &#8288;make with features like cross-platform support, command-line arguments for recipes, and syntax inspired by makefiles but without the build system complexity.</p><p><strong>Why is this useful for data engineers?<br></strong>It replaces scattered bash scripts and hard-to-remember commands with a single, self-documenting &#8288;<code>justfile</code> that standardizes how your team runs dbt models, data quality checks, pipeline deployments, and other data workflows, making onboarding faster and ensuring everyone executes critical data operations consistently.</p><p>&#129489;&#8205;&#128187; just is just on <a href="https://github.com/casey/just">GitHub</a>. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!q2KL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49da9caa-a8b2-4926-a430-b437daca9fd9_1686x690.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!q2KL!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49da9caa-a8b2-4926-a430-b437daca9fd9_1686x690.png 424w, https://substackcdn.com/image/fetch/$s_!q2KL!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49da9caa-a8b2-4926-a430-b437daca9fd9_1686x690.png 848w, https://substackcdn.com/image/fetch/$s_!q2KL!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49da9caa-a8b2-4926-a430-b437daca9fd9_1686x690.png 1272w, https://substackcdn.com/image/fetch/$s_!q2KL!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49da9caa-a8b2-4926-a430-b437daca9fd9_1686x690.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!q2KL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49da9caa-a8b2-4926-a430-b437daca9fd9_1686x690.png" width="1456" height="596" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/49da9caa-a8b2-4926-a430-b437daca9fd9_1686x690.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:596,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:84677,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineerthings.substack.com/i/175724520?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49da9caa-a8b2-4926-a430-b437daca9fd9_1686x690.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!q2KL!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49da9caa-a8b2-4926-a430-b437daca9fd9_1686x690.png 424w, https://substackcdn.com/image/fetch/$s_!q2KL!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49da9caa-a8b2-4926-a430-b437daca9fd9_1686x690.png 848w, https://substackcdn.com/image/fetch/$s_!q2KL!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49da9caa-a8b2-4926-a430-b437daca9fd9_1686x690.png 1272w, https://substackcdn.com/image/fetch/$s_!q2KL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49da9caa-a8b2-4926-a430-b437daca9fd9_1686x690.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://github.com/casey/just">source</a></figcaption></figure></div><div><hr></div><h3><strong>&#128161; DE Tip of the Month</strong></h3><p>Build idempotent pipelines! Running the same pipeline multiple times with the same inputs should produce the same result.</p><p>Why it matters: When you need to backfill historical data or retry a failed task, non-idempotent pipelines become your worst enemy. Non-idempotent pipelines create duplicates, corrupt historical loads, and turn simple reruns into multi-day debugging marathons.</p><ol><li><p>Replace <code>INSERT</code> with <code>UPSERT</code></p></li><li><p>Avoid using <code>datetime.now()</code> in your logic, use execution context instead</p></li><li><p>Read from specific partitions - Always process data for a specific time window</p></li><li><p>Use execution date - Let your orchestrator tell you what data to process</p></li></ol><p>How to work with execution context:</p><ul><li><p>Airflow: <a href="https://airflow.apache.org/docs/apache-airflow/stable/templates-ref.html">Airflow template variables</a></p></li><li><p>Dagster: <a href="https://docs.dagster.io/guides/build/partitions-and-backfills/partitioning-assets">Partitioning assets</a></p></li><li><p>Prefect: <a href="https://docs.prefect.io/v3/concepts/runtime-context">Runtime context</a></p></li></ul><div><hr></div><p>Let us know what you like the most in the newsletter. See you next time!</p><div class="poll-embed" data-attrs="{&quot;id&quot;:398287}" data-component-name="PollToDOM"></div><p>Cheers,</p><p>Chozhan, Srivignesh and Volker</p><div><hr></div><h4>&#8505;&#65039; About Data Engineer Things</h4><p><a href="https://www.dataengineerthings.org/">Data Engineer Things</a> (DET) is a global community built by data engineers for data engineers. Subscribe to the <a href="https://dataengineerthings.substack.com/">newsletter</a> and follow us on <a href="https://www.linkedin.com/company/data-engineer-things/posts/?feedView=all">LinkedIn</a> to gain access to exclusive learning resources and networking opportunities, including articles, webinars, meetups, conferences, mentorship, and much more.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://dataengineerthings.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Data Engineer Things! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Data Engineer Things Newsletter #24 (Oct 2025)]]></title><description><![CDATA[Netflix's journey from simple dashboard to trillion-row scale. Are platforms really simplifying our lives? Community insights on cutting through architectural complexity.]]></description><link>https://dataengineerthings.substack.com/p/data-engineer-things-newsletter-24</link><guid isPermaLink="false">https://dataengineerthings.substack.com/p/data-engineer-things-newsletter-24</guid><dc:creator><![CDATA[Data Engineer Things]]></dc:creator><pubDate>Thu, 09 Oct 2025 15:30:43 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!vYvr!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1b7fc08-41d9-4aa8-9148-a7510892922d_1456x1048.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!vYvr!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1b7fc08-41d9-4aa8-9148-a7510892922d_1456x1048.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!vYvr!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1b7fc08-41d9-4aa8-9148-a7510892922d_1456x1048.png 424w, https://substackcdn.com/image/fetch/$s_!vYvr!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1b7fc08-41d9-4aa8-9148-a7510892922d_1456x1048.png 848w, https://substackcdn.com/image/fetch/$s_!vYvr!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1b7fc08-41d9-4aa8-9148-a7510892922d_1456x1048.png 1272w, https://substackcdn.com/image/fetch/$s_!vYvr!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1b7fc08-41d9-4aa8-9148-a7510892922d_1456x1048.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!vYvr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1b7fc08-41d9-4aa8-9148-a7510892922d_1456x1048.png" width="1456" height="1048" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e1b7fc08-41d9-4aa8-9148-a7510892922d_1456x1048.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1048,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:161459,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://dataengineerthings.substack.com/i/173114685?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1b7fc08-41d9-4aa8-9148-a7510892922d_1456x1048.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!vYvr!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1b7fc08-41d9-4aa8-9148-a7510892922d_1456x1048.png 424w, https://substackcdn.com/image/fetch/$s_!vYvr!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1b7fc08-41d9-4aa8-9148-a7510892922d_1456x1048.png 848w, https://substackcdn.com/image/fetch/$s_!vYvr!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1b7fc08-41d9-4aa8-9148-a7510892922d_1456x1048.png 1272w, https://substackcdn.com/image/fetch/$s_!vYvr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1b7fc08-41d9-4aa8-9148-a7510892922d_1456x1048.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Highlights of the DET Newsletter October 2025 Edition</figcaption></figure></div><p>Hey folks,</p><p>I&#8217;m writing to you from Cincinnati, where fall is settling in and the leaves are starting to show their colors. The crisp air, football weekends, and streets lined with shades of red and gold make this one of my favorite times of the year. There&#8217;s something about the change of seasons that feels a lot like data engineering&#8212;constant shifts, small adjustments, and moments of beauty when everything comes together.</p><p>Much like the rhythm of fall, data engineering is about balance&#8212;keeping systems reliable while adapting to change. Whether it&#8217;s new tools, evolving patterns, or fresh ideas from the community, there&#8217;s always something to learn and something to improve.</p><p>This edition brings a mix of ideas and resources from across our community. Hope you find something here that sparks your curiosity as you sip on your favorite fall drink. &#127810;</p><p>Happy reading, and thanks for being here.</p><p>- Sukanya</p><div><hr></div><h3>&#128240; Data Pulse</h3><ul><li><p><strong>Stream Processing</strong>: <a href="https://github.com/apache/iggy">Open-source project Apache Iggy</a> is a persistent message streaming platform written in Rust, supporting QUIC, TCP (custom binary specification), and HTTP (regular REST API) transport protocols, capable of processing millions of messages per second at ultra-low latency.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!JzDm!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F396b1bd0-ce9b-4899-96d7-f720e2a97a01_2048x1198.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!JzDm!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F396b1bd0-ce9b-4899-96d7-f720e2a97a01_2048x1198.png 424w, https://substackcdn.com/image/fetch/$s_!JzDm!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F396b1bd0-ce9b-4899-96d7-f720e2a97a01_2048x1198.png 848w, https://substackcdn.com/image/fetch/$s_!JzDm!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F396b1bd0-ce9b-4899-96d7-f720e2a97a01_2048x1198.png 1272w, https://substackcdn.com/image/fetch/$s_!JzDm!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F396b1bd0-ce9b-4899-96d7-f720e2a97a01_2048x1198.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!JzDm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F396b1bd0-ce9b-4899-96d7-f720e2a97a01_2048x1198.png" width="1456" height="852" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/396b1bd0-ce9b-4899-96d7-f720e2a97a01_2048x1198.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:852,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;web_ui.png&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="web_ui.png" title="web_ui.png" srcset="https://substackcdn.com/image/fetch/$s_!JzDm!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F396b1bd0-ce9b-4899-96d7-f720e2a97a01_2048x1198.png 424w, https://substackcdn.com/image/fetch/$s_!JzDm!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F396b1bd0-ce9b-4899-96d7-f720e2a97a01_2048x1198.png 848w, https://substackcdn.com/image/fetch/$s_!JzDm!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F396b1bd0-ce9b-4899-96d7-f720e2a97a01_2048x1198.png 1272w, https://substackcdn.com/image/fetch/$s_!JzDm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F396b1bd0-ce9b-4899-96d7-f720e2a97a01_2048x1198.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Apache Iggy web UI (<a href="https://github.com/apache/iggy">Source</a>)</figcaption></figure></div></li><li><p><strong>Trends Report</strong>: <a href="https://www.infoq.com/articles/ai-ml-data-engineering-trends-2025/">InfoQ released its annual AI, ML &amp; Data Engineering Trends Report</a>. Physical AI and multi-modal models will revolutionize data processing pipelines by enabling real-time analysis of diverse data types in unified workflows. MCP standardization means easier integration between different AI tools and data systems, while commoditized RAG enables rapid deployment of knowledge-based applications. The evolution toward agentic AI directly impacts Data Engineering by automating complex ETL processes, data quality monitoring, and infrastructure management.</p></li></ul><ul><li><p><strong>AI</strong>: <a href="https://www.anthropic.com/news/claude-sonnet-4-5">Introducing Claude Sonnet 4.5</a>! Anthropic released Claude Sonnet 4.5, achieving 77.2% on SWE-bench. Key updates include checkpoints in Claude Code, VS Code extension, and the new <a href="https://www.anthropic.com/engineering/building-agents-with-the-claude-agent-sdk">Claude Agent SDK</a>. For DEs, the 30+ hour task focus enables end-to-end pipeline development without losing context, while the Agent SDK allows building sophisticated data processing agents for ETL workflows, monitoring, and error recovery.</p></li><li><p><strong>AI</strong>: <a href="https://github.com/firebase/genkit">Open-source project Genkit</a> is an open-source framework for building full-stack AI-powered applications, built and used in production by Google&#8217;s Firebase. It provides SDKs for multiple programming languages with varying levels of stability, including JavaScript/TypeScript, Go and Python.</p></li></ul><pre><code><code>import { genkit } from &#8216;genkit&#8217;;
import { googleAI } from &#8216;@genkit-ai/google-genai&#8217;;

const ai = genkit({ plugins: [googleAI()] });

const { text } = await ai.generate({
    model: googleAI.model(&#8217;gemini-2.5-flash&#8217;),
    prompt: &#8216;What is Data Engineer Things?&#8217;
});</code></code></pre><ul><li><p><strong>Conferences</strong>: Big Data LDN (London) 2025 took place from September 24&#8211;25 at Olympia London, bringing together thousands of data, analytics, and AI specialists to discuss building data-driven businesses. Watch out for the 2025 playlist on the <a href="https://www.youtube.com/@Bigdataldn/playlists">official YouTube channel</a>.</p></li></ul><div><hr></div><h3>&#128278; Featured Read</h3><h4><strong>Stop That Platform Hype for Good</strong></h4><p><em>Author: <a href="https://medium.com/@bernd.wessely">Bernd Wessely</a></em></p><p>Every vendor claims their platform is <em>the</em> answer to enterprise IT chaos&#8212;but are platforms really simplifying our lives, or just adding another layer of complexity? </p><p>This article cuts through the hype, explains why most platforms fail to deliver on their promise, and challenges the idea that &#8220;platform engineering&#8221; is the silver bullet.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!lo5J!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F767f9f4b-db54-46e1-8812-c1bb7e5f6e3b_700x467.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!lo5J!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F767f9f4b-db54-46e1-8812-c1bb7e5f6e3b_700x467.jpeg 424w, https://substackcdn.com/image/fetch/$s_!lo5J!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F767f9f4b-db54-46e1-8812-c1bb7e5f6e3b_700x467.jpeg 848w, https://substackcdn.com/image/fetch/$s_!lo5J!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F767f9f4b-db54-46e1-8812-c1bb7e5f6e3b_700x467.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!lo5J!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F767f9f4b-db54-46e1-8812-c1bb7e5f6e3b_700x467.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!lo5J!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F767f9f4b-db54-46e1-8812-c1bb7e5f6e3b_700x467.jpeg" width="700" height="467" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/767f9f4b-db54-46e1-8812-c1bb7e5f6e3b_700x467.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:467,&quot;width&quot;:700,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!lo5J!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F767f9f4b-db54-46e1-8812-c1bb7e5f6e3b_700x467.jpeg 424w, https://substackcdn.com/image/fetch/$s_!lo5J!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F767f9f4b-db54-46e1-8812-c1bb7e5f6e3b_700x467.jpeg 848w, https://substackcdn.com/image/fetch/$s_!lo5J!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F767f9f4b-db54-46e1-8812-c1bb7e5f6e3b_700x467.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!lo5J!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F767f9f4b-db54-46e1-8812-c1bb7e5f6e3b_700x467.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">The OS is an open toolbox &#8211; platforms are closed (<a href="https://blog.dataengineerthings.org/stop-that-platform-hype-for-good-536a4f68272e">Source</a>)</figcaption></figure></div><p>Highlights from the article:</p><ul><li><p><strong>Platform &#8800; Productivity:</strong> Platforms often become another layer of indirection instead of solving core problems.</p></li><li><p><strong>One-size-fits-none:</strong> What works for one company&#8217;s scale, culture, and team may be a terrible fit for another.</p></li><li><p><strong>Illusion of simplification:</strong> Instead of making life easier, platforms can hide complexity until it explodes later.</p></li><li><p><strong>Focus on fundamentals:</strong> Teams thrive when they improve automation, monitoring, and developer experience&#8212;not when they chase buzzwords.</p></li><li><p><strong>Reality check:</strong> Engineering isn&#8217;t about buying platforms; it&#8217;s about building reliable systems that solve <em>your</em> problems.</p></li></ul><p>&#128073;&#127996; Read the full article <strong><a href="https://blog.dataengineerthings.org/stop-that-platform-hype-for-good-536a4f68272e">HERE</a></strong>.</p><div><hr></div><h3><strong>&#128218; Articles of the Month</strong></h3><ul><li><p><a href="https://newsletter.posthog.com/p/avoid-these-ai-coding-mistakes">Avoid these AI coding mistakes</a>: PostHog engineers break down what actually works (and doesn&#8217;t) when coding with AI in production environments. Key insight: in large codebases (think 8,984 files), you need serious guardrails. AI excels at autocomplete, fixing tests, and research, but struggles with unfamiliar languages and writing quality tests from scratch. Success requires treating it like a skill you develop through experimentation, understanding its limitations, and staying hands-on because you&#8217;re ultimately responsible for what ships.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!m71B!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F847b6caa-fc38-4287-9abd-867bd3c2f438_768x588.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!m71B!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F847b6caa-fc38-4287-9abd-867bd3c2f438_768x588.png 424w, https://substackcdn.com/image/fetch/$s_!m71B!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F847b6caa-fc38-4287-9abd-867bd3c2f438_768x588.png 848w, https://substackcdn.com/image/fetch/$s_!m71B!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F847b6caa-fc38-4287-9abd-867bd3c2f438_768x588.png 1272w, https://substackcdn.com/image/fetch/$s_!m71B!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F847b6caa-fc38-4287-9abd-867bd3c2f438_768x588.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!m71B!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F847b6caa-fc38-4287-9abd-867bd3c2f438_768x588.png" width="393" height="300.890625" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/847b6caa-fc38-4287-9abd-867bd3c2f438_768x588.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:588,&quot;width&quot;:768,&quot;resizeWidth&quot;:393,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!m71B!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F847b6caa-fc38-4287-9abd-867bd3c2f438_768x588.png 424w, https://substackcdn.com/image/fetch/$s_!m71B!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F847b6caa-fc38-4287-9abd-867bd3c2f438_768x588.png 848w, https://substackcdn.com/image/fetch/$s_!m71B!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F847b6caa-fc38-4287-9abd-867bd3c2f438_768x588.png 1272w, https://substackcdn.com/image/fetch/$s_!m71B!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F847b6caa-fc38-4287-9abd-867bd3c2f438_768x588.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Usefulness of AI coding advice (<a href="https://newsletter.posthog.com/p/avoid-these-ai-coding-mistakes">Source</a>)</figcaption></figure></div></li><li><p><a href="https://netflixtechblog.com/scaling-muse-how-netflix-powers-data-driven-creative-insights-at-trillion-row-scale-aa9ad326fd77">Scaling Muse: How Netflix Powers Data-Driven Creative Insights at Trillion-Row Scale</a>: Learn how Netflix scaled <em>Muse</em> to <strong>trillion-row datasets</strong>, combining Druid, Iceberg, and probabilistic (<a href="https://odino.org/my-favorite-data-structure-hyperloglog/">HyperLogLog</a>) sketches to balance speed with accuracy. A masterclass in high-performance data engineering.</p></li><li><p><a href="https://www.theengineeringmanager.com/managing-managers/going-direct/">Going direct</a> is all about empowering teams to communicate openly across departments and levels without following org chart hierarchies, enabling faster decision-making, increased collaboration, and reduced bottlenecks. For Data Engineers, direct communication is crucial for resolving data pipeline issues quickly during incidents, enabling cross-team data integration projects, and allowing autonomous tactical decisions while escalating only high-risk architectural changes - helping data teams move faster in today&#8217;s leaner organizations.</p></li><li><p><a href="https://yewjin.substack.com/p/note-to-my-younger-self?r=ewzt3&amp;utm_campaign=post&amp;utm_medium=web&amp;triedRedirect=true">Note To My Younger Self</a>: In this reflective piece, a Google engineering leader shares wisdom he wishes he had known earlier, from avoiding the chase for validation to embracing uncertainty. A valuable read for data engineers thinking long-term about their careers.</p></li></ul><p><em>(&#9997;&#65039; Interested in publishing articles on DET on Medium? Read submission guidelines <a href="https://medium.com/data-engineer-things/write-for-data-engineer-things-32dc9294c5db">here</a>.)</em></p><div><hr></div><h3><strong>&#127912; Community Spotlight</strong></h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!q0Ui!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1bc6e9aa-cf33-4266-ba7a-d607b93185fd_1199x350.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!q0Ui!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1bc6e9aa-cf33-4266-ba7a-d607b93185fd_1199x350.png 424w, https://substackcdn.com/image/fetch/$s_!q0Ui!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1bc6e9aa-cf33-4266-ba7a-d607b93185fd_1199x350.png 848w, https://substackcdn.com/image/fetch/$s_!q0Ui!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1bc6e9aa-cf33-4266-ba7a-d607b93185fd_1199x350.png 1272w, https://substackcdn.com/image/fetch/$s_!q0Ui!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1bc6e9aa-cf33-4266-ba7a-d607b93185fd_1199x350.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!q0Ui!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1bc6e9aa-cf33-4266-ba7a-d607b93185fd_1199x350.png" width="1199" height="350" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1bc6e9aa-cf33-4266-ba7a-d607b93185fd_1199x350.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:350,&quot;width&quot;:1199,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:121317,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineerthings.substack.com/i/173114685?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1bc6e9aa-cf33-4266-ba7a-d607b93185fd_1199x350.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!q0Ui!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1bc6e9aa-cf33-4266-ba7a-d607b93185fd_1199x350.png 424w, https://substackcdn.com/image/fetch/$s_!q0Ui!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1bc6e9aa-cf33-4266-ba7a-d607b93185fd_1199x350.png 848w, https://substackcdn.com/image/fetch/$s_!q0Ui!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1bc6e9aa-cf33-4266-ba7a-d607b93185fd_1199x350.png 1272w, https://substackcdn.com/image/fetch/$s_!q0Ui!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1bc6e9aa-cf33-4266-ba7a-d607b93185fd_1199x350.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>Please introduce yourself briefly to the DET community and share how your journey has shaped your unique perspective on data systems.</strong></p><blockquote><p>Hi Everyone! I&#8217;m <a href="https://www.linkedin.com/in/anandaganesh/">Anandaganesh Balakrishnan</a>, a Data Engineering leader with over 18 years of experience designing and modernizing enterprise data infrastructure across the banking, trading, real estate investment, and utilities sectors. I&#8217;ve led major transformation projects at ING Bank, Susquehanna International Group (SIG), and Roc360, focusing on building data systems that are faster, more scalable, and efficient. Working across Asia, Europe, and North America has been a enriching journey that&#8217;s taught me to adapt to different cultures, collaborate with diverse teams, and see how great ideas can lead into large-scale innovation.</p></blockquote><p><strong>What was the most challenging &#8220;merge conflict&#8221; you had to resolve when transitioning from traditional systems to cloud-native architectures?</strong></p><blockquote><p>Architecting modern, scalable cloud-native data platforms across organizations meant redoing the data flow and the underlying technology, of course, but it also meant getting teams, processes, and everyone&#8217;s way of thinking on board with new tools. Getting past those disagreements taught me that updating things isn&#8217;t just about the tech itself; it&#8217;s about <strong>guiding changes in how people work</strong>, the steps they follow, and the tools they use.</p></blockquote><p><strong>What&#8217;s your framework for cutting through architectural complexity when modernizing enterprise data platforms, and what&#8217;s one common mistake you see organizations make during these transformations?</strong></p><blockquote><p>When modernizing enterprise data platforms, the key to managing complexity is <strong>keeping strategy (business/data) and execution in sync</strong>. It starts with having a clear vision of what the business wants to achieve, whether that&#8217;s faster insights, lower costs, or stronger compliance, and making sure every architectural choice supports those goals. On the ground, this means building modular, flexible systems that can adapt quickly today while scaling for tomorrow. A common pitfall I see is teams rushing into tool choices or cloud migrations without that alignment, which often leads to fragmented systems and unnecessary technical debt. Real transformation happens when strategy guides every architectural decision and day-to-day actions work together to bring that vision to life.</p></blockquote><p><strong>As someone who mentors others and has led strategic programs, what&#8217;s the most important skill data professionals need to develop to transition from IC to enterprise architect, and how should they build credibility with C-level executives?</strong></p><blockquote><p>The most important skill for data professionals looking to grow into enterprise architects is <strong>strategic thinking</strong>, seeing how every technical decision connects to the bigger business picture is important. It is about understanding how data drives revenue, manages risk, and supports the company&#8217;s overall objectives. To earn credibility with C-level leaders, speak the language of business outcomes rather than outputs, show how your architectural choices drive measurable value, save costs, or create a competitive edge. When you <strong>align technology efforts with what really matters to the business</strong>, you help bridge the gap between engineering and leadership. That alignment builds trust, strengthens collaboration, and shows that technology is a true partner in driving the company&#8217;s success.</p></blockquote><p><strong>Your career has involved positions as a project leader, database developer, data engineer, technical editor, and more. With your broad perspective on the data job landscape, what are the key roles for data professionals in the future, and what emerging technologies should data professionals be positioning themselves for?</strong></p><blockquote><p>Looking ahead, the most impactful roles in data will center on <strong>data architecture, agentic data stacks, platform engineering, and AI-ready infrastructure</strong>. As organizations move toward intelligent, automated systems, data professionals will need to design sustainable data platforms, infrastructures, and architectures that enable self-optimizing, adaptive data pipelines, which I refer to as agentic data engineering.</p></blockquote><p><strong>On your LinkedIn profile, you mention your favorite books: Atomic Habits, The Black Swan, and Money - Master the Game. For data engineers, what are the most impactful lessons you&#8217;ve learned from these books? Feel free to share one lesson from each.</strong></p><blockquote><p>Atomic Habits instilled in me the discipline to stay focused and execute projects with consistency and efficiency. The Black Swan, my favorite among them, deepened my understanding of how rare, unpredictable events, like today&#8217;s AI disruption, can reshape entire industries, reinforcing <strong>the need to unlearn and relearn</strong> continuously. Money - Master the Game taught me that sharing knowledge and empowering others not only creates collective progress but also leads to personal and professional growth.</p></blockquote><p><strong>What advice would you give to career starters who want to get into the world of data engineering? What should they focus on first?</strong></p><blockquote><p>If you&#8217;re beginning as a data engineer, <strong>get the basics down first before you mess around with fancy tools</strong>. Get good with Python, SQL, and distributed computing, these skills are super important. Also, learn how data helps businesses make decisions and makes money for the company.</p></blockquote><div><hr></div><h3><strong>&#128161; DE Tip of the Month</strong></h3><p>Before adding new orchestration or scheduling tools, explore the capabilities of what you already use. For example, Databricks Jobs, Airflow, and Prefect all support retries, alerting, and conditional workflows out of the box, often enough for most pipelines.</p><p>If you outgrow these basics, you can consider specialized workflow enhancements:</p><ul><li><p><strong><a href="https://github.com/dagster-io/dagster">Dagster</a></strong>: Strong focus on data-aware orchestration and observability</p></li><li><p><strong><a href="https://spacelift.io/blog/argo-workflows?">Argo Workflows</a></strong>: Kubernetes-native workflows for large-scale automation</p></li><li><p><strong><a href="https://atlan.com/mage-data-orchestration/">Mage</a></strong>: Lightweight and friendly alternative for quick pipeline building</p></li></ul><p>Begin with what&#8217;s built-in; you&#8217;ll be surprised how much complexity it saves.</p><div><hr></div><p>Let us know what you like the most in the newsletter. See you next time!</p><div class="poll-embed" data-attrs="{&quot;id&quot;:386335}" data-component-name="PollToDOM"></div><p>Cheers,</p><p>Sukanya and Ananda</p><div><hr></div><h4>&#8505;&#65039; About Data Engineer Things</h4><p><a href="https://www.dataengineerthings.org/">Data Engineer Things</a> (DET) is a global community built by data engineers for data engineers. Subscribe to the <a href="https://dataengineerthings.substack.com/">newsletter</a> and follow us on <a href="https://www.linkedin.com/company/data-engineer-things/posts/?feedView=all">LinkedIn</a> to gain access to exclusive learning resources and networking opportunities, including articles, webinars, meetups, conferences, mentorship, and much more.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://dataengineerthings.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Data Engineer Things! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Data Engineer Things Newsletter #23 (Sept 2025)]]></title><description><![CDATA[Open-Source Breakthroughs with Sail and MLflow, Hidden Infrastructure Behind ChatGPT, Streaming Data Into Iceberg, Data Quality Tips, and More.]]></description><link>https://dataengineerthings.substack.com/p/data-engineer-things-newsletter-23</link><guid isPermaLink="false">https://dataengineerthings.substack.com/p/data-engineer-things-newsletter-23</guid><dc:creator><![CDATA[Data Engineer Things]]></dc:creator><pubDate>Tue, 09 Sep 2025 15:03:22 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!VxSI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f519fc1-1e76-4dfc-91fb-234dc81d9e72_1456x1048.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!VxSI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f519fc1-1e76-4dfc-91fb-234dc81d9e72_1456x1048.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!VxSI!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f519fc1-1e76-4dfc-91fb-234dc81d9e72_1456x1048.png 424w, https://substackcdn.com/image/fetch/$s_!VxSI!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f519fc1-1e76-4dfc-91fb-234dc81d9e72_1456x1048.png 848w, https://substackcdn.com/image/fetch/$s_!VxSI!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f519fc1-1e76-4dfc-91fb-234dc81d9e72_1456x1048.png 1272w, https://substackcdn.com/image/fetch/$s_!VxSI!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f519fc1-1e76-4dfc-91fb-234dc81d9e72_1456x1048.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!VxSI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f519fc1-1e76-4dfc-91fb-234dc81d9e72_1456x1048.png" width="1456" height="1048" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5f519fc1-1e76-4dfc-91fb-234dc81d9e72_1456x1048.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1048,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:155702,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://dataengineerthings.substack.com/i/170741130?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f519fc1-1e76-4dfc-91fb-234dc81d9e72_1456x1048.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!VxSI!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f519fc1-1e76-4dfc-91fb-234dc81d9e72_1456x1048.png 424w, https://substackcdn.com/image/fetch/$s_!VxSI!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f519fc1-1e76-4dfc-91fb-234dc81d9e72_1456x1048.png 848w, https://substackcdn.com/image/fetch/$s_!VxSI!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f519fc1-1e76-4dfc-91fb-234dc81d9e72_1456x1048.png 1272w, https://substackcdn.com/image/fetch/$s_!VxSI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f519fc1-1e76-4dfc-91fb-234dc81d9e72_1456x1048.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Highlights of the DET Newsletter September 2025 Edition.</figcaption></figure></div><p>Hey folks,</p><p>Good to connect with you all! Our newsletter intros have rotated between teammates in recent editions, and this time I get to do the writing. I&#8217;m excited to welcome you to this issue.</p><p>I&#8217;m writing from the Bay Area as summer winds down. Having lived in Arizona, Texas, and now California, I&#8217;ve experienced heat in all its forms. From dry to humid to the famous Bay Area microclimates. And much like climate, data engineering is all about navigating different conditions and making sure systems stay reliable.</p><p>That&#8217;s why the value of data engineering lies not just in moving and shaping data, but in enabling people to make better decisions and build better products. Without it, analytics and AI rest on shaky foundations. With it, organizations gain speed, resilience, and trust in the data they use every day.</p><p>This newsletter is written by and for the data engineering community. Each edition brings together perspectives, tools, and lessons that reflect the collective experience of practitioners in the field. We hope it sparks ideas and helps strengthen the connections across our growing community.</p><p>Happy reading, and thanks for being part of the community.</p><p>- Shubham</p><div><hr></div><h3>&#128240; Data Pulse</h3><ul><li><p><strong>Data Engineering</strong>: <a href="https://lakesail.com/blog/sail-0-3-2/">Sail released 0.3.2 with native Delta Lake support</a>. Sail is an <a href="https://github.com/lakehq/sail">open-source</a> Rust-based drop-in replacement for Apache Spark. The 0.3.2 release marks a significant milestone with native Delta Lake integration, enabling direct read/write operations on existing lakehouse datasets across S3, Azure, GCS, and Cloudflare R2. Built entirely in Rust, Sail eliminates JVM overhead and garbage collection pauses, delivering 4x faster execution at just 6% of the cost compared to traditional Spark deployments. The Delta Lake support, built against low-level APIs for optimal performance, makes Sail worth a look!</p></li><li><p><strong>AI</strong>: <a href="https://github.com/mlflow/mlflow/releases/tag/v3.3.0">MLflow 3.3.0 is now available</a>! This release introduces several major features and improvements, especially for <a href="https://mlflow.org/docs/latest/genai/eval-monitor/">open-source AI observability and evaluation</a>, including <a href="https://mlflow.org/docs/latest/genai/tracing/integrations/listing/agno/">Agno Tracing integration</a>. <a href="https://github.com/agno-agi/agno">Agno is an open-source framework</a> for building multi-agent systems with memory, knowledge, and reasoning, along with tracing capabilities.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!skav!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F054197b7-48a0-47f3-9434-abaabe63bdab_2418x1518.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!skav!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F054197b7-48a0-47f3-9434-abaabe63bdab_2418x1518.png 424w, https://substackcdn.com/image/fetch/$s_!skav!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F054197b7-48a0-47f3-9434-abaabe63bdab_2418x1518.png 848w, https://substackcdn.com/image/fetch/$s_!skav!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F054197b7-48a0-47f3-9434-abaabe63bdab_2418x1518.png 1272w, https://substackcdn.com/image/fetch/$s_!skav!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F054197b7-48a0-47f3-9434-abaabe63bdab_2418x1518.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!skav!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F054197b7-48a0-47f3-9434-abaabe63bdab_2418x1518.png" width="1456" height="914" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/054197b7-48a0-47f3-9434-abaabe63bdab_2418x1518.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:914,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Agno Tracing via autolog&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Agno Tracing via autolog" title="Agno Tracing via autolog" srcset="https://substackcdn.com/image/fetch/$s_!skav!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F054197b7-48a0-47f3-9434-abaabe63bdab_2418x1518.png 424w, https://substackcdn.com/image/fetch/$s_!skav!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F054197b7-48a0-47f3-9434-abaabe63bdab_2418x1518.png 848w, https://substackcdn.com/image/fetch/$s_!skav!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F054197b7-48a0-47f3-9434-abaabe63bdab_2418x1518.png 1272w, https://substackcdn.com/image/fetch/$s_!skav!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F054197b7-48a0-47f3-9434-abaabe63bdab_2418x1518.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Agent tracing in MLFlow 3.3.0 with the open-source framework Agno (<a href="https://mlflow.org/docs/latest/genai/tracing/integrations/listing/agno/">Source</a>)</figcaption></figure></div><ul><li><p><strong>AI: </strong><a href="https://docs.unsloth.ai/basics/gpt-oss-how-to-run-and-fine-tune">Unsloth now supports training the new gpt-oss model from OpenAI</a>! Unsloth, the <a href="https://github.com/unslothai/unsloth">open-source</a> LLM fine-tuning framework written in Rust, now supports OpenAI's newly released <a href="https://openai.com/index/introducing-gpt-oss/">gpt-oss</a> models (20B and 120B parameters), enabling fine-tuning on just 14GB of VRAM for the 20B variant through custom MXFP4 quantization techniques. The <a href="https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/gpt-oss-(20B)-Fine-tuning.ipynb">official Colab notebook</a> is a great way for Data Engineers to get started.</p></li><li><p><strong>Data Engineering: </strong><a href="https://aiven.io/blog/iceberg-topics-for-apache-kafka-zero-etl-zero-copy">Aiven rolls out Iceberg Topics for Apache Kafka: Zero ETL, Zero Copy</a>. Iceberg Topics turn any Kafka topic into an Apache Iceberg table with zero ETL and zero data copies, making streaming data instantly queryable with SQL. By removing connectors and avoiding duplication, this open-source approach cuts cost, simplifies pipelines, and gives teams both real-time and analytical views of the same data.</p></li><li><p><strong>AI:</strong> <a href="https://engineering.fb.com/2025/08/13/data-infrastructure/agentic-solution-for-warehouse-data-access/">Creating AI agent solutions for warehouse data access and security</a>. Meta is tackling growing data warehouse access complexity with a new agentic architecture that uses intelligent <em>data-user</em> and <em>data-owner</em> AI agents. These agents streamline access requests, enforce security, and guide users through data discovery, exploration, and permission workflows, while auditing and feedback mechanisms ensure guardrails remain in place.</p></li></ul><div><hr></div><h3>&#128467; DET Meetups in Seattle, Warsaw, NYC, and Bay Area </h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!0jEB!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d2c9b10-065e-478c-b8c0-a93461942923_936x528.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0jEB!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d2c9b10-065e-478c-b8c0-a93461942923_936x528.png 424w, https://substackcdn.com/image/fetch/$s_!0jEB!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d2c9b10-065e-478c-b8c0-a93461942923_936x528.png 848w, https://substackcdn.com/image/fetch/$s_!0jEB!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d2c9b10-065e-478c-b8c0-a93461942923_936x528.png 1272w, https://substackcdn.com/image/fetch/$s_!0jEB!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d2c9b10-065e-478c-b8c0-a93461942923_936x528.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0jEB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d2c9b10-065e-478c-b8c0-a93461942923_936x528.png" width="596" height="336.20512820512823" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8d2c9b10-065e-478c-b8c0-a93461942923_936x528.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:528,&quot;width&quot;:936,&quot;resizeWidth&quot;:596,&quot;bytes&quot;:452418,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineerthings.substack.com/i/170741130?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d2c9b10-065e-478c-b8c0-a93461942923_936x528.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!0jEB!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d2c9b10-065e-478c-b8c0-a93461942923_936x528.png 424w, https://substackcdn.com/image/fetch/$s_!0jEB!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d2c9b10-065e-478c-b8c0-a93461942923_936x528.png 848w, https://substackcdn.com/image/fetch/$s_!0jEB!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d2c9b10-065e-478c-b8c0-a93461942923_936x528.png 1272w, https://substackcdn.com/image/fetch/$s_!0jEB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d2c9b10-065e-478c-b8c0-a93461942923_936x528.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>We have three meetup events coming up in September and one more in October:</p><ul><li><p>Seattle meetup at Databricks on Thu, Sept 18 (<strong><a href="https://bit.ly/47mtvFA">RSVP</a></strong>)</p></li><li><p>Warsaw meetup at Netflix on Thu, Sept 25 (<strong><a href="https://luma.com/nza6i4bf">RSVP</a></strong>)</p></li><li><p>NYC meetup at Capital One on Thu, Sept 25 (<strong><a href="https://luma.com/qllrsadk">RSVP</a></strong>)</p></li><li><p>Bay Area meetup at Altimate AI on Wed, Oct 1 (<strong><a href="https://www.meetup.com/data-engineer-things-bay-area-meetup/events/310939555/?isFirstPublish=true">RSVP</a></strong>)</p></li></ul><p><em>(&#127908; Interested in speaking at our meetups or online webinars? Submit talk proposals <a href="http://meetup.dataengineerthings.org/cfp">here</a>.)</em></p><div><hr></div><h3><strong>&#127941; Get Apache Airflow Certification for Free</strong></h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!1nGT!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F051cf538-2544-4836-8a06-6cf207998c21_1920x1008.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!1nGT!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F051cf538-2544-4836-8a06-6cf207998c21_1920x1008.png 424w, https://substackcdn.com/image/fetch/$s_!1nGT!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F051cf538-2544-4836-8a06-6cf207998c21_1920x1008.png 848w, https://substackcdn.com/image/fetch/$s_!1nGT!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F051cf538-2544-4836-8a06-6cf207998c21_1920x1008.png 1272w, https://substackcdn.com/image/fetch/$s_!1nGT!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F051cf538-2544-4836-8a06-6cf207998c21_1920x1008.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!1nGT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F051cf538-2544-4836-8a06-6cf207998c21_1920x1008.png" width="482" height="252.91758241758242" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/051cf538-2544-4836-8a06-6cf207998c21_1920x1008.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:764,&quot;width&quot;:1456,&quot;resizeWidth&quot;:482,&quot;bytes&quot;:111671,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineerthings.substack.com/i/170741130?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F051cf538-2544-4836-8a06-6cf207998c21_1920x1008.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!1nGT!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F051cf538-2544-4836-8a06-6cf207998c21_1920x1008.png 424w, https://substackcdn.com/image/fetch/$s_!1nGT!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F051cf538-2544-4836-8a06-6cf207998c21_1920x1008.png 848w, https://substackcdn.com/image/fetch/$s_!1nGT!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F051cf538-2544-4836-8a06-6cf207998c21_1920x1008.png 1272w, https://substackcdn.com/image/fetch/$s_!1nGT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F051cf538-2544-4836-8a06-6cf207998c21_1920x1008.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Join the Beyond Analytics virtual conference on Sept 16 for a free workshop on Apache Airflow 3 fundamentals. This workshop will help you prepare for the official Airflow certification exam and answer any questions you may have. Plus, you will get a discount code for a free certification ($150 value).</p><p>&#128073;&#127996; Sign up for the workshop <strong><a href="https://www.astronomer.io/lp/beyond-analytics-de/?utm_campaign=event-beyond-analytics-9-25&amp;utm_medium=paidmedia&amp;utm_source=data-engineering-things">HERE</a></strong>.</p><p><em>(This message is sponsored by Astronomer.)</em></p><div><hr></div><h3>&#128278; Featured Read</h3><h4><strong>Ever Wonder What Actually Happens When You Hit &#8220;Send&#8221; on ChatGPT?</strong></h4><p><em>Author: <a href="https://brilliantprogrammer.medium.com">Deepanshu Tyagi</a></em></p><p>We all know ChatGPT feels fast and reliable, but what really happens in those few seconds after you press send? This article reveals the hidden data engineering that powers the experience and shows why the real magic lies in the infrastructure behind the model.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!n_PJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fca0481-0d5a-4399-99d8-e0a9f7ac15c9_2194x1260.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!n_PJ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fca0481-0d5a-4399-99d8-e0a9f7ac15c9_2194x1260.png 424w, https://substackcdn.com/image/fetch/$s_!n_PJ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fca0481-0d5a-4399-99d8-e0a9f7ac15c9_2194x1260.png 848w, https://substackcdn.com/image/fetch/$s_!n_PJ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fca0481-0d5a-4399-99d8-e0a9f7ac15c9_2194x1260.png 1272w, https://substackcdn.com/image/fetch/$s_!n_PJ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fca0481-0d5a-4399-99d8-e0a9f7ac15c9_2194x1260.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!n_PJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fca0481-0d5a-4399-99d8-e0a9f7ac15c9_2194x1260.png" width="1456" height="836" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9fca0481-0d5a-4399-99d8-e0a9f7ac15c9_2194x1260.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:836,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:538698,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineerthings.substack.com/i/170741130?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fca0481-0d5a-4399-99d8-e0a9f7ac15c9_2194x1260.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!n_PJ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fca0481-0d5a-4399-99d8-e0a9f7ac15c9_2194x1260.png 424w, https://substackcdn.com/image/fetch/$s_!n_PJ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fca0481-0d5a-4399-99d8-e0a9f7ac15c9_2194x1260.png 848w, https://substackcdn.com/image/fetch/$s_!n_PJ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fca0481-0d5a-4399-99d8-e0a9f7ac15c9_2194x1260.png 1272w, https://substackcdn.com/image/fetch/$s_!n_PJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fca0481-0d5a-4399-99d8-e0a9f7ac15c9_2194x1260.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Source: <a href="https://current.confluent.io/post-conference-videos-2025/building-stream-processing-platform-at-openai-lnd25">OpenAI</a></figcaption></figure></div><p><strong>Scale at millions:</strong> ChatGPT manages millions of simultaneous conversations with low latency and high reliability.</p><p><strong>Streaming backbone:</strong> A customized <strong>PyFlink + Kafka</strong> setup processes events in near real time, orchestrated across regions with Kubernetes.</p><p><strong>Kafka Forwarder:</strong> Middleware that hides Kafka&#8217;s complexity and allows engineers to focus on AI instead of distributed systems.</p><p><strong>Zero-downtime upgrades:</strong> Multi-cluster architecture and smart traffic shifting let OpenAI swap infrastructure while keeping the system live.</p><p><strong>Event-driven AI:</strong> Each interaction becomes part of a <strong>data flywheel</strong> that continuously improves future responses.</p><p>Want to dive deeper? The article references these talks:</p><ul><li><p><a href="https://current.confluent.io/post-conference-videos-2025/building-stream-processing-platform-at-openai-lnd25">Building Stream Processing Platform at OpenAI</a></p></li><li><p><a href="https://current.confluent.io/post-conference-videos-2025/taming-the-kafka-chaos-how-openai-simplifies-kafka-consumption-lnd25">Taming the Kafka Chaos: How OpenAI Simplifies Kafka Consumption</a></p></li><li><p><a href="https://current.confluent.io/post-conference-videos-2025/changing-engines-mid-flight-kafka-migrations-at-openai-lnd25">Changing engines mid-flight: Kafka migrations at OpenAI</a></p></li></ul><p>&#128073;&#127996; Read the full article <strong><a href="https://blog.dataengineerthings.org/ever-wonder-what-actually-happens-when-you-hit-send-on-chatgpt-3e13176d4b05#0e0e-84902f593cdf">HERE</a></strong>.</p><div><hr></div><h3><strong>&#128218; Articles of the Month</strong></h3><ul><li><p><a href="https://blog.dataengineerthings.org/the-equality-delete-problem-in-apache-iceberg-143dd451a974">The Equality Delete Problem in Apache Iceberg</a>: Equality deletes in Apache Iceberg make streaming CDC ingestion tricky, slowing queries and limiting compatibility. This article explains the problem and shows how RisingWave tackles it with smarter delete strategies.</p></li><li><p><a href="https://rmoff.net/2025/08/18/kafka-to-iceberg-exploring-the-options">Kafka to Iceberg - Exploring the Options</a>: Thinking of streaming Kafka data into Apache Iceberg? This article breaks down three practical approaches - Flink SQL, Kafka Connect and Confluent Tableflow.</p></li><li><p><a href="https://blog.dataengineerthings.org/how-do-iceberg-delta-lake-and-hudi-ensure-atomicity-52bd3faf97d0">How do Iceberg, Delta Lake, and Hudi ensure atomicity?</a> Consistency in data lakes depends on atomicity. See how Iceberg, Delta Lake, and Hudi safeguard reliability by guaranteeing all-or-nothing writes.</p></li><li><p><a href="https://newsletter.pragmaticengineer.com/p/how-to-get-unstuck-during-coding-interviews">How experienced engineers get unstuck in coding interviews</a>: With many companies sticking to algorithmic interviews despite AI tools, this post reveals a systematic approach to getting unstuck during whiteboard coding. Data Engineers will recognize the parallels between boundary thinking for algorithm optimization and the same mental models needed for query optimization and distributed system design.</p></li><li><p><a href="https://netflixtechblog.com/from-facts-metrics-to-media-machine-learning-evolving-the-data-engineering-function-at-netflix-6dcc91058d8d">From Facts &amp; Metrics to Media Machine Learning: Evolving the Data Engineering Function at Netflix</a>: This article explores how Netflix is expanding its data engineering capabilities with a new Media ML Data Engineering specialization. It highlights the role of their Media Data Lake in powering machine learning workflows across video, audio, image, and text assets.</p></li><li><p><a href="https://hbr.org/2025/08/soft-skills-matter-now-more-than-ever-according-to-new-research">Soft Skills Matter Now More Than Ever &#8211; Harvard Business Review:</a> In today&#8217;s AI-driven workplace, technical expertise alone isn&#8217;t enough. This article highlights research showing that collaboration, adaptability, and critical thinking are now the skills that set professionals apart and future-proof careers.</p></li></ul><p><em>(&#9997;&#65039; Interested in publishing articles on DET on Medium? Read submission guidelines <a href="https://medium.com/data-engineer-things/write-for-data-engineer-things-32dc9294c5db">here</a>.)</em></p><div><hr></div><h3>&#128161; DE Tip of the Month</h3><p>Getting started with data quality checks doesn't need to mean introducing new frameworks to your data infrastructure. You can accomplish a lot with built-in open source functions from tools like Airflow.</p><p>However, if you need more advanced capabilities, here are the most popular dedicated data quality projects to consider:</p><ul><li><p><a href="https://github.com/sodadata/soda-core">Soda Core</a>: SQL-based data testing with YAML configuration</p></li><li><p><a href="https://greatexpectations.io/">Great Expectations</a>: Python-based data validation with extensive documentation</p></li><li><p><a href="https://github.com/awslabs/deequ">Deequ</a>: Scala library built on Apache Spark for large-scale data validation</p></li></ul><p><strong>Start simple with your existing tools before adding complexity.</strong></p><div><hr></div><p>Let us know what you like the most in the newsletter. See you next time!</p><div class="poll-embed" data-attrs="{&quot;id&quot;:365574}" data-component-name="PollToDOM"></div><p>Cheers,</p><p><a href="https://www.linkedin.com/in/shubhamgondane/">Shubham</a> and <a href="https://www.linkedin.com/in/vjanz/">Volker</a></p><div><hr></div><h4>&#8505;&#65039; About Data Engineer Things</h4><p><a href="https://www.dataengineerthings.org/">Data Engineer Things</a> (DET) is a global community built by data engineers for data engineers. Subscribe to the <a href="https://dataengineerthings.substack.com/">newsletter</a> and follow us on <a href="https://www.linkedin.com/company/data-engineer-things/posts/?feedView=all">LinkedIn</a> to gain access to exclusive learning resources and networking opportunities, including articles, webinars, meetups, conferences, mentorship, and much more.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://dataengineerthings.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Data Engineer Things! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Data Engineer Things Newsletter #22 (Aug 2025)]]></title><description><![CDATA[Explore the latest in data engineering: Agentic Data Engineering, Data Modeling for Data Products, and What Makes a Great Data Engineer in August's DET newsletter.]]></description><link>https://dataengineerthings.substack.com/p/data-engineer-things-newsletter-22</link><guid isPermaLink="false">https://dataengineerthings.substack.com/p/data-engineer-things-newsletter-22</guid><dc:creator><![CDATA[Anandaganesh Balakrishnan]]></dc:creator><pubDate>Tue, 12 Aug 2025 15:03:15 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!3RHB!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0820c1b-6d16-4315-a76c-f1ad1e35d8e0_1456x1048.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!3RHB!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0820c1b-6d16-4315-a76c-f1ad1e35d8e0_1456x1048.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!3RHB!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0820c1b-6d16-4315-a76c-f1ad1e35d8e0_1456x1048.jpeg 424w, https://substackcdn.com/image/fetch/$s_!3RHB!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0820c1b-6d16-4315-a76c-f1ad1e35d8e0_1456x1048.jpeg 848w, https://substackcdn.com/image/fetch/$s_!3RHB!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0820c1b-6d16-4315-a76c-f1ad1e35d8e0_1456x1048.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!3RHB!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0820c1b-6d16-4315-a76c-f1ad1e35d8e0_1456x1048.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!3RHB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0820c1b-6d16-4315-a76c-f1ad1e35d8e0_1456x1048.jpeg" width="727" height="523.2802197802198" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b0820c1b-6d16-4315-a76c-f1ad1e35d8e0_1456x1048.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1048,&quot;width&quot;:1456,&quot;resizeWidth&quot;:727,&quot;bytes&quot;:158860,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://dataengineerthings.substack.com/i/167326098?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0820c1b-6d16-4315-a76c-f1ad1e35d8e0_1456x1048.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!3RHB!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0820c1b-6d16-4315-a76c-f1ad1e35d8e0_1456x1048.jpeg 424w, https://substackcdn.com/image/fetch/$s_!3RHB!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0820c1b-6d16-4315-a76c-f1ad1e35d8e0_1456x1048.jpeg 848w, https://substackcdn.com/image/fetch/$s_!3RHB!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0820c1b-6d16-4315-a76c-f1ad1e35d8e0_1456x1048.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!3RHB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0820c1b-6d16-4315-a76c-f1ad1e35d8e0_1456x1048.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Highlights of the DET Newsletter August 2025 Edition</figcaption></figure></div><p>Hey All!</p><p>Great to connect with you through this newsletter, made possible by the incredible team working diligently behind the scenes. This month, I have taken the lead in introducing excellent content that will be helpful in your data engineering journey.</p><p>I am from Philadelphia, Home of the Liberty Bell and the boldest ideas. I have worked in India and the Netherlands before making a move to the US 12 years ago to do my Master&#8217;s. I have primarily worked in fintech industries and am currently working in utilities. In my free time, I read, play tennis, travel, and watch movies.</p><p>Data engineering is vast, and hundreds of tools and features are added each quarter. The fundamentals remain the same, and data engineering innovations have pushed the boundaries in how businesses efficiently utilize quality data. Knowing <a href="https://www.oreilly.com/library/view/data-engineering-design/9781098165826/">data engineering design patterns</a> will help data engineers, irrespective of their level, to build high-quality data systems.</p><p>In this newsletter, I have included content based on where data engineering is heading, caveats in implementing changes, and modeling data products.</p><p>Read, gain insights, and make an impact!</p><p>- Ananda</p><div><hr></div><h3>&#128240; Data Pulse</h3><ul><li><p><strong>AI</strong>: <a href="https://ollama.com/blog/new-app">Ollama released their new app for macOS and Windows</a>. The new app offers a user-friendly UI to download and chat with AI models, with support for file processing, multimodal capabilities, and adjustable context length.</p></li><li><p><strong>AI: </strong><a href="https://www.ascend.io/blog/introducing-agentic-data-engineering-the-first-ai-native-data-stack">Agentic Data Engineering:</a><strong><a href="https://www.ascend.io/blog/introducing-agentic-data-engineering-the-first-ai-native-data-stack"> </a></strong><a href="https://www.ascend.io/blog/introducing-agentic-data-engineering-the-first-ai-native-data-stack">Ascend.io</a><strong> </strong>&#8211; Agentic data engineering leverages AI agents with context metadata, logic, dependencies, and runtime behavior to build, manage, and optimize data pipelines efficiently and at scale. Ascend&#8217;s automation engine uses rich metadata to orchestrate pipelines by handling dependencies, initiating tasks, and enabling custom event-driven workflows.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!4EPL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4879d572-811a-4754-b163-2fd379848124_1874x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!4EPL!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4879d572-811a-4754-b163-2fd379848124_1874x1024.png 424w, https://substackcdn.com/image/fetch/$s_!4EPL!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4879d572-811a-4754-b163-2fd379848124_1874x1024.png 848w, https://substackcdn.com/image/fetch/$s_!4EPL!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4879d572-811a-4754-b163-2fd379848124_1874x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!4EPL!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4879d572-811a-4754-b163-2fd379848124_1874x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!4EPL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4879d572-811a-4754-b163-2fd379848124_1874x1024.png" width="695" height="379.9587912087912" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4879d572-811a-4754-b163-2fd379848124_1874x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:796,&quot;width&quot;:1456,&quot;resizeWidth&quot;:695,&quot;bytes&quot;:852761,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineerthings.substack.com/i/167326098?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4879d572-811a-4754-b163-2fd379848124_1874x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!4EPL!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4879d572-811a-4754-b163-2fd379848124_1874x1024.png 424w, https://substackcdn.com/image/fetch/$s_!4EPL!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4879d572-811a-4754-b163-2fd379848124_1874x1024.png 848w, https://substackcdn.com/image/fetch/$s_!4EPL!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4879d572-811a-4754-b163-2fd379848124_1874x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!4EPL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4879d572-811a-4754-b163-2fd379848124_1874x1024.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">An overview of the Ascent platform (<a href="https://www.ascend.io/product">Source</a>)</figcaption></figure></div></li><li><p><strong>AI: </strong><a href="https://www.starburst.io/blog/lakeside-ai/">Lakeside AI</a> enables organizations to deliver AI-ready data without needing a full migration to a lakehouse architecture. Through federated access, it helps to explore data in existing systems and brings only the most relevant datasets to the lakehouse when needed.</p></li><li><p><strong>Data Analytics</strong>: <a href="https://sqlrooms.org/">SQLRooms</a> is a React-based open-source framework that enables the building of data-centric applications using DuckDB. It allows quick, local analytics by directly interacting with file formats such as Parquet, Avro, and CSV through SQL. Ideal for dashboards, data exploration, and prototyping. A high-performance alternative to a database-backed app.</p></li></ul><div><hr></div><h3>&#128467; DET NYC Meetup on Sept 25</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!0q2U!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F580f1e3e-069f-402e-8c7d-5f34c08cd5d2_1280x720.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0q2U!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F580f1e3e-069f-402e-8c7d-5f34c08cd5d2_1280x720.png 424w, https://substackcdn.com/image/fetch/$s_!0q2U!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F580f1e3e-069f-402e-8c7d-5f34c08cd5d2_1280x720.png 848w, https://substackcdn.com/image/fetch/$s_!0q2U!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F580f1e3e-069f-402e-8c7d-5f34c08cd5d2_1280x720.png 1272w, https://substackcdn.com/image/fetch/$s_!0q2U!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F580f1e3e-069f-402e-8c7d-5f34c08cd5d2_1280x720.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0q2U!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F580f1e3e-069f-402e-8c7d-5f34c08cd5d2_1280x720.png" width="535" height="300.9375" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/580f1e3e-069f-402e-8c7d-5f34c08cd5d2_1280x720.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:720,&quot;width&quot;:1280,&quot;resizeWidth&quot;:535,&quot;bytes&quot;:100755,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineerthings.substack.com/i/167326098?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F580f1e3e-069f-402e-8c7d-5f34c08cd5d2_1280x720.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!0q2U!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F580f1e3e-069f-402e-8c7d-5f34c08cd5d2_1280x720.png 424w, https://substackcdn.com/image/fetch/$s_!0q2U!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F580f1e3e-069f-402e-8c7d-5f34c08cd5d2_1280x720.png 848w, https://substackcdn.com/image/fetch/$s_!0q2U!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F580f1e3e-069f-402e-8c7d-5f34c08cd5d2_1280x720.png 1272w, https://substackcdn.com/image/fetch/$s_!0q2U!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F580f1e3e-069f-402e-8c7d-5f34c08cd5d2_1280x720.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Attention, New Yorkers! You asked for it, and it's finally here: we are officially launching the DET NYC Meetup! Join the first NYC meetup on Sept 25th for an evening of learning, networking, and fun. Big thank you to Capital One for providing the venue and refreshments for the event.</p><ul><li><p><strong>When</strong>: 6:00 PM to 8:00 PM on Thursday, Sept 25th</p></li><li><p><strong>Where</strong>: Capital One's NYC office</p></li><li><p><strong>&#128073;&#127996; <a href="https://lu.ma/qllrsadk">RSVP</a></strong></p></li></ul><p>To stay tuned for all future NYC events, make sure to join the <a href="http://meetup.dataengineerthings.org/nyc">meetup group</a> (clicking the &#8220;Subscribe&#8221; button on the top right).</p><p>We are also looking for volunteers to help organize NYC meetups. Contact <a href="https://www.linkedin.com/in/sanchitburkule/">Sanchit Burkule</a> if you want to be part of the team.</p><p><em>(&#127908; Interested in speaking at our meetups or online webinars? Submit talk proposals <a href="http://meetup.dataengineerthings.org/cfp">here</a>.)</em></p><div><hr></div><h3>&#127916; Webinar: Improving Airflow Data Pipeline Reliability</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!LQgW!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5beeaff9-7636-40db-8320-5abe67203685_1920x1243.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!LQgW!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5beeaff9-7636-40db-8320-5abe67203685_1920x1243.png 424w, https://substackcdn.com/image/fetch/$s_!LQgW!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5beeaff9-7636-40db-8320-5abe67203685_1920x1243.png 848w, https://substackcdn.com/image/fetch/$s_!LQgW!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5beeaff9-7636-40db-8320-5abe67203685_1920x1243.png 1272w, https://substackcdn.com/image/fetch/$s_!LQgW!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5beeaff9-7636-40db-8320-5abe67203685_1920x1243.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!LQgW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5beeaff9-7636-40db-8320-5abe67203685_1920x1243.png" width="530" height="343.2623626373626" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5beeaff9-7636-40db-8320-5abe67203685_1920x1243.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:943,&quot;width&quot;:1456,&quot;resizeWidth&quot;:530,&quot;bytes&quot;:721795,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineerthings.substack.com/i/167326098?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5beeaff9-7636-40db-8320-5abe67203685_1920x1243.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!LQgW!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5beeaff9-7636-40db-8320-5abe67203685_1920x1243.png 424w, https://substackcdn.com/image/fetch/$s_!LQgW!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5beeaff9-7636-40db-8320-5abe67203685_1920x1243.png 848w, https://substackcdn.com/image/fetch/$s_!LQgW!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5beeaff9-7636-40db-8320-5abe67203685_1920x1243.png 1272w, https://substackcdn.com/image/fetch/$s_!LQgW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5beeaff9-7636-40db-8320-5abe67203685_1920x1243.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>At Astronomer, a team of just five data engineers manages 27,000 daily tasks, powering 18+ data products. In this webinar (11 am ET, Aug 21st), Maggie Stark, Staff Data Engineer at Astronomer, will share how her team reduced DAG failure rates by 81%:</p><ul><li><p>Using Airflow Asset scheduling to prevent upstream dependency issues</p></li><li><p>Orchestrating cross-DAG dependencies with a Control DAG</p></li><li><p>Implementing centralized observability to monitor SLAs and debug faster</p></li></ul><p>&#128073;&#127996; Sign up for the webinar <strong><a href="https://www.astronomer.io/events/webinars/how-to-increase-the-reliability-of-your-airflow-pipelines-video/?utm_source=data-engineering-things&amp;utm_medium=paidmedia&amp;utm_campaign=webinar-pipeline-reliability-8-25">HERE</a></strong>.</p><p><em>(This message is sponsored by Astronomer.)</em></p><div><hr></div><h3>&#128278; Featured Read</h3><h4>Data Modeling for Data Products: A Practical Guide</h4><p><em>Author: <a href="https://mahdiqb.medium.com/">Mahdi Karabiben</a></em></p><p><strong>Rethinking Data Modeling for the Age of Data Products</strong><br>As data teams embrace product thinking and business-driven use cases, this article explores how to shift from rigid, monolithic models to adaptive, use-case-focused designs that evolve with the business. A data product is a curated, reliable dataset designed to serve a specific business purpose or use case.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!58jR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac190796-8d96-4d1f-9976-30c5a5dc13ba_700x235.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!58jR!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac190796-8d96-4d1f-9976-30c5a5dc13ba_700x235.png 424w, https://substackcdn.com/image/fetch/$s_!58jR!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac190796-8d96-4d1f-9976-30c5a5dc13ba_700x235.png 848w, https://substackcdn.com/image/fetch/$s_!58jR!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac190796-8d96-4d1f-9976-30c5a5dc13ba_700x235.png 1272w, https://substackcdn.com/image/fetch/$s_!58jR!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac190796-8d96-4d1f-9976-30c5a5dc13ba_700x235.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!58jR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac190796-8d96-4d1f-9976-30c5a5dc13ba_700x235.png" width="700" height="235" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ac190796-8d96-4d1f-9976-30c5a5dc13ba_700x235.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:235,&quot;width&quot;:700,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!58jR!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac190796-8d96-4d1f-9976-30c5a5dc13ba_700x235.png 424w, https://substackcdn.com/image/fetch/$s_!58jR!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac190796-8d96-4d1f-9976-30c5a5dc13ba_700x235.png 848w, https://substackcdn.com/image/fetch/$s_!58jR!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac190796-8d96-4d1f-9976-30c5a5dc13ba_700x235.png 1272w, https://substackcdn.com/image/fetch/$s_!58jR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac190796-8d96-4d1f-9976-30c5a5dc13ba_700x235.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a><figcaption class="image-caption">Conceptual data model <a href="https://blog.dataengineerthings.org/data-modeling-for-data-products-a-practical-guide-2db003cc7e72">(Source)</a></figcaption></figure></div><p><strong>Approach to modeling data products:</strong></p><ul><li><p>Understand the<strong> &#8220;why&#8221; </strong>(core business needs and key business metrics)</p></li><li><p>Build the <strong>conceptual model</strong> (Business domains and how they relate to each other) and get the stakeholder buy-in.</p></li><li><p>Define the <strong>logical model</strong> at the domain level.</p></li><li><p>Adopt <strong>distributed ownership</strong> (Universal conceptual model with domain-specific modeling autonomy).</p></li><li><p>Data products and models must evolve continuously through <strong>incremental development</strong>.</p></li><li><p>Build a <strong>metric tree</strong> to connect business goals to underlying data components.</p></li><li><p>Create a <strong>semantic layer</strong> that maps business metrics to data, ensuring consistency, governance, and simplified access for users.</p></li><li><p>Use <strong><a href="https://preset.io/blog/introducing-entity-centric-data-modeling-for-analytics/">entity-centric modeling</a></strong> with enriched, denormalized tables to simplify queries and boost performance.</p></li></ul><p>&#128073;&#127996; Read the full article <strong><a href="https://blog.dataengineerthings.org/data-modeling-for-data-products-a-practical-guide-2db003cc7e72">HERE</a></strong>.</p><div><hr></div><h3><strong>&#128218; Articles of the Month</strong></h3><ul><li><p><a href="https://blog.dataengineerthings.org/how-does-doordash-evolve-realtime-processing-platform-with-iceberg-15486712cfbc">DoorDash's realtime processing platform with Iceberg:</a> Learn how DoorDash changed their architecture (Kafka &#8594; Flink &#8594; S3 &#8594; Snowpie &#8594; Snowflake) to include Iceberg (Kafka &#8594; Flink &#8594; Iceberg + S3) for their real-time processing platform (30 million messages per second) and saved 25-49% in storage costs, enabled concurrent writes, utilized hidden partitioning and increased the operational ability using query engines like Trino.</p></li><li><p><a href="https://blog.bytebytego.com/p/a-guide-to-database-sharding-key">Database Sharding: Key Strategies:</a> Sharding is a horizontal scaling technique used to distribute the load of a single database server across multiple machines, thereby spreading the data and query load evenly. This article explores the core concepts of database sharding, including its importance, how it functions, and the trade-offs involved.</p></li><li><p><a href="https://airbnb.tech/data/mussel-airbnbs-key-value-store-for-derived-data/">Mussel - Airbnb&#8217;s Key-Value Store for Derived Data:</a> An excellent article on how Airbnb uses Mussel to store derived data efficiently, achieving 99.9% availability, average read QPS (Queries per second) &gt; 800k and write QPS &gt; 35k, and average P95 read latency less than 8ms. </p></li><li><p><a href="https://seattledataguy.substack.com/p/speed-without-understanding-one-of">Speed Without Understanding - One of the Biggest Risks in Data Engineering:</a> Rushing to implement changes without a complete understanding of the effect of changes is a huge risk in data engineering. Even small changes require thorough investigation and communication across teams, as this helps pinpoint unintended consequences.</p></li></ul><p><em>(&#9997;&#65039; Interested in publishing articles on DET on Medium? Read submission guidelines <a href="https://medium.com/data-engineer-things/write-for-data-engineer-things-32dc9294c5db">here</a>.)</em></p><div><hr></div><h3>&#10024; Share Your DET Story</h3><p>We&#8217;re gathering testimonials from community members (like you!) to showcase the impact of the DET community and inspire others to join us. If the community has helped you learn, grow, or connect in any way, we&#8217;d love to hear your story! Fill out <a href="https://forms.gle/qE6wF7fs4KFL1kQr8">this</a> short form to share your DET story.</p><div><hr></div><h3>&#127912; Community Spotlight</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!PbhM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20a58fbb-9345-4d20-a12f-84aab4f0ec23_1199x350.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!PbhM!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20a58fbb-9345-4d20-a12f-84aab4f0ec23_1199x350.png 424w, https://substackcdn.com/image/fetch/$s_!PbhM!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20a58fbb-9345-4d20-a12f-84aab4f0ec23_1199x350.png 848w, https://substackcdn.com/image/fetch/$s_!PbhM!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20a58fbb-9345-4d20-a12f-84aab4f0ec23_1199x350.png 1272w, https://substackcdn.com/image/fetch/$s_!PbhM!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20a58fbb-9345-4d20-a12f-84aab4f0ec23_1199x350.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!PbhM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20a58fbb-9345-4d20-a12f-84aab4f0ec23_1199x350.png" width="1199" height="350" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/20a58fbb-9345-4d20-a12f-84aab4f0ec23_1199x350.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:350,&quot;width&quot;:1199,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:138118,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineerthings.substack.com/i/167326098?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20a58fbb-9345-4d20-a12f-84aab4f0ec23_1199x350.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!PbhM!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20a58fbb-9345-4d20-a12f-84aab4f0ec23_1199x350.png 424w, https://substackcdn.com/image/fetch/$s_!PbhM!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20a58fbb-9345-4d20-a12f-84aab4f0ec23_1199x350.png 848w, https://substackcdn.com/image/fetch/$s_!PbhM!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20a58fbb-9345-4d20-a12f-84aab4f0ec23_1199x350.png 1272w, https://substackcdn.com/image/fetch/$s_!PbhM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20a58fbb-9345-4d20-a12f-84aab4f0ec23_1199x350.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>Please introduce yourself briefly to the Data Engineer Things community!</strong></p><blockquote><p>Hi! I&#8217;m <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Shachar Meir&quot;,&quot;id&quot;:171302498,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/833aede5-203e-44ee-8617-421a44a41efd_1399x1399.jpeg&quot;,&quot;uuid&quot;:&quot;c484c1ec-1524-4451-aed7-118a82a8bb38&quot;}" data-component-name="MentionToDOM"></span>, I am a data executive with over 20 years of experience. Today I&#8217;m an independent advisor. I help companies succeed with their data, and help individuals grow their careers.</p><p>Throughout my career I&#8217;ve built data and analytics teams in various companies, ranging from scale-ups to bigger companies such as PayPal and Meta.</p></blockquote><p><strong>If your career were a Git repository, what would be the commit message for where you are right now?</strong></p><blockquote><p><em>Shachar was starting to get comfortable so we had to change something. This change may or may not work, we&#8217;ll find out..</em></p><p>I left what most people would consider the dream job (DE Director at Meta), with a salary I could have never dreamt of.  I left A LOT of money on the table for something completely unknown. It was scary, nerve-wracking, and most people would say it was a crazy thing to do or even a mistake.</p><p>Today, two years later, I&#8217;m not looking back. Best decision of my life.</p></blockquote><p><strong>In one of your recent posts, you discuss the "career danger zone" for tech managers who lose technical skills while managing small teams. How can data professionals avoid falling into this trap?</strong></p><blockquote><p>In short &#8211; the &#8220;career danger zone&#8221; is the situation where you have a manager of a small team and not hands-on. Managers basically make themselves irrelevant this way, and when they look for another job they fail the technical interviews and realise they are not recruitable anymore.</p><p>The solution is simple:</p><ol><li><p>Understand and recognise the danger zone</p></li><li><p>Unless you are a Director leading a 50+ people organisation &#8211; STAY HANDS-ON. Don&#8217;t lose your technical skills. You will become irrelevant.</p></li></ol></blockquote><p><strong>What's the single best piece of career advice you've received that data engineers and aspiring data leaders should hear today?</strong></p><blockquote><p>Nobody cares what you do, and nobody cares how you do it. What they care about is how it&#8217;s impacting the business. If you understand the business and use your skills to drive business outcomes &#8211; you will win this game. Anything else is just a waste of everybody&#8217;s time.</p></blockquote><p><strong>Based on your experience across multiple industries, what skills do you believe will be most valuable for data professionals in the next 3-5 years?</strong></p><blockquote><ol><li><p><strong>Curiosity</strong> &#8211; asking questions and being in the details</p></li><li><p><strong>Changeability</strong> &#8211; ability to take feedback, adapt, and change</p></li><li><p><strong>Clarity</strong> &#8211; ability to understand and explain complex topics, and ask the right questions</p></li><li><p><strong>Collaboration</strong> &#8211; ability to build relationships, influence, build communities, and get things done as a group</p></li></ol><p>It&#8217;s pretty much guaranteed that through the next 3-5 years:</p><ul><li><p>Our roles will change</p></li><li><p>Technologies will change</p></li><li><p>Our workplace and work culture will change</p></li></ul><p>I strongly believe that people with these 4 skills will survive anything that comes their way.</p></blockquote><p><strong>What's one message you'd like to share with the Data Engineer Things community about creating impact through data leadership?</strong></p><blockquote><p>Data is the glue that connects every function in the organisation. Understand the business, learn how to lead with data, and your growth will be unlimited.</p></blockquote><h4>&#10145;&#65039; Learn More</h4><p>If you want to learn more about building a successful career in Data Engineering, have a look at the following fireside chat with <a href="https://www.linkedin.com/in/shacharmeir/">Shachar</a> and <a href="https://www.linkedin.com/in/xinranwaibel/">Xinran</a>.</p><div id="youtube2-dYYAPStjyV4" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;dYYAPStjyV4&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/dYYAPStjyV4?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><div><hr></div><h3>&#128161; DE Tip of the Month</h3><p><a href="https://airbyte.com/data-engineering-resources/data-federation">Data federation</a> is the most effective way to gain seamless access to multiple data sources. The Federation doesn&#8217;t transfer data, but accesses it virtually from the source location. For quick prototyping or linking cloud and on-premises systems, this is useful. This solution works well when you want to limit data movement, create a single-window view, and desire timely information. However, this shouldn&#8217;t be used for high-volume or complex transformations.</p><div><hr></div><p>Let us know what you like the most in the newsletter. See you next time!</p><div class="poll-embed" data-attrs="{&quot;id&quot;:354192}" data-component-name="PollToDOM"></div><p>Cheers,</p><p><a href="https://www.linkedin.com/in/anandaganesh/">Ananda,</a> <a href="https://www.linkedin.com/in/sukanyawadawadagi/">Sukanya</a>, and <a href="https://www.linkedin.com/in/xinranwaibel/">Xinran</a></p><div><hr></div><h4>&#8505;&#65039; About Data Engineer Things</h4><p><a href="https://www.dataengineerthings.org/">Data Engineer Things</a> (DET) is a global community built by data engineers for data engineers. Subscribe to the <a href="https://dataengineerthings.substack.com/">newsletter</a> and follow us on <a href="https://www.linkedin.com/in/xinranwaibel/">LinkedIn</a> to gain access to exclusive learning resources and networking opportunities, including articles, webinars, meetups, conferences, mentorship, and much more.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://dataengineerthings.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Data Engineer Things! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Data Engineer Things Newsletter #21 (July 2025)]]></title><description><![CDATA[Explore the latest in data engineering: Git-for-Data, DAIS 2025 highlights, AI tools, Seattle meetup, and community tips&#8212;all in July&#8217;s DET newsletter.]]></description><link>https://dataengineerthings.substack.com/p/data-engineer-things-newsletter-21</link><guid isPermaLink="false">https://dataengineerthings.substack.com/p/data-engineer-things-newsletter-21</guid><dc:creator><![CDATA[Volker Janz]]></dc:creator><pubDate>Tue, 15 Jul 2025 15:01:33 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Ceqs!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb48d86e7-205c-4484-8dfa-4d573ea4cf54_1456x1048.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Ceqs!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb48d86e7-205c-4484-8dfa-4d573ea4cf54_1456x1048.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Ceqs!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb48d86e7-205c-4484-8dfa-4d573ea4cf54_1456x1048.png 424w, https://substackcdn.com/image/fetch/$s_!Ceqs!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb48d86e7-205c-4484-8dfa-4d573ea4cf54_1456x1048.png 848w, https://substackcdn.com/image/fetch/$s_!Ceqs!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb48d86e7-205c-4484-8dfa-4d573ea4cf54_1456x1048.png 1272w, https://substackcdn.com/image/fetch/$s_!Ceqs!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb48d86e7-205c-4484-8dfa-4d573ea4cf54_1456x1048.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Ceqs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb48d86e7-205c-4484-8dfa-4d573ea4cf54_1456x1048.png" width="1456" height="1048" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b48d86e7-205c-4484-8dfa-4d573ea4cf54_1456x1048.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1048,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:202662,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://dataengineerthings.substack.com/i/167325489?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb48d86e7-205c-4484-8dfa-4d573ea4cf54_1456x1048.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Ceqs!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb48d86e7-205c-4484-8dfa-4d573ea4cf54_1456x1048.png 424w, https://substackcdn.com/image/fetch/$s_!Ceqs!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb48d86e7-205c-4484-8dfa-4d573ea4cf54_1456x1048.png 848w, https://substackcdn.com/image/fetch/$s_!Ceqs!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb48d86e7-205c-4484-8dfa-4d573ea4cf54_1456x1048.png 1272w, https://substackcdn.com/image/fetch/$s_!Ceqs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb48d86e7-205c-4484-8dfa-4d573ea4cf54_1456x1048.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Highlights of the DET Newsletter July 2025 Edition</figcaption></figure></div><p>Hey there,</p><p>Nice to meet you! You might wonder why Xinran isn't writing the intro this time - that's because we now have a motivated team behind the newsletter, and this month it's my pleasure to introduce you to it.</p><p>I'm writing to you from sunny California, where I've recently settled after a life-changing move from Germany to the US. Change has been the defining theme of my year - from navigating the relocation to watching my children experience their first American summer. Like in Data Engineering, sometimes the most challenging transformations yield the most rewarding results!</p><p>Speaking of change, you might notice our newsletter looks different this month. We've reimagined the format to deliver even more value to you, our readers. You might also see the theme of change in this month's featured read, where a Git-for-Data approach to managing change in data systems is discussed.</p><p>In this context, let me start the newsletter with a quote from a respected fellow Data Engineer:</p><div class="pullquote"><p>Changing is scary, but so is staying the same.</p></div><p>Enjoy reading!</p><p>- Volker</p><div><hr></div><h3>&#128240; Data Pulse</h3><ul><li><p><strong>AI</strong>: Anthropic published a new resource: <a href="https://www.anthropic.com/partners/powered-by-claude">Powered by Claude</a>. A curated list of projects using Claude in production, a great source for AI inspiration.</p></li><li><p><strong>AI</strong>: Google released <a href="https://blog.google/technology/developers/introducing-gemini-cli-open-source-ai-agent/">Gemini CLI: your open-source AI agent</a>.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Bl6H!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff82b3aa8-d636-4636-928f-09613a59cb3f_800x474.gif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Bl6H!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff82b3aa8-d636-4636-928f-09613a59cb3f_800x474.gif 424w, https://substackcdn.com/image/fetch/$s_!Bl6H!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff82b3aa8-d636-4636-928f-09613a59cb3f_800x474.gif 848w, https://substackcdn.com/image/fetch/$s_!Bl6H!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff82b3aa8-d636-4636-928f-09613a59cb3f_800x474.gif 1272w, https://substackcdn.com/image/fetch/$s_!Bl6H!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff82b3aa8-d636-4636-928f-09613a59cb3f_800x474.gif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Bl6H!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff82b3aa8-d636-4636-928f-09613a59cb3f_800x474.gif" width="800" height="474" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f82b3aa8-d636-4636-928f-09613a59cb3f_800x474.gif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:474,&quot;width&quot;:800,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:3413681,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/gif&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineerthings.substack.com/i/167325489?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff82b3aa8-d636-4636-928f-09613a59cb3f_800x474.gif&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Bl6H!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff82b3aa8-d636-4636-928f-09613a59cb3f_800x474.gif 424w, https://substackcdn.com/image/fetch/$s_!Bl6H!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff82b3aa8-d636-4636-928f-09613a59cb3f_800x474.gif 848w, https://substackcdn.com/image/fetch/$s_!Bl6H!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff82b3aa8-d636-4636-928f-09613a59cb3f_800x474.gif 1272w, https://substackcdn.com/image/fetch/$s_!Bl6H!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff82b3aa8-d636-4636-928f-09613a59cb3f_800x474.gif 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Gemini CLI</figcaption></figure></div><ul><li><p><strong>Data Modeling</strong>: <a href="https://www.getdbt.com/blog/dbt-fusion-engine-public-beta-bigquery">dbt Fusion engine public beta is now available on BigQuery</a>.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!PAoJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdb175ff-2c0b-4ece-a80e-ea2acf810135_1200x249.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!PAoJ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdb175ff-2c0b-4ece-a80e-ea2acf810135_1200x249.png 424w, https://substackcdn.com/image/fetch/$s_!PAoJ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdb175ff-2c0b-4ece-a80e-ea2acf810135_1200x249.png 848w, https://substackcdn.com/image/fetch/$s_!PAoJ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdb175ff-2c0b-4ece-a80e-ea2acf810135_1200x249.png 1272w, https://substackcdn.com/image/fetch/$s_!PAoJ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdb175ff-2c0b-4ece-a80e-ea2acf810135_1200x249.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!PAoJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdb175ff-2c0b-4ece-a80e-ea2acf810135_1200x249.png" width="1200" height="249" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fdb175ff-2c0b-4ece-a80e-ea2acf810135_1200x249.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:249,&quot;width&quot;:1200,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:16771,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineerthings.substack.com/i/167325489?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdb175ff-2c0b-4ece-a80e-ea2acf810135_1200x249.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!PAoJ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdb175ff-2c0b-4ece-a80e-ea2acf810135_1200x249.png 424w, https://substackcdn.com/image/fetch/$s_!PAoJ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdb175ff-2c0b-4ece-a80e-ea2acf810135_1200x249.png 848w, https://substackcdn.com/image/fetch/$s_!PAoJ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdb175ff-2c0b-4ece-a80e-ea2acf810135_1200x249.png 1272w, https://substackcdn.com/image/fetch/$s_!PAoJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdb175ff-2c0b-4ece-a80e-ea2acf810135_1200x249.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><ul><li><p><strong>Event streaming</strong>: Going beyond micro-batch. The new <a href="https://www.databricks.com/dataaisummit/session/real-time-mode-technical-deep-dive-how-we-built-sub-300-millisecond">real-time mode in Apache Spark Structured Streaming</a> provides p99 latencies less than 300 milliseconds for both stateless and stateful streaming processing.</p></li><li><p><strong>Multimodal data formats</strong>: The rise of AI led to unprecedented needs for handling multimodal data at scale. Check out these up-and-coming open-source frameworks like <a href="https://github.com/facebookincubator/nimble">Meta&#8217;s Nimble</a>, <a href="https://github.com/lancedb/lancedb">Lance</a>, and <a href="https://github.com/Eventual-Inc/Daft">Daft</a>.</p></li><li><p><strong><a href="https://www.databricks.com/blog/apache-icebergtm-v3-moving-ecosystem-towards-unification">Apache Iceberg v3</a></strong>: New features include deletion vectors, row lineage, semi-structured data, geospatial types, and interoperability with Delta Lake.</p></li><li><p><strong><a href="https://www.databricks.com/blog/announcing-full-apache-iceberg-support-databricks">Full Apache Iceberg support in Unity Catalog</a></strong>: Read and write Managed Iceberg tables and use Unity Catalog to access and govern Iceberg tables in external catalogs</p></li><li><p><strong><a href="https://www.databricks.com/blog/bringing-declarative-pipelines-apache-spark-open-source-project">Declarative Pipelines</a></strong>: The declarative API for building robust data pipelines is now open-source and available for Apache Spark.</p></li><li><p><strong><a href="https://www.databricks.com/product/lakebase">Lakebase</a></strong>: Serverless, Postgres-compatible OLTP database for the lakehouse.</p></li><li><p><strong><a href="https://www.databricks.com/blog/introducing-databricks-free-edition">Free Edition</a></strong>: Free Edition offers the same suite of tools that were previously limited to paying customers, allowing everyone to experiment and learn all the latest in data and AI technologies.</p></li></ul><div><hr></div><h3>&#128467; DET Seattle Meetup</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!xIJN!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7281c9e6-063f-441a-93bf-eee7e7db641d_2048x1152.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!xIJN!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7281c9e6-063f-441a-93bf-eee7e7db641d_2048x1152.jpeg 424w, https://substackcdn.com/image/fetch/$s_!xIJN!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7281c9e6-063f-441a-93bf-eee7e7db641d_2048x1152.jpeg 848w, https://substackcdn.com/image/fetch/$s_!xIJN!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7281c9e6-063f-441a-93bf-eee7e7db641d_2048x1152.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!xIJN!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7281c9e6-063f-441a-93bf-eee7e7db641d_2048x1152.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!xIJN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7281c9e6-063f-441a-93bf-eee7e7db641d_2048x1152.jpeg" width="1456" height="819" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7281c9e6-063f-441a-93bf-eee7e7db641d_2048x1152.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:226916,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineerthings.substack.com/i/167325489?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7281c9e6-063f-441a-93bf-eee7e7db641d_2048x1152.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!xIJN!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7281c9e6-063f-441a-93bf-eee7e7db641d_2048x1152.jpeg 424w, https://substackcdn.com/image/fetch/$s_!xIJN!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7281c9e6-063f-441a-93bf-eee7e7db641d_2048x1152.jpeg 848w, https://substackcdn.com/image/fetch/$s_!xIJN!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7281c9e6-063f-441a-93bf-eee7e7db641d_2048x1152.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!xIJN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7281c9e6-063f-441a-93bf-eee7e7db641d_2048x1152.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">DET Seattle Meetup on July 24th</figcaption></figure></div><p>Join the next Seattle Meetup for a deep dive into Apache Hudi and Lance&#8212;two open-source frameworks for building scalable lakehouse architectures.</p><ul><li><p><strong>When</strong>: 5:00 PM to 8:00 PM on Thursday, July 24</p></li><li><p><strong>Where</strong>: Docusign Tower, 999 3rd Ave #1000, Seattle, WA 98104</p></li><li><p><strong>Talk #1</strong>: Redefining Open Lakehouse Architecture with Apache Hudi 1.0, by Dipankar Mazumdar</p></li><li><p><strong>Talk #2</strong>: Multimodal AI Lakehouse with Lance &amp; LanceDB, by Jack Ye </p></li><li><p><strong>&#128073;&#127996; <a href="https://www.meetup.com/data-engineer-things-seattle-meetup/events/308773412/">RSVP</a></strong></p></li></ul><p><em>(&#127908; Interested in speaking at our meetups or online webinars? Submit talk proposals <a href="http://meetup.dataengineerthings.org/cfp">here</a>.)</em></p><div><hr></div><h3>&#128278; Featured Read</h3><h4>In Git We Trust: Git for Data over Data Lakes</h4><p><em>Author: <a href="https://medium.com/@ciro.greco">Ciro Greco</a></em></p><blockquote><p>Just as every software team adopted Git by 2010, every data team will adopt Git-for-Data by 2030.</p></blockquote><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!4XbN!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92df411f-c6df-47a0-817f-b4d6ad4681be_2000x1125.webp" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!4XbN!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92df411f-c6df-47a0-817f-b4d6ad4681be_2000x1125.webp 424w, https://substackcdn.com/image/fetch/$s_!4XbN!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92df411f-c6df-47a0-817f-b4d6ad4681be_2000x1125.webp 848w, https://substackcdn.com/image/fetch/$s_!4XbN!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92df411f-c6df-47a0-817f-b4d6ad4681be_2000x1125.webp 1272w, https://substackcdn.com/image/fetch/$s_!4XbN!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92df411f-c6df-47a0-817f-b4d6ad4681be_2000x1125.webp 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!4XbN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92df411f-c6df-47a0-817f-b4d6ad4681be_2000x1125.webp" width="1456" height="819" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/92df411f-c6df-47a0-817f-b4d6ad4681be_2000x1125.webp&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:60312,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/webp&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineerthings.substack.com/i/167325489?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92df411f-c6df-47a0-817f-b4d6ad4681be_2000x1125.webp&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!4XbN!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92df411f-c6df-47a0-817f-b4d6ad4681be_2000x1125.webp 424w, https://substackcdn.com/image/fetch/$s_!4XbN!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92df411f-c6df-47a0-817f-b4d6ad4681be_2000x1125.webp 848w, https://substackcdn.com/image/fetch/$s_!4XbN!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92df411f-c6df-47a0-817f-b4d6ad4681be_2000x1125.webp 1272w, https://substackcdn.com/image/fetch/$s_!4XbN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92df411f-c6df-47a0-817f-b4d6ad4681be_2000x1125.webp 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Git-for-data (<a href="https://blog.det.life/in-git-we-trust-git-for-data-over-data-lakes-322fe8375ace">Source</a>)</figcaption></figure></div><p>Data engineering is facing a crisis of reliability. While software developers enjoy the safety of Git's branches, commits, and rollbacks, data teams still work with loosely managed files with no built-in versioning. This creates three critical problems:</p><p>&#8226; Reproducibility nightmares</p><p>&#8226; Dangerous experimentation</p><p>&#8226; Manual everything</p><p><strong>The Solution: Git Concepts for Data</strong></p><p>Building on open table formats like Apache Iceberg and Delta Lake, the article envisions data workflows with:</p><p>&#8226; <strong>Branches</strong> to isolate experiments on real production data</p><p>&#8226; <strong>Commits</strong> to track multi-table changes as atomic units</p><p>&#8226; <strong>Merges</strong> to publish only after passing data quality checks</p><p>&#8226; <strong>Rollbacks</strong> to instantly undo bad pipeline runs</p><p>As AI becomes core to every product and data powers user-facing applications, we need software engineering-grade reliability for data operations. The convergence of AI adoption, real-time data applications, and Python as the data lingua franca is driving this inevitable shift.</p><p>The article showcases two use cases for Git-for-Data using <a href="https://www.bauplanlabs.com/">Bauplan</a>, a Pythonic data platform for transformation and AI workloads with built-in Git-for-Data capabilities.</p><p>&#128073;&#127996; Read the full article <strong><a href="https://medium.com/data-engineer-things/in-git-we-trust-git-for-data-over-data-lakes-322fe8375ace">HERE</a></strong>.</p><div><hr></div><h3><strong>&#128218; Articles of the Month</strong></h3><ul><li><p><a href="https://medium.com/data-engineer-things/what-is-apache-hive-f839362d62cb">What is Apache Hive?</a>: Lakehouses might be trending, but many companies still rely heavily on Apache Hive. Here's an excellent article that clearly explains Hive fundamentals.</p></li><li><p><a href="https://corp.roblox.com/newsroom/2025/06/roblox-path-to-2-trillion-analytics-events-a-day">Roblox&#8217;s Path to 2&#8239;Trillion Analytics Events a Day</a>: Learn how Roblox built a real-time schema-based analytics pipeline, enabling them to handle over 2 trillion events daily with reduced latency and costs.</p></li><li><p><a href="https://medium.com/data-engineer-things/the-6-most-common-customer-questions-on-data-projects-57924236b782">The 6 Most Common Customer Questions on Data Projects</a>: A playbook on communication, expectation-setting, and collaborative processes between engineering teams and customers.</p></li><li><p><a href="https://airbnb.tech/uncategorized/embedding-based-retrieval-for-airbnb-search/">Embedding-Based Retrieval for Airbnb Search</a>: Discover how Airbnb tackled the challenge of searching millions of listings by building a two-tower neural network that maps both queries and homes into numerical vectors. Their innovative approach to training data construction and serving infrastructure delivered booking improvements comparable to their largest ML ranking advancements in years.</p></li></ul><p><em>(&#9997;&#65039; Interested in publishing articles on DET on Medium? Read submission guidelines <a href="https://medium.com/data-engineer-things/write-for-data-engineer-things-32dc9294c5db">here</a>.)</em></p><div><hr></div><h3>&#127912; Community Spotlight</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Ow0d!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feaed78a7-1e89-4dc0-ae40-73e9917f11e4_1199x350.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Ow0d!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feaed78a7-1e89-4dc0-ae40-73e9917f11e4_1199x350.png 424w, https://substackcdn.com/image/fetch/$s_!Ow0d!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feaed78a7-1e89-4dc0-ae40-73e9917f11e4_1199x350.png 848w, https://substackcdn.com/image/fetch/$s_!Ow0d!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feaed78a7-1e89-4dc0-ae40-73e9917f11e4_1199x350.png 1272w, https://substackcdn.com/image/fetch/$s_!Ow0d!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feaed78a7-1e89-4dc0-ae40-73e9917f11e4_1199x350.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Ow0d!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feaed78a7-1e89-4dc0-ae40-73e9917f11e4_1199x350.png" width="1199" height="350" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/eaed78a7-1e89-4dc0-ae40-73e9917f11e4_1199x350.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:350,&quot;width&quot;:1199,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:138336,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineerthings.substack.com/i/167325489?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feaed78a7-1e89-4dc0-ae40-73e9917f11e4_1199x350.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Ow0d!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feaed78a7-1e89-4dc0-ae40-73e9917f11e4_1199x350.png 424w, https://substackcdn.com/image/fetch/$s_!Ow0d!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feaed78a7-1e89-4dc0-ae40-73e9917f11e4_1199x350.png 848w, https://substackcdn.com/image/fetch/$s_!Ow0d!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feaed78a7-1e89-4dc0-ae40-73e9917f11e4_1199x350.png 1272w, https://substackcdn.com/image/fetch/$s_!Ow0d!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feaed78a7-1e89-4dc0-ae40-73e9917f11e4_1199x350.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>Tell us about yourself and how you started to engage with the DET community.</strong></p><blockquote><p>I'm <a href="https://www.linkedin.com/in/vjanz/">Volker</a>, a Principal Data Engineer who has been working with data in the gaming industry for nearly 15 years. I started publishing <a href="https://vojay.medium.com/">articles</a> with DET about a year ago, and now I'm happy to be contributing to the newsletter!</p></blockquote><p><strong>What's one Data Engineering resource (book, course, tool) you'd recommend that most engineers might not know about?</strong></p><blockquote><p><a href="https://www.amazon.com/Pragmatic-Programmer-journey-mastery-Anniversary/dp/0135957052">The Pragmatic Programmer by David Thomas and Andrew Hunt</a>. It might seem a bit out of the box since it focuses on general software development rather than data engineering, but the lessons it teaches can be easily adopted. Its core concepts have become guiding principles throughout my career in tech.</p></blockquote><p><strong>If your data engineering career were a Git repository, what would be the commit message for where you are right now?</strong></p><blockquote><p>refactor(career): add documentation so others can learn from it</p><p>Tip: Use <a href="https://gist.github.com/joshbuchea/6f47e86d2510bce28f8e7f42ae84c716">Semantic Commit Messages</a> in your data engineering code changes to make your work more understandable for your team!</p></blockquote><p><strong>What's the best piece of career advice you've received that other Data Engineers might benefit from?</strong></p><blockquote><p>Ask the Spice Girls question: "So tell me what you want, what you really, really want."</p><p>This question forces you to uncover the motivation, the WHY, behind a stakeholder's request. Before you start working on any project, I recommend clarifying three things for yourself and your team: WHAT are we building? HOW are we building it? WHY are we building it? Understanding the <em>why</em> helps you design simpler, more valuable solutions and increases your visibility because you understand the business value behind the technical requirements.</p></blockquote><div><hr></div><h3>&#128161; DE Tip of the Month</h3><p>Always set alerts on pipeline failures! Silent failures quickly become holiday spoilers. I learned this the hard way by spending my 4th of July debugging pipelines. Don't let this happen to you. Monitor early, alert often. For example, set up alerts for missing or delayed data to catch jobs that fail quietly without throwing an error.</p><div><hr></div><p>Let us know how you like the new newsletter format. See you next time!</p><div class="poll-embed" data-attrs="{&quot;id&quot;:342690}" data-component-name="PollToDOM"></div><p>Cheers,</p><p><a href="https://www.linkedin.com/in/vjanz/">Volker</a>, <a href="https://www.linkedin.com/in/shubhamgondane/">Shubham</a>, and <a href="https://www.linkedin.com/in/xinranwaibel/">Xinran</a></p><div><hr></div><h4>&#8505;&#65039; About Data Engineer Things</h4><p><a href="https://www.dataengineerthings.org/">Data Engineer Things</a> (DET) is a global community built by data engineers for data engineers. Subscribe to the <a href="https://dataengineerthings.substack.com/">newsletter</a> and follow us on <a href="https://www.linkedin.com/in/xinranwaibel/">LinkedIn</a> to gain access to exclusive learning resources and networking opportunities, including articles, webinars, meetups, conferences, mentorship, and much more.</p><p></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://dataengineerthings.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Data Engineer Things! Subscribe for free to receive new posts and support my work</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Data Engineer Things Newsletter #20]]></title><description><![CDATA[Numaflow webinar, free ebook on Airflow 3, volunteering, and Seattle coffee social.]]></description><link>https://dataengineerthings.substack.com/p/data-engineer-things-newsletter-20</link><guid isPermaLink="false">https://dataengineerthings.substack.com/p/data-engineer-things-newsletter-20</guid><dc:creator><![CDATA[Data Engineer Things]]></dc:creator><pubDate>Tue, 10 Jun 2025 15:02:26 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c606821-25e3-4857-ab5e-9ade13edc4d3_1920x1080.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey there,</p><p>I&#8217;m writing to you from San Francisco, while attending the Data + AI Summit this week! I had the honor of attending the Open Lakehouse Mini Summit on Monday, a mini-conference for open-source contributors. I enjoyed learning about the Apache Spark 4.x roadmap, including real-time streaming mode, variant type, DataSource API v2, and declarative pipelines, and having open discussions on yet-to-be-solved problems like handling multimodal data and BLOB at scale.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Jna8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e2d0d20-cfd8-4f54-97ed-61dbc95710db_3221x2545.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Jna8!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e2d0d20-cfd8-4f54-97ed-61dbc95710db_3221x2545.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Jna8!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e2d0d20-cfd8-4f54-97ed-61dbc95710db_3221x2545.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Jna8!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e2d0d20-cfd8-4f54-97ed-61dbc95710db_3221x2545.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Jna8!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e2d0d20-cfd8-4f54-97ed-61dbc95710db_3221x2545.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Jna8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e2d0d20-cfd8-4f54-97ed-61dbc95710db_3221x2545.jpeg" width="459" height="362.53434065934067" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3e2d0d20-cfd8-4f54-97ed-61dbc95710db_3221x2545.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1150,&quot;width&quot;:1456,&quot;resizeWidth&quot;:459,&quot;bytes&quot;:1660128,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://dataengineerthings.substack.com/i/165222813?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e2d0d20-cfd8-4f54-97ed-61dbc95710db_3221x2545.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Jna8!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e2d0d20-cfd8-4f54-97ed-61dbc95710db_3221x2545.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Jna8!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e2d0d20-cfd8-4f54-97ed-61dbc95710db_3221x2545.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Jna8!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e2d0d20-cfd8-4f54-97ed-61dbc95710db_3221x2545.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Jna8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e2d0d20-cfd8-4f54-97ed-61dbc95710db_3221x2545.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Checking in at the Data + AI Summit 2025 in San Francisco.</figcaption></figure></div><div><hr></div><h3>&#128467; DET Webinar on Numaflow on June 18</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!w0kS!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c606821-25e3-4857-ab5e-9ade13edc4d3_1920x1080.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!w0kS!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c606821-25e3-4857-ab5e-9ade13edc4d3_1920x1080.jpeg 424w, https://substackcdn.com/image/fetch/$s_!w0kS!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c606821-25e3-4857-ab5e-9ade13edc4d3_1920x1080.jpeg 848w, https://substackcdn.com/image/fetch/$s_!w0kS!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c606821-25e3-4857-ab5e-9ade13edc4d3_1920x1080.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!w0kS!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c606821-25e3-4857-ab5e-9ade13edc4d3_1920x1080.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!w0kS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c606821-25e3-4857-ab5e-9ade13edc4d3_1920x1080.jpeg" width="520" height="292.5" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4c606821-25e3-4857-ab5e-9ade13edc4d3_1920x1080.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:520,&quot;bytes&quot;:225236,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineerthings.substack.com/i/165222813?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c606821-25e3-4857-ab5e-9ade13edc4d3_1920x1080.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!w0kS!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c606821-25e3-4857-ab5e-9ade13edc4d3_1920x1080.jpeg 424w, https://substackcdn.com/image/fetch/$s_!w0kS!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c606821-25e3-4857-ab5e-9ade13edc4d3_1920x1080.jpeg 848w, https://substackcdn.com/image/fetch/$s_!w0kS!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c606821-25e3-4857-ab5e-9ade13edc4d3_1920x1080.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!w0kS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c606821-25e3-4857-ab5e-9ade13edc4d3_1920x1080.jpeg 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">DET Webinar: A Modern Approach to Real-Time Processing with Numaflow on June 18th.</figcaption></figure></div><p>DET Webinar is back! In the upcoming session, you will learn about <a href="https://numaflow.numaproj.io/">Numaflow</a>, an open-source framework from Intuit that rethinks real-time event processing for a broader audience of engineers and developers. Whether you're working on real-time processing, designing event-driven applications, or powering ML workflows, Numaflow offers a simple, flexible way to connect streaming sources, transform events, and move data in real time. Join us to see how modern stream processing is becoming more accessible and why it matters more than ever.</p><ul><li><p><strong>Speakers: </strong>Sri Harsha Yayi (Staff Product Manager, Intuit) and Vigith Maurice (Principal Software Engineer, Intuit)</p></li><li><p><strong>When</strong>: 10 am - 10:45 am on Wed, June 18th (PT)</p></li><li><p><strong>Where</strong>: Google Meet</p></li><li><p><strong>&#128073;&#127996; <a href="https://calendar.app.google/79kzvHo1Gn5C7XU77">RSVP</a></strong></p></li></ul><div><hr></div><h4>&#127916; Free eBook: Practical Guide to Apache Airflow 3</h4><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!38EC!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5551483b-62c0-445e-89c4-06173b534621_1920x1008.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!38EC!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5551483b-62c0-445e-89c4-06173b534621_1920x1008.png 424w, https://substackcdn.com/image/fetch/$s_!38EC!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5551483b-62c0-445e-89c4-06173b534621_1920x1008.png 848w, https://substackcdn.com/image/fetch/$s_!38EC!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5551483b-62c0-445e-89c4-06173b534621_1920x1008.png 1272w, https://substackcdn.com/image/fetch/$s_!38EC!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5551483b-62c0-445e-89c4-06173b534621_1920x1008.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!38EC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5551483b-62c0-445e-89c4-06173b534621_1920x1008.png" width="500" height="262.3626373626374" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5551483b-62c0-445e-89c4-06173b534621_1920x1008.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:764,&quot;width&quot;:1456,&quot;resizeWidth&quot;:500,&quot;bytes&quot;:1016279,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineerthings.substack.com/i/165222813?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5551483b-62c0-445e-89c4-06173b534621_1920x1008.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!38EC!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5551483b-62c0-445e-89c4-06173b534621_1920x1008.png 424w, https://substackcdn.com/image/fetch/$s_!38EC!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5551483b-62c0-445e-89c4-06173b534621_1920x1008.png 848w, https://substackcdn.com/image/fetch/$s_!38EC!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5551483b-62c0-445e-89c4-06173b534621_1920x1008.png 1272w, https://substackcdn.com/image/fetch/$s_!38EC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5551483b-62c0-445e-89c4-06173b534621_1920x1008.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Apache Airflow is one of the most popular open-source frameworks for data orchestration. Whether you are a new or experienced Airflow user, this book will be your guide for getting started with Airflow 3. You'll learn how to:</p><ul><li><p>Set up a local development environment and write your first pipeline.</p></li><li><p>Use new Airflow 3.0 features, including DAG versioning, backfill, asset-oriented syntax, and the new UI.</p></li><li><p>Prepare your DAGs for a smooth upgrade from Airflow 2 to 3.</p></li></ul><p>&#128073;&#127996; Read the book <strong><a href="https://www.astronomer.io/ebooks/practical-guide-to-apache-airflow-3/?utm_source=data-engineering-things&amp;utm_medium=paidmedia&amp;utm_campaign=ebook-practical-guide-af3-4-25">HERE</a></strong>.</p><p><em>(This message is sponsored by Astronomer.)</em></p><div><hr></div><h4>&#128587;&#127995;&#8205;&#9792;&#65039; Volunteers Needed</h4><p>We are looking for passionate and dedicated volunteers to join the DET team and shape the future of our community! As a volunteer, you will help us build the new mentorship program, design a one-stop community site, curate content for the newsletter, or host community events. This will be a fantastic opportunity to make meaningful connections, learning technical and soft skills, and enrich your professional portfolio!</p><p>&#128073;&#127996; Apply to be a volunteer <strong><a href="https://forms.gle/a4Ai2ETXaANxcpjE6">HERE</a></strong>.</p><div><hr></div><h4>&#9749;&#65039; Seattle Coffee Social on June 13</h4><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!zWE9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4324c3f9-f0b2-4371-8e8c-1caabfb51db2_1080x1080.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!zWE9!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4324c3f9-f0b2-4371-8e8c-1caabfb51db2_1080x1080.jpeg 424w, https://substackcdn.com/image/fetch/$s_!zWE9!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4324c3f9-f0b2-4371-8e8c-1caabfb51db2_1080x1080.jpeg 848w, https://substackcdn.com/image/fetch/$s_!zWE9!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4324c3f9-f0b2-4371-8e8c-1caabfb51db2_1080x1080.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!zWE9!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4324c3f9-f0b2-4371-8e8c-1caabfb51db2_1080x1080.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!zWE9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4324c3f9-f0b2-4371-8e8c-1caabfb51db2_1080x1080.jpeg" width="325" height="325" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4324c3f9-f0b2-4371-8e8c-1caabfb51db2_1080x1080.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1080,&quot;width&quot;:1080,&quot;resizeWidth&quot;:325,&quot;bytes&quot;:94359,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineerthings.substack.com/i/165222813?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4324c3f9-f0b2-4371-8e8c-1caabfb51db2_1080x1080.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!zWE9!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4324c3f9-f0b2-4371-8e8c-1caabfb51db2_1080x1080.jpeg 424w, https://substackcdn.com/image/fetch/$s_!zWE9!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4324c3f9-f0b2-4371-8e8c-1caabfb51db2_1080x1080.jpeg 848w, https://substackcdn.com/image/fetch/$s_!zWE9!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4324c3f9-f0b2-4371-8e8c-1caabfb51db2_1080x1080.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!zWE9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4324c3f9-f0b2-4371-8e8c-1caabfb51db2_1080x1080.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Meet me and Saransh Arora (DET Seattle Lead) at Downtown Seattle for a casual social time over coffee:</p><ul><li><p><strong>Where</strong>: Mr West Cafe Bar Downtown, 720 Olive Wy, Seattle, WA 98101</p></li><li><p><strong>When</strong>: 5 pm - 6 pm, Friday, June 13</p></li><li><p><strong>&#128073;&#127996; <a href="https://forms.gle/w6CUUwWQb6Y7agNR8">RSVP</a></strong></p></li></ul><div><hr></div><h4><strong>&#128218; Articles of the Week</strong></h4><ul><li><p><a href="https://blog.det.life/how-we-implemented-a-custom-b-tree-to-handle-10tb-of-time-series-data-64481fef908c">How We Implemented a Custom B-tree to Handle 10TB of Time-Series Data</a> by Coders Stop</p></li><li><p><a href="https://blog.det.life/why-are-there-so-many-databases-87d334c5dce6">Why Are There So Many Databases?</a> by Cai Parry-Jones</p></li><li><p><a href="https://blog.det.life/data-quality-with-airflow-sql-check-operators-a-step-by-step-guide-abb1d800dada">Data Quality With Airflow SQL Check Operators: A Step-by-Step Guide</a> by Lorena Gongang (<em>You might think you need some advanced framework for Data Quality, but sometimes Airflow could be all you need. This is a comprehensive follow-along guide for any engineer who wants to implement DQ in their pipelines.</em>)</p></li><li><p><a href="https://cloud.google.com/vertex-ai/generative-ai/docs/learn/prompts/prompt-design-strategies">Prompting Strategies</a> by Google Cloud (<em>aka. How to clearly communicate your requirements to GenAI to get exactly what you need.</em>)</p></li></ul><div><hr></div><p>Have a great week and see you next time!</p><p></p><p>Cheers,</p><p>Xinran Waibel</p><p>Head of the <a href="https://www.dataengineerthings.org/">Data Engineer Things</a> community</p><p><em>(&#128161; Don&#8217;t forget to follow us on <a href="https://www.linkedin.com/company/data-engineer-things/">LinkedIn</a> and join the DET <a href="http://join.det.life/">Slack</a> community!)</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://dataengineerthings.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Data Engineer Things! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Data Engineer Things Newsletter #19]]></title><description><![CDATA[Bay Area meetup at eBay HQ, OpenXData conference, and speaking opportunities.]]></description><link>https://dataengineerthings.substack.com/p/data-engineer-things-newsletter-19</link><guid isPermaLink="false">https://dataengineerthings.substack.com/p/data-engineer-things-newsletter-19</guid><dc:creator><![CDATA[Data Engineer Things]]></dc:creator><pubDate>Tue, 13 May 2025 15:00:41 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F72d91856-eadb-4d54-a1af-d6462ddd7e68_1920x1080.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hello everyone,</p><p>How have you been? I had the most amazing April! I visited my family in China and took a week-long trip to Chengdu, Sichuan, a city famous for numbing peppers and pandas. It was the most amazing culinary adventure I&#8217;ve ever had. Then I flew to Los Gatos to host the 2nd Data Engineering Open Forum, and I met many of you in person at the event. I&#8217;m hoping to bring the open forum to more cities and exciting things are in the works. Stay tuned!</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!xs7t!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec21900e-339e-46b7-b9c1-1b6341a88b43_2880x2880.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!xs7t!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec21900e-339e-46b7-b9c1-1b6341a88b43_2880x2880.jpeg 424w, https://substackcdn.com/image/fetch/$s_!xs7t!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec21900e-339e-46b7-b9c1-1b6341a88b43_2880x2880.jpeg 848w, https://substackcdn.com/image/fetch/$s_!xs7t!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec21900e-339e-46b7-b9c1-1b6341a88b43_2880x2880.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!xs7t!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec21900e-339e-46b7-b9c1-1b6341a88b43_2880x2880.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!xs7t!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec21900e-339e-46b7-b9c1-1b6341a88b43_2880x2880.jpeg" width="368" height="368" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ec21900e-339e-46b7-b9c1-1b6341a88b43_2880x2880.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1456,&quot;width&quot;:1456,&quot;resizeWidth&quot;:368,&quot;bytes&quot;:1956796,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://dataengineerthings.substack.com/i/163409832?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec21900e-339e-46b7-b9c1-1b6341a88b43_2880x2880.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!xs7t!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec21900e-339e-46b7-b9c1-1b6341a88b43_2880x2880.jpeg 424w, https://substackcdn.com/image/fetch/$s_!xs7t!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec21900e-339e-46b7-b9c1-1b6341a88b43_2880x2880.jpeg 848w, https://substackcdn.com/image/fetch/$s_!xs7t!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec21900e-339e-46b7-b9c1-1b6341a88b43_2880x2880.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!xs7t!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec21900e-339e-46b7-b9c1-1b6341a88b43_2880x2880.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">A photo recap of my April: spicy food, pandas, and the DE Open Forum. </figcaption></figure></div><div><hr></div><h3>&#128467; Bay Area Meetup at eBay HQ on May 28</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!xPju!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F72d91856-eadb-4d54-a1af-d6462ddd7e68_1920x1080.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!xPju!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F72d91856-eadb-4d54-a1af-d6462ddd7e68_1920x1080.jpeg 424w, https://substackcdn.com/image/fetch/$s_!xPju!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F72d91856-eadb-4d54-a1af-d6462ddd7e68_1920x1080.jpeg 848w, https://substackcdn.com/image/fetch/$s_!xPju!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F72d91856-eadb-4d54-a1af-d6462ddd7e68_1920x1080.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!xPju!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F72d91856-eadb-4d54-a1af-d6462ddd7e68_1920x1080.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!xPju!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F72d91856-eadb-4d54-a1af-d6462ddd7e68_1920x1080.jpeg" width="548" height="308.25" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/72d91856-eadb-4d54-a1af-d6462ddd7e68_1920x1080.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:548,&quot;bytes&quot;:1024419,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineerthings.substack.com/i/163409832?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F72d91856-eadb-4d54-a1af-d6462ddd7e68_1920x1080.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!xPju!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F72d91856-eadb-4d54-a1af-d6462ddd7e68_1920x1080.jpeg 424w, https://substackcdn.com/image/fetch/$s_!xPju!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F72d91856-eadb-4d54-a1af-d6462ddd7e68_1920x1080.jpeg 848w, https://substackcdn.com/image/fetch/$s_!xPju!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F72d91856-eadb-4d54-a1af-d6462ddd7e68_1920x1080.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!xPju!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F72d91856-eadb-4d54-a1af-d6462ddd7e68_1920x1080.jpeg 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Data Engineer Things (DET) Bay Area Meetup on May 28, 2025.</figcaption></figure></div><p>Join the next DET Bay Area Meetup for an evening of learning and networking!</p><ul><li><p><strong>When</strong>: 6 pm - 8 pm on Wednesday, May 28</p></li><li><p><strong>Where</strong>: eBay's Global Headquarters, 2025 Hamilton Ave, San Jose, CA 95125</p></li><li><p><strong>Talk #1</strong>: Graph Data Engineering for Complex Low-Latency Analytics &amp; AI at Scale, by Sahana Chattopadhyay Debnath (Senior Data Engineer at eBay)</p></li><li><p><strong>Talk #2</strong>: Inside the next-gen Presto C++ engine, by Aditi Pandit (Principal Engineer at IBM)</p></li><li><p><strong>&#128073;&#127996; <a href="https://www.meetup.com/data-engineer-things-bay-area-meetup/events/307686868/?slug=data-engineer-things-bay-area-meetup&amp;eventId=307686868">RSVP</a></strong></p></li></ul><div><hr></div><h4>&#127916; The OpenXData Conference on May 21</h4><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Zspt!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20714fe4-1b19-496e-b95c-d918af847dde_1216x692.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Zspt!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20714fe4-1b19-496e-b95c-d918af847dde_1216x692.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Zspt!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20714fe4-1b19-496e-b95c-d918af847dde_1216x692.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Zspt!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20714fe4-1b19-496e-b95c-d918af847dde_1216x692.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Zspt!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20714fe4-1b19-496e-b95c-d918af847dde_1216x692.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Zspt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20714fe4-1b19-496e-b95c-d918af847dde_1216x692.jpeg" width="548" height="311.85526315789474" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/20714fe4-1b19-496e-b95c-d918af847dde_1216x692.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:692,&quot;width&quot;:1216,&quot;resizeWidth&quot;:548,&quot;bytes&quot;:485098,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineerthings.substack.com/i/163409832?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20714fe4-1b19-496e-b95c-d918af847dde_1216x692.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Zspt!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20714fe4-1b19-496e-b95c-d918af847dde_1216x692.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Zspt!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20714fe4-1b19-496e-b95c-d918af847dde_1216x692.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Zspt!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20714fe4-1b19-496e-b95c-d918af847dde_1216x692.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Zspt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20714fe4-1b19-496e-b95c-d918af847dde_1216x692.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">The OpenXData Conference on May 21, 2025.</figcaption></figure></div><p>The OpenXData conference on Wednesday, May 21, is a free virtual event on open data architectures, covering topics like data lakehouses, stream processing, query engines, and data for AI/ML. At the event, you will find talks by great speakers from Netflix, dbt Labs, Databricks, Microsoft, Google, Meta, Peloton, and many more:</p><ul><li><p>Powering Amazon Unit Economics at Scale Using Apache Hudi, by Jason Liu (Senior Software Engineer at Amazon)</p></li><li><p>A Flexible, Efficient Lakehouse Architecture for Streaming Ingestion, by Rajwardhan Singh (Engineering Manager at Zoom)</p></li><li><p>Data Mesh and Governance at Twilio, by Aakash Pradeep (Principal Software Engineer at Twilio)</p></li></ul><p>&#128073;&#127996; Check out the full agenda and sign up <strong><a href="https://www.openxdata.ai/?utm_source=dataengineeringthings&amp;utm_medium=newsletter&amp;utm_campaign=2025_05_openxdata&amp;utm_id=2025_05_openxdata&amp;utm_content=202505_newsletter">HERE</a></strong>.</p><p><em>(This message is sponsored by the MLOps Community.)</em></p><div><hr></div><h4>&#128483;&#65039; Speaking at Future Community Events</h4><p>Within the community, we want to foster a safe environment for both learning and sharing, and we encourage our members (like you) to share their knowledge with everyone. It could be any data engineering topics you are passionate about: how you solved a real-world problem with data engineering, techniques of leveraging open-source frameworks, how to build a data team, communication skills, etc.</p><p>&#128073;&#127996; Want to speak at one of our in-person meetups or online webinars? Submit your talk proposals <strong><a href="http://meetup.dataengineerthings.org/cfp">HERE</a></strong>.</p><div><hr></div><h4><strong>&#128218; Articles of the Week</strong></h4><ul><li><p><a href="https://medium.com/data-engineer-things/why-most-data-engineering-resumes-get-rejected-in-10-seconds-and-how-to-fix-yours-398d5611a779">Why Most Data Engineering Resumes Get Rejected in 10 Seconds (And How to Fix Yours)</a> by Coders Stop. <em>(This article resonated with me. It has detailed examples of dos and don&#8217;ts and practical tips. If you are working on your resume, do check it out.)</em> </p></li><li><p><a href="https://blog.det.life/airflow-3-and-airflow-ai-sdk-in-action-analyzing-league-of-legends-490f5b4522f5">Airflow 3 and Airflow AI SDK in Action &#8212; Analyzing League of Legends</a> by Volker Janz</p></li><li><p><a href="https://medium.com/netflix-techblog/behind-the-scenes-building-a-robust-ads-event-processing-pipeline-e4e86caf9249">Behind the Scenes: Building a Robust Ads Event Processing Pipeline</a> by Netflix Technology Blog</p></li><li><p><a href="https://leaddev.com/communication/turn-workplace-conflict-collaboration">Turn workplace conflict into collaboration</a> by David Dye and Karin Hurt. <em>(In case you haven&#8217;t noticed, I&#8217;ve built a habit of recommending at least one article about soft skills in each newsletter. This time, it&#8217;s about conflict resolution.)</em></p></li></ul><div><hr></div><p>Have a great week and see you next time!</p><p></p><p>Cheers,</p><p>Xinran Waibel</p><p>Head of the <a href="https://www.dataengineerthings.org/">Data Engineer Things</a> community</p><p><em>(&#128161; Don&#8217;t forget to follow us on <a href="https://www.linkedin.com/company/data-engineer-things/">LinkedIn</a> and join the DET <a href="http://join.det.life/">Slack</a> community !)</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://dataengineerthings.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Data Engineer Things! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Data Engineer Things Newsletter #18]]></title><description><![CDATA[Upcoming meetups, our Medium editorial team, and volunteer opportunities.]]></description><link>https://dataengineerthings.substack.com/p/data-engineer-things-newsletter-18</link><guid isPermaLink="false">https://dataengineerthings.substack.com/p/data-engineer-things-newsletter-18</guid><dc:creator><![CDATA[Data Engineer Things]]></dc:creator><pubDate>Thu, 13 Mar 2025 15:03:29 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90bc57ab-4c8a-40c5-965d-4f88499dfefd_1728x1473.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hello everyone, &#127799;</p><p>I hope 2025 is treating you well. You haven&#8217;t heard from me for a while because I have been focusing on my own growth. Once again, I&#8217;m venturing outside of my comfort zone, and it&#8217;s still so uncomfortable and scary no matter how many times I&#8217;ve done it! I&#8217;ve been learning and reflecting a lot on delegation, ownership, and candor, and I hope to share my takeaways with you in a blog later. But until then, if you are also going through this, stay strong and keep climbing!</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!kNT1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea3de774-6c41-42cb-94ab-dca45a9ad96d_800x800.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!kNT1!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea3de774-6c41-42cb-94ab-dca45a9ad96d_800x800.png 424w, https://substackcdn.com/image/fetch/$s_!kNT1!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea3de774-6c41-42cb-94ab-dca45a9ad96d_800x800.png 848w, https://substackcdn.com/image/fetch/$s_!kNT1!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea3de774-6c41-42cb-94ab-dca45a9ad96d_800x800.png 1272w, https://substackcdn.com/image/fetch/$s_!kNT1!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea3de774-6c41-42cb-94ab-dca45a9ad96d_800x800.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!kNT1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea3de774-6c41-42cb-94ab-dca45a9ad96d_800x800.png" width="304" height="304" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ea3de774-6c41-42cb-94ab-dca45a9ad96d_800x800.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:800,&quot;width&quot;:800,&quot;resizeWidth&quot;:304,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;How to Leave your Comfort Zone and Enter your 'Growth Zone'&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="How to Leave your Comfort Zone and Enter your 'Growth Zone'" title="How to Leave your Comfort Zone and Enter your 'Growth Zone'" srcset="https://substackcdn.com/image/fetch/$s_!kNT1!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea3de774-6c41-42cb-94ab-dca45a9ad96d_800x800.png 424w, https://substackcdn.com/image/fetch/$s_!kNT1!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea3de774-6c41-42cb-94ab-dca45a9ad96d_800x800.png 848w, https://substackcdn.com/image/fetch/$s_!kNT1!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea3de774-6c41-42cb-94ab-dca45a9ad96d_800x800.png 1272w, https://substackcdn.com/image/fetch/$s_!kNT1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea3de774-6c41-42cb-94ab-dca45a9ad96d_800x800.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Source: <a href="https://positivepsychology.com/comfort-zone/">Positive Psychology</a></figcaption></figure></div><div><hr></div><h3>&#128467; Upcoming DET Meetups</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!9Cfp!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F645a168d-5893-4aed-bafd-725ba8cc45ec_4032x2284.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!9Cfp!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F645a168d-5893-4aed-bafd-725ba8cc45ec_4032x2284.jpeg 424w, https://substackcdn.com/image/fetch/$s_!9Cfp!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F645a168d-5893-4aed-bafd-725ba8cc45ec_4032x2284.jpeg 848w, https://substackcdn.com/image/fetch/$s_!9Cfp!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F645a168d-5893-4aed-bafd-725ba8cc45ec_4032x2284.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!9Cfp!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F645a168d-5893-4aed-bafd-725ba8cc45ec_4032x2284.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!9Cfp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F645a168d-5893-4aed-bafd-725ba8cc45ec_4032x2284.jpeg" width="540" height="305.89285714285717" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/645a168d-5893-4aed-bafd-725ba8cc45ec_4032x2284.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:2284,&quot;width&quot;:4032,&quot;resizeWidth&quot;:540,&quot;bytes&quot;:2012948,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineerthings.substack.com/i/158958784?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feaaa84d3-0fb8-4dc6-a80f-dfea129aa756_4032x3024.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!9Cfp!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F645a168d-5893-4aed-bafd-725ba8cc45ec_4032x2284.jpeg 424w, https://substackcdn.com/image/fetch/$s_!9Cfp!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F645a168d-5893-4aed-bafd-725ba8cc45ec_4032x2284.jpeg 848w, https://substackcdn.com/image/fetch/$s_!9Cfp!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F645a168d-5893-4aed-bafd-725ba8cc45ec_4032x2284.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!9Cfp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F645a168d-5893-4aed-bafd-725ba8cc45ec_4032x2284.jpeg 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">The last DET Seattle meetup was a blast.</figcaption></figure></div><p>Spring is here! If you are in London or Seattle, it's time to get out there and join our meetups in March and April.</p><p><strong>London Meetup on March 20</strong>:</p><ul><li><p><strong>&#128073;&#127996; <a href="https://www.meetup.com/data-engineer-things-london-meetup/events/305772435/?eventOrigin=group_upcoming_events">RSVP</a></strong></p></li><li><p>Talk #1: Scaling similarity: running algorithmically complex computation at scale in Spark for ELT by Robert Vadai</p></li><li><p>Talk #2: Building a Data Platform in AWS, dbt, Iceberg Glue and Athena by Sanchit Agarwal</p></li><li><p>Talk #3: Data Platforms without Boundaries by Hugo Lu</p></li></ul><p><strong>Seattle Meetup on April 2</strong>:</p><ul><li><p><strong>&#128073;&#127996; <a href="https://www.meetup.com/data-engineer-things-seattle-meetup/events/306518876/?eventOrigin=group_upcoming_events">RSVP</a></strong></p></li><li><p>Talk #1: Data Governance with Unity Catalog</p></li><li><p>More talks are coming soon.</p></li><li><p>Don&#8217;t forget to say hi to Saransh Arora - he is a data engineer at AWS and the new lead for our Seattle meetup.</p></li></ul><p><strong>Bay Area Meetup</strong></p><p><strong>&#128073;&#127996; </strong>We are looking for a co-organizer (to lead event organization and logistics) and a sponsor (to provide venue and food) for our Bay Area meetup. Email dataengineerthings@gmail.com if you are interested in learning more.</p><p><em>(&#128483;&#65039; Would you like to speak at one of our meetups? Submit your talk proposals <a href="http://meetup.dataengineerthings.org/cfp">here</a>.)</em></p><div><hr></div><h4>&#128075;&#127996; Meet the DET Medium Editorial Team</h4><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!kHtn!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90bc57ab-4c8a-40c5-965d-4f88499dfefd_1728x1473.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!kHtn!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90bc57ab-4c8a-40c5-965d-4f88499dfefd_1728x1473.png 424w, https://substackcdn.com/image/fetch/$s_!kHtn!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90bc57ab-4c8a-40c5-965d-4f88499dfefd_1728x1473.png 848w, https://substackcdn.com/image/fetch/$s_!kHtn!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90bc57ab-4c8a-40c5-965d-4f88499dfefd_1728x1473.png 1272w, https://substackcdn.com/image/fetch/$s_!kHtn!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90bc57ab-4c8a-40c5-965d-4f88499dfefd_1728x1473.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!kHtn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90bc57ab-4c8a-40c5-965d-4f88499dfefd_1728x1473.png" width="1728" height="1473" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/90bc57ab-4c8a-40c5-965d-4f88499dfefd_1728x1473.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1473,&quot;width&quot;:1728,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1439349,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dataengineerthings.substack.com/i/158958784?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb54cd56c-3258-432a-a293-294ab0cf405f_1728x2304.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!kHtn!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90bc57ab-4c8a-40c5-965d-4f88499dfefd_1728x1473.png 424w, https://substackcdn.com/image/fetch/$s_!kHtn!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90bc57ab-4c8a-40c5-965d-4f88499dfefd_1728x1473.png 848w, https://substackcdn.com/image/fetch/$s_!kHtn!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90bc57ab-4c8a-40c5-965d-4f88499dfefd_1728x1473.png 1272w, https://substackcdn.com/image/fetch/$s_!kHtn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90bc57ab-4c8a-40c5-965d-4f88499dfefd_1728x1473.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Please join me in welcoming the newest members of our Mediun editorial team: Toma Khalifa, Sreyashi Das, Lakshmi Kristam, Parthiban Manavalan, and Nancy Amandi!</p><blockquote><p>In Yaakov&#8217;s own words: <em>&#8220;As our team expands, we&#8217;ll be able to take on larger projects, such as hosting additional writing workshops, trainings, and mentorships. I&#8217;m enthusiastic for what the future holds for us.&#8221;</em></p></blockquote><p><strong>&#128073;&#127996; </strong>We are looking for more editors for our <a href="https://blog.det.life/">Medium publication</a>. You&#8217;ll build connections, network with industry leaders, deepen your knowledge of data engineering, and grow professionally. Apply to be an editor by filling out <a href="https://forms.gle/UPyf3HU7AX1rXDV7A">THIS</a> form.</p><div><hr></div><h4><strong>&#128218; Articles of the Week</strong></h4><ul><li><p><a href="https://blog.det.life/i-interviewed-200-data-engineers-heres-what-separates-the-best-from-the-rest-3092524e5875">I Interviewed 200+ Data Engineers. Here&#8217;s What Separates the Best from the Rest!</a> by Shashwath Shenoy</p></li><li><p><a href="https://blog.det.life/a-non-beginner-data-engineering-roadmap-2025-edition-2b39d865dd0b">A non-beginner Data Engineering Roadmap &#8212; 2025 Edition</a> by Ernani Castro</p></li><li><p><a href="https://www.youtube.com/watch?v=xCwk7hyUIn0">Winning the Data Job Interviews</a> (video) by Shachar Meir</p></li><li><p><a href="https://netflixtechblog.com/introducing-impressions-at-netflix-e2b67c88c9fb">Introducing Impressions at Netflix</a> by Tulika Bhatt</p></li><li><p><a href="https://leaddev.com/culture/build-a-productive-code-review-culture">Build a productive code review culture</a> by Ara Ramanathan</p></li></ul><div><hr></div><p>Have a great week and see you next time!</p><p></p><p>Cheers,</p><p>Xinran Waibel</p><p>Head of the <a href="https://www.dataengineerthings.org/">Data Engineer Things</a> community</p><p><em>(&#128073;&#127996; Don&#8217;t forget to join the DET <a href="http://join.det.life/">Slack</a> community and follow us on <a href="https://www.linkedin.com/company/data-engineer-things/">LinkedIn</a>!)</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://dataengineerthings.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Data Engineer Things! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Data Engineer Things Newsletter #17]]></title><description><![CDATA[Volunteer opportunities, DEML Summit recordings, and what's coming in 2025.]]></description><link>https://dataengineerthings.substack.com/p/data-engineer-things-newsletter-17</link><guid isPermaLink="false">https://dataengineerthings.substack.com/p/data-engineer-things-newsletter-17</guid><dc:creator><![CDATA[Data Engineer Things]]></dc:creator><pubDate>Tue, 19 Nov 2024 16:03:02 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66064eba-64c2-444a-8c61-2f9c14174abd_800x800.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hello there! &#128075;&#127996;</p><p>Time flies by and we are near the end of 2024! My family started a holiday tradition several years ago: writing down and sharing a list of things we are grateful for. Here are a few things on my gratitude list. I&#8217;m grateful for the DET community members who generously offer their time and knowledge to help others grow. I&#8217;m also thankful for the genuine constructive feedback I received from my colleagues and for being able to grow from that feedback. Lastly, I&#8217;m grateful that I can be myself at work and with the community - because I&#8217;m at my best when I don&#8217;t need to pretend, when I can be curious all the way, and when I can say no.</p><p>While a growth mindset requires us to be aware of where we lack, it&#8217;s just as important to recognize what we have, such as strengths, achievements, and connections. I encourage you to create your own list and celebrate it!</p><div><hr></div><h3>&#127881; A New Chapter for DET Medium Publication</h3><p>Introducing the new Editor-in-Chief for our publication on Medium: <a href="https://www.linkedin.com/in/yaakovbressler/">Yaakov Bressler</a>!</p><p>With over 7 years of experience in data engineering, Yaakov is currently a Lead Data Engineer at Capital One. Not only is he an active writer himself, but he also has been an editor for DET since the beginning of our publication. In the last year of working with Yaakov, I've witnessed firsthand how responsible, reflective, and selfless he is, so I have confidence in the positive impact he will bring to the publication.</p><p>Interested in learning and growing through writing? Here is how you can get started:</p><ul><li><p><a href="https://blog.det.life/write-for-data-engineer-things-32dc9294c5db">Write for DET on Medium</a></p></li><li><p><a href="https://forms.gle/tD3Rx2qxfv45MfKX7">Join our editorial team</a></p></li><li><p>New to technical writing? <a href="https://forms.gle/amn4WtY1sDyc71QC9">Sign up for the writer's workshop</a></p></li></ul><div><hr></div><h4>&#127916; DEML Summit 2024 Session Recordings</h4><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!a1Ke!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04a871c4-0f30-4908-a357-0c131883b23e_1600x400.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!a1Ke!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04a871c4-0f30-4908-a357-0c131883b23e_1600x400.jpeg 424w, https://substackcdn.com/image/fetch/$s_!a1Ke!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04a871c4-0f30-4908-a357-0c131883b23e_1600x400.jpeg 848w, https://substackcdn.com/image/fetch/$s_!a1Ke!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04a871c4-0f30-4908-a357-0c131883b23e_1600x400.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!a1Ke!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04a871c4-0f30-4908-a357-0c131883b23e_1600x400.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!a1Ke!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04a871c4-0f30-4908-a357-0c131883b23e_1600x400.jpeg" width="1456" height="364" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/04a871c4-0f30-4908-a357-0c131883b23e_1600x400.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:364,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:56639,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!a1Ke!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04a871c4-0f30-4908-a357-0c131883b23e_1600x400.jpeg 424w, https://substackcdn.com/image/fetch/$s_!a1Ke!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04a871c4-0f30-4908-a357-0c131883b23e_1600x400.jpeg 848w, https://substackcdn.com/image/fetch/$s_!a1Ke!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04a871c4-0f30-4908-a357-0c131883b23e_1600x400.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!a1Ke!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04a871c4-0f30-4908-a357-0c131883b23e_1600x400.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Recordings from DEML Summit 2024 are now available on <a href="https://www.youtube.com/@DEMLSummit/videos">YouTube</a>, including:</p><ul><li><p><strong><a href="https://youtu.be/vhO1pHtyz7Q?si=wvEQZ6U_QFDwL9Tp">Data Management for LLMs</a> </strong>by<strong> </strong>Abi Aryan, Author and Founder at Abide A.I.</p></li><li><p><strong><a href="https://youtu.be/JxP4pBAJHiU?si=aTRWpq4GTftj4P6l">Event Tracking Redefined: A Data Engineer's Guide to Creating Actionable Insights</a> </strong>by<strong> </strong>Saurabh Mishra, Staff Software Engineer at ThredUP Inc.</p></li><li><p><strong><a href="https://youtu.be/iXFO5uxSOmk?si=pCr6PQkMhhK6LMNw">Maestro Netflix&#8217;s Data ML Workflow Orchestrator</a></strong> by Jun He, Staff Software Engineer at Netflix.</p></li><li><p><strong><a href="https://youtu.be/3UfidL674Kk?si=MlaZesoTV5FuW8vl">The Resilience of SQL</a></strong> by Rui Machado, VP of Data Engineering and Architecture at H&amp;M Group.</p></li></ul><p><em>Thank you to our sponsor (Databricks), co-host (SeattleDataGuy), and volunteers (Mert Bozkir, Matthew Hoang, Sreyashi Das, Rho Lall, and Aishwarya Prasad Venkatesh) for making this conference happen!</em></p><div><hr></div><h3>&#129716; 2025 Looking Ahead</h3><p>We are always exploring new ideas to make the community better. I&#8217;d like to share a few things coming up in early 2025:</p><ul><li><p>I will be speaking at the <a href="https://datadaytexas.com/">Data Day Texas</a> on Jan 25. If you plan to attend as well, drop me a note on Slack.</p></li><li><p>We are now part of O'Reilly&#8217;s community partner program - which means our members will get access to free O'Reilly learning resources, e.g., free books!</p></li><li><p>&#8220;My involvement in the DET community helped me land the new job&#8221; - I&#8217;ve heard this from several folks! I&#8217;m brainstorming a new program for community members to participate in to build their professional brands.</p></li></ul><p>Any suggestions from you? </p><div><hr></div><p>Have a great week and see you next time (probably in 2025)! &#10052;&#65039;</p><p></p><p>Cheers,</p><p>Xinran Waibel</p><p><em>(&#128073;&#127996; Don&#8217;t forget to join the DET <a href="http://join.det.life/">Slack</a> community and follow us on <a href="https://www.linkedin.com/company/data-engineer-things/">LinkedIn</a>!)</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://dataengineerthings.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Data Engineer Things! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Data Engineer Things Newsletter #16]]></title><description><![CDATA[Sharing Xinran's DEML Summit 2024 playlist.]]></description><link>https://dataengineerthings.substack.com/p/data-engineer-things-newsletter-16</link><guid isPermaLink="false">https://dataengineerthings.substack.com/p/data-engineer-things-newsletter-16</guid><dc:creator><![CDATA[Data Engineer Things]]></dc:creator><pubDate>Wed, 02 Oct 2024 17:03:54 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!ppuw!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe899c6c-141e-4b97-9e39-d341f7e75981_1920x1080.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hello there,</p><p>The <a href="https://www.accelevents.com/e/data-engineering-and-machine-learning-summit-2024#about">Data Engineering and Machine Learning (DEML) Summit 2024</a> is happening tomorrow and Friday (Oct 3rd - Oct 4th)!</p><p>Before the conference, I would like to share a few sessions I plan to attend myself. Hopefully, you will find some of these topics interesting too.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ppuw!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe899c6c-141e-4b97-9e39-d341f7e75981_1920x1080.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ppuw!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe899c6c-141e-4b97-9e39-d341f7e75981_1920x1080.jpeg 424w, https://substackcdn.com/image/fetch/$s_!ppuw!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe899c6c-141e-4b97-9e39-d341f7e75981_1920x1080.jpeg 848w, https://substackcdn.com/image/fetch/$s_!ppuw!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe899c6c-141e-4b97-9e39-d341f7e75981_1920x1080.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!ppuw!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe899c6c-141e-4b97-9e39-d341f7e75981_1920x1080.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ppuw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe899c6c-141e-4b97-9e39-d341f7e75981_1920x1080.jpeg" width="564" height="317.25" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/be899c6c-141e-4b97-9e39-d341f7e75981_1920x1080.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:564,&quot;bytes&quot;:118327,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ppuw!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe899c6c-141e-4b97-9e39-d341f7e75981_1920x1080.jpeg 424w, https://substackcdn.com/image/fetch/$s_!ppuw!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe899c6c-141e-4b97-9e39-d341f7e75981_1920x1080.jpeg 848w, https://substackcdn.com/image/fetch/$s_!ppuw!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe899c6c-141e-4b97-9e39-d341f7e75981_1920x1080.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!ppuw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe899c6c-141e-4b97-9e39-d341f7e75981_1920x1080.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Data Engineering and Machine Learning (DEML) Summit 2024 on Oct 3rd - 4th.</figcaption></figure></div><div><hr></div><h4><strong>&#127911; Xinran&#8217;s DEML Summit 2024 Playlist</strong></h4><h4><strong>1. <a href="https://www.accelevents.com/e/data-engineering-and-machine-learning-summit-2024?sessionId=341110#agenda">Opportunity to Advance Data Engineering in the Environmental Sector</a></strong></h4><ul><li><p><strong>Speaker</strong>: Prashank Mishra<strong>, </strong>Analytics &amp; Data Engineer at Seirify Data</p></li><li><p><strong>Time</strong>: 8:00 am Thu, Oct 3rd (PT)</p></li><li><p><strong>Why I&#8217;m attending</strong>: I&#8217;ve heard someone joking that most data professionals today are working on getting more people to click on ads. Regardless of how true this joke is, I definitely noticed a lack of visibility in data engineering work in other fields, such as the environmental sector. I&#8217;m curious to learn about how Prashank solves challenges in the water utility space through data engineering.</p></li></ul><p></p><h4>2. <strong><a href="https://www.accelevents.com/e/data-engineering-and-machine-learning-summit-2024?sessionId=341124#agenda">Data Management for LLMs</a></strong></h4><ul><li><p><strong>Speaker</strong>: Abi Aryan, Author and Founder at Abide A.I.</p></li><li><p><strong>Time</strong>: 9:00 am Thu, Oct 3rd (PT)</p></li><li><p><strong>Why I&#8217;m attending</strong>: As more teams are adopting LLMs into their products, so will they need support from data engineers and ML engineers for managing backend data, and the data requirements will look different from those of traditional ML models. I&#8217;d like to hear Abi&#8217;s perspectives on how data engineering and ML engineering will shift in the LLM world.</p></li></ul><p></p><h4>3. <strong><a href="https://www.accelevents.com/e/data-engineering-and-machine-learning-summit-2024?sessionId=341114#agenda">Event Tracking Redefined: A Data Engineer's Guide to Creating Actionable Insights</a></strong></h4><ul><li><p><strong>Speaker</strong>: Saurabh Mishra, Staff Software Engineer at ThredUP Inc</p></li><li><p><strong>Time</strong>: 2:00 pm Thu, Oct 3rd (PT)</p></li><li><p><strong>Why I&#8217;m attending</strong>: The best data quality strategy focuses on prevention: in other words, establishing programmatic data contracts to stop data regressions from releases. How do you implement a technical solution and people process that can scale for constant product innovation? That&#8217;s a question I think about everyday. I want to hear about how Saurabh and his team are tackling this problem.</p></li></ul><p></p><h4>4. <strong><a href="https://www.accelevents.com/e/data-engineering-and-machine-learning-summit-2024?sessionId=341116#agenda">10 Most Neglected Data Engineering Tasks</a></strong></h4><ul><li><p><strong>Speaker</strong>: Veronika Durgin, VP of Data at Saks</p></li><li><p><strong>Time</strong>: 10:00 am Fri, Oct 4th (PT)</p></li><li><p><strong>Why I&#8217;m attending</strong>: As a data engineering team, you often have to juggle between business priorities and less visible work that offers long-term value. Securing time and resources for the latter can be hard. I&#8217;m curious to hear Veronika&#8217;s strategy addressing those common yet neglected tasks.</p></li></ul><p></p><h4>4. <a href="https://www.accelevents.com/e/data-engineering-and-machine-learning-summit-2024?sessionId=344057#agenda">Unified Data + AI Governance with Unity Catalog</a></h4><ul><li><p><strong>Speaker</strong>: Victoria Bukta, Member Of Technical Staff at Databricks</p></li><li><p><strong>Time</strong>: 2:00 pm Fri, Oct 4th (PT)</p></li><li><p><strong>Why I&#8217;m attending</strong>: Likely you have already heard about the <em>biggest data news of the year</em>: <a href="https://www.databricks.com/blog/databricks-tabular">Databricks acquiring Tabular</a>. How will the existing open-source catalog frameworks evolve to support table format unification and AI? I&#8217;m hoping to get insights from Victoria&#8217;s talk on Unity Catalog.</p></li></ul><div><hr></div><h4><strong>&#128467; Full Agenda and Registration</strong></h4><p>Check out the complete conference agenda <a href="https://www.accelevents.com/e/data-engineering-and-machine-learning-summit-2024#agenda">here</a> and explore other sessions. Sign up for DEML Summit 2024 for free at <a href="https://www.demlsummit.com/signup">demlsummit.com/signup</a>.</p><p><em>(Special thanks to <a href="https://www.databricks.com/">Databricks</a> for sponsoring the conference!)</em></p><div><hr></div><p>Have a great week and see you next time! &#129414;</p><p></p><p>Cheers,</p><p>Xinran Waibel (<a href="https://www.linkedin.com/in/xinranwaibel/">LinkedIn</a>)</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://dataengineerthings.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Data Engineer Things! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Data Engineer Things Newsletter #15]]></title><description><![CDATA[In-person community gathering at Big Data LDN on Sept 18]]></description><link>https://dataengineerthings.substack.com/p/data-engineer-things-newsletter-15</link><guid isPermaLink="false">https://dataengineerthings.substack.com/p/data-engineer-things-newsletter-15</guid><dc:creator><![CDATA[Data Engineer Things]]></dc:creator><pubDate>Thu, 05 Sep 2024 14:01:37 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!OSmC!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F506b024e-b9e9-4edb-b2b2-191fcde1ad21_2560x1440.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!OSmC!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F506b024e-b9e9-4edb-b2b2-191fcde1ad21_2560x1440.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!OSmC!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F506b024e-b9e9-4edb-b2b2-191fcde1ad21_2560x1440.png 424w, https://substackcdn.com/image/fetch/$s_!OSmC!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F506b024e-b9e9-4edb-b2b2-191fcde1ad21_2560x1440.png 848w, https://substackcdn.com/image/fetch/$s_!OSmC!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F506b024e-b9e9-4edb-b2b2-191fcde1ad21_2560x1440.png 1272w, https://substackcdn.com/image/fetch/$s_!OSmC!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F506b024e-b9e9-4edb-b2b2-191fcde1ad21_2560x1440.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!OSmC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F506b024e-b9e9-4edb-b2b2-191fcde1ad21_2560x1440.png" width="526" height="295.875" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/506b024e-b9e9-4edb-b2b2-191fcde1ad21_2560x1440.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:526,&quot;bytes&quot;:705307,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!OSmC!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F506b024e-b9e9-4edb-b2b2-191fcde1ad21_2560x1440.png 424w, https://substackcdn.com/image/fetch/$s_!OSmC!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F506b024e-b9e9-4edb-b2b2-191fcde1ad21_2560x1440.png 848w, https://substackcdn.com/image/fetch/$s_!OSmC!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F506b024e-b9e9-4edb-b2b2-191fcde1ad21_2560x1440.png 1272w, https://substackcdn.com/image/fetch/$s_!OSmC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F506b024e-b9e9-4edb-b2b2-191fcde1ad21_2560x1440.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The DET community will be part of <strong>Big Data LDN</strong> this year! </p><ul><li><p>On Sept 18, I will be co-hosting a session on career development in data engineering in the X-Axis Keynote Theatre. If you are a data professional or aspirational date engineer hoping to level up in your career, this session is for you.</p></li><li><p>At the end of the conference day, there will be a DET London Meetup. Please join us to network with the local data community. Looking forward to meeting you in person there. (Special thanks to Airbyte, Hetz Ventures, and Orchestra for co-hosting the meetup.)</p></li></ul><p><strong>&#128073;&#127996; Sign up for <a href="https://bigdataldn.com/register">Big Data LDN</a> and the <a href="https://www.meetup.com/data-engineer-things-london-meetup/events/302664845/?eventOrigin=group_upcoming_events">meetup</a> today (free registration).</strong></p><div><hr></div><h4>&#127903; Register for Data Engineering &amp; Machine Learning Summit 2024</h4><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!a1Ke!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04a871c4-0f30-4908-a357-0c131883b23e_1600x400.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!a1Ke!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04a871c4-0f30-4908-a357-0c131883b23e_1600x400.jpeg 424w, https://substackcdn.com/image/fetch/$s_!a1Ke!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04a871c4-0f30-4908-a357-0c131883b23e_1600x400.jpeg 848w, https://substackcdn.com/image/fetch/$s_!a1Ke!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04a871c4-0f30-4908-a357-0c131883b23e_1600x400.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!a1Ke!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04a871c4-0f30-4908-a357-0c131883b23e_1600x400.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!a1Ke!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04a871c4-0f30-4908-a357-0c131883b23e_1600x400.jpeg" width="1456" height="364" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/04a871c4-0f30-4908-a357-0c131883b23e_1600x400.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:364,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:56639,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!a1Ke!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04a871c4-0f30-4908-a357-0c131883b23e_1600x400.jpeg 424w, https://substackcdn.com/image/fetch/$s_!a1Ke!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04a871c4-0f30-4908-a357-0c131883b23e_1600x400.jpeg 848w, https://substackcdn.com/image/fetch/$s_!a1Ke!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04a871c4-0f30-4908-a357-0c131883b23e_1600x400.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!a1Ke!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04a871c4-0f30-4908-a357-0c131883b23e_1600x400.jpeg 1456w" sizes="100vw"></picture><div></div></div></a></figure></div><p>The DEML Summit is a month away! It is a free-to-attend virtual conference comprised of talks by data practitioners from various industries. Our lineup includes:</p><ul><li><p>&#8220;Event Tracking Redefined: A Data Engineer's Guide to Creating Actionable Insights&#8221; by Saurabh Mishra, Staff Software Engineer at thredUP </p></li><li><p>&#8220;Idea to Insight: Changing Data Culture through Innovation&#8221; by Gu Xie, Head of Data Engineering at Group 1001</p></li><li><p>&#8220;10 Most Neglected Data Engineering Tasks&#8221; by Veronika Durgin, VP of Data at Saks</p></li><li><p>&#8230; and many more!</p></li></ul><p><strong>&#128073;&#127996; Register for the conference <a href="http://www.demlsummit.com/signup">HERE</a> (free registration).</strong></p><p><em>(Special thanks to Databricks for sponsoring the conference.)</em></p><div><hr></div><h4>&#128218;<strong> Articles of the Week</strong></h4><ul><li><p><a href="https://blog.det.life/netflix-maestro-and-apache-airflow-competitors-or-companions-in-workflow-orchestration-2bce948956a5">Netflix Maestro and Apache Airflow &#8212; Competitors or Companions in Workflow Orchestration?</a> by Volker Janz</p></li><li><p><a href="https://blog.det.life/pyspark-interview-questions-for-data-engineers-part-i-cfa52ec6102d">PySpark Interview Questions for Data Engineers</a> by Vishal Barvaliya</p></li><li><p><a href="https://www.uber.com/blog/pinot-for-low-latency/">Pinot for Low-Latency Offline Table Analytics</a> by Uber Engineering</p></li><li><p><a href="https://medium.com/google-cloud/did-google-just-kill-streamlit-76f719d9e275">Did Google Just Kill Streamlit?</a> by Om Kamath</p></li></ul><p><em>(Want to publish blogs with DET? Read our submission guidelines <a href="https://medium.com/data-engineer-things/write-for-data-engineer-things-32dc9294c5db">here</a>.)</em></p><div><hr></div><p>One last thing before you go&#8230; I want to give a special shout-out to the <strong>Data Engineering for AI/ML</strong> virtual conference on Sept 12, organized by the MLOps community. You will hear from data leaders from DuckDB, Salesforce, NVIDIA, Lidl, and many more. Don&#8217;t miss out on this great opportunity to network and learn.</p><p><strong>&#128073;&#127996; Sign up for the conference <a href="https://home.mlops.community/home/events/dataengforai?agenda_day=665ee46e12463aa4df4d48b2&amp;agenda_track=665ee46f12463aa4df4d48c4&amp;agenda_stage=665ee46e12463aa4df4d48b8&amp;agenda_filter_view=stage&amp;agenda_view=list">HERE</a> (free registration).</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-4GV!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb604fb88-0437-4b26-8b7f-c9e0753688d9_1200x628.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-4GV!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb604fb88-0437-4b26-8b7f-c9e0753688d9_1200x628.png 424w, https://substackcdn.com/image/fetch/$s_!-4GV!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb604fb88-0437-4b26-8b7f-c9e0753688d9_1200x628.png 848w, https://substackcdn.com/image/fetch/$s_!-4GV!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb604fb88-0437-4b26-8b7f-c9e0753688d9_1200x628.png 1272w, https://substackcdn.com/image/fetch/$s_!-4GV!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb604fb88-0437-4b26-8b7f-c9e0753688d9_1200x628.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-4GV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb604fb88-0437-4b26-8b7f-c9e0753688d9_1200x628.png" width="512" height="267.94666666666666" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b604fb88-0437-4b26-8b7f-c9e0753688d9_1200x628.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:628,&quot;width&quot;:1200,&quot;resizeWidth&quot;:512,&quot;bytes&quot;:60204,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!-4GV!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb604fb88-0437-4b26-8b7f-c9e0753688d9_1200x628.png 424w, https://substackcdn.com/image/fetch/$s_!-4GV!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb604fb88-0437-4b26-8b7f-c9e0753688d9_1200x628.png 848w, https://substackcdn.com/image/fetch/$s_!-4GV!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb604fb88-0437-4b26-8b7f-c9e0753688d9_1200x628.png 1272w, https://substackcdn.com/image/fetch/$s_!-4GV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb604fb88-0437-4b26-8b7f-c9e0753688d9_1200x628.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div><hr></div><p>Have a great week and see you next time! &#129414;</p><p></p><p>Cheers,</p><p>Xinran Waibel (<a href="https://www.linkedin.com/in/xinranwaibel/">LinkedIn</a>)</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://dataengineerthings.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Data Engineer Things! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Data Engineer Things Newsletter #14]]></title><description><![CDATA[Join the Data Engineering and Machine Learning Summit 2024 on Oct 3rd and 4th.]]></description><link>https://dataengineerthings.substack.com/p/data-engineer-things-newsletter-14</link><guid isPermaLink="false">https://dataengineerthings.substack.com/p/data-engineer-things-newsletter-14</guid><dc:creator><![CDATA[Data Engineer Things]]></dc:creator><pubDate>Thu, 08 Aug 2024 15:01:29 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe899c6c-141e-4b97-9e39-d341f7e75981_1920x1080.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hello there,</p><p>I&#8217;m thrilled to announce that DET will be co-hosting the <a href="https://www.accelevents.com/e/data-engineering-and-machine-learning-summit-2024#about">Data Engineering and Machine Learning (DEML) Summit</a> with SeattleDataGuy on Oct 3rd and 4th!</p><p>The DEML Summit is a free-to-attend online conference, made of talks by those who truly get their hands dirty in real-world data &#8212; the practitioners. It will be our 3rd time hosting and we had over 3500 attendees last year.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ppuw!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe899c6c-141e-4b97-9e39-d341f7e75981_1920x1080.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ppuw!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe899c6c-141e-4b97-9e39-d341f7e75981_1920x1080.jpeg 424w, https://substackcdn.com/image/fetch/$s_!ppuw!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe899c6c-141e-4b97-9e39-d341f7e75981_1920x1080.jpeg 848w, https://substackcdn.com/image/fetch/$s_!ppuw!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe899c6c-141e-4b97-9e39-d341f7e75981_1920x1080.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!ppuw!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe899c6c-141e-4b97-9e39-d341f7e75981_1920x1080.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ppuw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe899c6c-141e-4b97-9e39-d341f7e75981_1920x1080.jpeg" width="564" height="317.25" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/be899c6c-141e-4b97-9e39-d341f7e75981_1920x1080.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:564,&quot;bytes&quot;:118327,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ppuw!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe899c6c-141e-4b97-9e39-d341f7e75981_1920x1080.jpeg 424w, https://substackcdn.com/image/fetch/$s_!ppuw!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe899c6c-141e-4b97-9e39-d341f7e75981_1920x1080.jpeg 848w, https://substackcdn.com/image/fetch/$s_!ppuw!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe899c6c-141e-4b97-9e39-d341f7e75981_1920x1080.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!ppuw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe899c6c-141e-4b97-9e39-d341f7e75981_1920x1080.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>If you are interested in <strong>attending</strong>, <strong>speaking</strong>, or <strong>sponsoring</strong> the conference<strong>,</strong> read on!</p><div><hr></div><h4><strong>&#127775; Speaker Lineup Sneak Peak &amp; Registration</strong></h4><p>The agenda is still a work in progress but here are a few speakers we have invited and I&#8217;m super excited about:</p><ul><li><p><a href="https://www.linkedin.com/in/jheua/">Jun He</a> (Staff Software Engineer at Netflix): Jun will share more about <a href="https://netflixtechblog.com/maestro-netflixs-workflow-orchestrator-ee13a06f9c78">Maestro</a> - Netflix&#8217;s data orchestrator that recently became available to the public.</p></li><li><p><a href="https://www.linkedin.com/in/himatejam/">Himateja Madala</a> (Engineering Leader at Disney Streaming): Himateja will talk about the role of data lineage and metadata management.</p></li><li><p><a href="https://www.linkedin.com/in/shacharmeir/">Shachar Meir</a> (Data Advisor, Ex-Director of Data Engineering at Meta): Shachar has prepared a session with tips to help you stand out in data interviews.</p></li></ul><p>The conference will provide a wealth of learning and networking opportunities that you wouldn&#8217;t want to miss. Grab your ticket today and stay tuned for more session announcements!</p><p><strong>&#128073;&#127996; Sign up for the virtual conference <a href="https://www.accelevents.com/e/data-engineering-and-machine-learning-summit-2024#about">HERE</a> (free registration).</strong></p><p><em>(Special thanks to Databricks for sponsoring the conference.)</em></p><div><hr></div><h4><strong>&#128221; Call for Proposals (CFP)</strong></h4><p>We are actively looking for sessions for the following conference tracks:</p><ul><li><p>Data Engineering &amp; Data Infrastructure</p></li><li><p>Data Science &amp; Machine Learning</p></li><li><p>Tech Demo (tutorials that demonstrate how to implement techniques or incorporate frameworks)</p></li></ul><p><strong>&#128073;&#127996; Please submit your talk proposal <a href="https://forms.gle/V6hwWuPQHAk2Xboq9">HERE</a> by Monday, Aug 12th</strong>. Speakers of accepted talks will be notified in early September.</p><div><hr></div><h4><strong>Sponsoring DEML</strong></h4><p>Contact us at <a href="http://demlsummit-info@googlegroups.com">demlsummit-info@googlegroups.com</a> if you are interested in sponsoring the conference.</p><div><hr></div><p>Have a great week and see you next time! &#129414;</p><p></p><p>Cheers,</p><p>Xinran Waibel (<a href="https://www.linkedin.com/in/xinranwaibel/">LinkedIn</a>)</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://dataengineerthings.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Data Engineer Things! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Data Engineer Things Newsletter #13]]></title><description><![CDATA[Upcoming online webinar on career growth and in-person mentorship opportunities.]]></description><link>https://dataengineerthings.substack.com/p/data-engineer-things-newsletter-13</link><guid isPermaLink="false">https://dataengineerthings.substack.com/p/data-engineer-things-newsletter-13</guid><dc:creator><![CDATA[Data Engineer Things]]></dc:creator><pubDate>Mon, 20 May 2024 15:02:11 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2efce92b-a947-421c-9058-86de368cfbef_1920x1080.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hello there,</p><p>Creating more in-person learning and networking opportunities for the community has been my focus this year. The DET team hosted two more successful meetup events in the last month: one in <a href="https://www.meetup.com/data-engineer-things-seattle-meetup/">Seattle</a> and another in <a href="https://www.meetup.com/data-engineer-things-london-meetup/">London</a>!</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Cqif!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd75f0112-19b3-4a90-b9dc-c37fc3978e6e.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Cqif!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd75f0112-19b3-4a90-b9dc-c37fc3978e6e.heic 424w, https://substackcdn.com/image/fetch/$s_!Cqif!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd75f0112-19b3-4a90-b9dc-c37fc3978e6e.heic 848w, https://substackcdn.com/image/fetch/$s_!Cqif!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd75f0112-19b3-4a90-b9dc-c37fc3978e6e.heic 1272w, https://substackcdn.com/image/fetch/$s_!Cqif!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd75f0112-19b3-4a90-b9dc-c37fc3978e6e.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Cqif!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd75f0112-19b3-4a90-b9dc-c37fc3978e6e.heic" width="478" height="358.5" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d75f0112-19b3-4a90-b9dc-c37fc3978e6e.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1092,&quot;width&quot;:1456,&quot;resizeWidth&quot;:478,&quot;bytes&quot;:2110491,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Cqif!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd75f0112-19b3-4a90-b9dc-c37fc3978e6e.heic 424w, https://substackcdn.com/image/fetch/$s_!Cqif!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd75f0112-19b3-4a90-b9dc-c37fc3978e6e.heic 848w, https://substackcdn.com/image/fetch/$s_!Cqif!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd75f0112-19b3-4a90-b9dc-c37fc3978e6e.heic 1272w, https://substackcdn.com/image/fetch/$s_!Cqif!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd75f0112-19b3-4a90-b9dc-c37fc3978e6e.heic 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Everyone at the DET Seattle Meetup on April 18.</figcaption></figure></div><p>The next meetup in the <a href="https://www.meetup.com/data-engineer-things-bay-area-meetup/">Bay Area</a> and <a href="https://www.meetup.com/data-engineer-things-seattle-meetup/">Seattle</a> will happen in late June. Stay tuned.</p><div><hr></div><h4><strong>&#127775; DET Online Webinar on May 23</strong></h4><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!V0kL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2efce92b-a947-421c-9058-86de368cfbef_1920x1080.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!V0kL!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2efce92b-a947-421c-9058-86de368cfbef_1920x1080.png 424w, https://substackcdn.com/image/fetch/$s_!V0kL!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2efce92b-a947-421c-9058-86de368cfbef_1920x1080.png 848w, https://substackcdn.com/image/fetch/$s_!V0kL!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2efce92b-a947-421c-9058-86de368cfbef_1920x1080.png 1272w, https://substackcdn.com/image/fetch/$s_!V0kL!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2efce92b-a947-421c-9058-86de368cfbef_1920x1080.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!V0kL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2efce92b-a947-421c-9058-86de368cfbef_1920x1080.png" width="474" height="266.625" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2efce92b-a947-421c-9058-86de368cfbef_1920x1080.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:474,&quot;bytes&quot;:986539,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!V0kL!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2efce92b-a947-421c-9058-86de368cfbef_1920x1080.png 424w, https://substackcdn.com/image/fetch/$s_!V0kL!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2efce92b-a947-421c-9058-86de368cfbef_1920x1080.png 848w, https://substackcdn.com/image/fetch/$s_!V0kL!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2efce92b-a947-421c-9058-86de368cfbef_1920x1080.png 1272w, https://substackcdn.com/image/fetch/$s_!V0kL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2efce92b-a947-421c-9058-86de368cfbef_1920x1080.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">DET Webinar: Growth Moments with Shachar Meir on May 23</figcaption></figure></div><p>Career and personal growth isn't linear. It happens in pivotal "moments" that transform your perspective and actions forever. The same applies to Data Engineering. In this online webinar, <a href="https://www.linkedin.com/in/shacharmeir/">Shachar Meir</a>, Data Advisor and Ex-Director of Data Engineering at Meta, will share some of the most impactful "growth moments" from his career in data. These are real-world stories with significant lessons that offer valuable insights and learnings.</p><p>&#128073;&#127996; Sign up <strong><a href="https://us06web.zoom.us/meeting/register/tZEufuqvqTgoGNzGbIAXNMi9hdW47zmvt2Nd">HERE</a></strong> (limited to 100 attendees).</p><div><hr></div><h4><strong>&#127775; DET Mentors at Snowflake Data Cloud Summit on June 6</strong></h4><p>Looking for guidance on growing your career in data engineering? Join us on Snowflake Dev Day to get mentorship from experienced DET Mentors:</p><ul><li><p><strong>Date &amp; Time</strong>: 3 PM - 4 PM on Thu, June 6</p></li><li><p><strong>Location</strong>: Moscone Center, 747 Howard St., San Francisco, CA 94103. Meet us at the Connect() lounge area.</p></li><li><p><strong>Signup</strong>: <a href="https://www.snowflake.com/summit/devday/">HERE</a> (Free registration)</p></li></ul><p>Below are the mentors you will get to meet and connect with in person:</p><ul><li><p><a href="https://www.linkedin.com/in/hao-xu-a04436103/">Hao Xu</a> (Lead Software Engineer at JPMorgan &amp; Chase)</p></li><li><p><a href="https://www.linkedin.com/in/jaibalani/">Jai Balani</a> (Senior Data Engineer at Netflix)</p></li><li><p><a href="https://www.linkedin.com/in/sekhar-sahu/">Sekhar Sahu</a> (Staff Software Engineer at Zeta Global)</p></li><li><p><a href="https://www.linkedin.com/in/sharathchandra1288/">Sharath Vandanapu</a> (Senior Manager, Data Engineering at Confluent)</p></li><li><p><a href="https://www.linkedin.com/in/sreyashidas/">Sreyashi Das</a> (Senior Data Engineer at Netflix)</p></li><li><p><a href="https://www.linkedin.com/in/yaakovbressler/">Yaakov Bressler</a> (Senior Data Engineer at Headspace)</p></li></ul><div><hr></div><h4><strong>&#127903; Community Discount for the AIQCon</strong></h4><p>The MLOps community is hosting the <a href="https://www.aiqualityconference.com/">AI Quality Conference</a> on June 25 in San Francisco. In this event, you'll engage with industry leaders and builders in creating the gold standard of AI Quality. These topics of AI quality will be covered: accuracy, transparency, generalization, bias mitigation, efficiency, cost, and more.</p><p>Apply code <strong>testinprod</strong> at checkout for 20% off. (Thank you, Demetrios!)</p><div><hr></div><h4>&#128218;<strong> Articles of the Week</strong></h4><ul><li><p><a href="https://blog.det.life/how-to-think-about-internal-data-products-as-a-data-engineer-42cef9081ebf">How to Think about Internal Data Products as a Data Engineer</a> by Hugo Lu</p></li><li><p><a href="https://blog.det.life/minds-and-machines-ai-for-mental-health-support-fine-tuning-llms-with-lora-in-practice-0ff19edb9d76">Minds and Machines &#8212; AI for Mental Health Support, Fine-Tuning LLMs with LoRA in Practice</a> by Volker Janz</p></li><li><p><a href="https://netflixtechblog.medium.com/data-gateway-a-platform-for-growing-and-protecting-the-data-tier-f1ed8db8f5c6">Data Gateway &#8212; A Platform for Growing and Protecting the Data Tier</a> by Netflix Technology Blog</p></li><li><p><a href="https://medium.com/dbsql-sme-engineering/one-big-table-vs-dimensional-modeling-on-databricks-sql-755fc3ef5dfd">One Big Table vs. Dimensional Modeling on Databricks SQL</a> by Databricks (Most concepts discussed are applicable outside of the Databricks platform)</p></li><li><p><a href="https://leaddev.com/managing-time-crisis/how-turn-engineering-incident-opportunity">How to turn an engineering incident into an opportunity</a> by Cory Watson</p></li></ul><p><em>(Want to publish blogs with DET? Read our submission guidelines <a href="https://medium.com/data-engineer-things/write-for-data-engineer-things-32dc9294c5db">here</a>.)</em></p><div><hr></div><p>Have a great week and see you next time! &#129414;</p><p></p><p>Cheers,</p><p>Xinran Waibel (<a href="https://www.linkedin.com/in/xinranwaibel/">LinkedIn</a>)</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://dataengineerthings.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Data Engineer Things! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item></channel></rss>