Forem

# dataengineering

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
Building an AI-Powered Customer Churn Prediction Pipeline on AWS (Step-by-Step)
Cover image for Building an AI-Powered Customer Churn Prediction Pipeline on AWS (Step-by-Step)

Building an AI-Powered Customer Churn Prediction Pipeline on AWS (Step-by-Step)

2
Comments
5 min read
Beyond Tagging: A Blueprint for Real-Time Cost Attribution in Data Platforms

Beyond Tagging: A Blueprint for Real-Time Cost Attribution in Data Platforms

Comments
9 min read
Introducing `everyrow.io/dedupe`: An LLM-based approach to semantic deduplication

Introducing `everyrow.io/dedupe`: An LLM-based approach to semantic deduplication

2
Comments
6 min read
New release: LightningChart Python 2.1
Cover image for New release: LightningChart Python 2.1

New release: LightningChart Python 2.1

Comments
1 min read
Why Your Model is Failing (Hint: It’s Not the Architecture)
Cover image for Why Your Model is Failing (Hint: It’s Not the Architecture)

Why Your Model is Failing (Hint: It’s Not the Architecture)

Comments
4 min read
Architecting for the Crash: Why 'Clean Data' is the Only Safety Net in Trading Wind-Down (TWD)
Cover image for Architecting for the Crash: Why 'Clean Data' is the Only Safety Net in Trading Wind-Down (TWD)

Architecting for the Crash: Why 'Clean Data' is the Only Safety Net in Trading Wind-Down (TWD)

1
Comments
3 min read
How One Can Start Their Journey in Data Engineering
Cover image for How One Can Start Their Journey in Data Engineering

How One Can Start Their Journey in Data Engineering

Comments 2
4 min read
The Time Our Pipeline Processed the Same Day’s Data 47 Times

The Time Our Pipeline Processed the Same Day’s Data 47 Times

Comments
5 min read
Firehose and Iceberg Tables
Cover image for Firehose and Iceberg Tables

Firehose and Iceberg Tables

Comments
4 min read
Building Production-Grade Data Analytics Pipelines: A Real-World Case Study in Government Data

Building Production-Grade Data Analytics Pipelines: A Real-World Case Study in Government Data

Comments
9 min read
Day 16: Delta Lake Explained - How Spark Finally Became Reliable for Production ETL
Cover image for Day 16: Delta Lake Explained - How Spark Finally Became Reliable for Production ETL

Day 16: Delta Lake Explained - How Spark Finally Became Reliable for Production ETL

Comments
2 min read
Data Engineering Uncovered: What It Is and Why It Matters

Data Engineering Uncovered: What It Is and Why It Matters

3
Comments 1
3 min read
Google's LEGO tribute đź§©

Google's LEGO tribute đź§©

27
Comments 8
1 min read
Migrate the legacy Greenplum to Apache Cloudberry with cbcopy

Migrate the legacy Greenplum to Apache Cloudberry with cbcopy

Comments
7 min read
Unpacking the Google File System Paper: A Simple Breakdown

Unpacking the Google File System Paper: A Simple Breakdown

Comments
3 min read
đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.