Forem

# dataengineering

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
Bulletproof Power Query (Part 2): A Smart, Fuzzy-Match Rename Function
Cover image for Bulletproof Power Query (Part 2): A Smart, Fuzzy-Match Rename Function

Bulletproof Power Query (Part 2): A Smart, Fuzzy-Match Rename Function

Comments
4 min read
System Architecture Analysis: The Data Pipeline Issues of TraderKnows

System Architecture Analysis: The Data Pipeline Issues of TraderKnows

Comments
2 min read
Building an AI-Powered Customer Churn Prediction Pipeline on AWS (Step-by-Step)
Cover image for Building an AI-Powered Customer Churn Prediction Pipeline on AWS (Step-by-Step)

Building an AI-Powered Customer Churn Prediction Pipeline on AWS (Step-by-Step)

2
Comments
5 min read
Beyond Tagging: A Blueprint for Real-Time Cost Attribution in Data Platforms

Beyond Tagging: A Blueprint for Real-Time Cost Attribution in Data Platforms

Comments
9 min read
Introducing `everyrow.io/dedupe`: An LLM-based approach to semantic deduplication

Introducing `everyrow.io/dedupe`: An LLM-based approach to semantic deduplication

2
Comments
6 min read
Why Your Model is Failing (Hint: It’s Not the Architecture)
Cover image for Why Your Model is Failing (Hint: It’s Not the Architecture)

Why Your Model is Failing (Hint: It’s Not the Architecture)

Comments
4 min read
Architecting for the Crash: Why 'Clean Data' is the Only Safety Net in Trading Wind-Down (TWD)
Cover image for Architecting for the Crash: Why 'Clean Data' is the Only Safety Net in Trading Wind-Down (TWD)

Architecting for the Crash: Why 'Clean Data' is the Only Safety Net in Trading Wind-Down (TWD)

1
Comments
3 min read
How One Can Start Their Journey in Data Engineering
Cover image for How One Can Start Their Journey in Data Engineering

How One Can Start Their Journey in Data Engineering

Comments 2
4 min read
The Time Our Pipeline Processed the Same Day’s Data 47 Times

The Time Our Pipeline Processed the Same Day’s Data 47 Times

Comments
5 min read
Building Production-Grade Data Analytics Pipelines: A Real-World Case Study in Government Data

Building Production-Grade Data Analytics Pipelines: A Real-World Case Study in Government Data

2
Comments
9 min read
Day 16: Delta Lake Explained - How Spark Finally Became Reliable for Production ETL
Cover image for Day 16: Delta Lake Explained - How Spark Finally Became Reliable for Production ETL

Day 16: Delta Lake Explained - How Spark Finally Became Reliable for Production ETL

Comments
2 min read
Apache Iceberg Explained: From Data Lakes to Metadata, Snapshots, and Real-World Usage
Cover image for Apache Iceberg Explained: From Data Lakes to Metadata, Snapshots, and Real-World Usage

Apache Iceberg Explained: From Data Lakes to Metadata, Snapshots, and Real-World Usage

2
Comments 2
4 min read
Data Engineering Uncovered: What It Is and Why It Matters

Data Engineering Uncovered: What It Is and Why It Matters

3
Comments 1
3 min read
SQL - PostgreSQL: Execution Order
Cover image for SQL - PostgreSQL: Execution Order

SQL - PostgreSQL: Execution Order

4
Comments
5 min read
Migrate the legacy Greenplum to Apache Cloudberry with cbcopy

Migrate the legacy Greenplum to Apache Cloudberry with cbcopy

Comments
7 min read
đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.