Forem

# dataengineering

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
5 Database Design Mistakes I Keep Seeing (And How to Catch Them Early)
Cover image for 5 Database Design Mistakes I Keep Seeing (And How to Catch Them Early)

5 Database Design Mistakes I Keep Seeing (And How to Catch Them Early)

1
Comments
7 min read
Scaling Fuzzy Matching: From Local Scripts to Production Pipelines

Scaling Fuzzy Matching: From Local Scripts to Production Pipelines

7
Comments
5 min read
Offloading Statistical Computations to BigQuery: Efficient EDA with Python and Seaborn

Offloading Statistical Computations to BigQuery: Efficient EDA with Python and Seaborn

1
Comments
2 min read
Why Most Data Projects Fail Before the First Model Is Built
Cover image for Why Most Data Projects Fail Before the First Model Is Built

Why Most Data Projects Fail Before the First Model Is Built

5
Comments
2 min read
Data Relationship Intelligence Is Infrastructure — Not a Feature

Data Relationship Intelligence Is Infrastructure — Not a Feature

4
Comments 1
1 min read
AI Data Engineer Skills Deep-Dive: Entry-Level Reality + Senior Differentiators (Follow-up to Part 1)
Cover image for AI Data Engineer Skills Deep-Dive: Entry-Level Reality + Senior Differentiators (Follow-up to Part 1)

AI Data Engineer Skills Deep-Dive: Entry-Level Reality + Senior Differentiators (Follow-up to Part 1)

Comments
4 min read
Why Data Teams Still “Guess” Join Keys in 2026

Why Data Teams Still “Guess” Join Keys in 2026

Comments 1
2 min read
How DiDi Scaled to Hundreds of Petabytes with Apache Ozone

How DiDi Scaled to Hundreds of Petabytes with Apache Ozone

Comments
4 min read
XLTable: Bringing the OLAP Experience Back to Excel on Modern Data Warehouses

XLTable: Bringing the OLAP Experience Back to Excel on Modern Data Warehouses

Comments
4 min read
Stop Bad Data From Breaking Your Pipelines — A Python Data Quality Framework
Cover image for Stop Bad Data From Breaking Your Pipelines — A Python Data Quality Framework

Stop Bad Data From Breaking Your Pipelines — A Python Data Quality Framework

Comments
3 min read
How to Implement Data Modelling in Power BI
Cover image for How to Implement Data Modelling in Power BI

How to Implement Data Modelling in Power BI

2
Comments
2 min read
O Poder da Leitura Genérica no PySpark: Uma Abordagem Unificada para Dados

O Poder da Leitura Genérica no PySpark: Uma Abordagem Unificada para Dados

1
Comments
3 min read
AI Data Engineer vs Data Engineer: What Actually Changed? (50+ Job Analysis)
Cover image for AI Data Engineer vs Data Engineer: What Actually Changed? (50+ Job Analysis)

AI Data Engineer vs Data Engineer: What Actually Changed? (50+ Job Analysis)

Comments
4 min read
The Two SQL Concepts That Made Me Finally Understand Real Data: Joins & Window Functions.

The Two SQL Concepts That Made Me Finally Understand Real Data: Joins & Window Functions.

1
Comments
3 min read
Scaling Relationship Discovery Beyond Brute Force

Scaling Relationship Discovery Beyond Brute Force

5
Comments 1
1 min read
đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.