Forem

# dataengineering

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
Apache Kafka in Data Engineering
Cover image for Apache Kafka in Data Engineering

Apache Kafka in Data Engineering

Comments
1 min read
Data Engineering 102: Understanding Transactions, ACID, and Isolation in PostgreSQL
Cover image for Data Engineering 102: Understanding Transactions, ACID, and Isolation in PostgreSQL

Data Engineering 102: Understanding Transactions, ACID, and Isolation in PostgreSQL

3
Comments
5 min read
Building a Robust Data Observability Framework to Ensure Data Quality and Integrity

Building a Robust Data Observability Framework to Ensure Data Quality and Integrity

1
Comments 1
7 min read
Benchmarking Multimodal AI Workloads: Daft vs Spark vs Ray Data

Benchmarking Multimodal AI Workloads: Daft vs Spark vs Ray Data

10
Comments
1 min read
All About Change Data Capture CDC
Cover image for All About Change Data Capture CDC

All About Change Data Capture CDC

1
Comments
6 min read
🚀 Day 17 of My Python Learning Journey

🚀 Day 17 of My Python Learning Journey

Comments
1 min read
Sagas vs ACID Transactions: Ensuring Reliability in Distributed Architectures
Cover image for Sagas vs ACID Transactions: Ensuring Reliability in Distributed Architectures

Sagas vs ACID Transactions: Ensuring Reliability in Distributed Architectures

1
Comments
11 min read
JOIN the data analytics race: Apache Doris vs. ClickHouse, Databricks, and Snowflake

JOIN the data analytics race: Apache Doris vs. ClickHouse, Databricks, and Snowflake

Comments 1
6 min read
A Beginner’s Journey with PostgreSQL
Cover image for A Beginner’s Journey with PostgreSQL

A Beginner’s Journey with PostgreSQL

2
Comments
3 min read
Break Through Data Silos: Practices of Multi-cloud Observability Integration Based on Object Storage Service (OSS)

Break Through Data Silos: Practices of Multi-cloud Observability Integration Based on Object Storage Service (OSS)

Comments
12 min read
Real-Time CDC with Debezium and Kafka for Sharded PostgreSQL Integration
Cover image for Real-Time CDC with Debezium and Kafka for Sharded PostgreSQL Integration

Real-Time CDC with Debezium and Kafka for Sharded PostgreSQL Integration

1
Comments
9 min read
Comprehensive Guide: kwargs vs XCom in Python & Airflow
Cover image for Comprehensive Guide: kwargs vs XCom in Python & Airflow

Comprehensive Guide: kwargs vs XCom in Python & Airflow

Comments
4 min read
Precise Data Extraction: Pattern-Based Partitioning for Structured Extraction
Cover image for Precise Data Extraction: Pattern-Based Partitioning for Structured Extraction

Precise Data Extraction: Pattern-Based Partitioning for Structured Extraction

1
Comments
3 min read
Apache Gravitino 1.0.0 — From Metadata Management to Contextual Engineering
Cover image for Apache Gravitino 1.0.0 — From Metadata Management to Contextual Engineering

Apache Gravitino 1.0.0 — From Metadata Management to Contextual Engineering

1
Comments
7 min read
Apache Kafka in Data engineering
Cover image for Apache Kafka in Data engineering

Apache Kafka in Data engineering

6
Comments 1
1 min read
How I Built a MongoDB Archiving System for Crawled Data

How I Built a MongoDB Archiving System for Crawled Data

1
Comments 2
7 min read
🧭System Design Roadmap for Data Engineers

🧭System Design Roadmap for Data Engineers

4
Comments
3 min read
Orchestrating and Observing Data Pipelines with Airflow, PostgreSQL, and Polar

Orchestrating and Observing Data Pipelines with Airflow, PostgreSQL, and Polar

2
Comments
3 min read
💥 Polars vs. Pandas: Why Your Next ETL Pipeline Should Run on Rust (Part 1/5)

💥 Polars vs. Pandas: Why Your Next ETL Pipeline Should Run on Rust (Part 1/5)

2
Comments
2 min read
(Ⅱ) A Complete Guide to Core Data Warehouse Design Standards: From Layers, Types to Lifecycle

(Ⅱ) A Complete Guide to Core Data Warehouse Design Standards: From Layers, Types to Lifecycle

Comments
6 min read
Building Distributed Systems with Ray—Just Like Running a Restaurant
Cover image for Building Distributed Systems with Ray—Just Like Running a Restaurant

Building Distributed Systems with Ray—Just Like Running a Restaurant

1
Comments
7 min read
The State of Apache Iceberg v4 - October 2025 Edition
Cover image for The State of Apache Iceberg v4 - October 2025 Edition

The State of Apache Iceberg v4 - October 2025 Edition

3
Comments
6 min read
ACID, Isolation Levels, and MVCC: Architecture and Execution in Relational Databases
Cover image for ACID, Isolation Levels, and MVCC: Architecture and Execution in Relational Databases

ACID, Isolation Levels, and MVCC: Architecture and Execution in Relational Databases

2
Comments
10 min read
Data Automation: A Deep Dive

Data Automation: A Deep Dive

1
Comments
5 min read
Why Data Partitioning Is Harder Than It Looks

Why Data Partitioning Is Harder Than It Looks

1
Comments
2 min read
loading...