Forem

# dataengineering

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
Star vs. Snowflake Schema
Cover image for Star vs. Snowflake Schema

Star vs. Snowflake Schema

Comments
4 min read
A real-world example of CsvPath schemas
Cover image for A real-world example of CsvPath schemas

A real-world example of CsvPath schemas

Comments
5 min read
Data Engineer — Người Kiến Tạo “Dòng Chảy Dữ Liệu” Trong Kỷ Nguyên Số

Data Engineer — Người Kiến Tạo “Dòng Chảy Dữ Liệu” Trong Kỷ Nguyên Số

Comments
2 min read
Building a Modern Data Platform to Track Kenya’s Food Prices — A Data Engineering Case Study
Cover image for Building a Modern Data Platform to Track Kenya’s Food Prices — A Data Engineering Case Study

Building a Modern Data Platform to Track Kenya’s Food Prices — A Data Engineering Case Study

Comments
5 min read
Inside the Edge: How Real-Time Data Pipelines Power Connected Devices

Inside the Edge: How Real-Time Data Pipelines Power Connected Devices

Comments
3 min read
Final Project Report 1: Schema Evolution Support on Apache SeaTunnel Flink Engine

Final Project Report 1: Schema Evolution Support on Apache SeaTunnel Flink Engine

Comments
4 min read
From Pandas to Upstream Control: The Evolution PyData Needs Next

From Pandas to Upstream Control: The Evolution PyData Needs Next

Comments
6 min read
Building Reliable Legal AI: Never Missing a Supreme Court Case
Cover image for Building Reliable Legal AI: Never Missing a Supreme Court Case

Building Reliable Legal AI: Never Missing a Supreme Court Case

2
Comments
26 min read
Statistics Day 2: Correlation Isn’t Causation — Here’s Why It Matters!
Cover image for Statistics Day 2: Correlation Isn’t Causation — Here’s Why It Matters!

Statistics Day 2: Correlation Isn’t Causation — Here’s Why It Matters!

5
Comments
4 min read
Containerization for Data Engineering: A Practical Guide with Docker and Docker Compose
Cover image for Containerization for Data Engineering: A Practical Guide with Docker and Docker Compose

Containerization for Data Engineering: A Practical Guide with Docker and Docker Compose

Comments
2 min read
Kafka consumer lag—Measure and reduce

Kafka consumer lag—Measure and reduce

Comments
5 min read
Understanding Kafka Consumer Lag: Causes, Risks, and How to Fix It
Cover image for Understanding Kafka Consumer Lag: Causes, Risks, and How to Fix It

Understanding Kafka Consumer Lag: Causes, Risks, and How to Fix It

Comments
3 min read
Building a Real-Time Crypto Data Pipeline with Debezium CDC
Cover image for Building a Real-Time Crypto Data Pipeline with Debezium CDC

Building a Real-Time Crypto Data Pipeline with Debezium CDC

Comments
5 min read
Undestanding Kafka Lag, Why It Happens and How To Fix It.
Cover image for Undestanding Kafka Lag, Why It Happens and How To Fix It.

Undestanding Kafka Lag, Why It Happens and How To Fix It.

2
Comments
4 min read
This Is Probably the Most Lightweight Alternative Technology to Logical Data Warehouses

This Is Probably the Most Lightweight Alternative Technology to Logical Data Warehouses

5
Comments
4 min read
The State of Apache Iceberg, Polaris, and Arrow: November 5-11
Cover image for The State of Apache Iceberg, Polaris, and Arrow: November 5-11

The State of Apache Iceberg, Polaris, and Arrow: November 5-11

Comments
5 min read
Understanding Kafka Lag: Why It Happens and How to Fix It

Understanding Kafka Lag: Why It Happens and How to Fix It

Comments
4 min read
Right Approach to JSON Log Analysis: A Hands-on Guide to Efficient Practices with Alibaba Cloud SLS

Right Approach to JSON Log Analysis: A Hands-on Guide to Efficient Practices with Alibaba Cloud SLS

Comments
7 min read
Understanding reasons behind Kafka lag and how to minimize it.

Understanding reasons behind Kafka lag and how to minimize it.

Comments
3 min read
Reducing Consumer Lag in Apache Kafka
Cover image for Reducing Consumer Lag in Apache Kafka

Reducing Consumer Lag in Apache Kafka

5
Comments
3 min read
Stop Copy-Pasting Between Excel and Code: Automate Your Data Workflows with GridScript

Stop Copy-Pasting Between Excel and Code: Automate Your Data Workflows with GridScript

Comments
2 min read
SQL: Summing categories
Cover image for SQL: Summing categories

SQL: Summing categories

Comments
2 min read
Build a Complete Data Pipeline from Scratch: CSV to Dashboard Using Python, MySQL, and Airflow”

Build a Complete Data Pipeline from Scratch: CSV to Dashboard Using Python, MySQL, and Airflow”

2
Comments 1
3 min read
The Future of Data Pipelines: How AI Is Redefining ETL Forever

The Future of Data Pipelines: How AI Is Redefining ETL Forever

1
Comments
4 min read
Translating Between CSV Schema Languages
Cover image for Translating Between CSV Schema Languages

Translating Between CSV Schema Languages

Comments
4 min read
loading...