Skip to content
Navigation menu
Search
Powered by Algolia
Search
Log in
Create account
Forem
Close
#
dataengineering
Follow
Hide
Posts
Left menu
đź‘‹
Sign in
for the ability to sort posts by
relevant
,
latest
, or
top
.
Right menu
Building a Modern Data Platform to Track Kenya’s Food Prices — A Data Engineering Case Study
Rose Wabere
Rose Wabere
Rose Wabere
Follow
Nov 13 '25
Building a Modern Data Platform to Track Kenya’s Food Prices — A Data Engineering Case Study
#
spark
#
pyspark
#
grafana
#
dataengineering
Comments
Add Comment
5 min read
Part 1: Database Concepts & Architecture
Data Tech Bridge
Data Tech Bridge
Data Tech Bridge
Follow
Dec 18 '25
Part 1: Database Concepts & Architecture
#
architecture
#
database
#
dataengineering
Comments
Add Comment
14 min read
Final Project Report 1: Schema Evolution Support on Apache SeaTunnel Flink Engine
Apache SeaTunnel
Apache SeaTunnel
Apache SeaTunnel
Follow
Nov 14 '25
Final Project Report 1: Schema Evolution Support on Apache SeaTunnel Flink Engine
#
opensource
#
apacheseatunnel
#
bigdata
#
dataengineering
Comments
Add Comment
4 min read
I Built an ETL Pipeline That Actually Thinks & And Cut Token Costs by 52% (And Here's What I Learned)
Seenivasa Ramadurai
Seenivasa Ramadurai
Seenivasa Ramadurai
Follow
Dec 17 '25
I Built an ETL Pipeline That Actually Thinks & And Cut Token Costs by 52% (And Here's What I Learned)
#
ai
#
dataengineering
#
performance
#
llm
1
 reaction
Comments
Add Comment
17 min read
Firehose and Iceberg Tables
Evan
Evan
Evan
Follow
Dec 17 '25
Firehose and Iceberg Tables
#
programming
#
dataengineering
#
aws
#
awsdatalake
2
 reactions
Comments
Add Comment
4 min read
Beyond SQL: Solving Data Warehouse Performance Bottlenecks with Smart Algorithms, Not Just Bigger Clusters
Judy
Judy
Judy
Follow
Dec 17 '25
Beyond SQL: Solving Data Warehouse Performance Bottlenecks with Smart Algorithms, Not Just Bigger Clusters
#
algorithms
#
database
#
dataengineering
#
performance
5
 reactions
Comments
Add Comment
13 min read
From Pandas to Upstream Control: The Evolution PyData Needs Next
David Aronchick
David Aronchick
David Aronchick
Follow
Nov 12 '25
From Pandas to Upstream Control: The Evolution PyData Needs Next
#
dataengineering
#
python
#
distributedsystems
#
machinelearning
Comments
Add Comment
6 min read
Statistics Day 2: Correlation Isn’t Causation — Here’s Why It Matters!
Chanchal Singh
Chanchal Singh
Chanchal Singh
Follow
Nov 13 '25
Statistics Day 2: Correlation Isn’t Causation — Here’s Why It Matters!
#
statistics
#
datascience
#
machinelearning
#
dataengineering
5
 reactions
Comments
Add Comment
4 min read
Unpacking the Google File System Paper: A Simple Breakdown
rajeevrajeshuni
rajeevrajeshuni
rajeevrajeshuni
Follow
Dec 15 '25
Unpacking the Google File System Paper: A Simple Breakdown
#
distributedsystems
#
dataengineering
#
systemdesign
#
pwl
6
 reactions
Comments
Add Comment
3 min read
Apache Dev List Digest: Iceberg, Polaris, Arrow & Parquet (Dec 9th - Dec15th, 2025)
Alex Merced
Alex Merced
Alex Merced
Follow
Dec 15 '25
Apache Dev List Digest: Iceberg, Polaris, Arrow & Parquet (Dec 9th - Dec15th, 2025)
#
news
#
dataengineering
#
opensource
1
 reaction
Comments
Add Comment
7 min read
Kafka consumer lag—Measure and reduce
Aimé Bangirahe
Aimé Bangirahe
Aimé Bangirahe
Follow
Nov 10 '25
Kafka consumer lag—Measure and reduce
#
devops
#
performance
#
dataengineering
#
monitoring
Comments
Add Comment
5 min read
Understanding Kafka Consumer Lag: Causes, Risks, and How to Fix It
Eric Kahindi
Eric Kahindi
Eric Kahindi
Follow
Nov 10 '25
Understanding Kafka Consumer Lag: Causes, Risks, and How to Fix It
#
systemdesign
#
performance
#
dataengineering
#
monitoring
Comments
Add Comment
3 min read
Building a dbt-UI I Wish Existed
remis haroon
remis haroon
remis haroon
Follow
Dec 15 '25
Building a dbt-UI I Wish Existed
#
dbt
#
data
#
dataengineering
#
dataanalytics
1
 reaction
Comments
Add Comment
3 min read
Building a Real-Time Crypto Data Pipeline with Debezium CDC
Aineah Simiyu
Aineah Simiyu
Aineah Simiyu
Follow
Nov 10 '25
Building a Real-Time Crypto Data Pipeline with Debezium CDC
#
python
#
kafka
#
cdc
#
dataengineering
Comments
Add Comment
5 min read
Undestanding Kafka Lag, Why It Happens and How To Fix It.
John Kioko
John Kioko
John Kioko
Follow
Nov 10 '25
Undestanding Kafka Lag, Why It Happens and How To Fix It.
#
kafka
#
programming
#
datascience
#
dataengineering
2
 reactions
Comments
Add Comment
4 min read
đź‘‹
Sign in
for the ability to sort posts by
relevant
,
latest
, or
top
.
We're a blogging-forward open source social network where we learn from one another
Log in
Create account