Forem

# dataengineering

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
🧪 Virtual Environments for Data Engineers — 2025 Edition

🧪 Virtual Environments for Data Engineers — 2025 Edition

Comments
1 min read
Apache Iceberg Table Optimization #2: The Basics of Compaction — Bin Packing Your Data for Efficiency
Cover image for Apache Iceberg Table Optimization #2: The Basics of Compaction — Bin Packing Your Data for Efficiency

Apache Iceberg Table Optimization #2: The Basics of Compaction — Bin Packing Your Data for Efficiency

Comments
3 min read
Big Data Fundamentals: big data tutorial

Big Data Fundamentals: big data tutorial

1
Comments
5 min read
Big Data Fundamentals: big data tutorial

Big Data Fundamentals: big data tutorial

1
Comments
5 min read
Apache Iceberg Table Optimization #7: Using Iceberg Metadata Tables to Determine When Compaction Is Needed
Cover image for Apache Iceberg Table Optimization #7: Using Iceberg Metadata Tables to Determine When Compaction Is Needed

Apache Iceberg Table Optimization #7: Using Iceberg Metadata Tables to Determine When Compaction Is Needed

Comments
3 min read
Apache Iceberg Table Optimization #5: Avoiding Metadata Bloat with Snapshot Expiration and Rewriting Manifests
Cover image for Apache Iceberg Table Optimization #5: Avoiding Metadata Bloat with Snapshot Expiration and Rewriting Manifests

Apache Iceberg Table Optimization #5: Avoiding Metadata Bloat with Snapshot Expiration and Rewriting Manifests

Comments
3 min read
Apache Iceberg Table Optimization #4: Smarter Data Layout — Sorting and Clustering Iceberg Tables
Cover image for Apache Iceberg Table Optimization #4: Smarter Data Layout — Sorting and Clustering Iceberg Tables

Apache Iceberg Table Optimization #4: Smarter Data Layout — Sorting and Clustering Iceberg Tables

1
Comments
3 min read
Apache Iceberg Table Optimization #3: Optimizing Compaction for Streaming Workloads in Apache Iceberg
Cover image for Apache Iceberg Table Optimization #3: Optimizing Compaction for Streaming Workloads in Apache Iceberg

Apache Iceberg Table Optimization #3: Optimizing Compaction for Streaming Workloads in Apache Iceberg

Comments
3 min read
Apache Iceberg Table Optimization #1: The Cost of Neglect — How Apache Iceberg Tables Degrade Without Optimization
Cover image for Apache Iceberg Table Optimization #1: The Cost of Neglect — How Apache Iceberg Tables Degrade Without Optimization

Apache Iceberg Table Optimization #1: The Cost of Neglect — How Apache Iceberg Tables Degrade Without Optimization

Comments
3 min read
Big Data Fundamentals: big data tutorial

Big Data Fundamentals: big data tutorial

1
Comments
5 min read
Database Design Errors to Avoid & How To Fix Them
Cover image for Database Design Errors to Avoid & How To Fix Them

Database Design Errors to Avoid & How To Fix Them

11
Comments 1
5 min read
Cross-Platform Multi-Channel Attribution in Marketing: Balancing Costs and Results Across Devices
Cover image for Cross-Platform Multi-Channel Attribution in Marketing: Balancing Costs and Results Across Devices

Cross-Platform Multi-Channel Attribution in Marketing: Balancing Costs and Results Across Devices

1
Comments
5 min read
2025 Data Warehouse Benchmark: What BigQuery, Snowflake, and Others Don’t Tell You
Cover image for 2025 Data Warehouse Benchmark: What BigQuery, Snowflake, and Others Don’t Tell You

2025 Data Warehouse Benchmark: What BigQuery, Snowflake, and Others Don’t Tell You

3
Comments
2 min read
🛒 Real-Life Data Lakehouse Use Case: Revolutionizing Retail Analytics

🛒 Real-Life Data Lakehouse Use Case: Revolutionizing Retail Analytics

2
Comments
2 min read
Data and analytics reimagined with Terraform and DevOps principles
Cover image for Data and analytics reimagined with Terraform and DevOps principles

Data and analytics reimagined with Terraform and DevOps principles

Comments
3 min read
Top Trends and Applications in Data Engineering and AI for the Modern Enterprise

Top Trends and Applications in Data Engineering and AI for the Modern Enterprise

Comments 1
5 min read
A Real-Time Earthquake Monitoring Pipeline with Kafka, MySQL, PostgreSQL, and Grafana
Cover image for A Real-Time Earthquake Monitoring Pipeline with Kafka, MySQL, PostgreSQL, and Grafana

A Real-Time Earthquake Monitoring Pipeline with Kafka, MySQL, PostgreSQL, and Grafana

3
Comments
4 min read
Pandas vs Polars: Is It Time to Rethink Python’s Trusted DataFrame Library?
Cover image for Pandas vs Polars: Is It Time to Rethink Python’s Trusted DataFrame Library?

Pandas vs Polars: Is It Time to Rethink Python’s Trusted DataFrame Library?

3
Comments 2
3 min read
Big Data Fundamentals: big data tutorial

Big Data Fundamentals: big data tutorial

5
Comments
5 min read
Big Data Fundamentals: big data project

Big Data Fundamentals: big data project

5
Comments
5 min read
Big Data Fundamentals: big data example

Big Data Fundamentals: big data example

5
Comments
5 min read
Big Data Fundamentals: big data

Big Data Fundamentals: big data

5
Comments
6 min read
Why Your Data Fails You - and How a Data Platform Can Fix It
Cover image for Why Your Data Fails You - and How a Data Platform Can Fix It

Why Your Data Fails You - and How a Data Platform Can Fix It

1
Comments
4 min read
Cloud Data Tools Simplified: AWS, Google Cloud, and Azure
Cover image for Cloud Data Tools Simplified: AWS, Google Cloud, and Azure

Cloud Data Tools Simplified: AWS, Google Cloud, and Azure

1
Comments
7 min read
Understanding Consistency in PostgreSQL: A Deep Dive into the “C” in ACID

Understanding Consistency in PostgreSQL: A Deep Dive into the “C” in ACID

Comments
3 min read
loading...