Forem

# dataengineering

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
Building Streaming Iceberg Tables for Real-Time Logistics Analytics

Building Streaming Iceberg Tables for Real-Time Logistics Analytics

Comments
4 min read
Building a Scalable Community Health Worker Analytics Platform: My Journey with dbt and Snowflake
Cover image for Building a Scalable Community Health Worker Analytics Platform: My Journey with dbt and Snowflake

Building a Scalable Community Health Worker Analytics Platform: My Journey with dbt and Snowflake

Comments
4 min read
The Great Table Format Debate: A Deep Dive into Apache Iceberg, Delta Lake, and Apache Hudi
Cover image for The Great Table Format Debate: A Deep Dive into Apache Iceberg, Delta Lake, and Apache Hudi

The Great Table Format Debate: A Deep Dive into Apache Iceberg, Delta Lake, and Apache Hudi

1
Comments
18 min read
Amazon Kinesis vs Amazon MSK: The Complete Guide for Stream Processing on AWS
Cover image for Amazon Kinesis vs Amazon MSK: The Complete Guide for Stream Processing on AWS

Amazon Kinesis vs Amazon MSK: The Complete Guide for Stream Processing on AWS

Comments
29 min read
Day 8: Accelerating Spark Joins - Broadcast, Shuffle Optimization & Skew Handling
Cover image for Day 8: Accelerating Spark Joins - Broadcast, Shuffle Optimization & Skew Handling

Day 8: Accelerating Spark Joins - Broadcast, Shuffle Optimization & Skew Handling

Comments
2 min read
Mastering Serverless Data Pipelines: AWS Step Functions Best Practices for 2026
Cover image for Mastering Serverless Data Pipelines: AWS Step Functions Best Practices for 2026

Mastering Serverless Data Pipelines: AWS Step Functions Best Practices for 2026

Comments
5 min read
2025 Year in Review: Apache Iceberg, Polaris, Parquet, and Arrow
Cover image for 2025 Year in Review: Apache Iceberg, Polaris, Parquet, and Arrow

2025 Year in Review: Apache Iceberg, Polaris, Parquet, and Arrow

Comments
6 min read
A Stranger In a New Town: CsvPath metadata fields
Cover image for A Stranger In a New Town: CsvPath metadata fields

A Stranger In a New Town: CsvPath metadata fields

Comments
6 min read
Interesting links - November 2025

Interesting links - November 2025

Comments
19 min read
đź’€ RIP Copy-Paste: Google NotebookLM Just Killed Manual Data Entry

đź’€ RIP Copy-Paste: Google NotebookLM Just Killed Manual Data Entry

Comments
3 min read
Unified Data Fabric: Serverless Spark on ROSA Integrating with AWS Glue Catalog

Unified Data Fabric: Serverless Spark on ROSA Integrating with AWS Glue Catalog

8
Comments 1
39 min read
dupl

dupl

Comments
1 min read
Apache Dev List Digest: Iceberg, Polaris, Arrow & Parquet (Nov 18–24, 2025)
Cover image for Apache Dev List Digest: Iceberg, Polaris, Arrow & Parquet (Nov 18–24, 2025)

Apache Dev List Digest: Iceberg, Polaris, Arrow & Parquet (Nov 18–24, 2025)

Comments
5 min read
Building a Realistic Banking Dummy Data Generator with Bad-Data Simulation

Building a Realistic Banking Dummy Data Generator with Bad-Data Simulation

1
Comments
1 min read
How to Sync Data from an Oracle Table to Elasticsearch using Kafka Connect

How to Sync Data from an Oracle Table to Elasticsearch using Kafka Connect

1
Comments 1
5 min read
đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.