Forem

# dataengineering

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
How I Built a MongoDB Archiving System for Crawled Data

How I Built a MongoDB Archiving System for Crawled Data

1
Comments 2
7 min read
Complete Guide: Dockerizing Spark, Kafka, and Jupyter for YouTube Pipeline
Cover image for Complete Guide: Dockerizing Spark, Kafka, and Jupyter for YouTube Pipeline

Complete Guide: Dockerizing Spark, Kafka, and Jupyter for YouTube Pipeline

Comments
9 min read
Dockerized Spark and Kafka: YouTube Data Pipeline Implementation
Cover image for Dockerized Spark and Kafka: YouTube Data Pipeline Implementation

Dockerized Spark and Kafka: YouTube Data Pipeline Implementation

Comments
7 min read
RIP Amazon Data Firehose Change Data Capture
Cover image for RIP Amazon Data Firehose Change Data Capture

RIP Amazon Data Firehose Change Data Capture

7
Comments 3
4 min read
Event-Driven Architectures on AWS: Beyond Lambda

Event-Driven Architectures on AWS: Beyond Lambda

4
Comments
2 min read
YouTube Data Processing Pipeline
Cover image for YouTube Data Processing Pipeline

YouTube Data Processing Pipeline

2
Comments 1
4 min read
🔄 ETL vs ELT: What’s the Difference and Why It Matters?
Cover image for 🔄 ETL vs ELT: What’s the Difference and Why It Matters?

🔄 ETL vs ELT: What’s the Difference and Why It Matters?

Comments
2 min read
CDC in AWS: Content Data Capture from AWS RDS MySQL into AWS MSK Kafka topic using Debezium
Cover image for CDC in AWS: Content Data Capture from AWS RDS MySQL into AWS MSK Kafka topic using Debezium

CDC in AWS: Content Data Capture from AWS RDS MySQL into AWS MSK Kafka topic using Debezium

1
Comments
5 min read
LLPY-03: ExtracciĂłn y Procesamiento Inteligente de Datos Legales
Cover image for LLPY-03: ExtracciĂłn y Procesamiento Inteligente de Datos Legales

LLPY-03: ExtracciĂłn y Procesamiento Inteligente de Datos Legales

Comments
21 min read
Create a Microsoft Fabric Lakehouse
Cover image for Create a Microsoft Fabric Lakehouse

Create a Microsoft Fabric Lakehouse

5
Comments
6 min read
🏗️ The Role of a Data Engineer: Beyond Pipelines

🏗️ The Role of a Data Engineer: Beyond Pipelines

Comments
2 min read
Beyond Flat Tables: Model Hierarchical Data in Supabase with Recursive Queries
Cover image for Beyond Flat Tables: Model Hierarchical Data in Supabase with Recursive Queries

Beyond Flat Tables: Model Hierarchical Data in Supabase with Recursive Queries

2
Comments
7 min read
Personal Picks: Data Product News (October 1, 2025)

Personal Picks: Data Product News (October 1, 2025)

Comments
7 min read
From Kafka to Clean Tables: Building a Confluent Snowflake Pipeline with Streams & Tasks
Cover image for From Kafka to Clean Tables: Building a Confluent Snowflake Pipeline with Streams & Tasks

From Kafka to Clean Tables: Building a Confluent Snowflake Pipeline with Streams & Tasks

4
Comments 1
10 min read
Git Integration in Microsoft Fabric

Git Integration in Microsoft Fabric

4
Comments
3 min read
đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.