Forem

# dataengineering

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
Well-formed, Valid, Canonical, and Correct

Well-formed, Valid, Canonical, and Correct

1
Comments
4 min read
Why we use Apache Airflow for Data Engineering

Why we use Apache Airflow for Data Engineering

Comments
2 min read
Building ML Infrastructure in TypeScript - Part 1: The Vision

Building ML Infrastructure in TypeScript - Part 1: The Vision

5
Comments
3 min read
Building a Real-Time Healthcare Data Pipeline with Apache Spark: From SQS to Parquet (Part 2)

Building a Real-Time Healthcare Data Pipeline with Apache Spark: From SQS to Parquet (Part 2)

Comments
8 min read
🧱 OLTP vs OLAP: When Transaction Meets Analytics

🧱 OLTP vs OLAP: When Transaction Meets Analytics

Comments
2 min read
Virtual Private Database (VPD) | DBMS_RLS | fine-grained access control (FGAC) | mrcaption49

Virtual Private Database (VPD) | DBMS_RLS | fine-grained access control (FGAC) | mrcaption49

5
Comments
5 min read
DBMS_SCHEDULER with Practical example | mrcaption49

DBMS_SCHEDULER with Practical example | mrcaption49

5
Comments
4 min read
Building a News Sentiment Analysis Pipeline with Apache Airflow and Snowflake
Cover image for Building a News Sentiment Analysis Pipeline with Apache Airflow and Snowflake

Building a News Sentiment Analysis Pipeline with Apache Airflow and Snowflake

11
Comments
3 min read
SQL CASE Statements: The Order Matters!

SQL CASE Statements: The Order Matters!

Comments
2 min read
Why Data Cleaning is 80% of Data Science

Why Data Cleaning is 80% of Data Science

Comments
2 min read
Slowly Changing Dimensions: Strategies for Maintaining History and Integrity in Analytical Systems
Cover image for Slowly Changing Dimensions: Strategies for Maintaining History and Integrity in Analytical Systems

Slowly Changing Dimensions: Strategies for Maintaining History and Integrity in Analytical Systems

1
Comments
8 min read
Tableau Sales Dashboard Performance (Updated for 2025)

Tableau Sales Dashboard Performance (Updated for 2025)

1
Comments
4 min read
Build a Lightweight Serverless ETL Pipeline to Iceberg Tables with AWS Lambda Athena

Build a Lightweight Serverless ETL Pipeline to Iceberg Tables with AWS Lambda Athena

2
Comments
4 min read
Big Data Fundamentals: data pipeline with python

Big Data Fundamentals: data pipeline with python

Comments
6 min read
Big Data Fundamentals: data pipeline tutorial

Big Data Fundamentals: data pipeline tutorial

Comments
6 min read
Data Science vs Business Analytics

Data Science vs Business Analytics

Comments
1 min read
Big Data Fundamentals: data pipeline example

Big Data Fundamentals: data pipeline example

Comments
6 min read
Big Data Fundamentals: data pipeline

Big Data Fundamentals: data pipeline

Comments
6 min read
What Is Change Data Capture (CDC) and How It Works on Google Cloud

What Is Change Data Capture (CDC) and How It Works on Google Cloud

Comments
2 min read
💾 Parquet or Avro? CSV or JSON?

💾 Parquet or Avro? CSV or JSON?

Comments
1 min read
Reading CSVs with varying column counts that pandas cannot read using DuckDB

Reading CSVs with varying column counts that pandas cannot read using DuckDB

1
Comments
3 min read
Working with Apache to automate collection of Weather data for Kenya’s major Agricultural Areas
Cover image for Working with Apache to automate collection of Weather data for Kenya’s major Agricultural Areas

Working with Apache to automate collection of Weather data for Kenya’s major Agricultural Areas

Comments
5 min read
2025 Data Warehouse Benchmark: What BigQuery, Snowflake, and Others Don’t Tell You
Cover image for 2025 Data Warehouse Benchmark: What BigQuery, Snowflake, and Others Don’t Tell You

2025 Data Warehouse Benchmark: What BigQuery, Snowflake, and Others Don’t Tell You

1
Comments
2 min read
Big Data Fundamentals: data warehouse example

Big Data Fundamentals: data warehouse example

Comments
5 min read
Big Data Fundamentals: data warehouse

Big Data Fundamentals: data warehouse

Comments
6 min read
loading...