Forem

# dataengineering

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
Event-Driven Data Pipelines - Real-Time Orchestration on AWS

Event-Driven Data Pipelines - Real-Time Orchestration on AWS

2
Comments
4 min read
Streamlit desde cero: cómo crear una app para explorar y visualizar datos desde un CSV

Streamlit desde cero: cómo crear una app para explorar y visualizar datos desde un CSV

Comments
4 min read
OLTP y OLAP: Sistemas de Procesamiento de Datos Empresariales

OLTP y OLAP: Sistemas de Procesamiento de Datos Empresariales

Comments
5 min read
Stop Manually Tracing Azure Synapse Dependencies

Stop Manually Tracing Azure Synapse Dependencies

Comments
1 min read
Building Pangolin: My Holiday Break, an AI IDE, and a Lakehouse Catalog for the Curious
Cover image for Building Pangolin: My Holiday Break, an AI IDE, and a Lakehouse Catalog for the Curious

Building Pangolin: My Holiday Break, an AI IDE, and a Lakehouse Catalog for the Curious

Comments
6 min read
Part 8: Databricks Pipeline & Dashboard
Cover image for Part 8: Databricks Pipeline & Dashboard

Part 8: Databricks Pipeline & Dashboard

Comments
2 min read
Part 4: Building the Bronze Layer with Auto Loader and Delta Lake
Cover image for Part 4: Building the Bronze Layer with Auto Loader and Delta Lake

Part 4: Building the Bronze Layer with Auto Loader and Delta Lake

Comments
2 min read
Part 5: Building a ZIP Code Dimension Table
Cover image for Part 5: Building a ZIP Code Dimension Table

Part 5: Building a ZIP Code Dimension Table

Comments
2 min read
Part 2: Project Architecture
Cover image for Part 2: Project Architecture

Part 2: Project Architecture

Comments
2 min read
Part 1: Creating Databricks Workspace and Enabling Unity Catalog
Cover image for Part 1: Creating Databricks Workspace and Enabling Unity Catalog

Part 1: Creating Databricks Workspace and Enabling Unity Catalog

Comments
2 min read
End-to-End Real-Time Data Engineering on Databricks Using Spark Structured Streaming and Delta Lake
Cover image for End-to-End Real-Time Data Engineering on Databricks Using Spark Structured Streaming and Delta Lake

End-to-End Real-Time Data Engineering on Databricks Using Spark Structured Streaming and Delta Lake

Comments
1 min read
Building Bulletproof Data Pipelines: Orchestration, Testing, and Monitoring (Part 3 of 3)
Cover image for Building Bulletproof Data Pipelines: Orchestration, Testing, and Monitoring (Part 3 of 3)

Building Bulletproof Data Pipelines: Orchestration, Testing, and Monitoring (Part 3 of 3)

Comments
8 min read
Part 3: Simulating Real-Time Streaming Data Using Databricks Sample Datasets
Cover image for Part 3: Simulating Real-Time Streaming Data Using Databricks Sample Datasets

Part 3: Simulating Real-Time Streaming Data Using Databricks Sample Datasets

Comments
1 min read
The Database Query That Could Cost a Company Millions(And Why Data Engineers Exist)

The Database Query That Could Cost a Company Millions(And Why Data Engineers Exist)

Comments
5 min read
Automating Serverless Data Ingestion: How to Connect External APIs to BigQuery using Python and Cloud Functions

Automating Serverless Data Ingestion: How to Connect External APIs to BigQuery using Python and Cloud Functions

Comments
12 min read
The Data Liberation: Amazon Athena and the Architecting of a Serverless Future
Cover image for The Data Liberation: Amazon Athena and the Architecting of a Serverless Future

The Data Liberation: Amazon Athena and the Architecting of a Serverless Future

Comments
3 min read
The Ultimate Guide to Data Engineering on Google Cloud (2026)
Cover image for The Ultimate Guide to Data Engineering on Google Cloud (2026)

The Ultimate Guide to Data Engineering on Google Cloud (2026)

5
Comments
3 min read
When code-gen suggests deprecated Pandas APIs — a subtle drift that broke a pipeline

When code-gen suggests deprecated Pandas APIs — a subtle drift that broke a pipeline

Comments
3 min read
Part 6: Silver Layer – Cleansing, Enrichment, and Dimensions
Cover image for Part 6: Silver Layer – Cleansing, Enrichment, and Dimensions

Part 6: Silver Layer – Cleansing, Enrichment, and Dimensions

Comments
2 min read
Part 7: Gold Layer – Metrics, Watermarks, and Aggregations
Cover image for Part 7: Gold Layer – Metrics, Watermarks, and Aggregations

Part 7: Gold Layer – Metrics, Watermarks, and Aggregations

Comments
2 min read
Why Data SLAs Fail — and How to Enforce Them with a Unified Reliability Framework

Why Data SLAs Fail — and How to Enforce Them with a Unified Reliability Framework

Comments
2 min read
Unveiling the Power of Databases in the Realm of Big Data

Unveiling the Power of Databases in the Realm of Big Data

Comments
2 min read
When an AI Suggests DataFrame.append: Missing Pandas Deprecations in Generated Code

When an AI Suggests DataFrame.append: Missing Pandas Deprecations in Generated Code

Comments 1
3 min read
S3-Native Kafka Alternatives: What's Actually Different

S3-Native Kafka Alternatives: What's Actually Different

Comments
3 min read
Analysing Drivers of Digital Transformation in Corporate Innovation Capacity Using Amazon SageMaker Studio and Kaggle API
Cover image for Analysing Drivers of Digital Transformation in Corporate Innovation Capacity Using Amazon SageMaker Studio and Kaggle API

Analysing Drivers of Digital Transformation in Corporate Innovation Capacity Using Amazon SageMaker Studio and Kaggle API

Comments
2 min read
loading...