Forem

# dataengineering

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
Analytics don't want duplicated data, so get it exactly-once with Flink/Kafka

Analytics don't want duplicated data, so get it exactly-once with Flink/Kafka

Comments
3 min read
Metadata for win — Apache Parquet

Metadata for win — Apache Parquet

Comments
5 min read
Remove unwanted partition data in Azure Synapse (SQL DW)

Remove unwanted partition data in Azure Synapse (SQL DW)

1
Comments
6 min read
Replacing Saas ETL with Python dlt: A painless experience for Yummy.eu

Replacing Saas ETL with Python dlt: A painless experience for Yummy.eu

2
Comments
3 min read
Simplifying SDMX Data Integration with Python

Simplifying SDMX Data Integration with Python

2
Comments
3 min read
Unlocking the Power of Large Language Models (LLMs): Your Ultimate Guide

Unlocking the Power of Large Language Models (LLMs): Your Ultimate Guide

6
Comments
1 min read
Clustering vs Partitioning your Apache Iceberg Tables

Clustering vs Partitioning your Apache Iceberg Tables

7
Comments
10 min read
From Messy Data to Super Mario Pipeline: My First Adventure in Data Engineering

From Messy Data to Super Mario Pipeline: My First Adventure in Data Engineering

1
Comments
12 min read
Database generated events: LiveSync’s database connector vs CDC

Database generated events: LiveSync’s database connector vs CDC

4
Comments
5 min read
The Data Professions

The Data Professions

1
Comments
3 min read
MySQL: Using and Enhancing `DATETIME` and `TIMESTAMP`

MySQL: Using and Enhancing `DATETIME` and `TIMESTAMP`

1
Comments
3 min read
Analyzing Svenskalag Data using DBT and DuckDB

Analyzing Svenskalag Data using DBT and DuckDB

1
Comments
4 min read
How I've implemented the Medallion architecture using Apache Spark and Apache Hdoop

How I've implemented the Medallion architecture using Apache Spark and Apache Hdoop

12
Comments
6 min read
Working with Dates and Times in SQL: Tips and Tricks

Working with Dates and Times in SQL: Tips and Tricks

Comments
3 min read
FastAPI for Data Applications: From Concept to Creation. Part I

FastAPI for Data Applications: From Concept to Creation. Part I

4
Comments
5 min read
Bridging Backend and Data Engineering: Communicating Through Events

Bridging Backend and Data Engineering: Communicating Through Events

Comments
1 min read
Usando Consultas de Percolação do Elasticsearch, Netflix Aperfeiçoa Buscas Reversas Eficientemente

Usando Consultas de Percolação do Elasticsearch, Netflix Aperfeiçoa Buscas Reversas Eficientemente

1
Comments
3 min read
How to setup resources for k8s pod

How to setup resources for k8s pod

2
Comments
3 min read
Multi-tenant workload isolation in Apache Doris: a better balance between isolation and utilization

Multi-tenant workload isolation in Apache Doris: a better balance between isolation and utilization

3
Comments
9 min read
Data Mesh: An Executive Guide to Modern Data Architecture in Manufacturing

Data Mesh: An Executive Guide to Modern Data Architecture in Manufacturing

1
Comments
13 min read
Difference between Data Analysts, Data Scientists, and Data Engineers

Difference between Data Analysts, Data Scientists, and Data Engineers

Comments 1
1 min read
What is Data Ethics?

What is Data Ethics?

Comments
8 min read
Converting .shp files to CSV with GeoPandas

Converting .shp files to CSV with GeoPandas

14
Comments 1
2 min read
Apache Iceberg and Data Lakehouse Partitioning

Apache Iceberg and Data Lakehouse Partitioning

8
Comments 1
7 min read
Data warehouse vs data lake

Data warehouse vs data lake

1
Comments
8 min read
loading...