Forem

# dataengineering

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
My journey learning Apache Spark

My journey learning Apache Spark

1
Comments
2 min read
AWS DATA ENGINEER - 101

AWS DATA ENGINEER - 101

3
Comments
2 min read
The Journey From a CSV File to Apache Hive Table

The Journey From a CSV File to Apache Hive Table

2
Comments
6 min read
CapĂ­tulo 2 - Modelos de Datos y Lenguajes de Consulta

CapĂ­tulo 2 - Modelos de Datos y Lenguajes de Consulta

2
Comments
7 min read
All About Parquet Part 05 - Compression Techniques in Parquet

All About Parquet Part 05 - Compression Techniques in Parquet

15
Comments
5 min read
All About Parquet Part 10 - Performance Tuning and Best Practices with Parquet

All About Parquet Part 10 - Performance Tuning and Best Practices with Parquet

15
Comments
6 min read
All About Parquet Part 08 - Reading and Writing Parquet Files in Python

All About Parquet Part 08 - Reading and Writing Parquet Files in Python

28
Comments
5 min read
All About Parquet Part 07 - Metadata in Parquet | Improving Data Efficiency

All About Parquet Part 07 - Metadata in Parquet | Improving Data Efficiency

5
Comments
5 min read
All About Parquet Part 04 - Schema Evolution in Parquet

All About Parquet Part 04 - Schema Evolution in Parquet

5
Comments 1
5 min read
All About Parquet Part 01 - An Introduction

All About Parquet Part 01 - An Introduction

2
Comments
4 min read
All About Parquet Part 09 - Parquet in Data Lake Architectures

All About Parquet Part 09 - Parquet in Data Lake Architectures

1
Comments
5 min read
All About Parquet Part 02 - Parquet's Columnar Storage Model

All About Parquet Part 02 - Parquet's Columnar Storage Model

2
Comments
4 min read
All About Parquet Part 06 - Encoding in Parquet | Optimizing for Storage

All About Parquet Part 06 - Encoding in Parquet | Optimizing for Storage

3
Comments
6 min read
All About Parquet Part 03 - Parquet File Structure | Pages, Row Groups, and Columns

All About Parquet Part 03 - Parquet File Structure | Pages, Row Groups, and Columns

3
Comments
5 min read
From a Unified Bronze Layer to Multiple Silver Layers: Streamlining Data Transformation in Databricks Unity Catalog

From a Unified Bronze Layer to Multiple Silver Layers: Streamlining Data Transformation in Databricks Unity Catalog

2
Comments
5 min read
*Mastering Informatica Intelligent Cloud Services (IICS) for Cloud Data Integration*

*Mastering Informatica Intelligent Cloud Services (IICS) for Cloud Data Integration*

1
Comments
3 min read
Data Engineering with Scala: Mastering Real-Time Data Processing with Apache Flink and Google Pub/Sub

Data Engineering with Scala: Mastering Real-Time Data Processing with Apache Flink and Google Pub/Sub

8
Comments
15 min read
Building a Big Data Playground Sandbox for Learning

Building a Big Data Playground Sandbox for Learning

8
Comments
5 min read
Still Using SQL, Python, & Excel for Data Deduplication? Here's Why You Need Better Tools.

Still Using SQL, Python, & Excel for Data Deduplication? Here's Why You Need Better Tools.

6
Comments
4 min read
Capture Browser XHR/Fetch API Response Automatically into JSON Files

Capture Browser XHR/Fetch API Response Automatically into JSON Files

Comments
1 min read
The True Cost of Poor Data Quality: Why It Matters and How to Improve It

The True Cost of Poor Data Quality: Why It Matters and How to Improve It

3
Comments
6 min read
From ETL and ELT to Reverse ETL

From ETL and ELT to Reverse ETL

Comments 1
4 min read
Explaining the History of Data Lakehouse

Explaining the History of Data Lakehouse

1
Comments
2 min read
Building a User-Friendly, Budget-Friendly Alternative to dbt Cloud

Building a User-Friendly, Budget-Friendly Alternative to dbt Cloud

Comments
1 min read
O que Ă© Engenharia de Dados?

O que Ă© Engenharia de Dados?

3
Comments
1 min read
loading...