Skip to content
Navigation menu
Search
Powered by
Search
Algolia
Log in
Create account
Forem
Close
#
dataengineering
Follow
Hide
Posts
Left menu
đź‘‹
Sign in
for the ability to sort posts by
relevant
,
latest
, or
top
.
Right menu
My journey learning Apache Spark
Paulet Wairagu
Paulet Wairagu
Paulet Wairagu
Follow
Oct 26 '24
My journey learning Apache Spark
#
spark
#
sql
#
dataengineering
1
reaction
Comments
Add Comment
2 min read
AWS DATA ENGINEER - 101
Sajjad Rahman
Sajjad Rahman
Sajjad Rahman
Follow
Oct 24 '24
AWS DATA ENGINEER - 101
#
aws
#
dataengineering
#
awschallenge
#
awsbigdata
3
reactions
Comments
Add Comment
2 min read
The Journey From a CSV File to Apache Hive Table
Abdullah Haggag
Abdullah Haggag
Abdullah Haggag
Follow
Oct 24 '24
The Journey From a CSV File to Apache Hive Table
#
hadoop
#
hive
#
bigdata
#
dataengineering
2
reactions
Comments
Add Comment
6 min read
CapĂtulo 2 - Modelos de Datos y Lenguajes de Consulta
Pablo Arango Ramirez
Pablo Arango Ramirez
Pablo Arango Ramirez
Follow
Oct 22 '24
CapĂtulo 2 - Modelos de Datos y Lenguajes de Consulta
#
data
#
sql
#
nosql
#
dataengineering
2
reactions
Comments
Add Comment
7 min read
All About Parquet Part 05 - Compression Techniques in Parquet
Alex Merced
Alex Merced
Alex Merced
Follow
Oct 21 '24
All About Parquet Part 05 - Compression Techniques in Parquet
#
database
#
datascience
#
dataengineering
15
reactions
Comments
Add Comment
5 min read
All About Parquet Part 10 - Performance Tuning and Best Practices with Parquet
Alex Merced
Alex Merced
Alex Merced
Follow
Oct 21 '24
All About Parquet Part 10 - Performance Tuning and Best Practices with Parquet
#
database
#
datascience
#
dataengineering
15
reactions
Comments
Add Comment
6 min read
All About Parquet Part 08 - Reading and Writing Parquet Files in Python
Alex Merced
Alex Merced
Alex Merced
Follow
Oct 21 '24
All About Parquet Part 08 - Reading and Writing Parquet Files in Python
#
database
#
datascience
#
dataengineering
#
data
28
reactions
Comments
Add Comment
5 min read
All About Parquet Part 07 - Metadata in Parquet | Improving Data Efficiency
Alex Merced
Alex Merced
Alex Merced
Follow
Oct 21 '24
All About Parquet Part 07 - Metadata in Parquet | Improving Data Efficiency
#
data
#
database
#
dataengineering
#
datascience
5
reactions
Comments
Add Comment
5 min read
All About Parquet Part 04 - Schema Evolution in Parquet
Alex Merced
Alex Merced
Alex Merced
Follow
Oct 21 '24
All About Parquet Part 04 - Schema Evolution in Parquet
#
database
#
datascience
#
dataengineering
5
reactions
Comments
1
comment
5 min read
All About Parquet Part 01 - An Introduction
Alex Merced
Alex Merced
Alex Merced
Follow
Oct 21 '24
All About Parquet Part 01 - An Introduction
#
database
#
dataengineering
2
reactions
Comments
Add Comment
4 min read
All About Parquet Part 09 - Parquet in Data Lake Architectures
Alex Merced
Alex Merced
Alex Merced
Follow
Oct 21 '24
All About Parquet Part 09 - Parquet in Data Lake Architectures
#
data
#
database
#
datascience
#
dataengineering
1
reaction
Comments
Add Comment
5 min read
All About Parquet Part 02 - Parquet's Columnar Storage Model
Alex Merced
Alex Merced
Alex Merced
Follow
Oct 21 '24
All About Parquet Part 02 - Parquet's Columnar Storage Model
#
database
#
datascience
#
dataengineering
2
reactions
Comments
Add Comment
4 min read
All About Parquet Part 06 - Encoding in Parquet | Optimizing for Storage
Alex Merced
Alex Merced
Alex Merced
Follow
Oct 21 '24
All About Parquet Part 06 - Encoding in Parquet | Optimizing for Storage
#
database
#
datascience
#
dataengineering
3
reactions
Comments
Add Comment
6 min read
All About Parquet Part 03 - Parquet File Structure | Pages, Row Groups, and Columns
Alex Merced
Alex Merced
Alex Merced
Follow
Oct 21 '24
All About Parquet Part 03 - Parquet File Structure | Pages, Row Groups, and Columns
#
database
#
datascience
#
dataengineering
3
reactions
Comments
Add Comment
5 min read
From a Unified Bronze Layer to Multiple Silver Layers: Streamlining Data Transformation in Databricks Unity Catalog
prakhyatkarri
prakhyatkarri
prakhyatkarri
Follow
Oct 20 '24
From a Unified Bronze Layer to Multiple Silver Layers: Streamlining Data Transformation in Databricks Unity Catalog
#
databricks
#
unitycatalog
#
medallionarchitecture
#
dataengineering
2
reactions
Comments
Add Comment
5 min read
*Mastering Informatica Intelligent Cloud Services (IICS) for Cloud Data Integration*
Rodolfo Mendivil
Rodolfo Mendivil
Rodolfo Mendivil
Follow
Oct 18 '24
*Mastering Informatica Intelligent Cloud Services (IICS) for Cloud Data Integration*
#
iics
#
data
#
etl
#
dataengineering
1
reaction
Comments
Add Comment
3 min read
Data Engineering with Scala: Mastering Real-Time Data Processing with Apache Flink and Google Pub/Sub
Geazi Anc
Geazi Anc
Geazi Anc
Follow
Oct 18 '24
Data Engineering with Scala: Mastering Real-Time Data Processing with Apache Flink and Google Pub/Sub
#
dataengineering
#
scala
#
datascience
#
flink
8
reactions
Comments
Add Comment
15 min read
Building a Big Data Playground Sandbox for Learning
Abdullah Haggag
Abdullah Haggag
Abdullah Haggag
Follow
Oct 17 '24
Building a Big Data Playground Sandbox for Learning
#
dataengineering
#
bigdata
#
opensource
8
reactions
Comments
Add Comment
5 min read
Still Using SQL, Python, & Excel for Data Deduplication? Here's Why You Need Better Tools.
Farah Kim
Farah Kim
Farah Kim
Follow
Oct 17 '24
Still Using SQL, Python, & Excel for Data Deduplication? Here's Why You Need Better Tools.
#
algorithms
#
ai
#
dataengineering
6
reactions
Comments
Add Comment
4 min read
Capture Browser XHR/Fetch API Response Automatically into JSON Files
Dendi Handian
Dendi Handian
Dendi Handian
Follow
Sep 12 '24
Capture Browser XHR/Fetch API Response Automatically into JSON Files
#
help
#
dataengineering
#
chrome
#
javascript
Comments
Add Comment
1 min read
The True Cost of Poor Data Quality: Why It Matters and How to Improve It
Mark Yu
Mark Yu
Mark Yu
Follow
Oct 16 '24
The True Cost of Poor Data Quality: Why It Matters and How to Improve It
#
database
#
datascience
#
dataengineering
#
management
3
reactions
Comments
Add Comment
6 min read
From ETL and ELT to Reverse ETL
luminousmen
luminousmen
luminousmen
Follow
Oct 15 '24
From ETL and ELT to Reverse ETL
#
dataengineering
#
bigdata
#
data
Comments
1
comment
4 min read
Explaining the History of Data Lakehouse
Pavol Z. Kutaj
Pavol Z. Kutaj
Pavol Z. Kutaj
Follow
Oct 14 '24
Explaining the History of Data Lakehouse
#
lakehouse
#
dataengineering
#
warehouse
1
reaction
Comments
Add Comment
2 min read
Building a User-Friendly, Budget-Friendly Alternative to dbt Cloud
Marco Porracin
Marco Porracin
Marco Porracin
Follow
Sep 8 '24
Building a User-Friendly, Budget-Friendly Alternative to dbt Cloud
#
dbt
#
dataengineering
#
opensource
#
datascience
Comments
Add Comment
1 min read
O que Ă© Engenharia de Dados?
Norton Augusto Herrero dos Santos
Norton Augusto Herrero dos Santos
Norton Augusto Herrero dos Santos
Follow
Oct 12 '24
O que Ă© Engenharia de Dados?
#
dataengineering
#
datascience
3
reactions
Comments
Add Comment
1 min read
loading...
We're a blogging-forward open source social network where we learn from one another
Log in
Create account