Forem

# dataengineering

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
Building a Scalable Scientific LLM Pipeline: From Raw Data to Hugging Face
Cover image for Building a Scalable Scientific LLM Pipeline: From Raw Data to Hugging Face

Building a Scalable Scientific LLM Pipeline: From Raw Data to Hugging Face

Comments
6 min read
Real-Time ETLT: Meet the Demands of Modern Data Processing
Cover image for Real-Time ETLT: Meet the Demands of Modern Data Processing

Real-Time ETLT: Meet the Demands of Modern Data Processing

1
Comments
5 min read
What is Data Engineering?
Cover image for What is Data Engineering?

What is Data Engineering?

Comments
5 min read
Personal Picks: Data Product News (April 30, 2025)

Personal Picks: Data Product News (April 30, 2025)

Comments
6 min read
#34 50 Advanced SQL Queries Every Developer Should Know
Cover image for #34 50 Advanced SQL Queries Every Developer Should Know

#34 50 Advanced SQL Queries Every Developer Should Know

2
Comments
7 min read
Setting Up Presto with Apache Superset using Docker 🐳 : Hands-On Guide
Cover image for Setting Up Presto with Apache Superset using Docker 🐳 : Hands-On Guide

Setting Up Presto with Apache Superset using Docker 🐳 : Hands-On Guide

Comments 2
3 min read
Most Important DWDM questions in MAKAUT exam
Cover image for Most Important DWDM questions in MAKAUT exam

Most Important DWDM questions in MAKAUT exam

Comments
3 min read
Understanding AWS Regions and Availability Zones: A Guide for Beginners
Cover image for Understanding AWS Regions and Availability Zones: A Guide for Beginners

Understanding AWS Regions and Availability Zones: A Guide for Beginners

Comments
5 min read
Data Warehousing and Data Mining
Cover image for Data Warehousing and Data Mining

Data Warehousing and Data Mining

Comments
2 min read
What I Learned Cleaning 1 Million Rows of CSV Data Without Pandas
Cover image for What I Learned Cleaning 1 Million Rows of CSV Data Without Pandas

What I Learned Cleaning 1 Million Rows of CSV Data Without Pandas

5
Comments
2 min read
Big Data Processing - Case Study 3 (Hadoop) 03:02

Big Data Processing - Case Study 3 (Hadoop)

Comments
1 min read
Building My First Real-Time Dashboard with ClickHouse and Streamlit: TrendLite Breakdown
Cover image for Building My First Real-Time Dashboard with ClickHouse and Streamlit: TrendLite Breakdown

Building My First Real-Time Dashboard with ClickHouse and Streamlit: TrendLite Breakdown

2
Comments
2 min read
From Reddit Trolls to Real-Time Analytics: Building an LLM-Powered Flink Deployment System

From Reddit Trolls to Real-Time Analytics: Building an LLM-Powered Flink Deployment System

3
Comments 1
7 min read
How to Handle Big Data Transformations Without Pandas (and My Favorite Workarounds)
Cover image for How to Handle Big Data Transformations Without Pandas (and My Favorite Workarounds)

How to Handle Big Data Transformations Without Pandas (and My Favorite Workarounds)

5
Comments
3 min read
Implementando Databricks Asset Bundles sin morir en el intento
Cover image for Implementando Databricks Asset Bundles sin morir en el intento

Implementando Databricks Asset Bundles sin morir en el intento

Comments
9 min read
Big Data Processing - Case Study 2 (Databricks) 01:42

Big Data Processing - Case Study 2 (Databricks)

Comments
1 min read
Big Data Processing - Case Study 2 (Hadoop) 04:26

Big Data Processing - Case Study 2 (Hadoop)

Comments
1 min read
Big Data Processing - Case Study 2 (Spark) 01:52

Big Data Processing - Case Study 2 (Spark)

Comments
1 min read
Big Data Processing - Case Study 1 (Hadoop) 02:01

Big Data Processing - Case Study 1 (Hadoop)

Comments
1 min read
The Ultimate Linux Command Cheat Sheet for Data Engineers and Analysts

The Ultimate Linux Command Cheat Sheet for Data Engineers and Analysts

75
Comments 4
4 min read
Free Datasets for Practicing Data Engineering Skills: A 2025 Guide

Free Datasets for Practicing Data Engineering Skills: A 2025 Guide

3
Comments
3 min read
Building a Stock Data Pipeline with requests, Apache Airflow and PostgreSQL
Cover image for Building a Stock Data Pipeline with requests, Apache Airflow and PostgreSQL

Building a Stock Data Pipeline with requests, Apache Airflow and PostgreSQL

1
Comments
4 min read
Why do AWS dashboards keep breaking — and is there a better way?

Why do AWS dashboards keep breaking — and is there a better way?

Comments 1
1 min read
Complete Beginner's Guide: Building a Weather ETL Pipeline with PySpark
Cover image for Complete Beginner's Guide: Building a Weather ETL Pipeline with PySpark

Complete Beginner's Guide: Building a Weather ETL Pipeline with PySpark

2
Comments 1
5 min read
Event Sourcing as a creative tool for engineers

Event Sourcing as a creative tool for engineers

1
Comments
5 min read
loading...