Forem

# dataengineering

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
Apache Data Lakehouse Weekly: February 26 – March 5, 2026
Cover image for Apache Data Lakehouse Weekly: February 26 – March 5, 2026

Apache Data Lakehouse Weekly: February 26 – March 5, 2026

1
Comments
6 min read
🚀 Projeto ETL em Python com dados públicos de clima + MySQL na nuvem

🚀 Projeto ETL em Python com dados públicos de clima + MySQL na nuvem

Comments
2 min read
Top 5 Snowflake Data Ingestion Tools in 2026 (Compared & Reviewed)
Cover image for Top 5 Snowflake Data Ingestion Tools in 2026 (Compared & Reviewed)

Top 5 Snowflake Data Ingestion Tools in 2026 (Compared & Reviewed)

Comments
9 min read
How Linux Powers Real-World Data Engineering
Cover image for How Linux Powers Real-World Data Engineering

How Linux Powers Real-World Data Engineering

2
Comments
14 min read
Databricks SQL Essentials - Array Data Type

Databricks SQL Essentials - Array Data Type

Comments
6 min read
Taking Action on your GCP bill: Automating BigQuery Storage Cleanup
Cover image for Taking Action on your GCP bill: Automating BigQuery Storage Cleanup

Taking Action on your GCP bill: Automating BigQuery Storage Cleanup

8
Comments
5 min read
Dynamic Selector Fallbacks: How to Scrape E-commerce Sites That Change Frequently
Cover image for Dynamic Selector Fallbacks: How to Scrape E-commerce Sites That Change Frequently

Dynamic Selector Fallbacks: How to Scrape E-commerce Sites That Change Frequently

Comments
5 min read
Monitoring Share of Search: Automating IKEA Product Visibility Tracking
Cover image for Monitoring Share of Search: Automating IKEA Product Visibility Tracking

Monitoring Share of Search: Automating IKEA Product Visibility Tracking

Comments
5 min read
Efficient Parallelism in Python: A Practical Guide to concurrent.futures module

Efficient Parallelism in Python: A Practical Guide to concurrent.futures module

Comments
5 min read
Is AWS Glue Data Catalog Sufficient as a Data Catalog? Organizing Its Design, Limitations, and Complementary Strategies

Is AWS Glue Data Catalog Sufficient as a Data Catalog? Organizing Its Design, Limitations, and Complementary Strategies

7
Comments
10 min read
🤖 Feature Pipeline — Where Your Raw Data Becomes AI Fuel🤖

🤖 Feature Pipeline — Where Your Raw Data Becomes AI Fuel🤖

13
Comments
2 min read
The Vinted Arbitrage War: Building a Scraper That Doesn't Get IP-Banned
Cover image for The Vinted Arbitrage War: Building a Scraper That Doesn't Get IP-Banned

The Vinted Arbitrage War: Building a Scraper That Doesn't Get IP-Banned

Comments 1
9 min read
Building a Real-Time Data Pipeline: Streaming TCP Socket Data to PostgreSQL with Node.js

Building a Real-Time Data Pipeline: Streaming TCP Socket Data to PostgreSQL with Node.js

Comments
3 min read
I built pq - the jq of Parquet. Here's why data engineers need a better CLI

I built pq - the jq of Parquet. Here's why data engineers need a better CLI

2
Comments
1 min read
The Ultimate Databricks Data Engineer Associate Exam Guide for AWS Engineers

The Ultimate Databricks Data Engineer Associate Exam Guide for AWS Engineers

1
Comments
45 min read
👋 Sign in for the ability to sort posts by relevant, latest, or top.