Skip to content
Navigation menu
Search
Powered by Algolia
Search
Log in
Create account
Forem
Close
#
dataengineering
Follow
Hide
Posts
Left menu
👋
Sign in
for the ability to sort posts by
relevant
,
latest
, or
top
.
Right menu
ClickHouse Has a Free Column-Oriented Database — Query Billions of Rows in Milliseconds
Alex Spinov
Alex Spinov
Alex Spinov
Follow
Mar 27
ClickHouse Has a Free Column-Oriented Database — Query Billions of Rows in Milliseconds
#
database
#
dataengineering
#
analytics
#
opensource
Comments
Add Comment
2 min read
How Linux is Used in Real-World Data Engineering
Frederick M
Frederick M
Frederick M
Follow
Mar 27
How Linux is Used in Real-World Data Engineering
#
cli
#
dataengineering
#
linux
#
tutorial
Comments
Add Comment
3 min read
100 Spark Interview Questions for Data Engineer
Hannah Usmedynska
Hannah Usmedynska
Hannah Usmedynska
Follow
Mar 27
100 Spark Interview Questions for Data Engineer
#
career
#
dataengineering
#
interview
#
resources
1
reaction
Comments
Add Comment
11 min read
Flowfile v0.8.0 — Your Flows Can Run Themselves Now
Edwardvaneechoud
Edwardvaneechoud
Edwardvaneechoud
Follow
Mar 26
Flowfile v0.8.0 — Your Flows Can Run Themselves Now
#
showdev
#
automation
#
dataengineering
#
tooling
Comments
Add Comment
4 min read
# Apache Data Lakehouse Weekly: March 20–27, 2026
Alex Merced
Alex Merced
Alex Merced
Follow
Mar 27
# Apache Data Lakehouse Weekly: March 20–27, 2026
#
news
#
community
#
dataengineering
#
opensource
Comments
Add Comment
7 min read
Frosty : 150 + AI Open Source Sub- Agents to Automate Snowflake
Priyank Malviya
Priyank Malviya
Priyank Malviya
Follow
Mar 26
Frosty : 150 + AI Open Source Sub- Agents to Automate Snowflake
#
ai
#
agents
#
opensource
#
dataengineering
Comments
Add Comment
2 min read
When Synthetic Data Lies: A Hidden Correlation Problem I Didn’t Expect
Mohamed Hussain S
Mohamed Hussain S
Mohamed Hussain S
Follow
Mar 26
When Synthetic Data Lies: A Hidden Correlation Problem I Didn’t Expect
#
dataengineering
#
clickhouse
#
analytics
#
debugging
3
reactions
Comments
Add Comment
3 min read
Building & Monitoring Data Backends: Tools, Architecture, and Observability
soy
soy
soy
Follow
Mar 26
Building & Monitoring Data Backends: Tools, Architecture, and Observability
#
database
#
sql
#
dataengineering
Comments
Add Comment
4 min read
Issues of Multi-GB Spreadsheets in Data Lakes
Toby Patrick
Toby Patrick
Toby Patrick
Follow
Mar 26
Issues of Multi-GB Spreadsheets in Data Lakes
#
data
#
dataengineering
#
performance
Comments
Add Comment
4 min read
Asset-Based Data Orchestration: Lessons from Building a Multi-State Social Data Platform
uninterrupted
uninterrupted
uninterrupted
Follow
for
u11d
Mar 26
Asset-Based Data Orchestration: Lessons from Building a Multi-State Social Data Platform
#
dagster
#
dataorchestration
#
dataengineering
1
reaction
Comments
Add Comment
6 min read
The Backyard Quarry, Part 2: Designing a Schema for Physical Objects
Ken W Alger
Ken W Alger
Ken W Alger
Follow
Mar 26
The Backyard Quarry, Part 2: Designing a Schema for Physical Objects
#
dataengineering
#
softwareengineering
#
backendengineering
#
datamodeling
2
reactions
Comments
Add Comment
5 min read
The Vinted Arbitrage War: Building a Scraper That Doesn't Get IP-Banned
KazKN
KazKN
KazKN
Follow
Mar 25
The Vinted Arbitrage War: Building a Scraper That Doesn't Get IP-Banned
#
webdev
#
python
#
dataengineering
#
tutorial
Comments
Add Comment
9 min read
Is AWS Glue Data Catalog Sufficient as a Data Catalog? Organizing Its Design, Limitations, and Complementary Strategies
Aki
Aki
Aki
Follow
for
AWS Community Builders
Mar 25
Is AWS Glue Data Catalog Sufficient as a Data Catalog? Organizing Its Design, Limitations, and Complementary Strategies
#
aws
#
dataengineering
6
reactions
Comments
Add Comment
10 min read
I built pq - the jq of Parquet. Here's why data engineers need a better CLI
Evgenii Orlov
Evgenii Orlov
Evgenii Orlov
Follow
Mar 25
I built pq - the jq of Parquet. Here's why data engineers need a better CLI
#
rust
#
cli
#
dataengineering
#
opensource
1
reaction
Comments
Add Comment
1 min read
How to Build a Scalable Serverless Social Media Ingestion & Analytics Pipeline on AWS
CapeStart
CapeStart
CapeStart
Follow
Mar 26
How to Build a Scalable Serverless Social Media Ingestion & Analytics Pipeline on AWS
#
aws
#
serverless
#
dataengineering
1
reaction
Comments
Add Comment
4 min read
👋
Sign in
for the ability to sort posts by
relevant
,
latest
, or
top
.
We're a blogging-forward open source social network where we learn from one another
Log in
Create account