Forem

# dataengineering

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
Amazon S3 Files: from Kafka to S3 via NFS
Cover image for Amazon S3 Files: from Kafka to S3 via NFS

Amazon S3 Files: from Kafka to S3 via NFS

2
Comments 1
11 min read
50,000 Cells. One Network. How Do You Know Which One Is Quietly Breaking?
Cover image for 50,000 Cells. One Network. How Do You Know Which One Is Quietly Breaking?

50,000 Cells. One Network. How Do You Know Which One Is Quietly Breaking?

1
Comments
5 min read
Snowflake vs Redshift vs BigQuery: Which One Should You Use?
Cover image for Snowflake vs Redshift vs BigQuery: Which One Should You Use?

Snowflake vs Redshift vs BigQuery: Which One Should You Use?

1
Comments
4 min read
Getting Data from Different Sources in Power BI
Cover image for Getting Data from Different Sources in Power BI

Getting Data from Different Sources in Power BI

Comments
13 min read
5 Data Engineering Techniques That Increased Our LLM Efficiency by 70%
Cover image for 5 Data Engineering Techniques That Increased Our LLM Efficiency by 70%

5 Data Engineering Techniques That Increased Our LLM Efficiency by 70%

Comments
1 min read
A Day in the Life of a Data Engineer (Real Talk, No Filter)
Cover image for A Day in the Life of a Data Engineer (Real Talk, No Filter)

A Day in the Life of a Data Engineer (Real Talk, No Filter)

Comments
5 min read
Monitoring and Observability for Real-Time Streaming Pipelines
Cover image for Monitoring and Observability for Real-Time Streaming Pipelines

Monitoring and Observability for Real-Time Streaming Pipelines

Comments
10 min read
AI Won't Stop Itself From Being Stupid - That's YOUR Job
Cover image for AI Won't Stop Itself From Being Stupid - That's YOUR Job

AI Won't Stop Itself From Being Stupid - That's YOUR Job

Comments
8 min read
Apache Data Lakehouse Weekly: March 10–17, 2026
Cover image for Apache Data Lakehouse Weekly: March 10–17, 2026

Apache Data Lakehouse Weekly: March 10–17, 2026

1
Comments
8 min read
Scaling multi-node GPU data pipelines using Dask on Kubernetes
Cover image for Scaling multi-node GPU data pipelines using Dask on Kubernetes

Scaling multi-node GPU data pipelines using Dask on Kubernetes

1
Comments
10 min read
AI Citation Registries and Structured Record Requirements for AI Interpretation
Cover image for AI Citation Registries and Structured Record Requirements for AI Interpretation

AI Citation Registries and Structured Record Requirements for AI Interpretation

1
Comments
3 min read
Soda Moved to ELv2. Provero Is Apache 2.0.

Soda Moved to ELv2. Provero Is Apache 2.0.

1
Comments
3 min read
Credit data is messier than equity data, always
Cover image for Credit data is messier than equity data, always

Credit data is messier than equity data, always

1
Comments 1
7 min read
Building with Snowflake Cortex Analyst — What I Learned About Semantic Layers and Guardrails

Building with Snowflake Cortex Analyst — What I Learned About Semantic Layers and Guardrails

Comments
2 min read
🚀 Apache Spark Just Killed the Microbatch Barrier (And Why Flink Should Be Worried)

🚀 Apache Spark Just Killed the Microbatch Barrier (And Why Flink Should Be Worried)

1
Comments
3 min read
đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.