Forem

# bigdata

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
The Data Refinery: Why Apache Spark is the Engine Behind Real-World Big Data Use Cases
Cover image for The Data Refinery: Why Apache Spark is the Engine Behind Real-World Big Data Use Cases

The Data Refinery: Why Apache Spark is the Engine Behind Real-World Big Data Use Cases

Comments
2 min read
I Built a Knowledge Base That Thinks — Inspired by Karpathy’s LLM Wiki

I Built a Knowledge Base That Thinks — Inspired by Karpathy’s LLM Wiki

5
Comments
6 min read
Processing High Frequency Solar Data Without HPC: Real Constraints and Design Decisions in MackSun

Processing High Frequency Solar Data Without HPC: Real Constraints and Design Decisions in MackSun

Comments
3 min read
ETL vs ELT: Which One Should You Use and Why?

ETL vs ELT: Which One Should You Use and Why?

Comments
5 min read
Infrastructure Design for Credit Risk Modeling
Cover image for Infrastructure Design for Credit Risk Modeling

Infrastructure Design for Credit Risk Modeling

Comments
1 min read
CONNECTING POSTGRESQL WITH POWERBI (FOR A LOAN PERFORMANCE DASHBOARD)

CONNECTING POSTGRESQL WITH POWERBI (FOR A LOAN PERFORMANCE DASHBOARD)

1
Comments
4 min read
ETL vs ELT: Which One Should You Use and Why?
Cover image for ETL vs ELT: Which One Should You Use and Why?

ETL vs ELT: Which One Should You Use and Why?

3
Comments
5 min read
Mastering Schema Evolution: Why Apache Avro is the King of Big Data (Part 2)

Mastering Schema Evolution: Why Apache Avro is the King of Big Data (Part 2)

Comments
3 min read
B-Trees, Clustered Indexes, and the OLAP Revolution (Part 2) 📊

B-Trees, Clustered Indexes, and the OLAP Revolution (Part 2) 📊

Comments
3 min read
Growing with the Community: Zhang Shenghang’s Path to Apache SeaTunnel PMC Member

Growing with the Community: Zhang Shenghang’s Path to Apache SeaTunnel PMC Member

Comments
4 min read
The Hidden Costs of Idle EMR Clusters (And How to Stop the Bleed)
Cover image for The Hidden Costs of Idle EMR Clusters (And How to Stop the Bleed)

The Hidden Costs of Idle EMR Clusters (And How to Stop the Bleed)

1
Comments
3 min read
(5)When Your Data Warehouse Breaks Down, It’s Probably a Naming Problem

(5)When Your Data Warehouse Breaks Down, It’s Probably a Naming Problem

Comments
5 min read
Understanding Data Modeling in Power BI: Joins, Relationships, and Schemas Explained.
Cover image for Understanding Data Modeling in Power BI: Joins, Relationships, and Schemas Explained.

Understanding Data Modeling in Power BI: Joins, Relationships, and Schemas Explained.

Comments
3 min read
PySpark : The Big Brain of Data Processing

PySpark : The Big Brain of Data Processing

3
Comments
5 min read
Modernizing Data Movement for the AI-Ready Enterprises
Cover image for Modernizing Data Movement for the AI-Ready Enterprises

Modernizing Data Movement for the AI-Ready Enterprises

1
Comments
4 min read
đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.