Forem

# bigdata

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
Connecting Multiple Kafka Clusters in ClickHouse Using Named Collections
Cover image for Connecting Multiple Kafka Clusters in ClickHouse Using Named Collections

Connecting Multiple Kafka Clusters in ClickHouse Using Named Collections

9
Comments
3 min read
SPL computing performance test series: associate tables and wide table

SPL computing performance test series: associate tables and wide table

Comments
6 min read
Leveraging AI in Education: Exploring Big Data and Related Applications
Cover image for Leveraging AI in Education: Exploring Big Data and Related Applications

Leveraging AI in Education: Exploring Big Data and Related Applications

Comments
11 min read
GlusterFS vs. JuiceFS

GlusterFS vs. JuiceFS

Comments
7 min read
50%+ Cut in Both Storage & Compute Costs: Designing NetEase Games' Cloud Big Data Platform

50%+ Cut in Both Storage & Compute Costs: Designing NetEase Games' Cloud Big Data Platform

Comments
9 min read
What is '_spark_metadata' Directory in Spark Structured Streaming ?

What is '_spark_metadata' Directory in Spark Structured Streaming ?

2
Comments
1 min read
SQL is consuming the lives of data scientists

SQL is consuming the lives of data scientists

5
Comments 3
20 min read
⛏ Get Mining into Data with These Top 5 Resources
Cover image for ⛏ Get Mining into Data with These Top 5 Resources

⛏ Get Mining into Data with These Top 5 Resources

5
Comments 2
6 min read
Data warehouse with “no house” performs better than the one with “the house”

Data warehouse with “no house” performs better than the one with “the house”

1
Comments
11 min read
Is Your Latest Data Really the Latest? Check the Data Update Mechanism of Your Database
Cover image for Is Your Latest Data Really the Latest? Check the Data Update Mechanism of Your Database

Is Your Latest Data Really the Latest? Check the Data Update Mechanism of Your Database

2
Comments 1
6 min read
Introduction to Big-data

Introduction to Big-data

2
Comments 2
3 min read
The performance problems of data warehouse and solutions

The performance problems of data warehouse and solutions

Comments
14 min read
Snowflake: Revolutionizing data warehousing
Cover image for Snowflake: Revolutionizing data warehousing

Snowflake: Revolutionizing data warehousing

3
Comments 1
6 min read
Listen to That Poor BI Engineer: We Need Fast Joins

Listen to That Poor BI Engineer: We Need Fast Joins

Comments
5 min read
Data warehouse running on file system

Data warehouse running on file system

1
Comments
9 min read
Apache Doris 2.0 Beta Now Available: Faster, Stabler, and More Versatile
Cover image for Apache Doris 2.0 Beta Now Available: Faster, Stabler, and More Versatile

Apache Doris 2.0 Beta Now Available: Faster, Stabler, and More Versatile

Comments
15 min read
3 Data Observability Tools

3 Data Observability Tools

Comments
3 min read
Spark AI - Bringing Chat GPT to Data Engineering

Spark AI - Bringing Chat GPT to Data Engineering

13
Comments 1
5 min read
Why Are There So Many Snapshot Tables in BI Systems?

Why Are There So Many Snapshot Tables in BI Systems?

5
Comments
9 min read
Why does wide table prevail?

Why does wide table prevail?

5
Comments
13 min read
Next Big Data System

Next Big Data System

Comments
1 min read
Open-source SPL: The Breaker of Closed Database Computing System

Open-source SPL: The Breaker of Closed Database Computing System

Comments 1
8 min read
Bulk load to Elastic Search with PySpark

Bulk load to Elastic Search with PySpark

7
Comments
2 min read
Routable computing engine implements front-end database

Routable computing engine implements front-end database

Comments
5 min read
How does the in-memory database bring memory’s advantage into play?

How does the in-memory database bring memory’s advantage into play?

Comments
12 min read
loading...