DEV Community

# bigdata

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
Metadata for win — Apache Parquet

Metadata for win — Apache Parquet

Comments
5 min read
Comprehensive Guide to Schema Inference with MongoDB Spark Connector in PySpark

Comprehensive Guide to Schema Inference with MongoDB Spark Connector in PySpark

Comments
3 min read
Advanced Insights into Automated Data Processing Tools

Advanced Insights into Automated Data Processing Tools

1
Comments
4 min read
How to Build an API with Strong Security Measures

How to Build an API with Strong Security Measures

Comments
4 min read
Documenting Rate Limits and Throttling in REST APIs

Documenting Rate Limits and Throttling in REST APIs

Comments
5 min read
GraphQL API Design Best Practices for Efficient Data Management

GraphQL API Design Best Practices for Efficient Data Management

1
Comments
5 min read
The current Lakehouse is like a false proposition

The current Lakehouse is like a false proposition

6
Comments 1
10 min read
Is distributed technology the panacea for big data processing?

Is distributed technology the panacea for big data processing?

7
Comments 1
10 min read
Big Data: a ferramenta que precisamos.

Big Data: a ferramenta que precisamos.

Comments
2 min read
PySpark: missing value

PySpark: missing value

Comments
2 min read
Cross-cluster replication for read-write separation

Cross-cluster replication for read-write separation

2
Comments
4 min read
Stream Data at scale from millions of sources with Amazon Kinesis (Serverless)

Stream Data at scale from millions of sources with Amazon Kinesis (Serverless)

13
Comments
7 min read
Trino & Iceberg Made Easy: A Ready-to-Use Playground

Trino & Iceberg Made Easy: A Ready-to-Use Playground

22
Comments
3 min read
The Role of Data Integration in Healthcare Research and Precision Medicine

The Role of Data Integration in Healthcare Research and Precision Medicine

Comments 1
4 min read
Automating Data Processes for Efficiency and Accuracy

Automating Data Processes for Efficiency and Accuracy

Comments
5 min read
Auto-increment columns in Apache Doris

Auto-increment columns in Apache Doris

Comments
11 min read
What to use parquet or CSV?

What to use parquet or CSV?

22
Comments
3 min read
Accelerating ETL Processes for Timely Business Intelligence

Accelerating ETL Processes for Timely Business Intelligence

Comments
4 min read
Are There “Queries over Trillion-Row Tables in Seconds”? Is “N-Times Faster Than ORACLE” an Exaggeration?

Are There “Queries over Trillion-Row Tables in Seconds”? Is “N-Times Faster Than ORACLE” an Exaggeration?

Comments
4 min read
A glimpse into the future of data processing infrastructure.

A glimpse into the future of data processing infrastructure.

Comments
9 min read
Safeguarding Data Quality By Addressing Data Privacy and Security Concerns

Safeguarding Data Quality By Addressing Data Privacy and Security Concerns

1
Comments 1
4 min read
Best Practices for Designing an Efficient ETL Pipeline

Best Practices for Designing an Efficient ETL Pipeline

5
Comments
4 min read
The Role of Big Data Analytics in BFSI: Leveraging Data for Competitive Advantage

The Role of Big Data Analytics in BFSI: Leveraging Data for Competitive Advantage

Comments
4 min read
LLMs, DevOps, and Big Data Musings

LLMs, DevOps, and Big Data Musings

Comments
3 min read
Understanding and Mitigating Message Loss in Apache Kafka

Understanding and Mitigating Message Loss in Apache Kafka

17
Comments
9 min read
loading...