Skip to content
Navigation menu
Search
Powered by
Search
Algolia
Log in
Create account
DEV Community
Close
#
bigdata
Follow
Hide
Posts
Left menu
đź‘‹
Sign in
for the ability to sort posts by
relevant
,
latest
, or
top
.
Right menu
Metadata for win — Apache Parquet
Rahul Dubey
Rahul Dubey
Rahul Dubey
Follow
May 25 '24
Metadata for win — Apache Parquet
#
python
#
bigdata
#
datascience
#
dataengineering
Comments
Add Comment
5 min read
Comprehensive Guide to Schema Inference with MongoDB Spark Connector in PySpark
Chetan Gupta
Chetan Gupta
Chetan Gupta
Follow
Jun 27 '24
Comprehensive Guide to Schema Inference with MongoDB Spark Connector in PySpark
#
pyspark
#
bigdata
#
mongodb
#
spark
Comments
Add Comment
3 min read
Advanced Insights into Automated Data Processing Tools
Data Expertise
Data Expertise
Data Expertise
Follow
Jun 16 '24
Advanced Insights into Automated Data Processing Tools
#
automateddataprocessing
#
machinelearning
#
bigdata
#
datascience
1
reaction
Comments
Add Comment
4 min read
How to Build an API with Strong Security Measures
Ovais
Ovais
Ovais
Follow
Jun 12 '24
How to Build an API with Strong Security Measures
#
api
#
bigdata
#
datascience
#
datamanagement
Comments
Add Comment
4 min read
Documenting Rate Limits and Throttling in REST APIs
Ovais
Ovais
Ovais
Follow
Jun 12 '24
Documenting Rate Limits and Throttling in REST APIs
#
api
#
bigdata
#
datamanagement
#
datascience
Comments
Add Comment
5 min read
GraphQL API Design Best Practices for Efficient Data Management
Ovais
Ovais
Ovais
Follow
Jun 12 '24
GraphQL API Design Best Practices for Efficient Data Management
#
api
#
datamanagement
#
bigdata
#
graphql
1
reaction
Comments
Add Comment
5 min read
The current Lakehouse is like a false proposition
Judy
Judy
Judy
Follow
Jun 12 '24
The current Lakehouse is like a false proposition
#
lackhouse
#
bigdata
#
development
#
programming
6
reactions
Comments
1
comment
10 min read
Is distributed technology the panacea for big data processing?
Judy
Judy
Judy
Follow
Jun 6 '24
Is distributed technology the panacea for big data processing?
#
bigdata
#
processing
#
development
#
lauguage
7
reactions
Comments
1
comment
10 min read
Big Data: a ferramenta que precisamos.
Delmiro Ribeiro
Delmiro Ribeiro
Delmiro Ribeiro
Follow
May 26 '24
Big Data: a ferramenta que precisamos.
#
bigdata
#
database
#
datascience
#
backend
Comments
Add Comment
2 min read
PySpark: missing value
ChelseaLiu0822
ChelseaLiu0822
ChelseaLiu0822
Follow
Apr 18 '24
PySpark: missing value
#
pyspark
#
python
#
dataengineering
#
bigdata
Comments
Add Comment
2 min read
Cross-cluster replication for read-write separation
Apache Doris
Apache Doris
Apache Doris
Follow
May 21 '24
Cross-cluster replication for read-write separation
#
database
#
bigdata
#
dataengineering
#
tutorial
2
reactions
Comments
Add Comment
4 min read
Stream Data at scale from millions of sources with Amazon Kinesis (Serverless)
Asanka Boteju
Asanka Boteju
Asanka Boteju
Follow
May 20 '24
Stream Data at scale from millions of sources with Amazon Kinesis (Serverless)
#
bigdata
#
kinesis
#
aws
#
strems
13
reactions
Comments
Add Comment
7 min read
Trino & Iceberg Made Easy: A Ready-to-Use Playground
ChunTing Wu
ChunTing Wu
ChunTing Wu
Follow
May 20 '24
Trino & Iceberg Made Easy: A Ready-to-Use Playground
#
bigdata
#
datascience
#
tutorial
#
dataengineering
22
reactions
Comments
Add Comment
3 min read
The Role of Data Integration in Healthcare Research and Precision Medicine
Ovais
Ovais
Ovais
Follow
May 13 '24
The Role of Data Integration in Healthcare Research and Precision Medicine
#
dataintegration
#
healthcare
#
datascience
#
bigdata
Comments
1
comment
4 min read
Automating Data Processes for Efficiency and Accuracy
Ovais
Ovais
Ovais
Follow
May 8 '24
Automating Data Processes for Efficiency and Accuracy
#
dataextraction
#
bigdata
#
datamanagement
#
datascience
Comments
Add Comment
5 min read
Auto-increment columns in Apache Doris
Apache Doris
Apache Doris
Apache Doris
Follow
May 8 '24
Auto-increment columns in Apache Doris
#
database
#
dataegnineering
#
tutorial
#
bigdata
Comments
Add Comment
11 min read
What to use parquet or CSV?
Hitesh
Hitesh
Hitesh
Follow
May 7 '24
What to use parquet or CSV?
#
datascience
#
database
#
python
#
bigdata
22
reactions
Comments
Add Comment
3 min read
Accelerating ETL Processes for Timely Business Intelligence
Ovais
Ovais
Ovais
Follow
May 7 '24
Accelerating ETL Processes for Timely Business Intelligence
#
changedatacapture
#
bigdata
#
datamanagement
#
datascience
Comments
Add Comment
4 min read
Are There “Queries over Trillion-Row Tables in Seconds”? Is “N-Times Faster Than ORACLE” an Exaggeration?
jbx1279
jbx1279
jbx1279
Follow
Apr 13 '24
Are There “Queries over Trillion-Row Tables in Seconds”? Is “N-Times Faster Than ORACLE” an Exaggeration?
#
sql
#
performance
#
bigdata
#
database
Comments
Add Comment
4 min read
A glimpse into the future of data processing infrastructure.
Kostas Pardalis
Kostas Pardalis
Kostas Pardalis
Follow
May 2 '24
A glimpse into the future of data processing infrastructure.
#
database
#
bigdata
#
snowflake
#
spark
Comments
Add Comment
9 min read
Safeguarding Data Quality By Addressing Data Privacy and Security Concerns
Ovais
Ovais
Ovais
Follow
Apr 30 '24
Safeguarding Data Quality By Addressing Data Privacy and Security Concerns
#
datascience
#
bigdata
#
datamanagement
#
datamigration
1
reaction
Comments
1
comment
4 min read
Best Practices for Designing an Efficient ETL Pipeline
Ovais
Ovais
Ovais
Follow
Apr 30 '24
Best Practices for Designing an Efficient ETL Pipeline
#
etl
#
datascience
#
bigdata
#
datamanagement
5
reactions
Comments
Add Comment
4 min read
The Role of Big Data Analytics in BFSI: Leveraging Data for Competitive Advantage
Ajay
Ajay
Ajay
Follow
Mar 27 '24
The Role of Big Data Analytics in BFSI: Leveraging Data for Competitive Advantage
#
bigdata
#
bfsi
#
data
#
analytics
Comments
Add Comment
4 min read
LLMs, DevOps, and Big Data Musings
bfuller
bfuller
bfuller
Follow
Apr 25 '24
LLMs, DevOps, and Big Data Musings
#
devops
#
llm
#
ai
#
bigdata
Comments
Add Comment
3 min read
Understanding and Mitigating Message Loss in Apache Kafka
Yusen Meng
Yusen Meng
Yusen Meng
Follow
Apr 25 '24
Understanding and Mitigating Message Loss in Apache Kafka
#
bigdata
#
datareliability
#
streamprocessing
#
distributed
17
reactions
Comments
Add Comment
9 min read
loading...
We're a blogging-forward open source social network where we learn from one another
Log in
Create account