Skip to content
Navigation menu
Search
Powered by
Search
Algolia
Log in
Create account
Forem
Close
#
spark
Follow
Hide
Posts
Left menu
👋
Sign in
for the ability to sort posts by
relevant
,
latest
, or
top
.
Right menu
A glimpse into the future of data processing infrastructure.
Kostas Pardalis
Kostas Pardalis
Kostas Pardalis
Follow
May 2 '24
A glimpse into the future of data processing infrastructure.
#
database
#
bigdata
#
snowflake
#
spark
Comments
Add Comment
9 min read
Learning Spark 2.0 Knowledge Dump
OdyAsh
OdyAsh
OdyAsh
Follow
Apr 29 '24
Learning Spark 2.0 Knowledge Dump
#
spark
#
dataengineering
#
learning
#
python
Comments
Add Comment
3 min read
Como conectar Spark e S3 para processamento de arquivos
Carlos Filho
Carlos Filho
Carlos Filho
Follow
for
AWS Community Builders
Apr 19 '24
Como conectar Spark e S3 para processamento de arquivos
#
aws
#
awscommunitybuilders
#
spark
#
s3
4
reactions
Comments
Add Comment
13 min read
Predicate Pushdown - Understanding Practically With An Example
Aniketh Deshpande
Aniketh Deshpande
Aniketh Deshpande
Follow
Apr 17 '24
Predicate Pushdown - Understanding Practically With An Example
#
spark
#
optimisation
#
sql
#
interview
4
reactions
Comments
1
comment
2 min read
Template for design document of Apache Spark project
Pankaj
Pankaj
Pankaj
Follow
Apr 2 '24
Template for design document of Apache Spark project
#
spark
#
pyspark
Comments
Add Comment
1 min read
Spark Associate Developer Certification Guide
Labinot Vila
Labinot Vila
Labinot Vila
Follow
Mar 19 '24
Spark Associate Developer Certification Guide
#
spark
#
certification
#
databricks
#
developer
Comments
1
comment
3 min read
Embarking on the Data Odyssey: A Deep Dive into Data Engineering for Tech Enthusiasts
Gaurav Vishwakarma
Gaurav Vishwakarma
Gaurav Vishwakarma
Follow
Feb 24 '24
Embarking on the Data Odyssey: A Deep Dive into Data Engineering for Tech Enthusiasts
#
dataengineering
#
data
#
python
#
spark
Comments
Add Comment
3 min read
Different file formats, a benchmark doing basic operations
Pedro H Goncalves
Pedro H Goncalves
Pedro H Goncalves
Follow
Mar 10 '24
Different file formats, a benchmark doing basic operations
#
dataengineering
#
spark
#
benchmark
#
datascience
10
reactions
Comments
2
comments
9 min read
Enhancing Data Security with Spark: A Guide to Column-Level Encryption - Part 1
Mostefa Brougui
Mostefa Brougui
Mostefa Brougui
Follow
for
AWS Community Builders
Mar 8 '24
Enhancing Data Security with Spark: A Guide to Column-Level Encryption - Part 1
#
aws
#
spark
#
encryption
#
dataprotection
3
reactions
Comments
Add Comment
5 min read
GroupBy and Join in Spark
Cris Crawford
Cris Crawford
Cris Crawford
Follow
Mar 4 '24
GroupBy and Join in Spark
#
spark
#
sql
#
dataengineering
3
reactions
Comments
Add Comment
2 min read
Configuring and using Hadoop and Spark on Ubuntu 22.04 LTS (with Canada 2021 Census data)
Jordan Bell
Jordan Bell
Jordan Bell
Follow
Feb 14 '24
Configuring and using Hadoop and Spark on Ubuntu 22.04 LTS (with Canada 2021 Census data)
#
tutorial
#
bash
#
ubuntu
#
spark
Comments
Add Comment
16 min read
An Introduction to Hive UDFs with Scala
Omer Farooq Ahmed
Omer Farooq Ahmed
Omer Farooq Ahmed
Follow
Dec 14 '23
An Introduction to Hive UDFs with Scala
#
hive
#
spark
#
scala
2
reactions
Comments
1
comment
5 min read
BigData Journey from Hadoop and MapReduce to AWS EMR
Olga Woschitz
Olga Woschitz
Olga Woschitz
Follow
Nov 21 '23
BigData Journey from Hadoop and MapReduce to AWS EMR
#
bigdata
#
emr
#
spark
#
hadoop
Comments
Add Comment
9 min read
Running Jobs on Athena Spark
elliott cordo
elliott cordo
elliott cordo
Follow
for
AWS Heroes
Oct 13 '23
Running Jobs on Athena Spark
#
aws
#
awsbigdata
#
spark
#
dataengineering
3
reactions
Comments
Add Comment
2 min read
Spark on AWS Glue: Performance Tuning 4 ( Spark Join)
Tomoya Oda
Tomoya Oda
Tomoya Oda
Follow
Jul 16 '23
Spark on AWS Glue: Performance Tuning 4 ( Spark Join)
#
aws
#
glue
#
spark
#
performance
2
reactions
Comments
Add Comment
2 min read
Spark on AWS Glue: Performance Tuning 2 (Glue DynamicFrame vs Spark DataFrame)
Tomoya Oda
Tomoya Oda
Tomoya Oda
Follow
Jul 16 '23
Spark on AWS Glue: Performance Tuning 2 (Glue DynamicFrame vs Spark DataFrame)
#
aws
#
glue
#
spark
#
performance
4
reactions
Comments
Add Comment
2 min read
Spark on AWS Glue: Performance Tuning 1 (CSV vs Parquet)
Tomoya Oda
Tomoya Oda
Tomoya Oda
Follow
Jul 16 '23
Spark on AWS Glue: Performance Tuning 1 (CSV vs Parquet)
#
aws
#
glue
#
spark
1
reaction
Comments
Add Comment
4 min read
A new Kedro dataset for Spark Structured Streaming
Juan Luis Cano RodrÃguez
Juan Luis Cano RodrÃguez
Juan Luis Cano RodrÃguez
Follow
for
Kedro
Jul 12 '23
A new Kedro dataset for Spark Structured Streaming
#
python
#
kedro
#
spark
#
streaming
1
reaction
Comments
Add Comment
7 min read
Graphite aracılığı ile Grafana'da Apache SPARK ve Hadoop Monitoring
Nurhak Şentürk
Nurhak Şentürk
Nurhak Şentürk
Follow
Jun 21 '23
Graphite aracılığı ile Grafana'da Apache SPARK ve Hadoop Monitoring
#
grafana
#
graphite
#
spark
#
hadoop
2
reactions
Comments
Add Comment
8 min read
Debug long running Spark job
anhcodes
anhcodes
anhcodes
Follow
May 31 '23
Debug long running Spark job
#
spark
Comments
Add Comment
10 min read
Using pyspark to stream data from coingecko API and visualise using dash
James
James
James
Follow
Jun 18 '23
Using pyspark to stream data from coingecko API and visualise using dash
#
spark
#
python
#
dataengineering
#
codenewbie
3
reactions
Comments
Add Comment
6 min read
Flatten Map Spark Python
Ivan G
Ivan G
Ivan G
Follow
May 15 '23
Flatten Map Spark Python
#
spark
#
pyspark
Comments
Add Comment
6 min read
Creating a Election Monitoring System Using MongoDB, Spark, Twilio SMS Notifications, and Dash
Stefen
Stefen
Stefen
Follow
Jun 13 '23
Creating a Election Monitoring System Using MongoDB, Spark, Twilio SMS Notifications, and Dash
#
data
#
spark
#
dash
#
mongodb
Comments
Add Comment
10 min read
Build an Open Source LakeHouse with minimun code effort (Spark + Hudi + DBT+ Hivemetastore + Trino)
Michael Martin
Michael Martin
Michael Martin
Follow
Jun 9 '23
Build an Open Source LakeHouse with minimun code effort (Spark + Hudi + DBT+ Hivemetastore + Trino)
#
spark
#
lakehouse
#
dbt
#
hudi
1
reaction
Comments
1
comment
8 min read
Bulk load to Elastic Search with PySpark
Valery C. Briz
Valery C. Briz
Valery C. Briz
Follow
Jun 1 '23
Bulk load to Elastic Search with PySpark
#
elasticsearch
#
spark
#
pyspark
#
bigdata
7
reactions
Comments
Add Comment
2 min read
loading...
We're a blogging-forward open source social network where we learn from one another
Log in
Create account