Forem

Data Science

Data Science allows us to extract meaning from and interpret data.

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
Entity Resolution on 208,000 Real Records with the Golden Suite
Cover image for Entity Resolution on 208,000 Real Records with the Golden Suite

Entity Resolution on 208,000 Real Records with the Golden Suite

Comments
7 min read
Why I Built a 3,200-Line Python Pipeline to Generate Synthetic Financial Data From Math -- Not AI

Why I Built a 3,200-Line Python Pipeline to Generate Synthetic Financial Data From Math -- Not AI

Comments
5 min read
Confusion Matrix, Precision, Recall, and F1: A Practical Medical Screening Guide
Cover image for Confusion Matrix, Precision, Recall, and F1: A Practical Medical Screening Guide

Confusion Matrix, Precision, Recall, and F1: A Practical Medical Screening Guide

Comments
3 min read
Understanding the Data Science Lifecycle From messy data to real-world impact – a step-by-step journey

Understanding the Data Science Lifecycle From messy data to real-world impact – a step-by-step journey

1
Comments
5 min read
From Dirty CSV to Golden Records: A Python Walkthrough
Cover image for From Dirty CSV to Golden Records: A Python Walkthrough

From Dirty CSV to Golden Records: A Python Walkthrough

Comments
10 min read
Thursday: April 9 - Visual AI Agents Workshop

Thursday: April 9 - Visual AI Agents Workshop

1
Comments
1 min read
📘 Master Note: The Hidden Mechanics of PCA & ICA
Cover image for 📘 Master Note: The Hidden Mechanics of PCA & ICA

📘 Master Note: The Hidden Mechanics of PCA & ICA

Comments
2 min read
Why I Build Healthcare AI That Clinicians Actually Use

Why I Build Healthcare AI That Clinicians Actually Use

Comments
2 min read
The ‘Missing Middle’ of Data Processing in Java (10M Rows in ~40s)
Cover image for The ‘Missing Middle’ of Data Processing in Java (10M Rows in ~40s)

The ‘Missing Middle’ of Data Processing in Java (10M Rows in ~40s)

7
Comments 1
3 min read
Monte Carlo Simulation in 5 Minutes: From Zero to Confidence Intervals in One API Call

Monte Carlo Simulation in 5 Minutes: From Zero to Confidence Intervals in One API Call

Comments
7 min read
Generate better Synthetic Data for Fine-Tuning with Skillware
Cover image for Generate better Synthetic Data for Fine-Tuning with Skillware

Generate better Synthetic Data for Fine-Tuning with Skillware

Comments
1 min read
End-to-End Data Ingestion in Power BI: Connecting and Preparing Data from Multiple Sources
Cover image for End-to-End Data Ingestion in Power BI: Connecting and Preparing Data from Multiple Sources

End-to-End Data Ingestion in Power BI: Connecting and Preparing Data from Multiple Sources

Comments
5 min read
QuillSort — A data sorter
Cover image for QuillSort — A data sorter

QuillSort — A data sorter

Comments
4 min read
PageRank, Louvain, and Shortest Path — Without Deploying Neo4j

PageRank, Louvain, and Shortest Path — Without Deploying Neo4j

Comments
6 min read
How I Used Python to Analyze S&P 500 Returns Since 1928

How I Used Python to Analyze S&P 500 Returns Since 1928

Comments
4 min read
👋 Sign in for the ability to sort posts by relevant, latest, or top.