Forem: Anshika

My Deep Learning Journey with Andrew Ng

Anshika — Fri, 25 Jul 2025 12:20:23 +0000

Just Completed: Neural Networks and Deep Learning on Coursera!

I'm excited to share that I've just finished the Neural Networks and Deep Learning course on Coursera as part of the Deep Learning Specialization. This foundational course has been an incredible journey into the world of AI and machine learning!

What I Learned

Core Concepts Mastered:

Neural Network Fundamentals: Understanding perceptrons, multi-layer networks, and the mathematical foundations behind them
Forward and Backward Propagation: Implementing the core algorithms that make neural networks learn
Activation Functions: Exploring sigmoid, tanh, ReLU, and their impact on network performance
Gradient Descent Optimization: Understanding how networks minimize cost functions
Deep Neural Networks: Building and training networks with multiple hidden layers

Hands-On Experience:

Implemented neural networks from scratch using Python and NumPy
Built binary and multi-class classification models
Worked with real datasets to solve practical problems
Optimized network architectures and hyperparameters
Developed intuition for debugging neural network performance

Key Takeaways

Mathematical Foundation Matters: The course emphasized understanding the underlying math rather than just using black-box libraries. This deep dive into linear algebra, calculus, and probability has given me a solid foundation for more advanced topics.

Implementation from Scratch: Writing forward and backward propagation algorithms manually was challenging but incredibly valuable. It demystified how popular frameworks like TensorFlow and PyTorch work under the hood.

Hyperparameter Tuning is an Art: Learning when to adjust learning rates, choose different activation functions, or modify network architecture based on performance metrics was eye-opening.

Practical Projects

Some highlights from the programming assignments:

Logistic Regression as a Neural Network: Understanding how simple logistic regression connects to neural network concepts
Planar Data Classification: Building a network to classify non-linearly separable data
Deep Neural Network Application: Creating a multi-layer network for image recognition tasks

What's Next?

This course is just the beginning! I'm planning to:

Continue with the rest of the Deep Learning Specialization
Apply these concepts to personal projects
Explore computer vision and NLP applications
Contribute to open-source ML projects

For Fellow Learners

If you're considering this course, here's my advice:

Don't skip the math: Even if it seems daunting, understanding the mathematical foundations pays off
Code along actively: Don't just watch the videos - implement everything yourself
Experiment beyond assignments: Try different parameters and see how they affect results
Join study groups: The discussion forums are incredibly helpful

Resources That Helped Me

Andrew Ng's clear explanations and intuitive examples
The programming assignments with detailed starter code
Supplementary reading on linear algebra and calculus
Community discussions and peer interactions

The field of deep learning is evolving rapidly, and this course has given me the foundational knowledge to keep learning and growing. Excited to see where this journey takes me next!

Course: Neural Networks and Deep Learning

Instructor: Andrew Ng

Platform: Coursera

Building a Breast Cancer Prediction App with Machine Learning and Streamlit

Anshika — Mon, 07 Jul 2025 12:02:15 +0000

Medical AI is revolutionizing healthcare, and machine learning models are becoming powerful tools for early disease detection. In this comprehensive tutorial, I'll walk you through building a complete breast cancer prediction system using the Wisconsin Breast Cancer dataset.

What We'll Build

By the end of this tutorial, you'll have:

A fully trained logistic regression model for cancer prediction
An interactive Streamlit web application
Comprehensive exploratory data analysis
A complete GitHub repository ready for deployment

Live Demo: Streamlit app

GitHub Repository: House Price Prediction

Understanding the Dataset

The Wisconsin Breast Cancer dataset contains 569 samples with 30 features each, computed from digitized images of breast mass fine needle aspirates. Each sample is classified as either:

Benign (B): Non-cancerous tumor
Malignant (M): Cancerous tumor

🔧 Setting Up the Environment

First, let's set up our development environment:

# Create virtual environment
python -m venv breast_cancer_env
source breast_cancer_env/bin/activate  # On Windows: breast_cancer_env\Scripts\activate

# Install required packages
pip install pandas numpy scikit-learn matplotlib seaborn streamlit plotly joblib

Exploratory Data Analysis

The first step in any machine learning project is understanding your data. Here's what we discovered:

Key Insights:

Dataset Balance: ~63% benign, ~37% malignant cases
Feature Correlations: Strong correlations between mean, SE, and worst values of the same measurements
Distinguishing Features: concave_points_worst, perimeter_worst, and concave_points_mean show the highest correlation with malignancy

Visualization Highlights:

# Target variable distribution
df['diagnosis'].value_counts().plot(kind='bar')
plt.title('Distribution of Diagnosis')
plt.show()

# Correlation matrix
sns.heatmap(df.corr(), annot=False, cmap='coolwarm')
plt.title('Feature Correlation Matrix')
plt.show()

Building the Machine Learning Model

Data Preprocessing

# Convert diagnosis to binary
df['diagnosis'] = df['diagnosis'].map({'B': 0, 'M': 1})

# Separate features and target
X = df.drop(['diagnosis', 'id'], axis=1)
y = df['diagnosis']

# Feature scaling
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

Model Training

# Split the data
X_train, X_test, y_train, y_test = train_test_split(
    X_scaled, y, test_size=0.3, random_state=42)

# Train logistic regression
lr = LogisticRegression()
lr.fit(X_train, y_train)

# Evaluate
y_pred = model.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print(f'Accuracy: {accuracy:.2f}')

Model Performance

Our logistic regression model achieved impressive results:

Accuracy: 98%
Precision: High precision for both classes
Recall: Excellent recall for malignant cases

Medical Disclaimer & Ethics

Important: This application is for educational purposes only. Key considerations:

Always consult qualified healthcare professionals
AI should augment, not replace, medical expertise
Consider bias in training data
Ensure patient data privacy and security
Regular model retraining and validation

Deployment Options

Local Development

streamlit run app.py

Streamlit Cloud

Push code to GitHub
Connect repository to Streamlit Cloud
Deploy with one click

Docker Deployment

FROM python:3.9-slim
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .
EXPOSE 8501
CMD ["streamlit", "run", "app.py"]

Future Enhancements

Model Improvements

Ensemble Methods: Random Forest, XGBoost
Deep Learning: Neural networks for complex patterns
Feature Engineering: Automated feature selection

Application Features

Multi-language Support: Reach global healthcare providers
API Integration: Connect with hospital systems
Mobile App: Native iOS/Android applications
Real-time Monitoring: Track model performance

Advanced Analytics

Explainable AI: SHAP values for feature importance
Uncertainty Quantification: Confidence intervals
Bias Detection: Fairness across demographic groups

Key Takeaways

Data Quality Matters: Clean, well-preprocessed data is crucial
Model Simplicity: Logistic regression can be highly effective
User Experience: Medical applications need intuitive interfaces
Validation is Critical: Rigorous testing ensures reliability
Ethical Considerations: Always prioritize patient safety

Technical Stack Summary

Data Science: pandas, numpy, scikit-learn
Visualization: matplotlib, seaborn, plotly
Web Framework: Streamlit
Deployment: Streamlit Cloud, Docker
Version Control: Git, GitHub

Resources & References

Conclusion

Building this breast cancer prediction system taught me the importance of combining technical excellence with ethical responsibility. Machine learning in healthcare requires not just accurate models, but also thoughtful user experience design and careful consideration of real-world implications.

The project demonstrates how modern tools like Streamlit can democratize AI deployment, making sophisticated machine learning models accessible to healthcare professionals without extensive technical backgrounds.

Remember: the goal isn't to replace medical professionals, but to provide them with powerful tools that can help save lives through early detection and improved diagnosis accuracy.

Have you built similar healthcare ML applications? What challenges did you face? Share your experiences in the comments below!

If you found this helpful, please give it a ❤️ and consider following for more AI and machine learning content!

Building My First End-to-End Machine Learning Project

Anshika — Sun, 06 Jul 2025 19:42:39 +0000

A complete journey from data to deployment with Python, Scikit-learn, and Streamlit

Introduction

As a budding data scientist, I wanted to create a comprehensive machine learning project that showcases the entire ML pipeline - from data preprocessing to model deployment. Today, I'm excited to share my House Price Prediction project that predicts real estate prices using machine learning!

Live Demo: Streamlit app

GitHub Repository: House Price Prediction

Project Overview

This project predicts house prices based on various features like:

Median income in the area
House age and size characteristics
Population and demographic data
Geographic location

The goal was to build a real-world applicable model with a user-friendly interface that anyone can use to get instant price predictions.

Tech Stack

Python: Core programming language
Scikit-learn: Machine learning algorithms
Streamlit: Web application framework
Pandas & NumPy: Data manipulation
Matplotlib & Seaborn: Data visualization
Plotly: Interactive charts

The Dataset

I used the California Housing Dataset containing 20,640 samples with features like:

Median income
House age
Average rooms/bedrooms
Population density
Geographic coordinates

This dataset is perfect for learning because it's:

Real-world data
Clean and well-structured
Sufficient size for training
Interpretable features

Key Steps in My ML Pipeline

1. Exploratory Data Analysis (EDA)

First, I dove deep into understanding the data:

# Check data distribution
plt.figure(figsize=(15, 10))
df.hist(bins=30, alpha=0.7)
plt.suptitle('Feature Distributions')
plt.show()

# Correlation analysis
correlation_matrix = df.corr()
sns.heatmap(correlation_matrix, annot=True, cmap='coolwarm')

Key Insights:

Median income has the strongest correlation with price (0.69)
Location (latitude/longitude) significantly impacts pricing
House age has a moderate negative correlation

2. Feature Engineering

I created three new features to improve model performance:

# Engineer new features
df['rooms_per_household'] = df['AveRooms'] / df['AveOccup']
df['population_per_household'] = df['Population'] / df['HouseHolds']
df['bedrooms_per_room'] = df['AveBedrms'] / df['AveRooms']

These engineered features provided better insights into housing quality and density.

3. Data Preprocessing

# Handle outliers using IQR method
Q1 = df['price'].quantile(0.25)
Q3 = df['price'].quantile(0.75)
IQR = Q3 - Q1
df_clean = df[(df['price'] >= Q1 - 1.5*IQR) & (df['price'] <= Q3 + 1.5*IQR)]

# Scale features
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

4. Model Training & Evaluation

I chose Linear Regression for interpretability:

model = LinearRegression()
model.fit(X_train_scaled, y_train)

# Evaluate performance
test_r2 = r2_score(y_test, y_pred)
test_rmse = np.sqrt(mean_squared_error(y_test, y_pred))

print(f"R² Score: {test_r2:.4f}")
print(f"RMSE: ${test_rmse*100:.0f}k")

Model Performance:

R² Score: 0.60 (explains 60% of price variance)
RMSE: ~$68k
MAE: ~$50k

Building the Web Application

The most exciting part was creating an interactive web app using Streamlit.

App Features:

Interactive sliders for all input features
Real-time predictions with instant results
Visualization of results and comparisons
Feature importance explanations
Mobile-responsive design

Results & Insights

Model Performance

Successfully predicts house prices with 60% accuracy
Identifies median income as the strongest price predictor
Location factors (lat/long) significantly impact pricing
Engineered features improved model performance by 5%

Key Learnings

Feature engineering can significantly boost model performance
Data visualization is crucial for understanding patterns
Model interpretability is as important as accuracy
User experience matters in ML applications

Future Improvements

Advanced Algorithms: Implement Random Forest, XGBoost
Hyperparameter Tuning: Use GridSearchCV for optimization
Cross-Validation: Implement k-fold cross-validation
Real-time Data: Integrate with real estate APIs
Model Monitoring: Add performance tracking
Cloud Deployment: Deploy on AWS/GCP for scalability

Lessons Learned

Technical Lessons

Data quality is more important than model complexity
Feature engineering often beats algorithm selection
Model interpretability is crucial for business applications
User interface design significantly impacts adoption

Project Management

Documentation is essential for portfolio projects
Version control (Git) saves time and prevents disasters
Modular code makes debugging and improvements easier
Testing with sample data prevents deployment issues

Impact on My Learning Journey

This project has significantly enhanced my skills in:

End-to-end ML pipeline development
Data preprocessing and feature engineering
Model evaluation and interpretation
Web application development
Project documentation and presentation
Version control and collaboration

I'd love to hear your thoughts! Please:

Star the GitHub repository if you find it useful
Comment with suggestions or questions
Share if you think others might benefit

Connect with me:

What's your first ML project story? Share in the comments below! 👇

This post chronicles my journey building my first complete ML project. The code, data, and live demo are all available for you to explore, learn from, and build upon. Happy coding!

Unsupervised Learning Finally Makes Sense – My Journey Through ML Course 3

Anshika — Sat, 05 Jul 2025 17:29:25 +0000

Hey everyone!

I'm so happy to share that I’ve officially completed the entire Machine Learning Specialization by Andrew Ng on Coursera — a journey that’s helped me build a solid foundation in both core ML theory and hands-on application.

This was the third and final course in the series, titled:

“Unsupervised Learning, Recommenders, Reinforcement Learning”

by DeepLearning.AI & Stanford University

What This Final Course Covered

This last course introduced some really exciting and practical machine learning areas that go beyond supervised learning:

Unsupervised Learning
- K-Means Clustering
- Anomaly Detection
- Principal Component Analysis (PCA)
Recommender Systems
- Content-based filtering
- Collaborative filtering with matrix factorization
Introduction to Reinforcement Learning (theoretical only)
- What RL is and how it differs from supervised/unsupervised learning
- High-level applications like robotics and game-playing agents

Although reinforcement learning wasn’t covered in depth (no coding for it), it was a great introduction to the concept and its use cases.

Concepts That Stuck With Me

Unsupervised learning helps uncover hidden patterns in unlabeled data.
K-Means Clustering is simple but powerful for grouping similar data points — great for tasks like customer segmentation.
Anomaly Detection is critical in areas like fraud detection and system health monitoring.
PCA helps reduce the dimensionality of high-dimensional datasets while preserving variance — useful for both visualization and performance.
Recommender Systems use data cleverly to personalize experiences — I now have a better understanding of what powers platforms like Netflix and Spotify!

Tools and Frameworks I Used

Throughout the specialization, I worked with:

Python
NumPy, pandas, matplotlib
Jupyter Notebooks & Google Colab
Implemented algorithms from scratch to better understand the math

Practice Highlights

Some of the hands-on work included:

Visualizing gene expression data with PCA
Building a basic movie recommender system
Detecting anomalies in server and sensor data

All exercises were designed to feel like real-world applications — not just theory!

My ML Journey So Far

This post marks the completion of my Machine Learning Specialization:

Supervised Machine Learning: Concepts I Finally Understand

→ Linear/Logistic Regression, Loss functions, Evaluation Metrics
Advanced Learning Algorithms: Concepts That Finally Clicked

→ Neural networks, forward/backward propagation, and building models from scratch
This post — Unsupervised learning, recommendation systems, and a peek into reinforcement learning!

What’s Next?

Now that I’ve wrapped up this specialization, here’s what I plan to do next:

Build end-to-end ML projects combining supervised & unsupervised learning
Dive into Generative AI, LLMs, and NLP
Compete in Kaggle challenges
Continue sharing my learnings right here on Dev.to!

Thanks so much for following along with my ML journey

Let me know if you’re also learning ML or building something cool — I’d love to connect!

Happy Learning!

Advanced Learning Algorithms: Concepts That Finally Clicked

Anshika — Wed, 02 Jul 2025 11:36:15 +0000

After writing about the basics of supervised machine learning, I went one step further and completed the second course in the Machine Learning Specialization by Andrew Ng. This one was a game-changer — it covered the why behind how machines learn, especially when things start to get nonlinear and complex.

Here are the key concepts that finally clicked for me

1. Regularization — Not Just a Buzzword

I used to hear "regularization" everywhere, but I didn’t really understand what it meant.

Turns out, it’s like teaching your model not to overthink. Too many weights = too much memorizing = poor generalization. L2 regularization (adding a penalty term) helps reduce those extreme weight values and keeps your model grounded.

Takeaway: Regularization isn’t just a math trick — it’s essential for better generalization.

2. Neural Networks — Finally Got the Intuition

Neural networks always sounded intimidating. But once I saw how a simple neural net is just a bunch of logistic regressions stacked and activated, it clicked.

I now understand:

How each layer transforms data
Why activation functions like ReLU or sigmoid matter
What it means to learn weights through backpropagation

Takeaway: Neural nets are just math + layering + learning — no magic, just logic.

3. Backpropagation — The Learning Engine

This was the hardest part at first. Chain rule? Gradients? But visualizing how errors move backward through layers to update weights made it clear.

Now I know:

The loss tells us how wrong we were
Gradients tell us how to fix it
Backpropagation adjusts all layers efficiently

Takeaway: Backpropagation is how the network learns — by tweaking each layer's weights based on the output error.

4. Deep vs. Shallow Models

Shallow models (like logistic regression) work fine for simple data. But deeper networks capture complex patterns, like images or sequences.

I learned:

Why adding more layers lets us learn hierarchical features
How depth adds power — but also complexity and risk of overfitting

Takeaway: Depth adds capability, but only if used wisely.

5. TensorFlow — My First Real ML Framework

This was my first time working with TensorFlow, and it really helped bridge the gap between theory and code.

Using tf.keras, I could:

Build neural networks in just a few lines
Train models and track accuracy/loss in real-time
Understand how each concept from the course translates into actual working code

Takeaway: TensorFlow makes ML implementation accessible — and fun!

6. Model Tuning Matters (More Than I Thought)

Before, I underestimated how important things like learning rate, initialization, and number of units were. Now I realize:

Poor weight initialization can kill training
Learning rate can make or break convergence
You need trial and error (and patience)

Takeaway: Tuning isn’t optional — it’s part of the craft.

What’s Next?

Up next, I’m diving into Unsupervised Learning, Recommenders, and Reinforcement Learning — the third course in the specialization. I’m excited to explore clustering algorithms, anomaly detection, and even get a taste of how reinforcement learning works!

And yes — I plan to keep building with TensorFlow too!

I’ll share my takeaways from that soon. Until then — happy learning.

If you're just getting started with ML, feel free to check out my first post:

👉 Supervised Machine Learning: Concepts I Finally Understand

Supervised Machine Learning: Concepts I Finally Understand

Anshika — Fri, 27 Jun 2025 11:50:00 +0000

Hi, I'm Anshika — a B.Tech student diving into the world of AI and Machine Learning.

I just completed Andrew Ng’s Supervised Machine Learning course on Coursera (the first in the ML Specialization by DeepLearning.AI), and I wanted to document my learnings, struggles, and next steps as I begin my ML journey.

What is Supervised Machine Learning?

Supervised ML is about teaching machines using labeled data.

You provide inputs (features) along with the correct outputs (labels), and the model learns to predict outputs for new, unseen inputs.

There are two key types:

Regression → Predict continuous values (e.g., house price, traffic speed)
Classification → Predict categories (e.g., spam vs. not spam)

Key Concepts I Learned

Linear Regression (with one and multiple variables)
Gradient Descent – how the model "learns"
Cost Function (Mean Squared Error) – measuring how wrong the model is
Logistic Regression – used for binary classification problems
Overfitting vs. Underfitting – finding the balance between simplicity and accuracy
Regularization (L2) – prevents the model from overfitting the training data

Tools I Used

Python
NumPy
Jupyter Notebook (for practice exercises)

What Helped Me Understand Better

Visualizing gradient descent and cost function graphs
Coding linear regression from scratch before using libraries
Reading discussion forums whenever I got stuck
Taking handwritten notes to simplify complex terms

What’s Next?

Now that I’ve finished the Supervised Learning course, I plan to:

Continue the specialization: Next up → Unsupervised Learning
Apply regression to a real-world dataset (maybe traffic or energy!)
Start writing beginner-friendly tutorials alongside learning

If you’re on a similar journey or just starting out — feel free to reach out! Let’s learn and build together.

Thanks for reading!

🔗 Connect with me:

GitHub: https://github.com/anshikalohan
LinkedIn: https://www.linkedin.com/in/anshika-lohan-570484273/