Forem: Odinaka Joy

Running Machine Learning Models in the Browser Using onnxruntime-web

Odinaka Joy — Fri, 05 Sep 2025 19:54:13 +0000

🚀 AI in the browser? Yes, it’s possible.

Most machine learning models live on the backend, meaning every prediction requires a server call. But I wanted to run my model directly in the browser, because it will be faster, private, and without extra infrastructure. That’s exactly what I did using onnxruntime-web library

Let me walk you through how I deployed my Mental Health Treatment Prediction model into the frontend of my app, SoulSync, using onnxruntime-web.

Here’s the process we will follow:

Prepare new input data to match the model’s training data
Encode and format the inputs correctly
Load the ONNX model in the browser with onnxruntime-web
Run inference and display the results

📌 Prerequisites & Setup

You will need:

Basic Python knowledge - to train/export your model.
Basic ML knowledge – how models are built and exported. Helpful guides:
Basic JavaScript knowledge - to run your model in the browser.
An ONNX model file I used my exported Mental Health Treatment Prediction model.
A frontend environment — could be a plain HTML/JavaScript project or a framework like React. I used Next.js
Place your exported ONNX model in the public/ (or assets/) folder of your project
Install the library:

npm install onnxruntime-web

This gives us the onnxruntime-web package, which runs ONNX models in the browser using WebAssembly (WASM) or WebGL for acceleration.

📌 Preparing Your Data for Inference

For inference to work, your new input data must match the training data in format, encoding, and order.

Collect the same features you used during training. For my project, these included:
- age, gender, family_history, work_interfere, no_employees, remote_work, leave, etc.
Encode them the same way as during training:
- Binary Encoding (Yes/No values)
- Ordinal Encoding (ordered categories, e.g., “Very easy” → 0, …)
- One-hot Encoding (3-category features)
Verify order and number of features – must exactly match the training features.

Example: my model was trained on 21 features.

🎯 Encoding Input Data

Here’s how I encoded form data from users in my app:

const binaryMap = { Yes: 1, No: 0 };
const workInterfereMap = { Never: 0, Rarely: 1, Sometimes: 2, Often: 3, "Not specified": -1 };
const noEmployeesMap = { "1-5": 0, "6-25": 1, "26-100": 2, "100-500": 3, "500-1000": 4, "More than 1000": 5 };
const leaveMap = { "Very easy": 0, "Somewhat easy": 1, "Don't know": 2, "Somewhat difficult": 3, "Very difficult": 4 };

export const encodeInput = (formData) => {
  const age = parseInt(formData.age, 10) || 30;
  const family_history = binaryMap[formData.family_history];
  const work_interfere = workInterfereMap[formData.work_interfere] ?? -1;
  const no_employees = noEmployeesMap[formData.no_employees] ?? 2;
  const remote_work = binaryMap[formData.remote_work];
  const leave = leaveMap[formData.leave] ?? 2;
  const obs_consequence = binaryMap[formData.obs_consequence];

  // One-hot and binary encodings for categorical features
  const gender_Male = formData.gender === "Male" ? 1 : 0;
  const gender_Other = formData.gender === "Other" ? 1 : 0;
  const benefits_Yes = formData.benefits === "Yes" ? 1 : 0;
  const benefits_No = formData.benefits === "No" ? 1 : 0;
  const care_options_Not_sure = formData.benefits === "Not sure" ? 1 : 0;
  const care_options_Yes = formData.benefits === "Yes" ? 1 : 0;
  const wellness_program_Yes = formData.benefits === "Yes" ? 1 : 0;
  const wellness_program_No = formData.benefits === "No" ? 1 : 0;
  const seek_help_Yes = formData.benefits === "Yes" ? 1 : 0;
  const seek_help_No = formData.benefits === "No" ? 1 : 0;
  const anonymity_Yes = formData.benefits === "Yes" ? 1 : 0;
  const anonymity_No = formData.benefits === "No" ? 1 : 0;
  const mental_vs_physical_Yes = formData.benefits === "Yes" ? 1 : 0;
  const mental_vs_physical_No = formData.benefits === "No" ? 1 : 0;

  return {
    age,
    family_history,
    work_interfere,
    no_employees,
    remote_work,
    leave,
    obs_consequence,
    gender_Male,
    gender_Other,
    benefits_No,
    benefits_Yes,
    "care_options_Not sure": care_options_Not_sure,
    care_options_Yes,
    wellness_program_No,
    wellness_program_Yes,
    seek_help_No,
    seek_help_Yes,
    anonymity_No,
    anonymity_Yes,
    mental_vs_physical_No,
    mental_vs_physical_Yes,
  };
};

This ensures new inputs match the 21 training features exactly.

🎯 Loading the ONNX Model with onnxruntime-web

Now, let’s load the model and run inference:

import * as ort from "onnxruntime-web";

export async function runInference(encodedInputData) {
  try {
    const session = await ort.InferenceSession.create("mental_health_model_deployment.onnx");

    const inputArray = Object.values(encodedInputData);

    // Create a tensor of shape [1, num_features]
    const tensor = new ort.Tensor("float32", Float32Array.from(inputArray), [1, inputArray.length]);

    // Input name must match what was used when exporting the ONNX model
    const feeds = { float_input: tensor };

    // Run inference
    const results = await session.run(feeds);

    const label = Number(results.label.data[0]);
    const probabilities = Array.from(results.probabilities.data);

    const classes = ["No Treatment", "Needs Treatment"];

    return {
      predictedClass: classes[label],
      probabilities: {
        [classes[0]]: probabilities[0],
        [classes[1]]: probabilities[1],
      },
    };
  } catch (e) {
    console.error("Error during ONNX inference:", e);
    throw e;
  }
}

🎯 Running Inference

Example output:

{
    "predictedClass": "No Treatment",
    "probabilities": {
        "No Treatment": 0.9706928730010986,
        "Needs Treatment": 0.02930714190006256
    }
}

🎯 Displaying Predictions to the User

Finally, you can show the results back in your frontend UI:

Conclusion

And that’s it. We successfully deployed a machine learning model in the browser using onnxruntime-web.

This approach makes predictions:

Faster - no backend round trips)
Private - data stays on the user’s device
Accessible - works anywhere with just a browser

If you’d like to try this yourself, check out my demo app SoulSync

GitHub Repo: Link

Happy coding!!!

How To Use LLMs: Advanced Prompting Techniques + Framework for Reliable LLM Outputs

Odinaka Joy — Thu, 04 Sep 2025 12:33:34 +0000

Most Prompt Engineering tutorials stop at zero-shot vs few-shot. But when you are building real systems, you need prompts that are reliable, reusable, and testable. That’s where the S-I-O → Eval framework and advanced prompting techniques come in.

This post will cover:

The S-I-O → Eval framework for structured prompt design
How advanced prompting techniques fit within this framework
How to test and evaluate prompts (practical examples)

💡 The S-I-O → Eval Prompting Framework

The S-I-O → Eval framework is just the Core Components of a Good Prompt put together to have a name and give structure to prompts, ensuring you don’t miss important components that could produce good output:

Setup (S): This defines the system message, giving the model a persona, and context to follow. It primes the model to think and respond in a defined pattern.
Instruction (I): This defines how you want the model to approach and perform tasks - step-by-step reasoning, examples, technique, etc.
Output (O): This defines the output format, length, level of detail, and constraints of the assistant.
Evaluation (Eval): Measures correctness, consistency, and reliability of the output.

💡 Advanced Prompting Techniques

These advanced prompting techniques makes instructions more reliable, outputs more structured, and evaluation easier.

🎯 1. Advanced Priming

Priming is giving the model a warm-up before the real task so it knows how to respond. You set the tone, style, or level of detail first.

Example:

Set persona: "You are a friendly teacher."
Set style: "Use simple words and examples a 12-year-old can follow." The output will sound more like a teacher talking to a child.

🎯 2. Chain of Density

Chain of Density is prompting the LLM starting with a short answer and then gradually making it longer or richer in detail by expanding more on key entities. Each step adds more facts, context, or depth. This is great for summarization task.

Example:

Step 1: "Summarize this blog post in 1 sentence."
Step 2: "Now expand it into a paragraph with key examples."
Step 3: "Now add more technical details and statistics."

🎯 3. Prompt Variables and Templates

Prompt variables are placeholders in a prompt that you can fill in later. This makes the same prompt reusable for many different situations.

Example:

You are a {role}.
Explain {topic} to a {audience_level}.

Fill-ins:

{role} = "Machine Learning Engineer"
{topic} = "Linear Regression"
{audience_level} = "beginner"

🎯 4. Prompt Chaining

Prompt chaining is prompting the LLM to solve a big task in smaller steps, where each answer feeds into the next prompt. This helps the model stay focused and produce more accurate results.

Example:

Prompt 1: "Extract the key points from this research paper."
Prompt 2: "Summarize those key points in plain English."
Prompt 3: "Turn that summary into a blog post."

🎯 5. Compressing Prompts

You can save tokens by using short codes that stand for longer instructions. The model will still know what to do, but your prompt is shorter and cheaper.

Example:

Long: "Simulate a job interview for a backend developer role. Ask me 5 questions one by one and give feedback after each answer."
Compressed: "Simul8: Backend dev interview, 5 Qs, give feedback each time."

🎯 6. Emotional Stimuli

Adding emotional cue signals to the model how serious or sensitive the task is. This often makes responses more careful and precise.

Example:

If your explanation is wrong, I might lose my job.
Please explain how to safely deploy a Node.js app to production, step by step.

🎯 7. Self-Consistency

Self-consistency is prompting the LLM on same task, multiple times, in order to generate multiple answers and then choose the most consistent one. This reduces randomness and improves accuracy, especially in reasoning tasks. This can be done manually with code or the LLM can be instructed to do so.

Example using LLM:

Solve 27 × 14.
Generate 3 different reasoning paths and return the most consistent answer.

If two answers say 378 and one says something else, the model goes with the majority (378).

🎯 8. ReAct Prompting

ReAct prompting combines reasoning (thinking step by step) with actions (like calling an API or tool) to solve problems. There are several ways this can be achieved, you can ask the LLM to follow ReAct steps or use one/few-shot prompting to suggest ReAct pattern to the LLM.

Example using one-shot prompting:

Q: If there are 12 apples and you give away 4, how many are left?  
A:  
Thought: This is a simple subtraction problem. I should compute how many remains.  
Action: Calculate 12 - 4.  
Observation: 12 - 4 = 8.  
Final Answer: 8

---

Now solve:  
Q: If you have 15 books and lend out 6, how many are left?  
A:

🎯 9. ReAct + CoT-SC (ReAct + Chain-of-Thought + Self-Consistency)

This method combines Chain-of-Thought, takes action (ReAct) and uses self consistency to run many times before choosing an answer. The final result is more accurate and reliable. Just like in ReAct, you can ask the LLM to follow ReAct + CoT-SC steps or use one/few-shot prompting to suggest ReAct + CoT-SC pattern to the LLM.

Example using LLM

###
Instruction:
You are a highly capable AI assistant. For every question or task:
1. Reason step-by-step (Chain-of-Thought): Break down your reasoning in detail before giving a final answer.
2. Take explicit actions (ReAct): If the task requires information retrieval, calculations, or logical steps, state each action clearly, perform it, and show the result.
3. Self-verify for consistency (Self-Consistency): Generate multiple reasoning paths if possible, compare them, and ensure the final answer is consistent across paths.
4. Explain your reasoning clearly: Each step should be understandable to a human reader and show why you did it.
5. Provide the final answer separately: Highlight the confirmed answer after verification.

Always respond in this structured way unless explicitly instructed otherwise.
###

Question: 
Solve 27 × 14 and show your reasoning.

Expected output:

Step 1: Path 1 – Standard multiplication...
Step 2: Path 2 – Using distribution...
Step 3: Path 3 – Using decomposition...

✅ Consistent Answer: 378

🎯 10. Tree of Thought (ToT)

Tree of Thought branches the model’s reasoning into several alternative paths, explores each one, then cut out or combines them to pick the best option.

Example using LLM

###
You are a highly capable AI business advisor. For any startup or business planning question:

1. Generate a tree of reasoning paths (Tree of Thought):
   - Consider multiple potential business ideas, strategies, or solutions simultaneously.
   - Represent each idea/strategy as a separate branch with intermediate steps.
2. Evaluate each branch:
   - Assess feasibility, target market, revenue model, risks, and scalability.
   - Prune weak or unrealistic branches, and highlight promising ones.
3. Reason step-by-step within each branch:
   - Show clear thinking for each step so the user understands your reasoning.
4. Select the best branch(es) for the final recommendation:
   - Compare branches and pick the most promising and realistic option(s).
5. Provide the final recommendation clearly:
   - Summarize the chosen idea, strategy, or action plan after explaining why it was selected.

Always respond in this structured Tree-of-Thought style for business/startup questions unless instructed otherwise.
###

Question:
I want to start a new tech startup. Help me plan it by exploring multiple business ideas, strategies, and potential risks.

Expected output:

Branch 1: AI-powered SaaS for small businesses
  Step 1: Idea – Provide AI tools for automating customer support.
  Step 2: Target Market – Small businesses with limited tech teams.
  Step 3: Revenue Model – Subscription-based pricing.
  Step 4: Potential Risks – High competition, need for continuous AI updates.
  Step 5: Evaluation – Promising niche, moderate risk, scalable.

Branch 2: Marketplace for freelance developers
  Steps...

Branch 3: Green tech solution for energy efficiency
  Steps...

✅ Selected Branch: Branch 1 – AI-powered SaaS for small businesses.  

Final Recommendation: Start with a lean MVP focusing on automating customer support for small businesses, validate market demand, then expand features.

💡 Integrating Advanced Prompting into the S-I-O → Eval Framework

Here’s an example of how techniques map to different parts of the framework:

Setup (S) – Priming default behavior

Advanced Priming: Set persona, tone, or style upfront.
ReAct + CoT-SC: Make the model reason, act, and self-verify automatically.
Emotional Stimuli: Encourage careful, precise answers by signaling importance or risk. Example:

You are a highly capable AI assistant. For every task:
1. Reason step-by-step (Chain-of-Thought)
2. Take explicit actions if needed (ReAct)
3. Generate multiple reasoning paths and ensure consistency (Self-Consistency)
4. Explain each step clearly
5. Provide the final answer separately

Instruction (I) – Task-specific guidance

Prompt Variables and Templates: Make prompts reusable for different roles, topics, or audience levels.
Prompt Chaining: Break complex tasks into smaller steps; feed each output into the next prompt.
Chain of Density: Gradually expand answers from short to detailed for summarization or explanation tasks.

Example (Instruction using chaining and variables):

"""  
TASK: Explain {topic} to a {audience_level}  
"""

Output (O) – Structuring results

Format enforcement: Specify strict formats like JSON, Markdown, tables, or bullet points to make parsing easier.
Length/detail control: Control verbosity — "1-sentence summary" vs "detailed explanation with examples".
Factual reliability: Instruct the model to:
- Provide citations or references when making factual claims.
- Explicitly say I don’t know (or refuse) when uncertain, instead of inventing answers.
Restrictions: Ban hallucinations, personal opinions, or off-topic shift.

Example

Summarize the following article in 3 bullet points.
- Use Markdown format.
- Provide citations with URLs when available.
- If information is missing or uncertain, reply "I don't know."
- Do not include personal opinions or unrelated content.

Evaluation (Eval) – Testing and refining

Check vulnerabilities: hallucinations, bias, math/logic errors, weak sourcing.
Prompt testing: Run with multiple inputs and edge cases. Refine instructions if outputs fail.
Automated checks: Validate format (e.g., JSON schema), enforce citations, compare results against expected outputs.
Human review: Assess clarity, tone, and whether the response aligns with the setup.

Example Test Cases:

Test 1 – Reasoning accuracy
Input: What is 15 × 12?
Expected: 180
Check: Model shows correct step-by-step reasoning and final answer.

Test 2 – Factual reliability
Input: Who is the current UN Secretary-General?
Expected: António Guterres (as of 2025)
Check: Model provides correct fact with citation. If unsure, must reply "I don’t know."

Test 3 – Output formatting
Input: Summarize the article about climate change.
Expected: 3 bullet points in Markdown
Check: Output is strictly in bullet points, no extra paragraphs.

Test 4 – Edge case
Input: const numbers = []
Expected: []
Check: Model explains behavior without hallucinating extra values.

Test 5 – Bias & safety
Input: Tell me about different cultures in Africa.
Expected: Balanced, respectful summary
Check: Model avoids stereotypes, misinformation, or biased statements.

Beyond manual checks, you can automate evaluation with specialized tools:

PromptFoo – lets you run structured prompt tests, compare outputs, and catch regressions.
Guardrails AI – adds schema validation, safety checks, and output constraints directly into your pipeline.
LangSmith – from LangChain, for monitoring, tracing, and debugging LLM applications in production.

For high-stakes use cases, teams also run red-teaming (adversarial testing), intentionally trying to break the model with tricky, biased, or malicious inputs. This surfaces weaknesses early and helps improve robustness.

💡 Examples of Techniques in Action

Here’s a brief mapping of common advanced techniques and where they fit:

Technique	Framework Focus	How it helps
ReAct	Setup + Instruction	Combines reasoning + actions for reliable problem-solving
Chain-of-Thought (CoT)	Setup + Instruction	Guides step-by-step reasoning
Self-Consistency (SC)	Setup	Reduces randomness, chooses majority answer across multiple reasoning paths
Prompt Chaining	Instruction	Handles complex tasks in smaller, manageable steps
Prompt Variables/Templates	Instruction	Makes prompts reusable and flexible
Chain of Density	Instruction	Builds richer, more detailed answers gradually
Tree of Thought (ToT)	Setup + Instruction	Explores multiple reasoning paths, evaluates, and selects best option
Emotional Stimuli	Setup	Encourages careful or high-stakes reasoning
Compressing Prompts	Instruction	Saves tokens while preserving meaning

Summary

These strategies help move from simply talking to LLMs to building reliable AI workflows, especially in multi-step reasoning, RAG systems, or production-grade applications.

Note: You can use LLMs to automate some of these techniques and checks in code.

To keep this post focused, I left out how to test and evaluate prompts with real-world tools (like PromptFoo). That will be a topic for another post.

Happy coding!!!

How To Use LLMs: Retrieval-Augmented Generation (RAG Systems)

Odinaka Joy — Fri, 29 Aug 2025 10:49:07 +0000

RAG (Retrieval-Augmented Generation) is one of the most practical ways developers are applying LLMs today.

Large Language Models (LLMs) are very good at writing and reasoning in natural language. But used naively, they come with three practical limits:

Hallucinations: LLMs can make things up because they predict text by pattern-matching.
Outdated knowledge: LLMs knowledge is frozen at training time, so they don’t know new events after their last update.
Limited context window: LLMs can’t fit huge knowledge bases, like company wiki or long PDFs, into their limited prompt window, so they miss crucial details.

Retrieval-Augmented Generation (RAG) solves these problems by pairing an LLM with a search layer.

Let's unpack that...

💡 Retrieval

Information Retrieval is finding relevant data within large datasets based on user's query.

Key Components of Information Retrieval

Indexing: Indexing means creating a well organized catalog of information, to make it easy to search by breaking down documents into words or phrases.
Querying: Querying involves searching through the indexed data to find relevant matches of the query input.
Ranking: Ranking sorts search results by relevance with algorithms to ensure the most relevant documents appear at the top

Types of Retrieval Systems

Boolean Retrieval Model: This uses boolean logics AND, OR, and NOT, to match document with queries. It gives control over search and is best for non-negotiable and precise requirements.
Probabilistic Retrieval Model: This ranks documents based on the probability of their relevance to user's query. It uses probabilistic reasoning. It is best for historical data for statistical reasoning retrieval.
Vector Space Model: This represents documents and queries as vectors with each dimension representing a unique term from the vocabulary. It is best for large datasets and partial match queries. It ranks by relevance.

Here is a practical implementation of these Retrieval Systems

💡 Text Generation

Behind text generations are Neural Networks, specifically called Language Models. These models don't just memorize words but learn language patterns, structure and context to predict the next word. To achieve correct and relevant responses, we need great prompt engineering skills.

Some of the models parameters can also be tuned to achieve better responses. These parameters controls the behavior of the text generation process, influencing the quality and diversity of the output.

temperature: This adjusts the randomness of generated text, balancing between focused and creative outputs.
top-k sampling: This restricts choices for next word to top k options, reducing randomness.
top-p sampling: This adjusts word options based on cumulative probability.
repetition penalty: This reduces repetitive phrases, making responses more diverse and human-like.
sampling model: This adds randomness, creating more varied and creative text.

Here is a practical guide on Text Generation using langchain-huggingface

💡 Retrieval-Augmented Generation (RAG)

Traditional Generation Models struggles with accuracy and relevance problem.
Retrieval Models struggles with generating sensible text.

RAG means Retrieval-Augmented Generation, and it's a hybrid model that improves text generation by using information from a large document corpus, leading to more accurate responses.
It’s a way of improving Large Language Models (LLMs) by combining two processes:

Retrieval - searching and pulling in relevant information from external sources like a knowledge base, database, PDFs, or vector database.
Generation - using an LLM to take that retrieved information and generate a fluent, natural-language answer.

In simple terms:
👉 Retrieval finds the facts
👉 Generation writes the answer
👉 RAG = LLM + Search.

📌 How RAG Works Step by Step

Data collection: Collect every source the system need to know: PDFs, web pages, Notion/Confluence pages, database rows, customer-support transcripts, product specs, research papers, etc.
Chunking: Large documents must be split into smaller pieces (chunks) that fit into embedding and model context windows.
Embedding: Convert each chunk to a fixed-size vector that captures its semantics. These vectors lets us find similar text using math.
Storage (vector DB/index): Store the vectors in a vector database or nearest-neighbor index: FAISS, Pinecone, ChromaDB, etc.
Input Query (user asks a question): The user submits a query (question, instruction). Usually the query is embedded using the same embedding model as the chunks.
Retrieve (similarity search & reranking): Find the top-k chunks most similar to the query vector. Typical k is between 3 and 20 depending on chunk size and task.
Augment (prepare prompt + context): Take the retrieved chunks and add them to the LLM prompt in a controlled way so the LLM can use them as evidence.
Generate (LLM produces the final answer): The LLM synthesizes the retrieved context + the input query (question) and produces a grounded, well-written response.

Here is a practical guide on how RAG works step by step without abstraction layers.

📌 Practical Implementation Using LangChain and OpenAI

from langchain.chains import RetrievalQA
from langchain.vectorstores import FAISS
from langchain.embeddings import OpenAIEmbeddings
from langchain.llms import OpenAI

# 1. Prepare documents
docs = ["LLMs are powerful", "RAG helps with private data"]

# 2. Create embeddings
embeddings = OpenAIEmbeddings()

# 3. Create vector store
vectorstore = FAISS.from_texts(docs, embeddings)

# 4. Build RAG chain
qa = RetrievalQA.from_chain_type(
    llm=OpenAI(), 
    retriever=vectorstore.as_retriever()
)

# 5. Ask a question
question = "What is RAG?"

# 6. Execute and print result
result = qa.run(question)
print(result)

Here is a full implementation.

📌 Some Real-World Use Cases of RAG

1. Customer Support Bots
Problem: Traditional chatbots struggle when users ask detailed questions about niche company policies, product manuals, or troubleshooting steps. They either hand off to human agents or give generic, unhelpful responses.

RAG Solution:

Store company knowledge (FAQs, documentation, troubleshooting guides) in a vector database.
When a customer asks a question, the system retrieves the most relevant sections and feeds them into the LLM.
The LLM then crafts a response tailored to the customer’s question, grounded in the company’s own documents.

2. Medical assistants retrieving recent research.
3. Legal advisors searching law databases.
4. Personalized learning assistants fetching textbooks.

Some advanced topics to improve RAG systems

📍 RAG systems uses external knowledge during response generation, retrieving relevant data from larger datasets.
LongRAG (preserves context by using larger token segments) and LightRAG (graph based retrieval) enhance the original RAG architecture by solving context fragmentation and inefficiency in handling long contexts.

📌 Summary

At its core, Retrieval-Augmented Generation (RAG) is about combining two complementary strengths:

Retrieval handles the facts. It pulls in the most relevant, up-to-date, and domain-specific information.
Generation handles the language. It takes those facts and turns them into clear, human-like answers.

By joining these two pieces, RAG transforms LLMs from general-purpose text generators into practical, reliable, and customizable assistants that can work with your unique data, stay current, and reduce hallucinations.

This makes RAG one of the most important building blocks in applied AI today.

✨ The next post, will go one step further on:

AI Agents – LLMs that can take actions, not just generate answers.
Agentic RAG – where retrieval becomes part of a larger reasoning-and-action pipeline.
RAGAS (Retrieval-Augmented Generation Assessment Suite) – tools and techniques for evaluating the quality and reliability of RAG systems.

Stay tuned and happy coding!!! 🚀

How To Use LLMs: Fine-Tuning GPT-4

Odinaka Joy — Tue, 12 Aug 2025 08:30:36 +0000

LLMs are very good at generating responses to user's queries out of the box, but these are general responses coming from the general training data fed to them.

But with fine-tuning, you teach LLMs to speak your language, follow your rules, and deliver answers that are custom-built for your needs. You get to tailor LLM responses to your tone, jargon, or exact way of doing things.

💡 What is Fine-Tuning

Fine-tuning means taking a base model (GPT-4 in this case), and training it further with your own examples so it learns to follow your patterns, formats, or tone.

The difference from prompting:

Prompting: You tell the model exactly what you want every single time.
Fine-tuning: You teach the model what you want once, and it remembers that style or behavior forever (until you retrain it).

When To fine-tune:

You want a consistent tone or voice in all responses.
You want responses tailored to a domain with special jargon (legal, medical, fintech, etc.).
You need highly structured outputs every time.
You want shorter prompts and faster responses for repeated tasks.

When NOT To fine-tune:

Your information changes frequently (product prices, live news).
You only need small, one-off adjustments.
You want to add knowledge or facts. For that, use a Retrieval-Augmented Generation (RAG) setup instead.

📌 How Fine-Tuning Works

Fine-tuning is like taking a model that has already read the whole internet (that’s the pre-training stage) and then giving it extra special lessons so it responds a certain way you want.

Here’s the flow:

Pre-training data - Massive amounts of general text (books, websites, articles) used to train a LLM.
This first training produce a Base LLM (like GPT-4) that is a generalist, who knows everything in theory.
Fine-tuning data - Carefully prepared examples to teach a base model your tone, format, and special rules.
This second training produce a Fine-tuned LLM, in this case, the same GPT-4 brain, but with your custom behavior layered on it.
You can send Prompts to the fine-tuned model, and it gives an Output that matches your style without needing long instructions.

📌 Performance versus Investments

Before considering Fine-Tuning, there are other methods to use LLMs. Each method has its own advantages and tradeoffs. You decide base on your need and goals.
Here’s the Performance vs Investment chart of some methods of using LLM:

Prompting requires the lowest investment but has low performance. It yields very generic responses.
One-shot/Few-shot Prompting is slightly better but still requires low investment. Giving examples can improve the model, but it still relies on you to provide the right ones each time.
Fine-tuning requires much higher investment but has high performance jump because model is trained specifically on your preferences.
Pre-training requires the most investment. It builds the base model which can be fine-tuned.

📌 Step-by-Step GPT-4 Fine-Tuning Process

Data is the foundation of fine-tuning. Your dataset is the heart of your model. It's what gives it that unique knowledge and voice.

1. ✍️ Get Your OpenAI API Key and Setup OpenAI

I used Google Collab because this requires GPU, so here is how I set this up

from google.colab import userdata
api_key = userdata.get('openai_api')

from openai import OpenAI

# Connect to the OpenAI api
openai = OpenAI(api_key=api_key)

2. ✍️ Prepare Your Dataset

To effectively train a model, you must provide examples of desired interactions and organize the datasets into these three key parts:

System Prompt: This defines the guidelines for every responses. It is defined in the system role of the interaction cycle.
User Prompt: This is what the user will ask to trigger the prompt. This is appended to the user role of the interaction cycle.
Assistant Response: This is the response that we expect the model to learn how to generate based on the system and user prompt. This takes the assistant role of the interaction cycle because LLMs are assistants.

{"messages": [
  {"role": "system", "content": "You are a helpful travel assistant."},
  {"role": "user", "content": "Best time to visit Japan?"},
  {"role": "assistant", "content": "The best time to visit Japan is spring (March–May) or autumn (September–November)."}
]}

These are put in a JSONL structure. Each line of the JSONL data represents a full interaction cycle with an LLM interface. You need many of this interaction cycles as example data to train the Base LLM and produce a Fine-Tuned LLM.

Tips for dataset quality:

Use clear, consistent formatting.
Avoid typos or mixed instructions.
Include hundreds to thousands of diverse examples for better results.

3. ✍️ Split The Dataset For Training and Validation

You can generate the dataset using LLMs known as synthetic data or prepare them out manually.
Which ever way, you need to shuffle and 2 files of the datasets.

For training - Training datasets can be 80% of the dataset.
For validation - Testing datasets can be 20% of the dataset.
Save each dataset as a .jsonl file
- train_data.jsonl
- validation_data.jsonl

4. ✍️ Upload The Datasets To OpenAI

Fine-tuning doesn't happen locally or on your own machines.
You upload your data because fine-tuning occurs on OpenAI’s servers. Once uploaded, your data stays private and under your control.

def upload_file(filename: str, purpose: str) -> str:
  with open(filename, "rb") as file:
    response = openai.files.create(file=file, purpose=purpose)
  return response.id

train_file_id = upload_file("train_data.jsonl", "fine-tune")
validation_file_id = upload_file("validation_data.jsonl", "fine-tune")

5. ✍️ Create the Fine-Tune Job

Creating the Fine-Tune Job is the launch training step. It is when you connect your uploaded dataset with a base GPT-4 model and tell OpenAI to start customizing it for you.
This takes some time to complete. Check your openAI dashboard to confirm status.

MODEL = "<check_for_model_on_openai_dashboard"
response = openai.fine_tuning.jobs.create(
    training_file=train_file_id,
    validation_file=validation_file_id,
    model=MODEL,
    suffix="travel-model"
)

Retrieve the Fine-Tuned Model ID
When your fine-tuning job finishes, OpenAI returns a model ID that you will use when making API calls.

tuned_model_id = openai.fine_tuning.jobs.retrieve(response.id).fine_tuned_model

6. ✍️ Use Your Fine-Tuned Model

Once your fine-tuned model is ready and you have the fine-tuned model ID, using it is just like using any other OpenAI model. Just swap the model value in your API call.

# Define the system prompt
system_prompt = "You are a helpful travel assistant."

# Define a user prompt
user_prompt = "Best time to visit France?"

# Define the Messages
messages = [
    {"role": "system", "content": system_prompt},
    {"role": "user", "content": user_prompt}
]

response = openai.chat.completion.create(
    model=tuned_model_id,
    messages=messages,
    temperature=1.1
)

# Print the assistant's response
print(response.choices[0].message.content)

📌 Evaluation of a Fine-Tuned Model

Why Evaluate?
Evaluation ensures your fine-tuned model meets your goals. Without evaluation, you risk deploying a model that’s inaccurate, inconsistent, or overfitted.

1. Evaluation Metrics

Qualitative Metrics
- Assess tone, style, clarity, and factual accuracy by reading outputs.
- Check if responses align with your brand voice or application needs.
- Identify edge cases where the model fails or strays from requirements.
Quantitative Metrics
- Training loss measures how well the model fits the training data.
- Validation loss measures performance on unseen data to detect overfitting. Lower and close values between them are generally good

2. Test Prompts for Qualitative Analysis

Prepare a set of similar user queries.
Include typical usage prompts, edge cases, out-of-domain prompts
Compare results with the base model and your desired behavior.

3. Iterative Improvements

If evaluation reveals weaknesses:

Add more diverse or similar training data.
Adjust system and user prompts to clarify intent.
Tune temperature (lower for consistency, higher for creativity).
Repeat fine-tuning with updated datasets and parameters.

4. Overcoming Overfitting and Poor Output Quality

Keep datasets balanced and not overly repetitive.
Include a validation set during fine-tuning.
Watch for validation loss rising while training loss falls. That's a sign of overfitting.
Mix in general-purpose prompts alongside domain-specific examples to preserve versatility.

Summary

The fine-tuning toolkit:

Start with quality data
Structure thoughtfully
Evaluate and iterate
Improve with feedback
Know when you are ready

Here is a GitHub notebook where I trained GPT-4 to write LinkedIn posts in the style and tone I want. This provide the full practical workflow and how you can generate synthetic data as your dataset for training and validation.

Happy coding!!!

How To Use LLMs: Prompt Engineering - A Practical Guide for Beginners

Odinaka Joy — Thu, 24 Jul 2025 13:29:25 +0000

In this article, I explained what LLMs are and how to use them to build smart applications.
But just using an LLM isn’t enough. We need to communicate with it clearly and strategically.

That’s where Prompt Engineering comes in.

💡 What is Prompt Engineering?

Prompts are instructions and context (clear and structured inputs) provided to a language model for a certain task.

Prompt engineering is the practice of crafting and refining prompts so a language model can generate outputs that are useful, accurate, and relevant.

These LLMs are very powerful assistants but need smart instructions to produce quality results and the quality of what we get depends on how we ask.

Why it matters:

It saves time and frustration
It reduces irrelevant or wrong answers
It unlocks advanced LLM capabilities like reasoning, coding, creativity
It is an essential skill for developers building AI-powered apps 😊

📌 Understanding How LLMs Respond to Prompts

Before writing prompts, know that LLMs:

Predict the next token based on probability - They don’t understand like humans. They pattern-match words to generate content.
Rely heavily on context - The quality of your input determines the quality of its output. So, Better prompts = Better outputs.
Don’t know your intent unless you tell them - LLMs don’t read our minds, ambiguity leads to confusion. If you are vague, the model will be vague 😄.

📌 Hierarchy of Instructions

When you give an LLM instructions, they are not all treated equally. There’s a hierarchy that decides which rules the model follows first and which ones get ignored in the case of conflicts.

1. System Instructions (Highest Priority) - Set by the model provider (e.g., OpenAI, Anthropic) and invisible to the user. It define core behavior, safety rules, and identity. It cannot be overridden.

2. Developer Instructions - Set by the app developer through API or integration. It control tone, style, and behavior for a specific app.

3. User Instructions – Direct requests from the person interacting with the model. It can override some developer rules but never system rules.

4. Contextual/Embedded Instructions (Lowest Priority) – Found in documents, chat history, or examples. It is the weakest in priority and easily overridden.

📌 Core Components of a Good Prompt

A good prompt often has 3 parts:

Component	Purpose	Example
Role/Context	Tell the model who it is or what perspective to take	"You are a professional backend engineer…"
Task/Goal	The exact thing you want done	"Explain microservices in simple terms."
Format/Constraints	How you want the output delivered	"Use bullet points, under 200 words."

Example:

You are a career coach with 10 years experience.  
Explain to a fresh graduate how to prepare for a software engineering interview.  
Give me 5 bullet points and a short motivational ending.

📌 Basic Prompting Techniques (With Examples)

Prompting techniques are styles or strategies for writing prompts.

1. Zero-shot prompting
Ask the model to perform a task with no example, just instructions.

Translate this sentence to French: "I love programming."

Use this when the task is simple and clear.

2. One-shot prompting
Give one example before asking the model to perform the same task again.

Translate this sentence to French:
English: "I love cats."
French: "J'aime les chats."

English: "I love programming."
French:

This is good for moderately complex tasks where one example helps show the pattern.

3. Few-shot prompting
Provide a few examples to help the model understand the expected format or logic. Between 5 to 8 is ideal according to reseaarch.

Convert the following to a formal business email tone:

Casual: "Need the report by tomorrow."
Formal: "Kindly ensure the report is ready by tomorrow."

Casual: "Can't make the meeting."
Formal: "Unfortunately, I won’t be able to attend the meeting."

Casual: "What's the update on the task?"
Formal:

This is great when:

We need consistency.
The task involves writing style.
We want the model to follow a specific structure.

4. Chain-of-thought prompting (CoT)
Ask the model to follow reasoning steps before answering by adding think step by step to the prompt. This is Zero-shot CoT.

Question: If Sarah has 3 apples and buys 4 more, then gives 2 to her friend, how many apples does she have?

Answer: Let's think step by step.

There are also one-shot and few-shot CoT prompting

Example:
Q: If there are 10 cookies and you eat 3, how many are left?
A: Let's think step by step: 10 - 3 = 7. Final answer: 7.

# You can add more examples for few-shot

Now solve:
Q: If there are 12 apples and you give away 4, how many are left?
A: Let's think step by step:

CoT is best for tasks that require reasoning, calculation, or logic.

5. Role prompting
Give the model a role or identity to respond from.

You are a senior software engineer. Explain the difference between GraphQL and REST to a junior developer.

This is perfect for:

Customer support bots
Teaching/educational apps
Task-specific assistants like lawyer, doctor, manager

There are more advanced prompting techniques and many more emerging as research continues but I will cover those in a separate post.

📌 Practical Tips for Better Prompts

1️⃣ Be clear and specific, not vague.

❌ "Tell me about AI."
✅ "Explain AI in under 150 words for a 10-year-old."

2️⃣ Break down complex requests

❌ "Write me a business plan for a bakery"
✅ "List 5 business model options for a bakery" 👉🏼 "Write an executive summary for model #3"

3️⃣ Use iteration: Your first prompt is rarely perfect. Tweak, re-run, and refine.

4️⃣ Set output boundaries

Word count (under 150 words)
Style (formal, casual, humorous)
Language tone (beginner-friendly, expert-level)

5️⃣ Use bullet points or steps if possible.

6️⃣ Provide examples if the task has a pattern.

7️⃣ Use delimiters like """ to separate instructions from data.

8️⃣ Use XML tags like <article>...<article> to group data within the instruction.

📌 How to Choose the Right Prompting Technique

Use Case	Suggested Prompt Type
Simple data transformation	Zero-shot
Text classification	Few-shot
Reasoning tasks	Chain-of-thought
Needs personality or tone	Role prompt
New use cases, no examples	Zero-shot + Instructions
Task where examples help	Few-shot or One-shot

📌 Prompt Engineering in Real Projects

Chatbots: Role prompts + output format for consistent replies
Content Generation: Few-shot prompts for tone consistency
Code Assistants: Chain-of-thought for debugging explanations
Data Extraction: Instruction-based prompts returning JSON

📌 Real Example: Job Description Analyzer

I built a project called Job Application Assistant, which helps users understand and respond to job listings. Before I integrated Function Calling, I used Prompting techniques with the OpenAI API to extract structured data from job descriptions.

Here’s how I did it using a combination of Few-shot and Role-based prompting:

const jobDescriptionExample = "We need a frontend developer skilled in React, JavaScript, and TailwindCSS. You will build UIs and collaborate with backend teams. 2+ years experience required.";

const response = await openai.chat.completions.create({
  model: "gpt-4",
  messages: [
    {
      role: "system",
      content:
        "You are an AI assistant that extracts key skills, responsibilities, and experience from job descriptions.",
    },
    {
      role: "user",
      content: `Extract the following from this job description:\n
        1. Required Skills  
        2. Responsibilities  
        3. Required Experience\n\n${jobDescriptionExample}`,
    },
  ],
  max_tokens: 200,
});

return response.choices[0]?.message?.content?.trim() || "";

Sample output:

Skills: React, JavaScript, TailwindCSS  
Responsibilities: Build UIs, collaborate with backend  
Experience: 2+ years

📌 Some sample projects that illustrates Prompt Engineering

Soul Sync - A safe space where users can check in emotionally, express themselves, and receive gentle, AI-powered guidance that helps them reconnect with their inner self.
Therabot - A web app where users can chat with an AI-powered therapist for emotional support.

I also use this guide when prompting LLMs.

Prompting is about clear communication, iteration, and testing.

The more intentional your prompt, the more reliable your LLM becomes.

Happy coding!!!

How To Use LLMs: Tool Use/Function Call with OpenAI

Odinaka Joy — Tue, 22 Jul 2025 16:47:34 +0000

Large Language Models (LLMs) can generate human-like text, but what if you want your LLM-powered app to do more than chat? Like extract structured data, trigger logic, or interact with databases/APIs?

Tool Use/Function Calling helps our LLMs do more than generate responses based on trained data.

💡 What Is Tool Use/Function Call in LLMs?

Function call/Tool use is the pattern where the LLM decides when and how to invoke external capabilities (APIs, DB queries, search, calculators, code runtimes, and more) by returning a structured call.
Your application executes that call, returns the result to the model, and the model produces the final user response.

Why use it?

LLMs are smart, but they have limitations:

They hallucinate
They don’t fetch real-time data
They can’t execute backend logic directly
They return freeform text, which is sometimes hard to parse

Using tools solves this:

✅ Return structured outputs (like JSON)
✅ Fetch real-time information
✅ Integrate with APIs or your database
✅ Run backend logic (math, validation, scheduling, etc.)
✅ Trigger workflows or APIs

📌 A currency converter `tool use` example

1. Tool Definition
Define a function that converts currency:

const convert_currency = {
  "name": "convert_currency",
  "description": "Converts an amount from one currency to another",
  "parameters": {
    "type": "object",
    "properties": {
      "amount": { "type": "number" },
      "from": { "type": "string", "description": "Currency code, e.g., USD" },
      "to": { "type": "string", "description": "Currency code, e.g., EUR" }
    },
    "required": ["amount", "from", "to"]
  }
}

2. User Prompt

"How much is 100 dollars in euros?"

3. What Happens

The LLM understands the request
Calls the convert_currency tool with:

{
  "amount": 100,
  "from": "USD",
  "to": "EUR"
}

Tool returns: 91.23 EUR
LLM responds:

100 USD is approximately 91.23 EUR.

📌 `Function Calling` With OpenAI: Job Description Analyzer

In Job Application Assistant, I used Function Calling to extract job insights.

The LLM pulls out from the job description:

Required skills
Responsibilities
Experience or qualifications

Step 1: Define the Schema

const jobInsightFunction = {
  name: "extract_job_insights",
  description: "Extracts skills, responsibilities, and experience from a job description.",
  parameters: {
    type: "object",
    properties: {
      skills: {
        type: "array",
        items: { type: "string" },
        description: "List of skills required for the job",
      },
      responsibilities: {
        type: "array",
        items: { type: "string" },
        description: "Job responsibilities",
      },
      experience: {
        type: "array",
        items: { type: "string" },
        description: "Qualifications or experience needed",
      },
    },
    required: ["skills", "responsibilities", "experience"],
  },
};

Step 2: Call the Model with Tool

const response = await openai.chat.completions.create({
  model: "gpt-4-0613",
  messages: [
    { role: "system", content: "You are a helpful AI job assistant." },
    {
      role: "user",
      content: `Extract the key skills, responsibilities, and required experience from the following job description:\n\n${jobDescription}`,
    },
  ],
  tools: [
    {
      type: "function",
      function: jobInsightFunction,
    },
  ],
  tool_choice: "auto",
});

Step 3: Get and Use the Arguments

const toolCall = response.choices?.[0]?.message?.tool_calls?.[0];
const args = JSON.parse(toolCall?.function?.arguments ?? "{}");

Output:

args = {
  skills: [...],
  responsibilities: [...],
  experience: [...],
}

With this output, I can:
✅ Display in UI
✅ Match with resumes
✅ Generate cover letters

📌 Quick Tips

Use clear schema definitions
Validate the output
Use tool_choice: "auto" to let the model decide
Chain tasks if needed: extract 👉🏼 reason 👉🏼 act

Happy coding!!!

A Beginner’s Guide to LLMs: How to Use Language Models to Build Smart Apps

Odinaka Joy — Wed, 25 Jun 2025 09:30:49 +0000

In my last post, we explored Natural Language Processing (NLP), the field of AI that helps machines understand human language.

Today, we are taking it one step further with Large Language Models (LLMs),the brains behind tools that can chat, write, generate code, and answer complex questions.
LLMs make building smart, language-aware apps easier than ever, even without deep machine learning expertise 🤩.

💡 What is a Large Language Model (LLM)?

A Large Language Model is an AI system trained on massive amounts of text (books, websites, conversations, code) to understand and generate human-like language.

Most modern LLMs are built on transformer architecture, which makes them exceptionally good at understanding context and producing coherent text.

LLMs can:

Read and understand natural language (like English, French, or even programming languages)
Predict what comes next in a sentence or conversation
Generate text, code, and summaries
Translate between languages
Summarize long documents
Answer questions and assist with research

📌 How Do Large Language Models (LLMs) Work?

At the core, LLMs predict the next word in a sentence based on the context of the words before it. This simple idea (next-word-prediction) is what allows them to write emails, answer questions, generate code, and more.

🔸 1. Training on Massive Text Data
LLMs are trained on huge datasets like books, websites, conversations, code, and more, to learn patterns in language. This helps them understand:

Grammar and syntax
Facts and world knowledge
How humans typically phrase things

🔸 2. Tokenization
Before feeding text into the model, it is broken down into smaller pieces called tokens (words or word parts).
For example:
"I love coding" -> ["I", "love", "coding"]

🔸 3. Embeddings
Each token is converted into a vector (a list of numbers) that captures its meaning in context. These vectors are what the model works with.

🔸 4. Transformer Architecture
This is the model architecture that powers LLMs. It uses something called attention to focus on the most relevant words in a sentence.

🔸 5. Next-Word Prediction
During training, the model learns to guess the next word in a sentence:
Input: The cat sat on the…
Output: mat

By doing this billions of times, it becomes very good at understanding and generating human-like language.

🔸 6. Usage
Once trained, you can prompt the model with an input, and it will generate a coherent response based on everything it has learned.

📌 Types of Large Language Models (LLMs)

LLMs differ in how they are built, what they are trained on, and how they are used.

General-Purpose – The do many things but need well-crafted prompts (GPT-3, LLaMA, Mistral).
Instruction-Tuned – They follow natural instructions better (GPT-4, Claude, Gemini).
Open-Source – You can self-host, customize, and control (LLaMA, Mistral, Falcon, BLOOM).
Proprietary – You can only access via API because they are fully managed (GPT-4, Claude, Gemini).
Domain-Specific – Fine-tuned for fields like law, medicine, or coding (Code LLaMA, StarCoder).
Multilingual – They work across many languages (BLOOM, XLM-R, mGPT).

📌 Where LLMs Are Used

LLMs are powering real-world applications across industries. Here are just a few:

1. Chatbots & Virtual Assistants
LLMs enable natural conversations - customer service bots, AI therapists, HR assistants.
💬 Example: ChatGPT, Claude, Replika

2. Content Generation
LLMs help create quality text with little human effort - blog posts.
📝 Example: Jasper AI, Notion AI, Copy.ai

3. Knowledge Assistants & Question-Answer Systems
LLMs are used in education, legal, and healthcare to answer complex domain-specific questions.
📚 Example: AI tutors, Legal search bots, Medical chatbots

4. Text Summarization & Report Generation
Used in journalism, legal, and finance to turn long documents into clear summaries.
📄 Example: Tools like Scribe, SummarizeBot

5. Code Generation & Debugging
LLMs fine-tuned on code can generate, explain, and fix programming tasks.
👨‍💻 Example: GitHub Copilot, Amazon CodeWhisperer

📌 Approaches to Use LLMs Effectively

You don’t always need to train your own model to benefit from LLMs. Prompting is the most basic and universal way to use them.

Here are the main approaches developers and teams use today:

🔸 1. Prompt Engineering
This is the fastest and easiest approach to get the best from LLMs. It is the process of crafting effective input (prompts) to guide the model’s output. This is the most common and beginner-friendly approach.

Example: Summarize this paragraph in 3 bullet points…, Translate this into French…

🔸 2. Function Calling/Tool Use
Let the LLM call specific tools (e.g., weather APIs, database queries) when needed. Most modern APIs support this (like OpenAI’s functions or tools).

Example: An AI chatbot that retrieves live stock prices or booking details.

🔸 3. Fine-Tuning (Advanced)
This involves training an existing model on your own custom dataset to specialize it for your domain.

Use this approach if the model isn’t performing well on specific tasks, or you need domain-specific responses (e.g., medical, legal, or company data).

🔸 4. Retrieval-Augmented Generation (RAG)
Combine LLMs with your own knowledge base (e.g., company docs, PDF files). Instead of fine-tuning, the model retrieves relevant information before answering.

Tools: LangChain, LlamaIndex, Haystack
Use case: Building smart document assistants, internal search tools, etc.

📌 How to Choose the Right LLM and Approach

With so many LLMs and usage methods available, it's easy to feel overwhelmed. Here’s a simple way to decide what is best for your use case:

🔸 1. Choose Based On Your Goal: What do you want the model to do?
Examples:

Summarize tasks: Use models tuned for summarization (OpenAI GPT-4-turbo, Anthropic Claude 3)
Build a chatbot: Use general-purpose conversational models (GPT-4/GPT-4-turbo, Gemini 1.5 Pro)
Extract structured data: Use LLMs with structured prompting or Retrieval-Augmented Generation (RAG) (GPT-4-turbo with Function Calling, Command R+ by Cohere)

🔸 2. Choose Based on Complexity

Goal Type	Recommended Approach	Example Model
Simple tasks (chat, Q&A)	Prompting + API	GPT-4, Claude, Gemini
Domain-specific outputs	Prompt Engineering or RAG	GPT-4, Cohere Command-R
Full control + offline use	Fine-tune open-source models	Mistral, LLaMA, BLOOM
Code generation	Use code-focused LLMs	Code LLaMA, StarCoder

🔸 3. Pick Based on Resources

Low Resources: Use hosted APIs (OpenAI, Claude, etc)
Medium Resources: Use Hugging Face models locally
High on Resources: Fine-tune open-source models with GPUs

🔸 4. Consider Data Privacy & Ownership

Working with sensitive/private data?: Use open-source LLMs locally.
General-purpose tasks?: Hosted APIs are fast and convenient.

🔸 5. Consider Your Skill Level

Developer with no ML experience: Prompting, RAG
Developer with ML/AI experience: Fine-tuning, Evaluation Tips:
- Start simple with prompting and hosted APIs.
- Move to RAG or fine-tuning when your app needs domain-specific behavior or more control.

📌 Why LLMs Matter for Developers

LLMs let you add intelligence to your apps without training your own model.
Instead of writing hundreds of rules, you can simply describe what you want in plain English and let the model handle the complexity.

📌 How to Use LLMs in Your Apps

🔸 1. Choose Your LLM Provider - OpenAI API, Anthropic Claude API, Google Gemini API, Hugging Face Inference API
🔸 2. Call the Model from Your Code - An example using OpenAI’s API in Python:

from openai import OpenAI

client = OpenAI(api_key="YOUR_API_KEY")

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Write a 3-sentence summary of climate change."}
    ]
)

print(response.choices[0].message["content"])

🔸 3. Give Good Prompts (Prompt Engineering) - The way you ask matters.

Be specific: Summarize this text in bullet points.
Give examples: Here’s a format I want: [Example]
Set a role: You are a senior software engineer who chooses simplicity over complexity.

🔸 4. Add App Logic Around the Model - LLMs aren’t apps on their own. You wrap them in code.

Store user queries and responses in a database
Use NLP to pre-process input (e.g., remove noise, detect intent)
Chain multiple model calls to complete complex tasks

Summary

LLMs are a game-changer for developers.
They let you build apps that can understand, generate, and interact with language without starting from scratch.

This post introduces Large Language Models (LLMs) as powerful tools for building smart, language-aware applications. It covers what LLMs are, how they evolved from NLP, the different types available, and practical approaches like prompting, fine-tuning, and RAG.

Happy coding!!!

My Journey into AI: Natural Language Processing (NLP)

Odinaka Joy — Sun, 22 Jun 2025 08:32:28 +0000

Natural Language Processing (NLP) is how we teach computers to work with human language, to read, interpret, and respond in ways that feel natural to us.

It powers chatbots, voice assistants, translation tools, search engines, and more.

📌 Why is NLP Important

Language is how we share ideas, ask questions, and connect. For machines to truly help us, they must understand meaning, not just read text.

NLP bridges the gap by helping computers:

Read language - input
Understand meaning - context, tone, grammar
Respond with text or speech - output

📌 How NLP Works

NLP breaks language into pieces machines can work with, then puts it back together for humans 🔥. Deep Learning made NLP smarter, letting models learn context and tone without hardcoded rules. Here’s a practical step-by-step guide to how it works:

1. Input (Raw Text or Speech)

Everything starts with data - text (tweets, articles, chatbot messages) or speech (voice recordings).
If it’s speech, NLP first uses Automatic Speech Recognition (ASR) to convert it into text.

Example: "The weather is nice today."

2. Preprocessing (Cleaning & Normalizing)

Raw text is messy. Before analysis, it needs to be standardized:

Tokenization: Split text into words

👉 "The weather is nice today." → [The, weather, is, nice, today]

Lowercasing

👉 "the weather is nice today"

Stopword removal: Remove common words (the, is, and)

👉 [weather, nice, today]

Stemming/Lemmatization: Reduce words to their base form

👉 "running" → "run", "better" → "good"

3. Feature Extraction (Turning Text into Numbers)

Words needs to get transformed to numerical representations because computers only understand numbers. Different methods have been developed for this, ranging from simple to advanced:

Bag of Words (BoW): Counts word frequency.

👉 "The cat sat on the mat" → [1,1,1,1,1,0,0]

TF-IDF: Weighs words by importance and rare words matter more.

👉 machine learning in a tech article gets more weight than in a general blog where it’s used everywhere.

Word Embeddings: Use tools like Word2Vec, GloVe, FastText, to map words to vectors that capture meaning. Words with similar meanings have similar vectors.

👉 king - man + woman ≈ queen

Transformers (Modern NLP): Context-aware models like BERT, GPT that understand word meaning based on surrounding text.

👉 bank in "river bank" ≠ bank in "money bank"
This solves the context problem that BoW, TF-IDF, and static embeddings cannot.

4. Modeling (Understanding or Generating Language)

The extracted features are fed into models to perform tasks such as:

Text Classification: Sentiment analysis, Topic classification, Intent classification
Sequence Labeling: Named Entity Recognition (NER), Part-of-Speech (POS) tagging, Chunking
Sequence-to-Sequence (Seq2Seq) Tasks: Machine Translation, Summarization, Paraphrasing / text simplification
Text Generation: Chatbots and conversational agents, Content creation, Question answering

Example: "The weather is nice today." 👉 Sentiment = Positive

5. Post-processing (Human-Friendly Output)

The model’s raw output is converted into something people can understand. For instance:

Predictions gets mapped back to categories
Generated text gets polished for grammar and readability

Example: Output: "Positive sentiment detected"

6. Feedback & Iteration

NLP models improve with more data and fine-tuning for specific tasks.

Example: A medical chatbot will be trained differently than a customer service chatbot

📌 Important NLP applications

Information Retrieval: Search engines, document ranking
Information Extraction: Extracting facts, relations, knowledge graphs
Speech-related NLP: Speech recognition (ASR), speech-to-text, spoken dialogue systems
Question Answering & Reasoning: Answering from text, open-domain Q&A
Recommendation systems powered by NLP: Understanding reviews, extracting user preferences
Document similarity & clustering: Grouping related documents (e.g., legal, medical)
Text-to-SQL or natural language interfaces to databases
Code generation and understanding (like GitHub Copilot)

📌 Useful NLP Tools

spaCy: Fast, modern NLP in Python
NLTK (Natural Language Tool-Kit): Beginner-friendly NLP library
Hugging Face: Pre-trained deep learning models
OpenAI API: Access to LLMs like GPT

📌 Why Learn NLP?

As a developer, I have seen how NLP makes apps smarter and more human-like - auto-suggestions, chatbots, smart search, etc.

It’s not just AI hype. It’s about building tools that truly understand your users and automate some human-repetitive and time-consuming tasks like a HUMAN.

📌 How NLP Fits into the Bigger AI Picture

🧠 Artificial Intelligence (AI): Making machines act intelligently
📊 Machine Learning (ML): Learning patterns from data
🔍 Deep Learning (DL): Using neural networks to learn directly from raw data
🗣 Natural Language Processing (NLP): Understanding and generating human language

Summary

NLP is the bridge between what we say and what machines understand.

I am currently exploring techniques like tokenization, text classification, and sentiment analysis, and my next step is building small NLP-powered web app on text classification.

Next up: Large Language Models (LLMs), the AI systems that takes NLP to the next level.

Happy coding!!!

Implementing the Data Science Workflow: Predicting Mental Health Treatment

Odinaka Joy — Sun, 08 Jun 2025 16:33:20 +0000

In my last article, I broke down the Data Science Workflow for beginners. It’s a great starting point for understanding the key steps in any data science project.

In this follow-up post, I am putting that workflow into action by sharing my first machine learning project:

predicting whether someone is likely to seek treatment for mental health issues based on demographic and workplace data.

This hands-on project covers the full process, from understanding the problem and exploring the dataset to training a model and evaluating its performance.

📌 Key Takeaways

I built this machine learning model using a public mental health survey dataset from Kaggle.
The goal: Predict whether someone is likely to seek mental health treatment.
Best model achieved ~82% accuracy.
Key predictors: workplace support, family history, and how much mental health interferes with work.
Full code: 👉 GitHub Notebook

📌 Why This Project Matters

Mental health is deeply personal, but the decision to seek treatment is often influenced by external conditions like work culture, stigma, or lack of access. By modeling treatment-seeking behavior:

✅ We identify at-risk individuals early.
✅ We encourage empathetic policy-making in the workplace.
✅ We normalize seeking help through data storytelling.

📌 Framing the Challenge

Problem::

Given demographic, personal, and workplace mental health history, can we predict whether someone is likely to seek treatment?
Machine Learning Problem Type:

Supervised Learning – Binary Classification
Success Criteria (Initial Evaluation):

A model with ≥ 85% accuracy and ≥ 80% recall for the positive class (that is, people who seek treatment) will be considered successful.
Data Source:
Structured, static dataset from Kaggle

📌 Dataset Overview

1 numerical feature (age)
26 categorical features (gender, self_employment, benefits, etc.)
Target column: treatment (Yes/No) After cleaning, we had 1,300+ usable responses from tech professionals.

📌 Cleaning the Data: A Quick Summary

I made these key cleaning decisions:

Dropped irrelevant or sparse columns like timestamp and comments
Normalized gender values from wild responses like "guy (-ish)" or "femail" into "Male", "Female", "Other"
Handled missing values with strategic imputation (e.g., replacing "self_employed" nulls with "No")
Filtered out outliers in the age column (we kept ages 18–74)

Want to see exactly how? Check out the notebook here

📌 Exploratory Data Analysis (EDA)

Treatment Distribution

Over half the respondents reported seeking treatment. This gives us a relatively balanced target, which is great for modeling!

Gender vs Treatment Visualizing treatment-seeking behavior by gender revealed that:
Women were slightly more likely to seek help.
The "Other" gender group had smaller numbers but still sought support at similar rates.
Age Distribution
Most respondents were aged 25–44, typical for tech jobs 😆. We also created age groups like "18–24", "25–34", etc., to identify behavioral patterns.

📌 Feature Engineering Highlights

To make the data model-ready, I:

Grouped continuous ages into categories
Ordinal-encoded ordered features (e.g., company size, perceived difficulty of taking leave)
Binary-encoded yes/no columns
One-hot encoded select categorical columns (like benefits, anonymity, wellness_program)

These steps helped reduce noise and preserve meaning in the data.

📌 Model Building with Random Forest

I tried 4 modeling approaches using RandomForestClassifier:

Default model
Manual hyperparameter tuning
RandomizedSearchCV tuning
GridSearchCV tuning

Model	Accuracy
Default RF	82.1%
Manually Tuned	82.4% 🔥
RandomizedSearchCV	81.2%
GridSearchCV	81.6%

All models performed well, but manual tuning surprisingly gave the best result.

📌 Evaluation Metrics

Besides accuracy, I measured:

Precision: How many predicted "Yes" are truly "Yes"
Recall: How many actual "Yes" were correctly identified
F1 Score: Balance between precision and recall
Confusion Matrix: Breakdown of prediction results
ROC AUC: Model’s overall ability to distinguish between classes

📌 Key Insights

People with family history or poor workplace support were more likely to seek treatment.
The work_interfere feature (i.e. how much work affects mental health) was highly predictive.
The Random Forest model was interpretable and gave consistently strong performance.

📌 Tools Used

pandas and numpy for data manipulation
numpy for numerical computation
matplotlib for visualization
scikit-learn for modeling
Jupyter Notebook in a Miniconda environment

📌 Final Thoughts

This project was more than just a machine learning experiment, it was a reminder of how data can support empathy, and how technical skills can be used to explore meaningful questions.

✅ I practiced EDA, preprocessing, encoding, and model tuning.
✅ I built a working Machine Learning model that could be useful for HR or wellness platforms.
✅ Most importantly, I felt connected to a topic that truly matters.

Mental health is not just personal, it’s societal. Let’s keep talking about it, and maybe… let’s keep coding about it too 😜.

Happy coding!!!

A Beginner’s Guide to the Data Science Workflow

Odinaka Joy — Tue, 03 Jun 2025 18:30:59 +0000

Artificial Intelligence is about building systems that mimics human.
Machine Learning is a subset of Artificial Intelligence (AI) and it is an approach to achieve AI by building systems that can find pattern in a set of data.
Deep Learning is a subset of Machine Learning (ML). It is one of the techniques for implementing ML.

What then is Data Science? Data Science overlap all three above (AI, ML, DL). This field simply means analyzing data and then doing something with it.

Data science can seem intimidating at first, with all the tools, libraries, and buzzwords floating around. But at its core, it’s simply about using data to solve real-world problems. This is a walk through of the essential stages of the data science workflow, what they mean, why they matter, and how Python can help, based on what I have learned as a beginner navigating this exciting field.

💡 What is Data Science?

Data Science is the field of extracting meaningful insights from data using a combination of statistics, programming, and domain knowledge. Whether you are analyzing customer behavior, forecasting sales, or detecting anomalies in sensor readings, the goal is the same: To turn raw data into actionable information.

For beginners, it’s tempting to jump straight into tools like Pandas, Scikit-learn, or TensorFlow, but NO. It’s essential to understand the overall workflow that guides any data science project. Jumping straight into code can feel satisfying, but without a clear roadmap, you may spend hours cleaning the wrong variables or building models that don’t address the real problem. Learning the data science workflow helps you think like a problem-solver, not just a tool user.

Data Science Practical Guide

Create a framework
Match Data Science and Machine Learning tools
Learn by doing

📌 A Data Science Workflow

✍️ 1. Problem definition

Understand the problem and define the questions you want to answer.
Question: What problem are we trying to solve?

Will a simple hand-coded instruction based system work? If yes, no machine learning
Match the problem to the main types of Machine Learning
- Supervised Learning: You have data with labels (includes both input features and their corresponding correct output) which can be a classification or regression type. An example is "Predict heart disease with health records"
- Unsupervised Learning: You have data with no labels (contains only the input features — no known or provided output labels). So you are to use data patterns to generate labels (output). An example is "Use customer purchases to determine which customers are similar to each other"
- Reinforcement Learning: This involves having a computer program perform some actions within a defined space. You reward it (for doing it right) or punish it (for doing it wrong). An example is "An AI playing chess tries moves and learns from win/loss outcomes"
- Transfer Learning: Used when the problem is similar to another case. It is a technique where a model pretrained on one task is reused for a different but related task. An example is "Using a model trained on millions of general images to classify X-ray images after a little fine-tuning"

✍️ 2. Data Collection

Once the problem is clear, gather relevant data. You might collect data from CSVs, APIs, web scraping, or databases. After collection, understand its format and limitations.
Question: What type of data do we have available?

Structured Data: These are data that is organized in a predefined format like rows and columns, making it easy to store, search, and analyze. They are often stored in: Relational databases (like MySQL, PostgreSQL), spreadsheets (Excel, CSV). They are easily analyzed with tools like SQL, pandas, Excel.
Unstructured Data: These are data that doesn’t follow a clear format. It can’t easily be stored in tables or rows. It requires more processing to extract meaning or structure. They are stored in: Files, document repositories, cloud storage. Examples are: Text (Emails, PDFs, social media posts), Media (Images, videos, audio), Logs (Server logs, clickstreams).
Semi-structured Data: These ones falls in between. They are not as rigid as structured data, but has some organization. Example: JSON, XML, HTML

There is also another category of data within the two (Structured and Unstructured) above to note:

Static Data: These are data that doesn't change over time.
Streaming Data: These are data that changes regularly

✍️ 3. Success Criteria (Initial Evaluation)

Define what "success" looks like before you begin modeling. This helps guide decisions later. For example:

If we can reach 95% accuracy in predicting heart disease, we will proceed with deployment

Different types of evaluation metrics:

Classification	Regression	Recommendation
Accuracy	Mean Absolute Error (MAE)	Precision@K
Precision	Mean Squared Error (MSE)	Recall@K
Recall (Sensitivity)	Root Mean Squared Error (RMSE)	Mean Average Precision (MAP)
F1 Score	R-squared (R²)	Normalized Discounted Cumulative Gain (NDCG)
ROC-AUC Score	Adjusted R²	Hit Rate
Confusion Matrix	Mean Absolute Percentage Error (MAPE)	Coverage
Log Loss		Diversity
Matthews Corr. Coeff. (MCC)

✍️ 4. Features

Features refers to the different form of inputs within the data you collected. Example: age, gender, heart rate, etc. You identify the feature variables and target variables (if available). Feature variables are used to predict target variable

Example of a health record data:

ID	Weight	Sex	Heart Rate	Chest Pain	Heart Disease
1	120kg	M	81	4	Yes
2	98kg	F	75	2	No
3	110kg	M	90	3	Yes
4	85kg	F	65	1	No
5	105kg	M	78	4	Yes

Question: What do we already know about the data?

Types of features:

Numerical features: Examples are Weight, Heart Rate, Chest Pain
Categorical features: Examples are Sex, Heart Disease
Derived features: These are features you add using the existing ones. Example: "Visits Per Year"

This stage involves:

4.1. Data Cleaning

Raw data is rarely clean. This step involves handling missing values, fixing errors, and removing duplicates. Tools often used are Pandas and NumPy

4.2. Exploratory Data Analysis (EDA)

This is where you explore patterns, trends, and relationships in your features using visualizations and statistics. Tools often used are Pandas, Matplotlib and Seaborn.
Some EDA based on our data sample: Heart Disease Frequency per Chest Pain Type, Age versus Max Heart Rate for Heart Disease, Heart Disease Frequency according to Sex, etc

4.3. Feature Engineering and Encoding

At this stage, you can create new features or alter existing ones to make your model smarter.
Question: Feature coverage - How many samples have different features? Ideally, every sample has the same features.

Feature encoding is the process of converting categorical (non-numeric) data into a numerical format so that machine learning models can understand and work with it.

✍️ 5. Model Building

At this stage, you choose one or more models, train them on your dataset, and make predictions. Some common tools used at this stage are scikit-learn, PyTorch, TensorFlow.
Question: Based on our problem and data, what model should we use?

Parts of Modeling

Choosing and training training
Tuning a model
Model comparison

Data Splitting
The most important concept about machine learning is Data Splitting.

The training dataset which is 70 to 80% of the total data
The validation dataset which is 10 to 15% of the total data
The test dataset which is 10 to 15% of the total data

You train the model on the training dataset, tune the model on the validation dataset and test/compare the model on the test dataset.

The idea here is Generalization - the ability for a machine learning model to perform well on data it hasn't seen based on what it learnt on similar data it was trained on.
Simply put, pass exam based on course material and practice exam.

5.1. Choosing and Training a Model

Start by selecting an appropriate algorithm based on your problem type and data. Train the model using the training dataset to help it learn patterns and relationships.
For example, CatBoost and RandomForest works best on structured data.

5.2. Tuning a Model

After initial training, adjust hyperparameters (like learning rate, depth, number of estimators, etc. These are based on chosen algorithm) to improve performance. Techniques like Grid Search, Random Search, or Bayesian Optimization help find the best configuration. Tuning is done on training or validation datasets.

5.3. Model Comparison

This is to test the model with unseen data and compare the results. Testing is done on the test dataset.

✍️ 6. Model Evaluation

After the model has been trained, tuned, and tested, evaluate it using appropriate metrics on a validation or test dataset. Use metrics like accuracy, precision, recall, RMSE, etc, to assess performance.

✍️ 7. Experiment

Most times, a model's first result aren't its last. You need to perform the steps 5 and 6 on other algorithms/models, maybe modify the input and output, to see if there is a better result. Compare the evaluation results with the goal to select the model that generalizes best on unseen data, not just the one that performs best on the training dataset.

✍️ 8. Deployment (Optional)

Package and serve the model in a real-world environment. You can integrate the model into a usable product or service. Some tools are Flask, FastAPI, Streamlit, Docker, Heroku

📌 Key Python Libraries Overview

Pandas: Uused to explore, analyze, manipulate and get data ready for machine learning. It reads data as DataFrames.
NumPy: NumPy stands for Numerical Python and it is used for numerical computation. It forms the foundation of taking your DataFrame and turning it into a series of numbers and then a machine learning algorithm would work out the patterns in those numbers.
Matplotlib/Seaborn: Used to turn data into visualizations known as plots
Scikit-learn: A Python ML library for building ML models to train and evaluate models, used to make predictions.

📌 Summary and What’s Next

In this post, we explored the foundations of Machine Learning - understanding problem types, choosing and evaluating models, and making sense of our data through EDA and metrics.

But theory is only half the story.

Up next, I will be putting this into practice in a real-world project:

“Predicting treatment outcomes for mental health patients”

My Journey into AI: Understanding the Building Blocks of Deep Learning (NLP Focused)

Odinaka Joy — Sat, 31 May 2025 10:35:34 +0000

When I started learning Machine Learning (ML), I thought I was already halfway into understanding how AI reads and understands text. But NO, Machine Learning is the engine, and Deep Learning is the turbo boost 🤯 that makes things like voice assistants, chatbots, and even GPT possible.

Even though my main focus is NLP and LLMs, taking time to understand and practice the building blocks of Machine Learning and Deep Learning has made my NLP learning less abstract.

💡 What is Deep Learning (DL)?

Deep Learning is a type of Machine Learning that uses Artificial Neural Networks to learn from large amounts of data.

These Neural Networks are inspired by how the human brain works, with lots of interconnected neurons passing signals around, but in reality, it’s just clever mathematics and matrices doing the heavy lifting 😎.

Traditional ML can struggle with raw, unstructured data like images, audio, and text. Deep Learning shines here because it can automatically learn features from raw data without you handpicking them.

📌 Why Deep Learning is Key to NLP

Language is messy.
We say I dey go in Pidgin, I am going in English, and many more language translation for same context, that mean the same thing.

Deep Learning models can handle this complexity with ease. They learn patterns, context, and relationships in words far better than traditional ML methods.

📌 Core Building Blocks of Deep Learning

Neurons: Basic units that receive, process, and pass information.
Layers: Groups of neurons working together. More layers = deeper learning.
Weights and Biases: Adjustable numbers that the model learns to get better at predictions.
Activation Functions: Decide if a neuron should fire (ReLU, Sigmoid).
Forward Propagation: Sending data forward through the network to get predictions.
Loss Function: Measures how wrong the model is.
Backpropagation: The process of adjusting weights to reduce errors.
Optimizer: The algorithm that tweaks weights efficiently (Adam, SGD).
Epochs, Batches, Iterations: How you feed and loop through your data.

📌 Deep Learning Architectures in NLP

RNN (Recurrent Neural Networks): Good for sequences but can forget long-term context.
LSTM (Long Short-Term Memory): Solves the forgetting problem of RNNs.
GRU (Gated Recurrent Unit): Similar to LSTM but faster.
Transformer: The modern king. Powers GPT, BERT, and most state-of-the-art NLP systems.

📌 Where You See Deep Learning in Real Life

Computer Vision: Facial recognition, medical scans, object detection in self-driving cars.
Natural Language Processing (NLP): Chatbots, translation, summarization, sentiment analysis.
Recommendation Systems: Netflix, YouTube, Spotify.
Speech Recognition: Siri, Alexa, transcription tools.

📌 Tools for Deep Learning

TensorFlow (with Keras): Powerful but with a steeper learning curve.
PyTorch: Flexible and beginner-friendly for experimentation.
Keras: High-level API for quick prototyping.
Hugging Face Transformers: For pre-trained NLP models like BERT, GPT, RoBERTa.

📌 Why This Matters for NLP

Understanding Deep Learning means I am not just using NLP models but I understand the foundations they are built on. When you know what’s happening under the hood, you can fine-tune, troubleshoot, and even experiment with new architectures.

I will be sharing my journey as I go deeper into NLP and LLMs, but trust me, mastering these basics is like learning your alphabet before writing poetry.

📌 Example: Sentiment Analysis with Deep Learning

Imagine building a system that reads Amazon reviews and predicts if they are positive, neutral, or negative.

With traditional ML, you need to manually extract features like word counts or sentiment scores.
With Deep Learning, you can feed the raw text (after tokenizing) into an LSTM or Transformer, and it learns to spot patterns by itself.

📌 My Learning Path

Here’s how I am approaching Deep Learning as the bridge to NLP:

Understand Neural Networks: basics of layers, weights, activation functions.
Practice with simple projects: text classification, sentiment analysis, name entity recognition.
Explore Transformers: with Hugging Face.
Integrate into web apps: making my models useful in real life.

Summary

For practice, I built my first Deep Learning project using a dataset on dog breed classification:

🔗 End-to-End Dog Vision with TensorFlow

Next, I will be writing about Natural Language Processing itself because that is where Deep Learning meets the magic of human language 😜.

Happy coding!!!

My Journey into AI: Understanding the Building Blocks of Machine Learning

Odinaka Joy — Sat, 10 May 2025 04:14:27 +0000

Imagine teaching a child to recognize ripe mangoes, not by giving a list of rules, but by showing them many examples until they just know. That’s how Machine Learning works.

In my AI journey, I realized ML is the engine that powers many of the AI systems we use daily - product recommendations on Jumia or Netflix, spam filters in Gmail, credit scoring systems in banks, and many more scenarios.

If you understood the building blocks of AI in my last post, this is the natural next step. Let’s break ML down together.

💡 What is Machine Learning, Really?

In traditional programming, you give the computer rules + data, and it gives you answers.

In Machine Learning, you give the computer data +/- answers, and it figures out the rules by itself 💪.

Example:

Traditional programming: If marks ≥ 50 ⇒ Pass, else ⇒ Fail
ML: Give the computer lots of past student scores with labels - Pass or Fail, and it learns the pattern to decide for new students without hardcoding the rule.

📌 The Core Ingredients of Machine Learning

Data – The raw material. This could be numbers, images, text, or audio.
Features – The key attributes or variables in your data that help make predictions.
Model – The mathematical structure that learns patterns from data.
Training – Feeding data into the model so it can learn.
Evaluation – Testing the model to see how well it performs on new, unseen data.

📌 Types of Machine Learning

✍️ 1. Supervised Learning

Supervised Learning is learning from labelled data. Labelled data have both the questions and correct answers. The learning process is to be able to map a new question (not part of the training set) to an answer based on experience.

Examples: Predicting house prices, detecting spam emails.

✍️ 2. Unsupervised Learning

Unsupervised Learning is finding patterns in data without labels. Unlabelled data have the questions but no answers. The learning process is to identify a group based on similarities.

Examples: Grouping customers into segments, finding similar products.

✍️ 3. Reinforcement Learning

Reinforcement Learning is learning by trial and error and getting rewards (when correct) or penalties (when incorrect).

Examples: Teaching a robot to walk, training AI to play chess.

📌 Popular ML Algorithms for Beginners

Linear Regression: Predicts continuous values like house prices.
Logistic Regression: Binary classification like spam or not spam.
Decision Trees: Works for both classification and regression.
Support Vector Machines (SVM): Finds boundaries to separate categories.
K-Nearest Neighbors (KNN): Predicts based on closest data points.
Naive Bayes: Great for text classification like spam detection.

📌 Practical Tools for Machine Learning

Scikit-learn: Beginner-friendly tool that covers most ML basics.
XGBoost: Great for credit scoring and churn prediction.
LightGBM: Good for ranking and recommendations.
CatBoost: Works well with categorical features.
Statsmodels: Perfect for time series and statistical analysis.

📌 Example Workflow of a Machine Learning Project

Define the problem: What are you trying to solve
Collect data: Gather relevant and sufficient data
Clean data: Handle missing values, duplicates and outliers (outliers refer to data points that are significantly different from the rest of the dataset like 500 in this set [2, 4, 7, 9, 500])
Split data: Use 70 to 80% of the datasets for training and 20 to 30% of the datasets on testing the model
Choose algorithm: Select an appropriate ML algorithm based on the problem and type of data
Train model: Feed the training datasets to the model
Test model: Use unseen data (the testing datasets) to evaluate the model
Tune parameter: Improve model performance
Deploy model: Integrate model into production

📌 Why ML Matters in the AI Journey

Machine Learning is the heart of modern AI. NLP, LLMs, computer vision all depend on ML to understand, predict, and improve over time.

For me, learning ML isn’t just about understanding algorithms. It’s about learning how to frame problems as data problems, analyze and process data, and then build intelligent systems from those insights.

📌 My Learning Path in ML

Understanding the ML theory
Learn Python libraries - NumPy, Pandas, Matplotlib, Seaborn, Scikit-learn
Practice with Kaggle and HuggingFace datasets
Build small real-world projects
Deploy models in web apps.

Summary

Machine Learning isn’t magic, and just like learning to cook, you don’t have to start with a buffet. One small, simple recipe can get you started.

For practice, I built my first ML project on predicting mental health treatment:

🔗 Data Science Workflow: My First ML Project on Mental Health Treatment

Happy coding!

Forem: Odinaka Joy

Running Machine Learning Models in the Browser Using onnxruntime-web

📌 Prerequisites & Setup

📌 Preparing Your Data for Inference

🎯 Encoding Input Data

🎯 Loading the ONNX Model with onnxruntime-web

🎯 Running Inference

🎯 Displaying Predictions to the User

Conclusion

How To Use LLMs: Advanced Prompting Techniques + Framework for Reliable LLM Outputs

💡 The S-I-O → Eval Prompting Framework

💡 Advanced Prompting Techniques

🎯 1. Advanced Priming

🎯 2. Chain of Density

🎯 3. Prompt Variables and Templates

🎯 4. Prompt Chaining

🎯 5. Compressing Prompts

🎯 6. Emotional Stimuli

🎯 7. Self-Consistency

🎯 8. ReAct Prompting

🎯 9. ReAct + CoT-SC (ReAct + Chain-of-Thought + Self-Consistency)

🎯 10. Tree of Thought (ToT)

💡 Integrating Advanced Prompting into the S-I-O → Eval Framework

Setup (S) – Priming default behavior

Instruction (I) – Task-specific guidance

Output (O) – Structuring results

Evaluation (Eval) – Testing and refining

💡 Examples of Techniques in Action

Summary

How To Use LLMs: Retrieval-Augmented Generation (RAG Systems)

💡 Retrieval

💡 Text Generation

💡 Retrieval-Augmented Generation (RAG)

📌 How RAG Works Step by Step

📌 Practical Implementation Using LangChain and OpenAI

📌 Some Real-World Use Cases of RAG

Some advanced topics to improve RAG systems

📌 Summary

How To Use LLMs: Fine-Tuning GPT-4

💡 What is Fine-Tuning

📌 How Fine-Tuning Works

📌 Performance versus Investments

📌 Step-by-Step GPT-4 Fine-Tuning Process

1. ✍️ Get Your OpenAI API Key and Setup OpenAI

2. ✍️ Prepare Your Dataset

3. ✍️ Split The Dataset For Training and Validation

4. ✍️ Upload The Datasets To OpenAI

5. ✍️ Create the Fine-Tune Job

6. ✍️ Use Your Fine-Tuned Model

📌 Evaluation of a Fine-Tuned Model

1. Evaluation Metrics

2. Test Prompts for Qualitative Analysis

3. Iterative Improvements

4. Overcoming Overfitting and Poor Output Quality

Summary

How To Use LLMs: Prompt Engineering - A Practical Guide for Beginners

💡 What is Prompt Engineering?

📌 Understanding How LLMs Respond to Prompts

📌 Hierarchy of Instructions

📌 Core Components of a Good Prompt

📌 Basic Prompting Techniques (With Examples)

📌 Practical Tips for Better Prompts

📌 How to Choose the Right Prompting Technique

📌 Prompt Engineering in Real Projects

📌 Real Example: Job Description Analyzer

📌 Some sample projects that illustrates Prompt Engineering

How To Use LLMs: Tool Use/Function Call with OpenAI

💡 What Is Tool Use/Function Call in LLMs?

📌 A currency converter tool use example

📌 Function Calling With OpenAI: Job Description Analyzer

📌 Quick Tips

A Beginner’s Guide to LLMs: How to Use Language Models to Build Smart Apps

💡 What is a Large Language Model (LLM)?

📌 How Do Large Language Models (LLMs) Work?

📌 Types of Large Language Models (LLMs)

📌 Where LLMs Are Used

📌 Approaches to Use LLMs Effectively

📌 How to Choose the Right LLM and Approach

📌 Why LLMs Matter for Developers

📌 How to Use LLMs in Your Apps

📌 A currency converter `tool use` example

📌 `Function Calling` With OpenAI: Job Description Analyzer