<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Ephantus Macharia</title>
    <description>The latest articles on Forem by Ephantus Macharia (@ephantus_macharia_).</description>
    <link>https://forem.com/ephantus_macharia_</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3818259%2F2a0ce59f-1b47-4c2b-a228-f6b44f45d932.jpg</url>
      <title>Forem: Ephantus Macharia</title>
      <link>https://forem.com/ephantus_macharia_</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/ephantus_macharia_"/>
    <language>en</language>
    <item>
      <title>Understanding Data Modelling in Power BI: Joins, Relationships, and Schemas.</title>
      <dc:creator>Ephantus Macharia</dc:creator>
      <pubDate>Mon, 30 Mar 2026 18:39:36 +0000</pubDate>
      <link>https://forem.com/ephantus_macharia_/understanding-data-modelling-in-power-bi-joins-relationships-and-schemas-254o</link>
      <guid>https://forem.com/ephantus_macharia_/understanding-data-modelling-in-power-bi-joins-relationships-and-schemas-254o</guid>
      <description>&lt;p&gt;Data modelling is the process of defining how your tables are structured and how they relate to one another. Instead of dumping everything into one giant flat table, you organise your data into multiple purpose-built tables and link them together. Power BI then uses those links to filter, aggregate, and display data correctly across your entire report.&lt;/p&gt;

&lt;h1&gt;
  
  
  Joins
&lt;/h1&gt;

&lt;p&gt;How Tables Connect at the Query Level&lt;br&gt;
A join is how you combine rows from two tables based on a shared column. In Power BI, joins happen inside Power Query  before the data even hits your model.&lt;/p&gt;

&lt;p&gt;There are four main join types:&lt;/p&gt;
&lt;h1&gt;
  
  
  Inner Join
&lt;/h1&gt;

&lt;p&gt;Returns only rows that have a match in both tables. If a farmer exists in your Sales table but not in your Customers table, that row is dropped.&lt;/p&gt;

&lt;p&gt;You only use when you want complete, matched records.&lt;/p&gt;
&lt;h1&gt;
  
  
  Left Outer Join
&lt;/h1&gt;

&lt;p&gt;Returns all rows from the left table, plus any matches from the right. Unmatched rows from the right come back as nulls.&lt;/p&gt;

&lt;p&gt;you  use it when you want to keep all records from your primary table regardless of whether a match exists.&lt;/p&gt;
&lt;h1&gt;
  
  
  Right Outer Join
&lt;/h1&gt;

&lt;p&gt;The mirror of a left join  keeps all rows from the right table and fills nulls where the left has no match.&lt;/p&gt;
&lt;h1&gt;
  
  
  Full Outer Join
&lt;/h1&gt;

&lt;p&gt;Returns all rows from both tables. Nulls appear wherever there's no match on either side.&lt;/p&gt;

&lt;p&gt;Use when  you want a complete picture and  you are willing to handle the nulls.&lt;/p&gt;

&lt;p&gt;In Power Query, you access these through Home → Merge Queries, then pick your join type from the dropdown.&lt;/p&gt;
&lt;h1&gt;
  
  
  Relationships
&lt;/h1&gt;

&lt;p&gt;How Tables Connect in the Model&lt;br&gt;
Once your tables are loaded, relationships are how Power BI understands the links between them inside the data model. Unlike joins (which physically merge rows), relationships are virtual  they let Power BI filter one table through another without duplicating data.&lt;/p&gt;

&lt;p&gt;You define relationships in the Model view, and they work automatically whenever you use fields from multiple tables in a visual.&lt;/p&gt;

&lt;p&gt;The Three Relationship Types&lt;/p&gt;

&lt;p&gt;One-to-Many (1:*) —The most common type. One row in Table A matches multiple rows in Table B. Example: one County matches many farmers. One Product matches many sales transactions.&lt;/p&gt;

&lt;p&gt;One-to-One (1:1)  Each row in Table A matches exactly one row in Table B. Example: one Employee record maps to one HR Profile. Rare in practice, often a sign you could just merge the tables.&lt;/p&gt;

&lt;p&gt;Many-to-Many (:)  Multiple rows in Table A match multiple rows in Table B. Example: one Order can contain many Products, and one Product can appear in many Orders. Power BI supports this natively, but it can create ambiguous filter paths best handled with a bridge table in between.&lt;/p&gt;
&lt;h1&gt;
  
  
  Filter Direction
&lt;/h1&gt;

&lt;p&gt;Every relationship has a filter direction  it controls which way filters flow between tables.&lt;/p&gt;
&lt;h1&gt;
  
  
  Single direction
&lt;/h1&gt;

&lt;p&gt;filters flow one way only (from the "one" side to the "many" side). This is the safe default.&lt;br&gt;
Both directioners flow both ways, Powerful, but can cause unexpected results in complex models. Use sparingly.&lt;br&gt;
How Joins and Relationships Are Connected&lt;br&gt;
This is where it clicks: joins and relationships solve the same problem at different stages of your pipeline.&lt;/p&gt;

&lt;p&gt;A join in Power Query physically combines two tables into one before loading. A relationship in the model keeps tables separate but links them logically. The choice between them comes down to this:&lt;/p&gt;

&lt;p&gt;Join (Power Query)  Relationship (Model)&lt;br&gt;
When it runs    At data refresh / load time At query / visual render time&lt;br&gt;
Result  One merged table    Two separate linked tables&lt;br&gt;
Best for    Lookup columns you need in the fact table   Filtering and aggregating across tables&lt;br&gt;
Performance Can increase table size Keeps model lean&lt;br&gt;
A common pattern: use a join to bring a single lookup column (like County Region) into your fact table, while using relationships to connect your full dimension tables (Date, Product, Customer) for filtering.&lt;/p&gt;
&lt;h1&gt;
  
  
  Schemas
&lt;/h1&gt;

&lt;p&gt;How You Arrange Your Tables&lt;br&gt;
A schema is the overall blueprint of your model  how many tables you have and how they're arranged. There are two schemas you'll encounter most in Power BI.&lt;/p&gt;
&lt;h1&gt;
  
  
  Star Schema
&lt;/h1&gt;

&lt;p&gt;The star schema has one central fact table surrounded by several dimension tables. The fact table holds your numbers (revenue, yield, quantity). The dimension tables hold your descriptive context (who, what, when, where).&lt;/p&gt;

&lt;p&gt;dim_Date        ─┐&lt;br&gt;
dim_County      ─┤&lt;br&gt;
dim_CropType    ─┼──── fact_FarmerProduction&lt;br&gt;
dim_Season      ─┤&lt;br&gt;
dim_SoilType    ─┘&lt;/p&gt;

&lt;p&gt;Every dimension connects directly to the fact table in a one-to-many relationship. This is the recommended structure for Power BI it's simple, fast, and the DAX engine is optimised for it.&lt;/p&gt;
&lt;h1&gt;
  
  
  Snowflake Schema
&lt;/h1&gt;

&lt;p&gt;The snowflake schema normalises the star further by splitting dimension tables into sub-tables. Instead of one flat dim_Product table, you might have dim_Product → dim_Category → dim_SubCategory.&lt;/p&gt;

&lt;p&gt;It reduces data duplication but adds complexity. In Power BI, the extra join hops can slow down queries and make DAX harder to write. Unless you have a strong reason (very large dimension tables with many repeated values), stick with the star schema.&lt;/p&gt;
&lt;h1&gt;
  
  
  Flat Table
&lt;/h1&gt;

&lt;p&gt;A single flat table with no relationships is fine for small, simple datasets. If you're working with under 10,000 rows and don't need to join to anything else, a flat table keeps things uncomplicated. The moment you need to combine data sources, or your dimension data is repeated hundreds of times, move to a star schema.&lt;/p&gt;

&lt;p&gt;Putting It All Together&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Load your raw tables into Power Query
Use joins to pull in any lookup values you need directly in the fact table (e.g., a region name from a county lookup)
Load separate dimension tables  Date, County, Crop Type, Season — without merging them
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Define relationships in Model view between your fact table and each dimension (one-to-many, single-direction filter)
Arrange your model as a star schema — fact table in the centre, dimensions around it
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h1&gt;
  
  
  Concept    : What it does, Where in Power BI
&lt;/h1&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Inner join: Keep only matched rows  Power Query → Merge
Left join:  Keep all left rows + matches    Power Query → Merge
One-to-many relationship: Link dimension to fact table  Model view
Many-to-many relationship:Complex links, use bridge table   Model view
Star schema:    Fact + flat dimensions  Model view layout
Snowflake schema:Fact + normalised sub-dimensions   Model view layout

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h1&gt;
  
  
  Conclusion
&lt;/h1&gt;

&lt;p&gt;Data modelling in Power BI isn't about complexity, it's about clarity. A well-structured star schema with clean one-to-many relationships will outperform a messy flat table every time, both in query speed and in how easy your DAX becomes to write and maintain.&lt;/p&gt;

&lt;p&gt;Start with your fact table, build your dimensions, connect them with single-direction relationships, and keep it flat. &lt;/p&gt;

</description>
      <category>beginners</category>
      <category>tutorial</category>
      <category>productivity</category>
    </item>
    <item>
      <title>From confusion to clarity;How Excel-Data analysis has Transformed my skills</title>
      <dc:creator>Ephantus Macharia</dc:creator>
      <pubDate>Fri, 27 Mar 2026 09:54:01 +0000</pubDate>
      <link>https://forem.com/ephantus_macharia_/from-confusion-to-clarityhow-excel-data-analysis-has-transformed-my-skills-4ji8</link>
      <guid>https://forem.com/ephantus_macharia_/from-confusion-to-clarityhow-excel-data-analysis-has-transformed-my-skills-4ji8</guid>
      <description>&lt;h1&gt;
  
  
  Introduction
&lt;/h1&gt;

&lt;p&gt;&lt;strong&gt;The Dataset That Changed Everything&lt;/strong&gt;&lt;br&gt;
I will be honest. When I first opened the Jumia Kenya product dataset, I had no idea where to begin. There were 115 rows of product data, but the prices were buried inside text strings like "KSh 1,525", the ratings were written as "4.5 out of 5", the review counts were all negative numbers, and a full 50 per cent of the rows had no rating information at all. It looked less like a dataset and more like a problem waiting to punish me.&lt;br&gt;
That experience, the confusion, the slow process of fixing each issue one by one, and the moment when the data finally came alive is exactly what this article is about. Learning Excel data analysis did not just teach me a set of formulas. It changed the way I think, the way I approach problems, and the way I trust my own conclusions. This is my story of how that happened, told through the real data I cleaned, interpreted, and turned into a working dashboard.&lt;/p&gt;

&lt;p&gt;Step One:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Working on Messy Data&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;What Real-World Data Taught Me First&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Real-world data is rarely clean  I learned that the hard way!&lt;/li&gt;
&lt;li&gt;Data analysis means looking at clean and beautiful tables and extracting insights  or so I thought!&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Well, I was wrong  most of the work happens before even a single chart is drawn!&lt;/p&gt;
&lt;h1&gt;
  
  
  The 6 Problems I Found in the Jumia Dataset
&lt;/h1&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Prices were stored as text  every price had a "KSh" prefix attached to it.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Ratings were written as sentences  "4.5 out of 5" instead of just "4.5".&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Review counts were negative  every review count was entered as a minus value.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;One product had a price range  "KSh 1,620 – KSh 1,980" instead of a single value.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Percentage Discounts were stored as text  "38%" instead of a real number.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;58 products had no rating at all  half of the products were blank for ratings!&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h1&gt;
  
  
  What This Taught Me
&lt;/h1&gt;

&lt;ul&gt;
&lt;li&gt;Every single problem needed a deliberate solution in Excel.&lt;/li&gt;
&lt;li&gt;Not only did I learn how to solve these problems, but I also learned how to solve them permanently!&lt;/li&gt;
&lt;li&gt;The order in which I solved these problems also taught me how to diagnose the problem before even touching the data!.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;=VALUE(SUBSTITUTE(SUBSTITUTE(A2,"KSh",""),",",""))   // Strip KSh and commas from price text, then convert to a true number
=VALUE(LEFT(A2,3))   // Extract just the numeric rating from "4.5 out of 5."
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;=ABS(A2)   // Convert negative review counts to positive values
=IF(ISBLANK(A2),"No Rating",IF(A2&amp;lt;3,"Poor",IF(A2&amp;lt;4.5,"Average","Excellent")))   // 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h1&gt;
  
  
  What Actually Surprised Me About Data Cleaning
&lt;/h1&gt;

&lt;ul&gt;
&lt;li&gt;The difficulty was not the surprising part. Each fix is straightforward once you know the function&lt;/li&gt;
&lt;li&gt;What surprised me is the amount of change the data underwent after the fixes
The column of seemingly random characters, the wall of meaningless text, suddenly looked like:
✅ Sortable
✅ Calculable
✅ Chart-ready&lt;/li&gt;
&lt;/ul&gt;

&lt;h1&gt;
  
  
  The Real Meaning of Data Cleaning
&lt;/h1&gt;

&lt;ul&gt;
&lt;li&gt;Data cleaning is not about correcting errors. It is about transforming noise into a signal&lt;/li&gt;
&lt;li&gt;The moment the data underwent its transformation is the moment I understood the true meaning of cleaning&lt;/li&gt;
&lt;li&gt;No explanation or book can do justice to the experience of going through the transformation yourself&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Skill Gained: &lt;strong&gt;What I Do Differently Now&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt; I instinctively inspect every new dataset for the following six things before I do anything else&lt;/li&gt;
&lt;li&gt;It takes five minutes to inspect&lt;/li&gt;
&lt;li&gt; It saves hours of confusion&lt;/li&gt;
&lt;li&gt;This is not something I learned by reading about it. This is something I learned by going through the experience 
once.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Step Two:&lt;/p&gt;

&lt;h1&gt;
  
  
  Creating Meaning with Formulas Data Enrichment
&lt;/h1&gt;

&lt;ul&gt;
&lt;li&gt;Clean data tells you what exists. Enriched data tells you what it means&lt;/li&gt;
&lt;li&gt;This is where Excel formulas started feeling genuinely powerful&lt;/li&gt;
&lt;li&gt;I was no longer just correcting errors; I was creating knowledge&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;New Columns I Added to the Dataset&lt;br&gt;
 &lt;strong&gt;Discount Amount&lt;/strong&gt; (KES)&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Formula: Current Price subtracted from Old Price&lt;/li&gt;
&lt;li&gt;Reveals the real savings in shillings, not just a percentage&lt;/li&gt;
&lt;li&gt;Why it matters:&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;64% off a KES 199 item = only KES 354 saved&lt;br&gt;
39% off a KES 3,750 drill = KES 2,393 saved&lt;br&gt;
 Percentage figures alone were hiding this distinction entirely&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Rating Category&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Used an IFS formula to classify every product into a clear tier:&lt;/p&gt;

&lt;p&gt;Poor — rating below 3&lt;br&gt;
 Average — rating between 3 and 4.4&lt;br&gt;
 Excellent — rating of 4.5 and above&lt;/p&gt;

&lt;p&gt;Turns a raw number into a label anyone can read instantly&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Discount Category&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Grouped every product into one of three discount tiers:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Low Discount — below 20%&lt;/li&gt;
&lt;li&gt; Medium Discount — between 20% and 40%&lt;/li&gt;
&lt;li&gt; High Discount — above 40%&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Why These Columns Mattered&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Rating Category and Discount Category became the foundation of almost every comparison in the final analysis&lt;/li&gt;
&lt;li&gt;Without them, grouping and comparing products would have required manual sorting every single time&lt;/li&gt;
&lt;li&gt;With them, a single AVERAGEIF or COUNTIF formula answers any group-level question instantly
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;=IFS(D2&amp;lt;3,"Poor",D2&amp;lt;4.5,"Average",D2&amp;gt;=4.5,"Excellent
=IFS(C2&amp;lt;20,"Low Discount",C2&amp;lt;=40,"Medium Discount",C2&amp;gt;40,"High Discount")   
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This step taught me something important about data analysis: the raw data rarely tells the whole story. The enriched data does. A number like 3.7 says very little on its own. The label "Average" is placed alongside it, in context with 114 other products.&lt;/p&gt;

&lt;p&gt;Step Three: &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Charts&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;After cleaning and enriching the data, I ran a full descriptive analysis using AVERAGE, COUNTIF, AVERAGEIF, and CORREL functions. But the moment the analysis truly came alive was when I built the visualizations. The charts below were produced directly from the cleaned Jumia dataset, and each one taught me something that the tables had kept hidden.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhezu8ihjk9iwj5yn11mk.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhezu8ihjk9iwj5yn11mk.png" alt=" " width="800" height="407"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This chart showed me immediately that 65 out of 115 products more than half carry a discount above 40%. At first, I assumed this meant they were the best-performing products. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsie5uk0igsqm3myz2nd3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsie5uk0igsqm3myz2nd3.png" alt=" " width="656" height="492"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The rating category chart was the most visually striking finding of the entire analysis. The grey "No Rating" segment representing 50% of all products dominates the chart. This is not just a design choice; it is a data quality alarm. Half the products in this dataset have never been reviewed. Any conclusion I draw about ratings applies only to the other half, and I must clearly state this every time I present findings. Learning to read that caveat into a chart and to communicate it honestly felt like a genuine step forward as an analyst.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmrwx33ikqruwrvo8h1el.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmrwx33ikqruwrvo8h1el.png" alt=" " width="800" height="500"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The top 10 discount chart delivered a surprise. The highest-discounted products are not expensive electronics or premium appliances. They are small everyday items: a bottle opener, a keychain, crochet needles, and a pillowcase. The product with the single highest discount in the entire dataset (64% off) costs just KES 199. That is a powerful reminder that percentage discounts and absolute value are entirely different things, a lesson I learned from the data, not from a textbook&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Average rating and average reviews by discount category&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Finding that medium-discount products outperform high-discount ones on both measures&lt;br&gt;
This final chart is the one I am most proud of, because it contradicts the most natural assumption in the entire dataset. I expected high-discount products to have the most reviews and the highest ratings. More discounts should mean more buyers, and more buyers should mean more reviews. The data said the opposite. Medium-discount products (20–40% off) had an average rating of 4.28 and 15.3 reviews. High-discount products rated only 3.61 and averaged 11.1 reviews. The correlation between discount percentage and reviews was just −0.14, essentially zero. Higher discounts do not drive customer engagement. Product quality does.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Products Analysed-115 &lt;/li&gt;
&lt;li&gt;Avg Current Price-1174
&lt;/li&gt;
&lt;li&gt;Avg Discount-36.96%
&lt;/li&gt;
&lt;li&gt;Avg Rating-3.89/5 &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;How This Has Made Me a Better Analyst  and a Better Thinker&lt;br&gt;
Working through this project from raw CSV to finished dashboard gave me five concrete skills that I did not have before, and that I now use every time I open a spreadsheet.&lt;br&gt;
1) Skills This Project Built&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Diagnose the first  scan for broken data before touching anything&lt;/li&gt;
&lt;li&gt; Write self-explanatory formulas SUBSTITUTE, VALUE, ABS, IFS, AVERAGEIF&lt;/li&gt;
&lt;li&gt;Trust the data, not your expectations  correlation was −0.14, not what I hoped&lt;/li&gt;
&lt;li&gt; Charts are not decoration  every chart reveals what the table could not&lt;/li&gt;
&lt;li&gt; Declare data gaps honestly  50% missing ratings must be stated. 
not hidden&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;The Spreadsheet That Taught Me to Think&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Here I Started vs Where I finished&lt;/li&gt;
&lt;li&gt;Started, finished 115 rows of messy data. A fully formatted Excel dashboard. Did not know VLOOKUP. Can clean, enrich, analyse, and visualise. Assumed data was neat. Know how to diagnose and fix real problems. Trusted percentages. Know how to check the numbers 
behind them&lt;/li&gt;
&lt;/ol&gt;

&lt;h1&gt;
  
  
  What This Project Actually Taught Me
&lt;/h1&gt;

&lt;ul&gt;
&lt;li&gt;Not just which function to use, but when and why&lt;/li&gt;
&lt;li&gt;Not just how to build charts, but how to read and explain them&lt;/li&gt;
&lt;li&gt;Not just Excel skills, but thinking skills&lt;/li&gt;
&lt;/ul&gt;

&lt;h1&gt;
  
  
  Things That Made It Real
&lt;/h1&gt;

&lt;ul&gt;
&lt;li&gt;Excel gave me the tools&lt;/li&gt;
&lt;li&gt; The Jumia dataset gave me the practice&lt;/li&gt;
&lt;li&gt; This course gave me the framework&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I will carry these skills into every dataset, every report, and every decision I face from here on.&lt;/p&gt;

</description>
      <category>data</category>
      <category>excell</category>
      <category>productivity</category>
      <category>beginners</category>
    </item>
    <item>
      <title>Data Analysis Setup: Tools, Installation, and Best Practices</title>
      <dc:creator>Ephantus Macharia</dc:creator>
      <pubDate>Mon, 23 Mar 2026 06:38:00 +0000</pubDate>
      <link>https://forem.com/ephantus_macharia_/data-analysis-setup-tools-installation-and-best-practices-3cd4</link>
      <guid>https://forem.com/ephantus_macharia_/data-analysis-setup-tools-installation-and-best-practices-3cd4</guid>
      <description>&lt;p&gt;In previous centuries, we used to decide as we always used to: based on a gut feeling, a coin toss or a prayer. However, we can not afford to guess in a world that is changing this rapidly. However, it is feared that the analysis of data removes humanity in the decision-making process- that it reduces individuals to statistics. I want to argue the opposite. Proper data analysis is the most understanding thing that you can do. It prevents the projection of personal prejudices on the world and makes one see people as they are, not as you think they are. It is not a de-humanizing of the process, it is simply a human element that has gotten right at last.&lt;/p&gt;

&lt;p&gt;Below are various tools that will help you kickstart your journey  of Data Analysis:&lt;/p&gt;

&lt;p&gt;EXCEL INSTALLATION GUIDE&lt;/p&gt;

&lt;p&gt;Step 1&lt;/p&gt;

&lt;p&gt;Steps into Installaling Excel on windows&lt;/p&gt;

&lt;p&gt;1.Go to the official page of MICROSOFT OFFICE Website:&lt;a href="https://www.office.com/" rel="noopener noreferrer"&gt;https://www.office.com/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;2 . Sign in with your Microsoft Account.&lt;/p&gt;

&lt;p&gt;Click  install Office&lt;/p&gt;

&lt;p&gt;4 Open the downloaded OfficeSetup.exe file.&lt;/p&gt;

&lt;p&gt;5   Wait for the installation to complete.&lt;/p&gt;

&lt;p&gt;6.After installation:&lt;/p&gt;

&lt;p&gt;7 Open Microsoft Excel from the Start Menu.&lt;/p&gt;

&lt;p&gt;8  Sign in to activate the software.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;ANACONDA INSTALLATION GUIDE.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Steps into Installing  Anaconda  On windows&lt;/p&gt;

&lt;p&gt;Anaconda-platforms is a tool developed to design to securely build,  and deploy artificial intelligence and machine learning models, primarily using Python and open-source software.&lt;/p&gt;

&lt;p&gt;Step 2&lt;/p&gt;

&lt;p&gt;1 .Go to the official Anaconda website: &lt;a href="https://www.anaconda.com" rel="noopener noreferrer"&gt;https://www.anaconda.com&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2fmjlu1uov9ym5p1q05v.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2fmjlu1uov9ym5p1q05v.jpg" alt=" " width="800" height="633"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;2 .Download the Anaconda Distribution for Windows.&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F395kd17cbb2exx4zrsox.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F395kd17cbb2exx4zrsox.jpg" alt=" " width="502" height="390"&gt;&lt;/a&gt;&lt;br&gt;
3 .Open the downloaded .exe installer.&lt;/p&gt;

&lt;p&gt;4 .Click Next → Agree to License.&lt;/p&gt;

&lt;p&gt;5 .Choose Just Me installation.&lt;/p&gt;

&lt;p&gt;6 .Select the installation location (default recommended).&lt;/p&gt;

&lt;p&gt;7 .Click Install.&lt;/p&gt;

&lt;p&gt;8 .After installation, click Finish.&lt;/p&gt;

&lt;p&gt;9 .To verify installation:&lt;/p&gt;

&lt;p&gt;10 .Open Anaconda Navigator from the Start Menu&lt;/p&gt;

</description>
      <category>beginners</category>
    </item>
  </channel>
</rss>
