<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: samkaruri</title>
    <description>The latest articles on Forem by samkaruri (@samkaruri).</description>
    <link>https://forem.com/samkaruri</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3818887%2F1be35e4f-9712-4631-8814-1cf74845a775.png</url>
      <title>Forem: samkaruri</title>
      <link>https://forem.com/samkaruri</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/samkaruri"/>
    <language>en</language>
    <item>
      <title>Understanding Data Modeling in Power BI: Joins, Relationships and Schemas Explained</title>
      <dc:creator>samkaruri</dc:creator>
      <pubDate>Sun, 05 Apr 2026 12:56:12 +0000</pubDate>
      <link>https://forem.com/samkaruri/understanding-data-modeling-in-power-bi-joins-relationships-and-schemas-explained-1ah9</link>
      <guid>https://forem.com/samkaruri/understanding-data-modeling-in-power-bi-joins-relationships-and-schemas-explained-1ah9</guid>
      <description>&lt;h2&gt;
  
  
  Data Modeling in Power BI.
&lt;/h2&gt;

&lt;p&gt;A perfect data modeling is like a great relationship; clear, structured and created on the right connections.&lt;br&gt;
Data modeling in Power BI is a fascinating process where data relationships are created from tables and subsequently transformed into meaningful insights and interactive report. This process entail setting relationships, structures and calculations among the data table so that power BI can analyze and visualize them correctly.   &lt;/p&gt;

&lt;p&gt;In this article, we shall dive into how Power BI creates data models by expounding on the topics below.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Joins.
&lt;/h3&gt;

&lt;p&gt;In power BI, joins are achieved through relationships in the model view or by merging queries in Power queries. The relationship/model view is more flexible approach as it creates a  "virtual" join between tables via Key.&lt;/p&gt;

&lt;h4&gt;
  
  
  Left Outer Joins.
&lt;/h4&gt;

&lt;p&gt;This is the most commonly used join for data modeling. Preserves all rows from the left table. Matching rows from the right table are appended; unmatched left rows get null values in the right-table columns.&lt;/p&gt;

&lt;p&gt;Example: Imagine Safaricom runs an SMS campaign asking customers to rate a new M-Pesa feature:&lt;br&gt;
Customers who reply, their feedback is recorded&lt;br&gt;
Customers who don’t reply, still exist in the customer list, but feedback = null&lt;/p&gt;

&lt;h4&gt;
  
  
  Right Outer Joins.
&lt;/h4&gt;

&lt;p&gt;RIGHT JOIN returns all rows from the table on the right side of the JOIN. If there are no matching rows in the left table, NULL values will be returned. &lt;br&gt;
It joins two tables, e.g., Sales Details and Client Details, on their common ClientID field and returns all rows from Client Details along with any matching rows from Sales Details, returning NULL values for unmatched fields. The query selects the Order rid, Client Name, and Order Date, ordering the results by descending Order Date.&lt;/p&gt;

&lt;h4&gt;
  
  
  Full Outer Joins.
&lt;/h4&gt;

&lt;p&gt;Retains all rows from both tables. Matching records are merged, while non-matching rows from either side are kept with null values in the missing fields.&lt;/p&gt;

&lt;p&gt;Example: Comparing employee records from two separate HR systems. All employees from both systems are included, with nulls indicating where records do not match.&lt;/p&gt;

&lt;h4&gt;
  
  
  Inner Joins.
&lt;/h4&gt;

&lt;p&gt;Returns only rows with matching keys in both tables. Unmatched rows from either side are discarded.&lt;/p&gt;

&lt;p&gt;Example: Sales analysis limited to currently active employees. An inner join on Employee ID drops sales records tied to terminated employees and excludes active employees with no sales.&lt;/p&gt;

&lt;h4&gt;
  
  
  Left Anti &amp;amp; Right Anti.
&lt;/h4&gt;

&lt;p&gt;A Left Anti join returns only the rows from the left table that do not have a corresponding match in the right table; any matching rows are excluded.&lt;/p&gt;

&lt;p&gt;Example: When coming up with a campaign target list, a Left Anti Join removes clients who have already been contacted, leaving only those who have not yet been reached.&lt;/p&gt;

&lt;p&gt;Returns only the rows from the right table that have no corresponding match in the left table.&lt;/p&gt;

&lt;p&gt;Example: Comparing registered passengers in a bus booking system with actual travelers recorded at a terminal. A Right Anti Join highlights passengers who boarded but are not captured in the booking system, pointing to unrecorded tickets or system gaps.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Relationships In Power BI.
&lt;/h3&gt;

&lt;p&gt;Relationships in Power BI are the connections between tables in a data model. They allow you to combine data from multiple tables for accurate calculations, filtering, and visualizations—without merging everything into one giant "flat" table.&lt;br&gt;
Think of relationships as the "glue" that tells Power BI how rows in one table relate to rows in another. This is essential for star schema or snowflake schema designs, where you have fact tables (containing measurable numbers like sales, orders) and dimension tables (containing descriptive attributes like products, dates, customers).&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Schemas: Star, Snowflake, and Flat Table.
&lt;/h3&gt;

&lt;p&gt;Data modeling in Power BI relies heavily on how data is structured. Choosing the right schema Star, Snowflake, or Flat Table directly impacts performance, ease of analysis, and overall user experience. Each approach offers different experience between simplicity, efficiency, and flexibility, making it important to understand when and how to use them effectively. Let look at each of them.&lt;/p&gt;

&lt;h4&gt;
  
  
  Star Schema.-Highly recommended.
&lt;/h4&gt;

&lt;p&gt;The Star Schema is the most recommended approach in Power BI. It consists of a single central Fact table connected directly to multiple Dimension tables through simple one-to-many relationships. There are no intermediate lookup layers or complex relationship paths.&lt;/p&gt;

&lt;p&gt;To support this structure, Dimension tables are intentionally denormalized during data preparation. For example, instead of maintaining separate tables for Product, Product Subcategory, and Product Category, all attributes are combined into one unified Product dimension.&lt;/p&gt;

&lt;h4&gt;
  
  
  Why it works effectively in Power BI:
&lt;/h4&gt;

&lt;p&gt;Performance: Queries require only a single relationship step between dimensions and facts, which aligns well with the VertiPaq engine’s optimization.&lt;br&gt;
Simplified DAX: Filters flow directly from dimension tables to the Fact table without requiring complex logic.&lt;br&gt;
User-friendly design: Each business entity is represented by one table, making it easier for report creators to understand and use.&lt;/p&gt;

&lt;p&gt;The main drawback is data redundancy, where repeated text values may appear across many rows. However, Vert iPAQ minimizes this impact through dictionary encoding, storing repeated values efficiently in memory.&lt;/p&gt;

&lt;h4&gt;
  
  
  Snowflake Schema
&lt;/h4&gt;

&lt;p&gt;A Snowflake Schema normalizes Dimension tables into multiple related sub-tables, forming hierarchical structures. For instance, a Product table may link to a Product Subcategory table, which then connects to a Product Category table.&lt;/p&gt;

&lt;p&gt;While this design improves storage efficiency and enforces data consistency, it introduces challenges in Power BI. Queries become slower due to multiple relationship steps, and DAX expressions grow more complex. Filter propagation across several tables can also lead to ambiguous relationships and unexpected results.&lt;/p&gt;

&lt;p&gt;Because of these limitations, it is generally advised to use Power Query to flatten Snowflake structures into a Star Schema before loading data into the model.&lt;/p&gt;

&lt;p&gt;In a star schema, dimensions are denormalized—flattened into a single table per entity—allowing for faster query performance and simpler relationships. Snowflake schemas, where dimensions are normalized into multiple related tables, can still be used in complex scenarios, but they increase DAX complexity and may hurt performance.&lt;/p&gt;

&lt;h4&gt;
  
  
  Flat Table.
&lt;/h4&gt;

&lt;p&gt;The Flat Table approach combines all data into a single large table, eliminating the separation between Fact and Dimension tables.&lt;/p&gt;

&lt;p&gt;In Power BI Import mode, this is considered a poor practice. Including large volumes of repeated descriptive data alongside transactional records increases memory usage, slows data refresh, and complicates DAX calculations. Even simple operations, like removing filters on a single attribute, become more difficult compared to a Star Schema.&lt;/p&gt;

&lt;p&gt;There is, however, one valid use case: Direct Query mode. When Power BI sends queries to external systems like Snowflake, Google Big Query, or Databricks, a pre-joined Flat Table can reduce the need for runtime joins and improve query performance.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;A perfect data model in Power BI is much like a great relationship; clear, well-structured, and built on the right connections. When tables are properly joined through relationships, organized into an optimal schema, and supported by thoughtful calculations, your data transforms from raw numbers into powerful, actionable insights.&lt;/p&gt;

&lt;p&gt;Throughout this article, we explored the foundational elements of effective data modeling:&lt;/p&gt;

&lt;p&gt;Joins (via Power Query merges or model relationships) determine how accurately and flexibly your tables connect. Whether using Left Outer joins to preserve all customers in a campaign analysis or Anti joins to identify gaps, choosing the right join ensures you retain the data you need without unnecessary duplication.&lt;br&gt;
Relationships act as the invisible yet critical glue that enables seamless filter propagation and accurate DAX calculations across your model.&lt;br&gt;
Schemas define the overall architecture of your data model. The Star Schema stands out as the gold standard for most Power BI projects because it delivers superior query performance, simplified DAX expressions, and an intuitive structure that aligns perfectly with Power BI’s VertiPaq engine. While the Snowflake Schema offers better normalization in highly complex hierarchical data, it often comes at the cost of slower performance and increased complexity. The Flat Table approach, though tempting for its simplicity, is generally discouraged in Import mode due to higher memory consumption and maintenance challenges — though it can have niche value in certain DirectQuery scenarios.&lt;/p&gt;

&lt;p&gt;Ultimately, investing time in building a clean, star-schema-based data model pays dividends in speed, scalability, maintainability, and user experience. A well-designed model not only makes your reports faster and more reliable but also empowers business users to explore data confidently without needing constant support from analysts.&lt;br&gt;
Remember: Great reports don’t start with beautiful visuals — they start with solid data modeling. By mastering joins, relationships, and schema design, you lay a strong foundation that turns raw data into meaningful stories and smarter business decisions.&lt;br&gt;
Start applying these principles in your next Power BI project, and you’ll quickly see the difference a thoughtful data model can make. The clearer your model, the brighter your insights.&lt;/p&gt;

</description>
      <category>powerbi</category>
      <category>datamodeling</category>
      <category>businessintelligence</category>
      <category>dataanalytics</category>
    </item>
    <item>
      <title>How Excel is Used in Real-World Data Analysis.</title>
      <dc:creator>samkaruri</dc:creator>
      <pubDate>Sat, 04 Apr 2026 11:14:16 +0000</pubDate>
      <link>https://forem.com/samkaruri/how-excel-is-used-in-real-world-data-analysis-2clj</link>
      <guid>https://forem.com/samkaruri/how-excel-is-used-in-real-world-data-analysis-2clj</guid>
      <description>&lt;h2&gt;
  
  
  Introduction.
&lt;/h2&gt;

&lt;p&gt;If you consistently interact with large amounts of data, then hands-on experience in Excel is a must-have skill. In today’s data-driven world, organizations and individuals rely heavily on analyzed data to make informed decisions. Excel has become one of the most efficient analytical tools for collecting, organizing, and analyzing data. As a result, it plays a pivotal role in transforming raw data into meaningful insights that support effective decision-making.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is Excel?.
&lt;/h2&gt;

&lt;p&gt;Microsoft Excel is part of the Microsoft Office Suite and is widely used by data analysts to store, organize, and manipulate data in a tabular format. It is a spreadsheet application consisting of rows and columns that form cells. Inside these cells, data can be entered, calculated, and visualized. In addition, Excel has flexible functionalities that enable users to manipulate data with ease and perform effective analysis.&lt;/p&gt;

&lt;h2&gt;
  
  
  Excel in the Real World of Data Analysis
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Data Cleaning and Organization.
&lt;/h3&gt;

&lt;p&gt;In real-world scenarios, most data is often disorganized and unstructured, making it difficult to analyze. To address this, analysts use Excel to clean and organize data using features such as:&lt;/p&gt;

&lt;h4&gt;
  
  
  Remove Duplicates.
&lt;/h4&gt;

&lt;p&gt;Repetitive entries in a dataset make it inaccurate and difficult to analyze effectively. The Remove Duplicates feature identifies and removes duplicates in a dataset.&lt;/p&gt;

&lt;h4&gt;
  
  
  Text Functions.
&lt;/h4&gt;

&lt;p&gt;These functions include TRIM, LEFT, RIGHT, and CONCAT, which help format and standardize text data. CONCAT, for instance, joins text from multiple cells into one.&lt;/p&gt;

&lt;h4&gt;
  
  
  Sorting and Filtering,
&lt;/h4&gt;

&lt;p&gt;These arrange data in ascending or descending order or filter specific records.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Analyzing Data Using Formulas and Functions.
&lt;/h3&gt;

&lt;p&gt;Once data has been cleaned and structured, the next step is analysis. Excel provides a wide range of formulas that allow users to perform calculations and analyze data efficiently. Some commonly used ones include:&lt;/p&gt;

&lt;h4&gt;
  
  
  SUMIF()
&lt;/h4&gt;

&lt;p&gt;The Excel SUMIF function sums values in a range that meet specific conditions. Its syntax is:&lt;/p&gt;

&lt;p&gt;=SUMIF(range, criteria, [sum range])&lt;/p&gt;

&lt;p&gt;The function checks the range against criteria, and the corresponding cells in the sum range are added.&lt;/p&gt;

&lt;h4&gt;
  
  
  VLOOKUP() / XLOOKUP()
&lt;/h4&gt;

&lt;p&gt;In a dataset where data consists of a unique identifier, VLOOKUP can be used to retrieve that identifier from data tables.&lt;/p&gt;

&lt;h4&gt;
  
  
  AVERAGE()
&lt;/h4&gt;

&lt;p&gt;The AVERAGE function in Excel calculates the mean by adding all values in a range and dividing by their count.&lt;/p&gt;

&lt;p&gt;Example: A sales analyst might use SUM() to calculate total revenue and IF() to categorize sales performance as “High” or “Low.”&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Data Visualization.
&lt;/h3&gt;

&lt;p&gt;After analyzing the data, presenting it in a clear and understandable way becomes essential. Excel enables users to visualize data using a wide variety of charts and graphs, transforming raw numbers into meaningful visual insights. Common visualization features in Excel include:&lt;/p&gt;

&lt;h4&gt;
  
  
  Bar charts.
&lt;/h4&gt;

&lt;p&gt;Ideal for comparing categories or showing differences between groups (e.g., sales by market or product performance).&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuph1gl3vrss1jv7ju7mq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuph1gl3vrss1jv7ju7mq.png" alt="barchart.png" width="800" height="471"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Line graphs.
&lt;/h4&gt;

&lt;p&gt;Best suited for showing trends over time, such as monthly revenue, stock prices, or online orders.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdx5h3wcserqvc4xamy0j.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdx5h3wcserqvc4xamy0j.png" alt="linechart.png" width="800" height="456"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Pie charts.
&lt;/h4&gt;

&lt;p&gt;Provide useful insights by displaying proportions or percentages of a whole (e.g., market share distribution or budget breakdown).&lt;br&gt;
Column charts, scatter plots, area charts, and histograms&lt;br&gt;
Excel also supports these for more specific analytical needs.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8vzpdsuokv2glczjgrbq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8vzpdsuokv2glczjgrbq.png" alt="pie.png" width="800" height="401"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;These visualizations help decision-makers quickly understand trends, patterns, relationships, and outliers without needing to analyze complex spreadsheets manually.&lt;/p&gt;

&lt;p&gt;Example: An organization can use a line chart to track monthly sales performance of different products. By visualizing this data, managers can identify seasonal fluctuations, periods of rapid or slow growth, and make timely decisions to improve underperforming months.&lt;/p&gt;

&lt;p&gt;To illustrate this further, an organization can use a line chart to track monthly sales performance of different products. By visualizing this data, managers can identify seasonal fluctuations, periods of rapid or slow growth, and make timely decisions to improve underperforming months.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Pivot Tables for Data Summarization
&lt;/h3&gt;

&lt;p&gt;In addition to formulas and charts, Excel offers powerful tools for summarizing data. One of the most effective features is Pivot Tables, which allow users to summarize large datasets and extract insights quickly.&lt;/p&gt;

&lt;p&gt;Example: A business can use a Pivot Table to analyze total sales by region, product, or time period without writing complex formulas.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Trend and Relationship Analysis.
&lt;/h3&gt;

&lt;p&gt;Beyond summarizing data, Excel can also be used to identify patterns and relationships between variables. This helps analysts make predictions and data-driven decisions.&lt;/p&gt;

&lt;p&gt;Correlation analysis using functions like CORREL() Scatter plots to visualize relationships&lt;br&gt;
Trendlines to predict future values&lt;/p&gt;

&lt;p&gt;Example: A company might analyze the relationship between discount percentages and customer purchases to determine optimal pricing strategies.&lt;/p&gt;

&lt;h3&gt;
  
  
  6. Automation with Conditional Formatting.
&lt;/h3&gt;

&lt;p&gt;To further enhance efficiency, Excel provides automation features such as conditional formatting, which highlights important data points based on predefined rules.&lt;/p&gt;

&lt;h2&gt;
  
  
  Real-World Applications of Excel.
&lt;/h2&gt;

&lt;p&gt;Excel’s versatility allows it to be applied across various industries. For example:&lt;/p&gt;

&lt;h3&gt;
  
  
  Finance.
&lt;/h3&gt;

&lt;p&gt;Managing budgets, building forecasts, and creating financial models.&lt;/p&gt;

&lt;h3&gt;
  
  
  Marketing.
&lt;/h3&gt;

&lt;p&gt;Analyzing campaigns and segmenting customer data.  &lt;/p&gt;

&lt;h3&gt;
  
  
  Healthcare.
&lt;/h3&gt;

&lt;p&gt;Tracking patient data and generating reports .&lt;/p&gt;

&lt;h3&gt;
  
  
  Education.
&lt;/h3&gt;

&lt;p&gt;Managing student records and performance analysis.  &lt;/p&gt;

&lt;h2&gt;
  
  
  Personal Reflection
&lt;/h2&gt;

&lt;p&gt;Learning Excel has significantly changed the way I understand and interpret data. Initially, data appeared overwhelming and difficult to analyze. However, with Excel, I have learned how to organize information systematically, apply formulas to uncover insights, and present findings in a clear and visual way. It has improved my ability to make data-driven decisions and strengthened my analytical thinking. Excel is not just a tool for calculations, it is a powerful platform for turning raw data into meaningful knowledge.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion.
&lt;/h2&gt;

&lt;p&gt;In conclusion, Microsoft Excel remains one of the most essential tools in real-world data analysis. Its combination of simplicity and powerful features makes it suitable for both beginners and advanced users. Whether it’s cleaning data, performing calculations, or visualizing trends, Excel continues to play a vital role in helping individuals and organizations make informed decisions.&lt;/p&gt;

</description>
      <category>excel</category>
      <category>dataanalysis</category>
      <category>beginners</category>
      <category>exceltips</category>
    </item>
  </channel>
</rss>
