<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Lawrence Murithi</title>
    <description>The latest articles on Forem by Lawrence Murithi (@lawrence_murithi).</description>
    <link>https://forem.com/lawrence_murithi</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3713327%2Fe1187555-8b89-4a2c-9168-be280e1c6b86.png</url>
      <title>Forem: Lawrence Murithi</title>
      <link>https://forem.com/lawrence_murithi</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/lawrence_murithi"/>
    <language>en</language>
    <item>
      <title>Connecting Power BI to SQL Databases: A Practical Guide for Data Analysts</title>
      <dc:creator>Lawrence Murithi</dc:creator>
      <pubDate>Tue, 17 Mar 2026 12:03:47 +0000</pubDate>
      <link>https://forem.com/lawrence_murithi/connecting-power-bi-to-sql-databases-a-practical-guide-for-data-analysts-5745</link>
      <guid>https://forem.com/lawrence_murithi/connecting-power-bi-to-sql-databases-a-practical-guide-for-data-analysts-5745</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;In most modern organizations, data is one of the most valuable assets. Companies collect large amounts of information from sales systems, websites, customer platforms, and operational databases. To make sense of this information, businesses use tools that can transform this raw data into clear insights. One of the most widely used tools for this purpose is the Microsoft Power BI platform.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is Power BI
&lt;/h2&gt;

&lt;p&gt;Power BI is a business intelligence and data visualization tool developed by Microsoft. It allows users to connect to different data sources, analyze data, and create interactive dashboards and reports. These reports help organizations monitor performance, understand trends, and support decision-making among other uses.&lt;/p&gt;

&lt;p&gt;Power BI is commonly used by data analysts, business managers, and decision makers because it can present complex data in simple visual forms such as charts, tables, maps, and dashboards.&lt;/p&gt;

&lt;p&gt;Most organizations store their operational and analytical data in SQL databases. SQL databases are designed to store large amounts of structured data in tables. They allow users to query, filter, update, and analyze data efficiently using Structured Query Language (SQL). SQL databases are reliable, secure, and scalable, hence they are widely used in business systems such as sales platforms, customer management systems, and inventory systems.&lt;/p&gt;

&lt;p&gt;Connecting Power BI to a database allows analysts to access this stored data directly. Instead of manually exporting data into spreadsheets, Power BI can retrieve the data automatically, refresh it when the database changes, and build dashboards that always reflect the latest information.&lt;/p&gt;

&lt;p&gt;This article explains how Power BI connects to SQL databases, how to connect to a local PostgreSQL database, how to connect to a cloud database such as Aiven PostgreSQL, and how the loaded data is modeled for analysis.&lt;/p&gt;

&lt;h3&gt;
  
  
  Understanding the Power BI Interface
&lt;/h3&gt;

&lt;p&gt;Before connecting to a database, it is helpful to understand the Power BI Desktop interface. Power BI Desktop is the main application used for building reports and dashboards.&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fccc89cx73itso4ddn7vt.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fccc89cx73itso4ddn7vt.jpg" alt="BI Desktop" width="800" height="450"&gt;&lt;/a&gt;&lt;br&gt;
The Power BI Desktop interface includes several sections such as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Ribbon (Top Menu) – Contains commands and tabs such as Get Data, Transform Data, and Publish.&lt;/li&gt;
&lt;li&gt;Report Canvas – The workspace where charts and dashboards are created.&lt;/li&gt;
&lt;li&gt;Visualizations Pane – Used to select and customize charts.&lt;/li&gt;
&lt;li&gt;Fields Pane – Displays the tables and columns loaded into Power BI.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You can download Power BI desktop app &lt;a href="https://www.microsoft.com/en-us/power-platform/products/power-bi/desktop" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;h3&gt;
  
  
  Connecting Power BI to a Local PostgreSQL Database
&lt;/h3&gt;

&lt;p&gt;PostgreSQL is one of the most popular open-source relational databases used in data analytics. Many organizations run databases locally on their own servers. &lt;br&gt;
The steps below explain how to connect Power BI to a local PostgreSQL database.&lt;/p&gt;
&lt;h4&gt;
  
  
  Step 1: Open Power BI Desktop
&lt;/h4&gt;

&lt;p&gt;Start by opening Power BI Desktop on your computer.&lt;br&gt;
When the application opens, a blank report canvas appears. This is where you will build your report after loading the data.&lt;/p&gt;
&lt;h4&gt;
  
  
  Step 2: Get Data
&lt;/h4&gt;

&lt;p&gt;On the Home tab of the ribbon, click Get Data.&lt;br&gt;
This button opens a list of available data sources. Power BI supports many data sources including:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Excel&lt;/li&gt;
&lt;li&gt;SQL Server&lt;/li&gt;
&lt;li&gt;PostgreSQL&lt;/li&gt;
&lt;li&gt;Web APIs
The Get Data feature is the starting point for connecting Power BI to any data source. Other data sources are as shown on the image.
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flsqplwvo6mi1vf0n859m.jpg" alt="Get Data" width="800" height="450"&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;
  
  
  Step 3: Select PostgreSQL Database
&lt;/h4&gt;

&lt;p&gt;From the list of available data connectors, click &lt;strong&gt;more&lt;/strong&gt; to view more options. Scroll down, select &lt;strong&gt;PostgreSQL Database&lt;/strong&gt; and click &lt;strong&gt;Connect&lt;/strong&gt;.&lt;/p&gt;
&lt;h4&gt;
  
  
  Step 4: Enter the Database Connection Details
&lt;/h4&gt;

&lt;p&gt;After selecting PostgreSQL and clicking connect, Power BI opens a connection window that requires connection details for the connection to go through.&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdwa7subvjsrzicalek1n.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdwa7subvjsrzicalek1n.jpg" alt="Credentials" width="800" height="450"&gt;&lt;/a&gt;&lt;br&gt;
&lt;strong&gt;Server&lt;/strong&gt; – The location of the database server. If the database is on your computer, use localhost.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Database&lt;/strong&gt; – The name of the database you want to connect to.&lt;/p&gt;
&lt;h4&gt;
  
  
  Step 5: Provide Login Credentials
&lt;/h4&gt;

&lt;p&gt;After a connection is made, Power BI will ask for authentication details.&lt;br&gt;
You will need to provide:&lt;br&gt;
&lt;strong&gt;Username&lt;br&gt;
Password&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;These credentials were set up during installation of the PostgreSQL and allows Power BI to securely access the database.&lt;/p&gt;

&lt;p&gt;Once the credentials are entered, click Connect.&lt;/p&gt;
&lt;h4&gt;
  
  
  Step 6: Select Tables to Import
&lt;/h4&gt;

&lt;p&gt;After connecting successfully, Power BI opens the Navigator Window which displays all available tables in the database.&lt;br&gt;
You can preview the contents of each table before loading them.&lt;br&gt;
There are two options:&lt;br&gt;
&lt;strong&gt;Load&lt;/strong&gt; – Import the data directly.&lt;br&gt;
&lt;strong&gt;Transform Data&lt;/strong&gt; – Clean or modify the data before loading it.&lt;/p&gt;
&lt;h3&gt;
  
  
  Connecting Power BI to a Cloud Database (Aiven PostgreSQL)
&lt;/h3&gt;

&lt;p&gt;Many organizations now store their databases in the cloud. Cloud databases are accessible through the internet and provide benefits such as scalability, backups, and easier management.&lt;br&gt;
Aiven is a cloud platform that provides managed PostgreSQL databases.&lt;br&gt;
Connecting Power BI to a cloud database is not different to connecting to a local database, only that additional security steps are required.&lt;/p&gt;

&lt;p&gt;The steps below explain how to connect Power BI to an Aiven PostgreSQL database.&lt;/p&gt;
&lt;h4&gt;
  
  
  Step 1: Get the Database Connection Details from Aiven
&lt;/h4&gt;

&lt;p&gt;Login to Aiven and inside the dashboard, you will find the connection information for your PostgreSQL service. These details are used by Power BI to locate and connect to the database.&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgniq9zxf2r4e78vijr8d.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgniq9zxf2r4e78vijr8d.jpg" alt="Connection Details" width="800" height="289"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h4&gt;
  
  
  Step 2: Download and install the SSL Certificate
&lt;/h4&gt;

&lt;p&gt;Cloud database providers often require SSL encryption to secure the connection.&lt;br&gt;
An SSL certificate ensures:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Data transferred between Power BI and the database is encrypted&lt;/li&gt;
&lt;li&gt;Unauthorized users cannot intercept the connection&lt;/li&gt;
&lt;li&gt;The database server identity is verified&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In Aiven, download the certificate file(CA Certificate) from the Connection Information section of the service dashboard.&lt;br&gt;
Rename the downloaded file from &lt;strong&gt;ca.pem&lt;/strong&gt; to &lt;strong&gt;ca.crt&lt;/strong&gt; and install the Certificate on your PC.&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzahcljjsav3r40i5mh96.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzahcljjsav3r40i5mh96.jpg" alt="SSL Certificate" width="597" height="763"&gt;&lt;/a&gt;&lt;br&gt;
Choose Local Machine as the location of the installation and click next.&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx30qrvf497dbag9gsbad.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx30qrvf497dbag9gsbad.jpg" alt="Local Machine" width="793" height="775"&gt;&lt;/a&gt;&lt;br&gt;
Choose &lt;strong&gt;place all certificates in the following store&lt;/strong&gt; and browse certificate store to &lt;strong&gt;Trusted Root Certification Authorities&lt;/strong&gt;. &lt;br&gt;
Click &lt;strong&gt;ok&lt;/strong&gt; and &lt;strong&gt;finish&lt;/strong&gt;.&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyg4l0upupjwxaz519po5.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyg4l0upupjwxaz519po5.jpg" alt="Store" width="783" height="811"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h4&gt;
  
  
  Step 3: Connect Power BI
&lt;/h4&gt;

&lt;p&gt;Open Power BI desktop as before, click &lt;strong&gt;Get Data&lt;/strong&gt;, click &lt;strong&gt;more&lt;/strong&gt;, scroll down and select &lt;strong&gt;PostgreSQL Database&lt;/strong&gt;.&lt;br&gt;
Copy the &lt;strong&gt;Server Name&lt;/strong&gt; from the service URL(&lt;strong&gt;host_name:port_number&lt;/strong&gt;) on Connection Information and paste on Power BI.&lt;br&gt;
Input the name of your database and click &lt;strong&gt;ok&lt;/strong&gt;.&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu18dtkx45ilpu9hofwse.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu18dtkx45ilpu9hofwse.jpg" alt="server &amp;amp; db" width="800" height="425"&gt;&lt;/a&gt;&lt;br&gt;
Copy the username and password from Aiven, input them on the Power BI credentials window that opens and click &lt;strong&gt;connect&lt;/strong&gt;.&lt;br&gt;
Once the connection is successful, a navigator window opens and displays all tables in the database. &lt;br&gt;
Select the tables you want to work with and click on &lt;strong&gt;load/transfrom data&lt;/strong&gt; depending on what you wish to do with the data.&lt;br&gt;
Transform data option is used to clean raw data e.g delete any duplicates and address null values using the most appropriate method.&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj384cz97geuk12zvbe1s.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj384cz97geuk12zvbe1s.jpg" alt="Load data" width="800" height="424"&gt;&lt;/a&gt;&lt;br&gt;
Successfully loaded data displays on the data pane as shown on figure below. &lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmoqoi3v0etj70udifbs3.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmoqoi3v0etj70udifbs3.jpg" alt="Tables" width="800" height="427"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h4&gt;
  
  
  Creating Relationships Between Tables
&lt;/h4&gt;

&lt;p&gt;Once loaded, Power BI automatically detects relationships between tables based on matching columns using primary and foreign keys. Relationships not created can be manually created by dragging a column from one table onto the matching column in another table.&lt;br&gt;
These relationships allow Power BI to combine information across multiple tables. &lt;br&gt;
For example:&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F17mn1jzfmmcng1bk54zg.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F17mn1jzfmmcng1bk54zg.jpg" alt="Connection" width="543" height="237"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In the Model View, the tables appear as connected boxes. The relationships show how data flows between tables.&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4ebu7s8y3tr1j5io4sqr.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4ebu7s8y3tr1j5io4sqr.jpg" alt="Relatioships" width="800" height="424"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  Data Modeling and Why its Important
&lt;/h2&gt;

&lt;p&gt;Data modeling is the process of defining how data is stored, structured, and related within a database. It ensures that Power BI understands how different tables are related.&lt;/p&gt;

&lt;p&gt;Good data modeling allows Power BI to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Filter data correctly&lt;/li&gt;
&lt;li&gt;Calculate totals accurately&lt;/li&gt;
&lt;li&gt;Create meaningful visualizations&lt;/li&gt;
&lt;li&gt;Avoid duplicated values&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For example, when analyzing sales:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The sales table stores transaction records.&lt;/li&gt;
&lt;li&gt;The customers table provides customer information.&lt;/li&gt;
&lt;li&gt;The products table describes the items sold.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  why SQL skills are important for Power BI analysts
&lt;/h3&gt;

&lt;p&gt;Power BI is a powerful tool for building reports and dashboards, but it does not replace the need for strong data handling skills. Most business data is stored in SQL databases, and before that data can be visualized in Power BI, it must first be retrieved, cleaned, and structured properly. &lt;br&gt;
SQL skills give Power BI analysts a real edge by providing an easier way to grab just what you need without pulling everything into Power BI. &lt;br&gt;
Without SQL, analysts may rely too much on raw data, which can lead to slow reports, incorrect results, and inefficient workflows.&lt;br&gt;
SQL allows analysts to:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Retrieve Data&lt;/strong&gt;&lt;br&gt;
Analysts can write queries to select specific rows and columns relevant to their analysis from a database.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- selecting only products name and price columns from products table&lt;/span&gt;

&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;product_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;price&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Why this matters in Power BI:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Reduces the amount of data imported&lt;/li&gt;
&lt;li&gt;Improves performance&lt;/li&gt;
&lt;li&gt;Makes the model easier to manage&lt;/li&gt;
&lt;li&gt;Avoids unnecessary columns&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;2. Filter Data&lt;/strong&gt;&lt;br&gt;
In real-world scenarios, not all data is useful for analysis. Analysts often need to focus on specific time periods, regions, or business conditions. SQL thus makes it easy to filter datasets based on a specific criteria before loading them into Power BI.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Retrieving only sales from 2024 onwards.&lt;/span&gt;

&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;sales&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;sale_date&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="s1"&gt;'2024-01-01'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Why this matters:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Reduces dataset size&lt;/li&gt;
&lt;li&gt;Speeds up report loading&lt;/li&gt;
&lt;li&gt;Focuses analysis on relevant data&lt;/li&gt;
&lt;li&gt;Avoids unnecessary processing inside Power BI&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;3. Perform Aggregations&lt;/strong&gt;&lt;br&gt;
Aggregation is the process of summarizing data. In business analysis, analysts often need totals, averages, counts, and other summary metrics. SQL can summarize large datasets quickly by using functions such as GROUP BY, SUM, COUNT, and AVG.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Calculating total sales per product&lt;/span&gt;

&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;product_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;quantity&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;total_quantity&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;sales&lt;/span&gt;
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;product_id&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Why aggregation in SQL is important:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Reduces data volume before loading&lt;/li&gt;
&lt;li&gt;Improves Power BI performance&lt;/li&gt;
&lt;li&gt;Simplifies data models&lt;/li&gt;
&lt;li&gt;Avoids heavy calculations in DAX&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Preparing data for Analysis
&lt;/h3&gt;

&lt;p&gt;Raw data must be cleaned or transformed before it is ready for visualization. &lt;br&gt;
SQL can be used to:&lt;/p&gt;
&lt;h4&gt;
  
  
  Joining Tables and Combining Data
&lt;/h4&gt;

&lt;p&gt;Business data is usually stored in multiple tables.&lt;br&gt;
SQL allows analysts to combine these tables using joins. Joined datasets in SQL can simplify the data model.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;customer_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sales_amount&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;
&lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;sales&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;
&lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Why this is important:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Combines related data into one dataset&lt;/li&gt;
&lt;li&gt;Reduces the need for complex relationships in Power BI&lt;/li&gt;
&lt;li&gt;Makes analysis easier&lt;/li&gt;
&lt;li&gt;Prevents duplication errors&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Data Cleaning and Preparation
&lt;/h4&gt;

&lt;p&gt;Raw data is often messy. It may contain:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Missing values&lt;/li&gt;
&lt;li&gt;Duplicate records&lt;/li&gt;
&lt;li&gt;Incorrect formats&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;SQL helps clean and prepare the data before it is loaded into Power BI hence leading to better insights.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Eliminating duplicates&lt;/span&gt;

&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;DISTINCT&lt;/span&gt; &lt;span class="n"&gt;customer_id&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;--Handling missing values&lt;/span&gt;

&lt;span class="k"&gt;SELECT&lt;/span&gt; 
  &lt;span class="n"&gt;COALESCE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;phone_number&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Not Provided'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;phone&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Why data cleaning matters:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Ensures data accuracy&lt;/li&gt;
&lt;li&gt;Improves report reliability&lt;/li&gt;
&lt;li&gt;Reduces cleaning work in Power BI&lt;/li&gt;
&lt;li&gt;Prevents errors in calculations&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Creating Calculated Fields
&lt;/h4&gt;

&lt;p&gt;SQL allows analysts to create new columns based on existing data.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Calculate total sales&lt;/span&gt;

&lt;span class="k"&gt;SELECT&lt;/span&gt; 
  &lt;span class="n"&gt;product_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;price&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;quantity&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;price&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;quantity&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;total_sales&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;sales&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Why calculated fields are useful:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Prepares key metrics before loading&lt;/li&gt;
&lt;li&gt;Reduces need for DAX calculations&lt;/li&gt;
&lt;li&gt;Keeps logic centralized in the database&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Supporting Advanced Analysis
&lt;/h4&gt;

&lt;p&gt;SQL also supports more advanced operations such as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Window functions (running totals, ranking)&lt;/li&gt;
&lt;li&gt;Subqueries&lt;/li&gt;
&lt;li&gt;Common Table Expressions (CTEs)&lt;/li&gt;
&lt;li&gt;Data transformations
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; 
  &lt;span class="n"&gt;sale_date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sales_amount&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;OVER&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;sale_date&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;running_total&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;sales&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Power BI is a powerful tool that helps organizations transform raw data into meaningful insights. By connecting directly to SQL databases, Power BI allows analysts to access structured data stored in business systems and convert it into interactive dashboards and reports.&lt;br&gt;
SQL prepares the foundation, and Power BI builds the story on top of it. Strong SQL skills allow analysts to work more efficiently, produce accurate reports, and deliver better insights for decision-making.&lt;br&gt;
When SQL and Power BI are used together, they provide a powerful combination for modern data analysis and business intelligence.&lt;/p&gt;

</description>
      <category>luxdev</category>
      <category>dataengineering</category>
      <category>sql</category>
      <category>database</category>
    </item>
    <item>
      <title>Mastering SQL Joins and Window Functions</title>
      <dc:creator>Lawrence Murithi</dc:creator>
      <pubDate>Tue, 03 Mar 2026 10:36:56 +0000</pubDate>
      <link>https://forem.com/lawrence_murithi/mastering-sql-joins-and-window-functions-1f30</link>
      <guid>https://forem.com/lawrence_murithi/mastering-sql-joins-and-window-functions-1f30</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;SQL (Structured Query Language) is a powerful tool used to search, manage, and analyze large amounts of data. It is widely used by data enthusiasts, software developers and even marketing professionals.&lt;br&gt;
In real-world databases, data is not stored in one large table. It is divided into multiple related tables. This makes storage efficient and avoids duplication. To work effectively with such data, you must understand SQL joins and window functions. These two features allow you to combine data correctly and perform advanced analysis without losing important details.&lt;/p&gt;
&lt;h2&gt;
  
  
  SQL Joins
&lt;/h2&gt;

&lt;p&gt;A JOIN in SQL is used to combine rows from two or more tables based on a related column. This relationship is usually created using:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A primary key (unique identifier in one table)&lt;/li&gt;
&lt;li&gt;A foreign key (reference to that key in another table)
Joins are essential when working with relational databases because data is often split across multiple tables.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  Importance of Joins
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Combining related data from multiple tables&lt;/li&gt;
&lt;li&gt;Maintaining relational integrity&lt;/li&gt;
&lt;li&gt;Supporting one-to-many and many-to-many relationships&lt;/li&gt;
&lt;li&gt;Building meaningful reports and analytics&lt;/li&gt;
&lt;li&gt;Preventing unnecessary duplication of data&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The type of join you use directly affects:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The number of rows returned&lt;/li&gt;
&lt;li&gt;Whether NULL values appear&lt;/li&gt;
&lt;li&gt;How business logic is interpreted
NB: Choosing the wrong join can lead to missing data, duplicated records, or incorrect analysis.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  Types of SQL Joins
&lt;/h3&gt;
&lt;h4&gt;
  
  
  INNER JOIN
&lt;/h4&gt;

&lt;p&gt;The INNER JOIN returns only the rows that have matching values in both tables.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Combines records based on a related column&lt;/li&gt;
&lt;li&gt;Returns only matching rows&lt;/li&gt;
&lt;li&gt;Excludes non-matching rows
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fijwwf41l1fwvnmt9kowo.jpg" alt="Inner Join" width="800" height="105"&gt;
INNER JOIN is used when:&lt;/li&gt;
&lt;li&gt;You only need matched data&lt;/li&gt;
&lt;li&gt;You want to exclude incomplete relationships&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;
  
  
  LEFT (OUTER) JOIN
&lt;/h4&gt;

&lt;p&gt;The LEFT (OUTER) JOIN returns:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;All rows from the left table&lt;/li&gt;
&lt;li&gt;Matching rows from the right table&lt;/li&gt;
&lt;li&gt;NULL values if no match exists
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fftfmhtuhwawp91s4otip.jpg" alt="Left Join" width="800" height="85"&gt;
LEFT JOIN is used when:&lt;/li&gt;
&lt;li&gt;You want all records from the main table&lt;/li&gt;
&lt;li&gt;You want to identify missing matches&lt;/li&gt;
&lt;li&gt;You need complete reporting from one side&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;
  
  
  RIGHT (OUTER) JOIN
&lt;/h4&gt;

&lt;p&gt;The RIGHT (OUTER) JOIN returns:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;All rows from the right table&lt;/li&gt;
&lt;li&gt;Matching rows from the left table&lt;/li&gt;
&lt;li&gt;NULL where no match exists on the left
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjn4qq9vkqqibun7ys4i9.jpg" alt="Right Join" width="800" height="83"&gt;
NB: RIGHT JOIN works like LEFT JOIN but from the opposite direction.&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;
  
  
  FULL (OUTER) JOIN
&lt;/h4&gt;

&lt;p&gt;The FULL JOIN returns:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;All rows from both tables&lt;/li&gt;
&lt;li&gt;Matching records where possible&lt;/li&gt;
&lt;li&gt;NULL values where no match exists
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk7lesr3ojfu51sn1gj8g.jpg" alt="Full Join" width="800" height="83"&gt;
The FULL JOIN is used when:&lt;/li&gt;
&lt;li&gt;Comparing two datasets&lt;/li&gt;
&lt;li&gt;Identifying differences between systems&lt;/li&gt;
&lt;li&gt;Performing reconciliation tasks&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;
  
  
  CROSS JOIN
&lt;/h4&gt;

&lt;p&gt;A CROSS JOIN returns all possible combinations of rows thus can create very large results.&lt;br&gt;
If Table A has 5 rows and Table B has 10 rows:&lt;br&gt;
Result = 50 rows.&lt;br&gt;
It does not use a matching condition.&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftbk5gfm1fjjiemhjcp68.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftbk5gfm1fjjiemhjcp68.jpg" alt="Cross Join" width="800" height="112"&gt;&lt;/a&gt;&lt;br&gt;
A CROSS JOIN is used to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Generate combinations&lt;/li&gt;
&lt;li&gt;Create calendar expansions&lt;/li&gt;
&lt;li&gt;Test scenarios&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;
  
  
  SELF JOIN
&lt;/h4&gt;

&lt;p&gt;A self join joins a table to itself. Aliases are used to refer to the same tale&lt;br&gt;
Example:&lt;br&gt;
Employee table:&lt;br&gt;
| EmployeeID | ManagerID |&lt;br&gt;
To show each employee and their manager name, the table is joined to itself.&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxkfb28zynixzjr1xj2wm.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxkfb28zynixzjr1xj2wm.jpg" alt="Self Join" width="800" height="103"&gt;&lt;/a&gt;&lt;br&gt;
Self joins are useful for hierarchical data.&lt;/p&gt;
&lt;h4&gt;
  
  
  NATURAL JOIN
&lt;/h4&gt;

&lt;p&gt;A natural join Joins all tables using columns that have the same name.&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq7932nmokz5hfoco4y1k.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq7932nmokz5hfoco4y1k.jpg" alt="Natural Join" width="800" height="47"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  Performance Considerations for Joins.
&lt;/h3&gt;

&lt;p&gt;Joins can affect performance, especially in large databases.&lt;br&gt;
Best practices:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Index join columns (primary and foreign keys)&lt;/li&gt;
&lt;li&gt;Avoid unnecessary joins&lt;/li&gt;
&lt;li&gt;Filter data early using WHERE&lt;/li&gt;
&lt;li&gt;Understand execution plans&lt;/li&gt;
&lt;li&gt;Be careful with joins that multiply rows unintentionally&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Improper joins can cause:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Duplicate results&lt;/li&gt;
&lt;li&gt;Data inflation&lt;/li&gt;
&lt;li&gt;Slow query execution&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  Window Functions
&lt;/h2&gt;

&lt;p&gt;Window functions allow us to perform advanced calculations across a group of related rows while keeping the original data. They are useful for ranking, running totals, moving averages, and analytical reporting.&lt;br&gt;
Window functions often remove the need for complex self-joins and provide an analytical layer within SQL.&lt;br&gt;
Window functions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt; Keep every row&lt;/li&gt;
&lt;li&gt;Add calculated values to each row&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  Structure of a window function:
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SELECT column_1,
       function() OVER (
           PARTITION BY column
           ORDER BY column
       ) AS output_column
FROM table_name;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;h4&gt;
  
  
  1.  OVER()
&lt;/h4&gt;

&lt;p&gt;The OVER() clause defines how the window function operates and controls:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Partitioning&lt;/li&gt;
&lt;li&gt;Ordering&lt;/li&gt;
&lt;li&gt;Optional frame boundaries&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;
  
  
  2.  PARTITION BY()
&lt;/h4&gt;

&lt;p&gt;The PARTITION BY divides rows into logical groups. If omitted, the entire dataset is treated as one group.&lt;/p&gt;
&lt;h4&gt;
  
  
  3.  ORDER BY()
&lt;/h4&gt;

&lt;p&gt;ORDER BY defines the sequence of rows inside each partition.&lt;/p&gt;

&lt;p&gt;It is essential for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Ranking&lt;/li&gt;
&lt;li&gt;Running totals&lt;/li&gt;
&lt;li&gt;Time-based comparisons&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If ORDER BY is omitted, row processing order is undefined.&lt;/p&gt;
&lt;h4&gt;
  
  
  4. Frame Clause (ROWS vs RANGE)
&lt;/h4&gt;

&lt;p&gt;Used to define a range of rows(boundary) and commonly used for moving averages and cumulative calculations.&lt;br&gt;
In the ROWS subclause, the frame is defined by beginning and ending row positions while in the RANGE subclause, the frame is defined by a value range.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ROWS BETWEEN lower_bound AND upper_bound
RANGE BETWEEN lower_bound AND upper_bound
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Types of SQL Window Functions
&lt;/h3&gt;

&lt;p&gt;Window functions fall into three main categories.&lt;/p&gt;

&lt;h4&gt;
  
  
  1. Aggregate Window Functions
&lt;/h4&gt;

&lt;p&gt;These include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;AVG() - Calculates moving averages.&lt;/li&gt;
&lt;li&gt;SUM() - Creates running totals.&lt;/li&gt;
&lt;li&gt;COUNT() - calculates the number of items found in a group.&lt;/li&gt;
&lt;li&gt;MIN() - returns the minimum value.&lt;/li&gt;
&lt;li&gt;MAX() - returns the maximum value.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Some use cases of Aggregate window functions include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Department totals&lt;/li&gt;
&lt;li&gt;Running totals&lt;/li&gt;
&lt;li&gt;Moving averages&lt;/li&gt;
&lt;li&gt;Cumulative metrics&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  2. Ranking Window Functions
&lt;/h4&gt;

&lt;p&gt;They are used to assign position or rank.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;ROW_NUMBER() - Assigns a unique number to each row. &lt;/li&gt;
&lt;li&gt;RANK() - Assigns rank with gaps when ties exist.&lt;/li&gt;
&lt;li&gt;DENSE_RANK() - Similar to RANK but does not skip numbers and better for ranking reports where gaps are not desired.&lt;/li&gt;
&lt;li&gt;PERCENT_RANK() - calculates the relative rank of a row within a group of rows.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Some use cases of Ranking window functions include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Top N per group&lt;/li&gt;
&lt;li&gt;Performance ranking&lt;/li&gt;
&lt;li&gt;Leaderboards&lt;/li&gt;
&lt;li&gt;Percentile analysis&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  3. Offset (Value) Window Functions
&lt;/h4&gt;

&lt;p&gt;They are used to access data from other rows.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;LAG() - shows previous row value and used in time-based analysis.&lt;/li&gt;
&lt;li&gt;LEAD() - shows next row value and used in time-based analysis.&lt;/li&gt;
&lt;li&gt;FIRST_VALUE() - returns the first value in an ordered set of values within a partition.&lt;/li&gt;
&lt;li&gt;LAST_VALUE() - returns the last value in an ordered set of values within a partition.&lt;/li&gt;
&lt;li&gt;NTH_VALUE() - Divides rows into equal groups and useful in performance analysis and segmentation.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Some use cases of Offset window functions are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Month-over-month growth&lt;/li&gt;
&lt;li&gt;Time-series comparison&lt;/li&gt;
&lt;li&gt;Trend detection&lt;/li&gt;
&lt;li&gt;Sequential analysis&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;SQL joins and window functions are core tools for designing efficient and powerful queries.&lt;br&gt;
Joins allow you to combine data from multiple tables using defined relationships while Window functions provide an advanced analytical layer in SQL.&lt;/p&gt;

</description>
      <category>luxdev</category>
      <category>sql</category>
      <category>dataengineering</category>
    </item>
    <item>
      <title>The Power of BI; Translating Messy Data, DAX, and Dashboards into Action</title>
      <dc:creator>Lawrence Murithi</dc:creator>
      <pubDate>Mon, 09 Feb 2026 17:59:06 +0000</pubDate>
      <link>https://forem.com/lawrence_murithi/the-power-of-bi-translating-messy-data-dax-and-dashboards-into-action-3kmj</link>
      <guid>https://forem.com/lawrence_murithi/the-power-of-bi-translating-messy-data-dax-and-dashboards-into-action-3kmj</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;In the real world, data rarely comes in a clean and perfect format. Most of the time, it comes from multiple systems, created by different people, and maintained with different rules. It may have missing values, inconsistent naming, or outdated records. Databases may store the same information in different ways. This is where analysts come in. Their work is not just about building reports, but about turning raw, messy data into clear insights that drive real business actions. Power BI is one of the main tools that helps them do this effectively.&lt;/p&gt;

&lt;h2&gt;
  
  
  Understanding the Business Before Touching the Data
&lt;/h2&gt;

&lt;p&gt;Good analysts first try to understand what problem they want to solve before embarking on the data analysis journey. They ask questions like: What decision needs to be made? Who will use the report? What actions should the dashboard support? Without this context, even the best-looking dashboard can fail.&lt;br&gt;
Understanding who your target is and what their needs are could help analysts decide what data to use, what calculations matter, and what level of detail is required.&lt;/p&gt;

&lt;h2&gt;
  
  
  Making Sense of Messy Data
&lt;/h2&gt;

&lt;p&gt;Most data comes in a rough and messy state, therefore, a majority of an analyst’s time is spent cleaning and preparing the data to make sure it is reliable. Power BI’s Power Query tool is designed for this task. Analysts use it to load data from many sources such as Excel files, SQL databases, APIs, and cloud platforms. Power Query allows analysts to apply repeatable step-by-step transformations to clean the data.&lt;/p&gt;

&lt;h3&gt;
  
  
  Common data problems analysts handle
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Missing values&lt;/li&gt;
&lt;li&gt;Duplicate records that inflate totals&lt;/li&gt;
&lt;li&gt;Different spellings or codes for the same category&lt;/li&gt;
&lt;li&gt;Incomplete dates or incorrect data types&lt;/li&gt;
&lt;li&gt;Columns that mix multiple values in one field&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Building a Strong Data Model
&lt;/h2&gt;

&lt;p&gt;After cleaning the data, analysts focus on building a proper data model. This includes defining relationships between tables, choosing the correct granularity, and organizing data in a way that supports accurate analysis.&lt;/p&gt;

&lt;h3&gt;
  
  
  Characteristics of a well-designed data model
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Improves performance&lt;/li&gt;
&lt;li&gt;Makes DAX calculations easier&lt;/li&gt;
&lt;li&gt;Reduces confusion for report users&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Analysts often use star schemas, separating fact tables from dimension tables, to keep the model simple and efficient. This step is invisible to most users, but is critical for reliable results.&lt;/p&gt;

&lt;h2&gt;
  
  
  Using DAX to Add Meaning to the Data
&lt;/h2&gt;

&lt;p&gt;Raw data alone does not answer business questions. DAX(Data Analysis Expressions), therefore, helps analysts turn raw numbers into useful metrics/insight. Analysts use DAX to create measures that reflect real performance. For example, instead of showing total sales, DAX can show how sales compare over different months or years thus leading to better decision-making.&lt;/p&gt;

&lt;h3&gt;
  
  
  Examples of Insights obtained using DAX
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Year-over-year growth&lt;/li&gt;
&lt;li&gt;Employee turnover rate&lt;/li&gt;
&lt;li&gt;Running totals and averages&lt;/li&gt;
&lt;li&gt;Comparison of current performance to past periods&lt;/li&gt;
&lt;li&gt;Percentages, ratios, and growth rates&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Building Dashboards That Tell a Story
&lt;/h2&gt;

&lt;p&gt;Once the data and calculations are ready, analysts design dashboards with the end user in mind. Dashboards are not just charts but are tools for communication. They should guide users toward key insights without overwhelming them. The goal is clarity, not complexity. A good dashboard should have the right visuals, avoid clutter and highlight the most important numbers hence telling a clear story. Filters and slicers allow users to explore the data without needing technical skills.&lt;br&gt;
A good dashboard should have:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The right chart type for each metric&lt;/li&gt;
&lt;li&gt;Logical layout and flow of information&lt;/li&gt;
&lt;li&gt;Clear labels, titles, and tooltips&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Good Dashboards help users:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Quickly see what is going well&lt;/li&gt;
&lt;li&gt;Spot problems early&lt;/li&gt;
&lt;li&gt;Ask better questions&lt;/li&gt;
&lt;li&gt;Drill down to find the cause&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Turning Insights into Real Business Actions
&lt;/h2&gt;

&lt;p&gt;The real success of a dashboard is determined by the actions it supports. A good dashboard helps teams respond faster and make better decisions. This could include setting targets, tracking performance, or identifying risks early.&lt;br&gt;
Since dashboards can refresh automatically, decisions are based on up-to-date information rather than static reports. Alerts and shared reports also ensure that insights reach the right people at the right time.&lt;/p&gt;

&lt;h2&gt;
  
  
  Measuring Impact and Improving Over Time
&lt;/h2&gt;

&lt;p&gt;Analyst work does not end after publishing a dashboard. They gather feedback, monitor usage, and refine reports over time. As business needs change, dashboards must evolve with them.&lt;/p&gt;

&lt;p&gt;Over time, Power BI reports can show measurable impact, such as reduced costs, improved performance, faster reporting cycles, or better accountability. These outcomes demonstrate how technical skills translate into real business value.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Analysts act as a bridge between raw data and business decisions. They clean messy data, use DAX to add meaning, and design dashboards that help people understand what is happening and what to do next. While Power BI provides the tools, it is the analyst’s understanding of data and business that turns information into action.&lt;/p&gt;

&lt;p&gt;The value of Power BI, therefore, is not in the visuals or formulas alone but lies in how analysts use it to support smarter decisions and create measurable impact across the organization.&lt;/p&gt;

</description>
      <category>dataengineering</category>
      <category>luxdev</category>
    </item>
    <item>
      <title>Schemas and Data Modelling in Power BI</title>
      <dc:creator>Lawrence Murithi</dc:creator>
      <pubDate>Tue, 03 Feb 2026 15:19:33 +0000</pubDate>
      <link>https://forem.com/lawrence_murithi/schemas-and-data-modelling-in-power-bi-2ja6</link>
      <guid>https://forem.com/lawrence_murithi/schemas-and-data-modelling-in-power-bi-2ja6</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;Power BI is a business intelligence tool used to turn raw data into meaningful reports and dashboards. It allows organizations to analyze data, track performance, and make informed decisions. However, the quality of insights produced by Power BI depends heavily on how the data is structured behind the scenes. Good visuals and advanced calculations cannot compensate for poorly designed data.&lt;/p&gt;

&lt;h3&gt;
  
  
  What are Schemas and Data Modelling
&lt;/h3&gt;

&lt;p&gt;Data modelling is the process of organizing data into tables and defining how those tables relate to each other. A schema is the structure or design of this data model. &lt;br&gt;
Schemas and data modelling define how data is organized, connected, and interpreted in Power BI. They determine how tables relate to each other, how filters flow across the model, and how calculations are performed. &lt;/p&gt;
&lt;h4&gt;
  
  
  Characteristics of a data model.
&lt;/h4&gt;

&lt;p&gt;A good data model:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Makes reports faster and more responsive&lt;/li&gt;
&lt;li&gt;Produces correct totals and calculations&lt;/li&gt;
&lt;li&gt;Is easy to understand and maintain&lt;/li&gt;
&lt;li&gt;Works naturally with DAX measures and visuals.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A bad data model:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Slows down reports&lt;/li&gt;
&lt;li&gt;Produces wrong or inconsistent numbers&lt;/li&gt;
&lt;li&gt;Forces complex and hard-to-read DAX formulas&lt;/li&gt;
&lt;li&gt;Confuses report users and developers&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This article explains schemas and data modelling in Power BI, focusing on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Star schema&lt;/li&gt;
&lt;li&gt;Snowflake schema&lt;/li&gt;
&lt;li&gt;Fact and dimension tables&lt;/li&gt;
&lt;li&gt;Relationships&lt;/li&gt;
&lt;li&gt;Importance of good modelling for performance and accurate reporting&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  Understanding Data Modelling in Power BI
&lt;/h2&gt;

&lt;p&gt;Data modelling happens after data is loaded from sources such as Excel, databases, or cloud systems. The model is built in the Model view, where tables and relationships are defined.&lt;br&gt;
A Power BI data model usually includes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Fact tables (business events or measurements)&lt;/li&gt;
&lt;li&gt;Dimension tables (descriptive information)&lt;/li&gt;
&lt;li&gt;Relationships between tables&lt;/li&gt;
&lt;li&gt;A schema design (such as star or snowflake)&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  Fact Tables and Dimension Tables
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Fact tables&lt;/strong&gt; store measurable business data (i.e) numerical data that you want to analyze and measure.&lt;br&gt;
Characteristics of fact tables:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Very large, with many rows&lt;/li&gt;
&lt;li&gt;Contain numeric values used in calculations&lt;/li&gt;
&lt;li&gt;Contain keys that link to dimension tables&lt;/li&gt;
&lt;li&gt;Grow over time as new transactions are added.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Examples of fact data:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Sales amount&lt;/li&gt;
&lt;li&gt;Quantity sold&lt;/li&gt;
&lt;li&gt;Profit.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Dimension tables&lt;/strong&gt; store descriptive information that helps explain the facts. Dimension tables are used for filtering, grouping, and slicing data in reports.&lt;/p&gt;

&lt;p&gt;Characteristics of dimension tables:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Smaller than fact tables&lt;/li&gt;
&lt;li&gt;Mostly text and categorical data&lt;/li&gt;
&lt;li&gt;Used for filtering, grouping, and slicing&lt;/li&gt;
&lt;li&gt;Change less frequently than facts.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Examples of dimension data:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Product name&lt;/li&gt;
&lt;li&gt;Customer name&lt;/li&gt;
&lt;li&gt;Region&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  Relationships in Power BI
&lt;/h2&gt;

&lt;p&gt;Relationships in Power BI define how tables are connected and how data flows between them. A relationship is usually created between a key column in one table and a matching column in another table. These keys allow Power BI to link descriptive data from dimension tables to numerical data in fact tables. Relationships are mainly defined by cardinality, direction, and status.&lt;/p&gt;
&lt;h3&gt;
  
  
  Types of Relationships
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;One-to-Many&lt;/strong&gt; - This is the most common and recommended relationship where one record in a dimension table matches many records in a fact table.&lt;br&gt;
&lt;strong&gt;One-to-One&lt;/strong&gt; - One row in one table matches one row in another table.&lt;br&gt;
&lt;strong&gt;Many-to-One&lt;/strong&gt; - Many rows in the fact table match one row in the dimension table.&lt;br&gt;
&lt;strong&gt;Many-to-Many&lt;/strong&gt; - multiple rows in one table match multiple rows in another table and is often used when there is no unique key.&lt;/p&gt;
&lt;h3&gt;
  
  
  Why Relationships Matter
&lt;/h3&gt;

&lt;p&gt;Good relationships:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Ensure correct totals and aggregations&lt;/li&gt;
&lt;li&gt;Control how slicers and filters behave&lt;/li&gt;
&lt;li&gt;Improve report performance&lt;/li&gt;
&lt;li&gt;Make DAX measures simpler and easier to maintain&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;On the contrary, poorly defined relationships often result in:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Wrong numbers&lt;/li&gt;
&lt;li&gt;Missing data in visuals&lt;/li&gt;
&lt;li&gt;Confusing filter behavior&lt;/li&gt;
&lt;li&gt;Slow reports&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  Best Practices for Relationships in Power BI
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Use one-to-many relationships wherever possible&lt;/li&gt;
&lt;li&gt;Connect dimension tables to fact tables, not dimension to dimension&lt;/li&gt;
&lt;li&gt;Use numeric surrogate keys instead of text&lt;/li&gt;
&lt;li&gt;Avoid unnecessary many-to-many relationships&lt;/li&gt;
&lt;li&gt;Use single-direction filtering by default&lt;/li&gt;
&lt;li&gt;Keep the model simple and clear&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  Star Schema
&lt;/h2&gt;

&lt;p&gt;A star schema is the recommended data model in Power BI and consists of:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;One central fact table&lt;/li&gt;
&lt;li&gt;Multiple dimension tables connected directly to the fact table.
A star schema structure looks like a star, with the fact table in the center and dimension tables branching out around it.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;        Date
         |
Product — Sales — Customer
         |
       Region
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Benefits of Star Schema
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Simple and easy to understand&lt;/li&gt;
&lt;li&gt;Faster query performance&lt;/li&gt;
&lt;li&gt;Fewer relationships&lt;/li&gt;
&lt;li&gt;Easier DAX calculations&lt;/li&gt;
&lt;li&gt;Better compatibility with Power BI’s engine.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Power BI can process queries more efficiently because dimension tables are not connected to each other.&lt;/p&gt;

&lt;h2&gt;
  
  
  Snowflake Schema
&lt;/h2&gt;

&lt;p&gt;A snowflake schema is a more complex version of the star schema. In this structure:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Dimension tables are normalized&lt;/li&gt;
&lt;li&gt;Dimension tables are connected to other dimension tables.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Sales → Product → Category → Department
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Benefits of Snowflake Schema
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Reduces data redundancy&lt;/li&gt;
&lt;li&gt;Useful for very large or complex dimensions&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Challenges of Snowflake Schema:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;More complex relationships&lt;/li&gt;
&lt;li&gt;Slower performance due to extra joins&lt;/li&gt;
&lt;li&gt;More complex DAX measures&lt;/li&gt;
&lt;li&gt;Harder for users to understand&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Why Good Data Modelling is Critical
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Performance&lt;/strong&gt; - Power BI uses an in-memory engine. A clean star schema reduces joins and improves query speed. Poor models can cause reports to load slowly or even fail.&lt;br&gt;
&lt;strong&gt;Accurate Reporting&lt;/strong&gt;  - Correct relationships and proper table design ensure that filters and totals behave correctly. Bad modelling often leads to duplicated values or missing data.&lt;br&gt;
&lt;strong&gt;Simpler DAX&lt;/strong&gt; - DAX formulas are easier to write and maintain when the model is simple. Complex schemas often require complicated formulas, which are harder to debug.&lt;br&gt;
&lt;strong&gt;Better User Experience&lt;/strong&gt; - Business users prefer models that are easy to understand. Clear table names, logical relationships, and simple structures help users create reports without confusion.&lt;/p&gt;

&lt;h2&gt;
  
  
  Common Modelling Mistakes
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Using many-to-many relationships unnecessarily&lt;/li&gt;
&lt;li&gt;Mixing transactional and lookup data in one table&lt;/li&gt;
&lt;li&gt;Using bi-directional relationships everywhere&lt;/li&gt;
&lt;li&gt;Not creating a proper date dimension&lt;/li&gt;
&lt;li&gt;Loading unnecessary columns&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;NB:&lt;/strong&gt; Avoiding these mistakes improves both performance and reliability.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Good data modelling is the foundation of effective Power BI reporting. Visuals and calculations only work well when the underlying model is designed correctly. A clean star schema with clear fact and dimension tables leads to faster performance, accurate results, and reports that are easy to build and maintain. &lt;br&gt;
 Good data modelling improves performance, ensures accurate reporting, simplifies DAX, and makes reports easier to use. Without proper modelling, even the best visuals cannot deliver correct insights.&lt;/p&gt;

</description>
      <category>dataengineering</category>
      <category>schema</category>
      <category>datamodelling</category>
    </item>
    <item>
      <title>Introduction to Linux for Data Engineers</title>
      <dc:creator>Lawrence Murithi</dc:creator>
      <pubDate>Mon, 26 Jan 2026 16:28:35 +0000</pubDate>
      <link>https://forem.com/lawrence_murithi/introduction-to-linux-for-data-engineers-29jn</link>
      <guid>https://forem.com/lawrence_murithi/introduction-to-linux-for-data-engineers-29jn</guid>
      <description>&lt;h3&gt;
  
  
  Introduction
&lt;/h3&gt;

&lt;p&gt;Linux is one of the most important tools for data engineers. Most data systems today run on Linux servers, including cloud platforms, databases, and big data tools like Hadoop and Spark. Understanding Linux basics is, therefore, a key skill for anyone starting a career in data engineering.&lt;/p&gt;

&lt;p&gt;This article introduces Linux in a simple way. It explains why Linux is important for data engineers, shows basic Linux commands, and demonstrates how to create and edit files using Vi and Nano, which are common Linux text editors.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why Linux for Data Engineers
&lt;/h3&gt;

&lt;p&gt;Linux is important for data engineers for several reasons:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Most data pipelines run on Linux servers&lt;/li&gt;
&lt;li&gt;Cloud platforms like AWS, Azure, and Google Cloud use Linux&lt;/li&gt;
&lt;li&gt;Tools such as Hadoop, Spark, Airflow, and Kafka are built for Linux&lt;/li&gt;
&lt;li&gt;Linux is stable, secure, and efficient for large data processing&lt;/li&gt;
&lt;li&gt;Data engineers often work with Log files, Configuration files and Scripts written in Python, SQL, or Bash. Linux makes it easy to manage these files directly from the terminal.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Basic Linux Commands For Beginners
&lt;/h3&gt;

&lt;p&gt;Linux commands are instructions typed in the terminal to tell the operating system what to do, such as creating files, moving between folders, or running programs. They allow users to interact directly with the system in a fast and efficient way. Linux commands help manage files, automate tasks and work effectively on servers, which is essential in data engineering and software development.&lt;br&gt;
Below are some of the beginner linux commands.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;ssh root@IP&lt;/strong&gt; - connects to the server
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkgnls3mkhgft5boas1i2.jpg" alt="connect to server" width="800" height="202"&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;pwd&lt;/strong&gt; - Shows current directory&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F587xtbaj2j82213ilg0k.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F587xtbaj2j82213ilg0k.jpg" alt="current directory" width="800" height="66"&gt;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;ls&lt;/strong&gt; - shows all files and folders in the current directory.&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3mpamitb90f2ygfrxgn2.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3mpamitb90f2ygfrxgn2.jpg" alt="list of files" width="800" height="79"&gt;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;cd&lt;/strong&gt; - Changes directory and &lt;strong&gt;cd ..&lt;/strong&gt; moves back one level&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fev19c7uurtluu6dnt6kr.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fev19c7uurtluu6dnt6kr.jpg" alt="Changes directory" width="800" height="54"&gt;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;mkdir&lt;/strong&gt;- Creates a new directory&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6et04w276yxlpbo7uqus.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6et04w276yxlpbo7uqus.jpg" alt="Creates a new folder" width="800" height="88"&gt;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;touch&lt;/strong&gt; - Creates an empty file&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzail6y4d6jb2n1dxk3fd.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzail6y4d6jb2n1dxk3fd.jpg" alt="Creates a new file" width="800" height="58"&gt;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;cp&lt;/strong&gt; - Copies files and &lt;strong&gt;cp -r&lt;/strong&gt; copies folder&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F16obwrr1ufd3g8ybl85c.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F16obwrr1ufd3g8ybl85c.jpg" alt="copy files" width="800" height="221"&gt;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;mv&lt;/strong&gt; - Moves or renames files&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fywehezjr2voj3kb9omxn.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fywehezjr2voj3kb9omxn.jpg" alt="rename file/folder" width="800" height="185"&gt;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;rm&lt;/strong&gt; - Deletes a file and &lt;strong&gt;rm -r&lt;/strong&gt; deletes a folder&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F70xipdqhtewyj4jpe7dg.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F70xipdqhtewyj4jpe7dg.jpg" alt="delete file/folder" width="800" height="145"&gt;&lt;/a&gt;&lt;br&gt;
-&lt;strong&gt;cat&lt;/strong&gt; - display file content&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx8kkeo6w9ctdxvc2z6g4.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx8kkeo6w9ctdxvc2z6g4.jpg" alt="display file content" width="800" height="60"&gt;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Linux Vi and Nano Text Editors&lt;/strong&gt;&lt;br&gt;
Linux editors are programs used to create, open, and edit text files directly from the terminal. They are important because many configuration files, scripts, and logs in Linux are text-based. Data engineers and developers often use Linux editors when working on servers where graphical tools are not available.&lt;br&gt;
Some of the common Linux text editors are Vi and Nano. &lt;br&gt;
&lt;strong&gt;1. Nano Editor&lt;/strong&gt;&lt;br&gt;
Nano is a simple and beginner-friendly editor.&lt;br&gt;
To open or create a file with Nano:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;nano filename.txt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F22el82mc6eivsybispy4.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F22el82mc6eivsybispy4.jpg" alt="opening nano" width="800" height="161"&gt;&lt;/a&gt;&lt;br&gt;
The command opens the window below.&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0xzfjrwjr1ts9jcapa8f.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0xzfjrwjr1ts9jcapa8f.jpg" alt="nano editor" width="800" height="466"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Other nano commands
&lt;/h4&gt;

&lt;h5&gt;
  
  
  Command    What it does
&lt;/h5&gt;

&lt;p&gt;Ctrl + O    Saves the file&lt;br&gt;
Ctrl + X    Exits Nano&lt;br&gt;
Ctrl + G    Shows help&lt;br&gt;
Ctrl + W    Searches for text&lt;br&gt;
Ctrl + K    Cuts (removes) a line&lt;br&gt;
Ctrl + U    Pastes a cut line&lt;br&gt;
Ctrl + A    Moves cursor to start of line&lt;br&gt;
Ctrl + E    Moves cursor to end of line&lt;br&gt;
Ctrl + C    Shows current line and column&lt;br&gt;
Ctrl + _    Go to a specific line number  &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Vi/Vim Editor&lt;/strong&gt;&lt;br&gt;
Vi is a powerful editor and widely used in professional environments.&lt;br&gt;
Vi has 3 main modes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Normal mode - navigation and commands&lt;/li&gt;
&lt;li&gt;Insert mode - typing text&lt;/li&gt;
&lt;li&gt;Visual mode - selecting text
To open a file in Vi editor:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;vi filename.txt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0xlo6qvida5587lliorm.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0xlo6qvida5587lliorm.jpg" alt="open vi editor" width="800" height="199"&gt;&lt;/a&gt;&lt;br&gt;
The below window opens when the command is prompted.&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fefoqg9hnzcivrrm3ce4t.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fefoqg9hnzcivrrm3ce4t.jpg" alt="vi editor" width="800" height="476"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Other vi commands
&lt;/h4&gt;

&lt;h5&gt;
  
  
  Entering Insert Mode
&lt;/h5&gt;

&lt;p&gt;i - insert before cursor&lt;br&gt;
a - append after cursor&lt;br&gt;
o - open new line below&lt;br&gt;
I - insert at beginning of line&lt;br&gt;
A - append at end of line&lt;/p&gt;

&lt;h5&gt;
  
  
  Saving and Exiting/quiting
&lt;/h5&gt;

&lt;p&gt;:w - save (write)&lt;br&gt;
:q - quit&lt;br&gt;
:wq or ZZ - save and quit&lt;br&gt;
:q! - quit without saving&lt;br&gt;
:w filename - save as new file&lt;/p&gt;

&lt;h5&gt;
  
  
  Navigation Commands
&lt;/h5&gt;

&lt;p&gt;h - Move left&lt;br&gt;
l - Move right&lt;br&gt;
j - Move down&lt;br&gt;
k - Move up&lt;br&gt;
gg - Go to start of file&lt;br&gt;
G - Go to end of file&lt;br&gt;
0 - Start of line&lt;br&gt;
$ - End of line&lt;/p&gt;

&lt;h5&gt;
  
  
  Editing Commands
&lt;/h5&gt;

&lt;p&gt;x - delete character&lt;br&gt;
dd - delete line&lt;br&gt;
yy - copy line&lt;br&gt;
p - paste below&lt;br&gt;
P - paste above&lt;br&gt;
u - undo&lt;br&gt;
Ctrl+r - redo&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Conclusion&lt;/strong&gt;&lt;br&gt;
Linux is a core skill for data engineers because it is used in servers, cloud platforms, and data tools. Basic Linux commands help you move around the system and manage files. &lt;br&gt;
Learning Linux early makes it easier to work with data pipelines, scripts, and production systems.&lt;/p&gt;

</description>
      <category>linux</category>
      <category>dataengineering</category>
      <category>vim</category>
      <category>luxdev</category>
    </item>
    <item>
      <title>A Simple beginners Guide to Git &amp; GitHub</title>
      <dc:creator>Lawrence Murithi</dc:creator>
      <pubDate>Sun, 25 Jan 2026 12:37:27 +0000</pubDate>
      <link>https://forem.com/lawrence_murithi/a-simple-beginners-guide-to-git-github-j29</link>
      <guid>https://forem.com/lawrence_murithi/a-simple-beginners-guide-to-git-github-j29</guid>
      <description>&lt;h2&gt;
  
  
  &lt;strong&gt;What is Git, GitHub and why version control is important&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;1. What is Git &amp;amp; GitHub&lt;/strong&gt;&lt;br&gt;
Git is a distributed version control system (VCS)/tool that helps you save, track, and manage changes in your code. It keeps a history of your work so you can see what changed, when it changed, and who changed it.&lt;br&gt;
Think of Git like a save system for your code, but much better than “Save As”.&lt;br&gt;
GitHub is an online platform where you store Git projects.&lt;br&gt;
GitHub allows you to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Back up your code&lt;/li&gt;
&lt;li&gt;Share code with others&lt;/li&gt;
&lt;li&gt;Work on the same project from different computers.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;What is version control?&lt;/strong&gt;&lt;br&gt;
Version control is a system that tracks and manages changes to software code and files over time.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why is version control important?&lt;/strong&gt;&lt;br&gt;
Version control helps:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Track changes in your work&lt;/li&gt;
&lt;li&gt;Go back to an older version if something breaks&lt;/li&gt;
&lt;li&gt;Work safely without fear of losing files&lt;/li&gt;
&lt;li&gt;Collaborate with other people on the same project&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;2. Set Up the Git &amp;amp; GitHub Environments&lt;/strong&gt;&lt;br&gt;
&lt;strong&gt;Step 1: Create a GitHub Account&lt;/strong&gt;&lt;br&gt;
Sign up on &lt;a href="https://github.com" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt; using you credentials.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 2: Install Git&lt;/strong&gt;&lt;br&gt;
Download &lt;a href="https://git-scm.com" rel="noopener noreferrer"&gt;Git&lt;/a&gt;.&lt;br&gt;
Install the application using the downloaded file.&lt;br&gt;
Open &lt;strong&gt;Git Bash&lt;/strong&gt; (Windows) or &lt;strong&gt;Terminal&lt;/strong&gt; (Mac/Linux).&lt;br&gt;
Configure your identity(Name and Email) to help Git Identify who is making the changes any time the changes are made.&lt;br&gt;
&lt;strong&gt;NB:&lt;/strong&gt; Use the email address used to sign up on GitHub.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git config -global user.name "your name"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git config -global user.email "youremail@example.com
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Check to ensure your configuration has been set up.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git config -global --list
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Step 3: Connect Git to Your GitHub Account&lt;/strong&gt;&lt;br&gt;
One of the easiest ways to connect your Git to your GitHub Account is using an SSH key which provides(digital identity) a secure password-less way to connect both to avoid the need for inputing a password every time.&lt;br&gt;
To generate the SSH key on GitBash, run the command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ssh-keygen -t ed25519 -C "youremail@example.com"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now, you need to generate an agent; a helper program that holds your key in memory.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;eval "$(ssh-agent -s)"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Add the key generated to the agent.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ssh-add ~/.ssh/id_ed25519
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Lastly, print the public key and use it to connect your GitHub account.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;cat ~/.ssh/id_ed25519.pub
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Copy the key, navigate to the SSH and GHG keys on settings in you GitHub account and add a new SSH key.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8s16vtnmc45jlvs01ohb.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8s16vtnmc45jlvs01ohb.jpg" alt="SSH key" width="800" height="440"&gt;&lt;/a&gt;&lt;br&gt;
Finally, authenticate your GitHub account in Git by running this command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ssh -T git@github.com
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;3. Set up a Repository in GitHub and Project folder in Git&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Create a repository on GitHub&lt;/li&gt;
&lt;li&gt;Create a project folder in Git
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;mkdir Myproject
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Initialize Git in your project. Tells Git to start tracking this project.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git init
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Add files in your folder. Tells Git which files to track.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git add 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Save changes (commit)
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git commit -m "your message"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;4. Connect your project to GitHub)&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git remote add origin https://github.com/yourusername/repositoryname.git
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Push code to GitHub (send code from your computer to GitHub)
Pushing refers to uploading local commits to a remote server to make your code accessible to others.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git push -u origin main
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Pull code from GitHub (send code from GitHub to your computer).
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git clone https://github.com/username/repositoryname.git

git pull origin main
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;clone&lt;/strong&gt; creates/downloads a new copy on your local computer while &lt;strong&gt;pull&lt;/strong&gt; updates your local files with any changes made.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Check the status of your file.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git status
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Checking status shows:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Modified files&lt;/li&gt;
&lt;li&gt;New files&lt;/li&gt;
&lt;li&gt;Files ready to be committed&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Summary Cheatsheet&lt;/strong&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fokt7ogphm09i1x7ywgct.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fokt7ogphm09i1x7ywgct.png" alt="Cheat sheet" width="514" height="598"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Conclusion&lt;/strong&gt;&lt;br&gt;
Git, GitHub, and version control help developers and data professionals manage their work in a clear and organized way. While Git allows users to track changes, save progress through commits, and return to earlier versions when needed, GitHub provides an online platform to store projects, share code, and collaborate with others. Together, they reduce the risk of losing work, make teamwork easier, and support professional development by keeping code history clear and well documented.&lt;/p&gt;

</description>
      <category>dataengineering</category>
      <category>github</category>
      <category>git</category>
      <category>luxdev</category>
    </item>
  </channel>
</rss>
