<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Ng'ang'a Njongo</title>
    <description>The latest articles on Forem by Ng'ang'a Njongo (@nganga_njongo).</description>
    <link>https://forem.com/nganga_njongo</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3709656%2F282aac17-7fad-4fdc-b875-7d4344712f9e.jpg</url>
      <title>Forem: Ng'ang'a Njongo</title>
      <link>https://forem.com/nganga_njongo</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/nganga_njongo"/>
    <language>en</language>
    <item>
      <title>ETL vs ELT: Which One Should You Use and Why?</title>
      <dc:creator>Ng'ang'a Njongo</dc:creator>
      <pubDate>Wed, 15 Apr 2026 12:42:33 +0000</pubDate>
      <link>https://forem.com/nganga_njongo/etl-vs-elt-which-one-should-you-use-and-why-44le</link>
      <guid>https://forem.com/nganga_njongo/etl-vs-elt-which-one-should-you-use-and-why-44le</guid>
      <description>&lt;p&gt;The data landscape has undergone a huge shift over the last few years. As organizations move from on-premise servers to cloud architectures, the methods used to move and process data have evolved. At the heart of this evolution is the debate between two fundamental data integration strategies: &lt;strong&gt;ETL (Extract, Transform, Load)&lt;/strong&gt; and &lt;strong&gt;ELT (Extract, Load, Transform)&lt;/strong&gt;. While they share the same three core components, the order in which these steps occur completely changes the architecture, cost, and performance of a data pipeline. This article provides a technical comparison to help you decide which approach is right for your modern data stack.&lt;/p&gt;

&lt;h1&gt;
  
  
  Understanding ETL: The Traditional Workhorse
&lt;/h1&gt;

&lt;p&gt;&lt;strong&gt;ETL&lt;/strong&gt;, which stands for &lt;strong&gt;Extract, Transform, and Load&lt;/strong&gt;, is the traditional method of data integration that has dominated the industry since the 1970s. In an ETL architecture, data is extracted from one or more source systems, moved to a separate "staging area" or processing server, transformed into a structured format, and finally loaded into a target data warehouse.&lt;/p&gt;

&lt;h2&gt;
  
  
  The ETL Workflow
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Extract&lt;/strong&gt;: Data is pulled from various sources, such as relational databases (SQL Server, Oracle), CRM systems (Salesforce), or flat files (CSV, XML).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Transform&lt;/strong&gt;: This is the most compute-intensive stage. On a dedicated transformation server, the raw data is cleaned, filtered, deduplicated, and formatted. Complex business logic is applied to ensure the data matches the strict schema of the target warehouse.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Load&lt;/strong&gt;: The "clean" and fully transformed data is then loaded into the data warehouse, ready for BI tools and analysts to query.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Why Use ETL?
&lt;/h2&gt;

&lt;p&gt;ETL is highly effective for organizations with strict compliance requirements, such as those in healthcare or finance. Because data is transformed before it reaches the warehouse, sensitive information like Personally Identifiable Information (PII) can be masked or removed entirely during the transformation phase. This ensures that sensitive raw data never enters the storage layer. Furthermore, ETL is ideal for legacy on-premise systems where the target data warehouse lacks the processing power to handle large-scale transformations.&lt;/p&gt;

&lt;h1&gt;
  
  
  Understanding ELT: The Cloud-Native Revolution
&lt;/h1&gt;

&lt;p&gt;&lt;strong&gt;ELT, or Extract, Load, and Transform&lt;/strong&gt;, is a modern approach that has gained massive popularity with the rise of cloud data warehouses like Snowflake, Google BigQuery, and Amazon Redshift. Unlike ETL, which relies on an external processing server, ELT leverages the massive, horizontally scalable compute power of the data warehouse itself to perform transformations.&lt;/p&gt;

&lt;h2&gt;
  
  
  The ELT Workflow
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Extract&lt;/strong&gt;: Just like ETL, data is pulled from source systems.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Load&lt;/strong&gt;: Instead of going to a staging server, the raw data is loaded directly into the target data warehouse. Modern cloud warehouses can ingest vast amounts of raw data (structured, semi-structured, or unstructured) at incredibly high speeds.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Transform&lt;/strong&gt;: Once the raw data is inside the warehouse, it is transformed using SQL or specialized tools. The raw data is often preserved in "bronze" or "staging" tables, while transformed versions are created in "silver" or "gold" tables for analysis.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Why Use ELT?
&lt;/h2&gt;

&lt;p&gt;ELT offers unparalleled flexibility and speed. Because the raw data is stored within the warehouse, data scientists and analysts can re-query and re-transform it whenever business requirements change without needing to re-extract it from the source. It is the backbone of the "Modern Data Stack," enabling faster ingestion and better support for Big Data and real-time analytics.&lt;/p&gt;

&lt;h1&gt;
  
  
  Key Differences Between ETL and ELT
&lt;/h1&gt;

&lt;p&gt;While both methods achieve the same end goal—making data available for analysis—the technical trade-offs are significant. The following table summarizes the core differences between these two approaches:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;ETL (Extract, Transform, Load)&lt;/th&gt;
&lt;th&gt;ELT (Extract, Load, Transform)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Transformation Location&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Separate dedicated processing server&lt;/td&gt;
&lt;td&gt;Target data warehouse (Cloud)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Data Format Support&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Primarily structured data&lt;/td&gt;
&lt;td&gt;Structured, semi-structured, and unstructured&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Flexibility&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Rigid; requires schema-on-write&lt;/td&gt;
&lt;td&gt;Highly flexible; supports schema-on-read&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Loading Speed&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Slower (waits for transformation)&lt;/td&gt;
&lt;td&gt;Faster (direct ingestion of raw data)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Scalability&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Limited by the staging server's capacity&lt;/td&gt;
&lt;td&gt;Highly scalable via cloud MPP architecture&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Maintenance&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;High; complex pipelines and server management&lt;/td&gt;
&lt;td&gt;Lower; automated ingestion and SQL-based logic&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Compliance&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Superior for masking PII before storage&lt;/td&gt;
&lt;td&gt;Requires careful management within the warehouse&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Cost Model&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;High upfront hardware/software costs&lt;/td&gt;
&lt;td&gt;Pay-as-you-go compute and storage&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h1&gt;
  
  
  Real-World Use Cases
&lt;/h1&gt;

&lt;p&gt;Choosing between ETL and ELT often depends on the specific industry, data volume, and regulatory environment. Below are common real-world applications for each.&lt;/p&gt;

&lt;h2&gt;
  
  
  ETL Use Cases
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;1. Healthcare Data Integration&lt;/strong&gt;: Healthcare providers often use ETL to merge patient records from fragmented Electronic Health Record (EHR) systems. Before loading this data into a centralized warehouse for clinical research, ETL pipelines must anonymize patient names and other PII.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Financial Fraud Detection&lt;/strong&gt;: Banks use ETL to process transaction logs from legacy mainframes. By transforming this data in a secure staging area, they can detect suspicious patterns and flag anomalies before the data is archived, ensuring that only verified, high-quality data is used for regulatory reporting.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Legacy System Modernization&lt;/strong&gt;: Organizations still running on-premise ERP systems often lack the cloud infrastructure for ELT. ETL allows them to extract data from these older systems, clean it on a mid-tier server, and load it into a structured reporting database without overwhelming their existing hardware.&lt;/p&gt;

&lt;h2&gt;
  
  
  ELT Use Cases
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;1. E-commerce Customer 360&lt;/strong&gt;: Modern retailers like Shopify or Amazon-based sellers use ELT to ingest massive streams of behavioral data (clicks, views, and cart additions). By loading this raw data into BigQuery or Snowflake, they can use tools like &lt;strong&gt;dbt&lt;/strong&gt; to build "Customer 360" profiles that drive real-time product recommendations and personalized marketing.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Log Analysis and IoT Monitoring&lt;/strong&gt;: Tech companies and manufacturers deal with millions of log entries and sensor readings per second. ELT allows them to "dump" these logs into a cloud data lake or warehouse immediately. Analysts can then perform transformations on specific subsets of that data only when a security audit or system failure occurs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Marketing Attribution&lt;/strong&gt;: Marketing teams pull data from dozens of disparate APIs, including Google Ads, Facebook, and LinkedIn. ELT is used to ingest all this data in its raw form first. This allows analysts to experiment with different attribution models (first-click, last-click, or linear) by re-transforming the same raw data multiple times.&lt;/p&gt;

&lt;h1&gt;
  
  
  The Tooling Landscape
&lt;/h1&gt;

&lt;p&gt;The tools you choose will largely define your architecture. The industry has split into traditional ETL vendors and modern ELT-focused platforms.&lt;/p&gt;

&lt;h2&gt;
  
  
  1) Traditional ETL Tools
&lt;/h2&gt;

&lt;p&gt;These tools are designed for complex, server-side transformations and often feature "drag-and-drop" visual interfaces.&lt;/p&gt;

&lt;p&gt;• &lt;strong&gt;Informatica PowerCenter&lt;/strong&gt;: The enterprise standard for decades, known for its robustness and complex workflow management.&lt;/p&gt;

&lt;p&gt;• &lt;strong&gt;Talend (Qlik)&lt;/strong&gt;: An open-source-based platform that provides extensive connectors for both on-premise and cloud systems.&lt;/p&gt;

&lt;p&gt;• &lt;strong&gt;Microsoft SSIS&lt;/strong&gt;: A popular choice for organizations already deep within the Microsoft SQL Server ecosystem.&lt;/p&gt;

&lt;p&gt;• &lt;strong&gt;IBM InfoSphere DataStage&lt;/strong&gt;: A high-performance ETL tool designed for large-scale enterprise data integration.&lt;/p&gt;

&lt;h2&gt;
  
  
  2) Modern ELT Tools
&lt;/h2&gt;

&lt;p&gt;These tools focus on high-speed ingestion and "in-warehouse" transformation, often using SQL as the primary language.&lt;/p&gt;

&lt;p&gt;• &lt;strong&gt;Fivetran &amp;amp; Airbyte&lt;/strong&gt;: These are the leaders in "automated ingestion." They focus on the E and L of ELT, moving data from hundreds of sources into a warehouse with zero configuration.&lt;/p&gt;

&lt;p&gt;• &lt;strong&gt;dbt (data build tool)&lt;/strong&gt;: The industry standard for the &lt;strong&gt;T&lt;/strong&gt; in ELT. It allows data analysts to write transformations in SQL and manage them like software code (version control, testing, and documentation).&lt;/p&gt;

&lt;p&gt;• &lt;strong&gt;Matillion&lt;/strong&gt;: A cloud-native tool that provides a visual interface for building ELT pipelines specifically for Snowflake, Redshift, and BigQuery.&lt;/p&gt;

&lt;p&gt;• &lt;strong&gt;AWS Glue &amp;amp; Azure Data Factory&lt;/strong&gt;: These cloud-native services are hybrid; they can perform traditional ETL using Spark or ELT by orchestrating warehouse-native commands.&lt;/p&gt;

&lt;h1&gt;
  
  
  Conclusion: Which One Should You Use?
&lt;/h1&gt;

&lt;p&gt;The decision between ETL and ELT is no longer a simple binary choice, but a strategic one based on your organization's maturity and needs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Choose ETL if&lt;/strong&gt;:&lt;/p&gt;

&lt;p&gt;• You operate in a highly regulated industry (Finance, Healthcare) and must mask sensitive data before it reaches your storage layer.&lt;/p&gt;

&lt;p&gt;• You are working with legacy on-premise systems that cannot handle the compute load of modern transformations.&lt;/p&gt;

&lt;p&gt;• Your data volumes are relatively small and predictable, and you require highly structured, "clean" data from day one.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Choose ELT if&lt;/strong&gt;:&lt;/p&gt;

&lt;p&gt;• You are building a "Modern Data Stack" in the cloud and want to leverage the scalability of Snowflake, BigQuery, or Redshift.&lt;/p&gt;

&lt;p&gt;• You deal with high-volume, high-velocity data (Big Data, IoT, or web logs) that requires fast ingestion.&lt;/p&gt;

&lt;p&gt;• Your team values flexibility and wants to retain raw data for future exploration and re-analysis.&lt;/p&gt;

&lt;p&gt;In 2026, the trend is undeniably toward &lt;strong&gt;ELT&lt;/strong&gt;. The cost of cloud storage has plummeted, while the power of cloud compute has skyrocketed. By moving transformations into the warehouse, organizations can empower their analysts, reduce pipeline maintenance, and build a more agile data-driven culture. However, for those with strict security mandates, the tried-and-true ETL approach remains a vital tool in the data engineer's arsenal.&lt;/p&gt;

</description>
      <category>dataengineering</category>
      <category>elt</category>
      <category>etl</category>
      <category>cloud</category>
    </item>
    <item>
      <title>Connecting Power BI to SQL Databases: A Beginner's Guide</title>
      <dc:creator>Ng'ang'a Njongo</dc:creator>
      <pubDate>Mon, 23 Mar 2026 15:21:45 +0000</pubDate>
      <link>https://forem.com/nganga_njongo/connecting-power-bi-to-sql-databases-a-beginners-guide-41cd</link>
      <guid>https://forem.com/nganga_njongo/connecting-power-bi-to-sql-databases-a-beginners-guide-41cd</guid>
      <description>&lt;p&gt;&lt;strong&gt;Power BI&lt;/strong&gt; is one of the most powerful tools for data analysis and business intelligence. It allows users to visualize their data through interactive dashboards and reports, making it easier for companies to track performance, identify trends, and make informed decisions.&lt;/p&gt;

&lt;p&gt;While Power BI can import data from simple files like Excel or CSVs, most professional organizations store their data in &lt;strong&gt;SQL databases&lt;/strong&gt;. These databases are essential for managing large volumes of structured data efficiently, ensuring data integrity, and providing a "single source of truth" for the entire company. By connecting Power BI directly to a SQL database, analysts can work with real-time data and build scalable reporting solutions.&lt;/p&gt;

&lt;h2&gt;
  
  
  Connecting to a Local PostgreSQL Database
&lt;/h2&gt;

&lt;p&gt;PostgreSQL is a popular open-source relational database. If you have a local instance of PostgreSQL running on your machine, connecting it to Power BI is a straightforward process.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Step-by-Step Connection&lt;/strong&gt;:
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;1 Open Power BI Desktop&lt;/strong&gt;: Start by launching the application on your computer.&lt;br&gt;
&lt;strong&gt;2 Select Get Data&lt;/strong&gt;: On the Home ribbon, click the "Get Data" icon. &lt;br&gt;
&lt;strong&gt;3 Choose PostgreSQL Database&lt;/strong&gt;: From the "Get Data" window, select "PostgreSQL" from the list.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft6n7kjgk9snwivubtrxu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft6n7kjgk9snwivubtrxu.png" alt="Get Data" width="800" height="401"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4 Enter Server Details&lt;/strong&gt;: Enter the Server name, in our case it's: localhost. Also specify the Database name you want to connect to.&lt;br&gt;
&lt;strong&gt;5 Authentication&lt;/strong&gt;: Choose the Database tab in the authentication window. Enter your PostgreSQL Username and Password.&lt;br&gt;
&lt;strong&gt;6 Load Tables&lt;/strong&gt;: Once connected, the Navigator window will display all available tables. You can select the ones you need and click Load.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnxex82dptxdji20t904n.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnxex82dptxdji20t904n.png" alt="DB Connections" width="800" height="413"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc7327rfpdfjq7fdbbth9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc7327rfpdfjq7fdbbth9.png" alt="Load Data" width="800" height="420"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Connecting to a Cloud Database: Aiven PostgreSQL
&lt;/h2&gt;

&lt;p&gt;Many companies use cloud-managed databases like Aiven for PostgreSQL to handle their production data. Cloud databases offer high availability and security but require a few extra steps to connect.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step-by-Step Connection:
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;1. Login to Aiven&lt;/strong&gt;: Login into Aiven and select create service as below:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fy9wqr44ygo9n5cyskzf6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fy9wqr44ygo9n5cyskzf6.png" alt="Aiven Login" width="800" height="365"&gt;&lt;/a&gt; &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Postgres Configurations&lt;/strong&gt;: Select PostgreSQL service and on the configuration page, configure below:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Service Tier&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Region&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Plan&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;3. Create Service&lt;/strong&gt;: Once above configurations are done, click "Create Service"&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkx35viy095v9dzmav09m.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkx35viy095v9dzmav09m.png" alt="Create Service" width="800" height="396"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Obtaining Connection Details
&lt;/h2&gt;

&lt;p&gt;Once you've created your PostgreSQL service on Aiven, gather the following information from your service overview:&lt;br&gt;
• Host&lt;br&gt;
• Port&lt;br&gt;
• Database Name&lt;br&gt;
• Username &amp;amp; Password&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;See example below&lt;/strong&gt;:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcogqrfguqpo8q4elzk1z.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcogqrfguqpo8q4elzk1z.png" alt="Aiven Connection" width="800" height="373"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Role of SSL Certificates
&lt;/h2&gt;

&lt;p&gt;Cloud connections often require SSL (Secure Sockets Layer) certificates. SSL encrypts the data moving between the database and Power BI, preventing unauthorized parties from intercepting sensitive information. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;To include the certificate in Power BI&lt;/strong&gt;:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; Download the ca.pem file from the Aiven console.&lt;/li&gt;
&lt;li&gt; Open Command Prompt or PowerShell as Administrator. Run the following command to add the certificate to the Root Store:
&lt;code&gt;certutil -addstore -f "Root" &amp;lt;path_to_your_ca.pem_file&amp;gt;&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt; In Power BI, when prompted for the connection, ensure the "Encrypt connections" option is checked.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fityfn4vc627ogopndvjt.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fityfn4vc627ogopndvjt.png" alt="Aiven SSL" width="800" height="368"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F36sf06eqpync2vrp6vgh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F36sf06eqpync2vrp6vgh.png" alt="Install SSL" width="800" height="513"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Get Data → PostgreSQL database&lt;/li&gt;
&lt;li&gt;Enter Aiven connection details:&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqkp6s3puldqcjx5fmbla.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqkp6s3puldqcjx5fmbla.png" alt="Aiven Host" width="800" height="401"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;NB:&lt;/em&gt;&lt;/strong&gt;&lt;br&gt;
Advanced options → Add SSL parameters:&lt;br&gt;
powerquery&lt;br&gt;
let&lt;br&gt;
Source = PostgreSQL.Database(&lt;br&gt;
"your-service.aivencloud.com:12345",&lt;br&gt;
"defaultdb",&lt;br&gt;
[&lt;br&gt;
CreateNavigationProperties = true,&lt;br&gt;
SSLMode = "Require",&lt;br&gt;
UseSSL = true&lt;br&gt;
]&lt;br&gt;
)&lt;br&gt;
in&lt;br&gt;
Source&lt;/p&gt;

&lt;h2&gt;
  
  
  Why SQL Skills are Vital for Power BI Analysts
&lt;/h2&gt;

&lt;p&gt;While Power BI provides a user-friendly interface for connecting to data, having SQL (Structured Query Language) skills is a game-changer for any data analyst.&lt;/p&gt;

&lt;p&gt;• &lt;strong&gt;Data Retrieval&lt;/strong&gt;: SQL allows you to write custom queries to pull only the specific columns and rows you need, reducing the load on Power BI and improving performance.&lt;br&gt;
• &lt;strong&gt;Data Filtering and Aggregation&lt;/strong&gt;: Instead of bringing millions of rows into Power BI, you can use SQL to aggregate data at the database level.&lt;br&gt;
• &lt;strong&gt;Data Cleaning&lt;/strong&gt;: SQL is efficient at handling "messy" data—renaming columns, handling null values, and formatting dates—before the data even reaches your dashboard.&lt;br&gt;
• &lt;strong&gt;Complex Logic&lt;/strong&gt;: Some business logic is easier to write in SQL than in Power BI’s DAX language, especially when involving complex joins or window functions.&lt;/p&gt;

&lt;p&gt;By mastering both SQL and Power BI, you become a versatile analyst capable of handling the entire data pipeline—from the raw database to the final executive dashboard.&lt;/p&gt;

</description>
      <category>database</category>
      <category>microsoft</category>
      <category>postgres</category>
      <category>sql</category>
    </item>
    <item>
      <title>A Beginner's Guide to SQL Joins and Window Functions</title>
      <dc:creator>Ng'ang'a Njongo</dc:creator>
      <pubDate>Sat, 07 Mar 2026 09:37:51 +0000</pubDate>
      <link>https://forem.com/nganga_njongo/a-beginners-guide-to-sql-joins-and-window-functions-45db</link>
      <guid>https://forem.com/nganga_njongo/a-beginners-guide-to-sql-joins-and-window-functions-45db</guid>
      <description>&lt;p&gt;Structured Query Language (SQL) is the backbone of data management, enabling us to interact with and extract meaningful insights from relational databases. Two powerful concepts within SQL that are essential for any data professional are Joins and Window Functions. This article will demystify these concepts, providing clear explanations and practical examples based on a hypothetical e-commerce database.&lt;/p&gt;

&lt;p&gt;Our database consists of four tables:&lt;/p&gt;

&lt;p&gt;• &lt;strong&gt;Customers:&lt;/strong&gt; customer_id, first_name, last_name, email, phone_number, registration_date, membership_status&lt;br&gt;
• &lt;strong&gt;Inventory:&lt;/strong&gt; product_id, stock_quantity&lt;br&gt;
• &lt;strong&gt;Products:&lt;/strong&gt; product_id, product_name, category, price, supplier, stock_quantity&lt;br&gt;
• &lt;strong&gt;Sales:&lt;/strong&gt; sale_id, customer_id, product_id, quantity_sold, sale_date, total_amount&lt;/p&gt;

&lt;h1&gt;
  
  
  Understanding SQL Joins: Connecting Related Data
&lt;/h1&gt;

&lt;p&gt;In relational databases, data is often spread across multiple tables to ensure efficiency and reduce redundancy. Joins are SQL clauses that combine rows from two or more tables based on a related column between them. They allow us to retrieve a complete picture by linking disparate pieces of information.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. INNER JOIN
&lt;/h2&gt;

&lt;p&gt;An &lt;strong&gt;INNER JOIN&lt;/strong&gt; returns only the rows that have matching values in both tables. It's the most common type of join and is used when you want to see data where a relationship exists in both datasets.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt; Finding customers who have purchased products with a price greater than 1000.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;SELECT&lt;br&gt;
    c.first_name || ' ' || c.last_name AS Cust_Above_1000&lt;br&gt;
FROM&lt;br&gt;
    assignment.Customers c&lt;br&gt;
INNER JOIN&lt;br&gt;
    assignment.Sales s ON c.customer_id = s.customer_id&lt;br&gt;
INNER JOIN&lt;br&gt;
    assignment.Products p ON s.product_id = p.product_id&lt;br&gt;
WHERE&lt;br&gt;
    p.price &amp;gt; 1000;&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Explanation:&lt;/strong&gt; This query combines Customers, Sales, and Products tables. It links Customers to Sales using customer_id and Sales to Products using product_id. The WHERE clause then filters these combined results to show only customers who bought products priced over 1000. Only customers who have made a sale are included, and only products that have been sold are considered.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt; Joining Sales and Products to calculate total sales for each product.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;SELECT&lt;br&gt;
    p.product_name,&lt;br&gt;
    SUM(s.quantity_sold) AS product_sales&lt;br&gt;
FROM&lt;br&gt;
    assignment.Sales s&lt;br&gt;
INNER JOIN&lt;br&gt;
    assignment.Products p ON s.product_id = p.product_id&lt;br&gt;
GROUP BY&lt;br&gt;
    p.product_name;&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Explanation:&lt;/strong&gt; Here, we join Sales and Products to get the product names associated with each sale. We then use SUM(s.quantity_sold) and GROUP BY p.product_name to calculate the total quantity sold for each product. This query effectively shows how many units of each product have been sold.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. LEFT JOIN (or LEFT OUTER JOIN)
&lt;/h2&gt;

&lt;p&gt;A &lt;strong&gt;LEFT JOIN&lt;/strong&gt; returns all rows from the left table, and the matching rows from the right table. If there's no match in the right table, NULL values are returned for columns from the right table. This is useful when you want to include all records from one table, even if they don't have a corresponding record in another.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt; List all customers and any sales they have made.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;SELECT&lt;br&gt;
    c.first_name, c.last_name,&lt;br&gt;
    s.sale_id, s.total_amount&lt;br&gt;
FROM&lt;br&gt;
    assignment.Customers c&lt;br&gt;
LEFT JOIN&lt;br&gt;
    assignment.Sales s ON c.customer_id = s.customer_id;&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Explanation:&lt;/strong&gt; This query would list every customer from the Customers table. If a customer has made sales, their sales details will appear alongside their name. If a customer has not made any sales, their name will still appear, but the sale_id and total_amount columns will show NULL.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. RIGHT JOIN (or RIGHT OUTER JOIN)
&lt;/h2&gt;

&lt;p&gt;A &lt;strong&gt;RIGHT JOIN&lt;/strong&gt; is the inverse of a &lt;strong&gt;LEFT JOIN&lt;/strong&gt;. It returns all rows from the right table, and the matching rows from the left table. If there's no match in the left table, NULL values are returned for columns from the left table.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt; List all products and any sales made for them.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;SELECT&lt;br&gt;
    p.product_name,&lt;br&gt;
    s.sale_id, s.quantity_sold&lt;br&gt;
FROM&lt;br&gt;
    assignment.Sales s&lt;br&gt;
RIGHT JOIN&lt;br&gt;
    assignment.Products p ON s.product_id = p.product_id;&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Explanation:&lt;/strong&gt; This query would list every product from the Products table. If a product has been sold, its sales details will appear. If a product has never been sold, its name will still appear, but sale_id and quantity_sold will be NULL.&lt;/p&gt;

&lt;h2&gt;
  
  
  4. FULL JOIN (or FULL OUTER JOIN)
&lt;/h2&gt;

&lt;p&gt;A &lt;strong&gt;FULL JOIN&lt;/strong&gt; returns all rows when there is a match in either the left or the right table. It essentially combines the results of both &lt;strong&gt;LEFT JOIN&lt;/strong&gt; and &lt;strong&gt;RIGHT JOIN&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt; Show all customers and all products, linking them by sales where applicable.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;SELECT&lt;br&gt;
    c.first_name, c.last_name,&lt;br&gt;
    p.product_name, s.sale_id&lt;br&gt;
FROM&lt;br&gt;
    assignment.Customers c&lt;br&gt;
FULL JOIN&lt;br&gt;
    assignment.Sales s ON c.customer_id = s.customer_id&lt;br&gt;
FULL JOIN&lt;br&gt;
    assignment.Products p ON s.product_id = p.product_id;&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Explanation:&lt;/strong&gt; This query would show all customers, all products, and any sales that connect them. If a customer has no sales, their details will appear with NULL for sales and product information. If a product has no sales, its details will appear with NULL for customer and sales information. If both exist and are linked by a sale, all information will be present.&lt;/p&gt;

&lt;h2&gt;
  
  
  5. SELF JOIN
&lt;/h2&gt;

&lt;p&gt;A &lt;strong&gt;SELF JOIN&lt;/strong&gt; is a regular join, but the table is joined with itself. This is useful for comparing rows within the same table.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt; Find all pairs of customers who have the same membership status.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;SELECT&lt;br&gt;
    c1.first_name AS customer1_first_name,&lt;br&gt;
    c1.last_name AS customer1_last_name,&lt;br&gt;
    c2.first_name AS customer2_first_name,&lt;br&gt;
    c2.last_name AS customer2_last_name,&lt;br&gt;
    c1.membership_status&lt;br&gt;
FROM&lt;br&gt;
    assignment.Customers c1&lt;br&gt;
JOIN&lt;br&gt;
    assignment.Customers c2 ON c1.membership_status = c2.membership_status&lt;br&gt;
WHERE&lt;br&gt;
    c1.customer_id &amp;gt; c2.customer_id;&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Explanation:&lt;/strong&gt; We join the Customers table to itself, aliasing it as c1 and c2. The join condition c1.membership_status = c2.membership_status finds customers with the same membership status. The &lt;strong&gt;WHERE&lt;/strong&gt; c1.customer_id &amp;gt; c2.customer_id clause is crucial to avoid duplicate pairs (e.g., (Alice, Bob) and (Bob, Alice)) and to prevent a customer from being paired with themselves.&lt;/p&gt;

&lt;h1&gt;
  
  
  Exploring SQL Window Functions: Advanced Analytics
&lt;/h1&gt;

&lt;p&gt;Window Functions perform a calculation across a set of table rows that are somehow related to the current row. Unlike aggregate functions &lt;strong&gt;(SUM, AVG, COUNT)&lt;/strong&gt; which collapse rows into a single summary row, window functions return a value for each row, making them incredibly powerful for analytical tasks like ranking, moving averages, and cumulative sums.&lt;/p&gt;

&lt;p&gt;The key to understanding window functions is the &lt;strong&gt;OVER()&lt;/strong&gt; clause, which defines the window or set of rows on which the function operates. The &lt;strong&gt;OVER()&lt;/strong&gt; clause can include:&lt;/p&gt;

&lt;p&gt;• &lt;strong&gt;PARTITION BY:&lt;/strong&gt; Divides the rows into groups or partitions. The window function is applied independently to each partition.&lt;br&gt;
• &lt;strong&gt;ORDER BY:&lt;/strong&gt; Orders the rows within each partition. This is crucial for functions that depend on the order of rows.&lt;/p&gt;

&lt;p&gt;Common Window Functions and Their Uses:&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Ranking Functions (ROW_NUMBER(), RANK(), DENSE_RANK(), NTILE())
&lt;/h2&gt;

&lt;p&gt;These functions assign a rank to each row within its partition based on the specified ordering. They are invaluable for identifying top performers, most recent entries, or other ordered subsets of data.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt; Rank products by total sales within each category.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;SELECT&lt;br&gt;
    p.category,&lt;br&gt;
    p.product_name,&lt;br&gt;
    SUM(s.total_amount) AS total_sales,&lt;br&gt;
    RANK() OVER (PARTITION BY p.category ORDER BY SUM(s.total_amount) DESC) AS sales_rank&lt;br&gt;
FROM&lt;br&gt;
    assignment.Sales s&lt;br&gt;
INNER JOIN&lt;br&gt;
    assignment.Products p ON s.product_id = p.product_id&lt;br&gt;
GROUP BY&lt;br&gt;
    p.category, p.product_name;&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Explanation:&lt;/strong&gt; This query first groups sales by product and category to get total_sales. Then, &lt;strong&gt;RANK() OVER (PARTITION BY p.category ORDER BY SUM(s.total_amount) DESC)&lt;/strong&gt; assigns a rank to each product within its category based on its total_sales in descending order. Products with the same total sales within a category will receive the same rank.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. Aggregate Window Functions (SUM(), AVG(), COUNT(), MIN(), MAX())
&lt;/h2&gt;

&lt;p&gt;These are standard aggregate functions used as window functions. When used with &lt;strong&gt;OVER()&lt;/strong&gt;, they perform their aggregation over the defined window, but instead of collapsing rows, they return the aggregate value for each row.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt; Calculate the running total of sales for each customer over time.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;SELECT&lt;br&gt;
    c.first_name, c.last_name,&lt;br&gt;
    s.sale_date,&lt;br&gt;
    s.total_amount,&lt;br&gt;
    SUM(s.total_amount) OVER (PARTITION BY c.customer_id ORDER BY s.sale_date) AS running_total_sales&lt;br&gt;
FROM&lt;br&gt;
    assignment.Sales s&lt;br&gt;
INNER JOIN&lt;br&gt;
    assignment.Customers c ON s.customer_id = c.customer_id&lt;br&gt;
ORDER BY&lt;br&gt;
    c.customer_id, s.sale_date;&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Explanation:&lt;/strong&gt; This query calculates a running_total_sales for each customer. &lt;strong&gt;PARTITION BY&lt;/strong&gt; c.customer_id ensures the sum restarts for each new customer, and &lt;strong&gt;ORDER BY&lt;/strong&gt; s.sale_date makes it a cumulative sum based on the sale date. Each row will show the total amount spent by that customer up to that specific sale date.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. Lag/Lead Functions (LAG(), LEAD())
&lt;/h2&gt;

&lt;p&gt;These functions allow you to access data from a previous &lt;strong&gt;(LAG())&lt;/strong&gt; or subsequent &lt;strong&gt;(LEAD())&lt;/strong&gt; row within the same result set without using a self-join. This is particularly useful for comparing values across rows, such as calculating the difference between consecutive sales.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt; Find the previous sale amount for each customer.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;SELECT&lt;br&gt;
    c.first_name, c.last_name,&lt;br&gt;
    s.sale_date,&lt;br&gt;
    s.total_amount,&lt;br&gt;
    LAG(s.total_amount, 1, 0) OVER (PARTITION BY c.customer_id ORDER BY s.sale_date) AS previous_sale_amount&lt;br&gt;
FROM&lt;br&gt;
    assignment.Sales s&lt;br&gt;
INNER JOIN&lt;br&gt;
    assignment.Customers c ON s.customer_id = c.customer_id&lt;br&gt;
ORDER BY&lt;br&gt;
    c.customer_id, s.sale_date;&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Explanation:&lt;/strong&gt; &lt;strong&gt;LAG(s.total_amount, 1, 0)&lt;/strong&gt; retrieves the total_amount from the previous row within each customer's partition, ordered by sale_date. If there is no previous row (e.g., the first sale), it defaults to 0 (the third argument).&lt;/p&gt;

&lt;h1&gt;
  
  
  Conclusion
&lt;/h1&gt;

&lt;p&gt;SQL Joins and Window Functions are indispensable tools for anyone working with relational databases. Joins allow you to combine data from multiple tables, creating a comprehensive view of your information. Window functions, on the other hand, provide a powerful way to perform complex analytical calculations over related sets of rows without aggregating them away. Mastering these concepts will significantly enhance your ability to extract, analyze, and report on data effectively, transforming raw data into actionable insights.&lt;/p&gt;

</description>
      <category>database</category>
      <category>sql</category>
      <category>dataengineering</category>
      <category>datascience</category>
    </item>
    <item>
      <title>How Analysts Translate Messy Data, DAX, and Dashboards into Action Using Power BI</title>
      <dc:creator>Ng'ang'a Njongo</dc:creator>
      <pubDate>Fri, 13 Feb 2026 17:21:37 +0000</pubDate>
      <link>https://forem.com/nganga_njongo/how-analysts-translate-messy-data-dax-and-dashboards-into-action-using-power-bi-2k7n</link>
      <guid>https://forem.com/nganga_njongo/how-analysts-translate-messy-data-dax-and-dashboards-into-action-using-power-bi-2k7n</guid>
      <description>&lt;p&gt;In this article we'll use a real-world example of data from farms in Kenya to show you how it's done. You'll learn how to take a messy file, clean it up, use some simple but powerful formulas, and create insightful charts—all using Power BI.&lt;/p&gt;

&lt;h1&gt;
  
  
  1. Cleaning Up Your Data
&lt;/h1&gt;

&lt;p&gt;The first and most important step in Power BI is to clean and prepare your data. Power BI has a powerful tool for this called the Power Query Editor.&lt;/p&gt;

&lt;h2&gt;
  
  
  Getting Your Data into Power BI
&lt;/h2&gt;

&lt;p&gt;We're using a simple CSV file with data about crops in Kenya. To pull data into Power BI, we simply use the "Get data" feature and select our csv file. Once it's loaded, Power Query shows you a preview of your data.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq2iuh2jo9z6igsrg8z1j.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq2iuh2jo9z6igsrg8z1j.png" alt="Get Data" width="800" height="408"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Cleaning the Dataset
&lt;/h2&gt;

&lt;p&gt;Our Kenya Crops dataset has a few common problems that we need to fix:&lt;/p&gt;

&lt;p&gt;• "Error" messages: Some cells just say "Error."  We replace these cells with a blank or label it as "Unknown."&lt;br&gt;
• Empty cells: Some cells in our data are just blank. We fill them with a zero, a label like "Unknown".&lt;br&gt;
• Numbers that are text: We need to make sure our numbers are actually numbers.&lt;br&gt;
• Missing information: We notice that some rows are missing the final profit number, even though they have the revenue and cost. &lt;/p&gt;

&lt;p&gt;By taking the time to clean the data, we make sure our final analysis is accurate and trustworthy. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4wj3xmsjw42qazam85tr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4wj3xmsjw42qazam85tr.png" alt="Cleaned_Data" width="800" height="395"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  2. Using DAX (Data Analysis Expressions)
&lt;/h1&gt;

&lt;p&gt;DAX (Data Analysis Expressions) in Power BI is a formula language used to create custom measures, calculated columns, and tables for advanced data analysis, manipulation, and modeling. It enables dynamic filtering, complex calculations like time intelligence (e.g., Year-over-Year), and row-level security, enhancing raw data into actionable insights. &lt;/p&gt;

&lt;p&gt;Let's look at some common DAX functions and how they help us understand our Kenya Crops data better.&lt;/p&gt;

&lt;h2&gt;
  
  
  SUM and AVERAGE
&lt;/h2&gt;

&lt;p&gt;• &lt;strong&gt;SUM()&lt;/strong&gt;: This adds up all the numbers in a chosen column. If you want to know the total revenue from all crops, you'd use SUM() on the 'Revenue (KES)' column.&lt;/p&gt;

&lt;p&gt;◦ Example: &lt;code&gt;Total Revenue = SUM('Kenya Crops'[Revenue (KES)])&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk4wc7eej0wyiztxy07qp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk4wc7eej0wyiztxy07qp.png" alt="SUM" width="800" height="384"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;• &lt;strong&gt;AVERAGE()&lt;/strong&gt;: This calculates the average of all numbers in a chosen column. To find the average amount of crop harvested (yield), you'd use AVERAGE() on the 'Yield (Kg)' column.&lt;/p&gt;

&lt;p&gt;◦ Example: &lt;code&gt;Average Yield = AVERAGE('Kenya Crops'[Yield (Kg)])&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjd8idz3b9htlkdau40sr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjd8idz3b9htlkdau40sr.png" alt="AVG" width="800" height="387"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  SUMX and AVERAGEX
&lt;/h2&gt;

&lt;p&gt;Sometimes, you need to do a calculation for each individual row before adding or averaging them up. This is where SUMX() and AVERAGEX() are incredibly powerful. &lt;/p&gt;

&lt;p&gt;• &lt;strong&gt;SUMX(Table, Expression)&lt;/strong&gt;: This function goes through each row of a specified Table, performs a calculation (Expression) for that row, and then adds up all those individual results.&lt;/p&gt;

&lt;p&gt;◦ Example: &lt;code&gt;Total Calculated Profit = SUMX('Kenya Crops', 'Kenya Crops'[Revenue (KES)] - 'Kenya Crops'[Cost of Production (KES)])&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4inbwscyoscplawoi1hp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4inbwscyoscplawoi1hp.png" alt="SUMX" width="800" height="380"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;• &lt;strong&gt;AVERAGEX(Table, Expression)&lt;/strong&gt;: Similar to SUMX(), this goes through each row, performs a calculation, and then finds the average of those results.&lt;/p&gt;

&lt;p&gt;◦ Example: &lt;code&gt;Average Profit per Acre = AVERAGEX('Kenya Crops', ('Kenya Crops'[Revenue (KES)] - 'Kenya Crops'[Cost of Production (KES)]) / 'Kenya Crops'[Planted Area (Acres)])&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnkymubmhkhcdm690mdc2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnkymubmhkhcdm690mdc2.png" alt="AVERAGEX" width="800" height="383"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  CALCULATE
&lt;/h2&gt;

&lt;p&gt;CALCULATE() helps  you focus your calculations on specific parts of your data.&lt;/p&gt;

&lt;p&gt;◦ Example: What was the total revenue only from 'Potatoes'?&lt;/p&gt;

&lt;p&gt;&lt;code&gt;Total Revenue Potatoes = CALCULATE(SUM('Kenya Crops'[Revenue (KES)]), 'Kenya Crops'[Crop Type] = "Potatoes")&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbq8q5f481bqzloiazvhl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbq8q5f481bqzloiazvhl.png" alt="CALCULATE" width="800" height="375"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;◦ Here, CALCULATE tells Power BI to only looking at rows where the 'Crop Type' is 'Potatoes', and then SUM the 'Revenue'.&lt;/p&gt;

&lt;h2&gt;
  
  
  Joining Text Together: Concatenation with &amp;amp;
&lt;/h2&gt;

&lt;p&gt;Sometimes you want to combine text from different columns. The ampersand (&amp;amp;) symbol lets you do this easily.&lt;/p&gt;

&lt;p&gt;• Example: To create a clear label like "Potatoes - Organic" by combining the 'Crop Type' and 'Crop Variety' columns:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;Crop Identifier = 'Kenya Crops'[Crop Type] &amp;amp; " - " &amp;amp; 'Kenya Crops'[Crop Variety]&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fijy1e9kif5svbykfuy9k.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fijy1e9kif5svbykfuy9k.png" alt="Concatenation" width="800" height="388"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;• This makes our data easier to read and understand at a glance.&lt;/p&gt;

&lt;h1&gt;
  
  
  3. Making Dashboards That Tell a Story
&lt;/h1&gt;

&lt;p&gt;After  cleaning and calculating, we use Power BI's visualizations to turn the numbers into engaging charts and graphs that anyone can understand. These visuals are what help people make smart decisions quickly.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1lknpxocusa6m1jw767t.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1lknpxocusa6m1jw767t.png" alt="Full_Report" width="800" height="452"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Cards
&lt;/h2&gt;

&lt;p&gt;In Power BI, a card is a type of visual specifically designed to display a single, important data point or a small set of related summary numbers.  Their purpose is to provide an immediate, at-a-glance summary of performance, cutting through the complexity of larger charts and tables&lt;/p&gt;

&lt;h2&gt;
  
  
  Bar and Column Charts
&lt;/h2&gt;

&lt;p&gt;Bar and column charts are fantastic for comparing different things. Column charts usually compare things over time or across different groups, while bar charts are great when you have long names for your categories.&lt;/p&gt;

&lt;h2&gt;
  
  
  Line Charts
&lt;/h2&gt;

&lt;p&gt;Line charts are perfect for showing how something changes over a period, like days, months, or years. They connect the dots to reveal patterns and trends.&lt;/p&gt;

&lt;h1&gt;
  
  
  Conclusion
&lt;/h1&gt;

&lt;p&gt;From a jumbled spreadsheet to clear, actionable insights—that's the magic a Power BI analyst performs. By carefully cleaning data, using the powerful DAX language to create smart calculations, and then building engaging visuals, analysts transform raw numbers into compelling stories. These stories help businesses understand what's happening, why it's happening, and what they should do next. Our Kenya Crops example shows how these technical skills aren't just about numbers; they're about making a real difference in the world, helping farmers and businesses thrive.&lt;/p&gt;

</description>
      <category>analytics</category>
      <category>microsoft</category>
      <category>datascience</category>
      <category>dataengineering</category>
    </item>
    <item>
      <title>Understanding Schemas and Data Modelling in Power BI</title>
      <dc:creator>Ng'ang'a Njongo</dc:creator>
      <pubDate>Tue, 03 Feb 2026 07:55:59 +0000</pubDate>
      <link>https://forem.com/nganga_njongo/understanding-schemas-and-data-modelling-in-power-bi-3odb</link>
      <guid>https://forem.com/nganga_njongo/understanding-schemas-and-data-modelling-in-power-bi-3odb</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;The core of any business intelligence solution lies in its data model. The same applies to Power BI. A quality data model allows you to build solid and powerful solutions that don't break. The speed, reliability and power of a solution all stem from a great data model. Let's have a look at some concepts that Power BI data modelers interact with to build models optimized for performance and usability.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Dimension Tables
&lt;/h3&gt;

&lt;p&gt;Dimension tables describe business entities—the things you model. Entities can include products, people, places, and concepts including time itself. The most consistent table you'll find in a star schema is a date dimension table. A dimension table contains a key column (or columns) that acts as a unique identifier, and other columns. Other columns support filtering and grouping your data.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Fact Tables
&lt;/h3&gt;

&lt;p&gt;A fact table consists of the measurements, metrics or facts of a business process. It is located at the center of a star schema or a snowflake schema surrounded by dimension tables. A fact table typically has two types of columns: those that contain facts and those that are a foreign key to dimension tables.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Relationships
&lt;/h3&gt;

&lt;p&gt;These links between tables, allow Power BI to group, filter, and aggregate data correctly. Example: The Sales (fact) table connects to the Products (dimension) table using Product ID.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Star Schema
&lt;/h3&gt;

&lt;p&gt;The star schema or star model is the simplest style of data mart schema and is the approach most widely used to develop data warehouses and dimensional data marts. The star schema consists of one or more fact tables referencing any number of dimension tables.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgviuvd6xl2a6m7rpty5n.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgviuvd6xl2a6m7rpty5n.png" alt="Star_Schema" width="800" height="542"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Above is an example of a star schema that has a Sales table (fact table) referencing other dimension tables such as Employee, Date, Product etc.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Snowflake Schema
&lt;/h3&gt;

&lt;p&gt;A snowflake schema or snowflake model is a logical arrangement of tables in a multidimensional database such that the entity relationship diagram resembles a snowflake shape. The snowflake schema is represented by centralized fact tables which are connected to multiple dimensions. See illustration below:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1pqiiflgpcabj4yke4t4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1pqiiflgpcabj4yke4t4.png" alt="Snowflake_Schema" width="522" height="315"&gt;&lt;/a&gt; &lt;/p&gt;

&lt;h2&gt;
  
  
  Key Benefits of a Good Data Model
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Faster Performance – Organized data reduces Power BI’s workload, making reports load faster. Example: Removing duplicate data speeds up calculations.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Easier Reporting – A clear structure simplifies visualization and calculations. Example: Using fact and dimension tables makes chart creation intuitive.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Accurate Insights – Proper relationships prevent errors in calculations.  Example: Incorrect joins can cause double counting of sales.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Handles Large Data Efficiently – Optimized models process millions of rows smoothly.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;A well-structured Power BI data model ensures better performance, accurate insights, and efficient reporting. By following best practices like Star Schema, reducing complexity, and using proper relationships, you can unlock the full potential of Power BI and make data-driven decisions with confidence.&lt;/p&gt;

</description>
      <category>analytics</category>
      <category>microsoft</category>
      <category>dataengineering</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Introduction to Linux for Data Engineers, Including Practical Use of Vi and Nano with Examples</title>
      <dc:creator>Ng'ang'a Njongo</dc:creator>
      <pubDate>Sun, 25 Jan 2026 13:17:21 +0000</pubDate>
      <link>https://forem.com/nganga_njongo/introduction-to-linux-for-data-engineers-including-practical-use-of-vi-and-nano-with-examples-4bf</link>
      <guid>https://forem.com/nganga_njongo/introduction-to-linux-for-data-engineers-including-practical-use-of-vi-and-nano-with-examples-4bf</guid>
      <description>&lt;h1&gt;
  
  
  Why Linux?
&lt;/h1&gt;

&lt;p&gt;Shifting from Windows to Linux for the first time can be daunting. Having no graphical interface to maneuver around where you can click on icons and folders, instead there's a black screen waiting for you to key in commands. So why is it essential in a data engineer's day-to-day?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Servers run Linux&lt;/strong&gt;&lt;br&gt;
Nearly all public cloud workloads and majority of servers powering data systems run on Linux.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Data engineering tools&lt;/strong&gt;&lt;br&gt;
Hadoop/Spark/Kafka were built for Unix-like systems. These core data engineering tools are designed and optimized to run on Linux. Development, testing, and production deployment naturally happen there.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Performance &amp;amp; Stability&lt;/strong&gt;&lt;br&gt;
Linux servers can run for years without reboots, crucial for long-running data pipelines and streaming jobs.&lt;/p&gt;

&lt;p&gt;Another is that Linux is free &amp;amp; open-source, which is critical for scalable, cost-effective data infrastructure. These are just some of the reasons why Linux is crucial for a data engineer. Let's see examples of some basic Linux commands which we can correlate to when were using Windows.&lt;/p&gt;

&lt;h1&gt;
  
  
  Basic Linux Commands
&lt;/h1&gt;

&lt;p&gt;To demonstrate this, we're going to connect to a remote server from Git Bash. We do this by using SSH (Secure Shell) which allows us to access and use the server's resources and run commands. The syntax to connect to the server is: &lt;br&gt;
&lt;code&gt;ssh user@ip&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;See below on connecting to a remote server provisioned on DigitalOcean:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo0zlridtp0vsbjuy6fcx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo0zlridtp0vsbjuy6fcx.png" alt="SSH" width="800" height="279"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;After successfully connecting to the remote server, let's run some commands. &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;whoami&lt;/code&gt;: This prints out the current user&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fy45sg4fik907osk86qoh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fy45sg4fik907osk86qoh.png" alt="whoami" width="676" height="127"&gt;&lt;/a&gt; &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;df -h&lt;/code&gt;: Displays disk usage of all mount points&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhp41rxfnovdaie7n301w.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhp41rxfnovdaie7n301w.png" alt="df -h" width="763" height="181"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;pwd&lt;/code&gt;: Prints the current working directory&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjxn0ziwdexke8scqsjti.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjxn0ziwdexke8scqsjti.png" alt="pwd" width="646" height="85"&gt;&lt;/a&gt; &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;ls&lt;/code&gt;: Lists all files and folders in your current directory. &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk3xmjniyuwpzm0wxm2cy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk3xmjniyuwpzm0wxm2cy.png" alt="ls" width="800" height="103"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;NB&lt;/strong&gt;: files are highlighted in white while folders are highlighted in blue. Zipped files in orange / red.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;cd&lt;/code&gt;: Changes directory to the folder you specify&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk7jk1p9l6plfqtr07kgt.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk7jk1p9l6plfqtr07kgt.png" alt="cd" width="798" height="178"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;From the above illustration, we changed directory from /root to /root/eveningClass. eveningClass is a folder within /root and we confirmed it by printing the current working directory (&lt;code&gt;pwd&lt;/code&gt;) then by listing (&lt;code&gt;ls&lt;/code&gt;) all files and folders within /root/eveningClass&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;cat&lt;/code&gt;: Allows a user to read / display the contents of a file&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiq9cn7taqxmjuz7mntex.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiq9cn7taqxmjuz7mntex.png" alt="cat" width="800" height="223"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;sudo adduser username&lt;/code&gt;: This creates a new user account with the specified username&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frfwalcny81vkanhu1rmv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frfwalcny81vkanhu1rmv.png" alt="adduser" width="790" height="436"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;From above, we've created a new user account 'nganga', we can verify the user has been added by displaying (&lt;code&gt;cat&lt;/code&gt;) the contents of the file in the directory: /etc/passwd &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs35dtppxfvjfvxipkuf0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs35dtppxfvjfvxipkuf0.png" alt="cat" width="800" height="332"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr2z8hefidqlgeler0e47.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr2z8hefidqlgeler0e47.png" alt="user_nganga" width="800" height="293"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;mkdir&lt;/code&gt;: Creates a new directory / folder. You also need to specify the name of directory you want to create&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj3r47d7mwgnvlmqe5v7i.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj3r47d7mwgnvlmqe5v7i.png" alt="mkdir" width="800" height="228"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;touch&lt;/code&gt;: Creates a new file&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F57jlg3o0w3cbdfx9cr0u.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F57jlg3o0w3cbdfx9cr0u.png" alt="touch" width="800" height="245"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;echo&lt;/code&gt;: Can be used to write to a file&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffeb4moj5kjtx8ybf3dc9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffeb4moj5kjtx8ybf3dc9.png" alt="echo" width="800" height="239"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;scp&lt;/code&gt;: Stands for Secure Copy. Which allows us to copy files from your local machine to a remote server and vice versa. Let's begin with copying from the local machine:&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;1. scp from local machine to remote host&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuwpxqa7aeygtk2f8k15o.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuwpxqa7aeygtk2f8k15o.png" alt="scp_local" width="800" height="280"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;From above illustration, we have a file SecondFile.txt on the local machine and we've copied it to the remote host in the directory: /root/eveningClass/new_dir as seen below:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvhd3ir5e14rsskez3rgf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvhd3ir5e14rsskez3rgf.png" alt="second_file" width="703" height="109"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. scp from remote host to local machine&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;We can copy a file from the remote host as shown below:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3qb3y4vhjphnc4m8fgax.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3qb3y4vhjphnc4m8fgax.png" alt="scp_remote" width="800" height="171"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;cp&lt;/code&gt;: This copies a file to a specified destination directory&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F85xhs5xotexg77fwa721.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F85xhs5xotexg77fwa721.png" alt="cp" width="800" height="484"&gt;&lt;/a&gt; &lt;/p&gt;

&lt;p&gt;The file NewFile.txt has been copied from /root/eveningClass/new_dir/ to /root/eveningClass/&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;mv&lt;/code&gt;: This moves a file from directory to another&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftvq6ttnm2h14tbf8sy4s.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftvq6ttnm2h14tbf8sy4s.png" alt="mv" width="800" height="217"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In above example, we've moved the file NewFile.txt from /root/eveningClass/ to /root/eveningClass/new_dir/&lt;/p&gt;

&lt;h1&gt;
  
  
  Creating and Editing Files with Nano and Vi
&lt;/h1&gt;

&lt;p&gt;Vi and Nano are text editors used in Linux/Unix terminal environments. Nano is simple and intuitive (like Notepad), while Vi/Vim is more powerful but has a learning curve.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Nano Editor&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Opening/Creating Files&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftgjv9vn62ak2214kktzb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftgjv9vn62ak2214kktzb.png" alt="creating_file" width="595" height="61"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F91n765jamidm30yezyp3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F91n765jamidm30yezyp3.png" alt="nano_editing" width="800" height="532"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn8al1r85bmfpr49n0om5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn8al1r85bmfpr49n0om5.png" alt="nano_save" width="687" height="145"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Type normally&lt;/strong&gt; - just start typing&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Move cursor&lt;/strong&gt; - Arrow keys work as expected&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Save file&lt;/strong&gt; - Ctrl + O (Write Out), then press Enter&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Exit&lt;/strong&gt; - Ctrl + X&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Search&lt;/strong&gt; - Ctrl + W, type word, press Enter&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;2. Vi Editor&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Vi has 3 main modes:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Normal mode (default)&lt;/strong&gt; - for navigation and commands&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Insert mode&lt;/strong&gt; - for typing text&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Visual mode&lt;/strong&gt; - for selecting text&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Opening/Creating Files&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9j4j5y1xixbvx2mm7tnw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9j4j5y1xixbvx2mm7tnw.png" alt="Vi_create_file" width="513" height="67"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7vp92kqa2epplu3ir615.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7vp92kqa2epplu3ir615.png" alt="Vi_edit_file" width="800" height="534"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fms0bs7wtrdmifhasvjwr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fms0bs7wtrdmifhasvjwr.png" alt="Vi_save_file" width="800" height="542"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fopcytszp8wya5u1fniya.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fopcytszp8wya5u1fniya.png" alt="Vi_check_content" width="670" height="166"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Essential Vi Commands&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;From Normal mode to Insert mode:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;i - insert before cursor&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;a - append after cursor&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;o - open new line below&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;I - insert at beginning of line&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;A - append at end of line&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Saving and Quitting (Normal mode):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;:w - save (write)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;:q - quit&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;:wq or ZZ - save and quit&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;:q! - quit without saving&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;:w filename - save as new file&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Navigation (Normal mode):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;h - left&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;j - down&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;k - up&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;l - right&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;0 - beginning of line&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;$ - end of line&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;gg - top of file&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;G - bottom of file&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;:5 - go to line 5&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Editing (Normal mode):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;x - delete character&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;dd - delete line&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;yy - copy line&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;p - paste below&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;P - paste above&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;u - undo&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Ctrl+r - redo&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h1&gt;
  
  
  Conclusion
&lt;/h1&gt;

&lt;p&gt;In this article we have covered the following:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Explained why Linux is important for data engineers&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Demonstrated basic Linux commands&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Shown practical usage of Vi and Nano (e.g., creating and editing files)&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>linux</category>
      <category>dataengineering</category>
      <category>beginners</category>
      <category>terminal</category>
    </item>
    <item>
      <title>Basics of Git and GitHub</title>
      <dc:creator>Ng'ang'a Njongo</dc:creator>
      <pubDate>Sat, 17 Jan 2026 07:45:26 +0000</pubDate>
      <link>https://forem.com/nganga_njongo/basics-of-git-and-github-1bh1</link>
      <guid>https://forem.com/nganga_njongo/basics-of-git-and-github-1bh1</guid>
      <description>&lt;h1&gt;
  
  
  What is Git?
&lt;/h1&gt;

&lt;p&gt;Git is a tool that tracks changes to files that are specified in your folder. Every change made in the project folder is tracked and is saved using commits. Each commit saves the state of your files at that moment and records who made the change and why. Each commit corresponds to a new version of the project folder.&lt;/p&gt;

&lt;h1&gt;
  
  
  GitHub
&lt;/h1&gt;

&lt;p&gt;This is the web interface that safely stores all the versions that have been made and pushed from Git.&lt;/p&gt;

&lt;h1&gt;
  
  
  Importance of Version Control
&lt;/h1&gt;

&lt;p&gt;From the Git definition earlier, one benefit is that it allows you to track the history of changes to a project folder (who, when, why). Others are listed below:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;You can experiment freely and safely as you can always revert back to a previous version without the fear of breaking things&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;It provides clear and documented history of changes. You are not caught in the chaos of saving files as "final_project_v7.zip"&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Allows members / developers to collaborate on a project simultaneously without overwriting each others work&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Avoids a situation where you have a single point of failure as each developer has a copy of the entire project&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h1&gt;
  
  
  1. Pushing Code to GitHub
&lt;/h1&gt;

&lt;p&gt;&lt;strong&gt;Create an empty folder&lt;/strong&gt;&lt;br&gt;
We'll begin by creating an empty folder on your local machine using Git and navigate to it as shown below: &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb6wbj1r16rbsy4kds5tm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb6wbj1r16rbsy4kds5tm.png" alt="Creating a folder" width="636" height="244"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Initialize Git&lt;/strong&gt;&lt;br&gt;
In the current directory, we then need to initialize Git. This is done using the "git init" command: &lt;code&gt;git init&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5a3z5kfubzueqv3svztj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5a3z5kfubzueqv3svztj.png" alt="Initialize Git" width="800" height="135"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Create a repository on GitHub&lt;/strong&gt;&lt;br&gt;
Before we push code to GitHub, we'll need to create a repository that will store our pushed code from Git. You can do this by navigating to GitHub and selecting "New Repository" as shown below:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fst2ecceo6uac0iu5gys3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fst2ecceo6uac0iu5gys3.png" alt="Creating a Repository" width="800" height="365"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Connecting Git to GitHub repository&lt;/strong&gt;&lt;br&gt;
Once the repository is created, we can then connect Git with the specific repository we created using the code below:&lt;br&gt;
&lt;code&gt;git remote add origin https://github.com/Nganga7/LuxDev_Assignments.git&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Creating a new file&lt;/strong&gt;&lt;br&gt;
In the directory "C:\Nganga\LuxDevHQ\Assignment1\", we created an empty text file "Doc_one.txt" and added the line "First Doc" as shown below:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu4awmo7pk8o1f069phvf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu4awmo7pk8o1f069phvf.png" alt="Text File Creation" width="800" height="350"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Writing to the new file&lt;/strong&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8s5c3z2i579ljltt1rlt.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8s5c3z2i579ljltt1rlt.png" alt="Text File Contents" width="800" height="656"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Track Git changes&lt;/strong&gt; &lt;br&gt;
Since we've made changes to our project folder (creating a new file &amp;amp; writing to the file), Git will track the changes. Initially there was nothing to be tracked. See below:  &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Before creating the file&lt;/strong&gt;&lt;br&gt;
We use &lt;code&gt;git status&lt;/code&gt; to track changes made in our project folder&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fivzr75vgrdez7gc4hzmc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fivzr75vgrdez7gc4hzmc.png" alt="Before file creation" width="800" height="157"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;After creating the file&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmgdxgs71lti8xxs9qcnb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmgdxgs71lti8xxs9qcnb.png" alt="After file creation" width="800" height="224"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Staging and committing changes&lt;/strong&gt;&lt;br&gt;
After making the changes, we notice that there are untracked. We therefore need to add the files to the staging area and commit these changes before we push our code. We do this by using &lt;code&gt;git add Doc_one.txt&lt;/code&gt; and commit (with comments) by &lt;code&gt;git commit -m "First Commit"&lt;/code&gt; as shown below:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flr568589360h6c1438nr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flr568589360h6c1438nr.png" alt="Git add &amp;amp; commit" width="800" height="741"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;We can view what has been committed using &lt;code&gt;git log&lt;/code&gt; and as shown below, we see the: Commit ID, author (who), date &amp;amp; time of commit  and comments made during the commit:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fww3pwvj3u744vs0d1y9q.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fww3pwvj3u744vs0d1y9q.png" alt="git log" width="800" height="403"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pushing to GitHub repository&lt;/strong&gt;&lt;br&gt;
All good to push our file with it's contents to GitHub. We do this using the command &lt;code&gt;git push -u origin master&lt;/code&gt; &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fga0zj80jiucmzge4lez3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fga0zj80jiucmzge4lez3.png" alt="Pushed from Git" width="800" height="240"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;File available on GitHub&lt;/strong&gt;&lt;br&gt;
The file has now been pushed to GitHub and is now available to multiple collaborators who may need to pull it:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpg2nv91mmq4xqdlmh92w.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpg2nv91mmq4xqdlmh92w.png" alt="Pushed to Git" width="800" height="366"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  2. Pulling Code from GitHub
&lt;/h1&gt;

&lt;p&gt;To pull a file from GitHub, we'll create a new file in our repository called "Doc_two.txt" and we'll write to it "Second Doc" as shown below:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Adding a file&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl9l04x4b0xdgct9155q8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl9l04x4b0xdgct9155q8.png" alt="Adding a file" width="800" height="398"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Creating the file&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F920varl1qv9zdy6e8gd8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F920varl1qv9zdy6e8gd8.png" alt="Creating the file" width="800" height="377"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;We then need to commit these changes and add a comment to it. Similar to what we did when we pushed the "Doc_one.txt" to GitHub using the command &lt;code&gt;git commit -m Doc_one.txt&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj4saraae2uu5ilrcjjwm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj4saraae2uu5ilrcjjwm.png" alt="Second Commit" width="800" height="370"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Our GitHub repository now has both files "Doc_one.txt" and "Doc_two.txt" but our local just has the "Doc_one.txt" file&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;GitHub repository&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffcesy011npnwhp6fevh1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffcesy011npnwhp6fevh1.png" alt="GitHub Repo" width="800" height="369"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Local Git repository&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fp6xb80do86qzwldfg17k.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fp6xb80do86qzwldfg17k.png" alt="Local Repo" width="800" height="258"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;To finally pull the new file to our local we use the &lt;code&gt;git pull command&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F09j9fk77k5nsr6ba6xbr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F09j9fk77k5nsr6ba6xbr.png" alt="Git Pull" width="800" height="579"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;As shown above, we have successfully pulled the "Doc_two.txt" file from GitHub to our local Git repository.&lt;/p&gt;

&lt;h1&gt;
  
  
  Conclusion
&lt;/h1&gt;

&lt;p&gt;In this article, we were able to cover the following:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;What Git is and why version control is important&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;How to push code to GitHub&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;How to pull code from GitHub&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;How to track changes using Git&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>git</category>
      <category>github</category>
      <category>dataengineering</category>
      <category>datascience</category>
    </item>
  </channel>
</rss>
