<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Emmanuel Kiriinya</title>
    <description>The latest articles on Forem by Emmanuel Kiriinya (@emmanuel_kiriinya).</description>
    <link>https://forem.com/emmanuel_kiriinya</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3017148%2F3c2bb168-3223-4c9c-b769-253e56a159c1.jpg</url>
      <title>Forem: Emmanuel Kiriinya</title>
      <link>https://forem.com/emmanuel_kiriinya</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/emmanuel_kiriinya"/>
    <language>en</language>
    <item>
      <title>Pandas vs Polars: Is It Time to Rethink Python’s Trusted DataFrame Library?</title>
      <dc:creator>Emmanuel Kiriinya</dc:creator>
      <pubDate>Sun, 22 Jun 2025 21:30:59 +0000</pubDate>
      <link>https://forem.com/emmanuel_kiriinya/pandas-vs-polars-is-it-time-to-rethink-pythons-trusted-dataframe-library-1a28</link>
      <guid>https://forem.com/emmanuel_kiriinya/pandas-vs-polars-is-it-time-to-rethink-pythons-trusted-dataframe-library-1a28</guid>
      <description>&lt;p&gt;For over a decade, &lt;strong&gt;Pandas&lt;/strong&gt; has been the cornerstone of tabular data manipulation in Python. Its intuitive syntax and rich functionality make it the default choice for analysts, data scientists, and researchers worldwide.&lt;/p&gt;

&lt;p&gt;However, as datasets have grown from megabytes to gigabytes—and now terabytes—the limitations of Pandas are increasingly evident. Enter &lt;strong&gt;Polars&lt;/strong&gt;: a modern, high-performance DataFrame library built for speed and scalability.&lt;/p&gt;

&lt;p&gt;In this article, we’ll cover:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Why Pandas remains popular
&lt;/li&gt;
&lt;li&gt;What makes Polars different
&lt;/li&gt;
&lt;li&gt;A practical benchmark with a large real-world dataset
&lt;/li&gt;
&lt;li&gt;Whether Pandas might eventually be replaced&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Pandas: A Reliable Workhorse
&lt;/h2&gt;

&lt;p&gt;Since its release in 2008, Pandas has dominated data analysis in Python. Its strengths include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Familiar and expressive API (&lt;code&gt;DataFrame&lt;/code&gt;, &lt;code&gt;Series&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Seamless integration with other Python libraries (NumPy, scikit-learn, matplotlib)&lt;/li&gt;
&lt;li&gt;Extensive tutorials, examples, and community support&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;However, Pandas was designed for single-threaded execution and expects the entire dataset to fit in memory. This often becomes a bottleneck when working with very large datasets on a laptop or single machine.&lt;/p&gt;




&lt;h2&gt;
  
  
  Polars: A Modern Alternative for High-Performance DataFrames
&lt;/h2&gt;

&lt;p&gt;Polars is a newer open-source DataFrame library, written in &lt;strong&gt;Rust&lt;/strong&gt; with Python bindings. It’s designed with performance and scalability in mind:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Multi-threaded execution:&lt;/strong&gt; Polars uses all available CPU cores automatically.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Lazy evaluation:&lt;/strong&gt; Like Spark, Polars can optimize a query plan before executing it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Memory efficiency:&lt;/strong&gt; Processes data in chunks to avoid excessive memory usage.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These design choices allow Polars to handle large datasets much faster and with lower resource consumption than Pandas.&lt;/p&gt;




&lt;h2&gt;
  
  
  Pandas vs Polars: A Real-World Benchmark
&lt;/h2&gt;

&lt;p&gt;To see the difference in practice, let’s analyze a real dataset: the &lt;strong&gt;NYC Taxi Trip data&lt;/strong&gt;, which typically has over &lt;strong&gt;20 million rows&lt;/strong&gt; and is about &lt;strong&gt;3 GB&lt;/strong&gt; uncompressed.&lt;/p&gt;

&lt;p&gt;Below is a simple benchmark computing the average trip distance grouped by passenger count, using both libraries.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Install the libraries if needed:
# pip install pandas polars
&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;pandas&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;polars&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;pl&lt;/span&gt;

&lt;span class="c1"&gt;# Replace with the path to your CSV file
&lt;/span&gt;&lt;span class="n"&gt;FILE_PATH&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;yellow_tripdata_2023-01.csv&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="c1"&gt;# --- Using Pandas ---
&lt;/span&gt;&lt;span class="n"&gt;start&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;time&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;df_pd&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;read_csv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;FILE_PATH&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;result_pd&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df_pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;groupby&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;passenger_count&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;trip_distance&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result_pd&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Pandas execution time:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;time&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# --- Using Polars ---
&lt;/span&gt;&lt;span class="n"&gt;start&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;time&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;df_pl&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pl&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;read_csv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;FILE_PATH&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;result_pl&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;df_pl&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;groupby&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;passenger_count&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;agg&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pl&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;col&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;trip_distance&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result_pl&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Polars execution time:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;time&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Expected results (typical laptop):&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Pandas:&lt;/strong&gt; 20–30 seconds, high memory usage
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Polars:&lt;/strong&gt; 3–6 seconds, significantly lower memory footprint&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This highlights how Polars can dramatically speed up large data workflows.&lt;/p&gt;




&lt;h2&gt;
  
  
  When to Use Each Library
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Aspect&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;Pandas&lt;/strong&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;Polars&lt;/strong&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Execution Model&lt;/td&gt;
&lt;td&gt;Single-threaded&lt;/td&gt;
&lt;td&gt;Multi-threaded, supports lazy evaluation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Performance&lt;/td&gt;
&lt;td&gt;Good for small to medium data&lt;/td&gt;
&lt;td&gt;Excellent for large data&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Memory Usage&lt;/td&gt;
&lt;td&gt;Entire dataset in RAM&lt;/td&gt;
&lt;td&gt;Efficient chunk processing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;API Maturity&lt;/td&gt;
&lt;td&gt;Highly mature&lt;/td&gt;
&lt;td&gt;Rapidly evolving&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Community Support&lt;/td&gt;
&lt;td&gt;Large &amp;amp; established&lt;/td&gt;
&lt;td&gt;Growing rapidly&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Will Pandas Be Replaced?
&lt;/h2&gt;

&lt;p&gt;It’s unlikely that Pandas will be phased out anytime soon. Reasons include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Deep integration in the Python ecosystem&lt;/li&gt;
&lt;li&gt;Many libraries (e.g., scikit-learn, statsmodels) expect Pandas DataFrames&lt;/li&gt;
&lt;li&gt;Widely taught in courses, bootcamps, and used in countless notebooks&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In practice, many modern data workflows use &lt;strong&gt;both&lt;/strong&gt;:&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Pandas&lt;/strong&gt; for quick exploration and prototyping, &lt;strong&gt;Polars&lt;/strong&gt; for heavy transformations, large datasets, or production-grade pipelines.&lt;/p&gt;




&lt;h2&gt;
  
  
  Key Takeaway
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Pandas isn’t going anywhere — but Polars is raising the bar for what’s possible on a single machine.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If you work with large CSVs, Parquet files, or complex transformations, try Polars on your next project. It’s an easy way to process more data faster, with less hardware overhead.&lt;/p&gt;




&lt;h2&gt;
  
  
  Next Steps
&lt;/h2&gt;

&lt;p&gt;Try Polars with your largest dataset&lt;br&gt;&lt;br&gt;
Experiment with its lazy API for ETL pipelines&lt;br&gt;&lt;br&gt;
Stay comfortable with Pandas for quick analyses and prototyping&lt;/p&gt;




</description>
      <category>datascience</category>
      <category>pandas</category>
      <category>polars</category>
      <category>dataengineering</category>
    </item>
    <item>
      <title>Understanding PowerBI</title>
      <dc:creator>Emmanuel Kiriinya</dc:creator>
      <pubDate>Sun, 25 May 2025 10:44:47 +0000</pubDate>
      <link>https://forem.com/emmanuel_kiriinya/understanding-powerbi-1kmb</link>
      <guid>https://forem.com/emmanuel_kiriinya/understanding-powerbi-1kmb</guid>
      <description>&lt;p&gt;In today’s data-driven world, visualization plays a central role in helping people understand and act on information. Whether in business, research, or public service, interpreting data visually is a critical skill. Power BI, Microsoft's powerful business intelligence tool, makes data visualization more accessible to analysts, developers, and decision-makers alike. This article covers the fundamentals of data visualization, the advantages of using Power BI, common use cases, and practical tips for creating effective dashboards.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Is Data Visualization?
&lt;/h2&gt;

&lt;p&gt;As discussed in my previous article on &lt;a href="https://dev.to/emmanuel_kiriinya_416fc40/visualization-beyond-aesthetics-5758"&gt;'Visualization Beyond Aesthetics'&lt;/a&gt;, Data visualization is about making complex information clear, truthful, and actionable, or say the process of representing data through graphical elements like charts, graphs, and maps. Rather than analyzing raw rows and columns of data, visualization presents patterns, trends, and outliers in an easily digestible form. A well-designed chart can make complex data more intuitive, saving time and reducing the risk of misinterpretation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Common types of visualizations include:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Bar and column charts for comparing quantities across categories&lt;/li&gt;
&lt;li&gt;Line charts for showing trends over time&lt;/li&gt;
&lt;li&gt;Pie charts for representing parts of a whole&lt;/li&gt;
&lt;li&gt;Maps for visualizing geographical data&lt;/li&gt;
&lt;li&gt;Scatter plots for exploring relationships between variables&lt;/li&gt;
&lt;li&gt;Heatmaps for identifying intensity patterns across two dimensions&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Effective visualizations simplify decision-making and highlight insights that might otherwise go unnoticed.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Is Power BI?
&lt;/h2&gt;

&lt;p&gt;Power BI is a business analytics and data visualization platform developed by Microsoft. It allows users to connect to various data sources, prepare and transform data, and create interactive reports and dashboards. Power BI is widely used across industries due to its ease of use, integration with other Microsoft tools, and ability to scale from small projects to enterprise-level deployments.&lt;/p&gt;

&lt;p&gt;There are three main components:&lt;/p&gt;

&lt;p&gt;Power BI Desktop: A free Windows application used for report building and data modeling.&lt;/p&gt;

&lt;p&gt;Power BI Service: A cloud-based platform for publishing, sharing, and collaborating on reports.&lt;/p&gt;

&lt;p&gt;Power BI Mobile: An app for accessing dashboards on phones and tablets.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Use Power BI for Visualization?
&lt;/h2&gt;

&lt;p&gt;Power BI has become a preferred tool for visualization for several reasons:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;User-Friendly Interface&lt;/strong&gt;&lt;br&gt;
Power BI uses a drag-and-drop interface that doesn’t require coding. Users can create visuals by simply selecting fields and choosing how to represent them. This lowers the learning curve for beginners while still supporting more advanced users through features like DAX (Data Analysis Expressions) and custom visuals.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Wide Range of Visualization Types&lt;/strong&gt;&lt;br&gt;
From basic charts to more advanced visuals like gauges, KPIs, decomposed trees, and maps, Power BI offers dozens of built-in options. Users can also import custom visuals from the Microsoft marketplace.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Interactivity&lt;/strong&gt;&lt;br&gt;
Dashboards in Power BI are interactive by default. Selecting a data point in one visual automatically filters other visuals on the same page. This allows users to explore relationships between data points in real-time, without writing new queries.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Real-Time Updates&lt;/strong&gt;&lt;br&gt;
With live connections to databases or APIs, Power BI dashboards can reflect real-time data. This is particularly useful for monitoring key metrics like website traffic, sales performance, or system uptime.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Data Integration&lt;/strong&gt;&lt;br&gt;
Power BI connects to a wide range of data sources, including Excel, SQL Server, Azure, SharePoint, PostgreSQL, and even web services. Data can be imported or accessed through direct queries.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Data Preparation Tools&lt;/strong&gt;&lt;br&gt;
Power BI includes Power Query, which provides tools for cleaning, reshaping, and merging data. Users can remove duplicates, change data types, split columns, and apply transformations, all without external software.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Common Use Cases
&lt;/h2&gt;

&lt;p&gt;Power BI is used in a variety of domains. Some of the most common include:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Business Performance Tracking&lt;/strong&gt;&lt;br&gt;
Organizations use Power BI dashboards to monitor sales, revenue, profit margins, and market performance. Dashboards often include KPIs, time series charts, and filters to compare performance across regions or products.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Marketing and Customer Analytics&lt;/strong&gt;&lt;br&gt;
Marketing teams track campaign performance, customer segmentation, and conversion rates using Power BI. Data from CRMs or social media platforms can be integrated to provide a complete view of customer behavior.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Finance and Budgeting&lt;/strong&gt;&lt;br&gt;
Financial analysts use Power BI for tracking budgets, forecasting revenue, and comparing planned vs. actual expenditures. Waterfall charts, variance visuals, and custom DAX formulas are commonly used in this context.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Operations and Supply Chain&lt;/strong&gt;&lt;br&gt;
Power BI helps operations teams monitor inventory, shipping times, and production efficiency. Real-time dashboards allow for quick response to delays or supply chain disruptions.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;HR and Workforce Analytics&lt;/strong&gt;&lt;br&gt;
HR departments use dashboards to visualize hiring trends, employee turnover, training completion, and diversity metrics. Filters allow reports to be customized by department, region, or time period.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Building a Visualization in Power BI Step-by-Step
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Connect to Data&lt;/strong&gt;&lt;br&gt;
Begin by loading data into Power BI. You can connect to Excel sheets, databases, CSV files, or online services. Choose either an import or direct query mode depending on your needs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Clean and Transform&lt;/strong&gt;&lt;br&gt;
Use Power Query to clean the data. This might include renaming columns, filtering rows, changing data types, or merging tables.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Build Data Model&lt;/strong&gt;&lt;br&gt;
Define relationships between tables. Use the “Model” view to create links between keys (e.g., CustomerID or ProductID) across tables. This allows data from different sources to be used in the same visual.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Create Visuals&lt;/strong&gt;&lt;br&gt;
Drag fields into the canvas and select the type of visual you want to use. Customize titles, labels, colors, and tooltips. Group related visuals together into dashboards.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Add Filters and Slicers&lt;/strong&gt;&lt;br&gt;
These allow users to interact with the dashboard. For example, a slicer could let someone view data from a specific year or region.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Publish and Share&lt;/strong&gt;&lt;br&gt;
Upload your report to the Power BI Service to share it with others. You can control permissions, schedule data refreshes, or embed reports into apps or websites.&lt;/p&gt;

&lt;h2&gt;
  
  
  Best Practices for Visualization in Power BI
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Be Clear and Focused&lt;/strong&gt;&lt;br&gt;
Every visual should serve a specific purpose. Avoid overcrowding a dashboard with too many elements. Highlight key findings and keep navigation simple.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Use Appropriate Chart Types&lt;/strong&gt;&lt;br&gt;
Choose the chart type that best fits the data and message. Don’t use pie charts for comparing more than three or four categories. Avoid 3D effects and other visual distortions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Make It Interactive&lt;/strong&gt;&lt;br&gt;
Take advantage of Power BI’s built-in interactivity. Use slicers and drill-throughs to let users explore the data in ways that matter to them.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Ensure Accessibility&lt;/strong&gt;&lt;br&gt;
Use high-contrast colors and readable fonts. Test how visuals look on different screen sizes, especially if they’ll be viewed on mobile devices.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Document and Label Clearly&lt;/strong&gt;&lt;br&gt;
Include titles, axis labels, and tooltips. If you use custom calculations or filters, provide notes or legends to explain them.&lt;/p&gt;

&lt;h3&gt;
  
  
  Learning Power BI
&lt;/h3&gt;

&lt;p&gt;For students and early-career professionals in data science and analytics, Power BI is a practical tool to master. It combines data manipulation, modeling, and visualization in one platform, helping you build both analytical and communication skills.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;To get started:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Use sample datasets (e.g., sales data, COVID-19 data, survey results)&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Try recreating dashboards you find onlineor use free datasets from websites like &lt;a href="//kaggle.com"&gt;Kaggle&lt;/a&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Participate in Power BI community challenges or forums&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Projects don’t have to be complicated. Even a dashboard tracking your spending can be a great way to practice.&lt;/p&gt;

&lt;h2&gt;
  
  
  Final Thoughts
&lt;/h2&gt;

&lt;p&gt;Power BI offers a powerful platform for turning data into insight. Its user-friendly interface, wide range of visualization options, and strong integration with other tools help analysts and decision-makers alike work with data more effectively. Whether you're preparing a class assignment, exploring real-world datasets, or building dashboards for&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Visualization Beyond Aesthetics</title>
      <dc:creator>Emmanuel Kiriinya</dc:creator>
      <pubDate>Thu, 15 May 2025 10:38:47 +0000</pubDate>
      <link>https://forem.com/emmanuel_kiriinya/visualization-beyond-aesthetics-5758</link>
      <guid>https://forem.com/emmanuel_kiriinya/visualization-beyond-aesthetics-5758</guid>
      <description>&lt;p&gt;In today’s data-driven world, we’re surrounded by slick dashboards, colorful graphs, and eye-catching infographics. Whether you’re scrolling through LinkedIn or sitting in a boardroom, chances are you’ve seen a chart that looks great—but does it actually help you understand the data?&lt;/p&gt;

&lt;p&gt;Data visualization is often praised for its aesthetic appeal, but effective visualization is about much more than making data look pretty. At its core, it’s about &lt;strong&gt;making complex information clear, truthful, and actionable&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;As Edward Tufte, a pioneer in the field, once said:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“Graphical excellence consists of complex ideas communicated with clarity, precision, and efficiency.”&lt;br&gt;&lt;br&gt;
— &lt;em&gt;The Visual Display of Quantitative Information&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Let’s dive into why &lt;strong&gt;visualization should go beyond just the aesthetics&lt;/strong&gt;—and how designers, analysts, and decision-makers can use it to tell better, more honest stories with data.&lt;/p&gt;




&lt;h2&gt;
  
  
  When Pretty Isn’t Enough
&lt;/h2&gt;

&lt;p&gt;It’s easier than ever to create visually appealing charts thanks to tools like Tableau, Power BI, and Flourish. But focusing too much on aesthetics can lead to visuals that are attractive but &lt;strong&gt;misleading or ineffective&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Alberto Cairo, in his book &lt;em&gt;The Truthful Art&lt;/em&gt;, warns:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“A good graphic isn’t just one that looks nice, but one that is based on sound data and communicates clearly.”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Take 3D charts, for example. They might look cool, but they often distort how we perceive the data. Similarly, excessive design elements—what Tufte called “chartjunk”—can confuse more than clarify.&lt;/p&gt;

&lt;p&gt;A visualization should &lt;strong&gt;serve a function&lt;/strong&gt;, not just win a design award.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F477i0xpyf3w4k2l0xuwc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F477i0xpyf3w4k2l0xuwc.png" alt="An example of a bad visualization" width="588" height="514"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Example: A 3D pie chart with too many categories and distorted proportions.&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Visualization as a Thinking Tool
&lt;/h2&gt;

&lt;p&gt;When done right, visualizations become &lt;strong&gt;cognitive tools&lt;/strong&gt;—helping people think, analyze, and decide. They reduce the mental effort required to interpret large datasets or find insights.&lt;/p&gt;

&lt;p&gt;Stephen Few, in &lt;em&gt;Now You See It&lt;/em&gt;, puts it simply:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“A chart should not be judged by its appeal, but by its ability to help users see what they need to see.”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The takeaway? Use the &lt;strong&gt;right chart for the right job&lt;/strong&gt;. Want to compare values? Use a bar chart. Want to show change over time? Use a line graph. Good design is about making the data easier to understand—not just easier on the eyes.&lt;/p&gt;




&lt;h2&gt;
  
  
  Storytelling with Data: More Than Just Numbers
&lt;/h2&gt;

&lt;p&gt;It’s not enough to show data—you need to &lt;strong&gt;tell a story&lt;/strong&gt; with it. This is especially true when your audience includes non-technical decision-makers.&lt;/p&gt;

&lt;p&gt;Cole Nussbaumer Knaflic, author of &lt;em&gt;Storytelling with Data&lt;/em&gt;, explains:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“Numbers have an important story to tell. Rely on data to tell that story, but augment it with the right visuals and the right context.”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Think of your chart as a narrative. What’s the main point? What do you want your audience to take away? Use annotations, highlights, and clean layouts to &lt;strong&gt;guide the viewer&lt;/strong&gt; through the story.&lt;/p&gt;

&lt;p&gt;Pro tip: remove anything that doesn’t serve the message. Decluttering is your best friend.&lt;/p&gt;




&lt;h2&gt;
  
  
  Know Your Audience
&lt;/h2&gt;

&lt;p&gt;The same visualization won’t work for everyone. What’s intuitive to a data analyst might be confusing to an executive or policymaker.&lt;/p&gt;

&lt;p&gt;Kieran Healy, in &lt;em&gt;Data Visualization: A Practical Introduction&lt;/em&gt;, emphasizes:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“Design decisions should be guided by empathy for the reader.”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That means:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Avoiding jargon
&lt;/li&gt;
&lt;li&gt;Using familiar chart types
&lt;/li&gt;
&lt;li&gt;Adding helpful labels or explanations
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Design with &lt;strong&gt;context in mind&lt;/strong&gt;. Who’s looking at this chart? What do they already know? What do they need to learn?&lt;/p&gt;




&lt;h2&gt;
  
  
  Visual Ethics: Telling the Truth with Data
&lt;/h2&gt;

&lt;p&gt;A beautiful but misleading chart is dangerous. Whether it’s a cherry-picked axis or a deceptive color scale, bad visualizations can &lt;strong&gt;manipulate opinions&lt;/strong&gt; and &lt;strong&gt;mislead decisions&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Edward Tufte called out these tactics decades ago, and the problem persists today.&lt;/p&gt;

&lt;p&gt;Alberto Cairo reinforces this in &lt;em&gt;The Truthful Art&lt;/em&gt;, calling for &lt;strong&gt;functional honesty&lt;/strong&gt; in design. The designer’s job isn’t to persuade—it’s to &lt;strong&gt;illuminate&lt;/strong&gt; the truth.&lt;/p&gt;

&lt;p&gt;This is especially critical in fields like journalism, public health, and finance. A good chart can influence policy. A bad one can cause harm.&lt;/p&gt;







&lt;h2&gt;
  
  
  Example of a Good Dashboard
&lt;/h2&gt;

&lt;p&gt;Great dashboards prioritize usability and clarity. They surface key insights at a glance and avoid information overload.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvq0u22ikkya3gt2f9r90.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvq0u22ikkya3gt2f9r90.png" alt="Dashboard" width="800" height="476"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Example: A clean executive dashboard showing KPIs with filters, annotations, and intuitive navigation.&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Final Thoughts: Show the Data, Tell the Truth
&lt;/h2&gt;

&lt;p&gt;Beautiful charts might win likes on social media—but in real-world decision-making, clarity and integrity are what truly matter.&lt;/p&gt;

&lt;p&gt;As Edward Tufte reminds us:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“Above all else, show the data.”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Data visualization is a &lt;strong&gt;bridge between numbers and understanding&lt;/strong&gt;. When we design with intention—beyond aesthetics—we unlock its real power: to inform, to inspire, and to drive better decisions.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Thanks for reading! Let me know in the comments—what’s the most misleading chart you’ve seen in the wild?&lt;/em&gt;&lt;/p&gt;

</description>
    </item>
    <item>
      <title>SQL For Data Analytics in a Nutshell</title>
      <dc:creator>Emmanuel Kiriinya</dc:creator>
      <pubDate>Thu, 15 May 2025 10:36:01 +0000</pubDate>
      <link>https://forem.com/emmanuel_kiriinya/sql-for-data-analytics-in-a-nutshell-kci</link>
      <guid>https://forem.com/emmanuel_kiriinya/sql-for-data-analytics-in-a-nutshell-kci</guid>
      <description>&lt;h2&gt;
  
  
  Overview
&lt;/h2&gt;

&lt;p&gt;An almost incomprehensible amount of data is created daily, and figures grow ever-increasingly each year. For future usage, this data has to be stored electronically for analysis and research to produce actionable insights. In this article, I will explain one of the most widely used data management systems, SQL, focusing on Relational Databases.  &lt;/p&gt;

&lt;h2&gt;
  
  
  What is SQL?
&lt;/h2&gt;

&lt;p&gt;SQL stands for Structured Query Language, pronounced popularly as &lt;code&gt;se-qu-el&lt;/code&gt;. SQL is a standard language for managing, manipulating, and querying databases basically a language used to communicate with databases. It was developed at IBM in the early 1970s and has undergone tremendous development updates, adapting to the emerging technologies. There are different types of SQL Database Management Systems like PostgreSQL, MySQL, SQL Server, Oracle, SQLite, etc. SQL stores data in a tabular format with columns and rows. The type of data stored in these databases is known as structured data.&lt;br&gt;
SQL can be further broken down into 4 sublanguages for tackling different jobs, namely:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Data Definition Language(DML)&lt;/strong&gt;: It is used to modify the structure of the database(used to create and modify tables, views etc). The commands used are &lt;code&gt;CREATE&lt;/code&gt;, &lt;code&gt;ALTER&lt;/code&gt;, &lt;code&gt;DROP&lt;/code&gt;, and &lt;code&gt;TRUNCATE&lt;/code&gt;. &lt;em&gt;&lt;strong&gt;Examples&lt;/strong&gt;&lt;/em&gt; 
&lt;code&gt;CREATE TABLE&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;CREATE TABLE customers (id INT,
 name VARCHAR(255));

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;&lt;code&gt;ALTER TABLE&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ALTER TABLE customers ADD COLUMN email VARCHAR(255);
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;DROP TABLE&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;DROP TABLE customers;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Data Query Language(DQL)&lt;/strong&gt;: It is used for querying data. DQL command is mainly the &lt;code&gt;SELECT&lt;/code&gt; statement.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SELECT *
FROM customers;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Data Manipulation Language(DML)&lt;/strong&gt;:
It is used to act on the data itself. Commands include &lt;code&gt;INSERT&lt;/code&gt;, &lt;code&gt;UPDATE&lt;/code&gt;, &lt;code&gt;DELETE&lt;/code&gt;, and &lt;code&gt;MERGE&lt;/code&gt;. &lt;strong&gt;&lt;em&gt;Examples&lt;/em&gt;&lt;/strong&gt;
&lt;code&gt;INSERT&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;INSERT INTO customers (id, name, email) 
(1, 'John Doe', 'john@example.com')
(2, 'Marya Akoth', 'marya.akoth@mail.com');
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;UPDATE&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;UPDATE customers 
SET name = 'Jane Doe' WHERE id = 1;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;DELETE&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;DELETE FROM customers WHERE id = 1;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Data Control Language(DCL)&lt;/strong&gt;: It is used to manage access to a database and its data. It is usually used by database administrators to manage access in an organisation. Commands include &lt;code&gt;GRANT&lt;/code&gt; and &lt;code&gt;REVOKE&lt;/code&gt;. &lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Why is SQL the foundation of Data Analytics?
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;In the data universe, data engineers and database administrators will use SQL to ensure that everybody in their organisation can access the data they need, sometimes depending on the permissions granted to the users.&lt;/li&gt;
&lt;li&gt;Data Scientists will use SQL to query data stored in a database to train and load data into their Machine Learning Models.&lt;/li&gt;
&lt;li&gt;Data analysts will use data to query tables of data and derive insights.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  What are the benefits of SQL?
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Standard Database Management System&lt;/strong&gt;: SQL is the standard language for retrieving data and interacting with databases. This is because most software applications and scripting languages, such as Python and R, are compatible with  SQL.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scalability&lt;/strong&gt;: SQL is a perfect tool for processing large datasets compared to tools like spreadsheets, where handling many rows might not be effective. This gives SQL an upper hand since it can easily handle millions of entries. Nevertheless, its scalability serves as a key benefit to meet the needs of modern data analysis.
&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Easy to learn and effective&lt;/strong&gt;: When compared to other programming languages, SQL is simple to learn and use, enabling data analysts to achieve immediate and effective results. Its clear syntax and straightforward commands empower analysts to execute complex computations efficiently.
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Handling Structured data&lt;/strong&gt;: Someone who works as an analyst should be able to comprehend SQL to perform different tasks on structured data. This includes generating and managing data sets kept in structured databases like Oracle, Microsoft SQL Server, and MySQL.  &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Data Preparation and Wrangling&lt;/strong&gt;:  Data cleaning and preprocessing are among the initial steps involved in data analysis. Unexpectedly, SQL is essential for performing these functions, especially when utilizing big data tools. This facilitates data manipulation in a manner that simplifies analysis while also helping the analyst avoid numerous errors.   &lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  A practical, slightly deeper understanding of common terms and syntax used in SQL.
&lt;/h2&gt;

&lt;p&gt;Now that we have defined SQL, its sublanguages, and its benefits, let's dive deeper and deepen our understanding of the basic terms and syntax associated with SQL. &lt;br&gt;
Let's dive into it.&lt;/p&gt;
&lt;h3&gt;
  
  
  Step 1: Create the Schema
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;SCHEMA&lt;/span&gt; &lt;span class="n"&gt;ecommerce_ke&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="n"&gt;USE&lt;/span&gt; &lt;span class="n"&gt;ecommerce_ke&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;&lt;strong&gt;Clause Tip&lt;/strong&gt;: &lt;code&gt;CREATE SCHEMA&lt;/code&gt; defines a namespace for organizing your tables and other database objects.&lt;/p&gt;


&lt;h3&gt;
  
  
  Step 2: Create the &lt;code&gt;customers&lt;/code&gt; Table
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;customer_id&lt;/span&gt; &lt;span class="nb"&gt;INT&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt; &lt;span class="n"&gt;AUTO_INCREMENT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;first_name&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;last_name&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;email&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;city&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;registration_date&lt;/span&gt; &lt;span class="nb"&gt;DATE&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 3: Create the &lt;code&gt;orders&lt;/code&gt; Table
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;order_id&lt;/span&gt; &lt;span class="nb"&gt;INT&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt; &lt;span class="n"&gt;AUTO_INCREMENT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;customer_id&lt;/span&gt; &lt;span class="nb"&gt;INT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;order_date&lt;/span&gt; &lt;span class="nb"&gt;DATE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;order_amount&lt;/span&gt; &lt;span class="nb"&gt;DECIMAL&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="k"&gt;FOREIGN&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;REFERENCES&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 4: Insert Sample Data (with explicit &lt;code&gt;customer_id&lt;/code&gt;)
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;first_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;last_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;city&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;registration_date&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Akinyi'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Otieno'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'akinyi.otieno@example.com'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Nairobi'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'2023-01-15'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Kamau'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Mwangi'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'kamau.mwangi@example.com'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Mombasa'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'2023-02-10'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Wanjiru'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Kariuki'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'wanjiru.kariuki@example.com'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Kisumu'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'2023-03-05'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Mutiso'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Nzomo'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'mutiso.nzomo@example.com'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Nakuru'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'2023-03-18'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Naliaka'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Wekesa'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'naliaka.wekesa@example.com'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Eldoret'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'2023-04-01'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;order_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;order_date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;order_amount&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'2023-03-01'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5000&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;00&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Completed'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'2023-03-02'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;7500&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;00&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Completed'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'2023-03-04'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3000&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;00&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Pending'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'2023-03-10'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2000&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;00&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Completed'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'2023-04-12'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10000&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;00&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Cancelled'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'2023-04-14'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4500&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;00&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Completed'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 5: Query Examples
&lt;/h2&gt;
&lt;h3&gt;
  
  
  1. &lt;code&gt;SELECT&lt;/code&gt; and &lt;code&gt;FROM&lt;/code&gt;
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;first_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;last_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;city&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;&lt;strong&gt;Clause Tip&lt;/strong&gt;: &lt;code&gt;SELECT&lt;/code&gt; retrieves specific columns, and &lt;code&gt;FROM&lt;/code&gt; specifies the table source.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Output&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;first_name | last_name | city
-----------|-----------|---------
Akinyi     | Otieno    | Nairobi
Kamau      | Mwangi    | Mombasa
Wanjiru    | Kariuki   | Kisumu
Mutiso     | Nzomo     | Nakuru
Naliaka    | Wekesa    | Eldoret
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  2. &lt;code&gt;WHERE&lt;/code&gt;
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Completed'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Clause Tip&lt;/strong&gt;: &lt;code&gt;WHERE&lt;/code&gt; filters records based on specified conditions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Output&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;order_id | customer_id | order_date | order_amount | status
---------|-------------|------------|--------------|----------
1        | 1           | 2023-03-01 | 5000.00      | Completed
2        | 2           | 2023-03-02 | 7500.00      | Completed
4        | 1           | 2023-03-10 | 2000.00      | Completed
6        | 5           | 2023-04-14 | 4500.00      | Completed
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  3. &lt;code&gt;GROUP BY&lt;/code&gt;
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;city&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;total_customers&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt; &lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;city&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Clause Tip&lt;/strong&gt;: &lt;code&gt;GROUP BY&lt;/code&gt; aggregates data by one or more columns.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Output&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;city     | total_customers
---------|------------------
Nairobi  | 1
Mombasa  | 1
Kisumu   | 1
Nakuru   | 1
Eldoret  | 1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  4. &lt;code&gt;HAVING&lt;/code&gt;
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;total_orders&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;customer_id&lt;/span&gt; &lt;span class="k"&gt;HAVING&lt;/span&gt; &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Clause Tip&lt;/strong&gt;: &lt;code&gt;HAVING&lt;/code&gt; filters groups created by &lt;code&gt;GROUP BY&lt;/code&gt;, unlike &lt;code&gt;WHERE&lt;/code&gt; which filters rows.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Output&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;customer_id | total_orders
-------------|--------------
1            | 2
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  5. &lt;code&gt;ORDER BY&lt;/code&gt;
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;order_amount&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Clause Tip&lt;/strong&gt;: &lt;code&gt;ORDER BY&lt;/code&gt; sorts results by one or more columns.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Output&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;order_id | customer_id | order_amount | status
---------|-------------|--------------|---------
5        | 4           | 10000.00     | Cancelled
2        | 2           | 7500.00      | Completed
1        | 1           | 5000.00      | Completed
6        | 5           | 4500.00      | Completed
3        | 3           | 3000.00      | Pending
4        | 1           | 2000.00      | Completed
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  6. &lt;code&gt;LIMIT&lt;/code&gt;
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt; &lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;registration_date&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt; &lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Clause Tip&lt;/strong&gt;: &lt;code&gt;LIMIT&lt;/code&gt; restricts the number of rows returned.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Output&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;first_name | registration_date
-----------|-------------------
Naliaka    | 2023-04-01
Mutiso     | 2023-03-18
Wanjiru    | 2023-03-05
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;SQL is a tool that spans various industries and is also the backbone of the data industry. Therefore, for someone looking to delve into data will have to learn SQL. &lt;/p&gt;

</description>
      <category>datascience</category>
      <category>sql</category>
      <category>beginners</category>
    </item>
    <item>
      <title>How Are Data Science, Data Analytics, And AI Transforming Industries?</title>
      <dc:creator>Emmanuel Kiriinya</dc:creator>
      <pubDate>Wed, 23 Apr 2025 19:52:04 +0000</pubDate>
      <link>https://forem.com/emmanuel_kiriinya/how-are-data-science-data-analytics-and-ai-transforming-industries-l0p</link>
      <guid>https://forem.com/emmanuel_kiriinya/how-are-data-science-data-analytics-and-ai-transforming-industries-l0p</guid>
      <description>&lt;h2&gt;
  
  
  Are We Living in the Age of Data-Driven Transformation?
&lt;/h2&gt;

&lt;p&gt;In today's internet era, the sheer volume of data generated every day is staggering. From the moment we unlock our smartphones, count steps using smart watches to the instant a transaction is completed online, even with as little as one Kenyan Shilling, data is being produced, stored, and more importantly, analyzed. The convergence of Data Science, Analytics, and Artificial Intelligence (AI) isn't only changing how companies operate and redefining entire industries.&lt;br&gt;
One might ask,  beneath the surface of this data revolution, what’s truly happening? How are organizations turning raw data into refined intelligence and 'prophecy'? And what does this mean for the future of work, business, and innovation? In this article, I will explore a part of the endless possibilities that data has achieved that could not have been imagined a decade ago.&lt;/p&gt;

&lt;h2&gt;
  
  
  How data science has helped in seeing what humans can’t.
&lt;/h2&gt;

&lt;p&gt;The core idea behind data science is finding patterns where others see chaos. Today, businesses operate in complex, competitive environments where every decision made counts. With the development of machine learning algorithms and predictive analytics, data science has helped organizations identify insights that would otherwise have gone unnoticed.&lt;/p&gt;

&lt;p&gt;Let us consider the financial industry. Hedge funds and trading platforms now use sophisticated algorithms to predict market movements, identify arbitrage opportunities, and assess credit risks with more accuracy than ever before, minimizing possible mistakes by human calculations. What once took days of manual analysis can now be achieved in minutes or seconds, depending on the amount of data. And it's not just about speed but also precision.&lt;/p&gt;

&lt;p&gt;Another good example is the healthcare sector. By analyzing patient records, medical literature, and real-time health data, AI-powered systems can predict disease outbreaks, identify early signs of chronic illness, and recommend personalized treatment plans. A human doctor backed by data-driven insights is no longer science fiction; it’s science fact.&lt;/p&gt;

&lt;h1&gt;
  
  
  Is AI the New Backbone of Industry?
&lt;/h1&gt;

&lt;p&gt;Artificial Intelligence is no longer a futuristic concept—it’s the backbone of many modern enterprises. Whether it's Natural Language Processing (NLP) enabling chatbots to understand human queries, or Computer Vision allowing machines to detect defects in a manufacturing line, AI is everywhere.&lt;/p&gt;

&lt;p&gt;In retail, AI is revolutionizing the customer experience. Recommendation engines, powered by deep learning, suggest products based on browsing history, past purchases, and even mood inferred from recent activity. Visual search tools allow customers to upload images and instantly find matching products. The entire shopping experience is becoming more intuitive, efficient, and personalized.&lt;/p&gt;

&lt;p&gt;Meanwhile, logistics and supply chain management are undergoing dramatic transformation. AI helps forecast demand, optimize routes in real time, and predict potential disruptions due to weather, geopolitical issues, or pandemics. The result? Faster delivery times, reduced operational costs, and enhanced customer satisfaction.&lt;/p&gt;

&lt;h2&gt;
  
  
  How Are Traditional Industries Being Reimagined?
&lt;/h2&gt;

&lt;p&gt;Industries traditionally resistant to change are now embracing data and AI to remain competitive. Agriculture, for example, is turning to AI-powered drones and sensors to monitor soil health, crop growth, and pest activity. These insights have enabled precision farming, where every drop of water or ounce of fertilizer is optimized.&lt;/p&gt;

&lt;p&gt;Construction firms are leveraging predictive analytics to estimate project timelines, costs, and risks more accurately. AI-driven safety monitoring tools use video analytics to identify unsafe behavior on job sites, significantly reducing workplace accidents.&lt;/p&gt;

&lt;p&gt;Even the legal profession—long associated with piles of paperwork and manual research—is being reshaped. Legal AI platforms now scan thousands of legal documents in seconds, extracting relevant case laws and identifying inconsistencies. This allows legal teams to focus on strategy rather than data gathering.&lt;/p&gt;

&lt;h2&gt;
  
  
  Can Data Analytics Predict the Future?
&lt;/h2&gt;

&lt;p&gt;One of the most powerful aspects of analytics is its predictive capability. By identifying patterns and trends, analytics can forecast outcomes with remarkable accuracy. In the energy sector, predictive analytics is used to manage grid loads, anticipate maintenance needs for equipment, and even optimize energy consumption patterns for entire cities.&lt;/p&gt;

&lt;p&gt;Marketing teams across industries deeply rely on predictive analytics to understand consumer behavior. By analyzing demographics, transaction histories, and social media interactions, marketers can tailor campaigns that reach the right audience at the right time, boosting engagement and conversions.&lt;/p&gt;

&lt;p&gt;In the public sector, predictive policing is a controversial but growing field. By analyzing crime data, police departments can deploy resources more efficiently and potentially prevent crimes before they happen. While it raises ethical questions, it undeniably showcases the power of data-driven decision-making.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Happens When Humans and Machines Collaborate?
&lt;/h2&gt;

&lt;p&gt;The idea that AI will replace humans is a common fear, but the more likely scenario is collaboration, not competition. In many cases, AI augments human capabilities rather than replacing them. Surgeons use robotic systems to enhance precision during operations. Journalists rely on AI tools to transcribe interviews and even generate first drafts of news articles. Financial analysts use machine learning models to validate their projections.&lt;/p&gt;

&lt;p&gt;This fusion of human intuition and machine intelligence often produces superior outcomes. It enables professionals to focus on high-value, creative tasks while offloading repetitive or data-heavy work to algorithms.&lt;/p&gt;

&lt;h2&gt;
  
  
  Importance of Ethics in AI.
&lt;/h2&gt;

&lt;p&gt;As AI and analytics become more pervasive, ethical considerations have taken center stage. How do we ensure that AI systems stay fair, transparent, and free from bias? How do we protect consumer data from misuse or breaches? And who is accountable when an AI-driven decision causes harm?&lt;/p&gt;

&lt;p&gt;Responsible AI development involves more than just technical accuracy. It requires diverse datasets, transparent algorithms, regular audits, and inclusive development teams. Companies are now establishing AI ethics boards and investing in explainability tools to ensure that AI decisions can be understood and challenged.&lt;/p&gt;

&lt;p&gt;Data privacy is another critical concern. With regulations like GDPR and CCPA in place, businesses must be more transparent about how they collect, store, and use data. Ethical data science practices are not just good PR, they're a business imperative.&lt;/p&gt;

&lt;h2&gt;
  
  
  Are We Ready for the Next Frontier as a Country?
&lt;/h2&gt;

&lt;p&gt;The evolution of data science, analytics, and AI is accelerating. We’re entering an era where edge computing, quantum analytics, and generative AI could redefine what’s possible.&lt;br&gt;
Imagine smart cities where traffic lights adapt to real-time congestion, energy systems self-optimize based on demand, and emergency services are dispatched using predictive modeling. Envision businesses that use digital twins to simulate every aspect of their operations before implementing real-world changes. Or picture personalized education systems where AI tailors content to each student’s learning style and pace.&lt;br&gt;
These aren’t distant dreams. They are already taking shape in labs, startups, and innovation hubs around the world. Let's hope Kenya's 2025 AI plan will be implemented to help our country not to be left behind in this revolution. &lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion: Are You Embracing the Change or Watching From the Sidelines?
&lt;/h2&gt;

&lt;p&gt;The impact of Data Science, Analytics, and AI is profound and irreversible. Organizations that are leveraging these tools effectively are not just surviving, they are thriving. They make faster decisions, operate more efficiently, and offer better experiences to customers and employees alike.&lt;/p&gt;

&lt;p&gt;The question is no longer whether data-driven technologies will reshape the world. The real question is, will your organization be ready when they do?&lt;/p&gt;

&lt;p&gt;Now is the best time to invest in data skills, infrastructure, and strategy. It’s time to foster a culture that values data-driven thinking and embraces continuous learning. Because in the age of intelligent automation and predictive insights, those who adapt will lead, and those who don’t will surely be left behind.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Microsoft Excel: The Underrated Backbone of Data Analytics</title>
      <dc:creator>Emmanuel Kiriinya</dc:creator>
      <pubDate>Mon, 14 Apr 2025 08:13:05 +0000</pubDate>
      <link>https://forem.com/emmanuel_kiriinya/microsoft-excel-the-underrated-backbone-of-data-analytics-5953</link>
      <guid>https://forem.com/emmanuel_kiriinya/microsoft-excel-the-underrated-backbone-of-data-analytics-5953</guid>
      <description>&lt;p&gt;Daily, it is estimated that the world produces approximately 5 billion gigabytes of data. This data has to be processed using data analytics tools to deduce useful insights. In the age of data science and data analytics, where Python, R, and SQL often steal the spotlight, Microsoft Excel might seem like a relic of the past and an outdated tool. Yet, Excel remains one of the most powerful, versatile, and widely used tools in data analytics. While many seasoned data professionals migrate to more advanced platforms, it's important to acknowledge one undeniable fact: most data analysts began their journey with Excel.&lt;/p&gt;

&lt;p&gt;As perceived by many, Microsoft Excel isn't just a spreadsheet software for data entry and record keeping but a full-fledged analytical tool that, when wielded with the right skill, can be very useful to deliver deep insights, automated processes, and creation of interactive dashboards. All without writing a single line of code, just functions and formulas. &lt;/p&gt;

&lt;h2&gt;
  
  
  The Gateway Tool: Why Excel is the First Step for Many
&lt;/h2&gt;

&lt;p&gt;If you ask any group of data analysts how they started, a majority will say Excel. It is approachable, pre-installed on most Windows based laptops/PC's, and doesn't require technical setup or coding knowledge. Whether it’s a financial report, inventory management, a marketing dashboard, or a patient records analysis in healthcare, Excel often serves as the default environment for data manipulation by most beginners.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Most professionals who develop skills in data analytics/data science likely started in Excel. It is typically the first tool we are exposed to to manage data and develop reporting and insight from the data.&lt;br&gt;
 – &lt;em&gt;Don Tomoff, Data Analytics Advisor&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  What Makes Excel Powerful for Data Analytics?
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;Data Exploration and Cleaning Made Easy&lt;/em&gt;&lt;/strong&gt;
From simple filters to Power Query, Excel allows analysts to clean and transform data easily. Tasks that would require multiple lines of Python or R can often be completed in minutes with Excel’s Graphical user interface. 
To clean data in Excel, follow these steps: &lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt; Identify and remove duplicates: Use the Remove Duplicates feature to eliminate duplicate entries.&lt;/li&gt;
&lt;li&gt; Handle missing values: Use the &lt;code&gt;IFERROR&lt;/code&gt; or &lt;code&gt;IFBLANK&lt;/code&gt; functions to replace missing values with a specific value or a calculated result.&lt;/li&gt;
&lt;li&gt; Standardize data formats: Use the Text to Columns feature to convert text to a standard format for dates, numbers, and other data types.&lt;/li&gt;
&lt;li&gt; Remove unnecessary characters: Use the &lt;code&gt;TRIM&lt;/code&gt; and &lt;code&gt;SUBSTITUTE&lt;/code&gt; functions to remove leading/trailing spaces and unwanted characters.&lt;/li&gt;
&lt;li&gt; Validate data: Use formulas and conditional formatting to identify invalid or inconsistent data.&lt;/li&gt;
&lt;/ul&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;&lt;em&gt;PivotTables: A Non-Technical Data Engine&lt;/em&gt;&lt;/strong&gt;&lt;br&gt;
With just a few clicks, users can summarize thousands of rows of data, calculate aggregates, and uncover trends with no programming knowledge required. Below is a sample of pivot tables from an HR dataset.&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu6kl3g2twskrvcbr37pn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu6kl3g2twskrvcbr37pn.png" alt="Screenshot of a spreadsheet with different pivot tables" width="800" height="449"&gt;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;&lt;em&gt;Formula and function Ecosystem&lt;/em&gt;&lt;/strong&gt;&lt;br&gt;
Excel’s function library rivals many scripting languages. Conditional logic (IF, IFS, AND, OR), lookups (VLOOKUP, HLOOKUP, INDEX + MATCH), and new dynamic array functions (FILTER, UNIQUE) make data wrangling both powerful and readable.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;&lt;em&gt;Power Pivot and DAX&lt;/em&gt;&lt;/strong&gt;&lt;br&gt;
For users working with multiple tables or large datasets, Power Pivot and DAX (Data Analysis Expressions) enable relational modeling and advanced calculations — similar to what you’d find in SQL or Power BI.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;5.&lt;strong&gt;_ Dashboards and Visualization_&lt;/strong&gt;&lt;br&gt;
 Slicers, pivot tables, and pivot charts allow users to create dynamic dashboards. These can be shared directly in Excel files or embedded in reports, making insights easily accessible to stakeholders and also easy to understand.&lt;/p&gt;

&lt;h2&gt;
  
  
  Excel Meets Artificial Intelligence
&lt;/h2&gt;

&lt;p&gt;As part of AI taking over the world by storm and Microsoft being part of the evolution, Excel is undergoing a major transformation powered by AI, machine learning, and natural language processing.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;Excel Copilot (Microsoft Copilot for Excel)&lt;/em&gt;&lt;/strong&gt;
Microsoft Copilot is a game-changer for productivity. It integrates generative AI directly into Excel, enabling users to:&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Ask natural language questions like: “What are the top-selling &lt;br&gt;
regions in Q1?”&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Automatically generate formulas, charts, and summaries based on your &lt;br&gt;
dataset.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Translate technical formulas into plain English (and vice versa)&lt;br&gt;
Example Use Case: Instead of writing &lt;code&gt;=IF(AND(A2&amp;gt;12, A2&amp;lt;=19), &lt;br&gt;
"Teenager", "")&lt;/code&gt;, you can simply type:&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;“Label all patients aged between 13 and 19 as Teenagers.”&lt;br&gt;
Copilot will handle the logic and apply it.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;&lt;em&gt;Suggested Patterns &amp;amp; Smart Fill&lt;/em&gt;&lt;/strong&gt;&lt;br&gt;
Excel now intelligently recognizes repetitive patterns and suggests completions (beyond Flash Fill), using machine learning to spot logic across columns.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;&lt;em&gt;Data Insights with Natural Language Queries&lt;/em&gt;&lt;/strong&gt;&lt;br&gt;
Tools like Analyze Data in Excel empower you to understand your data through natural language queries that allow you to ask questions about your data without having to write complicated formulas. In addition, Analyze Data provides high-level visual summaries, trends, and patterns. Excel can summarize your spreadsheet and even generate pivot tables or charts in response to plain language prompts. This makes exploratory data analysis more accessible to non-technical users.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;4.&lt;strong&gt;_ Deep Integration with Power BI and Power Platform_&lt;/strong&gt;&lt;br&gt;
Excel can now serve as a front-end interface to Power BI datasets and Power Apps. You can pull live data, manipulate it with Excel’s familiar tools, and sync changes back to a shared workspace — bridging self-service analytics and enterprise BI.&lt;/p&gt;

&lt;p&gt;5.&lt;strong&gt;_ Introduction of Add-Ins/Extensions_&lt;/strong&gt;&lt;br&gt;
Excel add-ins are like extra tools that help automate tasks, analyze data better, and even connect to other apps or databases. They’re great for saving time, especially with repetitive work or complex analysis that regular formulas can’t handle easily. Plus, they make Excel more user-friendly with custom buttons and features you can reuse across different files or share with your team.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fne56blq4s9i8idjs1q1x.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fne56blq4s9i8idjs1q1x.png" alt="A screenshot showing the Add-ins button" width="800" height="449"&gt;&lt;/a&gt;&lt;br&gt;
&lt;strong&gt;Real-World Relevance&lt;/strong&gt;&lt;br&gt;
From hospital administrators tracking diagnoses to small businesses analyzing monthly sales, Excel continues to empower decision-makers in diverse fields. For many professionals, especially in resource-constrained environments, Excel is not just the easiest tool, it’s the only one.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Limitations of Excel
&lt;/h2&gt;

&lt;p&gt;Despite its power, Excel has real limitations:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;_ Scalability_&lt;/strong&gt;&lt;br&gt;
Excel begins to struggle with performance when handling very large datasets (&amp;gt;1 million rows) or when complex formulas/functions are overused.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;Version Compatibility&lt;/em&gt;&lt;/strong&gt;&lt;br&gt;
Some AI features (like Copilot or dynamic arrays) require the latest version of Microsoft 365 — not available to all users.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;Risk of Human Error&lt;/em&gt;&lt;/strong&gt;&lt;br&gt;
Without strict data governance, Excel sheets can become error-prone, difficult to audit, and risky for enterprise-critical tasks.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;Limited Reproducibility&lt;/em&gt;&lt;/strong&gt;&lt;br&gt;
Analytical workflows in Excel are harder to version control and automate compared to scripting languages.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tips for Maximizing Excel's Analytical Capabilities&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;ol&gt;
&lt;li&gt;Use Named Ranges and Tables for dynamic referencing&lt;/li&gt;
&lt;/ol&gt;&lt;/li&gt;
&lt;li&gt;&lt;ol&gt;
&lt;li&gt;Learn Power Query for repeatable data transformation&lt;/li&gt;
&lt;/ol&gt;&lt;/li&gt;
&lt;li&gt;&lt;ol&gt;
&lt;li&gt;Explore Power Pivot + DAX to handle large and relational datasets&lt;/li&gt;
&lt;/ol&gt;&lt;/li&gt;
&lt;li&gt;&lt;ol&gt;
&lt;li&gt;Take advantage of Analyze Data and Copilot if available, but be aware that AI is sometimes error-prone.&lt;/li&gt;
&lt;/ol&gt;&lt;/li&gt;
&lt;li&gt;&lt;ol&gt;
&lt;li&gt;Use INDEX + MATCH instead of VLOOKUP and HLOOKUP for performance and 
flexibility since VLOOKUP and HLOOKUP search values vertically and horizontally, respectively.&lt;/li&gt;
&lt;/ol&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;When to Supplement Excel with Other Tools&lt;/strong&gt;&lt;br&gt;
Excel is great for prototyping, EDA (exploratory data analysis), and reporting. But for production analytics pipelines or real-time dashboards, pair it with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;SQL for scalable querying and joins&lt;/li&gt;
&lt;li&gt;Python/R for advanced statistical modeling and automations&lt;/li&gt;
&lt;li&gt;Power BI/Tableau for interactive visualizations&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Conclusion: Excel Isn’t Going Anywhere, It’s Evolving
&lt;/h2&gt;

&lt;p&gt;Microsoft Excel has stood the test of time by constantly adapting to the needs of its users. With the integration of AI-powered tools like Copilot, Excel is no longer just a spreadsheet, it’s a smart analytics assistant.&lt;/p&gt;

&lt;p&gt;Whether you’re a beginner working on your first dataset or an advanced user building dynamic dashboards, Excel deserves its place in the data analytics toolbox. It's fast, powerful, and deeply embedded in business workflows. And now, with the rise of AI, it’s smarter than ever.&lt;/p&gt;

</description>
      <category>msexcel</category>
      <category>analyst</category>
      <category>dataanalysis</category>
      <category>data</category>
    </item>
  </channel>
</rss>
