<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: John</title>
    <description>The latest articles on Forem by John (@john-maina).</description>
    <link>https://forem.com/john-maina</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1029063%2F258d3606-3513-438a-a3f5-284bd72143e9.jpeg</url>
      <title>Forem: John</title>
      <link>https://forem.com/john-maina</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/john-maina"/>
    <language>en</language>
    <item>
      <title>Unveiling the Future: Staying Relevant in the Ever-Evolving World of Technical Writing</title>
      <dc:creator>John</dc:creator>
      <pubDate>Wed, 05 Jul 2023 20:55:17 +0000</pubDate>
      <link>https://forem.com/john-maina/unveiling-the-future-staying-relevant-in-the-ever-evolving-world-of-technical-writing-3gp9</link>
      <guid>https://forem.com/john-maina/unveiling-the-future-staying-relevant-in-the-ever-evolving-world-of-technical-writing-3gp9</guid>
      <description>&lt;p&gt;My colleague and I were fascinated by the emergence of Artificial Intelligence(AI) tools that are revolutionizing the world of technical writing, with their amazing capabilities. It occurred to us how much the field had changed and the trajectory on which things were. We realized how we could not overstate, the importance of staying up to date with the emerging trends in technical writing.&lt;/p&gt;

&lt;h2&gt;
  
  
  Importance of staying updated with emerging trends
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;h3&gt;
  
  
  Staying relevant within the industry
&lt;/h3&gt;

&lt;p&gt;Most of the trends in technical writing are fueled by AI. The emergence of AI tools took the world by storm. It raised concerns about whether AI was going to replace people in the job market. &lt;br&gt;
Most industrial experts speculated that AI would not replace people, but rather, people utilizing the power of AI would replace those who did not. &lt;br&gt;
Staying updated with emerging trends enables people to remain relevant within their respective industries.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;h3&gt;
  
  
  Stay up to date on relevant topics
&lt;/h3&gt;

&lt;p&gt;Technical writers need to stay up to date with the topics they write about. This allows them to come up with more relevant and sought after content.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;h3&gt;
  
  
  Improving User Experience(UX)
&lt;/h3&gt;

&lt;p&gt;The goal of most production activities is to avail value to the end user of the product. For technical writers, the user experience of the reader is of great essence. Emerging trends to improve the user experience have maximized value creation for the end user.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;h3&gt;
  
  
  Automize writing processes
&lt;/h3&gt;

&lt;p&gt;AI has brought about the emergence of tools for technical writers to research and edit their content. These tools have made the tasks less time-consuming than before. The writer can utilize the time resource that the tools help save, to focus on other aspects of their craft.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;h3&gt;
  
  
  Bring traction towards content
&lt;/h3&gt;

&lt;p&gt;New technology fires up conversations among stakeholders. Take, the Apple Vision Pro, an AR/VR product by Apple. The ground-breaking technology used to put together this awesome tool has brought much interest among stakeholders. By staying updated with such emerging trends and writing about them, a writer is able to attract interested parties with their content.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;h3&gt;
  
  
  Positioning for the coming trends
&lt;/h3&gt;

&lt;p&gt;Michelangelo, the great artist says, &lt;em&gt;Become a student of change, for it is the only thing that remains constant.&lt;/em&gt; New trends in technology are bound to come up. &lt;br&gt;
By staying up to date with current trends, technical writers are positioning themselves strategically to leverage new technological tools since new technology will most likely be built on already existing technology.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Impact of Emerging technology
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Emerging technologies birth new and often more efficient ways to handle older tasks. This fuels the revolutionization of industries and economies at large.&lt;/li&gt;
&lt;li&gt;With new technology, comes new opportunities. Unlike the unpopular belief that technology takes roles from people, I rather believe that technology creates new opportunities that we can leverage and embrace.&lt;/li&gt;
&lt;li&gt;By automating most activities in the technical writing process, creators are able to focus on more creative and innovative aspects to improve their craft.&lt;/li&gt;
&lt;li&gt;New content formats. Emerging technology has introduced new content formats. These have scrapped off monotony in writing formats. A technical writer can choose from a variety of markdown styles as well as blog formats.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Impact of evolving industry standards
&lt;/h3&gt;

&lt;p&gt;Technical writing has evolved as an impact of evolving industry standards in the following ways:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;em&gt;Improved User Experience.&lt;/em&gt; Emerging trends have brought about more creative and interesting user interfaces. These add to the value availed to a reader interacting with technical content.&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;New research methods.&lt;/em&gt; Trends in technology have brought forth AI tools like Open AI's ChatGPT. This has increased the avenues where technical writers can look up information and compare it with findings from other sources. This has increased the credibility of the information availed to end-users&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;Automating some practices.&lt;/em&gt; New technology has seen the rise of tools like &lt;a href="https://www.grammarly.com/"&gt;Grammarly&lt;/a&gt; and &lt;a href="https://quillbot.com/"&gt;Quillbot&lt;/a&gt;. These tools have revolutionized editing practices and improved the quality of technical writing content.&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;A wide range of topics to write on.&lt;/em&gt; As new technology brings about new opportunities, with the rise of AI, writers can tap into the realm of AI and emerging technologies that end users would have an interest in reading on. &lt;/li&gt;
&lt;/ol&gt;

</description>
      <category>writing</category>
      <category>beginners</category>
      <category>ai</category>
      <category>markdown</category>
    </item>
    <item>
      <title>Cracking the Code: A Data Beginner's Guide to Python Programming</title>
      <dc:creator>John</dc:creator>
      <pubDate>Tue, 04 Jul 2023 20:28:41 +0000</pubDate>
      <link>https://forem.com/john-maina/cracking-the-code-a-data-beginners-guide-to-python-programming-55f3</link>
      <guid>https://forem.com/john-maina/cracking-the-code-a-data-beginners-guide-to-python-programming-55f3</guid>
      <description>&lt;p&gt;Python has a variety of applications and use cases. Python's versatile nature allows it to be used for web development, data analysis, machine learning, and artificial intelligence. Data scientists can leverage the capabilities of Python to read, clean, visualize and analyze data. Moreover, python offers an awesome environment for training and deploying machine learning models. &lt;/p&gt;

&lt;h2&gt;
  
  
  Data visualization with Matplotlib
&lt;/h2&gt;

&lt;p&gt;A picture is worth a thousand words. Data professionals visualize data to identify trends and communicate findings. Python has powerful packages for data visualization, among them Matplotlib. It uses the data provided to create desired plots that allow the data to tell its own story.&lt;br&gt;
Matplotlib has a sub-package called &lt;em&gt;Pyplot&lt;/em&gt;. To work with Pyplot, import the package;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import Matplotlib.pyplot as plt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Matplotlib is capable of creating a myriad of visualizations&lt;br&gt;
i.e &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;em&gt;Line Plot&lt;/em&gt;: Plots data points connecting them by lines to visualize trends and relationships over a continuous variable.
&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;plt.plot(x_values, y_values)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;ul&gt;
&lt;li&gt;
&lt;em&gt;Scatter Plot&lt;/em&gt;: Displays individual data points as dots to observe patterns or relationships between two variables.
&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;plt.scatter(x_values, y_values)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;ul&gt;
&lt;li&gt;
&lt;em&gt;Bar Chart&lt;/em&gt;: Represents categorical data using rectangular bars to compare values across different categories.
&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;plt.bar(x_values, y_values)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;ul&gt;
&lt;li&gt;
&lt;em&gt;Histogram&lt;/em&gt;: This shows the distribution of a continuous variable by dividing it into stacks called bins and displaying the count of data points within each bin. The size of the bin is determined by the count of data points within that range.
&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;plt.hist(data, no_of_bins)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;ul&gt;
&lt;li&gt;
&lt;em&gt;Pie Chart&lt;/em&gt;: Displays proportions of different categories as sectors of a circle, it is useful for representing parts of a whole.
&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;plt.pie(data, labels=labels(names of the proportions/ categories)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;ul&gt;
&lt;li&gt;
&lt;em&gt;Box Plot&lt;/em&gt;: Illustrates summary statistics, such as median, quartiles, and outliers, of a numerical variable to understand its distribution and identify potential outliers.
&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;plt.boxplot(data)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;ul&gt;
&lt;li&gt;
&lt;em&gt;Heatmap&lt;/em&gt;: Visualize a matrix of data using colors to represent values, often used for correlation matrices or showing patterns in two-dimensional data.
&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;plt.imshow(data, cmap='pick_colour_from_colourmap')
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;ul&gt;
&lt;li&gt;
&lt;em&gt;Area Plot&lt;/em&gt;: Depicts the cumulative values of multiple variables over time, where the area between the lines represents the cumulative sum.
&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;plt.fill_between(x_values, y_values1, y_values2)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;ul&gt;
&lt;li&gt;
&lt;em&gt;Horizontal bar Chart&lt;/em&gt;: Similar to a bar chart but with the bars plotted horizontally, useful for comparing values across different categories.
&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;plt.barh(y_values, x_values)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;ul&gt;
&lt;li&gt;
&lt;em&gt;Violin Plot&lt;/em&gt;: Combine a box plot and a kernel density plot to display the distribution of a variable, providing information about both central tendency and density. Useful in detecting outliers
&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;plt.violinplot(data)

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;h3&gt;
  
  
  Each visualization can be formatted to include:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Axis labels&lt;/li&gt;
&lt;li&gt;Chart name/title&lt;/li&gt;
&lt;li&gt;Data labels&lt;/li&gt;
&lt;li&gt;Grid lines&lt;/li&gt;
&lt;li&gt;Legend&lt;/li&gt;
&lt;li&gt;Trendline&lt;/li&gt;
&lt;li&gt;Error bars&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  Common Data Structures in Python
&lt;/h2&gt;

&lt;p&gt;A data structure is a data organization and storage format that is usually chosen for efficient access to data. &lt;br&gt;
The systematic organization allows for efficient management of the data in the computer's memory storage locations.&lt;br&gt;
You can read the location ID data is stored i.e. calling a function the variable name that references the storage location.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;print(id(name_of_variable))
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;em&gt;A variable is a named storage location that can hold a value&lt;/em&gt;&lt;br&gt;
Common Data Structures in Python include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Lists&lt;/li&gt;
&lt;li&gt;Tuple&lt;/li&gt;
&lt;li&gt;Dictionary&lt;/li&gt;
&lt;li&gt;Sets
###Lists
Say, I initiate a variable x, with certain integers as values, enclosed in square brackets:
&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;x = [1, 2, 3, 4, 5, 6]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;I can check the data type of the variable x, which will be influenced by the value it holds, in this case, integers enclosed in square brackets:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;type = type(x)
print(type)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;em&gt;The output is a "list".&lt;/em&gt; &lt;br&gt;
By initiating the variable x, we have created an object of the class "list". &lt;br&gt;
Different data structures portray different management methods and capabilities-&lt;em&gt;class determines behavior.&lt;/em&gt; &lt;br&gt;
A list is a mutable data structure. This means that it portrays the following characteristics and capabilities:&lt;/p&gt;
&lt;h5&gt;
  
  
  - Appending: adding items to the list
&lt;/h5&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;x.append(the_value_to_be_added)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;h5&gt;
  
  
  - Replacing: replacing items on a list with other items
&lt;/h5&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;x[2] = 5
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;This replaces the item at index 2 with 5&lt;/p&gt;
&lt;h5&gt;
  
  
  - Removing: removing items from a list
&lt;/h5&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;x.remove(value_to_remove)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;The ability to perform these functions without having to create a new list is called mutability.&lt;br&gt;
Lists can hold values of different data types i.e. string, integer, float, boolean and other lists.&lt;/p&gt;
&lt;h3&gt;
  
  
  Tuple
&lt;/h3&gt;

&lt;p&gt;Tuples differ in syntax from lists slightly. They take normal brackets as opposed to square brackets, like in lists.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;y = (1, 2, 3, 4, 5, 6)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Tuples are immutable and cannot be altered. They are used when you want to represent a collection of related values that should remain constant (not altered), such as coordinates, settings, or database records.&lt;br&gt;
A tuple can hold values of different data types i.e. string, integer, float, boolean, and other lists/tuples.&lt;/p&gt;
&lt;h3&gt;
  
  
  Dictionaries
&lt;/h3&gt;

&lt;p&gt;Take the example of a traditional dictionary. Say you want to know the meaning of the word programmer. You will open your dictionary and look up the word &lt;em&gt;programmer.&lt;/em&gt; &lt;br&gt;
Once you access the word in the dictionary, you will be able to read the definition of the word programmer. i.e.&lt;br&gt;
&lt;code&gt;programmer: person who turn the designs created by software developers and engineers into instructions that a computer can follow&lt;/code&gt;&lt;br&gt;
The word programmer(which we know) is a key that directs us to the definition of itself(which we do not know). &lt;br&gt;
The definition is the value we are looking for using the key, programmer.&lt;br&gt;
Dictionaries in Python work under a similar principle. You can create keys and assign a value to each key. This will form a key-value pair, which is now a dictionary.&lt;br&gt;
Dictionaries are enclosed in curly brackets{ }.&lt;br&gt;
Syntax:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;students_grade = {
Martin: "Not yet",
Jacob: "Pass",
Hellen: "Pass",
Joel: "Not yet",
Joylenne: "Not yet"
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To access the grade of any student, I can use their name(which I know), to find out their grade(which I may not be knowing).&lt;br&gt;
&lt;code&gt;students_grade["Hellen"]&lt;/code&gt; - This will return "Pass".&lt;/p&gt;

&lt;p&gt;Dictionaries are used:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;In J-SON files which are stored as key-value pairs. &lt;/li&gt;
&lt;li&gt;In data retrieval by calling a key and accessing its value. To view all keys in a dictionary you can call the function; 
&lt;code&gt;name_of_dictionary.keys()&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;As a programmer, along the way, you will pick up code best practices that will guide you in structuring data structures and algorithms in a way that makes the code you write:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;More readable&lt;/li&gt;
&lt;li&gt;Display faster runtime&lt;/li&gt;
&lt;li&gt;Consume lesser space&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>python</category>
      <category>beginners</category>
      <category>programming</category>
      <category>datascience</category>
    </item>
    <item>
      <title>Installing Anaconda for Seamless Python Development for Windows 10</title>
      <dc:creator>John</dc:creator>
      <pubDate>Wed, 21 Jun 2023 16:52:59 +0000</pubDate>
      <link>https://forem.com/john-maina/installing-anaconda-for-seamless-python-development-for-windows-10-dkd</link>
      <guid>https://forem.com/john-maina/installing-anaconda-for-seamless-python-development-for-windows-10-dkd</guid>
      <description>&lt;p&gt;Anaconda is an open-source and free Python distribution. A Python distribution is a package that provides a convenient way to install and manage Python and its associated packages on a computer. The package includes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Python interpreter&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Standard Python Library&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Third-Party packages i.e for Data analysis, Web development &amp;amp; &lt;br&gt;
Machine Learning&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Development tools i.e Integrated Development Environments&lt;br&gt;
This article will go through a step-by-step process to download and install the Anaconda environment on Windows 10. This will help you achieve a seamless flow while using Python to bring to life creative ideas and solutions to problems.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The prerequisite requirements for this task include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Windows 10 OS on Desktop/ Laptop&lt;/li&gt;
&lt;li&gt;Internet connection&lt;/li&gt;
&lt;li&gt;Access to a browser&lt;/li&gt;
&lt;li&gt;The download link can be accessed &lt;a href="https://www.anaconda.com/download"&gt;here&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Step 1: Navigate to the preferred browser and search Anaconda download
&lt;/h2&gt;

&lt;p&gt;Access a preferred browser and in the search tab, type Anaconda download The search results will include the following link probably as the first suggestion:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--fiugOt4U--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/8e7c29t8z9w3zt9i49re.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--fiugOt4U--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/8e7c29t8z9w3zt9i49re.PNG" alt="Open the link." width="667" height="165"&gt;&lt;/a&gt; &lt;/p&gt;

&lt;p&gt;Open the link.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 2: Navigate to the download tab
&lt;/h2&gt;

&lt;p&gt;Once the link is open, select the pricing option.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--HpX8y4VF--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/ieos5x886p1f1p6k4bv8.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--HpX8y4VF--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/ieos5x886p1f1p6k4bv8.PNG" alt="Image description" width="800" height="49"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Hit the &lt;em&gt;free download&lt;/em&gt; link to start the download.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 3: Run the executable file
&lt;/h2&gt;

&lt;p&gt;Once the download is complete, navigate to the downloads tab and click on the executable file(.exe)&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--OBh3qtkw--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/bdi7o30nblr88voqklp4.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--OBh3qtkw--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/bdi7o30nblr88voqklp4.PNG" alt="Image description" width="683" height="129"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This opens the following prompt:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--vx4mq3CT--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/4feh80z8bvgcpdiajyns.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--vx4mq3CT--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/4feh80z8bvgcpdiajyns.PNG" alt="Select Run" width="496" height="383"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Select Run&lt;/em&gt;&lt;br&gt;
Run the file and minimize your browser. Minimizing other applications makes it possible to update relevant systems without having to reboot your machine.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 4: Review and accept the terms and permissions
&lt;/h2&gt;

&lt;p&gt;From here onwards, a dialogue box will appear displaying different specifications that you will have to read through and allow in order to proceed with the download.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--5EjSBAun--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/6dwkcpb53xwqxwgvum0f.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--5EjSBAun--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/6dwkcpb53xwqxwgvum0f.PNG" alt="Click Next &amp;gt;" width="499" height="390"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Click Next &amp;gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--_tdvUVx---/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/rjff7bzycps0ipmhqcv5.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--_tdvUVx---/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/rjff7bzycps0ipmhqcv5.PNG" alt="Read through the terms and if satisfied click: I Agree" width="501" height="390"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Read through the terms and if satisfied click: I Agree&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--uvMAcJLZ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/m3ldpjwny6r8bwbhsnan.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--uvMAcJLZ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/m3ldpjwny6r8bwbhsnan.PNG" alt="Pick a preference. Click Next" width="500" height="388"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Pick a preference. Click Next&lt;/em&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Selecting “Just for me” will install Anaconda and configure it specifically for your user account on the computer.&lt;/li&gt;
&lt;li&gt;Choosing “For all users” will install Anaconda and configure it to be available to all user accounts on the computer.
In my case, I will install it for &lt;em&gt;all users.&lt;/em&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Select the storage path for the application&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--mBB0XRMt--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/xqqqmsj3af1w1vvotlph.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--mBB0XRMt--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/xqqqmsj3af1w1vvotlph.PNG" alt="Select Next &amp;gt;" width="500" height="391"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Select Next &amp;gt;&lt;/em&gt;&lt;br&gt;
Make custom installation options&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--yL8wpn17--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/h5swcbpnswl7bmwo3722.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--yL8wpn17--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/h5swcbpnswl7bmwo3722.PNG" alt="Click Install" width="501" height="389"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Click Install&lt;/em&gt;&lt;br&gt;
Wait through the installation process.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--VX-Xb0Jx--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/5r5w446r1m8ijxj7a9ae.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--VX-Xb0Jx--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/5r5w446r1m8ijxj7a9ae.PNG" alt="After the installation process is complete, click Next &amp;gt;" width="501" height="388"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;After the installation process is complete, click Next &amp;gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--bO67TquI--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/0bqd96n0qy3vn00ugybq.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--bO67TquI--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/0bqd96n0qy3vn00ugybq.PNG" alt="Click Next &amp;gt;" width="500" height="389"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Click Next &amp;gt;&lt;/em&gt;&lt;br&gt;
Finish up the installation here:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--fHV4G9Lv--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/qe6h33gqd4sbnmwydzpd.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--fHV4G9Lv--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/qe6h33gqd4sbnmwydzpd.PNG" alt="Click Finish" width="499" height="389"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Click Finish&lt;/em&gt;&lt;br&gt;
With this, the download and installation is complete.&lt;/p&gt;

&lt;p&gt;Now you can launch Anaconda Navigator and leverage the useful tools that the Anaconda environment offers including useful programs and packages.&lt;/p&gt;

</description>
      <category>python</category>
      <category>datascience</category>
      <category>analytics</category>
      <category>data</category>
    </item>
    <item>
      <title>Git and GitHub for Newbies: An Introduction to Version Control</title>
      <dc:creator>John</dc:creator>
      <pubDate>Sun, 28 May 2023 02:56:15 +0000</pubDate>
      <link>https://forem.com/john-maina/git-and-github-for-newbies-an-introduction-to-version-control-59jh</link>
      <guid>https://forem.com/john-maina/git-and-github-for-newbies-an-introduction-to-version-control-59jh</guid>
      <description>&lt;p&gt;Say you are solving the &lt;a href="https://blog.prepscholar.com/how-to-solve-a-rubiks-cube"&gt;Rubik’s cube&lt;/a&gt;, and you are a couple of steps toward the whole solution and at this point you get stuck. Maybe you would need to track back a few steps or get assistance from a friend to solve the cube’s puzzle. To do this, you would have to lose all your progress and start afresh, to try out different solutions.&lt;/p&gt;

&lt;p&gt;Consequentially, if the new method fails and you needed to go back to the initial process, it would be quite difficult unless you can recall all the steps you had made. In working on work projects( The Rubik’s cube solution in our analogy), this is where git comes in and the friend you would ask for assistance, in this case, would be a colleague you are collaborating with on the project.&lt;/p&gt;

&lt;p&gt;Git solves the challenge of losing previous work and the inability to keep track of changes made when trying to find a solution to a problem e.g. the Rubik’s cube in the analogy. This practice of tracking and managing changes is called &lt;a href="https://www.atlassian.com/git/tutorials/what-is-version-control#:~:text=Version%20control%2C%20also%20known%20as,to%20source%20code%20over%20time."&gt;version control&lt;/a&gt;. Git is a version control system, built to track and manage changes in software code.&lt;/p&gt;

&lt;p&gt;To keep track of changes, git takes a snapshot of the project at a specific point in time. To command git to take a snapshot at a specific point in a project, a &lt;a href="https://medium.com/@johngithuimaina/git-and-github-essentials-understanding-the-basics-6a91644f2df3"&gt;commit command&lt;/a&gt; is run. Git might need to store this snapshot at a location where a colleague at work can access it when you need their contribution to the project from them.&lt;/p&gt;

&lt;p&gt;To solve this issue is where GitHub comes in. GitHub is a platform that provides hosting services for snapshots taken through git commits.&lt;/p&gt;

&lt;p&gt;Through GitHub, one can access projects that have been kept record of by git, and hosted on GitHub. By accessing the projects from time to time, developers can keep track of changes that had been made whenever a commit was run. Through GitHub, colleagues can also add their input contributions to the code and host them on GitHub also. Awesome, right?&lt;/p&gt;

&lt;p&gt;We have mentioned running a commit command in git to take snapshots of a project. Actions in git are operated by running commands. Commands for git are run in a command-line interface called Git Bash. There are some terminologies and common commands used in git, outlined &lt;a href="https://dev.to/john-maina/git-and-github-essentials-understanding-the-basics-28pi"&gt;here&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>github</category>
      <category>git</category>
      <category>datascience</category>
      <category>data</category>
    </item>
    <item>
      <title>Git and GitHub Essentials: Understanding the Basics</title>
      <dc:creator>John</dc:creator>
      <pubDate>Sun, 28 May 2023 02:39:02 +0000</pubDate>
      <link>https://forem.com/john-maina/git-and-github-essentials-understanding-the-basics-28pi</link>
      <guid>https://forem.com/john-maina/git-and-github-essentials-understanding-the-basics-28pi</guid>
      <description>&lt;p&gt;Git is a version control system used to manage and keep track of changes in software code. The changes can be hosted on a web platform i.e. GitHub.&lt;/p&gt;

&lt;p&gt;These two tools are essential for developers as changes are made in code daily. A basic introduction to git is well outlined here. To perform actions in git, commands are run in a command-line interface known as Git Bash.&lt;/p&gt;

&lt;p&gt;By the end of this article, a beginner should be able to make a commit and upload it onto a remote repository i.e. GitHub.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;These are the basic terminologies used in git:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Repository(repo)&lt;/strong&gt; — a central location where Git stores all the files, folders, and version history of a project.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Remote repository&lt;/strong&gt; — a repository hosted on a server or online(web-based) like GitHub where the version history can be saved and interacted with.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Local repository&lt;/strong&gt; — a copy of the git repository that resides on the local machine. It contains a git directory(hidden file), commit histories, files, and folders.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;There are also some basic commands in git that can be used to make a commit and upload it onto a remote repository — where colleagues can interact with the project. Interaction can be done by analyzing the project and probably making changes and saving the changes made.&lt;/p&gt;

&lt;p&gt;Here is a beginner-friendly walkthrough to learn the process of making commits and pushing a project onto a remote repository i.e. GitHub:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 1: Create a folder and save it&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;We assume this folder will contain files of our project and our intention is to track the changes we will make to the project and host them on GitHub — where other colleagues can interact with it.&lt;/p&gt;

&lt;p&gt;I will name my folder “First_project”. You can add a file of choice including a .txt file.&lt;/p&gt;

&lt;p&gt;Close the folder and navigate to Git Bash.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 2: Create a git directory(hidden file) in the saved folder&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;To do this, navigate into the folder you have created through the command line by;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;cd "the_file_path_to_the_folder"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In my Git Bash environment, it will resemble this;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--qtEvWVya--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/uh123cyx41lch5cwfvj4.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--qtEvWVya--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/uh123cyx41lch5cwfvj4.PNG" alt="Image description" width="484" height="41"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;To confirm you are working in the directory, you can use the command;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;pwd
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The command prints the current directory.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--PpGMUoDW--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/j5yoieuhboix65gshma1.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--PpGMUoDW--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/j5yoieuhboix65gshma1.PNG" alt="Image description" width="454" height="52"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In the current directory, create a git repository(hidden file) in the folder. This is done by initializing using a command;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git init
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In my command line;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--Q2UWwCA_--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/s4czssh9d9ywjniyb80u.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--Q2UWwCA_--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/s4czssh9d9ywjniyb80u.PNG" alt="Image description" width="576" height="60"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 3: Add the files from the folder that you wish to track&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This is done through a command;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git add "name_of_the_file"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In my command line;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--2GaA3Rv2--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/wncbz5sjzqlbdg5evs2a.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--2GaA3Rv2--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/wncbz5sjzqlbdg5evs2a.PNG" alt="Image description" width="464" height="46"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 4: Check whether the file(s) have been added &amp;amp; ready for commit&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Run a command;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git status
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In my command line;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--_vBk2gMb--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/hw0f796r9w6ik5fxqued.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--_vBk2gMb--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/hw0f796r9w6ik5fxqued.PNG" alt="Image description" width="530" height="160"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;To do this, use the command;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git commit -m "include_a_message"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The message is a short description of why the commit is made.&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--Jyuh1vr9--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/fznrprcd352famw9ruct.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--Jyuh1vr9--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/fznrprcd352famw9ruct.PNG" alt="Image description" width="471" height="98"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 6: Create an empty remote repository on GitHub&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;On your GitHub account, create a new repository, where the files will be hosted on the remote repository.&lt;/p&gt;

&lt;p&gt;You will be directed to this window after creating the repo successfully;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--mUbxS_8y--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/21fs0el95805yovldk2s.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--mUbxS_8y--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/21fs0el95805yovldk2s.PNG" alt="Image description" width="800" height="87"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Copy the link displayed, that will be used to create a link between git and the remote repository we created.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 7: Create a connection to the remote repo&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Use the command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git remote add origin &amp;lt;the_link_copied_from_github_repo&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;On Git Bash:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--e_z_NIa8--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/bw5oucec0k6v639kawsv.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--e_z_NIa8--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/bw5oucec0k6v639kawsv.PNG" alt="Image description" width="523" height="46"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This creates a connection to the remote repository we created on GitHub.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 8: Push the files to GitHub&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;After creating the connection, push the files to the remote repository by;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git push -u origin
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In Git Bash, a successful push command will appear as;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--OUmrbpKU--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/vviegoarhyl2amx4gvg7.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--OUmrbpKU--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/vviegoarhyl2amx4gvg7.PNG" alt="Image description" width="597" height="166"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The files are now reflected on the remote repository in GitHub. To confirm this, you can open the remote repository on GitHub and view the files &lt;a href="https://github.com/John-Maina/first_project"&gt;here&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;They will appear as:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--EfhtxOkM--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/jy7z1emjy7qskfiryv58.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--EfhtxOkM--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/jy7z1emjy7qskfiryv58.PNG" alt="Image description" width="800" height="188"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Follow the steps above to track and manage changes in files through git and GitHub.&lt;/p&gt;

</description>
      <category>git</category>
      <category>github</category>
      <category>datascience</category>
      <category>data</category>
    </item>
    <item>
      <title>Subqueries Unraveled: Exploring SQL’s Hidden Power</title>
      <dc:creator>John</dc:creator>
      <pubDate>Mon, 22 May 2023 20:39:53 +0000</pubDate>
      <link>https://forem.com/john-maina/subqueries-unraveled-exploring-sqls-hidden-power-cbn</link>
      <guid>https://forem.com/john-maina/subqueries-unraveled-exploring-sqls-hidden-power-cbn</guid>
      <description>&lt;p&gt;As you work with tables and databases, sometimes you may require the output from one query, to act as input in another query, in order to get the desired output from the second query. This would require writing two queries. The first is meant to get the figure that will be used in the second query.&lt;/p&gt;

&lt;p&gt;This process can be lengthy and might not be as reliable for reasons we are going to look at. To solve this issue is where subqueries come into play.&lt;/p&gt;

&lt;p&gt;A subquery is an SQL query that is embedded inside another query. This will bring the idea of an inner query(the one embedded) and an outer query(the larger query). A subquery is commonly nested within the WHERE clause of another query. However, it can also be nested under SELECT, INSERT, UPDATE, or DELETE clauses within another query.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Let us consider an instance:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;You have a table &lt;em&gt;“Jobs_market”&lt;/em&gt; with columns &lt;em&gt;“Salary”&lt;/em&gt; and &lt;em&gt;“Job_group”&lt;/em&gt;&lt;br&gt;
You want to get a list of Job groups where the salary is more than the average salary level in the market.&lt;br&gt;
&lt;strong&gt;There are two ways you could do this;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Method 1&lt;/strong&gt;&lt;br&gt;
i ) Write a query to calculate the average salary being paid in the job market:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SELECT AVG(Salary) AS avg_salary
FROM Jobs_market;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This will output the average salary in the job market, say $ 10,000 annually.&lt;/p&gt;

&lt;p&gt;ii ) Write a query that checks the job groups where the salary is more than the average salary, $ 10,000&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SELECT 
    Job_group, 
    Salary
FROM Jobs_market
WHERE Salary &amp;gt; $ 10,000;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This query will give the job groups where the salary is greater than $10,000, which is the average salary we calculated in the first query.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Method 2&lt;/strong&gt;&lt;br&gt;
Here, we will use the idea of subqueries. We will embed the query that outputs the average salary within the query that gives the list of job groups that earn above the average salary.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SELECT 
    Job_group, 
    Salary
FROM Jobs_market
WHERE Salary &amp;gt; (
SELECT AVG(Salary) AS avg_salary
FROM Jobs_market
);
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The output will be the same as the output in method 1, only that we did not run the queries independently, and did not hardcode the figure from the first query, copy it, and paste the value into the second query.&lt;/p&gt;

&lt;h2&gt;
  
  
  These are the reasons why method 1 might not be as reliable;
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Using a subquery can be done faster and consumes lesser space when writing the query. This will save time and make the SQL script more readable as compared to using multiple queries.&lt;/li&gt;
&lt;li&gt;&lt;p&gt;In the case that the salaries being paid in the job market change at a particular time, method 1 will not recognize the changes as it uses a hard-coded value of “$10,000”. The subquery in method 2 however will not use a hard-coded value but will always compute its own average each time the query is run, thus using the real-time values in the salary column.&lt;br&gt;
&lt;strong&gt;It is important to note that:&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Subqueries are always enclosed in &lt;em&gt;parenthesis( )&lt;/em&gt; as you will notice in our illustration above.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;A subquery can be put within a subquery as many times, as they build up to form the larger main query.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Subqueries are executed from the inner-most query, towards the outermost query, to return the desired output. When querying data, ensure that each subquery is working as desired to get the correct final output.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>sql</category>
      <category>datascience</category>
      <category>data</category>
    </item>
    <item>
      <title>Visualizing Data in Excel: Charts and Conditional Formatting</title>
      <dc:creator>John</dc:creator>
      <pubDate>Mon, 22 May 2023 20:29:29 +0000</pubDate>
      <link>https://forem.com/john-maina/visualizing-data-in-excel-charts-and-conditional-formatting-11p8</link>
      <guid>https://forem.com/john-maina/visualizing-data-in-excel-charts-and-conditional-formatting-11p8</guid>
      <description>&lt;p&gt;These are visualizations used to visually represent data in bars and graphs. Not every stakeholder and user of data may be well-equipped with data skills.&lt;/p&gt;

&lt;p&gt;Charts are better used to relay information, and findings and pick up trends in data.&lt;/p&gt;

&lt;p&gt;To create visualization charts;&lt;br&gt;
· Highlight the table from which the visualizations intend to come from&lt;/p&gt;

&lt;p&gt;· Click the Insert tab&lt;/p&gt;

&lt;p&gt;· Under the insert tab, select a chart of your choice.&lt;/p&gt;

&lt;p&gt;There are different chart styles to choose from including but not limited to &lt;em&gt;pie charts&lt;/em&gt;, &lt;em&gt;bar graphs&lt;/em&gt;, and &lt;em&gt;line graphs&lt;/em&gt;. There are preferred visualizations for e. g categorical data, continuous data, time and trend analysis, network data, and geographical data. A good visualization tool is one that best suits the visualization goals.&lt;/p&gt;

&lt;p&gt;Each visualization tool has data that it is more suitable to represent:&lt;/p&gt;

&lt;p&gt;· Numerical data — Excel provides a wide range of charts and graphs, such as line plots, scatter plots, histograms, or box plots.&lt;/p&gt;

&lt;p&gt;· Categorical data — These can be visualized using bar charts, pie charts, stacked charts, or categorical heat maps.&lt;/p&gt;

&lt;p&gt;· Time series data — Line graphs and charts are more suitable for trend and time analysis.&lt;/p&gt;

&lt;p&gt;It’s important to match your data types and complexity with the capabilities and features of the visualization tool you choose.&lt;/p&gt;

&lt;p&gt;A new worksheet can be created to host the charts for enhanced readability.&lt;/p&gt;

&lt;p&gt;One can also choose chart styles in the chart. This is useful especially when intending to relate the visualization to the color templates of the organization the data is for and make the visualization more appealing aesthetically.&lt;/p&gt;

&lt;p&gt;A chart elements button also can be used in Excel to add features such as a data table. These features provide convenience and efficiency for visualizing data.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conditional formatting in Excel
&lt;/h2&gt;

&lt;p&gt;Conditional formatting is a way to see patterns and trends in data. It is essential for easily spotting trends and patterns in &lt;em&gt;data using bars&lt;/em&gt;, &lt;em&gt;colors&lt;/em&gt;, and &lt;em&gt;icons&lt;/em&gt; to easily highlight important values.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Conditional formatting offers:&lt;/strong&gt;&lt;br&gt;
· Highlighting cell rules&lt;/p&gt;

&lt;p&gt;· Top/ Bottom rules&lt;/p&gt;

&lt;p&gt;· Data bars&lt;/p&gt;

&lt;p&gt;· Color scales&lt;/p&gt;

&lt;p&gt;· Icon sets&lt;/p&gt;

&lt;p&gt;· Creating new rules&lt;/p&gt;

&lt;p&gt;· Deleting a rule/ clear rules&lt;/p&gt;

&lt;p&gt;· Managing rules&lt;/p&gt;

&lt;p&gt;Highlight cells rule is among the most commonly used conditional formatting tools. It is useful for pointing out duplicate values in columns that require unique values e. g &lt;em&gt;Customer IDs&lt;/em&gt;, &lt;em&gt;Telephone numbers&lt;/em&gt;, &lt;em&gt;employee IDs&lt;/em&gt;, &lt;em&gt;social security numbers&lt;/em&gt; and &lt;em&gt;serial numbers for products&lt;/em&gt;. These are often primary keys.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Checking for duplicates saves the user time.&lt;/strong&gt;&lt;br&gt;
A dialogue box pops up where a user can choose whether to highlight duplicate values or unique values. This is dependent on how the user wants to use the data. After highlighting the desired cells, simply sort the column by color using the filter function. The highlighted cells will appear at the top.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Text that contains&lt;/strong&gt;&lt;br&gt;
This is an option under highlighting cell rules. It is useful for finding specific keywords. E. g if you had a list of phone numbers, and want to classify them using the country code, you can check for texts that contain +254, and +234, after highlighting the telephone numbers column.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Excel Formulas: Mastering Data Manipulation</title>
      <dc:creator>John</dc:creator>
      <pubDate>Mon, 22 May 2023 20:21:14 +0000</pubDate>
      <link>https://forem.com/john-maina/excel-formulas-mastering-data-manipulation-2iam</link>
      <guid>https://forem.com/john-maina/excel-formulas-mastering-data-manipulation-2iam</guid>
      <description>&lt;p&gt;&lt;strong&gt;The key thing to know about formulas is that they begin with an “ = ” sign.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I feel like this is important to begin with because as a beginner I would always forget that, and start wondering why my formula is not working. Remembering this will definitely save you a lot of time.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Formulas&lt;/strong&gt; in Excel are words or statements that when executed, perform a specific action on data(input) and give an output.&lt;/p&gt;

&lt;p&gt;If you were to go a little in-depth, the formulas are defined as functions of Excel and each performs a specific task based on how it has been defined. You run data through the function as input and get a desired output based on the formula used.&lt;/p&gt;

&lt;p&gt;You can use a formula, &lt;em&gt;to sum up data&lt;/em&gt;, &lt;em&gt;get the maximum value from an array of data&lt;/em&gt;, &lt;em&gt;format data&lt;/em&gt;, etc. Input to be used in formulas is placed in &lt;em&gt;parenthesis ()&lt;/em&gt;. The values in parenthesis are run through the function, and output is given.&lt;/p&gt;

&lt;h2&gt;
  
  
  There are some commonly used Excel formulas which include:
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Min and Max.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The mean and max formulas share syntax. They take a range of data as input and the max formula gives the maximum number from the array. As you would have thought, the min function gives the minimum value from the array as output. The syntax appears as:&lt;/p&gt;

&lt;p&gt;=MAX(range)&lt;/p&gt;

&lt;p&gt;=MIN(range)&lt;/p&gt;

&lt;p&gt;An example of a range could be H2:H6, A1:D10, K17:P87, etc. The first example indicates a range from cell H2 to cell H6.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;IF and IFS&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;IF allows you to place a condition based on a logical test on the data put in as input. The syntax generally is:&lt;/p&gt;

&lt;p&gt;=IF (logical test, (value if true), (value if false))&lt;/p&gt;

&lt;p&gt;The formula generally checks if the logical test is true and if so, gives an output as defined, and if not gives output depending on the instructions given. The logical test includes a range of the data that will be checked against the logical test.&lt;/p&gt;

&lt;p&gt;=IF(H2:J15&amp;gt;15, “Over_threshhold”,”Below_threshhold”)&lt;/p&gt;

&lt;p&gt;This will check for the data within the range whether or if it is greater than 15. For each value greater than 15, excel will print “Over_threshhold” in the cells and for each value lesser than 15, excel will print ”Below_threshhold” in their respective cells&lt;/p&gt;

&lt;p&gt;IFS allows you to apply multiple conditions and is also different from If in that it does not include a value if the condition checked is false. The syntx would appear as follows:&lt;/p&gt;

&lt;p&gt;=IFS(logical test 1, value if true, logical test 2, value if true)&lt;/p&gt;

&lt;p&gt;=IFS(Range = “Teacher” , “Job group F”, “Principal”, “Job group E”)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Len(Length)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A function that can be used when checking the length of a string. It can be useful for checking phone numbers, social security numbers to check for those with missing values. A len function returns the length of the strings in the cells selected. The syntax will appear as:&lt;/p&gt;

&lt;p&gt;=LEN(A5)&lt;/p&gt;

&lt;p&gt;This will return the length of the characters in the cells.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;LeftRight&lt;/strong&gt;&lt;br&gt;
These functions choose a certain part of a text string and extract data from that. In the formula, you specify the number of characters you want to be outputted from the start of the string. Using the left function will count strings from the left while using the right function will count strings from the right.&lt;/p&gt;

&lt;p&gt;Assuming a use case where I have dates with a date format dd-mm-yyyy, I can choose to select only the year from a range of cells using the right function by using a formula:&lt;/p&gt;

&lt;p&gt;=RIGHT(B25:B34, 4)&lt;/p&gt;

&lt;p&gt;B25:B34 will denote the range of cells we want affected by the formula while 4, denotes the number of characters selected in the output. In this case, yyyy.&lt;/p&gt;

&lt;p&gt;=LEFT(B25:B34,2)&lt;/p&gt;

&lt;p&gt;This will output the day only from the range given, by selecting dd&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Date to Text&lt;/strong&gt;&lt;br&gt;
As it states, the formula is used to convert dates to text data types. Left / Right formulas mentioned above work with string data types only. You will thus be required to convert date data types to string data types using the Date to Text formula. The syntax will appear as:&lt;/p&gt;

&lt;p&gt;=TEXT(G5:P5, “dd-mm-yyyy”)&lt;/p&gt;

&lt;p&gt;G5:P5 specifies the range of cells where we need the change effected. Thereafter in the formula we need to specify the format in which the date already is in.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;A tip&lt;/em&gt;: &lt;em&gt;to tell if an input value is a date type or a string type, date types have a right indentation in the cell while string types display a left indentation in the cell.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Trim&lt;/strong&gt;&lt;br&gt;
It is an essential function that removes unwanted spaces from both sides of the text or data. This makes the data more readable. The syntax will resemble:&lt;/p&gt;

&lt;p&gt;=TRIM(C2:C7)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;CONCATENATE&lt;/strong&gt;&lt;br&gt;
A formula that joins two or more text strings into one text string. It is only applicable on text/string data types. The texts being joined are usually on different cells. The syntax:&lt;/p&gt;

&lt;p&gt;= CONCATENATE(D9,” “,E9)&lt;/p&gt;

&lt;p&gt;The “ “ creates a space between the two strings. You can place any character between the quotation marks and the character will be included in the output.&lt;/p&gt;

&lt;p&gt;This formula can be used to write emails. It would appear as:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Substitute&lt;/strong&gt;&lt;br&gt;
As the name suggests it is used to substitute values or characters with defined characters. It replaces existing text with new text in a text string.&lt;/p&gt;

&lt;p&gt;You might have a date in the format &lt;em&gt;dd-yyyy-mm&lt;/em&gt; and wish to convert it to &lt;em&gt;dd/yyyy/mm&lt;/em&gt;. The syntax would resemble:&lt;/p&gt;

&lt;p&gt;=SUBSTITUTE(D2:H5, “-”, “/”)&lt;/p&gt;

&lt;p&gt;The formula also allows to include the number of instances the change will take place. If you do not include an instance in the brackets, a formula will change the character wherever it appears.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;With one instance:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;=SUBSTITUTE(D2:H5,”-”,”/”, 1)&lt;/p&gt;

&lt;p&gt;The date format will be output as: dd/yyyy-mm. The formula will have changed only the first instance of “–“ that appears.&lt;/p&gt;

&lt;p&gt;With two instances:&lt;/p&gt;

&lt;p&gt;= SUBSTITUTE(D2:H5,”-”,”/”, 2)&lt;/p&gt;

&lt;p&gt;The date format will be output as dd-yyyy/mm. The formula will have changed the second instance where “-” appears.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Sum and Sumif&lt;/strong&gt;&lt;br&gt;
Sum adds up all the values in the range of cells that are selected.&lt;/p&gt;

&lt;p&gt;=SUM(H2:H15)&lt;/p&gt;

&lt;p&gt;This will sum up all the values in the selected range.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Sumif&lt;/strong&gt;&lt;br&gt;
Adding if to the formula creates a condition where the values are summed up if they meet the condition placed.&lt;/p&gt;

&lt;p&gt;=SUMIF(A4:A10,”&amp;gt;3000”)&lt;/p&gt;

&lt;p&gt;This will add up the figures within that range that meet the set criteria.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Sumifs&lt;/strong&gt;&lt;br&gt;
Sumifs allows the user to set multiple criteria when working out a sum. They will add up the values in the range that meet all the criteria defined.&lt;/p&gt;

&lt;p&gt;=SUMIFS(G2:G6, R4:R16,”female”, D4:D8,”&amp;gt;30”)&lt;/p&gt;

&lt;p&gt;Before summing the function will check if a value between R4 to R16 is a female, then proceed to check if the other criteria is also true before summing up.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Count&lt;/strong&gt;&lt;br&gt;
This gives a count of how many cells there are within a certain range.&lt;/p&gt;

&lt;p&gt;=COUNT(G5:H7)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Countif&lt;/strong&gt;&lt;br&gt;
This does the count of the cells that meet certain given criteria.&lt;/p&gt;

&lt;p&gt;=COUNTIF(J4:K7,”&amp;gt;5000”)&lt;/p&gt;

&lt;p&gt;This will give a count of the entries within the range of J4 to K7 that are greater than 5000.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Countifs&lt;/strong&gt;&lt;br&gt;
This will allow for multiple criteria to be checked before counting the cells within the range given.&lt;/p&gt;

&lt;p&gt;=COUNTIF(G5:H17,”&amp;gt;5000”, E34:E45,”Male”)&lt;/p&gt;

&lt;p&gt;This will check for the values that are greater than 5000 within range G5:H17 and for cells that have males within ranges E34 to E45.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Days&lt;/strong&gt;&lt;br&gt;
The days formula counts the number of days between dates in cells.&lt;/p&gt;

&lt;p&gt;=DAYS(end_date,start_date)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Network days&lt;/strong&gt;&lt;br&gt;
Similar to days formula but eliminates weekends and holidsays, leaving a count of workdays between dates.&lt;/p&gt;

&lt;p&gt;=NETWORKDAYS(End_date,Start_date,[holidays])&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Xlookup&lt;/strong&gt;&lt;br&gt;
This formula searches a range or an array for a match and returns the corresponding item from a second range.&lt;/p&gt;

&lt;p&gt;An array is a range.&lt;/p&gt;

&lt;p&gt;The syntax is:&lt;/p&gt;

&lt;p&gt;=XLOOKUP(look_up value, look_up array, return_array)&lt;/p&gt;

&lt;p&gt;Look_up value highlights the cell with the value that is subject in the search.&lt;/p&gt;

&lt;p&gt;Look_up array is the range of cells where the look_up value will be checked.&lt;/p&gt;

&lt;p&gt;Return_array is the range that will be checked for the value that will be returned, correspondent with the look_up value. You can select multiple rows for the return value, thus all the columns selected will produce an output of their data each, in respect to the correspondent look_up value.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Xlookup_exact match&lt;/strong&gt; — this allows for inclusivity of a return that would be output if the look_up value was not found in the look_up array.&lt;/p&gt;

&lt;p&gt;=XLOOKUP(A10, T5:T36, G4:G7,”Not found”)&lt;/p&gt;

&lt;p&gt;If the look_up value is not found, an output not found will be returned.&lt;/p&gt;

&lt;h2&gt;
  
  
  Wildcard in Xlookup.
&lt;/h2&gt;

&lt;p&gt;Assuming the user does not know all the characters or the words in the lookup value. They would not be able to specify the exact value that they want to be looked up.&lt;/p&gt;

&lt;p&gt;To solve this, excel offers a wildcard option that is defined by a special character placed between quotation marks (*). An ampersand (&amp;amp;) is placed between the wild card and the known word.&lt;/p&gt;

&lt;p&gt;A wildcard is placed before or after the word in the lookup value that we know. If the unknown word comes before the known word, the wildcard is placed before the known word in place of the unknown value.&lt;/p&gt;

&lt;p&gt;If the unknown word comes after the known word, then it is placed after the known word. A wildcard specifies that a value, just not stated which value can occupy its place. Example:&lt;/p&gt;

&lt;p&gt;XLOOKUP(“ * ”&amp;amp;A4, H2:H10, O2:O10,”Not found”)&lt;/p&gt;

</description>
      <category>excel</category>
      <category>datascience</category>
      <category>data</category>
      <category>analyst</category>
    </item>
    <item>
      <title>Excel for Data Science: Analyze, Visualize, and Excel</title>
      <dc:creator>John</dc:creator>
      <pubDate>Mon, 22 May 2023 19:39:07 +0000</pubDate>
      <link>https://forem.com/john-maina/excel-for-data-science-analyze-visualize-and-excel-1cki</link>
      <guid>https://forem.com/john-maina/excel-for-data-science-analyze-visualize-and-excel-1cki</guid>
      <description>&lt;p&gt;Excel is a tool that is useful to people working with data in their own personal projects or in their organizations. It is a most essential tool for data analysts and data scientists.&lt;/p&gt;

&lt;p&gt;Most articles I read when learning to use Excel stated that most renowned data scientists had Excel as their foundational tool. Excel is a great tool for performing analytics. It offers a user — friendly interface for data manipulation, analysis, and visualization.&lt;/p&gt;

&lt;p&gt;Microsoft Excel is a spreadsheet developed by Microsoft. It has a total of 1,048,576 rows and 16,384 columns per worksheet.&lt;/p&gt;

&lt;p&gt;Excel has computational capabilities, graphing tools, pivot tables, and a Macro- programming language called &lt;em&gt;Visual Basic Applications (VBA).&lt;/em&gt;&lt;br&gt;
&lt;strong&gt;Excel combines these capabilities to:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Calculate data&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Format data&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Organize data&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Excel offers an all-in-one toolpak, that is;&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;A data repository&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Graphical User Interface&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Analysis tool&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Scripting environment with VBA.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Below is common Excel terminology:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Row&lt;/strong&gt; — They run horizontally across cells. They are commonly called observations, records, tuples, and trials.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Column&lt;/strong&gt;— They run vertically across cells. They are also referred to as features, fields, attributes, predictors, and also variables.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Cell&lt;/strong&gt;— An intersection between a row and a column. Users enter data in cells.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Cell reference&lt;/strong&gt; — A set of coordinates showing the location of a cell. Rows are numbered while Columns are vertical and are assigned letters.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Address bar&lt;/strong&gt; — This bar is located on the upper left side of the home ribbon. It displays the coordinates of the selected cell.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Workbook&lt;/strong&gt; — An Excel file that contains one or more worksheets.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Worksheet tab&lt;/strong&gt; — These are tabs arranged from the bottom left of the spreadsheets that are used to select or navigate between worksheets.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Column and row headers&lt;/strong&gt; — Numbered and lettered cells located outside of the columns and rows. Selecting a header highlights the whole row or column&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Auto-fill&lt;/strong&gt; — A feature that enables users to add more than one cell that occurs in a series automatically by selecting the cell with the value and dragging across the cells they wish to auto-fill with the value.&lt;br&gt;
It is important to note that after getting a value using a formula, one can autofill on other cells.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Auto-sum&lt;/strong&gt; — This enables users to add multiple values. Users can select the cells they want to add and press the Alt and Equal keys. There is also a button to enable this feature on the top right of the home page, above “Fill” and to the left of “Sort and Filter”.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Pivot table&lt;/strong&gt; — A data summarization tool that sorts and calculates data automatically. It is located under the insert tab on the far left.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Pivot Chart&lt;/strong&gt; — A visual aid to the pivot table, providing graph representations of the data. Located under the middle of the insert page, next to maps.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Source data&lt;/strong&gt; — Information used to create a pivot table&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Formula bar&lt;/strong&gt; — A long input bar located to the right of the address bar. It is used to visualize and enter values and formulas in cells. It is denoted by an FX label&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Formula&lt;/strong&gt; — Mathematical equations with cell references and functions relating to the cells that work together to output a desired value/ computation. Formulas have to be initialized with an “ = ” sign.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Excel can perform a variety of tasks in data analytics. These include:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Data entry and storage&lt;/li&gt;
&lt;li&gt;Sorting and filtering data to find specific information&lt;/li&gt;
&lt;li&gt;Statistical analysis&lt;/li&gt;
&lt;li&gt;Strategic analysis&lt;/li&gt;
&lt;li&gt;Accounting and budgeting&lt;/li&gt;
&lt;li&gt;Conditional formatting cells to closely observe trends in data and pull insights&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Excel files are denoted by .xls extensions or .csv extensions. A .xls is a spreadsheet file that can be created by Excel or other spreadsheet programs. The file type represents an Excel Binary file format.&lt;/p&gt;

&lt;p&gt;A CSV file is an abbreviation for Comma Separated Values. Excel files and CSV files can be integrated with Python for data analysis.&lt;/p&gt;

&lt;p&gt;Excel allows the user to copy data and therefore can perform analysis and changes on data while still retaining the original dataset. This becomes useful when an analyst needs to start the process all over again or has effected changes that they wish to revert.&lt;/p&gt;

&lt;p&gt;Excel has advanced features used to perform complex analyses such as regression analysis, time series analysis, and statistical analysis. Excel performs statistical analysis for measures of central tendency including &lt;strong&gt;&lt;em&gt;mean&lt;/em&gt;&lt;/strong&gt;, &lt;strong&gt;&lt;em&gt;median&lt;/em&gt;&lt;/strong&gt;, &lt;strong&gt;&lt;em&gt;mode&lt;/em&gt;&lt;/strong&gt;, and measures of variance including &lt;strong&gt;&lt;em&gt;standard deviation&lt;/em&gt;&lt;/strong&gt;, &lt;strong&gt;&lt;em&gt;regression&lt;/em&gt;&lt;/strong&gt;, and &lt;strong&gt;&lt;em&gt;correlation&lt;/em&gt;&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Excel for data science can be used to perform:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Data cleaning and processing — remove duplicates, fill in missing values, and transform data&lt;/li&gt;
&lt;li&gt;Data transformation — convert data from one format to another, from a source system’s format to the required format of a destination system.&lt;/li&gt;
&lt;li&gt;Data wrangling — This is removing errors and combining complex data sets to make them more accessible and easier for data analysis.&lt;/li&gt;
&lt;li&gt;Data exploration and visualization — Excel has a rich variety of visualization tools including graphs, charts, and pivot tables which are useful in data exploration.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;To perform its tasks excel uses:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Formulas — Mathematical equations that are formed using functions. Desired data is passed through the functions as input and the desired output is derived based on the formula function applied&lt;/li&gt;
&lt;li&gt;Conditional formatting — This is a way to see patterns and trends in data. It helps users and analysts to easily spot trends and patterns in data using bars, colors, and icons to easily highlight important values.&lt;/li&gt;
&lt;li&gt;Charts — Used to visually represent data in bar, line, and pie charts. Each type of visualization has a preferred data set where it is more applicable. E. g line graphs work well with time and trend analysis.&lt;/li&gt;
&lt;li&gt;Pivot tables — These are features in Excel that help to summarize data and sort of give a different-eye view of the data set. This is achieved by comparing desired columns to those that directly relate to them without interfering with the structure of the data set.&lt;/li&gt;
&lt;/ol&gt;

</description>
      <category>datascience</category>
      <category>data</category>
      <category>analytics</category>
      <category>excel</category>
    </item>
    <item>
      <title>Exploratory Data Analysis Guide</title>
      <dc:creator>John</dc:creator>
      <pubDate>Fri, 24 Feb 2023 15:13:32 +0000</pubDate>
      <link>https://forem.com/john-maina/exploratory-data-analysis-guide-3d08</link>
      <guid>https://forem.com/john-maina/exploratory-data-analysis-guide-3d08</guid>
      <description>&lt;p&gt;Exploratory Data Analysis(EDA) is a technique data scientists and analysts use to analyze and understand data sets. EDA was first developed by an American mathematician, John Tukey in the 1970s. John Tuckey introduced Fast Fourier Transform. Data scientists are required to use statistical raw data to give insights and advise organizations on matters, improving performance.&lt;br&gt;
It would be a challenge for data scientists to present and analyze data without proper visualization tools. Without EDA data scientists would be making assumptions that are not fueled by a statistical view.&lt;br&gt;
EDA applies graphical and statistical techniques to analyze and summarize data sets. Computer software like python and powerBi offer platforms for creating useful visualizations. From data visualization, statisticians can discover patterns, spot anomalies, and investigate correlations between variables. EDA provides a better understanding of data sets and an understandable way to present data and findings. EDA helps affirm that the results provided are valid and applicable.&lt;br&gt;
In this article, we will look at various ways to perform Exploratory Data Analysis and software that can be employed like excel, python, and powerBi. We will elaborate more on how EDA works to identify errors, understand patterns within data by graphical and visual aid, detect outliers and anomalous events, finding interesting relations among variables. Outliers are values that fall far off the range within which other values fall.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Why we need Exploratory Data Analysis&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;EDA is useful in discovering hidden patterns in data. Information sourced from EDA can be used to build and refine machine learning models that are fine-tuned to the specific needs of the data. From EDA we exploit understanding, discover underlying structures, identify outliers, and get answers to various assumptions on the data set.&lt;br&gt;
**&lt;/p&gt;

&lt;h2&gt;
  
  
  Exploratory Data Analysis process
&lt;/h2&gt;

&lt;p&gt;**&lt;br&gt;
EDA process includes:&lt;br&gt;
&lt;strong&gt;Data collection&lt;/strong&gt;&lt;br&gt;
This is the process of gathering data that will be useful when drawing insights on the matter being investigated. To collect data an analyst has to define the problem they want to solve. This will lead to the collection of relevant data.&lt;br&gt;
Data can be extracted from public data sources or private data sources. Public data sources include websites like Kaggle which allow data access without restrictions. You can freely acquire data, perform analysis and draw reasonable output from it.&lt;br&gt;
Private data sources however require authorization and authentication to retrieve data from. These could be company records or private shared websites where companies upload their data.&lt;br&gt;
&lt;strong&gt;Data cleaning&lt;/strong&gt;&lt;br&gt;
This process involves checking data types for each column, removing unused columns, removing duplicates, removing missing values, checking the outliers, renaming your columns e. g those with spaces and missing values, check if data values have the correct format. Cleaning data leaves data that can be used to make useful insights.&lt;br&gt;
There often are discussions on how long data cleaning should take. Technically there is no given period since the process is dependent on a couple of variables. These include the rawness of the data presented to the analyst, the information required to be extracted by the analyst, and the tools provided since different tools offer different functions and for different motives.&lt;br&gt;
&lt;strong&gt;Data processing&lt;/strong&gt;&lt;br&gt;
This is the manipulation and transformation of data into useful insight. This involves a couple of analyses including univariate analysis, bivariate analysis, and multivariate analysis. Through data processing, we identify trends through manual processes or automated processes. Tools useful for data processing using automated means like python include pandas and numpy.&lt;br&gt;
Univariate analysis is the process of analyzing each variable individually mainly to understand the distribution of data and identify outliers and anomalies. &lt;br&gt;
Bivariate analysis is the process you use to analyze the relationship between pairs of variables. This could be done by plotting graphs and charts from which correlation between two variables could be done.&lt;br&gt;
The multivariate analysis involves analyzing relationships among a number of variables. This will help to gain a deeper understanding of the data set.&lt;br&gt;
&lt;strong&gt;Data visualization&lt;/strong&gt;&lt;br&gt;
This involves the graphical representation of data. Visual data output enables even those without data knowledge to draw conclusions based on what is shown on the models. It is made easier to compare relationships between variables. Data can be visualized using but not limited to bar graphs, histograms, line graphs, and charts, heatmaps for regression and correlation analysis, pivot tables in excel. Tools such as excel, python, and powerbi offer strong visualization tools.&lt;br&gt;
Tools for Exploratory Data Analysis&lt;br&gt;
&lt;strong&gt;Excel&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Excel is a useful tool for data analytics. Excel allows the user to manage data and develop reporting and insight from the data. Excel has tools to perform the necessary EDA functionalities that include data sourcing, data cleaning, data processing, and data visualization. Excel also allows you to view data stored in a worksheet in software like Python.&lt;br&gt;
&lt;em&gt;Data sourcing&lt;/em&gt; - Excel offers options to import data from the web or from the local storage on which you can perform analysis.&lt;br&gt;
&lt;em&gt;Data cleaning&lt;/em&gt; - Excel allows you to clean data by removing duplicates, filling in missing values, and transforming data. This will enable working with a more refined data set.&lt;br&gt;
&lt;em&gt;Data processing&lt;/em&gt; - Excel has formulas and functions that are useful for data processing. A few include index and match, vlookup, If tests, countif, countifs, sumif, and sumifs.&lt;br&gt;
&lt;em&gt;Data visualization&lt;/em&gt; - Excel can create charts, line graphs, bar graphs, and pivot tables to graphically visualize data. In pivot tables, you can add slicers to view data from different and preferred perspectives.&lt;br&gt;
Excel, however, has a limitation to the amount of data it can handle which is 1,048,576 rows and also a limited number of columns per worksheet.&lt;br&gt;
&lt;strong&gt;Python&lt;/strong&gt;&lt;br&gt;
Python as a programming language is a useful tool for data analysis, data science, machine learning, and Artificial intelligence. Python has inbuilt functions and tools that allow it to source, clean, process, and visualize data.&lt;br&gt;
&lt;em&gt;Data sourcing &lt;/em&gt;- Python allows you to source data from the web, local storage, or excel using the pandas module. It allows data in different formats including &lt;em&gt;csv&lt;/em&gt;, .xlsx, ipynb, and others.&lt;br&gt;
&lt;em&gt;Data cleaning&lt;/em&gt; - Python allows you to clean data by checking for missing values using data.isna() function. You can manipulate the function in the python pandas module to give the number of missing files, and show which columns have missing files.&lt;br&gt;
You can drop columns in pandas using the dropna() function to drop rows and columns with missing values. You can merge columns using the " +" sign in pandas and replace columns. Python has the numpy module that allows you to create and manipulate arrays and work with numerical data.&lt;br&gt;
&lt;em&gt;Data processing&lt;/em&gt; - The pandas module in python allows for data processing by calling functions like .describe() which gives statistical measures of variable values.&lt;br&gt;
&lt;em&gt;Data visualizations&lt;/em&gt; - Python has powerful modules for visualization including seaborn and matplotlyb. These modules allow you to create heatmaps for regression, line charts, bar graphs, histograms, and other visual tools.&lt;br&gt;
Python code can be deployed on git platforms to allow access by various people.&lt;br&gt;
&lt;strong&gt;Powerbi&lt;/strong&gt;&lt;br&gt;
Powerbi is a powerful tool for data analysis. It is famous for its rich visualization tools. On powerbi, you can import CSV and excel files. It creates relationships among variables in a dataset using primary keys and foreign keys. &lt;br&gt;
After analyzing and processing data, the tool tries to create a model for you to relate the variables in the data. A star schema approach is used to create these models. The inbuilt models may not be specific to your needs and therefore you can edit them to suit the user's needs. &lt;br&gt;
Powerbi analysis can be deployed on Microsoft platforms to allow access by various users.&lt;br&gt;
In this article, we have looked at the usefulness of EDA and how to perform EDA on different platforms. EDA allows users to derive useful insights from given data sets and offers understandable graphical visualizations. Data analysts and scientists are encouraged to always be learning as new modules and frameworks come up to help them achieve their purpose of driving organizational growth using data.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>productivity</category>
      <category>development</category>
      <category>discuss</category>
    </item>
    <item>
      <title>Python3 101: Introduction to python for Data Science</title>
      <dc:creator>John</dc:creator>
      <pubDate>Sun, 19 Feb 2023 16:45:02 +0000</pubDate>
      <link>https://forem.com/john-maina/python3-101-introduction-to-python-for-data-science-4b11</link>
      <guid>https://forem.com/john-maina/python3-101-introduction-to-python-for-data-science-4b11</guid>
      <description>&lt;p&gt;Python is a high-level, general-purpose programming language that was first released in 1991 by its creator Guido van Rossum. Guido van Rossum began work on Python in the late 1980s while he was working at the National Research Institute for Mathematics and Computer Science in the Netherlands. He wanted to create a language that was easy to read, write and understand, and that was also open-source and available to everyone.&lt;/p&gt;

&lt;p&gt;Python was initially designed as a scripting language to automate system administration tasks and other small programs. Its design philosophy emphasizes code readability, and its syntax is meant to be simple and easy to understand. As a result, Python quickly gained popularity among developers and has since become one of the most widely used programming languages in the world.&lt;/p&gt;

&lt;p&gt;Guido van Rossum continued to lead the development of Python until he stepped down as the project's lead in 2018. Today, Python is maintained by the Python Software Foundation, a non-profit organization that is dedicated to advancing and promoting the use of Python. &lt;br&gt;
Python is a versatile programming language that can be used for a wide variety of applications. Here are some of the most common uses of Python:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Web development: Python is widely used in web development, with frameworks like Django and Flask being popular choices for building web applications.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Data analysis and visualization: Python's rich set of libraries like Pandas, NumPy, and Matplotlib, make it a popular choice for data analysis, machine learning, and data visualization.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Scientific computing: Python has a strong presence in the scientific computing field, with libraries like SciPy and BioPython providing powerful tools for scientific computing.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Artificial intelligence and machine learning: Python is used extensively in artificial intelligence and machine learning applications. Frameworks like TensorFlow, PyTorch, and Keras are popular choices for building machine-learning models.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Desktop GUI applications: Python can also be used to create desktop GUI applications with libraries like PyQt and wxPython.&lt;br&gt;
Game development: Python can be used for game development, with libraries like Pygame providing tools for game development.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Scripting and automation: Python's simplicity and ease of use make it a popular choice for scripting and automation tasks.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Education: Python's readability and easy-to-understand syntax make it an ideal language for beginners, and many educational institutions use Python as a teaching language.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This article will focus on data science and data analysis with python. As mentioned, python has rich libraries like pandas and numpy that come in handy when getting statistics from data.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why python is preferred for data science and data analysis.
&lt;/h2&gt;

&lt;p&gt;Python has certain features which make it preferable for data science, data analysis, machine learning, and Artificial intelligence. These features include:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Large and active community: Python has a large and active community of developers, which means that there is a wealth of resources, libraries, and tools available to help with data science and data analysis. This makes it easier for developers to get started and find solutions to problems they encounter.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Extensive libraries: Python has several powerful libraries for data science and data analysis, including NumPy, Pandas, SciPy, Scikit-learn, and Matplotlib, to name just a few. These libraries provide a broad range of functionality, such as data manipulation, statistical analysis, and visualization.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Interactivity: Python can be easily integrated with other programming languages, which makes it an ideal choice for data scientists and analysts who work in multi-language environments.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Open-source: Python is an open-source language, which means that it is free to use and can be customized to suit individual needs. This makes it an affordable and flexible option for data scientists and analysts. &lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Python in-built functions
&lt;/h2&gt;

&lt;p&gt;Built-in functions are actions that are embedded in a system and are denoted by certain specific names or signs. Each time an inbuilt function is run, it is expected to perform a specific task.&lt;br&gt;
Some of the python's built-in functions include:    &lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;print(): Prints the specified message to the console.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;input(): Accepts input from the user through the console.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;len(): Returns the length of an object, such as a string, list, or tuple.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;range(): Returns a sequence of numbers, starting from 0 by default, and increments by 1 (by default) and stopping before a specified number.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;type(): Returns the type of an object, such as int, str, or list.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;int(): Converts a string or float to an integer.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;str(): Converts an object to a string.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;float(): Converts a string or integer to a float.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;list[]: Converts an object to a list.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;tuple(): Converts an object to a tuple.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;11.dictionaries{}: They include a key to which values are assigned.&lt;/p&gt;

&lt;h2&gt;
  
  
  Difference between a list, a tuple, and a set.
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;A list is used to show several items that exist in a variable. A list is denoted by [], square brackets, and can be altered in the future.&lt;/li&gt;
&lt;li&gt;A tuple is similar to a list in that it shows items that exist in a given variable. A tuple is enclosed in (), brackets. Unlike a list, the components of a tuple cannot be altered in the future.&lt;/li&gt;
&lt;li&gt;A set is also used to show items that exist in a certain variable. A set is denoted by {}, curly brackets. Sets are useful when filtering items in a list since they do not repeat items in them. If an item occurs in a frequency of more than one, it will be recorded once.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Operators in python
&lt;/h2&gt;

&lt;p&gt;In Python, operators are special symbols or characters that perform operations on values or variables. Here are the different types of operators in Python:&lt;/p&gt;

&lt;p&gt;Arithmetic operators: These operators are used to perform mathematical operations. Examples include + (addition), - (subtraction), * (multiplication), / (division), % (modulo or remainder), ** (exponentiation), and // (floor division).&lt;/p&gt;

&lt;p&gt;Assignment operators: These operators are used to assign values to variables. Examples include = (simple assignment), += (addition and assignment), -= (subtraction and assignment), &lt;em&gt;= (multiplication and assignment), /= (division and assignment), %= (modulo and assignment), *&lt;/em&gt;= (exponentiation and assignment), and //= (floor division and assignment).&lt;/p&gt;

&lt;p&gt;Comparison operators: These operators are used to compare values or variables. Examples include == (equality), != (inequality), &amp;gt; (greater than), &amp;lt; (less than), &amp;gt;= (greater than or equal to), and &amp;lt;= (less than or equal to).&lt;/p&gt;

&lt;p&gt;Logical operators: These operators are used to perform logical operations on Boolean values. Examples include and (logical AND), or (logical OR), and not (logical NOT).&lt;/p&gt;

&lt;p&gt;Bitwise operators: These operators are used to perform operations on binary values. Examples include &amp;amp; (bitwise AND), | (bitwise OR), ^ (bitwise XOR), ~ (bitwise NOT), &amp;lt;&amp;lt; (left shift), and &amp;gt;&amp;gt; (right shift).&lt;/p&gt;

&lt;p&gt;Membership operators: These operators are used to test whether a value or variable is a member of a sequence or collection. Examples include in (value is in the sequence) and not in (value is not in the sequence).&lt;/p&gt;

&lt;p&gt;Identity operators: These operators are used to test whether two variables or values refer to the same object in memory. Examples include is (variables refer to the same object) and is not (variables do not refer to the same object).&lt;/p&gt;

&lt;h2&gt;
  
  
  Pandas
&lt;/h2&gt;

&lt;p&gt;We mentioned pandas as one of the extensive libraries that python uses to work with data and give useful output that could further be used to make informed decisions.&lt;br&gt;
Pandas create data frames from items like lists, and tuples but the most common is dictionaries.&lt;/p&gt;

&lt;p&gt;Pandas is a popular open-source Python library for data manipulation and analysis. It provides high-performance, easy-to-use data structures and data analysis tools for data processing, cleaning, and analysis.&lt;/p&gt;

&lt;p&gt;Pandas provide two primary data structures: Series and DataFrame. A Series is a one-dimensional array-like object that can hold any data type, while a data frame is a two-dimensional table-like data structure that consists of columns and rows.&lt;/p&gt;

&lt;p&gt;Pandas also provide a rich set of tools for data manipulation, including merging and joining data sets, pivoting tables, and data reshaping. It also provides tools for handling missing data, time series analysis, and statistical analysis.&lt;br&gt;
Some key features of pandas :&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Data manipulation and cleaning: Pandas provides powerful tools for data cleaning and manipulation, such as removing duplicates, filling in missing data, and transforming data.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Data aggregation: Pandas provides functions for grouping and aggregating data, which allows users to perform complex data manipulations.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Easy data input and output: Pandas supports reading and writing data in a variety of formats, including CSV, Excel, SQL databases, and JSON.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Data visualization: Pandas integrates with other Python libraries such as Matplotlib and Seaborn to create high-quality data visualizations.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Fast and efficient: Pandas is designed to be fast and efficient, with functions that can handle large data sets without compromising performance.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Numpy
&lt;/h2&gt;

&lt;p&gt;NumPy is a popular Python library for numerical computing that provides support for large, multi-dimensional arrays and matrices, along with a vast collection of high-level mathematical functions to operate on these arrays.&lt;br&gt;
NumPy is a useful library in Python for several reasons, especially in the fields of data science, machine learning, and scientific computing. &lt;br&gt;
Here are some of how NumPy is useful:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Efficient mathematical operations: NumPy provides an array object that is much more efficient than Python's built-in list data type for operations that involve mathematical operations on large datasets. This is because NumPy is implemented in C and uses vectorization to perform mathematical operations, which allows for faster and more efficient computation.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Multi-dimensional arrays: NumPy provides support for multi-dimensional arrays and matrices, which are essential in many areas of data science and scientific computing. These arrays make it easy to store and manipulate large amounts of data, such as images, audio, and time series data.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Mathematical functions: NumPy provides a range of mathematical functions such as trigonometric functions, logarithmic functions, and statistical functions. These functions are optimized for use with NumPy arrays, which allows for faster and more efficient computation.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Linear algebra: NumPy provides support for linear algebra operations such as matrix multiplication, inversion, and decomposition. These operations are critical in many areas of data science and scientific computing, such as regression analysis and image processing.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Fourier transforms: NumPy provides support for Fourier transforms, which are used in many areas of signal processing and image analysis.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Random number generation: NumPy provides support for random number generation, which is important in many areas of data science and scientific computing, such as simulation and modeling.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Numpy
&lt;/h2&gt;

&lt;p&gt;NumPy is a fundamental package for scientific computing with Python, as many other libraries and tools depend on it for data processing, analysis, and visualization. Some examples of popular data science and machine learning libraries built on top of NumPy include pandas, sci-kit-learn, and TensorFlow.&lt;br&gt;
NumPy provides efficient and convenient ways to perform mathematical operations on arrays, such as element-wise addition, subtraction, multiplication, and division, as well as more advanced linear algebra operations, like matrix multiplication, inversion, and eigenvalue decomposition. It also offers tools for statistical analysis, random number generation, and signal processing.&lt;/p&gt;

&lt;h2&gt;
  
  
  functions using Numpy
&lt;/h2&gt;

&lt;p&gt;Numpy offers a variety of functions to manipulate data and get useful output. Some functions using the Numpy module include:&lt;br&gt;
numpy. array(): creates an array from a list or tuple.&lt;/p&gt;

&lt;p&gt;numpy. zeros(): creates an array of all zeros.&lt;/p&gt;

&lt;p&gt;numpy. ones(): creates an array of all ones.&lt;/p&gt;

&lt;p&gt;numpy. random.rand(): creates an array of random numbers between 0 and 1.&lt;br&gt;
numpy. concatenate(): concatenates two or more arrays.&lt;/p&gt;

&lt;p&gt;numpy. sum(): calculates the sum of array elements.&lt;/p&gt;

&lt;p&gt;numpy. mean(): calculates the mean of array elements.&lt;/p&gt;

&lt;p&gt;numpy. std(): calculates the standard deviation of array elements.&lt;/p&gt;

&lt;p&gt;numpy. max(): returns the maximum element in an array.&lt;/p&gt;

&lt;p&gt;numpy. min(): returns the minimum element in an array.&lt;/p&gt;

&lt;p&gt;numpy. exp(): calculates the exponential of array elements.&lt;/p&gt;

&lt;p&gt;numpy.log(): calculates the natural logarithm of array elements&lt;/p&gt;

&lt;p&gt;Python is suitable for performing data analytics, data science, machine learning, and AI because of its extensive libraries and modules. MastPythonpython is relatively easy because of its simple syntax.&lt;/p&gt;

</description>
      <category>discuss</category>
    </item>
  </channel>
</rss>
