<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Edina Bwari</title>
    <description>The latest articles on Forem by Edina Bwari (@edina).</description>
    <link>https://forem.com/edina</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1172855%2F1bddc82e-de28-4363-95e2-cccf688763ff.gif</url>
      <title>Forem: Edina Bwari</title>
      <link>https://forem.com/edina</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/edina"/>
    <language>en</language>
    <item>
      <title>Exploratory Data Analysis using Data Visualization Techniques.</title>
      <dc:creator>Edina Bwari</dc:creator>
      <pubDate>Fri, 06 Oct 2023 20:03:47 +0000</pubDate>
      <link>https://forem.com/edina/exploratory-data-analysis-using-data-visualization-techniques-3mnn</link>
      <guid>https://forem.com/edina/exploratory-data-analysis-using-data-visualization-techniques-3mnn</guid>
      <description>&lt;p&gt;The better you know your data the better is your analysis. Data needs to be analyzed so as to produce good results. Exploratory data analysis (EDA) is an approach to analyze and summarize data in order to gain insights and identify patterns or trends. It is often the first step in data analysis and is used to understand the structure of the data, detect outliers and anomalies, and inform the selection of appropriate statistical models.&lt;/p&gt;

&lt;h2&gt;
  
  
  Objectives of EDA.
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Confirm if the data is making sense in context of the business problem.&lt;/li&gt;
&lt;li&gt;It uncovers and resolves data quality issues like missing data, duplicate and incorrect values.&lt;/li&gt;
&lt;li&gt;Data scientists can use exploratory analysis to ensure the results they produce are valid and applicable to any desired business outcomes and goals.&lt;/li&gt;
&lt;li&gt; EDA helps stakeholders by confirming they are asking the right questions. &lt;/li&gt;
&lt;li&gt;EDA can help answer questions about standard deviations, categorical variables, and confidence intervals.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Types of exploratory data analysis.
&lt;/h2&gt;

&lt;p&gt;EDA can be classified into two category this is &lt;em&gt;graphical&lt;/em&gt; and &lt;em&gt;non-graphical&lt;/em&gt; each  having Univariable and multivariable type. &lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--LMKPtGQu--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/gxo0eje6rdqz122xqohd.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--LMKPtGQu--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/gxo0eje6rdqz122xqohd.png" alt="four types of EDA" width="275" height="183"&gt;&lt;/a&gt;&lt;br&gt;
&lt;strong&gt;Univariate non-graphical.&lt;/strong&gt;&lt;br&gt;
Data being analyzed consists of just one variable and it doesn’t deal with causes or relationships. The main purpose of univariate analysis is to describe the data and find patterns that exist within it.&lt;br&gt;
&lt;strong&gt;Univariate graphical.&lt;/strong&gt;&lt;br&gt;
They provide a full picture of the data. Common types of univariate graphics include: Stem-and-leaf plots, Histograms and box plots.&lt;br&gt;
&lt;strong&gt;Multivariate non graphical.&lt;/strong&gt;&lt;br&gt;
Multivariate data arises from more than one variable. Multivariate non-graphical EDA techniques generally show the relationship between two or more variables of the data through cross-tabulation or statistics&lt;br&gt;
&lt;strong&gt;Multivariate graphical.&lt;/strong&gt;&lt;br&gt;
Multivariate data uses graphics to display relationships between two or more sets of data.  Example is a grouped bar plot or bar chart.&lt;/p&gt;
&lt;h2&gt;
  
  
  Exploratory Data Analysis Tools.
&lt;/h2&gt;

&lt;p&gt;In this article I will only focus on &lt;strong&gt;Python&lt;/strong&gt;: We used python programming language for exploratory data analysis. Python offers a variety of libraries and  some of them uses great visualization tool. Visualization process can make it easier to  create the clear report.&lt;br&gt;
To use python for EDA here are some of the steps you will use;&lt;br&gt;
&lt;strong&gt;Step 1: Imports and Reading Data.&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;pandas&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;numpy&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;matplotlib.pylab&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;plt&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;seaborn&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;sns&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;style&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;use&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'ggplot'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;set_option&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'max_columns'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;read_csv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'filename.data.csv'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With these code and libraries imported, you're ready to start working with data and creating visualizations in your Python environment. Make sure you have the necessary data loaded and continue with your data analysis and visualization tasks.&lt;br&gt;
&lt;strong&gt;Step 2: Data Understanding.&lt;/strong&gt;&lt;br&gt;
This involves getting a grasp of the data you're working with, its characteristics, structure, and content. Here are some of the ways to archive data understanding using python code.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Dataframe shape
&lt;code&gt;df.shape&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;head and tail
&lt;code&gt;df.head(5)&lt;/code&gt; &lt;/li&gt;
&lt;li&gt;dtypes
&lt;code&gt;df.dtypes&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;describe
&lt;code&gt;df.describe()&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Step 3: Data Preparation.&lt;/strong&gt;&lt;br&gt;
In this step you will be focusing on dropping irrelevant columns and rows, identifying duplicated columns etc. In this phase, you transform and clean the raw data to make it suitable for analysis.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 4: Feature Understanding.&lt;/strong&gt;&lt;br&gt;
This step falls into Univariate analysis which involves creating, selecting, and transforming features (variables or attributes) in your dataset to improve the performance and interpretability of machine learning models or enhance the effectiveness of data analysis. Thus, plotting Feature Distributions, Histograms, KDE and Boxplot.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 5: Feature Relationships.&lt;/strong&gt;&lt;br&gt;
Here, you will be focusing on understanding how different features (variables) in your dataset relate to each other. This step helps you uncover patterns, dependencies, and interactions between features, which can be valuable for model building, feature selection, and gaining insights from your data. In this step you will be able to come up with Scatterplot, Heatmap Correlation, Pair plot and Group by comparisons.&lt;/p&gt;

</description>
      <category>python</category>
      <category>analyst</category>
      <category>newbie</category>
      <category>datascience</category>
    </item>
    <item>
      <title>Data science for beginners Detailed Road map for 2023–2024.</title>
      <dc:creator>Edina Bwari</dc:creator>
      <pubDate>Sat, 30 Sep 2023 18:36:31 +0000</pubDate>
      <link>https://forem.com/edina/data-science-for-beginners-in-week-one-detailed-road-map-for-2023-2024-1kpe</link>
      <guid>https://forem.com/edina/data-science-for-beginners-in-week-one-detailed-road-map-for-2023-2024-1kpe</guid>
      <description>&lt;p&gt;I'll discuss the following in this article:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;What is data science?&lt;/li&gt;
&lt;li&gt;Why Data Science? &lt;/li&gt;
&lt;li&gt;How Should a Novice Approach Data Science?&lt;/li&gt;
&lt;li&gt;Job Roles in Data Science&lt;/li&gt;
&lt;li&gt;In summary&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Data science is a field of study involving statistical tools and techniques to extract meaningful insights from data. It uses statistical methods and tools to draw important conclusions from data. People's definitions of data science vary, but in the simplest words, we may state that data science is the use of data to find answers to issues. Due to its ability to support organizations in making decisions that are grounded in logic and reason rather than just intuition, data science has grown to play a significant role in the modern corporate world.&lt;/p&gt;

&lt;p&gt;There is no set curriculum for how data science should be taught. The following article will provide you with some insight into what to anticipate from an introductory data science course because data science is not difficult to learn.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is Data Science?
&lt;/h2&gt;

&lt;p&gt;Data science is the study of how to use statistics and machine learning to analyze raw data to make inferences about that data. Technology and statistical analysis are used by data scientists to extract new insights from data collections. &lt;/p&gt;

&lt;p&gt;So briefly it can be said that Data Science involves:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Statistics, computer science, mathematics&lt;/li&gt;
&lt;li&gt;Data cleaning and formatting&lt;/li&gt;
&lt;li&gt;Data visualization&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Why Data Science?
&lt;/h2&gt;

&lt;p&gt;The world is changing quickly, as seen by the ongoing development of new technology. Businesses are increasingly looking to new solutions to stay competitive in this dynamic environment. Big data and data science have become important instruments for fostering growth among them. To find important patterns and insights, data scientists analyze enormous, complicated databases, like text and image data. It has become essential, especially in industries where a ton of data is produced every day, like healthcare, retail, and finance.  Businesses are gathering vast amounts of client data because of the growth of big data, making it crucial to use this data for improved decision-making. As a result, there is an increase in demand in the number of data scientist jobs available to newcomers. &lt;/p&gt;

&lt;h2&gt;
  
  
  How to Start Learning Data Science as a Beginner.
&lt;/h2&gt;

&lt;p&gt;It's essential to have a specific aim in mind before starting your journey into the world of data science. Do you intend to pursue this field as a long-term profession or are you pursuing it for your academic projects in college? Your learning path will be shaped by your goals. For instance, a basic grasp can be sufficient if your goal is to apply data science for undergraduate projects. However, studying professional and advanced topics becomes crucial if you want to pursue a long-term job. Usually, data scientists come from various educational and work experience backgrounds, most should be proficient in, or in an ideal case be masters in four key areas.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Domain Knowledge.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Even though some people might undervalue the value of domain knowledge in data science, its significance cannot be understated. Imagine that you want to work in the health care sector as a data scientist. Your credentials become substantially more valuable if you have a solid grasp of ideas relevant to health care, like clinical decision support. In such circumstances, your subject expertise will be advantageous to you.&lt;br&gt;
As a result, as a novice, acquire the required technologies and tools, comprehend the underlying principles, and practice applying what you have learned. You may develop a solid foundation in data science and master the subject with perseverance and commitment. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Math Skills&lt;/strong&gt;&lt;br&gt;
 Having a solid foundation in math and computer science is often required. Calculus, linear algebra, statistics, and programming training may be included. These three subjects are crucial since they aid in our comprehension of the numerous machine learning methods that are crucial to data science. Like that, knowing statistics is crucial because they are used in data analysis. Additionally important to statistics, probability is viewed as a requirement for mastering machine learning.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Computer Science&lt;/strong&gt;&lt;br&gt;
As you begin your data science journey, you must have a solid foundation. The data science field requires skill and experience in either computer science or programming. You should learn a minimum of one programming language, such as Python, SQL, Scala, Java, or R.&lt;br&gt;
Here are some of the languages and skill to have:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Basics of Data Structure and Algorithm&lt;/li&gt;
&lt;li&gt;Learn Query Language Like SQL
&lt;/li&gt;
&lt;li&gt;Programming Language Like R, Python&lt;/li&gt;
&lt;li&gt;Visualization Tool Like PowerBI,Qliksense,QlikView&lt;/li&gt;
&lt;li&gt;Basic Statistics for Machine Learning Algorithms and Deep Learning&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Communication Skill&lt;/strong&gt;&lt;br&gt;
This is a soft skill that covers both spoken and written communication. In a data science project, the project must be explained to others when findings from the analysis have been reached. This can occasionally be a report that you provide to your team or employer at work. Sometimes it might be a blog entry. It is frequently a presentation to a group of coworkers. Whatever the case, communicating the research's findings is a necessary part of every data science endeavor. So, to become a data scientist, you must have good communication abilities.&lt;/p&gt;

&lt;h2&gt;
  
  
  Data Science Jobs Roles
&lt;/h2&gt;

&lt;p&gt;These are only a handful of the numerous employment options available in the data science and analytics industry. You might also consider careers as a statistician, business intelligence analyst, machine learning engineer, natural language processing (NLP) engineer, or computer vision engineer, depending on your interests and abilities. It’s crucial to pick a path that fits your hobbies and professional objectives.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Data Scientist.&lt;/strong&gt;&lt;br&gt;
As a data scientist, you'll extract, analyze and interpret large amounts of data from a range of sources, using algorithmic, data mining, artificial intelligence, machine learning and statistical tools, to make it accessible to businesses. Once you've interpreted the data, you'll present your results using clear and engaging language.&lt;br&gt;
You'll use your technical, analytical and communication skills to collect and examine data to help a business find patterns and solve problems. This can be for many purposes, for example, predicting what customers will buy or tackling plastic pollution. Large datasets must be gathered, cleaned, and analyzed by data scientists to yield insightful conclusions and guide decision-making. To create prediction models and address challenging issues, they employ a variety of statistical and machine learning techniques. To find possibilities to use data to fuel business success, data scientists frequently collaborate closely with business stakeholders.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Data Analyst.&lt;/strong&gt;&lt;br&gt;
Data analysts concentrate on analyzing data to give their organizations useful insights. They analyses data sets to find ways to solve problems relating to a business's customers. A data analyst also communicates this information to management and other stakeholders. The employment of these individuals encompasses many different industries such as business, finance, criminal justice, science, medicine, and government.&lt;br&gt;
The role of a data analyst can be defined as someone who has the knowledge and skills to turn raw data into information and insight, which can be used to make business decisions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Data Engineer.&lt;/strong&gt;&lt;br&gt;
Data engineers lay down the foundation of a database and its architecture. They assess a wide range of requirements and apply relevant database techniques to create a robust architecture. Afterward, the data engineer begins the implementation process and develops the database from scratch. After periodic intervals, they also carry out testing to identify any bugs or performance issues. A data engineer is tasked with maintaining the database and ensuring that it works smoothly without causing any disruption. When a database stops working, it brings a halt to the associated IT infrastructure. The expertise of a data engineer is especially needed to manage large-scale processing systems where performance and scalability issues need continuous maintenance. &lt;br&gt;
Data engineers can also support the data science team by constructing dataset procedures that can help with data mining, modeling, and production. In this way, their participation is crucial in enhancing the quality of data.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Data Architect.&lt;/strong&gt;&lt;br&gt;
A data architect is an expert who formulates the organizational data strategy, including standards of data quality, the flow of data within the organization, and security of data. It's the vision of this data management professional that converts business requirements into technical requirements. Data architects design the overall structure and organization of data within an organization. They create data models, define data standards, and ensure data is stored, integrated, and accessed effectively. Data architects play a critical role in establishing data governance and ensuring data quality.&lt;/p&gt;

&lt;h2&gt;
  
  
  In Conclusion
&lt;/h2&gt;

&lt;p&gt;Data Scientists are in high demand and are one of the highest-paid professionals in the Data Science field. With the ever-growing data, business organizations have increased investments in improving their data infrastructure and implementation of data science solutions. Due to this, this demand is expected to grow in the next decade as well. If you wish to build a career as a Data Scientist, you can create a strong learning plan using this guide. Post learning the skills, make sure to work on diverse sets of Data Science projects to apply your skills as practical applications are always preferred over theoretical knowledge for a Data Scientist job.&lt;/p&gt;

</description>
      <category>datascience</category>
      <category>newbie</category>
      <category>python</category>
      <category>beginners</category>
    </item>
  </channel>
</rss>
