<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Brendah Achieng</title>
    <description>The latest articles on Forem by Brendah Achieng (@archybrendah).</description>
    <link>https://forem.com/archybrendah</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1028159%2F19a4e2ba-fd2a-4bdd-980c-30d17c36907c.jpg</url>
      <title>Forem: Brendah Achieng</title>
      <link>https://forem.com/archybrendah</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/archybrendah"/>
    <language>en</language>
    <item>
      <title>Introduction to Data Version Control</title>
      <dc:creator>Brendah Achieng</dc:creator>
      <pubDate>Sat, 01 Apr 2023 20:22:28 +0000</pubDate>
      <link>https://forem.com/archybrendah/introduction-to-data-version-control-2g1a</link>
      <guid>https://forem.com/archybrendah/introduction-to-data-version-control-2g1a</guid>
      <description>&lt;p&gt;&lt;strong&gt;Data Version Control&lt;/strong&gt; is a free open-source system that ensures management for data,machine learning experiments and machine learning automations.By ensuring that scientist do not have to worry about which data model uses which dataset and the actions carried out to achieve the result, work has been made easier.Data scientists are able to manage large datasets  with ease  making collaboration better.&lt;/p&gt;

&lt;p&gt;Data Version Control  was first released in 2017 as a simple command line tool.It is based on existing version control tools like Git and CI.It tracks the changing versions of data and every commit changes done to any file.Therefore DVC is like git for machine learning projects.&lt;/p&gt;

&lt;p&gt;The .dvc file is lightweight hence stored with code in github.The .dvc files is downloaded together with code from github. the large datasets used and the model ****files are always stored on the DVC remote storage  while the .dvc files that points to the data files are stored on github.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;DVC design principles&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Codification:&lt;/strong&gt; Definition of the project aspects like data and model versions or machine learning experiments in metafiles that are readable by humans.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Versioning:&lt;/strong&gt; Commit DVC metafiles to  git which enables the versioning and sharing of the entire project(that is datasets,  source code and configuration, parameters and metrics) using git.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Secure Collaboration:&lt;/strong&gt; Control the access and permissions  to the project.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Characteristics of DVC&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Data Version Control takes advantage of existing technologies with the aim of bringing the best software engineering practices to the field of data science.Some of the characteristics of DVC include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Easy to use and install:&lt;/strong&gt;&lt;br&gt;
DVC doesnt require special infrastructure and knowledge.Furthermore, it does not depend on any external services.DVC can be easily intergrated with existing tools like Git.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Can work on top of Git Repo:&lt;/strong&gt;&lt;br&gt;
DVC sticks to the git workflow like commit,branching requests,pull,push,clone etc.It can also work on its own without the versioning capabilities.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;DVC doesn't depend on the platform:&lt;/strong&gt;&lt;br&gt;
It can run and work on all major operating systems.It is independent of the programming languages and the machine learning libraries.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;How to install Data Version Control on windows&lt;/strong&gt;&lt;br&gt;
DVC can be installed on both Linux and macOS.However we will look into the windows installation in this article.&lt;br&gt;
To use DVC as a Python library, you can install it with conda or with pip.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Installation with choco&lt;/strong&gt;&lt;br&gt;
To install from command line use Chocolatey by using the choco command:&lt;br&gt;
&lt;strong&gt;$ choco install dvc&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Installation with conda:&lt;/strong&gt;&lt;br&gt;
Requires minioconda or anaconda distribution.Use conda from anaconda prompt.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;$ conda install -c conda-forge mamba&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;$ mamba install -c conda-forge dvc&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Installation using pip:&lt;/strong&gt;&lt;br&gt;
Virtual environment creation is recommended or using pipx to encapsulate your local environment.Python 3.8+ is needed to get the latest version of DVC&lt;/p&gt;

&lt;p&gt;** $ pip install dvc**&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Windows Installer:&lt;/strong&gt;&lt;br&gt;
Go to the &lt;a href="https://dvc.org/"&gt;https://dvc.org/&lt;/a&gt;  homepage and get the  self-contained, executable installer, which is available from the  &lt;strong&gt;Download&lt;/strong&gt; button .You can also get it from the release page on GitHub.&lt;br&gt;
To update the DVC download and run the installer again.Use Windows Uninstaller incase you want to uninstall the program from your machine.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Advantages of Data Version Control&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Organized Machine learning data-&lt;/strong&gt;&lt;br&gt;
Data pipeline concept is used by DVC to version data using Git. The pipelines being lightweight allow  organization and reproduciblity of workflows.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Share Models via Cloud Storage-&lt;/strong&gt;&lt;br&gt;
Using a centralized data storage scientists find it easy to perform experiments on a single shared machine which  leads to better resource utilization.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Reproducibility-&lt;/strong&gt;&lt;br&gt;
DVC repositories store the history and details such what changes were made and when.It can also use no-code pulls to update requests with just one commit.The easy to use command line interface allow scientists to reproduce and organize feature stores with dvc get and dvc import commits.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Track &amp;amp; Visualize ML Models-&lt;/strong&gt;&lt;br&gt;
Versioning is achieved using Git workflows such as pull and push requests.DVC built in cache is used to store all the machine learning information which are further synchronized with remote cloud storage. DVC therefore, allows for the tracking of data and models for further versioning.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Disadvantages of DVC&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;a)Poor Performance in Sloppy Architecture&lt;/strong&gt;&lt;br&gt;
Data version control works alongside Git hence the team members are not able to enjoy the  full benefits of this version control system if some information about the datasets for a given project is mising.Teams may have to manually develop extra features in DVC to meet certain demands of ML.&lt;br&gt;
&lt;strong&gt;b)Redundancy&lt;/strong&gt;&lt;br&gt;
DVC uses pipeline management hence any use of a separate pipeline tool  leads to redundancy.&lt;br&gt;
&lt;strong&gt;c) Incorrect Configuration Risk&lt;/strong&gt;&lt;br&gt;
Should the working team forget to add the output file there is always a risk of incorrect confirguration of the pipeline.Furthermore, a DVC-produced version of project from last year may not work the same in today's circumstance.&lt;/p&gt;

</description>
      <category>datascience</category>
      <category>dataversioncontrol</category>
      <category>datamanagement</category>
      <category>machinelearning</category>
    </item>
    <item>
      <title>Getting Started With Sentiment Analysis</title>
      <dc:creator>Brendah Achieng</dc:creator>
      <pubDate>Tue, 28 Mar 2023 19:38:55 +0000</pubDate>
      <link>https://forem.com/archybrendah/getting-started-with-sentiment-analysis-152h</link>
      <guid>https://forem.com/archybrendah/getting-started-with-sentiment-analysis-152h</guid>
      <description>&lt;p&gt;Sentiment analysis(opinion mining) is a natural language processing (NLP) technique that focuses on analyzing and finding the intent/emotion behind a given text or speech.&lt;br&gt;
There is always a sentiment behind any  written or spoken speech.It could be negative,positive or neutral.&lt;/p&gt;

&lt;p&gt;Sentimental analysis helps automate the processing of large amount of data in real time.It can be used to analyze customer feedback, survey responses,social media monitoring, reputation management, customer experience  and product reviews.Business decisions can be made after analyzing and understanding people's reaction towards a given comodity.&lt;/p&gt;

&lt;p&gt;Sentimental analysis is fast becoming an essential tool in understanding the sentiment behind all types of data.Being able to understand the responses from over 5000 customers   from a given survey automatically is a great gain for a business.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Importance of sentimental analysis&lt;/strong&gt; &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Sorting large amount of data:&lt;/strong&gt; Manually sorting through thousands of tweets or customer survey responses is very tidious.Sentimental analysis helps analyse large amounts of unstructured data within a short period of time.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Real time analysis:&lt;/strong&gt;Through Sentimental analysis models urgent or critical issues can be detected in real time .For example an angry customer who needs immediate attention can be identified immediately and the situation delt with.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Consistent criteria:&lt;/strong&gt;Using a centralize sentimental analysis model can help with the consistency and maintenance of the standard when interpreting data.Manually done interpretations can be bias as sometimes people get influenced with their  experience,beliefs and thoughts.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;How Does Sentiment Analysis Work?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;With the use of machine learning and natural language processing sentimental analysis  can determine whether a text  is neutral,positive or negative.&lt;/p&gt;

&lt;p&gt;Main approaches of sentimental analysis are:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1.Rule-based sentiment analysis.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A set of manually created rules is used for the analysis.NLP techniques like Lexicons (lists of words), Stemming, Tokenization, Parsing are used.&lt;/p&gt;

&lt;p&gt;Lexicons-A list of both negative and positive words are created and later used to describe the sentiment.&lt;br&gt;
Tokenization- Breaking a text or a sentence into smaller pieces called tokens.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Basic example of how a rule-based system works:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Defines two lists of polarized words that is negative words such as bad, ugly and positive words such as best, beautiful.&lt;/p&gt;

&lt;p&gt;The text is then prepared,processed and formated to make analyzation by the machine possible and easy.Tokenizationm and Lemmatization occurs here.&lt;/p&gt;

&lt;p&gt;The computer then counts the number of words classified as negative and the positive words in the text.&lt;/p&gt;

&lt;p&gt;The overall sentiment score of the text is then calculated based on a given scale like -100 to 100.If the number of positive words  are higher than the negative word the system returns positive sentiment and vice versa.Should the score be even the system returns neutral sentiment.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Disadvantages of Rule-based sentiment analysis&lt;/strong&gt; &lt;/p&gt;

&lt;p&gt;It is limited because it doesnt consider the whole sentences but parts of it.Human language is complicated and sometimes the real emotion can be missed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Automated or Machine Learning Sentiment Analysis&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Machine learning techniques are used.A model is trained with a given data set  to classify the sentiment based on the words and their order in a given text.The quality of this approach depends on the quality of the training dataset used.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 1: Feature Extraction&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Data(text) preparation is done here.Techniques such as tokenization,lemmatization,vectorization and stopword removal are used to make the text ready for classification by the model.Deep learning is used to achieve vectorization of the text.&lt;/p&gt;

&lt;p&gt;Step 2: Training &amp;amp; Prediction&lt;br&gt;
A sentiment-labelled training dataset is used to train the algorithm.The dataset is created manually or generated from reviews.&lt;/p&gt;

&lt;p&gt;Step 3: Predictions&lt;/p&gt;

&lt;p&gt;New text is fed into the model. The model then predicts labels for this new data using the model trained using the training dataset. The text is then classified  as positive, negative or neutral in sentiment. This eliminate  the need for a pre-defined lexicon used in rule-based sentiment analysis.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;N/B-&lt;/strong&gt;A hybrid of both rule-based and automated can be used sometimes.Although they are very complex, they provide the best result.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Building Sentiment Analysis Model&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Pre-trained models are publicly available on the Hub hence they are the best place to get started.The available models use deep learning designs like transformers.For better results it is advisable to fine tune the chosen model with your own data  to better fit the case at hand and for accurate results&lt;/p&gt;

</description>
      <category>database</category>
      <category>sentimentanalysis</category>
      <category>analytics</category>
      <category>datascience</category>
    </item>
    <item>
      <title>Essential SQL Commands For Data Science</title>
      <dc:creator>Brendah Achieng</dc:creator>
      <pubDate>Wed, 15 Mar 2023 16:24:25 +0000</pubDate>
      <link>https://forem.com/archybrendah/essential-sql-commands-for-data-science-5o9</link>
      <guid>https://forem.com/archybrendah/essential-sql-commands-for-data-science-5o9</guid>
      <description>&lt;p&gt;Structured Query Language is a simple ,easy to write language used around the world to manipulate databases.Without data there is no Data Science hence SQL is very important.&lt;br&gt;
In this post we will talk about some of the important sql commands used in Data Science.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Data Retrieval&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Select Command&lt;/strong&gt;&lt;br&gt;
Together with other retrieve commands it is  used to retrieve specific data from the database.The Select Clause can be used to specify a column or columns from the database.To retieve more than one column a comma and a space  between the column names is used.And to get all the columns in a given table use an asterik (*).&lt;/p&gt;

&lt;p&gt;Syntax of Select Command:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Select * from table_name&lt;/li&gt;
&lt;li&gt;Select column_name from table_name&lt;/li&gt;
&lt;li&gt;Select column_name1, column_name2 from table_name&lt;/li&gt;
&lt;li&gt;Select * from table_name where condition&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Distinct Command&lt;/strong&gt;&lt;br&gt;
It is used with the select command to  display only the different,unique or distinct data from a table that has some similar data.&lt;/p&gt;

&lt;p&gt;Syntax of the distinct command:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Select distinct column_name from table_name&lt;/li&gt;
&lt;li&gt;Select distinct column_name1, column_name2…….. from table_name&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Data Retrieval With Simple Conditions&lt;/strong&gt;&lt;br&gt;
&lt;strong&gt;where&lt;/strong&gt;&lt;br&gt;
This is used to display specific data that meets the given condition.&lt;/p&gt;

&lt;p&gt;Syntax of where statement:****&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Select * from table_name where condition&lt;/li&gt;
&lt;li&gt; Select column_name1, column_name2…. where condition&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;order by&lt;/strong&gt;&lt;br&gt;
Used to retrieve data from the database in a specific order.It could be ascending and descending order.&lt;/p&gt;

&lt;p&gt;Syntax of order by statement:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Select * from table_name where condition order by column_name&lt;/li&gt;
&lt;li&gt;Select * from table_name where condition order by column_name 
DESC&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;limit&lt;/strong&gt;&lt;br&gt;
Used to get a limited number of entries i.e the top 10 records.&lt;/p&gt;

&lt;p&gt;Syntax:&lt;br&gt;
SELECT * FROM table_name where condition order by column_name desc limit 10.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Aggregations&lt;/strong&gt;&lt;br&gt;
An aggregate function calculates multiple values and returns a single value.Aggregate functions  in SQL includes group by, avg, count, sum, min, max and many others.NULL values are ignored during the calculations except for the count function. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;GROUP BY&lt;/strong&gt;&lt;br&gt;
The group by clause is used to display the result in the group with the aggregate functions.&lt;/p&gt;

&lt;p&gt;Syntax of group by:&lt;br&gt;
Select column_list from table_name where condition group by expression.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;COUNT()&lt;/strong&gt;&lt;br&gt;
Used to count the number of all or distinct values in an expression. &lt;/p&gt;

&lt;p&gt;Syntax of count () function:&lt;br&gt;
SELECT * from count (column_name) from table_name&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;SUM()&lt;/strong&gt;&lt;br&gt;
sum() function is used to add and get the total sum of values of a numeric column.&lt;/p&gt;

&lt;p&gt;Syntax of sum() function:&lt;br&gt;
SELECT sum (column_name) from table_name&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;JOINS&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;SQL JOINS is used to combine data or rows from two or more tables based on a common field between them.&lt;br&gt;
There are 4 different types of SQL joins:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;SQL INNER JOIN (SIMPLE JOIN OR JOIN)&lt;/strong&gt;&lt;br&gt;
Returns rows from multiple tables where the join condition is met,returns only the common data between the two tables that is where they intersect.&lt;/p&gt;

&lt;p&gt;Syntax:&lt;br&gt;
SELECT table1.column1,table1.column2,table2.column1,....&lt;br&gt;
FROM table1 &lt;br&gt;
INNER JOIN table2&lt;br&gt;
ON table1.matching_column = table2.matching_column;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;SQL LEFT OUTER JOIN (LEFT JOIN)&lt;/strong&gt;&lt;br&gt;
Returns all rows from the LEFT-hand table specified in the ON condition and only those rows from the other table where the join condition is met  or where they intersect.&lt;/p&gt;

&lt;p&gt;Suntax:&lt;br&gt;
SELECT table1.column1,table1.column2,table2.column1,....&lt;br&gt;
FROM table1 &lt;br&gt;
LEFT JOIN table2&lt;br&gt;
ON table1.matching_column = table2.matching_column;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;SQL RIGHT OUTER JOIN (RIGHT JOIN)&lt;/strong&gt;&lt;br&gt;
It returns all the rows of the table on the right side of the join and matching rows for the table on the left side of the join.&lt;/p&gt;

&lt;p&gt;Syntax:&lt;br&gt;
SELECT table1.column1,table1.column2,table2.column1,....&lt;br&gt;
FROM table1 &lt;br&gt;
RIGHT JOIN table2&lt;br&gt;
ON table1.matching_column = table2.matching_column;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;SQL FULL OUTER JOIN (FULL JOIN)&lt;/strong&gt;&lt;br&gt;
Returns all the rows from both tables. For the rows for which there is no matching,it returns NULL values.&lt;/p&gt;

&lt;p&gt;Syntax:&lt;br&gt;
SELECT table1.column1,table1.column2,table2.column1,....&lt;br&gt;
FROM table1 &lt;br&gt;
FULL JOIN table2&lt;br&gt;
ON table1.matching_column = table2.matching_column;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;UNION&lt;/strong&gt;&lt;br&gt;
Returns two query results together.&lt;/p&gt;

&lt;p&gt;Syntax:&lt;br&gt;
SELECT column_name AS Name FROM table_name&lt;br&gt;
UNION&lt;br&gt;
SELECT column_name FROM table_name&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Complex Conditions&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;CASE Statement&lt;/strong&gt;&lt;br&gt;
This is the way SQL handles if/then logic.The statements are often followed by WHEN and THEN statements.The case statements ends with END statement.ELSE statements  are optional .&lt;/p&gt;

&lt;p&gt;Syntax:&lt;br&gt;
SELECT CASE Expression&lt;br&gt;
When expression1 Then Result1&lt;br&gt;
When expression2 Then Result2&lt;br&gt;
...&lt;br&gt;
ELSE Result&lt;br&gt;
END&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Window Functions&lt;/strong&gt;&lt;br&gt;
Uses agreggate functions and other functions over a particular set of rows.OVER clause is used in the  definition of the window. &lt;/p&gt;

&lt;p&gt;Syntax:&lt;br&gt;
SELECT coulmn_name1, &lt;br&gt;
window_function(cloumn_name2)&lt;br&gt;
OVER([PARTITION BY column_name1] [ORDER BY column_name3]) AS new_column&lt;br&gt;
FROM table_name;&lt;/p&gt;

</description>
      <category>database</category>
      <category>datascience</category>
      <category>sql</category>
    </item>
    <item>
      <title>Introduction to SQl for Data Analysis</title>
      <dc:creator>Brendah Achieng</dc:creator>
      <pubDate>Fri, 17 Feb 2023 08:19:22 +0000</pubDate>
      <link>https://forem.com/archybrendah/introduction-to-sql-for-data-analysis-21d9</link>
      <guid>https://forem.com/archybrendah/introduction-to-sql-for-data-analysis-21d9</guid>
      <description>&lt;p&gt;Structured query language is a standard programming language designed  in 1970s for accessing, manipulating and storing data in a relational database.As the name suggests, a relational database is a database composed of data organized in tables that relate to each other. The table rows and columns represent data characteristics and how the data values relate to each other.&lt;/p&gt;

&lt;p&gt;*&lt;em&gt;Why is SQl so important?&lt;br&gt;
*&lt;/em&gt;&lt;br&gt;
SQL is very easy to learn since it uses common English keywords like "where" in it's statements.&lt;br&gt;
SQL is the most universal language in the world.It is  used in almost all types of  applications because it integrates so well with many programming languages.&lt;br&gt;
It is the standard language for database management systems used in both extremely big and small businesses. &lt;br&gt;
SQL is a powerful,fast, efficient,secure, inexpensive open source  software that can be used to do anything related to a database.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How SQL works&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;When a query is run it is processed by a query optimizer.Upon reaching the SQL server,the query is  compiled in three stages:&lt;br&gt;
a) parsing-syntax checking&lt;br&gt;
b) binding-semantics checking &lt;br&gt;
c) optimization-query execution plan creation&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;SQL Commands&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Data Definition Language&lt;/strong&gt;:the creation,design and modification of the database structure and objects i.e the CREATE command &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Data Query Language&lt;/strong&gt;: retrieval of data from the database for example the SELECT command &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Data Manipulation Language:&lt;/strong&gt; insertion of new records and modification of existing ones i.e the INSERT command &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Data Control Language:&lt;/strong&gt;access authorization of the database for example the GRANT command  to allow a given user to access a particular section of the database.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Transaction Control Language&lt;/strong&gt;: automatic database changes i.e ROLLBACK command &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;SQL For Data Analysis&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Data Analysis,Data Science, Business Intelligence,Big Data etc all manipulate and process big amounts of data using different methods to gain &lt;br&gt;
useful  insights.&lt;br&gt;
As mentioned earlier SQL can be implemented in all database management systems like desktop (Access),open source (MySQL) and commercial (oracle).&lt;br&gt;
Data Analysts use SQL to process, manipulate and generally interact with data stored in relational databases.&lt;br&gt;
Businesses and Organizations need Data analysts to discover useful patterns and trends from their data.&lt;br&gt;
Data Analysis therefore involves collecting and organizing data to extract  and retrieve useful information that can be used to make critical decisions.&lt;br&gt;
SQL offer great ability to data manipulation of big amounts of data.It can efficiently build complex models and analysis in a very short time.&lt;/p&gt;

&lt;p&gt;*&lt;em&gt;How to use SQL for Data Analysis *&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Due to the SQL ability to communicate complex instructions to the database and manipulate data in the shortest time possible, SQL can be used to create useful dashboards with reporting tools that can display data in many ways.&lt;br&gt;
Furthermore,SQL can be used to design and build useful warehouses.&lt;br&gt;
SQL can be intergrated with different data analytics frameworks and Languages like python,R, Scala etc.&lt;/p&gt;

&lt;p&gt;*&lt;em&gt;Learning SQL for data analysis *&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;SQL is easy to learn ad use.Sometimes having just an SQL cheat sheet can get a data analyst get going.However, to be a better Data Analysts ,one need to exhaust SQL and master all the skills.&lt;/p&gt;

&lt;p&gt;Finally, data analysts do analyze data but before that they need to retrieve it from the database and that's when SQL come in.Therefore, SQL is a critical language in data analysis.&lt;/p&gt;

</description>
      <category>sql</category>
      <category>database</category>
      <category>datascience</category>
      <category>beginners</category>
    </item>
  </channel>
</rss>
