Forem: Dan

GPT-3 and Article Writing

Dan — Tue, 15 Sep 2020 21:12:57 +0000

Open AI’s GPT-3

In the event that you haven't read the news stories about it, a software engineering student at the University of California, Berkeley, set up a blog on Substack under the pen name Adolos. While OpenAI has at present made GPT-3 accessible only to a restricted crowd of engineers, and Liam Porr was not one of them, he was able to ask a Ph.D. student who had access to the AI to run his inquiries on GPT-3.

Essentially, Porr gave a feature and introduction for the post, and GPT-3 provided a full article. He picked the best of a few results from the AI model and submitted them as a blog with almost no altering.

The principal post, named, "Feeling unproductive? Maybe you should stop overthinking" reached the top spot on Hacker News with almost 200 upvotes and in excess of 70 remarks. In a single week, the blog received 26,000 viewed and achieved 60 subscribers. As indicated by Porr, not many people brought up that the blog may have been composed by AI.

Porr ended the blog with a confession and some ideas on how GPT-3 could change the eventual fate of writing.

The Guardian’s AI Article

The Guardian followed up by publishing an article written by GPT-3 where the prompt was why humans have nothing to fear from AI. It was also fed the following introduction: “I am not a human. I am Artificial Intelligence. Many people think I am a threat to humanity. Stephen Hawking has warned that AI could “spell the end of the human race.” I am here to convince you not to worry. Artificial Intelligence will not destroy humans. Believe me.” GPT-3 produced 8 different outputs, which the Guardian edited and spliced into a single essay. The Guardian stated that editing the articles from GPT-3 was no different than editing a human written article.

What is GPT-3?

Generative Pre-trained Transformer 3 (GPT-3) is a new language model created by OpenAI that is able to generate written text of such quality that is often difficult to differentiate from text written by a human.

GPT-3 is a deep neural network that uses the attention mechanism to predict the next word in a sentence. It is trained on a corpus of over 1 billion words, and can generate text at character level accuracy. GPT-3's architecture consists of two main components: an encoder and a decoder. The encoder takes as input the previous word in the sentence and produces a vector representation of it, which is then passed through an attention mechanism to produce the next word prediction. The decoder takes as input both the previous word and its vector representation, and outputs a probability distribution over all possible words given those inputs. GPT-3's full version has a capacity of 175 billion machine learning parameters, over 10 times the previous largest language model, Microsoft’s Turing NLG.

How it works

The tech world is abuzz about GPT-3's release. Huge language models (like GPT3) are becoming higher and higher and are starting to emulate a human's ability. Whereas it's not fully reliable for many businesses to place before of their customers, these models are showing sparks of cleverness that are guaranteed to accelerate the march of automation and also the prospects of intelligent systems. Let's dig into how GPT-3 is trained and the way it works.

A trained language model generates text. We will optionally pass it some text as input, and that influences its output. The output is generated from what the model “learned” throughout its coaching amount wherever it scanned huge amounts of text.

Training is accomplished by exposing the model to a lot of text. That training has been completed. All the experiments you see currently are from that one trained model. It has been calculated to take 355 GPU years and cost $4.6 million.

The dataset of three hundred billion tokens of text is employed to come up with coaching examples for the model. The model is presented an example. It is only given the features and then asked to predict the following word.

The model’s prediction will be wrong. A calculation based on the error in its prediction is performed and the model updated so the next time it makes a better prediction. Repeat this over and over again.

Now let’s follow up on these same steps with a closer look at the details.

GPT-3 really generates output one token at a time (let’s assume a token may be a word for now). GPT-3 is very large. It encodes what it learns from training in one hundred seventy five billion numbers (called parameters). These numbers are used to calculate the token that comes up at every run. The novice model starts with random parameters. Training finds values that result in higher predictions.These numbers are a part of many matrices within the model. Prediction is generally lots of matrix multiplication.

To shed light on how these parameters are distributed and used, we’ll have to open the model and peer within. GPT3 is 2048 tokens wide. That's its “context window”. which means it's 2048 tracks on that tokens are processed.

High-level steps:

Convert the word to a vector (list of numbers) representing the word

Calculate prediction
Convert ensuing vector to word
The necessary calculations of the GPT-3 occur within its stack of ninety six electrical device decoder layers.

Each of those layers has its own 1.8 billion parameters to form its calculations. That's wherever the “magic” happens. It’s spectacular that this works like this. Results will improve dramtically once fine-tuning is extended for the GPT-3. The odds are it going to be even more impressive. Fine-tuning really updates the model’s weights to form the model higher at a particular task.

GPT-3 Key Takeaways

GPT-3 shows that language model performance scales as a power-law of model size, dataset size, and therefore the quantity of computation.
GPT-3 demonstrates that a language model trained on enough knowledge will solve information science tasks that it's never encountered. That is, GPT-3 studies the model as a general answer for several downstream jobs while not fine-tuning.
The cost of AI is increasing exponentially. coaching GPT-3 would value over $4.6M employing a Tesla V100 cloud instance.
The size of progressive (SOTA) language models is growing by a minimum of an element of ten each year. This outpaces the expansion of GPU memory. For NLP, the times of "embarrassingly parallel" is coming back to the end; model parallelization can become indispensable.
Although there's a transparent performance gain from increasing the model capability, it's not clear what's extremely happening beneath the hood. Especially, it remains an issue of whether or not the model has learned to try and do reasoning, or just memorizes coaching examples in an exceedingly a lot of intelligent method.

Machine Learning and Wine Quality: Finding a good wine using multiple classifications

Dan — Tue, 15 Sep 2020 02:28:51 +0000

Machine Learning and Wine Quality: Finding a good wine using multiple classifications

Wine Tasting

Wine tasting is an esoteric process with many ceremonies and customs. Everything from the shape of the glass to the temperature of the wine can affect how a wine is rated. Wine experts examine color, viscosity, smell, taste and secondary aromas. While machines could examine wines in a similar fashion it would be extremely expensive and difficult. A more feasible option is to use gas spectrum analysis along with pH and other chemical indicators to break a wine down into 11 variables. Using these variables along with reviews we can create a model that will predict which of these variables are most important in determining a “good” wine.

This project will use Kaggle’s Red Wine Quality dataset to create multiple classification models in an effort to predict if a red wine is “good” or not. The wines in the dataset already have been reviewed and rated from 0 to 10. The following 11 variables were also made available:

Fixed acidity
Volatile acidity
Citric acid
Residual sugar
Chlorides
Free sulfur dioxide
Total sulfur dioxide
Density
pH
Sulfates
Alcohol

We are going to experiment with several classification models to see which one can return the highest accuracy with this dataset. In doing so we will also get a good idea of which variables are most important in determining a “good” wine.

Setup

Import the dataset and the libraries that we will use:

import numpy as np
import pandas as pd
import matplotlib as plt
import seaborn as sns
import plotly.express as px

Read the data:

df = pd.read_csv("../input/red-wine-quality-cortez-et-al-2009/winequality-red.csv")

Examine the data:

print("Rows, columns: " + str(df.shape))
df.head()

You will see that there are a total of 1599 rows in 12 columns. There appear to be no issues with the data in the first five rows. Let’s check for missing values:

print(df.isna().sum())

Kaggle has provided a nice clean dataset with no missing values.

Visualizing the Variables

Histogram of the quality variable

To ensure that the quality variable has enough variance and quantity we create a histogram:

fig = px.histogram(df,x='quality')
fig.show()

Variable Correlations

In order to visualize the correlations between the variables we will create a correlation matrix. This will enable us to understand the different relationships between the variables and even determine which variables are correlated to good quality wines.

corr = df.corr()
matplotlib.pyplot.subplots(figsize=(15,10))
sns.heatmap(corr, xticklabels=corr.columns, yticklabels=corr.columns, annot=True, cmap=sns.diverging_palette(220, 20, as_cmap=True))

Convert to a Classification Problem

Going back to the objective of predicting wine quality, we needed the output variable to be a binary output.

For this problem, I defined a bottle of wine as ‘good quality’ if it had a quality score of 8 or higher, and if it had a score of less than 8, it was deemed ‘bad quality’.

Once I converted the output variable to a binary output, I separated my feature variables (X) and the target variable (y) into separate dataframes.

# Create Classification version of target variable
df['goodquality'] = [1 if x >= 8 else 0 for x in df['quality']]

# Separate feature variables and target variable
X = df.drop(['quality','goodquality'], axis = 1)
y = df['goodquality']

Proportion of Good vs Bad Wines

I wanted to make sure that there was a reasonable number of good quality wines. Based on the results below, it seemed like a fair enough number. In some applications, resampling may be required if the data was extremely imbalanced, but I assumed that it was okay for this purpose.

# See proportion of good vs bad wines
df['goodquality'].value_counts()

Preparing Data for Modeling

Normalizing Feature Variables

Now, I felt that I was prepared to set up the information for demonstrating. The primary thing that I did was normalize the information. Normalizing the information implies that it will change the information so its circulation will have a mean of 0 and a standard deviation of 1. It's critical to normalize your information so as to balance the scope of the information.

For instance, envision a dataset with two information highlights: stature in millimeters and weight in pounds. Since the estimations of 'tallness' are a lot higher because of its estimation, a more noteworthy accentuation will consequently be put on stature than weight, making a bias.

# Normalize feature variables
from sklearn.preprocessing import StandardScaler
X_features = X
X = StandardScaler().fit_transform(X)

Split information

Next I split the information into a training and test set with the goal that I could cross-approve my models and decide their viability.

# Splitting the data
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=.25, random_state=0)

Modeling

For this project, we will compare five different machine learning models: decision trees, random forests, AdaBoost, Gradient Boost, and XGBoost. For the purpose of this project, I wanted to compare these models by their accuracy.

Model 1: Decision Tree
Decision trees are a popular model, used in operations research, strategic planning, and machine learning. Each square above is called a node, and the more nodes you have, the more accurate your decision tree will be (generally). The last nodes of the decision tree, where a decision is made, are called the leaves of the tree. Decision trees are intuitive and easy to build but fall short when it comes to accuracy.

from sklearn.metrics import classification_report
from sklearn.tree import DecisionTreeClassifier

model1 = DecisionTreeClassifier(random_state=1)
model1.fit(X_train, y_train)
y_pred1 = model1.predict(X_test)

print(classification_report(y_test, y_pred1))

Model 2: Random Forest
Random forests are an ensemble learning technique that builds off of decision trees. Random forests involve creating multiple decision trees using bootstrapped datasets of the original data and randomly selecting a subset of variables at each step of the decision tree. The model then selects the mode of all of the predictions of each decision tree. What’s the point of this? By relying on a “majority wins” model, it reduces the risk of error from an individual tree.

For example, if we created one decision tree, the third one, it would predict 0. But if we relied on the mode of all 4 decision trees, the predicted value would be 1. This is the power of random forests.

from sklearn.ensemble import RandomForestClassifier
model2 = RandomForestClassifier(random_state=1)
model2.fit(X_train, y_train)
y_pred2 = model2.predict(X_test)

print(classification_report(y_test, y_pred2))

Model 3: AdaBoost
The next three models are boosting algorithms that take weak learners and turn them into strong ones. I don’t want to get sidetracked and explain the differences between the three because it’s quite complicated and intricate. That being said, I’ll leave some resources where you can learn about AdaBoost, Gradient Boosting, and XGBoosting.

StatQuest: AdaBoost
StatQuest: Gradient Boost
StatQuest: XGBoost

from sklearn.ensemble import AdaBoostClassifier
model3 = AdaBoostClassifier(random_state=1)
model3.fit(X_train, y_train)
y_pred3 = model3.predict(X_test)

print(classification_report(y_test, y_pred3))

Model 4: Gradient Boosting

from sklearn.ensemble import GradientBoostingClassifier
model4 = GradientBoostingClassifier(random_state=1)
model4.fit(X_train, y_train)
y_pred4 = model4.predict(X_test)

print(classification_report(y_test, y_pred4))

Model 5: XGBoost

import xgboost as xgb
model5 = xgb.XGBClassifier(random_state=1)
model5.fit(X_train, y_train)
y_pred5 = model5.predict(X_test)

print(classification_report(y_test, y_pred5))

By comparing the five models, the random forest and XGBoost seems to yield the highest level of accuracy. However, since XGBoost has a better f1-score for predicting good quality wines (1), XGBoost appears to be the better model.

Feature Importance

Below, are graphed the feature importance based on the Random Forest model and the XGBoost model. While they slightly vary, the top 3 features are the same: alcohol, volatile acidity, and sulphates. If you look below the graphs, the dataset is split into good quality and bad quality to compare these variables in more detail.

Random Forest

feat_importances = pd.Series(model2.feature_importances_, index=X_features.columns)
feat_importances.nlargest(25).plot(kind='barh',figsize=(10,10))

XGBoost

feat_importances = pd.Series(model5.feature_importances_, index=X_features.columns)
feat_importances.nlargest(25).plot(kind='barh',figsize=(10,10))

Comparing the Top 4 Features

 # Filtering df for only good quality
df_temp = df[df['goodquality']==1]
df_temp.describe()

# Filtering df for only bad quality
df_temp2 = df[df['goodquality']==0]
df_temp2.describe()

By looking into the details, we can see that good quality wines have higher levels of alcohol on average, have a lower volatile acidity on average, higher levels of sulphates on average, and higher levels of residual sugar on average.

Microsoft Azure Machine Learning: Automated Machine Learning and Machine Learning Ops

Dan — Tue, 15 Sep 2020 01:49:06 +0000

What is machine learning?

Machine learning is a data science technique that allows computers to use existing data to forecast future behaviors, outcomes, and trends. By using machine learning, computers learn without being explicitly programmed.

Forecasts or predictions from machine learning can make apps and devices smarter. For example, when you shop online, machine learning helps recommend other products you might want based on what you've bought. Or when your credit card is swiped, machine learning compares the transaction to a database of transactions and helps detect fraud. And when your robot vacuum cleaner vacuums a room, machine learning helps it decide whether the job is done.

Machine learning tools to fit each task

Azure Machine Learning provides many tools developers and data scientists need for their machine learning workflows, including:

The Azure Machine Learning designer: drag-n-drop modules to build your experiments and then deploy pipelines.
Jupyter notebooks: use Microsoft’s example notebooks or create your own notebooks to leverage our SDK for Python samples for your machine learning.
R scripts or notebooks in which you use the SDK for R to write your own code, or use the R modules in the designer.
The Many Models Solution Accelerator builds on Azure Machine Learning and enables you to train, operate, and manage hundreds or even thousands of machine learning models.
Visual Studio Code extension
Machine learning CLI
Open-source frameworks such as PyTorch, TensorFlow, and scikit-learn and many more
Reinforcement learning with Ray RLlib
You can even use MLflow to track metrics and deploy models or Kubeflow to build end-to-end workflow pipelines.

Azure Machine Learning

Azure Machine Learning is a group of cloud services, all dealing with machine learning, packaged in the form of a software development kit (SDK). It is designed for:

Data scientists who build, train and deploy machine learning models at scale
ML engineers who manage, track and automate the machine learning pipelines
Azure Machine Learning comprises of the following components:

An SDK that plugs into any Python-based IDE, notebook or CLI
A compute environment that offers both scale up and scale out capabilities with the flexibility of auto-scaling and the agility of CPU or GPU based infrastructure for training
A centralized model registry to help keep track of models and experiments, irrespective of where and how they are created
Managed container service integrations with Azure Container Instance, Azure Kubernetes Service and Azure IoT Hub for containerized deployment of models to the cloud and the IoT edge
Monitoring service that helps tracks metrics from models that are registered and deployed via Machine Learning Azure Machine Learning can handle workloads of any scale and complexity.

Automated Machine Learning

A lot of the time spent in machine learning is from data scientists iterating over the machine learning models during the experimentation layer. Testing and trying out different algorithms and parameter combinations can take a lot of time and effort while providing little mental simulation or actual challenge. Automated machine learning can leverage concepts and proposals to implement an automated people that tries intelligently-selected algorithms and parameters based on the make up of the data being massaged. The automated pipeline can reduce workload and time spent immensely.

Azure Automated ML support classification, regression and forecasting and provides features like handling missing values, early termination, blacklisting algorithms and others to reduce time and resources spent. Automated ML has a newly introduced UI mode which improves usability for non-professional data scientists and beginners. The wizard like UI now allows them to be valuable contributors in data science teams. By allowing the team to expand beyond just highly specialized data scientists, companies can increase the scale that machine learning benefits them while still using highly qualified people where needed.

MLOps: Deploy & lifecycle management

Creating a model is just the beginning of the Machine Learning pipeline. Using the model in production requires the models to be packaged and deployed, tracked and monitored. Metrics must also be collected to allow retraining or gathering of insights. When you have the right model, you can easily use it in a web service, on an IoT device, or from Power BI. Then you can manage your deployed models by using the Azure Machine Learning SDK for Python, Azure Machine Learning studio, or the machine learning CLI.

These models can be consumed and return predictions in real time or asynchronously on large quantities of data.

And with advanced machine learning pipelines, you can collaborate on each step from data preparation, model training and evaluation, through deployment. Pipelines allow you to:

Automate the end-to-end machine learning process in the cloud
Reuse components and only rerun steps when needed
Use different compute resources in each step
Run batch scoring tasks

If you want you can use scripts to automate your machine learning workflow, the machine learning CLI provides command-line tools that perform common tasks, such as submitting a training run or deploying a model.

Azure Machine Learning provides data scientists an easy way to package their models with simple commands that also keep track of dependencies. Code generated can be versioned controlled using GitHub. The models themselves can be stored in a central model repository that will also hold model metrics, and allow one click deployment.

Once a model has been packaged and registered, testing is needed. Azure Container Instances allow easy containerization and are built into Azure Machine Learning. Once the container is deployed, testing can be performed.

Production environments are synonymous with scale, flexibility and tight monitoring capabilities. This is where Azure Kubernetes Services (AKS) can be very useful for container deployments. It provides scale-out capabilities as it’s a cluster and can be sized to cater to the business’ needs.

Once your model is deployed, you want to be able to collect metrics on the model. You want to ascertain that the model is drifting from its objective and that the inference is useful for the business. This means you capture a lot of metrics and analyze them.

As you collect more metrics and additional data becomes available for training, there may be a need to be able to retrain the model in the hope of improving its accuracy and/or performance.

Also, since this is a continuous process of integrations and deployment (CI/CD), there’s a need for this process to be automated. This process of retraining and effective CI/CD of ML models is the biggest strength of Azure Machine Learning.

Azure Machine Learning is integrated with Azure DevOps for you to be able to create MLOps pipelines inside the DevOps environment.

Start with Azure Machine Learning now!

Follow this guide to get started on the exciting journey of using Azure Machine Learning!

An Enterprise AI Platform

Dan — Fri, 12 Jun 2020 01:56:41 +0000

Today, Artificial Intelligence (AI) is more focused on performing a single task very smartly rather than providing a comprehensive solution covering many areas that require intelligence.

AI has been around for decades, so why the big push for AI today? Three main factors have created an explosion in the AI industry.

Big Data. There is more data than ever, which allows machine learning to improve and provide more relevant insights.
Reduced processing costs. The cost to set up infrastructure and build a specialized team was extremely high. AI required huge budgets and investment for custom made-to-order algorithms. The rise of ubiquitous computing, low-cost cloud services, inexpensive storage, and new algorithms changes all that. Cloud computing and advances in Graphical Processing Units (GPUs) have provided the necessary computational power, while AI algorithms and architectures have progressed rapidly, often enabled by open source software. In fact, today there are many open source options and cloud solutions from Google, Amazon, IBM and more to address infrastructure costs.
The third reason AI is surging today has to do with breakthroughs in deep learning technology. A subset of machine learning, deep learning has structures loosely inspired by the neural connections in the human brain. Most of these big deep learning technology breakthroughs happened after 2010, but deep learning (neural nets) have already demonstrated the ability to solve highly complex problems that are well beyond the capabilities of a human programmer using if-then statements and decision trees.
So how can an organization adopt AI. Here are some best practices around adopting Artificial Intelligence into your organization.

Setting Executive Expectations About AI Adoption: AI is Not Like Regular Software
Business pioneers ought to comprehend that they have to generally change the manner in which they consider these tasks contrasted with conventional software automation. . Implementing a software automation venture that doesn't include any AI is as a rule far simpler and takes significantly less time.

AI activities are time-and asset concentrated. They for the most part require immense measures of very specific sorts of information that the business might or might not have.

They require time from both data scientists and the subject-matter experts advise their work; a business should approve of taking subject-matter experts away from the routine ways in which they bring in cash for the business to team up with information researchers to construct an AI model. In spite of this speculation, the ROI of some random AI undertaking may be insignificant for the time being, if there is one by any means.

Sector-Specific AI Understanding
Businesses also have to be compelled to conceptually understand what sorts of issues AI will solve. There are many approaches to computer science and machine learning, as well as natural language process, computer vision, and anomaly detection. All of those have their own specific use-cases in business.

Once a business leader understands what’s doable with AI, they'll establish business issues at their organization that AI may solve. There are plenty of AI applications within the market. It's vital to settle on the acceptable applications that arrange to solve business issues with comfortable technology and supply worth in measurable ways. Finally, businesses should have adequate and relevant knowledge for coaching machine learning algorithms for correct outputs.

Clarity on Goals
Setting a long-term AI objective is critical for success. Simultaneously, business leaders need to comprehend that any "reconsidering" of business work processes through AI is a huge undertaking with changing paces of progress.

Organizations need to concentrate on picking up AI abilities first and afterward use them as an indicator for characterizing long haul objectives. If the objective is upskilling the group's comprehension of AI, organizations should begin with little undertakings and pick a zone where a traditional software solution exists. Pick something where you as of now have a sensible existing model for with the goal that you can at any rate the effects of the AI model and know whether you are going the correct way.

While accomplishing these drawn out objectives isn't a short-term process, the best approach to move toward them is to begin with little AI extends that are lined up with the sort of long haul AI abilities that a business needs to pick up.

Ways to Organize AI-Compatible Teams
When organizations have comprehended what AI can do and have aligned those abilities to their business objectives, the following stage is to amass information researchers and subject-matter experts to make multidisciplinary groups. SMEs are workers with a profound comprehension of business forms in a specific capacity or division.

As a rule, gathering such a group is more costly than customary programming ventures. The spending allotment for such tasks for the most part should be affirmed by the COO or somebody in the C-suite. Arranging such a group of information researchers may include the accompanying advances:

Guaranteeing that the information researchers taking a shot at the arrangement are obviously mindful of the business issue AI is being applied to. This will help give them context on how much and what kind of information they need, just as what other colleagues' aptitudes may be required for the undertaking.

Subject matter experts need to recognize the business issues that should be explained. At that point, information researchers may be more qualified to determine if AI can take care of that specific business issue.

Artificial intelligence ventures are not a one-time investment. At the point when organizations produce new information, the calculations need to be adjusted so as to consolidate the extra information and still keep up precise outcomes. Maintaining and refreshing the AI frameworks is necessary, and business leaders need to gather groups that can achieve this undertaking in any event, when the task is to a great extent created and sent. Once more, this procedure isn't one that includes justdata scientists. Much the same as with the advancement of AI frameworks, keeping up and aligning these frameworks to improve their exactness likewise requires contributions from topic specialists and other colleagues.

Data and Data Infrastructure Considerations
While we have spoken pretty much all the fundamental human components needed to effectively embrace AI, none of these steps are beneficial unless they are built round a data-centered strategy. Data is the thing that makes AI ventures run, and this information should be cleaned, parsed, and tried before it tends to be utilized.

Rethinking how a business is gathering, putting away, and overseeing information is a choice that should be made in the wake of increasing a specific degree of information competency.

When the data being tested in a pilot is quantifiably significant as a proof of concept, organizations can consider the stage where the whole information foundation is overhauled. The following are a couple of pointers on what business pioneers can expect with regards to information and information framework the executives in AI ventures:

Organizations will find that getting to information is generally harder than foreseen. Information may be put away in a few unique arrangements or may be put away in various geological areas that have various information move guidelines.

Even the information that is available is generally not in a configuration that makes it simple to utilize. The information will regularly require overwhelming redesigning and arranging so as to format it, and, now and again, purging it.

The storage equipment for this information may likewise require overhauling. What's more, organizations may likewise need to reconsider how they are gathering information at present and what new foundation may be required to actualize AI economically

Picking an Initial AI Project
Start Small, But With A Long-Term View of AI Skills
AI projects also involve many individual steps that may take days or weeks to complete. In essence, there are many benefits to beginning small:

Small projects will facilitate businesses target building skills instead of trying to find outright returns right away. AI projects are technically difficult and need giant amounts of initial capital to deploy. They will need 2 to 6 months to make, and even then there may not be a successful result in some cases. Beginning with little pilots permits businesses to understand that AI skills that appear to be working and which aren’t valuable.

Small projects may not need total information infrastructure overhauls in order to successfully test and deploy. For example, deploying a chatbot may not need a business to overhaul their entire information infrastructure, and nonetheless it'll provide them a degree of entry into AI. Businesses will keep small AI projects in a control system in terms of internal information flows thereby not disrupting existing processes.

Small projects will facilitate build confidence in data management capabilities.
Gaining confidence in operating with data can facilitate continue the development of future AI projects in the future, and gaining data ability as a critical capability can allow businesses to remain ahead of the competition.

Basic Tips for Artificial Intelligence
Understand your data and business
The above statement may sound like common sense, but it’s worth mentioning anyway, as skipping those steps may have critical consequences for the project.
Exploratory data analysis helps to see the information quality and define reasonable expectations towards the project’s goals. Moreover, close cooperation with Subject Matter Experts provides the domain’s insights, which are the key to get an entire understanding of the matter.
The above should lead to metrics that help in tracking project development not only from a machine learning perspective but also from business factors.

Stand on the shoulders of giants
It’s highly likely that somebody has already faced an issue just like yours and located facilitate your solution.
Literature review, blogs, and evaluating available open-source codes can facilitate you to see the initial direction and shortlist possible solutions that may support assembling the merchandise.

Don’t believe everything stated within the papers
On the other hand, many papers are written to prove specific model superiority over alternative concepts and don’t address the restrictions and downsides of a given method. Therefore, it’s a decent practice to approach each article with a dose of skepticism and customary sense.

Start with an easy approach
Running a straightforward approach may offer you more insights regarding the matter than a more complicated one, as simple methods and their results are easier to interpret. Moreover, implementing, training, and evaluating a straightforward model is much less time consuming than a complicated one.

Model interpretability vs. flexibility. A more flexible model can handle harder tasks, but the results are harder to interpret. Deep Learning should be located far-off on the proper bottom corner of the above diagram

Define your baseline
How does one know that your state-of-the-art billion parameters model does better than a naive solution? As sophisticated methods do not always outperform more straightforward approaches, it's a decent practice to own a straightforward baseline that helps in tracking the gain offered by complex strategies. Sometimes the benefit is minimal, and a straightforward method may be preferable for a given task for reasons like inference speed or deployment costs.

Plan and track your experiments
Numerous different variables may influence the performance of AI algorithms. The statement is especially valid for deep learning models united can experiment with model architectures, cost functions, and hyper-parameters. Hence, tracking the trials becomes challenging, primarily if many folks work together.
The solution is solely a lab notebook. counting on the team size and your needs, it'd be an easy approach as a shared spreadsheet or a more sophisticated one as MLflow.

Don’t spend an excessive amount of time on finetuning
The results presented within the papers are often an impression of pushing the described methods to their limits. The additional gain of accuracy percentage fractions may be an impression of the many time-consuming experiments. Moreover, papers don't seem to be step-by-step implementation guides but instead target describing the essential concepts of the presented method, and therefore the authors don’t mention many nuances that may be important from the implementation perspective. Therefore implementing a paper from scratch may be a very challenging task, especially if you are trying to match the described accuracy.
An AI project is sometimes time-constrained and requires a wise approach to time management. Hence, if the project features a different goal than replicating some publication precisely, “close enough” results may be sufficient to prevent the implementation. This remark is crucial if several approaches require implementation and evaluation.

Make your experiments reproducible
It doesn’t bring much value to the project if you managed to attain 99% accuracy, but you're ineffective to breed this result. Therefore, you must guarantee that your experiments may be repeated.
First of all, use version control, not only to your code but also to your data. There are several tools for code versioning, but the information versioning is additionally gaining more and more attention, which ends up in solutions suitable for data science projects.
Machine learning frameworks are non-deterministic and depend upon pseudo-random numbers generators. Therefore one may obtain different results on different runs. To create things fully reproducible, store the seed you used to initialize your weights.

Maintain code quality
There is a quite common term “research code,” which is an excuse for poor quality code that's barely readable. The authors usually say, the main focus was to create and evaluate a brand new method instead of worrying about code quality. Which may be a good excuse, as long as nobody else is created to reuse such implementation, there's no need for changes or deployment to production. Unfortunately, all of these points are inherently a part of a commercial project. Therefore, as soon as you make your code available to others, refactor it and make it human pleasant.
Moreover, sometimes not only the code quality is poor, but also the project structure makes it hard to grasp. Also, during this case, you will like already existing tools that help in maintaining a transparent code organization.

Best Practices for Setting Up Artificial Intelligence in Your Organization

Dan — Fri, 12 Jun 2020 01:33:29 +0000

Today, Artificial Intelligence (AI) is more focused on performing a single task very smartly rather than providing a comprehensive solution covering many areas that require intelligence.

AI has been around for decades, so why the big push for AI today? Three main factors have created an explosion in the AI industry.

Big Data. There is more data than ever, which allows machine learning to improve and provide more relevant insights.
Reduced processing costs. The cost to set up infrastructure and build a specialized team was extremely high. AI required huge budgets and investment for custom made-to-order algorithms. The rise of ubiquitous computing, low-cost cloud services, inexpensive storage, and new algorithms changes all that. Cloud computing and advances in Graphical Processing Units (GPUs) have provided the necessary computational power, while AI algorithms and architectures have progressed rapidly, often enabled by open source software. In fact, today there are many open source options and cloud solutions from Google, Amazon, IBM and more to address infrastructure costs.
The third reason AI is surging today has to do with breakthroughs in deep learning technology. A subset of machine learning, deep learning has structures loosely inspired by the neural connections in the human brain. Most of these big deep learning technology breakthroughs happened after 2010, but deep learning (neural nets) have already demonstrated the ability to solve highly complex problems that are well beyond the capabilities of a human programmer using if-then statements and decision trees.

So how can an organization adopt AI. Here are some best practices around adopting Artificial Intelligence into your organization.

Setting Executive Expectations About AI Adoption: AI is Not Like Regular Software

Business pioneers ought to comprehend that they have to generally change the manner in which they consider these tasks contrasted with conventional software automation. . Implementing a software automation venture that doesn't include any AI is as a rule far simpler and takes significantly less time.

AI activities are time-and asset concentrated. They for the most part require immense measures of very specific sorts of information that the business might or might not have.

Sector-Specific AI Understanding

Businesses also have to be compelled to conceptually understand what sorts of issues AI will solve. There are many approaches to computer science and machine learning, as well as natural language process, computer vision, and anomaly detection. All of those have their own specific use-cases in business.

Clarity on Goals

Setting a long-term AI objective is critical for success. Simultaneously, business leaders need to comprehend that any "reconsidering" of business work processes through AI is a huge undertaking with changing paces of progress.

Ways to Organize AI-Compatible Teams

When organizations have comprehended what AI can do and have aligned those abilities to their business objectives, the following stage is to amass information researchers and subject-matter experts to make multidisciplinary groups. SMEs are workers with a profound comprehension of business forms in a specific capacity or division.

Data and Data Infrastructure Considerations

While we have spoken pretty much all the fundamental human components needed to effectively embrace AI, none of these steps are beneficial unless they are built round a data-centered strategy. Data is the thing that makes AI ventures run, and this information should be cleaned, parsed, and tried before it tends to be utilized.

Rethinking how a business is gathering, putting away, and overseeing information is a choice that should be made in the wake of increasing a specific degree of information competency.

Picking an Initial AI Project

Start Small, But With A Long-Term View of AI Skills

AI projects also involve many individual steps that may take days or weeks to complete. In essence, there are many benefits to beginning small:

Basic Tips for Artificial Intelligence

Understand your data and business
The above statement may sound like common sense, but it’s worth mentioning anyway, as skipping those steps may have critical consequences for the project.
Exploratory data analysis helps to see the information quality and define reasonable expectations towards the project’s goals. Moreover, close cooperation with Subject Matter Experts provides the domain’s insights, which are the key to get an entire understanding of the matter.
The above should lead to metrics that help in tracking project development not only from a machine learning perspective but also from business factors.
Stand on the shoulders of giants
It’s highly likely that somebody has already faced an issue just like yours and located facilitate your solution.
Literature review, blogs, and evaluating available open-source codes can facilitate you to see the initial direction and shortlist possible solutions that may support assembling the merchandise.
Don’t believe everything stated within the papers
On the other hand, many papers are written to prove specific model superiority over alternative concepts and don’t address the restrictions and downsides of a given method. Therefore, it’s a decent practice to approach each article with a dose of skepticism and customary sense.
Start with an easy approach
Running a straightforward approach may offer you more insights regarding the matter than a more complicated one, as simple methods and their results are easier to interpret. Moreover, implementing, training, and evaluating a straightforward model is much less time consuming than a complicated one.

Define your baseline
How does one know that your state-of-the-art billion parameters model does better than a naive solution? As sophisticated methods do not always outperform more straightforward approaches, it's a decent practice to own a straightforward baseline that helps in tracking the gain offered by complex strategies. Sometimes the benefit is minimal, and a straightforward method may be preferable for a given task for reasons like inference speed or deployment costs.
Plan and track your experiments
Numerous different variables may influence the performance of AI algorithms. The statement is especially valid for deep learning models united can experiment with model architectures, cost functions, and hyper-parameters. Hence, tracking the trials becomes challenging, primarily if many folks work together.
The solution is solely a lab notebook. counting on the team size and your needs, it'd be an easy approach as a shared spreadsheet or a more sophisticated one as MLflow.
Don’t spend an excessive amount of time on finetuning
The results presented within the papers are often an impression of pushing the described methods to their limits. The additional gain of accuracy percentage fractions may be an impression of the many time-consuming experiments. Moreover, papers don't seem to be step-by-step implementation guides but instead target describing the essential concepts of the presented method, and therefore the authors don’t mention many nuances that may be important from the implementation perspective. Therefore implementing a paper from scratch may be a very challenging task, especially if you are trying to match the described accuracy.
An AI project is sometimes time-constrained and requires a wise approach to time management. Hence, if the project features a different goal than replicating some publication precisely, “close enough” results may be sufficient to prevent the implementation. This remark is crucial if several approaches require implementation and evaluation.
Make your experiments reproducible
It doesn’t bring much value to the project if you managed to attain 99% accuracy, but you're ineffective to breed this result. Therefore, you must guarantee that your experiments may be repeated.
First of all, use version control, not only to your code but also to your data. There are several tools for code versioning, but the information versioning is additionally gaining more and more attention, which ends up in solutions suitable for data science projects.
Machine learning frameworks are non-deterministic and depend upon pseudo-random numbers generators. Therefore one may obtain different results on different runs. To create things fully reproducible, store the seed you used to initialize your weights.
Maintain code quality
There is a quite common term “research code,” which is an excuse for poor quality code that's barely readable. The authors usually say, the main focus was to create and evaluate a brand new method instead of worrying about code quality. Which may be a good excuse, as long as nobody else is created to reuse such implementation, there's no need for changes or deployment to production. Unfortunately, all of these points are inherently a part of a commercial project. Therefore, as soon as you make your code available to others, refactor it and make it human pleasant.
Moreover, sometimes not only the code quality is poor, but also the project structure makes it hard to grasp. Also, during this case, you will like already existing tools that help in maintaining a transparent code organization.

Tensorflow Developer Certification

Dan — Fri, 12 Jun 2020 00:21:59 +0000

What I do and why I decided to get certified

As a software developer I’ve always been interested in data science and analytics but preferred code to math. I did dip my toes in enough to earn a degree specializing in Business Intelligence, but over the years drifted away to more conventional coding. You may have noticed the last couple of years has caused a boom in artificial intelligence and machine learning and I’ve been knocking the rust off some old skills and when Google offered the TensorFlow Developer Certification I thought it would be a perfect fit for me.

What is Tensorflow

TensorFlow is an end-to-end open source platform for machine learning. It has a comprehensive, flexible ecosystem of tools, libraries and community resources that lets researchers push the state-of-the-art in ML and developers easily build and deploy ML powered applications. Created by Google, it powers many of its machine learning services. Usually TensorFlow is written in Python or JavaScript, which then run the machine learning algorithms in C for speed.

What is TensorFlow Developer Certification?

The first step to find out what the certification is would be the landing page and then more the Candidate Handbook. It’s only 7 pages, so read it well. The handbook contains a skills checklist that can help assess if you are ready to take the exam, or if you should polish up on areas where you might be rusty. I definitely needed to polish several areas.

But for the quick version, the TensorFlow Developer Certification is a way to prove and show off your ability to use TensorFlow, specifically the Python version, to build machine learning models for various tasks from regression, natural language processing, forecasting and computer vision.

Why would you want to be TensorFlow Developer Certified?

For me, it was a way to motivate myself into learning more about modern machine learning and to complete a fun challenge. For a lot of people however, it’s going to be more about proving and certifying that you are skilled in a major tool of machine learning.

Exam requirements

The exam is taken at home, using your personal computer. The Candidate Handbook lays out how to set up your environment. I found it to be fairly easy. You’ll need to install Python 3.7, PyCharm (a popular Python IDE), and some packages like tensorflow and numpy. The whole process is detailed in the ‘setting up your environment’ document.

The exam is five hours long and will cost $100 and require an ID certification. You will be presented with five questions. There will be five models that you will have to code. You won’t be starting from scratch, and it will be clearly marked where you have to add your code by comments. The categories include a basic machine learning model, a model from a learning dataset, a Convolutional Neural Network with real-world image dataset, a Natural Language Processing Text Classification with real-world text dataset, and Sequence Model with the real-world numeric dataset.

Test requirements

Foundational Principles Of ML & Deep Learning
You will need to have a clear understanding of building TensorFlow models using Computer Vision techniques, Convolutional Neural Networks (CNN), and Natural Language Processing (NLP), among others. Understanding how to use TensorFlow 2.0 is another prerequisite.
Build & Train Neural Network Models Using TensorFlow 2.0
You need to have a good grasp of ML and deep learning models using the latest TensorFlow 2.0 version. For this, you will need to know how to use TensorFlow 2.0, build, compile and train ML models using TensorFlow, preprocess data to get it ready for use in a model, and use models to predict results.
You will need to build and train models with multiple layers for binary classification and multi-class categorization, understand how to use callbacks to trigger the end of training cycles, datasets from different sources and formats. You must also know how to identify strategies to prevent overfitting, including augmentation and dropout, plot loss and accuracy of a trained model, extract features from pre-trained models, as well as ensure that the inputs are in correct shape, and you can match test data to the input shape of a neural network.
Knowledge Of Image Classification
The third requirement is knowledge on how to build image recognition and object detection models with deep neural networks and convolutional neural networks using TensorFlow 2.0.
For this, you will need to understand CNNs with Conv2D and pooling layers, and how to use convolutions to improve your neural network. You will also need to know image augmentation to prevent overfitting, and ImageDataGenerator and how it labels images based on the directory structure. In addition to this, you also need to know how to build and train models to process real-world image datasets.
Knowledge Of Natural Language Processing (NLP)
You need to understand how to use neural networks to solve NLP problems using TensorFlow. For this, you will need to know how to build NLP models using TensorFlow, build models that identify the category of a piece of text using binary and multi-class categorization, use word embeddings and LSTM in the TensorFlow model, use RNNS, LSTMs, GRUs and CNNs to work with text, as well as train LSTMs on existing text to generate text.
Knowledge of Time Series, Sequences and Predictions
This is the last requirement for participants to pass the certification. You will need to have a clear understanding of how to solve time series and forecasting problems in TensorFlow. This includes training, tuning to solve time series and forecasting problems in TensorFlow, knowledge of Mean Average Error (MAE) and how it can be used to evaluate the accuracy of sequence models. Other prerequisites include how to use RNNs and CNNs for time series, sequence and forecasting models, identifying when to use trailing versus centered windows, identifying and compensating for sequence bias, adjusting the learning rate dynamically in time series, sequence and prediction models, as well as using TensorFlow for forecasting.

How to prepare for the exam

The TensorFlow Developer Certification Handbook
You should start your journey here. The handbook outlines the topics that will be covered in the exam. I suggest reading it a couple of times. The topics may look daunting, but the resources below will completely cover everything you need to know, it just helps to know what you HAVE to know to pass.
TensorFlow in Practice Specialization on Coursera
An absolutely wonderful course by Laurence Moroney and Andrew Ng.. I would highly recommend it for anyone interested in machine learning. The course covers all the topics and goes from a basic model to image and text classification to time series of data in just a few weeks. This is the most important resource for the exam (and getting started with TensorFlow in general).
It’s taught by Laurence Moroney and Andrew Ng, two luminaries of TensorFlow and machine learning and if I had to only choose one resource to prepare for the exam, this would be it.
Hands-on Machine Learning with Scikit-Learn, Keras and TensorFlow 2nd Edition
I suggested this book in my other blog post about learning machine learning and once again highly recommend it. At 700+ pages, this book covers basically all of machine learning and thus, some topics which aren’t relevant to the exam. But it’s a must-read for anyone interested in setting themselves a solid foundation for a future in machine learning and not just to pass an exam.
If you’re new to machine learning, you’ll probably find this book hard to read (to begin with) and I would suggest starting with the Coursera course first.
If you’re only after relevant chapters to the exam, you’ll want to read:
Chapter 10: Introduction to Artificial Neural Networks with Keras
Chapter 11: Training Deep Neural Networks
Chapter 12: Custom Models and Training with TensorFlow
Chapter 13: Loading and Preprocessing Data with TensorFlow
Chapter 14: Deep Computer Vision Using Convolutional Neural Networks
Chapter 15: Processing Sequences Using RNNs and CNNs
Chapter 16: Natural Language Processing with RNNs and Attention
But for the serious student, I’d suggest the whole book and the exercises (maybe not all, but pick and the choose the ones which suit spark your interests most).
Introduction to Deep Learning by MIT
Excellent world-class deep learning course from a world-class university, and it’s free!
The first 3 lectures, deep learning (in general), Convolutional Neural Networks (usually used for computer vision) and Recurrent Neural Networks (usually used for text processing) are the most relevant to the exam.
But again, for the eager learner, going through the whole course wouldn’t be a bad idea.
Be sure to check out the labs and code they offer on GitHub, especially the Introduction to TensorFlow one. And again, I can’t stress the importance of writing the code yourself.
A PyCharm Tutorial
The exam takes place in PyCharm (a Python IDE). I would suggest becoming familiar with it before the exam and running through several example models to make sure everything in your environment is working well.

What happens after you finish the exam?

Once you complete the exam you’ll be notified by email whether or not you passed. There will be no feedback except “Congratulations you passed” or “Unfortunately you didn’t pass this time”. For the most part, you’ll already know if you passed or not when you finish the exam.

Be sure to fill out the form in the email to make sure you get added the TensorFlow Certified Developers network. Once you’ve passed the exam and filled out the form in the email confirmation, in a couple of weeks you’ll be able to Google’s Global Certification Network. Registering yourself here means anyone who’s looking for skilled TensorFlow developers will be able to search for you based on your certification type, experience and region.
You’ll also be emailed an official TensorFlow Developer Certification and badge. Congratulations and good luck with learning more about Artificial Intelligence and Machine Learning!

Installing an OS and Software for Machine Learning

Dan — Fri, 13 Mar 2020 19:11:08 +0000

This is an article about setting up a new computer with the software needed to do Machine Learning on it. We’re going to use Ubuntu 18.04 LTS. If you plan to use Windows on this machine as well, you’ll need to set up dual booting.

OS Installation

Install Ubuntu 18.04 LTS by downloading the image from Ubuntu’s website and imaging it to a USB drive.

There are tutorials from Ubuntu to install Ubuntu from Windows, Mac and Ubuntu I like to use Rufus for imaging if using a Windows machine. Boot up the computer with the USB drive and follow the installation instructions. You may need to set the BIOS up to boot from the USB drive. Make sure your computer is connected to the Internet before starting the installation and it will fetch and install any updates while installing. If your computer was not connected to the internet during installation be sure to run the following commands once it is connected:

sudo apt update
sudo apt upgrade

Why Ubuntu?

Ubuntu is a distribution of Linux maintained by Canonical. It is free and open-source and fairly user friendly with lots of community support available on the Internet.

Install ssh server

I recommend having a ssh server set up to log into your machine. This will be especially useful if you plan on having the computer be headless (no monitor attached).

sudo apt update
sudo apt install openssh-server
cd /etc/ssh
sudo chmod 777 sshd_config

GPU Acceleration

Install the proprietary NVIDIA drivers as well as CUDA and CUDNN to get the full capability of your NVidia GPUs.

You can go to the official NVIDIA website and find which version of the display driver you should use. You can download the drivers from the website or you can install from the terminal with the following command:

sudo apt-get install nvidia-<version number>

Then reboot and ensure the drivers installed by typing:

nvidia-smi

If you’re using the GUI you can also go to Additional Drivers and select the correct driver for you GPU.

Download and Install CUDA

You can go to the CUDA Toolkit website and get the commands for the latest version. For 10.2 on Ubuntu 18.04 use the following commands:

wget http://developer.download.nvidia.com/compute/cuda/10.2/Prod/local_installers/cuda_10.2.89_440.33.01_linux.run
sudo sh cuda_10.2.89_440.33.01_linux.run

Download and Install CUDNN

Go to the NVIDIA cuDNN home page and sign in with your account or create a new one if you don't have one yet.

Download CUDNN 7.x for CUDA 10.2. Extract the content, this will create a cuda directory, e.g. in ~/Downloads/cuda. After extracting copy the content to the CUDA folder, as follows:

sudo rsync -rl cuda/ /usr/local/cuda

Python Virtualenv

We’re going to install Python virtualenv to enable creating isolated Python environments. This will help with dependency management.

First make sure you have all the prerequiestes:

sudo apt install python-dev python3-dev python-pip virtualenv
sudo pip install virtualenv virtualenvwrapper

Then add the following to your bashrc:

# virtualenv and virtualenvwrapper
export WORKON_HOME=$HOME/.virtualenvs
source /usr/local/bin/virtualenvwrapper.sh

Create a new virtual environment named 'virtenv' and specify the python version you like to use on it.

virtualenv -p python3.5 virtenv

Activate the virtual environment

source virtenv/bin/activate

You can deactivate the virtenv by typing

deactivate

Make sure you are in your virtual environment and install the following with pip. Pip is a Python package manager and makes installing software very easy.

pip install numpy
pip install scipy
pip install pandas
pip install matplotlib

Check the installed packages in the virtual environment using:

pip list --format=columns

Machine Learning Packages

Let's now install some of the more popular Machine Learning packages.

Install OpenCV

OpenCV 3.2.0 can be installed from the Ubuntu 18.04 official repository.

sudo apt install python3-opencv

Install Scikit-learn

pip install scikit-learn

Install Tensorflow.

pip install --upgrade tensorflow-gpu

Install Keras

pip install keras

PyTorch Development Environment

Now it is time to setup the python environment for PyTorch development.

mkvirtualenv pytorch -p python3 --system-site-packages

Switch to the pytorch virtual environment and then install the packages:

pip install torch torchvision

Summary

Hopefully this guide was useful in getting your Machine Learning Box with Ubuntu set up. If you run into trouble be sure to ask the Ubuntu community for help, they are usually a friendly group and more than happy to help.

Picking out the parts for a custom Machine Learning Box

Dan — Fri, 13 Mar 2020 19:11:00 +0000

A Custom Machine Learning Box

Building a Machine Learning box can be a lot of fun and save a lot of money over cloud solutions. For developers that want to dive deep into machine learning and small companies it offers a lot of flexibility, customizability, security and performance for the dollar. Building your own box may not be an option for larger enterprise level companies that require warranties and service, but they usually have their own methods of acquiring hardware and servers for their on premise data centers. For these companies making a between cloud solutions and on premise servers requires weighing a lot of factors such as security, data center capability and room, sysadmin availability and price. However a small company or developer can usually easily set up a box or two and then supplement their needs with a cloud solution if necessary to get the most for their money.

The four most important parts: GPU, CPU, Storage and Memory

For those who haven’t built a computer before they may be surprised to find out there are only really 8 components required to put together a computer: GPU, CPU, Storage, Memory, CPU Cooler, Motherboard, Power Supply and Case. While all the parts do depend on each other (so don’t try to cheap out on the Power Supply or else you may have issues!) the ones that matter the most for Machine Learning are the GPU, CPU, Storage and Memory. The GPU will do most of the work with the machine learning model so you’ll want it to be as fast as possible and have enough on-board memory to hold the model and data. The CPU has to manage all the GPUs and can be a bottleneck if it is too slow. Another common bottleneck is the memory and storage, they need to be big enough and fast enough to feed the datasets to the GPUs.

AMD CPUs vs Intel CPUs

We’ll start with the CPU as it dictates a lot of the rest of the system, such as the motherboard. Currently AMD CPUs offer far more performance per dollar than Intel CPUs, especially when it comes to CPU threads. An AMD Threadripper 1920X is $199.99 and offers 12 Cores/24 Threads, a 3.5 Ghz Clock and 60 PCIe lanes. An Intel 9900X has 10 Cores/20 Threads a 3.5 Ghz Clock and only 44 PCIe lanes and costs $597.87. In case you were wondering why these CPUs, read on!

Make sure your CPU can handle 4 GPUs

It’s hard to be sure how many GPUs you’ll end up with when you’re building your first box. Starting with 1 GPU can save a lot of money, but if your models take too long to train you’ll want to be able to upgrade. For Machine Learning a GPU takes 8 PCIe lanes. Your M.2 SSD storage will take 4 PCIe lanes and you’ll need another 4 PCIe lanes for your Gigabit ethernet. So at a minimum you’ll need 40 PCIe lanes.

You’ll also want enough threads to run 4 experiments per GPU, so you’ll need 16 threads to handle the 4 GPUs possible.

Picking the right GPU

As mentioned before, the model and dataset need to be able to fit in the on-board memory of your GPU. Too little memory and you just won’t be able to run your model at all. Nvidia GPUs have an edge over AMD when it comes to machine learning support and tools, so that narrows the choices down considerably. You’ll also want to make sure the GPU is a blower style, where the fans blow the hot air out of case instead of into it. This is especially important if you’re going to have 4 GPUs as there will be very little room for air flow inside the case with that many GPUs.

I would suggest a 2080 Ti (11 GB of RAM, $1079) for a high end build, a 2070 Super (8 GB, $499) for a mid tier build and a 2060 Super (8 GB, $372) for a low end build. You can mix and match GPUs, though remember, if your model and dataset require more than 8 GB you won’t be able to run them on the lower end GPUs.

NOTE: Remember to get a blower style GPU, otherwise the heat will be vented right on top of your other GPUs, causing them to overheat.

Memory and Storage

One of the big bottlenecks in the machine learning cloud solutions is the Input/Output from storage to the GPU instances. We don’t want that to happen in our custom box so we’re going to make sure the Memory and Storage is fast enough. For Storage I recommend a NVMe SSD. NVMe uses PCIe instead of SATA. A SATA3 SSD typically has a read/write speed up to 550MB/second while a NVMe SSD is up to 3500MB/second. That’s a pretty large difference for the money (only around $20 more if you compare otherwise similar drives).

CPU Cooler, Motherboard, Power Supply and Case

You’ll want a good quality CPU Cooler to keep your CPU cool, an air cooler is cheaper and easy to deal with than a water cooler and will more than do the job. Your Motherboard needs to support 40+ PCIe lanes and work with your CPU. For the Power Supply, make sure it provides enough power for up to 4 GPUs and the CPU working at max. You’ll want 1600 watts to supply 4 x 250W GPUs, the 180W CPU, and 150W for the rest of the system. You’ll notice that 1600 watts is more than that, that’s because Power Supplies don’t deliver 100% of the power they are rated for, usually it’s around 80%-90%. The case should be large enough to have 8 expansion slots. I would suggest a full size tower to give yourself the most room and air flow to work with, but some mid size towers will also work.

PC Part Picker

I suggest using PC Part Picker to pick out your parts. Not only will it help you find the best prices, it will help ensure that all your parts are compatible with each other.

Here are two builds I put together as examples. The only difference is the GPU.
The first one is a budget build with a low end single GPU (2060 SUPER) build for a total of $1650.83. The second one is a high tier single GPU (2080 Ti) build for a total of $2380.83. The GPU obviously makes up a large portion of the budget, especially when you scale it up to 4 GPUs.

PCPartPicker Part List

Type	Item	Price
CPU	AMD Threadripper 1920X 3.5 GHz 12-Core Processor	$199.99 @ Amazon
CPU Cooler	Noctua NH-U14S TR4-SP3 82.52 CFM CPU Cooler	$79.90 @ Amazon
Motherboard	Gigabyte X399 AORUS PRO ATX sTR4 Motherboard	$279.99 @ Amazon
Memory	Crucial Ballistix Sport AT 32 GB (4 x 8 GB) DDR4-3200 Memory	$151.99 @ Newegg
Storage	Samsung 970 Evo 1 TB M.2-2280 NVME Solid State Drive	$169.99 @ Amazon
Video Card	Asus GeForce RTX 2060 SUPER 8 GB Turbo EVO Video Card	$409.99 @ B&H
Case	Fractal Design Define XL R2 (Black Pearl) ATX Full Tower Case	$158.99 @ Amazon
Power Supply	Rosewill 1600 W 80+ Gold Certified Semi-modular ATX Power Supply	$189.99 @ Newegg
	Prices include shipping, taxes, rebates, and discounts
	Total	$1640.83

PCPartPicker Part List

Type	Item	Price
CPU	AMD Threadripper 1920X 3.5 GHz 12-Core Processor	$199.99 @ Amazon
CPU Cooler	Noctua NH-U14S TR4-SP3 82.52 CFM CPU Cooler	$79.90 @ Amazon
Motherboard	Gigabyte X399 AORUS PRO ATX sTR4 Motherboard	$279.99 @ Amazon
Memory	Crucial Ballistix Sport AT 32 GB (4 x 8 GB) DDR4-3200 Memory	$151.99 @ Newegg
Storage	Samsung 970 Evo 1 TB M.2-2280 NVME Solid State Drive	$169.99 @ Amazon
Video Card	Asus GeForce RTX 2080 Ti 11 GB Turbo Video Card	$1149.99 @ Amazon
Case	Fractal Design Define XL R2 (Black Pearl) ATX Full Tower Case	$158.99 @ Amazon
Power Supply	Rosewill 1600 W 80+ Gold Certified Semi-modular ATX Power Supply	$189.99 @ Newegg
	Prices include shipping, taxes, rebates, and discounts
	Total	$2380.83

Summary

Once you have your parts picked out it's easy to use the links provided by PC Part Picker to order all of your parts. Once your parts arrive it's a fairly simple matter to put them together. I suggest looking up a few videos on Youtube to give you a good idea on what to do. This video by Linus Tech Tips is fairly informative. Put the parts together and install Linux and enjoy your custom Machine Learning box!

Getting started with Machine Learning

Dan — Fri, 13 Mar 2020 19:10:49 +0000

Artificial Intelligence, Machine Learning and Deep Learning are some of the hottest topics right now and have been experiencing an explosion in popularity and usage. However there are a lot of unfamiliar terms and subject materials to delve into and it can be hard to figure out where to start. The first step in getting started can be the most difficult to take and when given too many choices in terms of direction it can often be crippling. This article will cover some basic terms and offer curated resources to further your learning. The underlying assumption of this article is that you are not an expert in machine learning or python and need a foundation in these subjects.

What is machine learning?

Machine learning is a system that can learn from example through self-improvement and without being explicitly coded for. The main idea is that a machine can learn from the data to produce accurate results. Machine learning typically combines data with statistical tools to predict an output. Machine learning is closely related to data mining and Bayesian predictive modeling. The machine receives data as input and uses an algorithm to come up with answers. You should note that Machine learning is different from Artificial Intelligence and Deep Learning, even though the terms are sometimes used interchangeably by people who don’t know better.

Types of learning

Machine learning can be grouped into three broad learning methods
Supervised Learning: Regression and Classification
Unsupervised Learning: Clustering and Dimensionality Reduction
Reinforcement Learning: Real time decisions, Game AI, Navigation, Skill Acquisition

Supervised Learning

An algorithm uses training data and feedback from humans to learn the relationship of given inputs to an output. You can use supervised learning when the output data is known. The algorithm will predict or label new data once trained.

There are two main categories of supervised learning:

Classification task: An example would be identifying the gender of a customer based on sales or other information gathered about the customer. A classifier is not limited to just two classes, just about any number is possible.

Regression task: When the data output is a continuous value it falls into the regression category. For instance, a financial analyst may need to forecast the value of a stock based on a range of features such as equity, previous performance, and macroeconomics indexes. The system will be trained to estimate the price of the stocks with the lowest possible error.

Unsupervised Learning

Unsupervised learning is quite different from supervised in the sense that it almost always does not have a definite output. The learning agent aims to find structures or patterns in the data. You can use it when you do not know how to classify the data, and you want the algorithm to find patterns and classify the data for you

Reinforcement Learning

Reinforcement learning is where the learner receives rewards and punishments for its actions. The reward could simply be a score and the agent could be told to receive as much score as possible in order to “win”.

Challenges and Limitations of Machine learning

The primary challenge of machine learning is the lack of data or the diversity in the dataset. A machine cannot learn if there is no data available. A dataset also needs to be diverse enough to give meaningful insight. It is rare that an algorithm can extract information when there are no or few variations. It is recommended to have at least 20 observations per group to help the machine learn. This need for data is why data is referred to as the new oil.

How to start learning Machine Learning?

While a Ph.D. degree is not necessary, you do need at least a basic understanding of some of the math behind machine learning. Without this understanding you will not be able to feed your algorithms meaningful data.

Some basic prerequisites:

Learn Linear Algebra, Multivariate Calculus, and Probability Theory: All are very important in Machine Learning, though the extent to which you’ll need them depends on how much you want to focus on R&D vs using available libraries. Even if you plan on using existing libraries you will want at least a basic understanding of Linear Algebra, Multivariate Calculus and Probability Theory.
Learn Statistics: Data plays the most important role in Machine Learning. A lot of your time will be spent collecting and cleaning data and since statistics is the field that handles collection, analysis and presentation of data you’ll want a solid foundation in it.
Learn Python: Python has become the standard language for using many of the tools in Machine Learning. While there are other possibilities such as R and Scala, Python is by far the most popular. Some of the most important Python libraries for Artificial Intelligence and Machine Learning are Keras, TensorFlow, and Scikit-learn.

Some of these prerequisites are covered in the resources below, but you may need to find some other resources to fill in any missing gaps in your knowledge.

Resources for Learning Machine Learning

Recommended courses:

Even though it is getting up there in age and the course uses Octave instead of Python I highly recommend Andrew Ng's Coursera Course.

Another option is Udacity’s Intro to Machine Learning with PyTorch.

If you chose Andrew Ng’s course, I do recommend re-implementing all the solutions in Python after doing them in Octave and making sure you get the same answers.

If you have basic python skills and an interest in deep learning, another option is the Winter 2016 CS231n course from Stanford. The lectures are top notch and the course notes are incredibly detailed. The homework assignments really reinforce the lessons. It goes from traditional statistical machine learning methods to convolutional neural net, and recurrent neural net. And it is recent enough for everything that gets taught to be for the mostly relevant.

For Fun and Resume Filler:

Kaggle Competitions

Kaggle competitions are a great way to become more proficient in Machine Learning by combining theoretical knowledge with practical implementation in an easy to do format. Two of the better basic competitions you can do are Kaggle are:

Titanic: Machine Learning from Disaster:
The Titanic: Machine Learning from Disaster challenge is a very popular beginner project for Machine Learning and has multiple tutorials available. It is a great introduction to concepts such as data exploration, feature engineering, and model tuning.

Digit Recognizer:
The Digit Recognizer is a good project to attempt after you have some knowledge of Python and Machine Learning basics. It is a great introduction into the exciting world neural networks using a classic dataset which includes pre-extracted features.

More Resources:

A website that does a great job breaking down the steps you need to learn machine learning including a lot of useful resources can be found here: https://machinelearningmastery.com/start-here/#getstarted

Summary

So we covered the very basics of Machine Learning and went over some resources to continue learning more. Like any science the amount of information to learn and understand can feel overwhelming. The resources you choose can make the task seem a lot less daunting, which is why I tried to list a variety covering a range of knowledge and different backgrounds. Pick one that seems right for you and give it a try!