<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: anirudhkannan</title>
    <description>The latest articles on Forem by anirudhkannan (@anirudhkannanvp).</description>
    <link>https://forem.com/anirudhkannanvp</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F386269%2F70384555-1ebc-4e61-9c9a-e6b0a8badccd.jpeg</url>
      <title>Forem: anirudhkannan</title>
      <link>https://forem.com/anirudhkannanvp</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/anirudhkannanvp"/>
    <language>en</language>
    <item>
      <title>A compilation of Different Machine Learning Algorithms/Models for beginners in Data Science Competitons(Kaggle)</title>
      <dc:creator>anirudhkannan</dc:creator>
      <pubDate>Tue, 04 Aug 2020 12:26:48 +0000</pubDate>
      <link>https://forem.com/anirudhkannanvp/a-compilation-of-different-machine-learning-algorithms-models-for-beginners-in-data-science-competitons-kaggle-346n</link>
      <guid>https://forem.com/anirudhkannanvp/a-compilation-of-different-machine-learning-algorithms-models-for-beginners-in-data-science-competitons-kaggle-346n</guid>
      <description>&lt;p&gt;Hola, &lt;/p&gt;

&lt;p&gt;Before reading I want the reader to know that I am not an expert in data science.I am an SDE by profession. I have started spending quite a lot of my time on Kaggle and learning about data science in General.&lt;/p&gt;

&lt;p&gt;Here I have compiled a list of frequently used ML Algorithms by various Kaggle Grandmasters, so that I can frequently lookup to this list, keep adding more stuff here for faster lookup during future Competitions(This post is just meant to be my cache)&lt;/p&gt;

&lt;p&gt;If you consider yourself an expert, please skip this post.&lt;/p&gt;

&lt;p&gt;1) &lt;em&gt;Linear Model&lt;/em&gt;&lt;br&gt;
      1. Especially good for sparse high dimensional data. &lt;br&gt;
      2. Usually split a given space into two sub spaces with a line/hyperspace.&lt;br&gt;
      3. Regularization is usually done for Linear models in pre processing during Competitions&lt;/p&gt;

&lt;p&gt;eg:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Logistic Regression&lt;/li&gt;
&lt;li&gt;Support Vector Machines&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Best Implementations:- &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Sckit Learn &lt;/li&gt;
&lt;li&gt;VowPal Rabbit &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;2) Tree Based Methods (Uses Decision tree to create models)&lt;/p&gt;

&lt;p&gt;Here we divide spaces into sub spaces until probability of a class in a divided.&lt;/p&gt;

&lt;p&gt;eg:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Random Forest &lt;/li&gt;
&lt;li&gt;Gradient Boosted Decision Trees(We improve prediction probabilities based on probabilities of sum of the previous ones)&lt;/li&gt;
&lt;li&gt;ExtraTrees Classifier&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Disadvantages:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Hard to capture linear splits if it exists while classifying&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Best Implementations:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Sckit Learn&lt;/li&gt;
&lt;li&gt;XGBoost&lt;/li&gt;
&lt;li&gt;LightGBM&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;3) K-NN(K nearest neighbours) methods&lt;/p&gt;

&lt;p&gt;Based on intutiton/assumption that nearest neighbours have &lt;br&gt;
 similar labels. &lt;/p&gt;

&lt;p&gt;Best Implementations are in Sckit Learn&lt;/p&gt;

&lt;p&gt;4) Neural Networks&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The most used ones according to a Kaggle Grandmaster are Feed-forward neural network which produces smooth non-linear decision boundaries. &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Best Implementations:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;TensorFlow&lt;/li&gt;
&lt;li&gt;Keras&lt;/li&gt;
&lt;li&gt;mxnet &lt;/li&gt;
&lt;li&gt;Pytorch &lt;/li&gt;
&lt;li&gt;Lasagne&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Making Inferences from Decision Surfaces&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;If lines parallel to the axis and boundaries are smooth then its probably a Random Forest&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Important: Choose a model for a Particular Competition based on use the use case as no model is better than others in all situations&lt;/p&gt;

</description>
      <category>datascience</category>
      <category>kaggle</category>
      <category>machinelearning</category>
      <category>ai</category>
    </item>
    <item>
      <title>A Beginners guide to Data Science: How I started with Data Science, Kaggle and Machine Learning...Solutions, Tips and much more</title>
      <dc:creator>anirudhkannan</dc:creator>
      <pubDate>Sun, 05 Jul 2020 14:36:13 +0000</pubDate>
      <link>https://forem.com/anirudhkannanvp/a-beginners-guide-to-data-science-how-i-started-with-data-science-kaggle-and-machine-learning-solutions-tips-and-much-more-3okp</link>
      <guid>https://forem.com/anirudhkannanvp/a-beginners-guide-to-data-science-how-i-started-with-data-science-kaggle-and-machine-learning-solutions-tips-and-much-more-3okp</guid>
      <description>&lt;p&gt;This is the starting post of a series I am writing that is meant to help people get started with Data Science. I am not a highly experienced person in Data Science yet, but I keep learning everyday and I am gonna keep publishing content to help people get started with Data Science as getting started seems like the toughest job.&lt;/p&gt;

&lt;p&gt;Sometimes I may be automatically publishing my work done at Kaggle or any website directly through a web page parser I wrote with Python, Selenium and Beautiful Soup, so pardon me for any markdown errors. I hope to write a parser one day that will learn from its mistakes and create a perfect medium or dev.to post. But that still has a long way to go haha :)&lt;/p&gt;

&lt;p&gt;The upcoming series is meant to help newbies with Data Science, get through Kaggle Courses, where I discuss the solutions of courses, contests and much more here or in upcoming posts.&lt;/p&gt;

&lt;p&gt;Pardon me for my mistakes if any, since I am new to publishing written content but I hope to create high quality work as I always believe in highest standards of work.&lt;/p&gt;

&lt;p&gt;Please follow me if you like my content or feel free to reach out to me regarding any comments/feedback. I always value comments from anyone irrespective of your experience, as I believe all humans are equal. Lets all grow as humans and create a better world :)&lt;/p&gt;

</description>
      <category>datascience</category>
      <category>ai</category>
      <category>machinelearning</category>
      <category>kaggle</category>
    </item>
  </channel>
</rss>
