<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Dave Parr</title>
    <description>The latest articles on Forem by Dave Parr (@daveparr).</description>
    <link>https://forem.com/daveparr</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F150692%2F22b3fd57-c859-4087-897b-f63d034fa359.jpeg</url>
      <title>Forem: Dave Parr</title>
      <link>https://forem.com/daveparr</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/daveparr"/>
    <language>en</language>
    <item>
      <title>Moving Starpilot to GraphQL</title>
      <dc:creator>Dave Parr</dc:creator>
      <pubDate>Tue, 16 Jan 2024 00:00:00 +0000</pubDate>
      <link>https://forem.com/daveparr/moving-starpilot-to-graphql-4p8n</link>
      <guid>https://forem.com/daveparr/moving-starpilot-to-graphql-4p8n</guid>
      <description>&lt;p&gt;After building &lt;a href="https://github.com/DaveParr/starpilot"&gt;Starpilot&lt;/a&gt; at the end of last year and using it for a little. I became pretty frustrated with the time it took to create the vectorstore. Specifically, the time it took to read the relevant repo data from GitHub’s REST api via &lt;a href="https://github.com/PyGithub/PyGithub"&gt;PyGithub&lt;/a&gt;. I &lt;a href="https://www.daveparr.info/blog/copilot-for-your-github-stars-1cep/"&gt;outlined the choices I made&lt;/a&gt; in the original blog post, but to summarise: &lt;code&gt;PyGithub&lt;/code&gt; wraps the GitHub REST api and is used to read the repos that are starred by a user on GitHub, and then pass this to a function that iterates over &lt;em&gt;each&lt;/em&gt; repo for &lt;em&gt;each&lt;/em&gt; piece of data I want to extract. I’m sure that for many (most) use cases this isn’t an issue, but for getting 9 pieces of data for lists of sometimes hundreds or even thousands of repos, it’s the wrong tool for the job. There is overhead to set up the connection, then execute the call, and wait for the response to come back. The total number of calls is roughly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;9[pieces of information] * 1000[number of repos] + 1[list of user starred repos]

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is slow. To be fair I wasn’t surprised by this. I actually expected it and just accepted it so I could get onto making the tool somewhat useful. Now I know it is useful though, it’s time to optimise. For my list of 800 starred repos it takes about 20 minutes on my desktop. Luckily, I’ve found a way to 10x the speed of this process down to 2 minutes for the same list on the same machine using GraphQL.&lt;/p&gt;

&lt;h2&gt;
  
  
  How I did it
&lt;/h2&gt;

&lt;h3&gt;
  
  
  GraphQL
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://graphql.org/"&gt;GraphQL&lt;/a&gt; is a query language for APIs and a runtime for fulfilling those queries with your existing data. It provides a complete and understandable description of the data in your API, gives clients the power to ask for exactly what they need and nothing more, makes it easier to evolve APIs over time, and enables powerful developer tools. GraphQL is perfect for the needs of this project as:&lt;/p&gt;

&lt;h3&gt;
  
  
  GitHub GraphQL Explorer
&lt;/h3&gt;

&lt;p&gt;GitHub provides a &lt;a href="https://docs.github.com/en/graphql"&gt;GraphQL api&lt;/a&gt;. The Github GraphQL api allows you to structure a query for a specific set of information for a repo, then use that structure to return the same set of of structured information for each repo that a specific user has starred, all in one call. GitHub also provides a &lt;a href="https://docs.github.com/en/graphql/overview/explorer"&gt;GraphQL explorer&lt;/a&gt; to help you build your queries. However after a while I found this limiting, mostly because the page renders so that only around a 3rd of the screen is the playground with a mad amount of space for the header, footer and sidebar. Therefore I installed:&lt;/p&gt;

&lt;h3&gt;
  
  
  Insomnia
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://dev.to/scottw/insomnia-rest-client-578d-temp-slug-9682618"&gt;Insomnia&lt;/a&gt; is a desktop editor for developing APIs. I liked it because the workspace is more fully featured, allowing you to save queries and sync them to your cloud account for storage, and handles authentication and suppliying variables. Because part of the GraphQL specification is that each API can be &lt;a href="https://graphql.org/learn/introspection/"&gt;&lt;em&gt;introspected&lt;/em&gt;&lt;/a&gt; it also behaved exactly like the GitHub Explorer and provided autocomplete, documentation and error linting for the queries I was writing. This was a huge help in getting the queries right. GitHub’s GraphQL interface is HUUUUGE. With Insomnia I was able to build up the query I needed in stages, testing each part as I went (Insomnia the app, not the sleep problem). However, I still needed to integrate this into my code. Most tutorials seem to suggest that the thing to do is just take the query, wrap it in &lt;code&gt;""" myQuery here"""&lt;/code&gt; and pass it to requests, but that seemed like a perfect way to accidentally break 90 lines of code with an unlintable typo. I wanted to use a python library that would allow me to build the query in a more structured way that is harder to break by accident, and also allow me to pass variables to the query. I found:&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;code&gt;graphql-query&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://github.com/denisart/graphql-query"&gt;denisart/graphql-query&lt;/a&gt; is a Python package to build GraphQL queries from the core abstract pieces of the GraphQL language. I found this really useful as I learnt more about GraphQL and how it’s mental model is formed of fields with potentially optional arguments, nodes, edges and fragments. It also helped to make editing in VS Code directly a lot easier as it supports type hinting, so you can’t put an &lt;code&gt;Argument&lt;/code&gt; or &lt;code&gt;Field&lt;/code&gt; in the wrong place. As I iterated over the last tweaks to the query I found I actually didn’t need to go back to Insomnia as much. However, &lt;code&gt;graphql-query&lt;/code&gt; doesn’t actually act as a client to the GraphQL api, &lt;a href="https://github.com/DaveParr/starpilot/blob/5c688eb70727d860ca9a0659f20fda988b5b686a/starpilot/utils/utils.py#L61-L138"&gt;it just builds the query&lt;/a&gt;. For that I needed:&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;code&gt;gql&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://github.com/graphql-python/gql"&gt;gql&lt;/a&gt; is a Python library for handling GraphQL queries in Python through a client. This became particularly useful as a (reasonable) limitation of GitHub’s GraphQL API is that when returning back a &lt;code&gt;repo&lt;/code&gt; type node, you can only return up to 100 per call. Therefore I needed to paginate through the results. I would have liked it if &lt;code&gt;gql&lt;/code&gt; had a way to handle this for me, but I couldn’t find it if it does, so I wrote a small bit of simple logic to handle that. The slightly tricky thing about pagination on this node is that you need to handle it with a GraphQL &lt;code&gt;cursor&lt;/code&gt;. This is a string that is returned with each node that you can use to tell the API where to start the next page of results from. I found that the easiest way to handle this was &lt;a href="https://github.com/DaveParr/starpilot/blob/5c688eb70727d860ca9a0659f20fda988b5b686a/starpilot/utils/utils.py#L172-L184"&gt;to use a &lt;code&gt;while&lt;/code&gt; loop&lt;/a&gt; and pass the cursor back to the query as a variable. When the query return is empty for a last cursor, the loop breaks, and the results are returned. I did debate a more complex approach that involved counting the number of expected repos first, modulo 100, and then using that to determine the number of pages to request, but I decided that was overkill for my use case.&lt;/p&gt;

&lt;h2&gt;
  
  
  Other tweaks
&lt;/h2&gt;

&lt;p&gt;Because the data that is returned from the API comes as a dict for each repo, I modified the old function that iterated over each repo to get all the data for a specific repo. This function also contained the logic to write the data to disk for iteration over by &lt;code&gt;langchain&lt;/code&gt;, so it seemed a suitable change. As I’d also learnt some more about &lt;code&gt;jq&lt;/code&gt; and how it is used in &lt;code&gt;langchain&lt;/code&gt; I realised that instead of creating a &lt;code&gt;json&lt;/code&gt; file with 2 dicts, one for &lt;code&gt;metadata {...}&lt;/code&gt; and one for &lt;code&gt;content {...}&lt;/code&gt; I realised I could just &lt;a href="https://github.com/DaveParr/starpilot/blob/5c688eb70727d860ca9a0659f20fda988b5b686a/starpilot/utils/utils.py#L193-L221"&gt;write the data structure as a single dict&lt;/a&gt;, and then use &lt;code&gt;jq&lt;/code&gt; to &lt;a href="https://github.com/DaveParr/starpilot/blob/5c688eb70727d860ca9a0659f20fda988b5b686a/starpilot/utils/utils.py#L256-L277"&gt;extract the data I needed&lt;/a&gt; via tha optional &lt;code&gt;metadata_func&lt;/code&gt; argument in &lt;code&gt;langchain.document_loaders.JSONLoader&lt;/code&gt;. I got tripped up for a while as I had mistakenly written a behaviour into &lt;code&gt;metadata_func&lt;/code&gt; that effectively &lt;em&gt;re-imputed&lt;/em&gt; a &lt;code&gt;None&lt;/code&gt; value into the &lt;code&gt;description&lt;/code&gt; value &lt;em&gt;after&lt;/em&gt; the &lt;code&gt;format_repo()&lt;/code&gt; function had taken the value &lt;em&gt;and&lt;/em&gt; key out of the file saved to disk, which was then passed to &lt;code&gt;Chroma&lt;/code&gt; via the &lt;code&gt;JSONLoader&lt;/code&gt;. I managed to debug that however, and then I implemented the suggestion from the helpful error message to use &lt;a href="https://github.com/DaveParr/starpilot/blob/5c688eb70727d860ca9a0659f20fda988b5b686a/starpilot/utils/utils.py#L295"&gt;&lt;code&gt;langchain.vectorstores.utils.filter_complex_metadata()&lt;/code&gt;&lt;/a&gt; as a (slightly redundant) safeguard. I’d already got it wrong once so…&lt;/p&gt;

&lt;h2&gt;
  
  
  The tweaks that didn’t make it
&lt;/h2&gt;

&lt;p&gt;A version of the GraphQL query that I used originally went the whole hog and also returned the &lt;code&gt;README.md/README.rst&lt;/code&gt; from the repo. I’m certain that many of the readmes have great context that will enhance the embedding in the vectorstore, however I found that returning all that data significantly eroded the performance gains. To the extent that though I had 10x speeded up the process of reading the data from GitHub, I was now spending 10x longer generating the embedding via &lt;code&gt;langchain.embeddings.GPT4ALLEmbeddings&lt;/code&gt;. I know that the OpenAI embedding model seems to be much faster from some trials I’ve done, but I’m not (yet) happy to migrate over as it increases the cost of running &lt;code&gt;starpilot&lt;/code&gt;. However, not to the extent that it’s unviable (probably still less than a cup of coffee), but I’m not sure I’m ready to commit to that yet. Maybe as I work towards building an agent architecture for &lt;code&gt;starpilot&lt;/code&gt; I’ll revisit this. When I do, I’ll probably implement a way to summarise the use case from the readmes and strip out the things like installation instructions. That summary would then be the part used in the embedding.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Using Github’s GraphQL API has allowed me to &lt;a href="https://github.com/DaveParr/starpilot/commit/5c688eb70727d860ca9a0659f20fda988b5b686a"&gt;10x the speed of reading the data from GitHub&lt;/a&gt;. Developing for GraphQL is maybe even &lt;em&gt;less&lt;/em&gt; of a hassle than for REST APIs, as the introspection allows you to build complex queries in a more structured way with a better linting and autocomplete driven experience up front. I’m not sure that means &lt;em&gt;all&lt;/em&gt; API’s should be GraphQL. It’s pretty clear the benefits come when the data that is being returned (or operations being performed) are being done on data objects with a huge amount of structure. The other benefit is being able to batch up calls on that structure. The introspection aspects of it also help with learning, developer experience and good documentation, however arguably a chunk of this idea is already available on REST APIs via OpenAPI/Swagger, so it’s not a unique selling point.&lt;/p&gt;

&lt;p&gt;Developing GraphQL in Python also seems pretty robust, with a number of other options I didn’t include for GraphQL clients in the language ecosystem, as well as other tools that I didn’t need for starpilot like GraphQL servers. I had worried it would be pretty under tooled, and that GraphQL client side was a very “JavaScript” thing to do. Luckily that wasn’t the case.&lt;/p&gt;

&lt;p&gt;In the future either on this project or another I’d like to explore &lt;code&gt;graphql-query&lt;/code&gt; ’s support for fragments to dynamically extend the query render at call time (e.g. to include the readme data in the query based on a boolean flag in the function signature). I’d also maybe look at &lt;a href="https://gql.readthedocs.io/en/latest/advanced/dsl_module.html"&gt;&lt;code&gt;gql&lt;/code&gt;’s support for creating a Domain Specific Language (DSL) for a given GraphQL schema&lt;/a&gt;. I only spotted that after I had finished my work with &lt;code&gt;graphql-query&lt;/code&gt; so I didn’t spend time resolving the same problem, but if I need to write more GraphQL in Python I’ll definitely look at that.&lt;/p&gt;

&lt;p&gt;Hopefully this speed up will make &lt;a href="https://github.com/DaveParr/starpilot"&gt;&lt;code&gt;starpilot&lt;/code&gt;&lt;/a&gt; more approachable for a lot of people. I know I’ve been irritated in my own development cycle iterating over &lt;code&gt;star pilot read&lt;/code&gt; and waiting for the data to come back, so at the very least I’ve saved myself some time and learnt a bunch in the process.&lt;/p&gt;

</description>
      <category>python</category>
      <category>graphql</category>
      <category>github</category>
      <category>cli</category>
    </item>
    <item>
      <title>How do you use your VSCode profile?</title>
      <dc:creator>Dave Parr</dc:creator>
      <pubDate>Fri, 12 Jan 2024 14:03:32 +0000</pubDate>
      <link>https://forem.com/daveparr/how-do-you-use-your-vscode-profile-1ag1</link>
      <guid>https://forem.com/daveparr/how-do-you-use-your-vscode-profile-1ag1</guid>
      <description>&lt;p&gt;I've been using &lt;a href="https://code.visualstudio.com/docs/editor/profiles"&gt;VScode profiles&lt;/a&gt; for a while, but I fee like my profiles are turnung into a mess. &lt;/p&gt;

&lt;p&gt;I started by just making a python profile. Then I made a profile from that one named data science. Then I made one with markdown tools and cspell for writing blog posts and note taking. Then I made one for rust. And one for writing django projects. Plus theres the built in default one.&lt;/p&gt;

&lt;p&gt;The problems I'm now getting are that I wonder why my editor in data science view isn't type checking the way I expected it to, and it's because I tweaked those settings in Python, not Data Science. and I've found a great tool that I've added to my Django profile that helps with lots of python things, but I can only add it upwards to my default profile, not my python profile or the data science one, without having to switch it in, refind and install it, then switch back to what I was doing.&lt;/p&gt;

&lt;p&gt;So the question is: Have you found a way to manage your profiles that you are happy with? &lt;/p&gt;

&lt;p&gt;I feel the problem is that the profile is a 'leaky abstraction'. It allows a lot of customisability to do what I've done, but that ends up with a muddle of profiles split by language OR task. &lt;/p&gt;

&lt;p&gt;I feel like a neat solution is to adopt profiles of&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;default&lt;/li&gt;
&lt;li&gt;python&lt;/li&gt;
&lt;li&gt;rust&lt;/li&gt;
&lt;li&gt;R&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;OR &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;web development&lt;/li&gt;
&lt;li&gt;data science&lt;/li&gt;
&lt;li&gt;writing&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;What do you all think?&lt;/p&gt;

</description>
      <category>vscode</category>
      <category>microsoft</category>
      <category>ide</category>
      <category>editor</category>
    </item>
    <item>
      <title>Copilot for your GitHub stars</title>
      <dc:creator>Dave Parr</dc:creator>
      <pubDate>Sun, 19 Nov 2023 17:08:15 +0000</pubDate>
      <link>https://forem.com/daveparr/copilot-for-your-github-stars-1cep</link>
      <guid>https://forem.com/daveparr/copilot-for-your-github-stars-1cep</guid>
      <description>&lt;p&gt;How do you use your GitHub stars? &lt;/p&gt;

&lt;p&gt;I'd guess if you've been programming for a few years you've probably hit the star button at the top of a few of your favourite repos. I know some people I follow have done it &lt;em&gt;thousands&lt;/em&gt; of times. Do you go back to them though? Do you review them for inspiration for your next project or go to them when you're stuck on a partictular problem?&lt;/p&gt;

&lt;h2&gt;
  
  
  Inspiration
&lt;/h2&gt;

&lt;p&gt;I've always assumed I would use them but I never have. I found myself doing some research recently into how to build software that uses LLMs, with the deliberate goal of building an as yet undefined side-project. I wanted to build something I hadn't built before, something that was hopefully a little original, and maybe even &lt;strong&gt;useful&lt;/strong&gt;! So yet again I was starring repos like LangChain and Chroma, swearing this time would be different. &lt;/p&gt;

&lt;p&gt;As I was running through blog posts and diligently smashing the star buttons I realised that I had just hit on exactly what I wanted to try. I wanted to bring my GitHub stars right into my editor. I wanted to be able to have them next to me as I was working and get a sensible set of suggestions on what might be useful for my needs at that moment, and I had just been starring the exact repos that could make this happen!&lt;/p&gt;

&lt;h2&gt;
  
  
  The original idea
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;Use a dataset of your personal stars to inform retrival augmented generation for a question and answer large language model deployed in a command line interface&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I thought this would be useful for a few reasons:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;By having it in the CLI its available right in my editor, and to every project.&lt;/li&gt;
&lt;li&gt;By having a set of your personal stars the suggestions are already curated by your interests and preferences. Mine are all Python and R librarys, wierd data bases and charting libraries. Yours might mostly be Ruby gems, or web frameworks, or tools for embedded systems.&lt;/li&gt;
&lt;li&gt;By using a large language model the tool &lt;em&gt;might&lt;/em&gt; be more capable of understanding the intentention of your goals, for instance the query "Suggest how to build a web app" might be able to infer that you'd likely want a front end component, a backend component and a data storage component, and might even deal with servers and deployment.&lt;/li&gt;
&lt;li&gt;By using large language models the tool &lt;em&gt;might&lt;/em&gt; be more capable of semantic search rather than keyword matching which suits this problem as there is no strong standard on how a library describes it self through it's topics, description and documentation.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Semantic vs keyword
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Keyword
&lt;/h4&gt;

&lt;p&gt;A keyword search looks for the exact letters in a string, or potentially a partial match. As an example the query &lt;code&gt;"Data Science"&lt;/code&gt; would find things that exactly matched the charcters in the string &lt;code&gt;"Data Science"&lt;/code&gt; and maybe also &lt;code&gt;["Data", "Science", "DS"]&lt;/code&gt;.&lt;/p&gt;

&lt;h4&gt;
  
  
  Semantic
&lt;/h4&gt;

&lt;p&gt;A semantic search looks for the &lt;em&gt;conceptual&lt;/em&gt; similarity between things, so in this context &lt;code&gt;"Data Science"&lt;/code&gt; would find things that matched the vector embedding of &lt;code&gt;"Data Science"&lt;/code&gt; as well as maybe also the vector embeddings that are associated with &lt;code&gt;["Machine Learning", "Artificial Intelligence"]&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;And so I ran &lt;code&gt;poetry new starpilot&lt;/code&gt;:&lt;/p&gt;


&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--A9-wwsHG--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/DaveParr"&gt;
        DaveParr
      &lt;/a&gt; / &lt;a href="https://github.com/DaveParr/starpilot"&gt;
        starpilot
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      Use your GitHub stars for great good!
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;
&lt;h2&gt;
Starpilot is like copilot, but for GitHub stars.&lt;/h2&gt;
&lt;p&gt;I've been starring repos for years thinking "This will definitely be useful later".&lt;/p&gt;
&lt;p&gt;However I never really went back to them.&lt;/p&gt;
&lt;p&gt;Starpilot is a retrival augmented generation CLI tool for rediscovering your GitHub stars.&lt;/p&gt;
&lt;p&gt;Starpilot helps this problem by allowing you to rediscover GitHub repos you had previously starred that are relevant to your current project.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://dev.to/daveparr/copilot-for-your-github-stars-1cep" rel="nofollow"&gt;Here's some more details about the motivation for and state of the project&lt;/a&gt;.&lt;/p&gt;
&lt;h3&gt;
Installation&lt;/h3&gt;
&lt;p&gt;&lt;a href="http://github.com/badges/stability-badges"&gt;&lt;img src="https://camo.githubusercontent.com/dac819f88a300cef4d60fc98040a4f9d3b4d5a50821a80fc6917639f767da856/687474703a2f2f6261646765732e6769746875622e696f2f73746162696c6974792d6261646765732f646973742f6578706572696d656e74616c2e737667" alt="experimental"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;This project is in early development and is not yet available on PyPi&lt;/p&gt;
&lt;/blockquote&gt;
&lt;ol&gt;
&lt;li&gt;Fork repo&lt;/li&gt;
&lt;li&gt;Clone repo&lt;/li&gt;
&lt;li&gt;&lt;code&gt;cd starpilot&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;poetry install&lt;/code&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;You will need to have a .env file with&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;a &lt;a href="https://docs.github.com/en/github/authenticating-to-github/creating-a-personal-access-token"&gt;GitHub personal access token&lt;/a&gt; saved to a &lt;code&gt;.env&lt;/code&gt; file in the root of the project. This should have the user&amp;gt; read:user scope permission.&lt;/li&gt;
&lt;li&gt;a &lt;a href="https://platform.openai.com/api-keys" rel="nofollow"&gt;OpenAI API key&lt;/a&gt; saved to a &lt;code&gt;.env&lt;/code&gt; file in the root of the project.&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="snippet-clipboard-content notranslate position-relative overflow-auto"&gt;&lt;pre class="notranslate"&gt;&lt;code&gt;GITHUB_API_KEY="ghp_..."
OPENAI_API_KEY="sk-..."
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;…&lt;/p&gt;
&lt;/div&gt;
  &lt;/div&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/DaveParr/starpilot"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;


&lt;h2&gt;
  
  
  Why retrival augmented generation matters
&lt;/h2&gt;

&lt;p&gt;Retrieval Augmented Generation (RAG) is a technique used by large language models to cope with some of the limitations inherent in what are also sometimes referred to as 'Foundational' models. &lt;/p&gt;

&lt;p&gt;When a model like GPT3 is trained, it is fed large amounts of textual data written by humans. These get translated into 'weights' in a nueral net. To overly simplify, these weights tell the model what the next most likely text is that follows the text it has already been shown. &lt;/p&gt;

&lt;p&gt;However, these models don't know much about what has happened recently, what other programming resources really exist rather than what just sounds like it should exist, or where to exactly get a specific repo or webpage.&lt;/p&gt;

&lt;p&gt;Retrieval augmented generation solves this by allowing you to feed the large language model with known real, up to date and relevant information. &lt;/p&gt;

&lt;h2&gt;
  
  
  Vectorstores
&lt;/h2&gt;

&lt;p&gt;A type of data base called a vectorstore is commonly used for this because they are deliberately optimised towards a similarity search use case. They achieve this in a few ways:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Vectorstores store what you pass them as a 'vector embedding'. A vector embedding takes data (like text or images) and converts them to a list like representation of numbers.&lt;/li&gt;
&lt;li&gt;Vectorstores keep similar vector embeddings close together in memory. This means that they are as fast as possible at returning lots of documents that have similar semantic meaning, because they are all clustered together.&lt;/li&gt;
&lt;li&gt;Vectorstores have APIs that are specifically designed for these use cases, with querying methods that lean towards semantic searches more than sql queries, and loading techniques that integrate tightly into other systems that generate these vector embeddings from large language models.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Designing a system
&lt;/h2&gt;

&lt;p&gt;With this set of goals and new knowledge I got to work working out which puzzle pieces I needed and how to fit them together. This time I did go through my stars (and a few other things), though maybe this is for the last time!&lt;/p&gt;

&lt;p&gt;I figured I could get started using 4 main open source repos. &lt;a href="https://github.com/DaveParr/starpilot/commit/4846bcd43b598d0c5d6fb408fbe2d622076b999f#diff-50c86b7ed8ac2cf95bd48334961bf0530cdc77b5a56f852c5c61b89d735fd711"&gt;My first commit to my pyproject.toml&lt;/a&gt; used these projects:&lt;/p&gt;

&lt;h3&gt;
  
  
  Typer
&lt;/h3&gt;


&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--A9-wwsHG--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/tiangolo"&gt;
        tiangolo
      &lt;/a&gt; / &lt;a href="https://github.com/tiangolo/typer"&gt;
        typer
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      Typer, build great CLIs. Easy to code. Based on Python type hints.
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;
&lt;p&gt;
  &lt;a href="https://typer.tiangolo.com" rel="nofollow"&gt;&lt;img src="https://camo.githubusercontent.com/254bafa7c1d69068ca7bbe2e596c68af31bec2965481c4258a4c286bd2f3d0fc/68747470733a2f2f74797065722e7469616e676f6c6f2e636f6d2f696d672f6c6f676f2d6d617267696e2f6c6f676f2d6d617267696e2d766563746f722e737667" alt="Typer"&gt;&lt;/a&gt;
&lt;/p&gt;

&lt;p&gt;
    &lt;em&gt;Typer, build great CLIs. Easy to code. Based on Python type hints.&lt;/em&gt;
&lt;/p&gt;

&lt;p&gt;
&lt;a href="https://github.com/tiangolo/typer/actions?query=workflow%3ATest"&gt;
    &lt;img src="https://github.com/tiangolo/typer/workflows/Test/badge.svg" alt="Test"&gt;
&lt;/a&gt;
&lt;a href="https://github.com/tiangolo/typer/actions?query=workflow%3APublish"&gt;
    &lt;img src="https://github.com/tiangolo/typer/workflows/Publish/badge.svg" alt="Publish"&gt;
&lt;/a&gt;
&lt;a href="https://coverage-badge.samuelcolvin.workers.dev/redirect/tiangolo/typer" rel="nofollow"&gt;
    &lt;img src="https://camo.githubusercontent.com/6bc03d03f71655f088c9d66616582082bf6f3f912cac2e57250eb148f4c2f13c/68747470733a2f2f636f7665726167652d62616467652e73616d75656c636f6c76696e2e776f726b6572732e6465762f7469616e676f6c6f2f74797065722e737667" alt="Coverage"&gt;
&lt;/a&gt;&lt;a href="https://pypi.org/project/typer" rel="nofollow"&gt;
    &lt;img src="https://camo.githubusercontent.com/e295a21444457ee07fb94f83bc381b0f9fb46ce9a1379dd619e2edb6ca996684/68747470733a2f2f696d672e736869656c64732e696f2f707970692f762f74797065723f636f6c6f723d253233333444303538266c6162656c3d707970692532307061636b616765" alt="Package version"&gt;
&lt;/a&gt;
&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Documentation&lt;/strong&gt;: &lt;a href="https://typer.tiangolo.com" rel="nofollow"&gt;https://typer.tiangolo.com&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Source Code&lt;/strong&gt;: &lt;a href="https://github.com/tiangolo/typer"&gt;https://github.com/tiangolo/typer&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;Typer is a library for building CLI applications that users will &lt;strong&gt;love using&lt;/strong&gt; and developers will &lt;strong&gt;love creating&lt;/strong&gt;. Based on Python 3.6+ type hints.&lt;/p&gt;

&lt;p&gt;The key features are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Intuitive to write&lt;/strong&gt;: Great editor support. Completion everywhere. Less time debugging. Designed to be easy to use and learn. Less time reading docs.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Easy to use&lt;/strong&gt;: It's easy to use for the final users. Automatic help, and automatic completion for all shells.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Short&lt;/strong&gt;: Minimize code duplication. Multiple features from each parameter declaration. Fewer bugs.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Start simple&lt;/strong&gt;: The simplest example adds only 2 lines of code to your app: &lt;strong&gt;1 import, 1 function call&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Grow large&lt;/strong&gt;: Grow in complexity as much as you want, create arbitrarily complex trees of commands and groups of subcommands, with options and…&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
  &lt;/div&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/tiangolo/typer"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;


&lt;p&gt;&lt;code&gt;typer&lt;/code&gt; is a pretty trendy framework for building CLI tools in python right now. It embraces typing, uses function decorators to magically turn your functions into CLI commands, and has relatively clear documention.&lt;/p&gt;

&lt;p&gt;I chose &lt;code&gt;typer&lt;/code&gt; specifically because:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;I wanted to see what the hype was about&lt;/li&gt;
&lt;li&gt;I think typing helps write better code&lt;/li&gt;
&lt;li&gt;I found the documentation really helpful to get started easily&lt;/li&gt;
&lt;/ul&gt;


&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--A9-wwsHG--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/langchain-ai"&gt;
        langchain-ai
      &lt;/a&gt; / &lt;a href="https://github.com/langchain-ai/langchain"&gt;
        langchain
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      ⚡ Building applications with LLMs through composability ⚡
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;
&lt;h1&gt;
🦜️🔗 LangChain&lt;/h1&gt;
&lt;p&gt;⚡ Building applications with LLMs through composability ⚡&lt;/p&gt;
&lt;p&gt;&lt;a href="https://github.com/langchain-ai/langchain/releases"&gt;&lt;img src="https://camo.githubusercontent.com/b4aa5879d2ea622211e2c17a7e72e31e720d5b9fed9aa859d73cf2f833787ae4/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f72656c656173652f6c616e67636861696e2d61692f6c616e67636861696e" alt="Release Notes"&gt;&lt;/a&gt;
&lt;a href="https://github.com/langchain-ai/langchain/actions/workflows/langchain_ci.yml"&gt;&lt;img src="https://github.com/langchain-ai/langchain/actions/workflows/langchain_ci.yml/badge.svg" alt="CI"&gt;&lt;/a&gt;
&lt;a href="https://github.com/langchain-ai/langchain/actions/workflows/langchain_experimental_ci.yml"&gt;&lt;img src="https://github.com/langchain-ai/langchain/actions/workflows/langchain_experimental_ci.yml/badge.svg" alt="Experimental CI"&gt;&lt;/a&gt;
&lt;a href="https://pepy.tech/project/langchain" rel="nofollow"&gt;&lt;img src="https://camo.githubusercontent.com/e4796338c72e102e954541dad216e1d203e015306ceaac50abe8f1b0d4f804fe/68747470733a2f2f7374617469632e706570792e746563682f62616467652f6c616e67636861696e2f6d6f6e7468" alt="Downloads"&gt;&lt;/a&gt;
&lt;a href="https://opensource.org/licenses/MIT" rel="nofollow"&gt;&lt;img src="https://camo.githubusercontent.com/78f47a09877ba9d28da1887a93e5c3bc2efb309c1e910eb21135becd2998238a/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f4c6963656e73652d4d49542d79656c6c6f772e737667" alt="License: MIT"&gt;&lt;/a&gt;
&lt;a href="https://twitter.com/langchainai" rel="nofollow"&gt;&lt;img src="https://camo.githubusercontent.com/1c2e325d2339ff83058312c86964f80ebc5f28c8d2769f67c2667e21a535f16a/68747470733a2f2f696d672e736869656c64732e696f2f747769747465722f75726c2f68747470732f747769747465722e636f6d2f6c616e67636861696e61692e7376673f7374796c653d736f6369616c266c6162656c3d466f6c6c6f772532302534304c616e67436861696e4149" alt="Twitter"&gt;&lt;/a&gt;
&lt;a href="https://discord.gg/6adMQxSpJS" rel="nofollow"&gt;&lt;img src="https://camo.githubusercontent.com/dafad31410896f1d826306b256c8a0cc3cd9e4f2bbe1362ca17ec197cba65539/68747470733a2f2f646362616467652e76657263656c2e6170702f6170692f7365727665722f3661644d517853704a533f636f6d706163743d74727565267374796c653d666c6174" alt=""&gt;&lt;/a&gt;
&lt;a href="https://vscode.dev/redirect?url=vscode://ms-vscode-remote.remote-containers/cloneInVolume?url=https://github.com/langchain-ai/langchain" rel="nofollow"&gt;&lt;img src="https://camo.githubusercontent.com/365df1eb24944e5d013b6c1dde82c45f5b8a170ec0ca7992d3463a5cf1909d72/68747470733a2f2f696d672e736869656c64732e696f2f7374617469632f76313f6c6162656c3d446576253230436f6e7461696e657273266d6573736167653d4f70656e26636f6c6f723d626c7565266c6f676f3d76697375616c73747564696f636f6465" alt="Open in Dev Containers"&gt;&lt;/a&gt;
&lt;a href="https://codespaces.new/langchain-ai/langchain" rel="nofollow"&gt;&lt;img src="https://github.com/codespaces/badge.svg" alt="Open in GitHub Codespaces"&gt;&lt;/a&gt;
&lt;a href="https://star-history.com/#langchain-ai/langchain" rel="nofollow"&gt;&lt;img src="https://camo.githubusercontent.com/0b9855d0fa73cc7c0eab9717dc7eee899f3ce583727783a804a34dcf9573ec7f/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f73746172732f6c616e67636861696e2d61692f6c616e67636861696e3f7374796c653d736f6369616c" alt="GitHub star chart"&gt;&lt;/a&gt;
&lt;a href="https://libraries.io/github/langchain-ai/langchain" rel="nofollow"&gt;&lt;img src="https://camo.githubusercontent.com/89b7f888098a216c6b170d3cfaf55e6d56b554b3a53d8a30c927ebb34285ea5c/68747470733a2f2f696d672e736869656c64732e696f2f6c6962726172696573696f2f6769746875622f6c616e67636861696e2d61692f6c616e67636861696e" alt="Dependency Status"&gt;&lt;/a&gt;
&lt;a href="https://github.com/langchain-ai/langchain/issues"&gt;&lt;img src="https://camo.githubusercontent.com/4630bda310f55878af93f6b4bf8e7ae5f09ec7faab36a390a8ca1709f331f200/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f6973737565732d7261772f6c616e67636861696e2d61692f6c616e67636861696e" alt="Open Issues"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Looking for the JS/TS library? Check out &lt;a href="https://github.com/langchain-ai/langchainjs"&gt;LangChain.js&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;To help you ship LangChain apps to production faster, check out &lt;a href="https://smith.langchain.com" rel="nofollow"&gt;LangSmith&lt;/a&gt;
&lt;a href="https://smith.langchain.com" rel="nofollow"&gt;LangSmith&lt;/a&gt; is a unified developer platform for building, testing, and monitoring LLM applications
Fill out &lt;a href="https://airtable.com/appwQzlErAS2qiP0L/shrGtGaVBVAz7NcV2" rel="nofollow"&gt;this form&lt;/a&gt; to get off the waitlist or speak with our sales team.&lt;/p&gt;
&lt;h2&gt;
Quick Install&lt;/h2&gt;
&lt;p&gt;With pip:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell notranslate position-relative overflow-auto js-code-highlight"&gt;
&lt;pre&gt;pip install langchain&lt;/pre&gt;

&lt;/div&gt;
&lt;p&gt;With conda:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell notranslate position-relative overflow-auto js-code-highlight"&gt;
&lt;pre&gt;conda install langchain -c conda-forge&lt;/pre&gt;

&lt;/div&gt;
&lt;h2&gt;
🤔 What is LangChain?&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;LangChain&lt;/strong&gt; is a framework for developing applications powered by language models. It enables applications that:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Are context-aware&lt;/strong&gt;: connect a language model to sources of context (prompt instructions, few shot examples, content to ground its response in, etc.)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reason&lt;/strong&gt;: rely on a language model to reason (about how to answer based on provided context, what actions to take, etc.)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This framework consists of several parts.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;LangChain Libraries&lt;/strong&gt;: The Python and…&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
  &lt;/div&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/langchain-ai/langchain"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;


&lt;p&gt;&lt;code&gt;langchain&lt;/code&gt; is the most mature and well embraced large language model orchestration framework. Langchain itself doesn't supply you with any specific llm or vector store or embedding approach. Instead it is deliberately 'vendor agnostic'. It provides a common set of APIs and abstractions across a staggering number of vector data bases, large language models and embedding engines. &lt;/p&gt;

&lt;p&gt;I chose &lt;code&gt;langchain&lt;/code&gt; because:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;It is the most established tool in a brand new space&lt;/li&gt;
&lt;li&gt;I wasn't really sure which suppliers of vectorstores and large languge models made the most sense for my use case&lt;/li&gt;
&lt;li&gt;I found the documentation really helpful to get started&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Chroma
&lt;/h3&gt;


&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--A9-wwsHG--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/chroma-core"&gt;
        chroma-core
      &lt;/a&gt; / &lt;a href="https://github.com/chroma-core/chroma"&gt;
        chroma
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      the AI-native open-source embedding database
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;
&lt;p&gt;
  &lt;a href="https://trychroma.com" rel="nofollow"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--gkmeyorA--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://user-images.githubusercontent.com/891664/227103090-6624bf7d-9524-4e05-9d2c-c28d5d451481.png" alt="Chroma logo"&gt;&lt;/a&gt;
&lt;/p&gt;

&lt;p&gt;
    &lt;b&gt;Chroma - the open-source embedding database&lt;/b&gt;. &lt;br&gt;
    The fastest way to build Python or JavaScript LLM apps with memory
&lt;/p&gt;

&lt;p&gt;
  &lt;a href="https://discord.gg/MMeYNTmh3x" rel="nofollow"&gt;
      &lt;img src="https://camo.githubusercontent.com/8027d5b46f7a8c527294708e4e86fec5ef1098462cf841e4254ade5d9d194f5c/68747470733a2f2f696d672e736869656c64732e696f2f646973636f72642f31303733323933363435333033373935373432" alt="Discord"&gt;
  &lt;/a&gt; |
  &lt;a href="https://github.com/chroma-core/chroma/blob/master/LICENSE"&gt;
      &lt;img src="https://camo.githubusercontent.com/33347befc196140f65bf772e62f7716eac377242c64f220f08d7d55e528b9431/68747470733a2f2f696d672e736869656c64732e696f2f7374617469632f76313f6c6162656c3d6c6963656e7365266d6573736167653d41706163686520322e3026636f6c6f723d7768697465" alt="License"&gt;
  &lt;/a&gt; |
  &lt;a href="https://docs.trychroma.com/" rel="nofollow"&gt;
      Docs
  &lt;/a&gt; |
  &lt;a href="https://www.trychroma.com/" rel="nofollow"&gt;
      Homepage
  &lt;/a&gt;
&lt;/p&gt;

&lt;p&gt;
  &lt;a href="https://github.com/chroma-core/chroma/actions/workflows/chroma-integration-test.yml"&gt;
    &lt;img src="https://github.com/chroma-core/chroma/actions/workflows/chroma-integration-test.yml/badge.svg?branch=main" alt="Integration Tests"&gt;
  &lt;/a&gt; |
  &lt;a href="https://github.com/chroma-core/chroma/actions/workflows/chroma-test.yml"&gt;
    &lt;img src="https://github.com/chroma-core/chroma/actions/workflows/chroma-test.yml/badge.svg?branch=main" alt="Tests"&gt;
  &lt;/a&gt;
&lt;/p&gt;

&lt;div class="highlight highlight-source-shell notranslate position-relative overflow-auto js-code-highlight"&gt;
&lt;pre&gt;pip install chromadb &lt;span class="pl-c"&gt;&lt;span class="pl-c"&gt;#&lt;/span&gt; python client&lt;/span&gt;
&lt;span class="pl-c"&gt;&lt;span class="pl-c"&gt;#&lt;/span&gt; for javascript, npm install chromadb!&lt;/span&gt;
&lt;span class="pl-c"&gt;&lt;span class="pl-c"&gt;#&lt;/span&gt; for client-server mode, chroma run --path /chroma_db_path&lt;/span&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;p&gt;The core API is only 4 functions (run our &lt;a href="https://colab.research.google.com/drive/1QEzFyqnoFxq7LUGyP1vzR4iLt9PpCDXv?usp=sharing" rel="nofollow"&gt;💡 Google Colab&lt;/a&gt; or &lt;a href="https://replit.com/@swyx/BasicChromaStarter?v=1" rel="nofollow"&gt;Replit template&lt;/a&gt;):&lt;/p&gt;
&lt;div class="highlight highlight-source-python notranslate position-relative overflow-auto js-code-highlight"&gt;
&lt;pre&gt;&lt;span class="pl-k"&gt;import&lt;/span&gt; &lt;span class="pl-s1"&gt;chromadb&lt;/span&gt;
&lt;span class="pl-c"&gt;# setup Chroma in-memory, for easy prototyping. Can add persistence easily!&lt;/span&gt;
&lt;span class="pl-s1"&gt;client&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;chromadb&lt;/span&gt;.&lt;span class="pl-v"&gt;Client&lt;/span&gt;()
&lt;span class="pl-c"&gt;# Create collection. get_collection, get_or_create_collection, delete_collection also available!&lt;/span&gt;
&lt;span class="pl-s1"&gt;collection&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;client&lt;/span&gt;.&lt;span class="pl-en"&gt;create_collection&lt;/span&gt;(&lt;span class="pl-s"&gt;"all-my-documents"&lt;/span&gt;)

&lt;span class="pl-c"&gt;# Add docs to the collection. Can also update and delete. Row-based API coming soon!&lt;/span&gt;
&lt;span class="pl-s1"&gt;collection&lt;/span&gt;.&lt;span class="pl-en"&gt;add&lt;/span&gt;(
    &lt;span class="pl-s1"&gt;documents&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;[&lt;span class="pl-s"&gt;"This is document1"&lt;/span&gt;, &lt;span class="pl-s"&gt;"This is document2"&lt;/span&gt;], &lt;span class="pl-c"&gt;# we handle tokenization, embedding, and indexing automatically. You can skip that and add your own embeddings as well&lt;/span&gt;
    &lt;span class="pl-s1"&gt;metadatas&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;[{&lt;span class="pl-s"&gt;"source"&lt;/span&gt;: &lt;/pre&gt;…
&lt;/div&gt;
&lt;/div&gt;
  &lt;/div&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/chroma-core/chroma"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;



&lt;p&gt;&lt;code&gt;chroma&lt;/code&gt; is a vectorstore that has great support from Langchain. There are many others as well but Chroma won out at this stage because:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;I can run &lt;code&gt;chroma&lt;/code&gt; as an 'embedded' data store, e.g. it runs locally on the users machine&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;chroma&lt;/code&gt; was the most often used vectorstore in the Langchain docs for RAG tasks&lt;/li&gt;
&lt;li&gt;It was trivially easy to set up to the point at which I was convinced reading the tutorials that they had to have made a mistake&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  GPT4All
&lt;/h3&gt;


&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--A9-wwsHG--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/nomic-ai"&gt;
        nomic-ai
      &lt;/a&gt; / &lt;a href="https://github.com/nomic-ai/gpt4all"&gt;
        gpt4all
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      gpt4all: open-source LLM chatbots that you can run anywhere
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;
&lt;h1&gt;
GPT4All&lt;/h1&gt;
&lt;p&gt;Open-source large language models that run locally on your CPU and nearly any GPU&lt;/p&gt;
&lt;p&gt;
&lt;a href="https://gpt4all.io" rel="nofollow"&gt;GPT4All Website and Models&lt;/a&gt;
&lt;/p&gt;

&lt;p&gt;
&lt;a href="https://docs.gpt4all.io" rel="nofollow"&gt;GPT4All Documentation&lt;/a&gt;
&lt;/p&gt;

&lt;p&gt;
&lt;a href="https://discord.gg/mGZE39AS3e" rel="nofollow"&gt;Discord&lt;/a&gt;
&lt;/p&gt;

&lt;p&gt;
&lt;a href="https://python.langchain.com/en/latest/modules/models/llms/integrations/gpt4all.html" rel="nofollow"&gt;🦜️🔗 Official Langchain Backend&lt;/a&gt;
&lt;/p&gt;

&lt;p&gt;
GPT4All is made possible by our compute partner &lt;a href="https://www.paperspace.com/" rel="nofollow"&gt;Paperspace&lt;/a&gt;
&lt;/p&gt;

&lt;p&gt;
  &lt;a rel="noopener noreferrer nofollow" href="https://user-images.githubusercontent.com/13879686/231876409-e3de1934-93bb-4b4b-9013-b491a969ebbc.gif"&gt;&lt;img width="600" height="365" src="https://res.cloudinary.com/practicaldev/image/fetch/s--7bl8XS24--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_66%2Cw_800/https://user-images.githubusercontent.com/13879686/231876409-e3de1934-93bb-4b4b-9013-b491a969ebbc.gif"&gt;&lt;/a&gt;
&lt;/p&gt;

&lt;p&gt;
Run on an M1 macOS Device (not sped up!)
&lt;/p&gt;

&lt;h2&gt;
GPT4All: An ecosystem of open-source on-edge large language models.&lt;/h2&gt;
&lt;div class="markdown-alert markdown-alert-important"&gt;
&lt;p class="markdown-alert-title"&gt;Important&lt;/p&gt;
&lt;p&gt;GPT4All v2.5.0 and newer only supports models in GGUF format (.gguf). Models used with a previous version of GPT4All (.bin extension) will no longer work.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;GPT4All is an ecosystem to run &lt;strong&gt;powerful&lt;/strong&gt; and &lt;strong&gt;customized&lt;/strong&gt; large language models that work locally on consumer grade CPUs and any GPU. Note that your CPU needs to support &lt;a href="https://en.wikipedia.org/wiki/Advanced_Vector_Extensions" rel="nofollow"&gt;AVX or AVX2 instructions&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Learn more in the &lt;a href="https://docs.gpt4all.io" rel="nofollow"&gt;documentation&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;A GPT4All model is a 3GB - 8GB file that you can download and plug into the GPT4All open-source ecosystem software. &lt;strong&gt;Nomic AI&lt;/strong&gt; supports and maintains this software ecosystem to…&lt;/p&gt;
&lt;/div&gt;
  &lt;/div&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/nomic-ai/gpt4all"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;


&lt;p&gt;&lt;code&gt;gpt4all&lt;/code&gt; provides a set of LLM models and embedding engines that are also well supported by Langchain. &lt;code&gt;gpt4all&lt;/code&gt; was appealing because:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;It runs locally on 'normal' machines&lt;/li&gt;
&lt;li&gt;It seems well supported and maintained&lt;/li&gt;
&lt;li&gt;It is open about what data it was trained on and what data it will use to train on&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  PyGithub
&lt;/h3&gt;


&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--A9-wwsHG--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/PyGithub"&gt;
        PyGithub
      &lt;/a&gt; / &lt;a href="https://github.com/PyGithub/PyGithub"&gt;
        PyGithub
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      Typed interactions with the GitHub API v3
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;
&lt;h1&gt;
PyGitHub&lt;/h1&gt;
&lt;p&gt;&lt;a href="https://pypi.python.org/pypi/PyGithub" rel="nofollow"&gt;&lt;img src="https://camo.githubusercontent.com/2fb31d1d7bcb95a355cbfa05958974a9733011b123d2963535395801cfca67c7/68747470733a2f2f696d672e736869656c64732e696f2f707970692f762f50794769746875622e737667" alt="PyPI"&gt;&lt;/a&gt;
&lt;a rel="noopener noreferrer" href="https://github.com/PyGithub/PyGithub/workflows/CI/badge.svg"&gt;&lt;img src="https://github.com/PyGithub/PyGithub/workflows/CI/badge.svg" alt="CI"&gt;&lt;/a&gt;
&lt;a href="https://pygithub.readthedocs.io/en/stable/?badge=stable" rel="nofollow"&gt;&lt;img src="https://camo.githubusercontent.com/2901a67182a088e1aa55bd580f6fdb14ee021cb2a1957bebc6ff033763680c90/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f646f63732d737461626c652d627269676874677265656e2e7376673f7374796c653d666c6174" alt="readthedocs"&gt;&lt;/a&gt;
&lt;a href="https://en.wikipedia.org/wiki/GNU_Lesser_General_Public_License" rel="nofollow"&gt;&lt;img src="https://camo.githubusercontent.com/aa84acb2dcddcf981e3c9b43773ad474c0ee7172c8e375dc11740003f2aeef2b/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f6c6963656e73652d4c47504c2d626c75652e737667" alt="License"&gt;&lt;/a&gt;
&lt;a href="https://join.slack.com/t/pygithub-project/shared_invite/zt-duj89xtx-uKFZtgAg209o6Vweqm8xeQ" rel="nofollow"&gt;&lt;img src="https://camo.githubusercontent.com/efbdf8c69e4c110aaaf8fcdcaaef6baec66bf397417e480d6b4cb8e4327f0d6a/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f536c61636b2532306368616e6e656c2d2532302532302d626c75652e737667" alt="Slack"&gt;&lt;/a&gt;
&lt;a href="https://www.codetriage.com/pygithub/pygithub" rel="nofollow"&gt;&lt;img src="https://camo.githubusercontent.com/4c0290cb7db35982a50f5aec30bb1f4d9310f00788135851905ca11718311379/68747470733a2f2f7777772e636f64657472696167652e636f6d2f70796769746875622f70796769746875622f6261646765732f75736572732e737667" alt="Open Source Helpers"&gt;&lt;/a&gt;
&lt;a href="https://codecov.io/gh/PyGithub/PyGithub" rel="nofollow"&gt;&lt;img src="https://camo.githubusercontent.com/ff0140656fd95f8978ee9a9c5e5f2b53debfc8262832557354e691f78d8b276c/68747470733a2f2f636f6465636f762e696f2f67682f50794769746875622f50794769746875622f6272616e63682f6d61737465722f67726170682f62616467652e737667" alt="codecov"&gt;&lt;/a&gt;
&lt;a href="https://github.com/psf/black"&gt;&lt;img src="https://camo.githubusercontent.com/d91ed7ac7abbd5a6102cbe988dd8e9ac21bde0a73d97be7603b891ad08ce3479/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f636f64652532307374796c652d626c61636b2d3030303030302e737667" alt="Code style: black"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;PyGitHub is a Python library to access the &lt;a href="https://docs.github.com/en/rest"&gt;GitHub REST API&lt;/a&gt;
This library enables you to manage &lt;a href="https://github.com"&gt;GitHub&lt;/a&gt; resources such as repositories, user profiles, and organizations in your Python applications.&lt;/p&gt;
&lt;h2&gt;
Install&lt;/h2&gt;
&lt;div class="highlight highlight-source-shell notranslate position-relative overflow-auto js-code-highlight"&gt;
&lt;pre&gt;pip install PyGithub&lt;/pre&gt;

&lt;/div&gt;
&lt;h2&gt;
Simple Demo&lt;/h2&gt;
&lt;div class="highlight highlight-source-python notranslate position-relative overflow-auto js-code-highlight"&gt;
&lt;pre&gt;&lt;span class="pl-k"&gt;from&lt;/span&gt; &lt;span class="pl-s1"&gt;github&lt;/span&gt; &lt;span class="pl-k"&gt;import&lt;/span&gt; &lt;span class="pl-v"&gt;Github&lt;/span&gt;

&lt;span class="pl-c"&gt;# Authentication is defined via github.Auth&lt;/span&gt;
&lt;span class="pl-k"&gt;from&lt;/span&gt; &lt;span class="pl-s1"&gt;github&lt;/span&gt; &lt;span class="pl-k"&gt;import&lt;/span&gt; &lt;span class="pl-v"&gt;Auth&lt;/span&gt;

&lt;span class="pl-c"&gt;# using an access token&lt;/span&gt;
&lt;span class="pl-s1"&gt;auth&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-v"&gt;Auth&lt;/span&gt;.&lt;span class="pl-v"&gt;Token&lt;/span&gt;(&lt;span class="pl-s"&gt;"access_token"&lt;/span&gt;)

&lt;span class="pl-c"&gt;# First create a Github instance:&lt;/span&gt;

&lt;span class="pl-c"&gt;# Public Web Github&lt;/span&gt;
&lt;span class="pl-s1"&gt;g&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-v"&gt;Github&lt;/span&gt;(&lt;span class="pl-s1"&gt;auth&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-s1"&gt;auth&lt;/span&gt;)

&lt;span class="pl-c"&gt;# Github Enterprise with custom hostname&lt;/span&gt;
&lt;span class="pl-s1"&gt;g&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-v"&gt;Github&lt;/span&gt;(&lt;span class="pl-s1"&gt;base_url&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-s"&gt;"https://{hostname}/api/v3"&lt;/span&gt;, &lt;span class="pl-s1"&gt;auth&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-s1"&gt;auth&lt;/span&gt;)

&lt;span class="pl-c"&gt;# Then play with your Github objects:&lt;/span&gt;
&lt;span class="pl-k"&gt;for&lt;/span&gt; &lt;span class="pl-s1"&gt;repo&lt;/span&gt; &lt;span class="pl-c1"&gt;in&lt;/span&gt; &lt;span class="pl-s1"&gt;g&lt;/span&gt;.&lt;span class="pl-en"&gt;get_user&lt;/span&gt;().&lt;span class="pl-en"&gt;get_repos&lt;/span&gt;():
    &lt;span class="pl-en"&gt;print&lt;/span&gt;(&lt;span class="pl-s1"&gt;repo&lt;/span&gt;.&lt;span class="pl-s1"&gt;name&lt;/span&gt;)

&lt;span class="pl-c"&gt;# To close connections after use&lt;/span&gt;
&lt;span class="pl-s1"&gt;g&lt;/span&gt;.&lt;span class="pl-en"&gt;close&lt;/span&gt;()&lt;/pre&gt;

&lt;/div&gt;
&lt;h2&gt;
Documentation&lt;/h2&gt;
&lt;p&gt;More information can be found on the &lt;a href="https://pygithub.readthedocs.io/en/stable/introduction.html" rel="nofollow"&gt;PyGitHub documentation site.&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
Development&lt;/h2&gt;
&lt;h3&gt;
Contributing&lt;/h3&gt;
&lt;p&gt;Long-term discussion and bug…&lt;/p&gt;
&lt;/div&gt;
  &lt;/div&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/PyGithub/PyGithub"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;


&lt;p&gt;Soon after this I realised that &lt;code&gt;pygithub&lt;/code&gt; would be an easy way to go to GitHub to get the information I needed and bring it back into &lt;code&gt;starpilot&lt;/code&gt; to load into the vectorstore. I had initially thought I might be able to use the &lt;a href="https://python.langchain.com/docs/integrations/document_loaders/github"&gt;GitHub Document Loader&lt;/a&gt; built into &lt;code&gt;langchain&lt;/code&gt;, though once I sat down to really work it out I realised that this doesn't give access to a users stars, so I needed an alternative.&lt;/p&gt;
&lt;h2&gt;
  
  
  The other way to build
&lt;/h2&gt;

&lt;p&gt;There were alternatives in all these choices. I think these are all totally viable parts to build effectively the same system:&lt;/p&gt;

&lt;h3&gt;
  
  
  Click
&lt;/h3&gt;


&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--A9-wwsHG--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/pallets"&gt;
        pallets
      &lt;/a&gt; / &lt;a href="https://github.com/pallets/click"&gt;
        click
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      Python composable command line interface toolkit
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="rst"&gt;
&lt;h1&gt;
$ click_&lt;/h1&gt;
&lt;p&gt;Click is a Python package for creating beautiful command line interfaces
in a composable way with as little code as necessary. It's the "Command
Line Interface Creation Kit". It's highly configurable but comes with
sensible defaults out of the box.&lt;/p&gt;
&lt;p&gt;It aims to make the process of writing command line tools quick and fun
while also preventing any frustration caused by the inability to
implement an intended CLI API.&lt;/p&gt;
&lt;p&gt;Click in three points:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Arbitrary nesting of commands&lt;/li&gt;
&lt;li&gt;Automatic help page generation&lt;/li&gt;
&lt;li&gt;Supports lazy loading of subcommands at runtime&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
Installing&lt;/h2&gt;
&lt;p&gt;Install and update using &lt;a href="https://pip.pypa.io/en/stable/getting-started/" rel="nofollow"&gt;pip&lt;/a&gt;:&lt;/p&gt;
&lt;pre&gt;$ pip install -U click
&lt;/pre&gt;
&lt;h2&gt;
A Simple Example&lt;/h2&gt;
&lt;div class="highlight highlight-source-python notranslate position-relative overflow-auto js-code-highlight"&gt;
&lt;pre&gt;&lt;span class="pl-k"&gt;import&lt;/span&gt; &lt;span class="pl-s1"&gt;click&lt;/span&gt;
&lt;span class="pl-en"&gt;@&lt;span class="pl-s1"&gt;click&lt;/span&gt;.&lt;span class="pl-en"&gt;command&lt;/span&gt;()&lt;/span&gt;
&lt;span class="pl-en"&gt;@&lt;span class="pl-s1"&gt;click&lt;/span&gt;.&lt;span class="pl-en"&gt;option&lt;/span&gt;(&lt;span class="pl-s"&gt;"--count"&lt;/span&gt;, &lt;span class="pl-s1"&gt;default&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-c1"&gt;1&lt;/span&gt;, &lt;span class="pl-s1"&gt;help&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-s"&gt;"Number of greetings."&lt;/span&gt;)&lt;/span&gt;
&lt;span class="pl-en"&gt;@&lt;span class="pl-s1"&gt;click&lt;/span&gt;.&lt;span class="pl-en"&gt;option&lt;/span&gt;(&lt;span class="pl-s"&gt;"--name"&lt;/span&gt;, &lt;span class="pl-s1"&gt;prompt&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-s"&gt;"Your name"&lt;/span&gt;, &lt;span class="pl-s1"&gt;help&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-s"&gt;"The person to greet."&lt;/span&gt;)&lt;/span&gt;
&lt;span class="pl-k"&gt;def&lt;/span&gt; &lt;span class="pl-en"&gt;hello&lt;/span&gt;&lt;/pre&gt;…
&lt;/div&gt;
&lt;/div&gt;
  &lt;/div&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/pallets/click"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;


&lt;p&gt;I actually am using &lt;code&gt;click&lt;/code&gt;, sort of. &lt;code&gt;typer&lt;/code&gt; is built ontop of &lt;code&gt;click&lt;/code&gt;, but to be honest I didn't really know that before I'd mostly decided. &lt;code&gt;click&lt;/code&gt; looks like a really great project, but it wasn't &lt;em&gt;as&lt;/em&gt; clear how to get started.&lt;/p&gt;


&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--A9-wwsHG--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/run-llama"&gt;
        run-llama
      &lt;/a&gt; / &lt;a href="https://github.com/run-llama/llama_index"&gt;
        llama_index
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      LlamaIndex (formerly GPT Index) is a data framework for your LLM applications
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;
&lt;h1&gt;
🗂️ LlamaIndex 🦙&lt;/h1&gt;
&lt;p&gt;&lt;a href="https://pypi.org/project/llama-index/" rel="nofollow"&gt;&lt;img src="https://camo.githubusercontent.com/3d409efd2346abb64db2b52d273f51974b4985ee4f6658b4b773bef53ad4c6d8/68747470733a2f2f696d672e736869656c64732e696f2f707970692f646d2f6c6c616d612d696e646578" alt="PyPI - Downloads"&gt;&lt;/a&gt;
&lt;a href="https://github.com/jerryjliu/llama_index/graphs/contributors"&gt;&lt;img src="https://camo.githubusercontent.com/42319ba785974a8f706df98c10b9f43363604d4b079b7d5a1791f662ddb48411/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f636f6e7472696275746f72732f6a657272796a6c69752f6c6c616d615f696e646578" alt="GitHub contributors"&gt;&lt;/a&gt;
&lt;a href="https://discord.gg/dGcwcsnxhU" rel="nofollow"&gt;&lt;img src="https://camo.githubusercontent.com/befc6b8accd5616ad08380a82ad65c11d2d280c04740198a07505da1084bdfee/68747470733a2f2f696d672e736869656c64732e696f2f646973636f72642f31303539313939323137343936373732363838" alt="Discord"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;LlamaIndex (GPT Index) is a data framework for your LLM application.&lt;/p&gt;
&lt;p&gt;PyPI:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;LlamaIndex: &lt;a href="https://pypi.org/project/llama-index/" rel="nofollow"&gt;https://pypi.org/project/llama-index/&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;GPT Index (duplicate): &lt;a href="https://pypi.org/project/gpt-index/" rel="nofollow"&gt;https://pypi.org/project/gpt-index/&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;LlamaIndex.TS (Typescript/Javascript): &lt;a href="https://github.com/run-llama/LlamaIndexTS"&gt;https://github.com/run-llama/LlamaIndexTS&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Documentation: &lt;a href="https://docs.llamaindex.ai/en/stable/" rel="nofollow"&gt;https://docs.llamaindex.ai/en/stable/&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Twitter: &lt;a href="https://twitter.com/llama_index" rel="nofollow"&gt;https://twitter.com/llama_index&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Discord: &lt;a href="https://discord.gg/dGcwcsnxhU" rel="nofollow"&gt;https://discord.gg/dGcwcsnxhU&lt;/a&gt;.&lt;/p&gt;
&lt;h3&gt;
Ecosystem&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;LlamaHub (community library of data loaders): &lt;a href="https://llamahub.ai" rel="nofollow"&gt;https://llamahub.ai&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;LlamaLab (cutting-edge AGI projects using LlamaIndex): &lt;a href="https://github.com/run-llama/llama-lab"&gt;https://github.com/run-llama/llama-lab&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
🚀 Overview&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;NOTE&lt;/strong&gt;: This README is not updated as frequently as the documentation. Please check out the documentation above for the latest updates!&lt;/p&gt;
&lt;h3&gt;
Context&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;LLMs are a phenomenal piece of technology for knowledge generation and reasoning. They are pre-trained on large amounts of publicly available data.&lt;/li&gt;
&lt;li&gt;How do we best augment LLMs with our own private data?&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;We need a comprehensive toolkit to help perform this data augmentation for LLMs.&lt;/p&gt;
&lt;h3&gt;
Proposed Solution&lt;/h3&gt;
&lt;p&gt;That's where &lt;strong&gt;LlamaIndex&lt;/strong&gt; comes in. LlamaIndex is a "data framework" to help you build LLM apps. It provides the following tools:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Offers &lt;strong&gt;data connectors&lt;/strong&gt; to ingest…&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
  &lt;/div&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/run-llama/llama_index"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;


&lt;p&gt;&lt;code&gt;llama_index&lt;/code&gt; is probably a great project, but I only found it late in my thinking on this project. If I start a different project it's suitable for any time soon I'm definately going to try it out as a comparison.&lt;/p&gt;


&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--A9-wwsHG--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/facebookresearch"&gt;
        facebookresearch
      &lt;/a&gt; / &lt;a href="https://github.com/facebookresearch/faiss"&gt;
        faiss
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      A library for efficient similarity search and clustering of dense vectors.
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;
&lt;h1&gt;
Faiss&lt;/h1&gt;
&lt;p&gt;Faiss is a library for efficient similarity search and clustering of dense vectors. It contains algorithms that search in sets of vectors of any size, up to ones that possibly do not fit in RAM. It also contains supporting code for evaluation and parameter tuning. Faiss is written in C++ with complete wrappers for Python/numpy. Some of the most useful algorithms are implemented on the GPU. It is developed primarily at Meta's &lt;a href="https://ai.facebook.com/" rel="nofollow"&gt;Fundamental AI Research&lt;/a&gt; group.&lt;/p&gt;
&lt;h2&gt;
News&lt;/h2&gt;
&lt;p&gt;See &lt;a href="https://github.com/facebookresearch/faissCHANGELOG.md"&gt;CHANGELOG.md&lt;/a&gt; for detailed information about latest features.&lt;/p&gt;
&lt;h2&gt;
Introduction&lt;/h2&gt;
&lt;p&gt;Faiss contains several methods for similarity search. It assumes that the instances are represented as vectors and are identified by an integer, and that the vectors can be compared with L2 (Euclidean) distances or dot products. Vectors that are similar to a query vector are those that have the lowest L2 distance or the highest dot product with the query vector. It also…&lt;/p&gt;
&lt;/div&gt;
  &lt;/div&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/facebookresearch/faiss"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;


&lt;p&gt;I'd used &lt;code&gt;faiss&lt;/code&gt; in a tutorial on vectorstores before. It didn't strike me as hugely intuitive to use or as simple to set up (it's recommended installation path is via conda). I also don't particularly like Facebook so I'm happy to use an alternative. &lt;/p&gt;


&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--A9-wwsHG--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/openai"&gt;
        openai
      &lt;/a&gt; / &lt;a href="https://github.com/openai/openai-python"&gt;
        openai-python
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      The official Python library for the OpenAI API
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;
&lt;h1&gt;
OpenAI Python API library&lt;/h1&gt;
&lt;p&gt;&lt;a href="https://pypi.org/project/openai/" rel="nofollow"&gt;&lt;img src="https://camo.githubusercontent.com/de6cf76be0f20f305575261470d9f6819adbdf5c96e7ab494e375a38f82b3a15/68747470733a2f2f696d672e736869656c64732e696f2f707970692f762f6f70656e61692e737667" alt="PyPI version"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;The OpenAI Python library provides convenient access to the OpenAI REST API from any Python 3.7+
application. The library includes type definitions for all request params and response fields
and offers both synchronous and asynchronous clients powered by &lt;a href="https://github.com/encode/httpx"&gt;httpx&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;It is generated from our &lt;a href="https://github.com/openai/openai-openapi"&gt;OpenAPI specification&lt;/a&gt; with &lt;a href="https://stainlessapi.com/" rel="nofollow"&gt;Stainless&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;
Documentation&lt;/h2&gt;
&lt;p&gt;The API documentation can be found &lt;a href="https://platform.openai.com/docs" rel="nofollow"&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;
Installation&lt;/h2&gt;
&lt;div class="markdown-alert markdown-alert-important"&gt;
&lt;p class="markdown-alert-title"&gt;Important&lt;/p&gt;
&lt;p&gt;The SDK was rewritten in v1, which was released November 6th 2023. See the &lt;a href="https://github.com/openai/openai-python/discussions/742"&gt;v1 migration guide&lt;/a&gt;, which includes scripts to automatically update your code.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="highlight highlight-source-shell notranslate position-relative overflow-auto js-code-highlight"&gt;
&lt;pre&gt;pip install openai&lt;/pre&gt;

&lt;/div&gt;
&lt;h2&gt;
Usage&lt;/h2&gt;
&lt;p&gt;The full API of this library can be found in &lt;a href="https://www.github.com/openai/openai-python/blob/main/api.md"&gt;api.md&lt;/a&gt;.&lt;/p&gt;
&lt;div class="highlight highlight-source-python notranslate position-relative overflow-auto js-code-highlight"&gt;
&lt;pre&gt;&lt;span class="pl-k"&gt;import&lt;/span&gt; &lt;span class="pl-s1"&gt;os&lt;/span&gt;
&lt;span class="pl-k"&gt;from&lt;/span&gt; &lt;span class="pl-s1"&gt;openai&lt;/span&gt; &lt;span class="pl-k"&gt;import&lt;/span&gt; &lt;span class="pl-v"&gt;OpenAI&lt;/span&gt;
&lt;span class="pl-s1"&gt;client&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-v"&gt;OpenAI&lt;/span&gt;(
    &lt;span class="pl-c"&gt;# This is the default and can be omitted&lt;/span&gt;
    &lt;span class="pl-s1"&gt;api_key&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-s1"&gt;os&lt;/span&gt;.&lt;span class="pl-s1"&gt;environ&lt;/span&gt;.&lt;span class="pl-en"&gt;get&lt;/span&gt;(&lt;span class="pl-s"&gt;"OPENAI_API_KEY"&lt;/span&gt;),
)

&lt;span class="pl-s1"&gt;chat_completion&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;client&lt;/span&gt;.&lt;span class="pl-s1"&gt;chat&lt;/span&gt;.&lt;span class="pl-s1"&gt;completions&lt;/span&gt;.&lt;span class="pl-en"&gt;create&lt;/span&gt;(
    &lt;span class="pl-s1"&gt;messages&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;[&lt;/pre&gt;…
&lt;/div&gt;
&lt;/div&gt;
  &lt;/div&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/openai/openai-python"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;


&lt;p&gt;I'd used &lt;code&gt;openai&lt;/code&gt; for a handful of tutorials and notebook experiments already and been very happy with it. However for a project like this I wasn't really sure what the operational costs would be, and if they would be worth it for the benefit the tool provides. That combined with the requirement to have network connectivity while using the tool pushed me towards experimenting with alternatives. Luckily with &lt;code&gt;langchain&lt;/code&gt; I should be able to provide it as an optional backend in the future?&lt;/p&gt;

&lt;h2&gt;
  
  
  What state is &lt;code&gt;starpilot&lt;/code&gt; now?
&lt;/h2&gt;

&lt;p&gt;"actively developed", "v0.1.0", "untested" and "it runs on my machine" are good descriptions of the project right now.&lt;/p&gt;

&lt;p&gt;I've spent a few evenings this month on it, and see myself at least spending a few more on it next month. The API is getting breaking changes almost everytime I open the project. It's got 0 real tests. It should get some soon though. It requires a few manual installation steps that are documented in &lt;code&gt;README.md&lt;/code&gt; but haven't yet even been attempted on another machine other than the one I'm on right now.&lt;/p&gt;

&lt;p&gt;It also doesn't yet achieve exactly what I want it to, but I see no reason yet that it can't with some more development time.&lt;/p&gt;

&lt;h3&gt;
  
  
  Current features
&lt;/h3&gt;

&lt;h4&gt;
  
  
  &lt;code&gt;starpilot read MyCoolUserName&lt;/code&gt;
&lt;/h4&gt;

&lt;p&gt;This will connect to Github and read the starred repos of the user &lt;code&gt;MyCoolUserName&lt;/code&gt;. Then it will go to each of those repos and get the topics and descriptions (and optionally the readmes) and load these into &lt;code&gt;chroma&lt;/code&gt; which is persisted on the local hard drive.&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;code&gt;starpilot shoot "insert topic here"&lt;/code&gt;
&lt;/h4&gt;

&lt;p&gt;This will spin up the &lt;code&gt;chroma&lt;/code&gt; database and perform a semantic similarity search on the string given in the command, then return the documents that seem to be the most relevant.&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;code&gt;starpilot fortuneteller "Insert a question here"&lt;/code&gt;
&lt;/h4&gt;

&lt;p&gt;This will perform the exact same search as the &lt;code&gt;shoot&lt;/code&gt; command, but then spin up a large language model and pass the results into the large language model for processing. It then returns the documents it found as well as the response from the LLM&lt;/p&gt;

&lt;h2&gt;
  
  
  So....
&lt;/h2&gt;

&lt;p&gt;That's where this project is at. I've learnt a tonne about the available tools and relevant techniques in this space already, which was really the main goal of starting to begin with!&lt;/p&gt;

&lt;p&gt;That said the progress I've made so far only makes me more curious about what else can be done with this and what else can be solved towards the vision of "Making your GitHub stars more valuable in your daily coding". Here's some ideas that I've found exciting while getting my hands dirty that might show up in the future. These are along with the obvious things like any testing at all, a simpler way to set up the project on your machine, better error handling, a more sensible way to update the vectorstore than drop everything and rebuild each time, etc. &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Inspecting the current projects description (both it's loose goals as well as more specific things like what packages it already uses) so that things that are already used aren't suggested and are instead used to inform the response.&lt;/li&gt;
&lt;li&gt;Dynamically creating a GitHub list of similar starred repos for your user (though that would probably rely on &lt;a href="https://github.com/orgs/community/discussions/8293?sort=new"&gt;this suggestion to extend the GitHub API&lt;/a&gt;) so that you naturally have some ways of saving and sharing your starred repos that solve a specific problem between sessions in your terminal&lt;/li&gt;
&lt;li&gt;Building starpilot into a research agent that can perform actions such as installing the selected suggestion into the current project or be sent to GitHub to find new projects that solve the current goals that you haven't starred yet&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What do you think?
&lt;/h2&gt;

&lt;p&gt;Does this sound like something intersting to you, maybe even something useful? Did this just spark inspiration in you for a new project? Does this actually already exist somewhere and I'm just being an idiot? Let me know :)&lt;/p&gt;

</description>
      <category>ai</category>
      <category>github</category>
      <category>python</category>
      <category>cli</category>
    </item>
    <item>
      <title>My GitHub profile shows my popular dev.to posts and GitHub repos automatically</title>
      <dc:creator>Dave Parr</dc:creator>
      <pubDate>Wed, 05 Aug 2020 15:41:23 +0000</pubDate>
      <link>https://forem.com/daveparr/my-github-profile-shows-my-popular-dev-to-posts-and-github-repos-automatically-2n05</link>
      <guid>https://forem.com/daveparr/my-github-profile-shows-my-popular-dev-to-posts-and-github-repos-automatically-2n05</guid>
      <description>&lt;p&gt;This is my github profile:&lt;/p&gt;


&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--vJ70wriM--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://practicaldev-herokuapp-com.freetls.fastly.net/assets/github-logo-ba8488d21cd8ee1fee097b8410db9deaa41d0ca30b004c0c63de0a479114156f.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/DaveParr"&gt;
        DaveParr
      &lt;/a&gt; / &lt;a href="https://github.com/DaveParr/DaveParr"&gt;
        DaveParr
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      My GitHub Readme Profile
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;
&lt;h2&gt;
Popular Repos&lt;/h2&gt;
&lt;p&gt;&lt;a rel="noopener noreferrer" href="https://github.com/daveparr/daveparr/blob/main/graph.png"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--Fcg1TfPV--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://github.com/daveparr/daveparr/raw/main/graph.png" alt="A plot of David Parrs popular github repos"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
Popular Blogs&lt;/h2&gt;
&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;article&lt;/th&gt;
&lt;th&gt;public_reactions_count&lt;/th&gt;
&lt;th&gt;comments_count&lt;/th&gt;
&lt;th&gt;page_views_count&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://dev.to/daveparr/i-made-my-dev-to-content-into-a-website-to-find-a-new-job-2kn5" rel="nofollow"&gt;I made my dev.to content into a website to find a new job&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;36&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;518&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://dev.to/daveparr/why-use-aws-lambda-for-data-science-421" rel="nofollow"&gt;Why use AWS Lambda for Data Science?&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;33&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;560&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://dev.to/daveparr/posting-straight-from-rmd-to-dev-to-1j4p" rel="nofollow"&gt;Posting straight from .Rmd to dev.to (for real this time)&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;18&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;173&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://dev.to/daveparr/gotcha-local-gitlab-runners-no-such-image-docker-and-disk-space-7ei" rel="nofollow"&gt;Local gitlab runners, ‘no such image’, docker and disk space&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;9&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;600&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://dev.to/daveparr/writing-r-packages-fast-474c" rel="nofollow"&gt;Writing R packages, fast&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;14&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;
&lt;/div&gt;



&lt;/div&gt;
&lt;br&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/DaveParr/DaveParr"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;br&gt;
&lt;/div&gt;
&lt;br&gt;


&lt;h2&gt;
  
  
  Background
&lt;/h2&gt;

&lt;p&gt;I made it because I saw this repo by zhiiyang&lt;/p&gt;


&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--vJ70wriM--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://practicaldev-herokuapp-com.freetls.fastly.net/assets/github-logo-ba8488d21cd8ee1fee097b8410db9deaa41d0ca30b004c0c63de0a479114156f.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/zhiiiyang"&gt;
        zhiiiyang
      &lt;/a&gt; / &lt;a href="https://github.com/zhiiiyang/zhiiiyang"&gt;
        zhiiiyang
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      It is a self-updating personal README showing my latest tweet and reply. 
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;
&lt;p&gt;My name is Zhi (pronounced as Z). Welcome to my GitHub page! Here are highlights of my recent projects.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
📱 A Phone Shiny app &lt;a href="https://github.com/zhiiiyang/OTworkout"&gt;OTworkout&lt;/a&gt; to track your workout from GitHub.&lt;/li&gt;
&lt;li&gt;
🤖 A Twitter bot &lt;a href="https://github.com/zhiiiyang/mutSignature_Pubmed_bot"&gt;mutsignatures&lt;/a&gt; using Python and AWS Lambda.&lt;/li&gt;
&lt;li&gt;
📊 #TidyTuesday Data Visualization &lt;a href="https://github.com/zhiiiyang/tidytuesday"&gt;Gallery&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;
📦 Two Bioconductor packages: &lt;a href="https://github.com/USCbiostats/HiLDA"&gt;HiLDA&lt;/a&gt; and &lt;a href="https://github.com/USCbiostats/selectKSigs"&gt;selectKSigs&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;
👇 You're viewing my lastest tweet deployed on &lt;a href="https://github.com/zhiiiyang/zhiiiyang"&gt;GitHub actions&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;div&gt;
&lt;a rel="noopener noreferrer" href="https://github.com/zhiiiyang/zhiiiyang/blob/master/tweet.png"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--zV0khAWK--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://github.com/zhiiiyang/zhiiiyang/raw/master/tweet.png" width="600"&gt;&lt;/a&gt;    
&lt;p&gt;&lt;a href="https://twitter.com/zhiiiyang" rel="nofollow"&gt;Follow me on Twitter&lt;/a&gt; 💬   |   &lt;a href="https://www.linkedin.com/in/zhiiiyang/" rel="nofollow"&gt;Connect me on LinkedIn&lt;/a&gt; 👔   |   &lt;a href="https://zhiyang.netlify.app/" rel="nofollow"&gt;Check out my website&lt;/a&gt; 🔗&lt;/p&gt;
&lt;/div&gt;

&lt;/div&gt;



&lt;/div&gt;
&lt;br&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/zhiiiyang/zhiiiyang"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;br&gt;
&lt;/div&gt;
&lt;br&gt;


&lt;p&gt;Thanks to &lt;a class="comment-mentioned-user" href="https://dev.to/mokkapps"&gt;@mokkapps&lt;/a&gt;
 for this article which uses the twitter updating&lt;br&gt;
action for discovering it.&lt;/p&gt;


&lt;div class="ltag__link"&gt;
  &lt;a href="/mokkapps" class="ltag__link__link"&gt;
    &lt;div class="ltag__link__pic"&gt;
      &lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--PV2woSQj--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://res.cloudinary.com/practicaldev/image/fetch/s--w6M-xuYC--/c_fill%2Cf_auto%2Cfl_progressive%2Ch_150%2Cq_auto%2Cw_150/https://dev-to-uploads.s3.amazonaws.com/uploads/user/profile_image/23105/e2bc4409-7a70-40ee-b801-53bb7d003fe2.jpg" alt="mokkapps image"&gt;
    &lt;/div&gt;
  &lt;/a&gt;
  &lt;a href="/mokkapps/how-i-built-a-self-updating-readme-on-my-github-profile-418d" class="ltag__link__link"&gt;
    &lt;div class="ltag__link__content"&gt;
      &lt;h2&gt;How I Built A Self-Updating README On My Github Profile&lt;/h2&gt;
      &lt;h3&gt;Michael Hoffmann ・ Jul 15 '20 ・ 2 min read&lt;/h3&gt;
      &lt;div class="ltag__link__taglist"&gt;
        &lt;span class="ltag__link__tag"&gt;#github&lt;/span&gt;
        &lt;span class="ltag__link__tag"&gt;#profile&lt;/span&gt;
        &lt;span class="ltag__link__tag"&gt;#javascript&lt;/span&gt;
        &lt;span class="ltag__link__tag"&gt;#developer&lt;/span&gt;
      &lt;/div&gt;
    &lt;/div&gt;
  &lt;/a&gt;
&lt;/div&gt;


&lt;h2&gt;
  
  
  Motivation
&lt;/h2&gt;

&lt;p&gt;It took my a little while to decide what to do with my profile. I wanted&lt;br&gt;
something data driven, and I wanted something to show-off my programming&lt;br&gt;
successes. So I decided to show my most popular repos, and also to show&lt;br&gt;
my most popular dev.to posts! I noticed that zhiiiyang made extensive&lt;br&gt;
use of &lt;a href="https://github.com/r-lib/actions"&gt;&lt;code&gt;r-lib/actions&lt;/code&gt;&lt;/a&gt;, and I’ve&lt;br&gt;
found that it was really valuable for my project too!&lt;/p&gt;
&lt;h2&gt;
  
  
  Method
&lt;/h2&gt;
&lt;h3&gt;
  
  
  Repos
&lt;/h3&gt;

&lt;p&gt;The repos script was where I started. Building from zhiiiyang’s work, I&lt;br&gt;
built a &lt;a href="https://github.com/DaveParr/DaveParr/blob/5c66bd4a2bd970ec7ad85e6de56fedcc75fbf74f/.github/workflows/main.yml"&gt;GitHub&lt;br&gt;
workflow&lt;/a&gt;&lt;br&gt;
to call a script I had written.&lt;/p&gt;

&lt;p&gt;The &lt;a href="https://github.com/DaveParr/DaveParr/blob/5c66bd4a2bd970ec7ad85e6de56fedcc75fbf74f/repos.R"&gt;script was quite&lt;br&gt;
simple&lt;/a&gt;.&lt;br&gt;
It gets my repos from the GitHub API through the&lt;br&gt;
&lt;a href="https://github.com/r-lib/gh"&gt;&lt;code&gt;gh&lt;/code&gt;&lt;/a&gt; package, and tidies the return using&lt;br&gt;
&lt;code&gt;hoist&lt;/code&gt; to grab the important bits. It then filters and pivots the data&lt;br&gt;
into a simple plot.&lt;/p&gt;

&lt;p&gt;The most interesting part was where zhiiiyang added the output as a&lt;br&gt;
commit by the action itself. The authentication for the action is&lt;br&gt;
actually allowed by this section:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;env&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;GITHUB_PAT&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ secrets.GITHUB_TOKEN }}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;I did stumble a little over the correct path to the plot image. It turns&lt;br&gt;
out that you can use the relative location (&lt;code&gt;./graph.png&lt;/code&gt;) which will&lt;br&gt;
work if you are viewing the README from &lt;em&gt;within&lt;/em&gt; the repo, but to make&lt;br&gt;
it work when the README is displayed from my &lt;em&gt;user&lt;/em&gt; page you have to use&lt;br&gt;
the absolute path&lt;br&gt;
(&lt;code&gt;https://github.com/daveparr/daveparr/blob/main/graph.png&lt;/code&gt;).&lt;/p&gt;
&lt;h3&gt;
  
  
  Posts
&lt;/h3&gt;

&lt;p&gt;The posts was actually pretty easy once I’d got used to how GitHub&lt;br&gt;
Action operate, and also how the GitHub &lt;code&gt;README&lt;/code&gt; profiles worked.&lt;/p&gt;

&lt;p&gt;The key part was actually a feature I recently developed for my own&lt;br&gt;
package.&lt;/p&gt;


&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--vJ70wriM--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://practicaldev-herokuapp-com.freetls.fastly.net/assets/github-logo-ba8488d21cd8ee1fee097b8410db9deaa41d0ca30b004c0c63de0a479114156f.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/DaveParr"&gt;
        DaveParr
      &lt;/a&gt; / &lt;a href="https://github.com/DaveParr/dev.to.ol"&gt;
        dev.to.ol
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      dev.to.ol helps R users publish to dev.to
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;
&lt;h1&gt;
&lt;a href="https://dev.to/daveparr" rel="nofollow"&gt;
&lt;img src="https://camo.githubusercontent.com/1f3c6413af566c3bdc34d592cb5f299bf014242798daf4854b3c531ad522b904/68747470733a2f2f6432666c746978307632653073622e636c6f756466726f6e742e6e65742f6465762d62616467652e737667" alt="Dave Parr's DEV Profile" height="30" width="30"&gt;
&lt;/a&gt;.to.ol
&lt;/h1&gt;
&lt;p&gt;&lt;a href="https://www.tidyverse.org/lifecycle/#maturing" rel="nofollow"&gt;&lt;img src="https://camo.githubusercontent.com/ae2f538d678a8e76c5493d870c59fbf928b14906e41227a07af5bbf3566b5068/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f6c6966656379636c652d6d61747572696e672d626c75652e737667" alt="Lifecycle: maturing"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;The goal of &lt;code&gt;dev.to.ol&lt;/code&gt; is to help R users publish to dev.to&lt;/p&gt;
&lt;h2&gt;
Installation&lt;/h2&gt;
&lt;p&gt;You can install the dev.to.ol from &lt;a href="https://raw.githubusercontent.com/DaveParr/dev.to.ol/main/www.github.com"&gt;github&lt;/a&gt; with
&lt;code&gt;remotes&lt;/code&gt;:&lt;/p&gt;
&lt;div class="highlight highlight-source-r js-code-highlight"&gt;
&lt;pre&gt;&lt;span class="pl-e"&gt;remotes&lt;/span&gt;&lt;span class="pl-k"&gt;::&lt;/span&gt;install.github(&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;DaveParr/dev.to.ol&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;)&lt;/pre&gt;

&lt;/div&gt;
&lt;h2&gt;
Workflow&lt;/h2&gt;
&lt;h3&gt;
Create your article&lt;/h3&gt;
&lt;p&gt;The &lt;code&gt;create_new_article&lt;/code&gt; function will give you the front mattter
boilerplate for an article &lt;code&gt;.Rmd&lt;/code&gt; file. Optionally supplying a file name
will create a new file with the front matter at the start.&lt;/p&gt;
&lt;div class="highlight highlight-source-r js-code-highlight"&gt;
&lt;pre&gt;create_new_article(&lt;span class="pl-v"&gt;title&lt;/span&gt; &lt;span class="pl-k"&gt;=&lt;/span&gt; &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;my title&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;)&lt;/pre&gt;

&lt;/div&gt;
&lt;h3&gt;
Write your article!&lt;/h3&gt;
&lt;p&gt;This is the fun bit. Mark your great ideas down in an &lt;code&gt;.Rmd&lt;/code&gt;!&lt;/p&gt;
&lt;h3&gt;
Post your article&lt;/h3&gt;
&lt;p&gt;Once the &lt;code&gt;.Rmd&lt;/code&gt; is written, you can post it to dev.to with
&lt;code&gt;post_new_article&lt;/code&gt;&lt;/p&gt;
&lt;div class="highlight highlight-source-r js-code-highlight"&gt;
&lt;pre&gt;post_new_article(&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;./my-great-article.Rmd&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;)&lt;/pre&gt;

&lt;/div&gt;
&lt;h3&gt;
Check your articles&lt;/h3&gt;
&lt;p&gt;There are two functions to check the posted articles on dev.to,
published and unpublished. Both will return a ‘tidy’ data set by
default.&lt;/p&gt;
&lt;div class="highlight highlight-source-r js-code-highlight"&gt;
&lt;pre&gt;get_users_articles()
&lt;span class="pl-smi"&gt;Using&lt;/span&gt; &lt;span class="pl-smi"&gt;DEVTO&lt;/span&gt; &lt;span class="pl-k"&gt;in&lt;/span&gt; &lt;span class="pl-smi"&gt;.Renviron&lt;/span&gt;
&lt;span class="pl-smi"&gt;The&lt;/span&gt; &lt;span class="pl-smi"&gt;API&lt;/span&gt; &lt;span class="pl-smi"&gt;returned&lt;/span&gt; &lt;span class="pl-smi"&gt;the&lt;/span&gt; &lt;span class="pl-smi"&gt;expected&lt;/span&gt; &lt;span class="pl-smi"&gt;success&lt;/span&gt;&lt;/pre&gt;…
&lt;/div&gt;
&lt;/div&gt;
  &lt;/div&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/DaveParr/dev.to.ol"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;



&lt;p&gt;&lt;code&gt;dev.to.ol&lt;/code&gt; is a package to help R users manage their dev.to content. In&lt;br&gt;
particular I had recently finished the functions that return data from&lt;br&gt;
the API about your published articles. The key to this is to have the&lt;br&gt;
DEV.TO api key that you want to use set as an encrypted secret in the&lt;br&gt;
repo. once that is set, my package can read it if it’s set as an&lt;br&gt;
environmental variable along with the &lt;code&gt;GITHUB_PAT&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;env&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;DEVTO&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ secrets.DEVTO }}&lt;/span&gt;
    &lt;span class="na"&gt;GITHUB_PAT&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ secrets.GITHUB_TOKEN }}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Because my package returns a tidy data.frame like object, it was trivial&lt;br&gt;
to munge it down to just what I wanted to show, and then format it&lt;br&gt;
neatly with &lt;code&gt;knitr&lt;/code&gt;. I also went all in on the &lt;code&gt;r-lib/actions&lt;/code&gt; examples,&lt;br&gt;
and now not only generate the new data for both the chart and blogs&lt;br&gt;
during the GitHub action, but also do the full compile from &lt;code&gt;.Rmd&lt;/code&gt; to&lt;br&gt;
&lt;code&gt;.md&lt;/code&gt; using the &lt;a href="https://github.com/DaveParr/DaveParr/blob/1f0d043ead21077879e4ba8bb282d66f9a6e1cb3/.github/workflows/main.yml#L18"&gt;&lt;code&gt;setup-pandoc&lt;/code&gt;&lt;br&gt;
action&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Lessons in automation
&lt;/h2&gt;

&lt;p&gt;I’d been meaning to explore GitHub Actions for a little while, and I&lt;br&gt;
found a few things out that I’m going to be considering in the future as&lt;br&gt;
I develop this an other projects.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pricing vs performance on macOS and Linux
&lt;/h3&gt;

&lt;p&gt;It’s free to a point, and then you need to pay. I had a look at the&lt;br&gt;
&lt;a href="https://docs.github.com/en/github/setting-up-and-managing-billing-and-payments-on-github/about-billing-for-github-actions"&gt;pricing&lt;br&gt;
plan&lt;/a&gt;&lt;br&gt;
and noticed that your runtime impacts your pricing. Most of the examples&lt;br&gt;
from &lt;code&gt;r-lib/actions&lt;/code&gt; run from macos-latest, as does zhiiiyang’s project.&lt;br&gt;
In GitHub Actions pricing a minute of macOS runtime is worth &lt;em&gt;10&lt;br&gt;
minutes&lt;/em&gt; of linux runtime. I ran on macOS for a while too, but&lt;br&gt;
eventually thought that it might be a smart idea for a long running&lt;br&gt;
personal project to convert to a linux run time, though now I’ve done it&lt;br&gt;
I’m debating going back to mac.&lt;/p&gt;

&lt;p&gt;The ‘problem’ is that I did not write this process to be fast, or light.&lt;br&gt;
It’s a silly hobby project to over-automate because I can. Therefore, on&lt;br&gt;
Linux I am now actually &lt;a href="https://github.com/DaveParr/DaveParr/blob/1f0d043ead21077879e4ba8bb282d66f9a6e1cb3/.github/workflows/main.yml#L21-L29"&gt;compiling the packages on each installation&lt;br&gt;
&lt;em&gt;AND&lt;/em&gt; installing libcurl for the api&lt;br&gt;
calls&lt;/a&gt;.&lt;br&gt;
The cost savings I make from not picking the faster run time in this&lt;br&gt;
case are approximately negated by the increase in actual run time. By&lt;br&gt;
eye, that part of the job runs at about 10 minutes now, where as on mac&lt;br&gt;
it was about 10 times faster as the pre-compiled binaries could just be&lt;br&gt;
downloaded and would run ‘out of the box’. I’ll probably change it back&lt;br&gt;
in a little while after I’ve checked how variable it can be.&lt;/p&gt;

&lt;p&gt;Potential solutions could include some form of caching (which I’ve heard&lt;br&gt;
is maybe supported?) or running the action in my own docker image with&lt;br&gt;
the pre-compiled, though TBH that sounds like work, and this is supposed&lt;br&gt;
to be fun :P&lt;/p&gt;

&lt;h3&gt;
  
  
  r-lib actions for the Rmd
&lt;/h3&gt;

&lt;p&gt;I really like the idea of compiling the &lt;code&gt;README.md&lt;/code&gt; for a package from&lt;br&gt;
the &lt;code&gt;README.Rmd&lt;/code&gt; we often use in R. I’ve often forgotten in my own work&lt;br&gt;
to do that key step before a push, and having a relatively simple&lt;br&gt;
automation backed into where my repos live will likely be something I&lt;br&gt;
use in the future. The best part of this trick is that &lt;code&gt;r-libs/actions&lt;/code&gt;&lt;br&gt;
does the most irritating part of ‘making Pandoc work’ for me. So I can&lt;br&gt;
just profit!&lt;/p&gt;

&lt;h2&gt;
  
  
  Successs!
&lt;/h2&gt;

&lt;p&gt;I really liked hacking this out. I got to put my &lt;code&gt;dev.to.ol&lt;/code&gt; package to&lt;br&gt;
another practical lesson and learn about GitHub Actions. Feel free to&lt;br&gt;
re-create this on your profiles, either by grabbing bits or by just&lt;br&gt;
lifting the whole thing. One of the reasons I built it the way I did is&lt;br&gt;
so it could be relatively portable between users, and maybe solve a&lt;br&gt;
problem for more than just me. So if you get this deployed on your&lt;br&gt;
profile, or get stuck, I’d love to hear from you!&lt;/p&gt;

</description>
      <category>github</category>
      <category>rstats</category>
      <category>showdev</category>
      <category>r</category>
    </item>
    <item>
      <title>How to calculate a Pokemons 'power level' using kmeans</title>
      <dc:creator>Dave Parr</dc:creator>
      <pubDate>Fri, 17 Jul 2020 15:29:19 +0000</pubDate>
      <link>https://forem.com/daveparr/how-to-calculate-a-pokemons-power-level-using-kmeans-4m8g</link>
      <guid>https://forem.com/daveparr/how-to-calculate-a-pokemons-power-level-using-kmeans-4m8g</guid>
      <description>&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight r"&gt;&lt;code&gt;&lt;span class="n"&gt;library&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pokedex&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="n"&gt;library&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tidyverse&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="n"&gt;library&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tidymodels&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="n"&gt;library&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;showtext&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="n"&gt;font_add_google&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"Press Start 2P"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="n"&gt;showtext_auto&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="n"&gt;theme_set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;theme_pokedex&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="n"&gt;knitr&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;opts_chunk&lt;/span&gt;&lt;span class="o"&gt;$&lt;/span&gt;&lt;span class="n"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fig.showtext&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;TRUE&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;h2&gt;
  
  
  The Heros Journey
&lt;/h2&gt;

&lt;p&gt;Pokemon games have a very familiar cycle. You start with one of 3&lt;br&gt;
Pokemon. You adventure out with your new buddy, facing tougher and&lt;br&gt;
tougher Pokemon, in greater variety. Many of your Pokemon evolve over&lt;br&gt;
time, and eventually you find the end-game legendaries in a climactic&lt;br&gt;
battle of titans!&lt;/p&gt;

&lt;p&gt;You can even see this story in the data. Here is the &lt;code&gt;base_experience&lt;/code&gt;&lt;br&gt;
of all the Pokemon, identified by game generation. The &lt;code&gt;base_experience&lt;/code&gt;&lt;br&gt;
is the basic amount of experience that is gained by the winner of a&lt;br&gt;
battle from a specific species. e.g. If you &lt;em&gt;beat&lt;/em&gt; a Bulbasaur, your&lt;br&gt;
Pokemon &lt;em&gt;gains&lt;/em&gt; experience based on a formula which uses Bulbasaurs&lt;br&gt;
&lt;code&gt;base_experience&lt;/code&gt;. Because of that we can see it as a proxy value for&lt;br&gt;
how &lt;em&gt;powerful&lt;/em&gt; a Pokemon is. If it’s more powerful, it will be harder to&lt;br&gt;
beat, and so reward you with more experience when you win.&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight r"&gt;&lt;code&gt;&lt;span class="n"&gt;pokemon&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;%&amp;gt;%&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="n"&gt;ggplot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;aes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;base_experience&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;colour&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;generation_id&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="n"&gt;geom_point&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="n"&gt;labs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Base Experience for each Pokemon"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--ea44K4TJ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://raw.githubusercontent.com/DaveParr/dev.to-posts/master/pokemon-power-levels_files/figure-gfm/plot%2520base%2520experience-1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--ea44K4TJ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://raw.githubusercontent.com/DaveParr/dev.to-posts/master/pokemon-power-levels_files/figure-gfm/plot%2520base%2520experience-1.png" alt="" width="672" height="480"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In each generation you can see a few attributes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;power level - 3(ish) tiers, grouped in different vertical lanes&lt;/li&gt;
&lt;li&gt;progression - a (generally) increasing trend in the base power value&lt;/li&gt;
&lt;li&gt;the up-ticks at the start and the end of each gen are the starters
top-tier evolution (Woo Blastoise!) and the Legendaries at the
end-game&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Lets see if we can group each Pokemon into a &lt;code&gt;power_level&lt;/code&gt;. A&lt;br&gt;
categorical grouping which relates it to other Pokemon with similar&lt;br&gt;
&lt;code&gt;base_experience&lt;/code&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  Grouping and counting
&lt;/h2&gt;

&lt;p&gt;Maybe we can explicitly describe the power levels of each tier of&lt;br&gt;
Pokemon with a simple process? Can we group each Pokemon by evolutionary&lt;br&gt;
chain, and then count each Pokemons order within the group?&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight r"&gt;&lt;code&gt;&lt;span class="n"&gt;pokemon&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;%&amp;gt;%&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="n"&gt;group_by&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;evolution_chain_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;%&amp;gt;%&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="n"&gt;mutate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;power_level&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;row_number&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;pokemon_group_count&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="n"&gt;ggplot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pokemon_group_count&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
       &lt;/span&gt;&lt;span class="n"&gt;aes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;base_experience&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="n"&gt;colour&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;as_factor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;power_level&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="n"&gt;geom_point&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="n"&gt;labs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Power level by group count"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
       &lt;/span&gt;&lt;span class="n"&gt;colour&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"power_level"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--hR35QkEb--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://raw.githubusercontent.com/DaveParr/dev.to-posts/master/pokemon-power-levels_files/figure-gfm/plot%2520by%2520group%2520count-1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--hR35QkEb--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://raw.githubusercontent.com/DaveParr/dev.to-posts/master/pokemon-power-levels_files/figure-gfm/plot%2520by%2520group%2520count-1.png" alt="" width="672" height="480"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Sort of? Generally we capture the pattern, but we don’t actually get it&lt;br&gt;
very right. First off, there are more than 3 evolutionary tiers in our&lt;br&gt;
engineered feature. We can also see there are some Pokemon are&lt;br&gt;
classified as a &lt;code&gt;power_level&lt;/code&gt; higher than 1, but still in the lowest&lt;br&gt;
group on this list. Why might have caused this?&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight r"&gt;&lt;code&gt;&lt;span class="n"&gt;pokemon_group_count&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;%&amp;gt;%&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="n"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;power_level&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;%&amp;gt;%&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="n"&gt;pull&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;evolution_chain_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;pokemon_group_count_mistakes&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="n"&gt;pokemon_group_count&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;%&amp;gt;%&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="n"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;evolution_chain_id&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;%in%&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;pokemon_group_count_mistakes&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;%&amp;gt;%&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="n"&gt;select&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
         &lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
         &lt;/span&gt;&lt;span class="n"&gt;base_experience&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
         &lt;/span&gt;&lt;span class="n"&gt;generation_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
         &lt;/span&gt;&lt;span class="n"&gt;evolution_chain_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
         &lt;/span&gt;&lt;span class="n"&gt;power_level&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="o"&gt;%&amp;gt;%&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="n"&gt;arrange&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;evolution_chain_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;%&amp;gt;%&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="n"&gt;select&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;base_experience&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;evolution_chain_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;power_level&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;%&amp;gt;%&lt;/span&gt;&lt;span class="w"&gt; 
  &lt;/span&gt;&lt;span class="n"&gt;knitr&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;kable&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;id&lt;/th&gt;
&lt;th&gt;name&lt;/th&gt;
&lt;th&gt;base_experience&lt;/th&gt;
&lt;th&gt;evolution_chain_id&lt;/th&gt;
&lt;th&gt;power_level&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;43&lt;/td&gt;
&lt;td&gt;Oddish&lt;/td&gt;
&lt;td&gt;64&lt;/td&gt;
&lt;td&gt;18&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;44&lt;/td&gt;
&lt;td&gt;Gloom&lt;/td&gt;
&lt;td&gt;138&lt;/td&gt;
&lt;td&gt;18&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;45&lt;/td&gt;
&lt;td&gt;Vileplume&lt;/td&gt;
&lt;td&gt;221&lt;/td&gt;
&lt;td&gt;18&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;182&lt;/td&gt;
&lt;td&gt;Bellossom&lt;/td&gt;
&lt;td&gt;221&lt;/td&gt;
&lt;td&gt;18&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;60&lt;/td&gt;
&lt;td&gt;Poliwag&lt;/td&gt;
&lt;td&gt;60&lt;/td&gt;
&lt;td&gt;26&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;61&lt;/td&gt;
&lt;td&gt;Poliwhirl&lt;/td&gt;
&lt;td&gt;135&lt;/td&gt;
&lt;td&gt;26&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;62&lt;/td&gt;
&lt;td&gt;Poliwrath&lt;/td&gt;
&lt;td&gt;230&lt;/td&gt;
&lt;td&gt;26&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;186&lt;/td&gt;
&lt;td&gt;Politoed&lt;/td&gt;
&lt;td&gt;225&lt;/td&gt;
&lt;td&gt;26&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;106&lt;/td&gt;
&lt;td&gt;Hitmonlee&lt;/td&gt;
&lt;td&gt;159&lt;/td&gt;
&lt;td&gt;47&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;107&lt;/td&gt;
&lt;td&gt;Hitmonchan&lt;/td&gt;
&lt;td&gt;159&lt;/td&gt;
&lt;td&gt;47&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;236&lt;/td&gt;
&lt;td&gt;Tyrogue&lt;/td&gt;
&lt;td&gt;42&lt;/td&gt;
&lt;td&gt;47&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;237&lt;/td&gt;
&lt;td&gt;Hitmontop&lt;/td&gt;
&lt;td&gt;159&lt;/td&gt;
&lt;td&gt;47&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;133&lt;/td&gt;
&lt;td&gt;Eevee&lt;/td&gt;
&lt;td&gt;65&lt;/td&gt;
&lt;td&gt;67&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;134&lt;/td&gt;
&lt;td&gt;Vaporeon&lt;/td&gt;
&lt;td&gt;184&lt;/td&gt;
&lt;td&gt;67&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;135&lt;/td&gt;
&lt;td&gt;Jolteon&lt;/td&gt;
&lt;td&gt;184&lt;/td&gt;
&lt;td&gt;67&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;136&lt;/td&gt;
&lt;td&gt;Flareon&lt;/td&gt;
&lt;td&gt;184&lt;/td&gt;
&lt;td&gt;67&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;196&lt;/td&gt;
&lt;td&gt;Espeon&lt;/td&gt;
&lt;td&gt;184&lt;/td&gt;
&lt;td&gt;67&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;197&lt;/td&gt;
&lt;td&gt;Umbreon&lt;/td&gt;
&lt;td&gt;184&lt;/td&gt;
&lt;td&gt;67&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;470&lt;/td&gt;
&lt;td&gt;Leafeon&lt;/td&gt;
&lt;td&gt;184&lt;/td&gt;
&lt;td&gt;67&lt;/td&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;471&lt;/td&gt;
&lt;td&gt;Glaceon&lt;/td&gt;
&lt;td&gt;184&lt;/td&gt;
&lt;td&gt;67&lt;/td&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;700&lt;/td&gt;
&lt;td&gt;Sylveon&lt;/td&gt;
&lt;td&gt;184&lt;/td&gt;
&lt;td&gt;67&lt;/td&gt;
&lt;td&gt;9&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;265&lt;/td&gt;
&lt;td&gt;Wurmple&lt;/td&gt;
&lt;td&gt;56&lt;/td&gt;
&lt;td&gt;135&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;266&lt;/td&gt;
&lt;td&gt;Silcoon&lt;/td&gt;
&lt;td&gt;72&lt;/td&gt;
&lt;td&gt;135&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;267&lt;/td&gt;
&lt;td&gt;Beautifly&lt;/td&gt;
&lt;td&gt;178&lt;/td&gt;
&lt;td&gt;135&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;268&lt;/td&gt;
&lt;td&gt;Cascoon&lt;/td&gt;
&lt;td&gt;72&lt;/td&gt;
&lt;td&gt;135&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;269&lt;/td&gt;
&lt;td&gt;Dustox&lt;/td&gt;
&lt;td&gt;173&lt;/td&gt;
&lt;td&gt;135&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;280&lt;/td&gt;
&lt;td&gt;Ralts&lt;/td&gt;
&lt;td&gt;40&lt;/td&gt;
&lt;td&gt;140&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;281&lt;/td&gt;
&lt;td&gt;Kirlia&lt;/td&gt;
&lt;td&gt;97&lt;/td&gt;
&lt;td&gt;140&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;282&lt;/td&gt;
&lt;td&gt;Gardevoir&lt;/td&gt;
&lt;td&gt;233&lt;/td&gt;
&lt;td&gt;140&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;475&lt;/td&gt;
&lt;td&gt;Gallade&lt;/td&gt;
&lt;td&gt;233&lt;/td&gt;
&lt;td&gt;140&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;789&lt;/td&gt;
&lt;td&gt;Cosmog&lt;/td&gt;
&lt;td&gt;40&lt;/td&gt;
&lt;td&gt;413&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;790&lt;/td&gt;
&lt;td&gt;Cosmoem&lt;/td&gt;
&lt;td&gt;140&lt;/td&gt;
&lt;td&gt;413&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;791&lt;/td&gt;
&lt;td&gt;Solgaleo&lt;/td&gt;
&lt;td&gt;306&lt;/td&gt;
&lt;td&gt;413&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;792&lt;/td&gt;
&lt;td&gt;Lunala&lt;/td&gt;
&lt;td&gt;306&lt;/td&gt;
&lt;td&gt;413&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;So there are some clear problems with this approach. In Gen 1 we had&lt;br&gt;
some branching evolution with the Eevee family. Not only was this family&lt;br&gt;
expanded in multiple generations to eventually 8 variations, but we also&lt;br&gt;
saw more branching evolution trees as well. We also got ‘baby’ Pokemon.&lt;br&gt;
Pokemon that are actually pre-cursors to other Pokemon, but are listed&lt;br&gt;
later in the Pokedex.&lt;/p&gt;

&lt;p&gt;Luckily there is another variable we can use, that should be a whole lot&lt;br&gt;
better.&lt;/p&gt;
&lt;h2&gt;
  
  
  &lt;code&gt;evolves_from_species&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;Each Pokemon that evolves from another Pokemon has the Pokemon they&lt;br&gt;
evolve froms &lt;code&gt;id&lt;/code&gt; as the value in the &lt;code&gt;evolves_from_species_id&lt;/code&gt;&lt;br&gt;
variable. Maybe we can use that to break up the Pokemon into their&lt;br&gt;
‘power levels’.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The following code is not my best work, however I spent some time on a&lt;br&gt;
more recursive strategy, but it was honestly miles more confusing. For&lt;br&gt;
the purposes of a silly example for a blog, I think this is&lt;br&gt;
prefferable.&lt;br&gt;
&lt;/p&gt;
&lt;/blockquote&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight r"&gt;&lt;code&gt;&lt;span class="n"&gt;pokemon&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;%&amp;gt;%&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="n"&gt;mutate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;power_level&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;case_when&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;is.na&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;evolves_from_species_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;~&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;%&amp;gt;%&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="n"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;power_level&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;==&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;pokemon_1&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="n"&gt;pokemon&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;%&amp;gt;%&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="n"&gt;mutate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;power_level&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;case_when&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;evolves_from_species_id&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;%in%&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;pokemon_1&lt;/span&gt;&lt;span class="o"&gt;$&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;~&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;%&amp;gt;%&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="n"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;power_level&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;==&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;pokemon_2&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="n"&gt;pokemon&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;%&amp;gt;%&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="n"&gt;mutate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;power_level&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;case_when&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;evolves_from_species_id&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;%in%&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;pokemon_2&lt;/span&gt;&lt;span class="o"&gt;$&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;~&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;3&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;%&amp;gt;%&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="n"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;power_level&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;==&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;pokemon_3&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="n"&gt;bind_rows&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pokemon_1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;pokemon_2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;pokemon_3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;%&amp;gt;%&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="n"&gt;arrange&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;%&amp;gt;%&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="n"&gt;mutate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;power_level&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;as_factor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;power_level&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;pokemon_evolves_from&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="n"&gt;pokemon_evolves_from&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;%&amp;gt;%&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="n"&gt;ggplot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;aes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;base_experience&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;colour&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;power_level&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="w"&gt; 
  &lt;/span&gt;&lt;span class="n"&gt;geom_point&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="n"&gt;labs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Power level by evolves_from"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--CTooGERX--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://raw.githubusercontent.com/DaveParr/dev.to-posts/master/pokemon-power-levels_files/figure-gfm/plot%2520by%2520evolves_from-1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--CTooGERX--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://raw.githubusercontent.com/DaveParr/dev.to-posts/master/pokemon-power-levels_files/figure-gfm/plot%2520by%2520evolves_from-1.png" alt="" width="672" height="480"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Hmm, that’s actually worse? Lets focus on the Pokemon labelled&lt;br&gt;
&lt;code&gt;power_level&lt;/code&gt; 1, but are up where we would expect level 3 Pokemon to be.&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight r"&gt;&lt;code&gt;&lt;span class="n"&gt;pokemon_evolves_from&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;%&amp;gt;%&lt;/span&gt;&lt;span class="w"&gt; 
  &lt;/span&gt;&lt;span class="n"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;base_experience&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;200&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;power_level&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;==&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;%&amp;gt;%&lt;/span&gt;&lt;span class="w"&gt; 
  &lt;/span&gt;&lt;span class="n"&gt;select&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;base_experience&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;%&amp;gt;%&lt;/span&gt;&lt;span class="w"&gt; 
  &lt;/span&gt;&lt;span class="n"&gt;knitr&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;kable&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;id&lt;/th&gt;
&lt;th&gt;name&lt;/th&gt;
&lt;th&gt;base_experience&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;144&lt;/td&gt;
&lt;td&gt;Articuno&lt;/td&gt;
&lt;td&gt;261&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;145&lt;/td&gt;
&lt;td&gt;Zapdos&lt;/td&gt;
&lt;td&gt;261&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;146&lt;/td&gt;
&lt;td&gt;Moltres&lt;/td&gt;
&lt;td&gt;261&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;150&lt;/td&gt;
&lt;td&gt;Mewtwo&lt;/td&gt;
&lt;td&gt;306&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;151&lt;/td&gt;
&lt;td&gt;Mew&lt;/td&gt;
&lt;td&gt;270&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;243&lt;/td&gt;
&lt;td&gt;Raikou&lt;/td&gt;
&lt;td&gt;261&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;244&lt;/td&gt;
&lt;td&gt;Entei&lt;/td&gt;
&lt;td&gt;261&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;245&lt;/td&gt;
&lt;td&gt;Suicune&lt;/td&gt;
&lt;td&gt;261&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;249&lt;/td&gt;
&lt;td&gt;Lugia&lt;/td&gt;
&lt;td&gt;306&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;250&lt;/td&gt;
&lt;td&gt;Ho-Oh&lt;/td&gt;
&lt;td&gt;306&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;251&lt;/td&gt;
&lt;td&gt;Celebi&lt;/td&gt;
&lt;td&gt;270&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;377&lt;/td&gt;
&lt;td&gt;Regirock&lt;/td&gt;
&lt;td&gt;261&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;378&lt;/td&gt;
&lt;td&gt;Regice&lt;/td&gt;
&lt;td&gt;261&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;379&lt;/td&gt;
&lt;td&gt;Registeel&lt;/td&gt;
&lt;td&gt;261&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;380&lt;/td&gt;
&lt;td&gt;Latias&lt;/td&gt;
&lt;td&gt;270&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;381&lt;/td&gt;
&lt;td&gt;Latios&lt;/td&gt;
&lt;td&gt;270&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;382&lt;/td&gt;
&lt;td&gt;Kyogre&lt;/td&gt;
&lt;td&gt;302&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;383&lt;/td&gt;
&lt;td&gt;Groudon&lt;/td&gt;
&lt;td&gt;302&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;384&lt;/td&gt;
&lt;td&gt;Rayquaza&lt;/td&gt;
&lt;td&gt;306&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;385&lt;/td&gt;
&lt;td&gt;Jirachi&lt;/td&gt;
&lt;td&gt;270&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;386&lt;/td&gt;
&lt;td&gt;Deoxys&lt;/td&gt;
&lt;td&gt;270&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;480&lt;/td&gt;
&lt;td&gt;Uxie&lt;/td&gt;
&lt;td&gt;261&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;481&lt;/td&gt;
&lt;td&gt;Mesprit&lt;/td&gt;
&lt;td&gt;261&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;482&lt;/td&gt;
&lt;td&gt;Azelf&lt;/td&gt;
&lt;td&gt;261&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;483&lt;/td&gt;
&lt;td&gt;Dialga&lt;/td&gt;
&lt;td&gt;306&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;484&lt;/td&gt;
&lt;td&gt;Palkia&lt;/td&gt;
&lt;td&gt;306&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;485&lt;/td&gt;
&lt;td&gt;Heatran&lt;/td&gt;
&lt;td&gt;270&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;486&lt;/td&gt;
&lt;td&gt;Regigigas&lt;/td&gt;
&lt;td&gt;302&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;487&lt;/td&gt;
&lt;td&gt;Giratina&lt;/td&gt;
&lt;td&gt;306&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;488&lt;/td&gt;
&lt;td&gt;Cresselia&lt;/td&gt;
&lt;td&gt;270&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;489&lt;/td&gt;
&lt;td&gt;Phione&lt;/td&gt;
&lt;td&gt;216&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;490&lt;/td&gt;
&lt;td&gt;Manaphy&lt;/td&gt;
&lt;td&gt;270&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;491&lt;/td&gt;
&lt;td&gt;Darkrai&lt;/td&gt;
&lt;td&gt;270&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;492&lt;/td&gt;
&lt;td&gt;Shaymin&lt;/td&gt;
&lt;td&gt;270&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;493&lt;/td&gt;
&lt;td&gt;Arceus&lt;/td&gt;
&lt;td&gt;324&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;494&lt;/td&gt;
&lt;td&gt;Victini&lt;/td&gt;
&lt;td&gt;270&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;531&lt;/td&gt;
&lt;td&gt;Audino&lt;/td&gt;
&lt;td&gt;390&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;638&lt;/td&gt;
&lt;td&gt;Cobalion&lt;/td&gt;
&lt;td&gt;261&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;639&lt;/td&gt;
&lt;td&gt;Terrakion&lt;/td&gt;
&lt;td&gt;261&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;640&lt;/td&gt;
&lt;td&gt;Virizion&lt;/td&gt;
&lt;td&gt;261&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;641&lt;/td&gt;
&lt;td&gt;Tornadus&lt;/td&gt;
&lt;td&gt;261&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;642&lt;/td&gt;
&lt;td&gt;Thundurus&lt;/td&gt;
&lt;td&gt;261&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;643&lt;/td&gt;
&lt;td&gt;Reshiram&lt;/td&gt;
&lt;td&gt;306&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;644&lt;/td&gt;
&lt;td&gt;Zekrom&lt;/td&gt;
&lt;td&gt;306&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;645&lt;/td&gt;
&lt;td&gt;Landorus&lt;/td&gt;
&lt;td&gt;270&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;646&lt;/td&gt;
&lt;td&gt;Kyurem&lt;/td&gt;
&lt;td&gt;297&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;647&lt;/td&gt;
&lt;td&gt;Keldeo&lt;/td&gt;
&lt;td&gt;261&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;648&lt;/td&gt;
&lt;td&gt;Meloetta&lt;/td&gt;
&lt;td&gt;270&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;649&lt;/td&gt;
&lt;td&gt;Genesect&lt;/td&gt;
&lt;td&gt;270&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;716&lt;/td&gt;
&lt;td&gt;Xerneas&lt;/td&gt;
&lt;td&gt;306&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;717&lt;/td&gt;
&lt;td&gt;Yveltal&lt;/td&gt;
&lt;td&gt;306&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;718&lt;/td&gt;
&lt;td&gt;Zygarde&lt;/td&gt;
&lt;td&gt;270&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;719&lt;/td&gt;
&lt;td&gt;Diancie&lt;/td&gt;
&lt;td&gt;270&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;720&lt;/td&gt;
&lt;td&gt;Hoopa&lt;/td&gt;
&lt;td&gt;270&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;721&lt;/td&gt;
&lt;td&gt;Volcanion&lt;/td&gt;
&lt;td&gt;270&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;785&lt;/td&gt;
&lt;td&gt;Tapu Koko&lt;/td&gt;
&lt;td&gt;257&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;786&lt;/td&gt;
&lt;td&gt;Tapu Lele&lt;/td&gt;
&lt;td&gt;257&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;787&lt;/td&gt;
&lt;td&gt;Tapu Bulu&lt;/td&gt;
&lt;td&gt;257&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;788&lt;/td&gt;
&lt;td&gt;Tapu Fini&lt;/td&gt;
&lt;td&gt;257&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;793&lt;/td&gt;
&lt;td&gt;Nihilego&lt;/td&gt;
&lt;td&gt;257&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;794&lt;/td&gt;
&lt;td&gt;Buzzwole&lt;/td&gt;
&lt;td&gt;257&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;795&lt;/td&gt;
&lt;td&gt;Pheromosa&lt;/td&gt;
&lt;td&gt;257&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;796&lt;/td&gt;
&lt;td&gt;Xurkitree&lt;/td&gt;
&lt;td&gt;257&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;797&lt;/td&gt;
&lt;td&gt;Celesteela&lt;/td&gt;
&lt;td&gt;257&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;798&lt;/td&gt;
&lt;td&gt;Kartana&lt;/td&gt;
&lt;td&gt;257&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;799&lt;/td&gt;
&lt;td&gt;Guzzlord&lt;/td&gt;
&lt;td&gt;257&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;800&lt;/td&gt;
&lt;td&gt;Necrozma&lt;/td&gt;
&lt;td&gt;270&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;801&lt;/td&gt;
&lt;td&gt;Magearna&lt;/td&gt;
&lt;td&gt;270&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;802&lt;/td&gt;
&lt;td&gt;Marshadow&lt;/td&gt;
&lt;td&gt;270&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;805&lt;/td&gt;
&lt;td&gt;Stakataka&lt;/td&gt;
&lt;td&gt;257&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;806&lt;/td&gt;
&lt;td&gt;Blacephalon&lt;/td&gt;
&lt;td&gt;257&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;807&lt;/td&gt;
&lt;td&gt;Zeraora&lt;/td&gt;
&lt;td&gt;270&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;So these Pokemon are nearly all ‘Legendary’. They are big end-game&lt;br&gt;
Pokemon, with real rarity in game. They also don’t evolve from, or to,&lt;br&gt;
anything, so our rule doesn’t classify them effectively.&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight r"&gt;&lt;code&gt;&lt;span class="n"&gt;pokemon_evolves_from&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;%&amp;gt;%&lt;/span&gt;&lt;span class="w"&gt; 
  &lt;/span&gt;&lt;span class="n"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;base_experience&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;200&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;base_experience&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;100&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;power_level&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;==&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;%&amp;gt;%&lt;/span&gt;&lt;span class="w"&gt; 
  &lt;/span&gt;&lt;span class="n"&gt;select&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;base_experience&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;%&amp;gt;%&lt;/span&gt;&lt;span class="w"&gt; 
  &lt;/span&gt;&lt;span class="n"&gt;knitr&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;kable&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;id&lt;/th&gt;
&lt;th&gt;name&lt;/th&gt;
&lt;th&gt;base_experience&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;83&lt;/td&gt;
&lt;td&gt;Farfetch’d&lt;/td&gt;
&lt;td&gt;132&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;115&lt;/td&gt;
&lt;td&gt;Kangaskhan&lt;/td&gt;
&lt;td&gt;172&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;127&lt;/td&gt;
&lt;td&gt;Pinsir&lt;/td&gt;
&lt;td&gt;175&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;128&lt;/td&gt;
&lt;td&gt;Tauros&lt;/td&gt;
&lt;td&gt;172&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;131&lt;/td&gt;
&lt;td&gt;Lapras&lt;/td&gt;
&lt;td&gt;187&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;132&lt;/td&gt;
&lt;td&gt;Ditto&lt;/td&gt;
&lt;td&gt;101&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;142&lt;/td&gt;
&lt;td&gt;Aerodactyl&lt;/td&gt;
&lt;td&gt;180&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;201&lt;/td&gt;
&lt;td&gt;Unown&lt;/td&gt;
&lt;td&gt;118&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;203&lt;/td&gt;
&lt;td&gt;Girafarig&lt;/td&gt;
&lt;td&gt;159&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;206&lt;/td&gt;
&lt;td&gt;Dunsparce&lt;/td&gt;
&lt;td&gt;145&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;213&lt;/td&gt;
&lt;td&gt;Shuckle&lt;/td&gt;
&lt;td&gt;177&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;214&lt;/td&gt;
&lt;td&gt;Heracross&lt;/td&gt;
&lt;td&gt;175&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;222&lt;/td&gt;
&lt;td&gt;Corsola&lt;/td&gt;
&lt;td&gt;144&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;225&lt;/td&gt;
&lt;td&gt;Delibird&lt;/td&gt;
&lt;td&gt;116&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;227&lt;/td&gt;
&lt;td&gt;Skarmory&lt;/td&gt;
&lt;td&gt;163&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;234&lt;/td&gt;
&lt;td&gt;Stantler&lt;/td&gt;
&lt;td&gt;163&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;241&lt;/td&gt;
&lt;td&gt;Miltank&lt;/td&gt;
&lt;td&gt;172&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;302&lt;/td&gt;
&lt;td&gt;Sableye&lt;/td&gt;
&lt;td&gt;133&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;303&lt;/td&gt;
&lt;td&gt;Mawile&lt;/td&gt;
&lt;td&gt;133&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;311&lt;/td&gt;
&lt;td&gt;Plusle&lt;/td&gt;
&lt;td&gt;142&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;312&lt;/td&gt;
&lt;td&gt;Minun&lt;/td&gt;
&lt;td&gt;142&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;313&lt;/td&gt;
&lt;td&gt;Volbeat&lt;/td&gt;
&lt;td&gt;151&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;314&lt;/td&gt;
&lt;td&gt;Illumise&lt;/td&gt;
&lt;td&gt;151&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;324&lt;/td&gt;
&lt;td&gt;Torkoal&lt;/td&gt;
&lt;td&gt;165&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;327&lt;/td&gt;
&lt;td&gt;Spinda&lt;/td&gt;
&lt;td&gt;126&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;335&lt;/td&gt;
&lt;td&gt;Zangoose&lt;/td&gt;
&lt;td&gt;160&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;336&lt;/td&gt;
&lt;td&gt;Seviper&lt;/td&gt;
&lt;td&gt;160&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;337&lt;/td&gt;
&lt;td&gt;Lunatone&lt;/td&gt;
&lt;td&gt;161&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;338&lt;/td&gt;
&lt;td&gt;Solrock&lt;/td&gt;
&lt;td&gt;161&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;351&lt;/td&gt;
&lt;td&gt;Castform&lt;/td&gt;
&lt;td&gt;147&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;352&lt;/td&gt;
&lt;td&gt;Kecleon&lt;/td&gt;
&lt;td&gt;154&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;357&lt;/td&gt;
&lt;td&gt;Tropius&lt;/td&gt;
&lt;td&gt;161&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;359&lt;/td&gt;
&lt;td&gt;Absol&lt;/td&gt;
&lt;td&gt;163&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;369&lt;/td&gt;
&lt;td&gt;Relicanth&lt;/td&gt;
&lt;td&gt;170&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;370&lt;/td&gt;
&lt;td&gt;Luvdisc&lt;/td&gt;
&lt;td&gt;116&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;417&lt;/td&gt;
&lt;td&gt;Pachirisu&lt;/td&gt;
&lt;td&gt;142&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;440&lt;/td&gt;
&lt;td&gt;Happiny&lt;/td&gt;
&lt;td&gt;110&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;441&lt;/td&gt;
&lt;td&gt;Chatot&lt;/td&gt;
&lt;td&gt;144&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;442&lt;/td&gt;
&lt;td&gt;Spiritomb&lt;/td&gt;
&lt;td&gt;170&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;455&lt;/td&gt;
&lt;td&gt;Carnivine&lt;/td&gt;
&lt;td&gt;159&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;479&lt;/td&gt;
&lt;td&gt;Rotom&lt;/td&gt;
&lt;td&gt;154&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;538&lt;/td&gt;
&lt;td&gt;Throh&lt;/td&gt;
&lt;td&gt;163&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;539&lt;/td&gt;
&lt;td&gt;Sawk&lt;/td&gt;
&lt;td&gt;163&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;550&lt;/td&gt;
&lt;td&gt;Basculin&lt;/td&gt;
&lt;td&gt;161&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;556&lt;/td&gt;
&lt;td&gt;Maractus&lt;/td&gt;
&lt;td&gt;161&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;561&lt;/td&gt;
&lt;td&gt;Sigilyph&lt;/td&gt;
&lt;td&gt;172&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;587&lt;/td&gt;
&lt;td&gt;Emolga&lt;/td&gt;
&lt;td&gt;150&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;594&lt;/td&gt;
&lt;td&gt;Alomomola&lt;/td&gt;
&lt;td&gt;165&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;615&lt;/td&gt;
&lt;td&gt;Cryogonal&lt;/td&gt;
&lt;td&gt;180&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;618&lt;/td&gt;
&lt;td&gt;Stunfisk&lt;/td&gt;
&lt;td&gt;165&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;621&lt;/td&gt;
&lt;td&gt;Druddigon&lt;/td&gt;
&lt;td&gt;170&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;626&lt;/td&gt;
&lt;td&gt;Bouffalant&lt;/td&gt;
&lt;td&gt;172&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;631&lt;/td&gt;
&lt;td&gt;Heatmor&lt;/td&gt;
&lt;td&gt;169&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;632&lt;/td&gt;
&lt;td&gt;Durant&lt;/td&gt;
&lt;td&gt;169&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;676&lt;/td&gt;
&lt;td&gt;Furfrou&lt;/td&gt;
&lt;td&gt;165&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;701&lt;/td&gt;
&lt;td&gt;Hawlucha&lt;/td&gt;
&lt;td&gt;175&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;702&lt;/td&gt;
&lt;td&gt;Dedenne&lt;/td&gt;
&lt;td&gt;151&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;707&lt;/td&gt;
&lt;td&gt;Klefki&lt;/td&gt;
&lt;td&gt;165&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;741&lt;/td&gt;
&lt;td&gt;Oricorio&lt;/td&gt;
&lt;td&gt;167&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;764&lt;/td&gt;
&lt;td&gt;Comfey&lt;/td&gt;
&lt;td&gt;170&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;765&lt;/td&gt;
&lt;td&gt;Oranguru&lt;/td&gt;
&lt;td&gt;172&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;766&lt;/td&gt;
&lt;td&gt;Passimian&lt;/td&gt;
&lt;td&gt;172&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;771&lt;/td&gt;
&lt;td&gt;Pyukumuku&lt;/td&gt;
&lt;td&gt;144&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;772&lt;/td&gt;
&lt;td&gt;Type: Null&lt;/td&gt;
&lt;td&gt;107&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;774&lt;/td&gt;
&lt;td&gt;Minior&lt;/td&gt;
&lt;td&gt;154&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;775&lt;/td&gt;
&lt;td&gt;Komala&lt;/td&gt;
&lt;td&gt;168&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;776&lt;/td&gt;
&lt;td&gt;Turtonator&lt;/td&gt;
&lt;td&gt;170&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;777&lt;/td&gt;
&lt;td&gt;Togedemaru&lt;/td&gt;
&lt;td&gt;152&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;778&lt;/td&gt;
&lt;td&gt;Mimikyu&lt;/td&gt;
&lt;td&gt;167&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;779&lt;/td&gt;
&lt;td&gt;Bruxish&lt;/td&gt;
&lt;td&gt;166&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;780&lt;/td&gt;
&lt;td&gt;Drampa&lt;/td&gt;
&lt;td&gt;170&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;781&lt;/td&gt;
&lt;td&gt;Dhelmise&lt;/td&gt;
&lt;td&gt;181&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;803&lt;/td&gt;
&lt;td&gt;Poipole&lt;/td&gt;
&lt;td&gt;189&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;These Pokemon are not end game, but they either have very short&lt;br&gt;
evolution trees (2 Pokemon long), or no evolution tree at all.&lt;/p&gt;

&lt;p&gt;So our ‘group count’ process doesn’t work well, and neither does our&lt;br&gt;
‘&lt;code&gt;evolves_from_species&lt;/code&gt;’ process. We’re going to have to to learn some&lt;br&gt;
new moves.&lt;/p&gt;
&lt;h2&gt;
  
  
  TM01 (e.g. Tidy Models 01)
&lt;/h2&gt;

&lt;p&gt;Clustering is the process of using machine learning to derive a&lt;br&gt;
categorical variable from data. The simplest form of clustering that&lt;br&gt;
seems relevant to our problem is k-means. Seeing as we have a pretty&lt;br&gt;
good intuition that 3 groups &lt;em&gt;implicitly&lt;/em&gt; exist in this data, and a&lt;br&gt;
clear &lt;em&gt;visualisation&lt;/em&gt; supporting us, lets cut straight to asking R for 3&lt;br&gt;
groups. k-means clustering aims to divide the data into a &lt;em&gt;known&lt;/em&gt; number&lt;br&gt;
of groups which is set in the &lt;code&gt;centers&lt;/code&gt; argument, and doesn’t require&lt;br&gt;
any data to be fed to it as examples of what makes up a ‘group’. That&lt;br&gt;
sentence might be a little confusing, as we do obviously give it some&lt;br&gt;
data. What we &lt;em&gt;don’t&lt;/em&gt; give it is a training data set which has examples&lt;br&gt;
of what Pokemon are supposed to be in a given group, labelled with the&lt;br&gt;
group they are supposed to be in, e.g. Squirtle is in group 1, Wartortle&lt;br&gt;
is in group 2, Blastoise is in group 3, and then give it unlabelled data&lt;br&gt;
to classify, e.g. “What group is Charmeleon in?”. k-means will &lt;em&gt;figure&lt;br&gt;
out&lt;/em&gt; how to split the continuous variable &lt;code&gt;base_experience&lt;/code&gt; into 3&lt;br&gt;
groups.&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight r"&gt;&lt;code&gt;&lt;span class="n"&gt;set.seed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;68&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="n"&gt;pokemon&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;%&amp;gt;%&lt;/span&gt;&lt;span class="w"&gt; 
  &lt;/span&gt;&lt;span class="n"&gt;select&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;base_experience&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="o"&gt;%&amp;gt;%&lt;/span&gt;&lt;span class="w"&gt; 
  &lt;/span&gt;&lt;span class="n"&gt;kmeans&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;centers&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;%&amp;gt;%&lt;/span&gt;&lt;span class="w"&gt; 
  &lt;/span&gt;&lt;span class="n"&gt;augment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pokemon&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;pokemon_cluster&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;h3&gt;
  
  
  How kmeans works
&lt;/h3&gt;

&lt;p&gt;First, we set centroids to be at random positions in the data. To make&lt;br&gt;
sure this doesn’t effect consistency in my article I’ve used &lt;code&gt;set.seed&lt;/code&gt;&lt;br&gt;
so k-means starts looking for the centres of our clusters from the same&lt;br&gt;
position each time. A ‘centroid’ can be seen as a ‘centre point’ for&lt;br&gt;
each cluster. We have the same number of centroids set as the value set&lt;br&gt;
in the &lt;code&gt;centers&lt;/code&gt; argument. Each data point is then assigned to it’s&lt;br&gt;
closest centroid.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The Sum of Squared Errors (SSE) from the centroids is then used as an&lt;br&gt;
&lt;em&gt;objective function&lt;/em&gt; towards a &lt;em&gt;local minimum&lt;/em&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This is the core concept of how k-means calculates a solution. The Sum&lt;br&gt;
of Squared Errors are calculated like this:&lt;/p&gt;

&lt;p&gt;

&lt;/p&gt;
&lt;div class="katex-element"&gt;
  &lt;span class="katex-display"&gt;&lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;∑i−1n(xi−xˉ)2
  \sum^n_{i-1}(x_i-\bar{x})^2
&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mop op-limits"&gt;&lt;span class="vlist-t vlist-t2"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;&lt;span class="mord mathnormal mtight"&gt;i&lt;/span&gt;&lt;span class="mbin mtight"&gt;−&lt;/span&gt;&lt;span class="mord mtight"&gt;1&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span&gt;&lt;span class="mop op-symbol large-op"&gt;∑&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mathnormal mtight"&gt;n&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-s"&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mopen"&gt;(&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;x&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t vlist-t2"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mathnormal mtight"&gt;i&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-s"&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;−&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord accent"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;x&lt;/span&gt;&lt;/span&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="accent-body"&gt;&lt;span class="mord"&gt;ˉ&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mclose"&gt;&lt;span class="mclose"&gt;)&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;2&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/div&gt;



&lt;p&gt;What this means is for each group, the distance from the centroid to&lt;br&gt;
each observation is measured, and then squared. Then each of those&lt;br&gt;
squared distances, one per observation in the group, is totalled.&lt;/p&gt;

&lt;p&gt;The centroid is then moved to the &lt;em&gt;average&lt;/em&gt; value of it’s group. Because&lt;br&gt;
the centroids are now no longer in the same position as when the SSE was&lt;br&gt;
calculated, the SSE is now recalculated for all observations, to each of&lt;br&gt;
the &lt;em&gt;new&lt;/em&gt; centroid positions. This means that some observations are now&lt;br&gt;
closer to a &lt;em&gt;different&lt;/em&gt; centroid, and so get assigned to a &lt;em&gt;different&lt;/em&gt;&lt;br&gt;
cluster.&lt;/p&gt;

&lt;p&gt;Then the centroids are moved &lt;em&gt;again&lt;/em&gt; to the &lt;em&gt;new&lt;/em&gt; average of the &lt;em&gt;new&lt;/em&gt;&lt;br&gt;
cluster. Each cluster then gets &lt;em&gt;new distances&lt;/em&gt; calculated. This will go&lt;br&gt;
on until termination when, each centroid is in the average position of&lt;br&gt;
the cluster, and each observation in the cluster is closest to the&lt;br&gt;
&lt;em&gt;centroid&lt;/em&gt; it is currently assigned to.&lt;/p&gt;

&lt;p&gt;That’s a slightly wordy description of a complicated process. I&lt;br&gt;
recommend that you have a look at the &lt;a href="https://www.tidymodels.org/learn/statistics/k-means/"&gt;k-means explanation in tidy&lt;br&gt;
models&lt;/a&gt; to really&lt;br&gt;
cement the concept. It also contains &lt;em&gt;the most adorable animation of a&lt;br&gt;
statistical concept in existance&lt;/em&gt;.&lt;/p&gt;
&lt;h3&gt;
  
  
  How to use it
&lt;/h3&gt;

&lt;p&gt;Here, I’ve simply selected the one column of data, &lt;code&gt;base_experience&lt;/code&gt;,&lt;br&gt;
and piped it into &lt;code&gt;kmeans&lt;/code&gt;, which is part of base R. This returns a&lt;br&gt;
super un-tidy list like object of class &lt;code&gt;"kmeans"&lt;/code&gt;, however, with&lt;br&gt;
&lt;code&gt;tidymodels&lt;/code&gt; we can easily make it usable.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;augment&lt;/code&gt; helps us to match the output of the &lt;code&gt;kmeans&lt;/code&gt; function back to&lt;br&gt;
our data for easier processing. The output of &lt;code&gt;kmeans&lt;/code&gt; doesn’t actually&lt;br&gt;
contain any of the other information about our data, it only got given&lt;br&gt;
one column remember? &lt;code&gt;augment&lt;/code&gt; goes through the return of &lt;code&gt;kmeans&lt;/code&gt;,&lt;br&gt;
finds the relevant bit, and matches it neatly back to our original data&lt;br&gt;
ready for plotting in one step!&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight r"&gt;&lt;code&gt;&lt;span class="n"&gt;pokemon_cluster&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;%&amp;gt;%&lt;/span&gt;&lt;span class="w"&gt; 
  &lt;/span&gt;&lt;span class="n"&gt;ggplot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;aes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;base_experience&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;colour&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;.cluster&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="n"&gt;geom_point&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="n"&gt;labs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Power level by kmeans clustering"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--jOuAZh-H--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://raw.githubusercontent.com/DaveParr/dev.to-posts/master/pokemon-power-levels_files/figure-gfm/plot%2520cluster-1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--jOuAZh-H--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://raw.githubusercontent.com/DaveParr/dev.to-posts/master/pokemon-power-levels_files/figure-gfm/plot%2520cluster-1.png" alt="" width="672" height="480"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This is a &lt;em&gt;lot&lt;/em&gt; better. It gives us really clear groups, in exactly&lt;br&gt;
where we expected them. It’s also tonnes simpler code!&lt;/p&gt;

&lt;p&gt;Lets check some of our boundary positions, just to make sure it makes&lt;br&gt;
sense.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight r"&gt;&lt;code&gt;&lt;span class="n"&gt;pokemon_cluster&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;%&amp;gt;%&lt;/span&gt;&lt;span class="w"&gt; 
  &lt;/span&gt;&lt;span class="n"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;.cluster&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;==&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;base_experience&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;%&amp;gt;%&lt;/span&gt;&lt;span class="w"&gt; 
  &lt;/span&gt;&lt;span class="n"&gt;select&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;base_experience&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;%&amp;gt;%&lt;/span&gt;&lt;span class="w"&gt; 
  &lt;/span&gt;&lt;span class="n"&gt;knitr&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;kable&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;id&lt;/th&gt;
&lt;th&gt;name&lt;/th&gt;
&lt;th&gt;base_experience&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;132&lt;/td&gt;
&lt;td&gt;Ditto&lt;/td&gt;
&lt;td&gt;101&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;440&lt;/td&gt;
&lt;td&gt;Happiny&lt;/td&gt;
&lt;td&gt;110&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;699&lt;/td&gt;
&lt;td&gt;Aurorus&lt;/td&gt;
&lt;td&gt;104&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;762&lt;/td&gt;
&lt;td&gt;Steenee&lt;/td&gt;
&lt;td&gt;102&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;772&lt;/td&gt;
&lt;td&gt;Type: Null&lt;/td&gt;
&lt;td&gt;107&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight r"&gt;&lt;code&gt;&lt;span class="n"&gt;pokemon_cluster&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;%&amp;gt;%&lt;/span&gt;&lt;span class="w"&gt; 
  &lt;/span&gt;&lt;span class="n"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;.cluster&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;==&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;2&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;base_experience&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;200&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;%&amp;gt;%&lt;/span&gt;&lt;span class="w"&gt; 
  &lt;/span&gt;&lt;span class="n"&gt;select&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;base_experience&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;%&amp;gt;%&lt;/span&gt;&lt;span class="w"&gt; 
  &lt;/span&gt;&lt;span class="n"&gt;knitr&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;kable&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;id&lt;/th&gt;
&lt;th&gt;name&lt;/th&gt;
&lt;th&gt;base_experience&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;189&lt;/td&gt;
&lt;td&gt;Jumpluff&lt;/td&gt;
&lt;td&gt;207&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Ditto and Type: Null are just a plain weird Pokemon due to their&lt;br&gt;
abilities messing with their type. Happiny is a baby type, but from a&lt;br&gt;
family with an &lt;em&gt;insane&lt;/em&gt; base_experience. Jumpluff is an awkward edge&lt;br&gt;
case. Technically the 3rd evolution, it’s still got an &lt;em&gt;extremely&lt;/em&gt; low&lt;br&gt;
base_experience. Potentially this is for game balancing as they are&lt;br&gt;
relatively regularly encountered? Aurorus and Steenee I do not have a&lt;br&gt;
good hypothesis for.&lt;/p&gt;

&lt;p&gt;Generally I think that this is a pretty good solution. There are maybe a&lt;br&gt;
few edge cases that are open to interpretation, but that’s just what we&lt;br&gt;
get sometimes with machine learning. Lacking a labelled training data&lt;br&gt;
set, we can’t compute a confusion matrix or ROC-curve.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;We’ve found a situation in the ‘real’ world where we know from context&lt;br&gt;
there is a categorical relationship, but from the data available it’s&lt;br&gt;
not possible to classify that precisely. However we can create this&lt;br&gt;
categorisation using machine learning! Even better, we can use the&lt;br&gt;
&lt;code&gt;tidymodels&lt;/code&gt; package to help us do it quickly and cleanly.&lt;/p&gt;

</description>
      <category>rstats</category>
      <category>datascience</category>
      <category>videogame</category>
      <category>pokemon</category>
    </item>
    <item>
      <title>The Missingno Experiment and Multiple Form Pokemon</title>
      <dc:creator>Dave Parr</dc:creator>
      <pubDate>Wed, 01 Jul 2020 17:42:42 +0000</pubDate>
      <link>https://forem.com/daveparr/the-missingno-experiment-and-multiple-form-pokemon-26ga</link>
      <guid>https://forem.com/daveparr/the-missingno-experiment-and-multiple-form-pokemon-26ga</guid>
      <description>&lt;h2&gt;
  
  
  Wild missingno appeared!
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--1DZmFXrU--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_66%2Cw_800/https://media2.giphy.com/media/G7rSYPWEeTjY4/giphy.gif%3Fcid%3Decf05e47f8faf1991ba53b479305d68e25326d15db3d6769%26rid%3Dgiphy.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--1DZmFXrU--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_66%2Cw_800/https://media2.giphy.com/media/G7rSYPWEeTjY4/giphy.gif%3Fcid%3Decf05e47f8faf1991ba53b479305d68e25326d15db3d6769%26rid%3Dgiphy.gif" alt="Battle entry animation of a ‘wild missingno appeared’ from pokemon&amp;lt;br&amp;gt;
red/blue" width="480" height="360"&gt;&lt;/a&gt;&lt;br&gt;
Missingno is the patron Pokemon of data science. You’re just casually&lt;br&gt;
surfing up and down your data, doing some sweet coding, when suddenly a&lt;br&gt;
bunch of missing and corrupted data gets in you way, and you suddenly&lt;br&gt;
have a bunch of random items in your bag for no reason. OK, well maybe I&lt;br&gt;
just have a messy bag.&lt;/p&gt;

&lt;p&gt;The valuable part of this metaphor is the part where you battle&lt;br&gt;
Missingno, and win. I’ve been doing this with my Pokedex project&lt;br&gt;
recently, to try and iron out what data I can rely on from my data&lt;br&gt;
source, and what’s a bit patchy.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight r"&gt;&lt;code&gt;&lt;span class="n"&gt;library&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pokedex&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="n"&gt;library&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tidyverse&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="n"&gt;library&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;naniar&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="n"&gt;library&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;skimr&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;h2&gt;
  
  
  Go, Skimr!
&lt;/h2&gt;

&lt;p&gt;Skimr gives us a text based summary view. As well as the basics on data&lt;br&gt;
set size, it also shows us some statistical values, but most valuably it&lt;br&gt;
describes how many values are missing, and in what columns.&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight r"&gt;&lt;code&gt;&lt;span class="n"&gt;pokemon&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;%&amp;gt;%&lt;/span&gt;&lt;span class="w"&gt; 
  &lt;/span&gt;&lt;span class="n"&gt;skimr&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;skim&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Name&lt;/td&gt;
&lt;td&gt;Piped data&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Number of rows&lt;/td&gt;
&lt;td&gt;807&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Number of columns&lt;/td&gt;
&lt;td&gt;24&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;_______________________&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Column type frequency:&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;character&lt;/td&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;list&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;numeric&lt;/td&gt;
&lt;td&gt;15&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;________________________&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Group variables&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Data summary&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Variable type: character&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;skim_variable&lt;/th&gt;
&lt;th&gt;n_missing&lt;/th&gt;
&lt;th&gt;complete_rate&lt;/th&gt;
&lt;th&gt;min&lt;/th&gt;
&lt;th&gt;max&lt;/th&gt;
&lt;th&gt;empty&lt;/th&gt;
&lt;th&gt;n_unique&lt;/th&gt;
&lt;th&gt;whitespace&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;identifier&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;1.00&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;20&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;807&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;type_1&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;1.00&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;18&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;type_2&lt;/td&gt;
&lt;td&gt;402&lt;/td&gt;
&lt;td&gt;0.50&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;18&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;name&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;1.00&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;12&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;807&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;genus&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;1.00&lt;/td&gt;
&lt;td&gt;11&lt;/td&gt;
&lt;td&gt;21&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;589&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;color&lt;/td&gt;
&lt;td&gt;20&lt;/td&gt;
&lt;td&gt;0.98&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;10&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;shape&lt;/td&gt;
&lt;td&gt;20&lt;/td&gt;
&lt;td&gt;0.98&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;9&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;14&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;habitat&lt;/td&gt;
&lt;td&gt;422&lt;/td&gt;
&lt;td&gt;0.48&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;13&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;9&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Variable type: list&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;skim_variable&lt;/th&gt;
&lt;th&gt;n_missing&lt;/th&gt;
&lt;th&gt;complete_rate&lt;/th&gt;
&lt;th&gt;n_unique&lt;/th&gt;
&lt;th&gt;min_length&lt;/th&gt;
&lt;th&gt;max_length&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;flavour_text&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;807&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Variable type: numeric&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;skim_variable&lt;/th&gt;
&lt;th&gt;n_missing&lt;/th&gt;
&lt;th&gt;complete_rate&lt;/th&gt;
&lt;th&gt;mean&lt;/th&gt;
&lt;th&gt;sd&lt;/th&gt;
&lt;th&gt;p0&lt;/th&gt;
&lt;th&gt;p25&lt;/th&gt;
&lt;th&gt;p50&lt;/th&gt;
&lt;th&gt;p75&lt;/th&gt;
&lt;th&gt;p100&lt;/th&gt;
&lt;th&gt;hist&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;id&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;1.00&lt;/td&gt;
&lt;td&gt;404.00&lt;/td&gt;
&lt;td&gt;233.11&lt;/td&gt;
&lt;td&gt;1.0&lt;/td&gt;
&lt;td&gt;202.5&lt;/td&gt;
&lt;td&gt;404&lt;/td&gt;
&lt;td&gt;605.5&lt;/td&gt;
&lt;td&gt;807.0&lt;/td&gt;
&lt;td&gt;▇▇▇▇▇&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;species_id&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;1.00&lt;/td&gt;
&lt;td&gt;404.00&lt;/td&gt;
&lt;td&gt;233.11&lt;/td&gt;
&lt;td&gt;1.0&lt;/td&gt;
&lt;td&gt;202.5&lt;/td&gt;
&lt;td&gt;404&lt;/td&gt;
&lt;td&gt;605.5&lt;/td&gt;
&lt;td&gt;807.0&lt;/td&gt;
&lt;td&gt;▇▇▇▇▇&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;height&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;1.00&lt;/td&gt;
&lt;td&gt;1.16&lt;/td&gt;
&lt;td&gt;1.08&lt;/td&gt;
&lt;td&gt;0.1&lt;/td&gt;
&lt;td&gt;0.6&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;1.5&lt;/td&gt;
&lt;td&gt;14.5&lt;/td&gt;
&lt;td&gt;▇▁▁▁▁&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;weight&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;1.00&lt;/td&gt;
&lt;td&gt;61.77&lt;/td&gt;
&lt;td&gt;111.52&lt;/td&gt;
&lt;td&gt;0.1&lt;/td&gt;
&lt;td&gt;9.0&lt;/td&gt;
&lt;td&gt;27&lt;/td&gt;
&lt;td&gt;63.0&lt;/td&gt;
&lt;td&gt;999.9&lt;/td&gt;
&lt;td&gt;▇▁▁▁▁&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;base_experience&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;1.00&lt;/td&gt;
&lt;td&gt;144.85&lt;/td&gt;
&lt;td&gt;74.95&lt;/td&gt;
&lt;td&gt;36.0&lt;/td&gt;
&lt;td&gt;66.0&lt;/td&gt;
&lt;td&gt;151&lt;/td&gt;
&lt;td&gt;179.5&lt;/td&gt;
&lt;td&gt;608.0&lt;/td&gt;
&lt;td&gt;▇▇▁▁▁&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;is_default&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;1.00&lt;/td&gt;
&lt;td&gt;1.00&lt;/td&gt;
&lt;td&gt;0.00&lt;/td&gt;
&lt;td&gt;1.0&lt;/td&gt;
&lt;td&gt;1.0&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;1.0&lt;/td&gt;
&lt;td&gt;1.0&lt;/td&gt;
&lt;td&gt;▁▁▇▁▁&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;hp&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;1.00&lt;/td&gt;
&lt;td&gt;68.75&lt;/td&gt;
&lt;td&gt;26.03&lt;/td&gt;
&lt;td&gt;1.0&lt;/td&gt;
&lt;td&gt;50.0&lt;/td&gt;
&lt;td&gt;65&lt;/td&gt;
&lt;td&gt;80.0&lt;/td&gt;
&lt;td&gt;255.0&lt;/td&gt;
&lt;td&gt;▃▇▁▁▁&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;attack&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;1.00&lt;/td&gt;
&lt;td&gt;76.09&lt;/td&gt;
&lt;td&gt;29.54&lt;/td&gt;
&lt;td&gt;5.0&lt;/td&gt;
&lt;td&gt;55.0&lt;/td&gt;
&lt;td&gt;75&lt;/td&gt;
&lt;td&gt;95.0&lt;/td&gt;
&lt;td&gt;181.0&lt;/td&gt;
&lt;td&gt;▂▇▆▂▁&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;defense&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;1.00&lt;/td&gt;
&lt;td&gt;71.73&lt;/td&gt;
&lt;td&gt;29.73&lt;/td&gt;
&lt;td&gt;5.0&lt;/td&gt;
&lt;td&gt;50.0&lt;/td&gt;
&lt;td&gt;67&lt;/td&gt;
&lt;td&gt;89.0&lt;/td&gt;
&lt;td&gt;230.0&lt;/td&gt;
&lt;td&gt;▅▇▂▁▁&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;special_attack&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;1.00&lt;/td&gt;
&lt;td&gt;69.49&lt;/td&gt;
&lt;td&gt;29.44&lt;/td&gt;
&lt;td&gt;10.0&lt;/td&gt;
&lt;td&gt;45.0&lt;/td&gt;
&lt;td&gt;65&lt;/td&gt;
&lt;td&gt;90.0&lt;/td&gt;
&lt;td&gt;173.0&lt;/td&gt;
&lt;td&gt;▃▇▅▂▁&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;special_defense&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;1.00&lt;/td&gt;
&lt;td&gt;70.01&lt;/td&gt;
&lt;td&gt;27.29&lt;/td&gt;
&lt;td&gt;20.0&lt;/td&gt;
&lt;td&gt;50.0&lt;/td&gt;
&lt;td&gt;65&lt;/td&gt;
&lt;td&gt;85.0&lt;/td&gt;
&lt;td&gt;230.0&lt;/td&gt;
&lt;td&gt;▇▇▂▁▁&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;speed&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;1.00&lt;/td&gt;
&lt;td&gt;65.83&lt;/td&gt;
&lt;td&gt;27.74&lt;/td&gt;
&lt;td&gt;5.0&lt;/td&gt;
&lt;td&gt;45.0&lt;/td&gt;
&lt;td&gt;65&lt;/td&gt;
&lt;td&gt;85.0&lt;/td&gt;
&lt;td&gt;160.0&lt;/td&gt;
&lt;td&gt;▃▇▆▂▁&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;generation_id&lt;/td&gt;
&lt;td&gt;20&lt;/td&gt;
&lt;td&gt;0.98&lt;/td&gt;
&lt;td&gt;3.67&lt;/td&gt;
&lt;td&gt;1.94&lt;/td&gt;
&lt;td&gt;1.0&lt;/td&gt;
&lt;td&gt;2.0&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;5.0&lt;/td&gt;
&lt;td&gt;7.0&lt;/td&gt;
&lt;td&gt;▇▅▃▅▅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;evolves_from_species_id&lt;/td&gt;
&lt;td&gt;426&lt;/td&gt;
&lt;td&gt;0.47&lt;/td&gt;
&lt;td&gt;364.35&lt;/td&gt;
&lt;td&gt;232.43&lt;/td&gt;
&lt;td&gt;1.0&lt;/td&gt;
&lt;td&gt;156.0&lt;/td&gt;
&lt;td&gt;345&lt;/td&gt;
&lt;td&gt;570.0&lt;/td&gt;
&lt;td&gt;803.0&lt;/td&gt;
&lt;td&gt;▇▆▅▆▅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;evolution_chain_id&lt;/td&gt;
&lt;td&gt;20&lt;/td&gt;
&lt;td&gt;0.98&lt;/td&gt;
&lt;td&gt;195.96&lt;/td&gt;
&lt;td&gt;124.57&lt;/td&gt;
&lt;td&gt;1.0&lt;/td&gt;
&lt;td&gt;84.0&lt;/td&gt;
&lt;td&gt;187&lt;/td&gt;
&lt;td&gt;303.0&lt;/td&gt;
&lt;td&gt;427.0&lt;/td&gt;
&lt;td&gt;▇▆▅▆▅&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;I was expecting some missing data in &lt;code&gt;type_2&lt;/code&gt;, and&lt;br&gt;
&lt;code&gt;evolves_from_species_id&lt;/code&gt;, but I wasn’t expecting only half of &lt;code&gt;habitat&lt;/code&gt;&lt;br&gt;
to be there. Either I broke something in my data pipeline, or the data&lt;br&gt;
wasn’t there to begin with. &lt;code&gt;colour&lt;/code&gt;, &lt;code&gt;shape&lt;/code&gt;, &lt;code&gt;generation_id&lt;/code&gt; and&lt;br&gt;
&lt;code&gt;evolution_chain_id&lt;/code&gt; are all missing 20 entries each, which is a bit or&lt;br&gt;
a coincidence. I wonder if they are all missing from the same Pokemon?&lt;/p&gt;
&lt;h2&gt;
  
  
  Visdat I choose you!
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;visdat&lt;/code&gt; is a package that helps you visualise missing data and data&lt;br&gt;
types.&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight r"&gt;&lt;code&gt;&lt;span class="n"&gt;visdat&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;vis_dat&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pokemon&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--id1o8y8k--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://raw.githubusercontent.com/DaveParr/dev.to-posts/master/missingno-experiment_files/figure-gfm/visdat-1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--id1o8y8k--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://raw.githubusercontent.com/DaveParr/dev.to-posts/master/missingno-experiment_files/figure-gfm/visdat-1.png" alt="" width="672" height="480"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This clearly shows us the data types in each column, and where values&lt;br&gt;
are missing in context. It looks like habitat might just not be&lt;br&gt;
available after a certain time. It also looks like &lt;code&gt;colour&lt;/code&gt;, &lt;code&gt;shape&lt;/code&gt;,&lt;br&gt;
&lt;code&gt;generation_id&lt;/code&gt; and &lt;code&gt;evolution_chain_id&lt;/code&gt; looks like they are maybe all&lt;br&gt;
missing from the same individual Pokemon?&lt;/p&gt;
&lt;h2&gt;
  
  
  Go, Naniar!
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;Naniar&lt;/code&gt; helps us check through plots where relationships between&lt;br&gt;
missing values and other variables might occur. Lets check first if&lt;br&gt;
there is a relationship between &lt;code&gt;generation_id&lt;/code&gt; and &lt;code&gt;evolution_chain_id&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight r"&gt;&lt;code&gt;&lt;span class="n"&gt;pokemon&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;%&amp;gt;%&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="n"&gt;ggplot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;aes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;generation_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;evolution_chain_id&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="n"&gt;geom_miss_point&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--NlTYD-qm--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://raw.githubusercontent.com/DaveParr/dev.to-posts/master/missingno-experiment_files/figure-gfm/naniar_missing_all-1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--NlTYD-qm--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://raw.githubusercontent.com/DaveParr/dev.to-posts/master/missingno-experiment_files/figure-gfm/naniar_missing_all-1.png" alt="" width="672" height="480"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This plot might need a little explanation. For the &lt;code&gt;Not Missing&lt;/code&gt; blue&lt;br&gt;
values, this is a normal &lt;code&gt;geom_point()&lt;/code&gt;. However, where the values are&lt;br&gt;
marked as &lt;code&gt;Missing&lt;/code&gt; pink they are deliberately moved below the &lt;code&gt;(0,0)&lt;/code&gt;&lt;br&gt;
mark for the &lt;em&gt;axis they are missing values for&lt;/em&gt;, then they ‘jitter’, to&lt;br&gt;
avoid over-plotting. The little cluster at the far bottom left in a line&lt;br&gt;
marks that for &lt;em&gt;all&lt;/em&gt; values where &lt;code&gt;evolution_chain_id&lt;/code&gt; being missing,&lt;br&gt;
&lt;code&gt;generation_id&lt;/code&gt; is also missing. Let’s have a look at the&lt;br&gt;
&lt;code&gt;evolves_from_species_id&lt;/code&gt; variable just to help us understand.&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight r"&gt;&lt;code&gt;&lt;span class="n"&gt;pokemon&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;%&amp;gt;%&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="n"&gt;ggplot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;aes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;evolves_from_species_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;generation_id&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="n"&gt;geom_miss_point&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--7WNigYlw--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://raw.githubusercontent.com/DaveParr/dev.to-posts/master/missingno-experiment_files/figure-gfm/naniar_missing_half-1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--7WNigYlw--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://raw.githubusercontent.com/DaveParr/dev.to-posts/master/missingno-experiment_files/figure-gfm/naniar_missing_half-1.png" alt="" width="672" height="480"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This is showing that in every game generation (Red/Blue, X/Y, etc.) that&lt;br&gt;
there are Pokemon that have an &lt;code&gt;evolves_from_species_id&lt;/code&gt;, i.e. they have&lt;br&gt;
a precursor Pokemon, and that there are also Pokemon that &lt;em&gt;don’t&lt;/em&gt; have a&lt;br&gt;
precursor. Just what we see in the games. It’s also showing that have&lt;br&gt;
neither &lt;code&gt;generation_id&lt;/code&gt; or &lt;code&gt;evolves_from_species_id&lt;/code&gt;.&lt;/p&gt;
&lt;h2&gt;
  
  
  Who is that Pokemon?
&lt;/h2&gt;

&lt;p&gt;Now we know the characteristics of the missing data we are interested&lt;br&gt;
in, we can pull them out easily. Especially with the newly released&lt;br&gt;
&lt;a href="https://dplyr.tidyverse.org/articles/colwise.html"&gt;&lt;code&gt;across()&lt;/code&gt; function&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight r"&gt;&lt;code&gt;&lt;span class="n"&gt;missing_cols&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;c&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"color"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"shape"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"generation_id"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"evolves_from_species_id"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="n"&gt;pokemon&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;%&amp;gt;%&lt;/span&gt;&lt;span class="w"&gt; 
  &lt;/span&gt;&lt;span class="n"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;across&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;missing_cols&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;~&lt;/span&gt;&lt;span class="nf"&gt;is.na&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;.x&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;%&amp;gt;%&lt;/span&gt;&lt;span class="w"&gt; 
  &lt;/span&gt;&lt;span class="n"&gt;select&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;identifier&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;missing_cols&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;missing_pokes&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="n"&gt;missing_pokes&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;%&amp;gt;%&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;knitr&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;kable&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;name&lt;/th&gt;
&lt;th&gt;identifier&lt;/th&gt;
&lt;th&gt;color&lt;/th&gt;
&lt;th&gt;shape&lt;/th&gt;
&lt;th&gt;generation_id&lt;/th&gt;
&lt;th&gt;evolves_from_species_id&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Deoxys&lt;/td&gt;
&lt;td&gt;deoxys-normal&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Wormadam&lt;/td&gt;
&lt;td&gt;wormadam-plant&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Giratina&lt;/td&gt;
&lt;td&gt;giratina-altered&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Shaymin&lt;/td&gt;
&lt;td&gt;shaymin-land&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Basculin&lt;/td&gt;
&lt;td&gt;basculin-red-striped&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Darmanitan&lt;/td&gt;
&lt;td&gt;darmanitan-standard&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Tornadus&lt;/td&gt;
&lt;td&gt;tornadus-incarnate&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Thundurus&lt;/td&gt;
&lt;td&gt;thundurus-incarnate&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Landorus&lt;/td&gt;
&lt;td&gt;landorus-incarnate&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Keldeo&lt;/td&gt;
&lt;td&gt;keldeo-ordinary&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Meloetta&lt;/td&gt;
&lt;td&gt;meloetta-aria&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Meowstic&lt;/td&gt;
&lt;td&gt;meowstic-male&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Aegislash&lt;/td&gt;
&lt;td&gt;aegislash-shield&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Pumpkaboo&lt;/td&gt;
&lt;td&gt;pumpkaboo-average&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Gourgeist&lt;/td&gt;
&lt;td&gt;gourgeist-average&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Oricorio&lt;/td&gt;
&lt;td&gt;oricorio-baile&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Lycanroc&lt;/td&gt;
&lt;td&gt;lycanroc-midday&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Wishiwashi&lt;/td&gt;
&lt;td&gt;wishiwashi-solo&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Minior&lt;/td&gt;
&lt;td&gt;minior-red-meteor&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Mimikyu&lt;/td&gt;
&lt;td&gt;mimikyu-disguised&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;So it looks like in the current version of the package, these Pokemon&lt;br&gt;
all have ‘complex’ identifiers. This is because these Pokemon all have&lt;br&gt;
different forms. Some vary by colour like&lt;br&gt;
&lt;a href="https://bulbapedia.bulbagarden.net/wiki/List_of_Pok%C3%A9mon_with_form_differences#Basculin"&gt;Basculin&lt;/a&gt;&lt;br&gt;
which can be Red or Blue striped, others have ability transformations,&lt;br&gt;
like&lt;br&gt;
&lt;a href="https://bulbapedia.bulbagarden.net/wiki/List_of_Pok%C3%A9mon_with_form_differences#Aegislash"&gt;Aegislash&lt;/a&gt;&lt;br&gt;
or which game it was caught in like&lt;br&gt;
&lt;a href="https://bulbapedia.bulbagarden.net/wiki/List_of_Pok%C3%A9mon_with_form_differences#Deoxys"&gt;Deoxys&lt;/a&gt;.&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight r"&gt;&lt;code&gt;&lt;span class="n"&gt;missing_pokes&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;%&amp;gt;%&lt;/span&gt;&lt;span class="w"&gt; 
  &lt;/span&gt;&lt;span class="n"&gt;pull&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;%&amp;gt;%&lt;/span&gt;&lt;span class="w"&gt; 
  &lt;/span&gt;&lt;span class="n"&gt;stringr&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;str_to_lower&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;.&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;missing_pokes_name_list&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="n"&gt;pokedex&lt;/span&gt;&lt;span class="o"&gt;$&lt;/span&gt;&lt;span class="n"&gt;pokemon_species&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;%&amp;gt;%&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="n"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="n"&gt;stringr&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;str_to_lower&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;identifier&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;%in%&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;missing_pokes_name_list&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;%&amp;gt;%&lt;/span&gt;&lt;span class="w"&gt; 
  &lt;/span&gt;&lt;span class="n"&gt;select&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;identifier&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;generation_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;evolves_from_species_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;shape_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;color_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;%&amp;gt;%&lt;/span&gt;&lt;span class="w"&gt; 
  &lt;/span&gt;&lt;span class="n"&gt;knitr&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;kable&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;identifier&lt;/th&gt;
&lt;th&gt;generation_id&lt;/th&gt;
&lt;th&gt;evolves_from_species_id&lt;/th&gt;
&lt;th&gt;shape_id&lt;/th&gt;
&lt;th&gt;color_id&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;deoxys&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;12&lt;/td&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;wormadam&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;412&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;giratina&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;10&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;shaymin&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;basculin&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;darmanitan&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;554&lt;/td&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;tornadus&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;thundurus&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;landorus&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;keldeo&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;10&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;meloetta&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;12&lt;/td&gt;
&lt;td&gt;9&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;meowstic&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;677&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;aegislash&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;680&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;pumpkaboo&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;gourgeist&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;710&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;oricorio&lt;/td&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;9&lt;/td&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;lycanroc&lt;/td&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;744&lt;/td&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;wishiwashi&lt;/td&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;minior&lt;/td&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;mimikyu&lt;/td&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;NA&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;10&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;If we go back to the raw source data, we can see that the data is&lt;br&gt;
actually there for most cases, it just didn’t join properly because in&lt;br&gt;
the source data, they are identified by the simple name, in lower case,&lt;br&gt;
and in &lt;a href="https://github.com/DaveParr/pokedex/blob/ebe078c291ffa4eb757d09e0641553de63c5a530/data-raw/pokemon.R#L59-L61"&gt;this version of the&lt;br&gt;
package&lt;/a&gt;&lt;br&gt;
this data is joined on &lt;code&gt;id&lt;/code&gt; AND the column that actually has the complex&lt;br&gt;
name. Also, because shape and color link &lt;em&gt;through&lt;/em&gt; this data, they are&lt;br&gt;
missed as well!&lt;/p&gt;
&lt;h2&gt;
  
  
  You defeated wild missingno!
&lt;/h2&gt;

&lt;p&gt;This is all based on my Pokedex R data package, which I’m just about to&lt;br&gt;
fix :)&lt;/p&gt;


&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--A9-wwsHG--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/DaveParr"&gt;
        DaveParr
      &lt;/a&gt; / &lt;a href="https://github.com/DaveParr/pokedex"&gt;
        pokedex
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      an R data package for pokemon
    &lt;/h3&gt;
  &lt;/div&gt;
&lt;/div&gt;



</description>
      <category>rstats</category>
      <category>datascience</category>
      <category>pokemon</category>
    </item>
    <item>
      <title>Why did I make this dev.to API wrapper?</title>
      <dc:creator>Dave Parr</dc:creator>
      <pubDate>Tue, 30 Jun 2020 14:26:12 +0000</pubDate>
      <link>https://forem.com/daveparr/why-did-i-even-bother-making-this-dev-to-api-wrapper-2nk2</link>
      <guid>https://forem.com/daveparr/why-did-i-even-bother-making-this-dev-to-api-wrapper-2nk2</guid>
      <description>&lt;p&gt;&lt;code&gt;dev.to.ol&lt;/code&gt; is 0.0.1!&lt;/p&gt;


&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--vJ70wriM--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://practicaldev-herokuapp-com.freetls.fastly.net/assets/github-logo-ba8488d21cd8ee1fee097b8410db9deaa41d0ca30b004c0c63de0a479114156f.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/DaveParr"&gt;
        DaveParr
      &lt;/a&gt; / &lt;a href="https://github.com/DaveParr/dev.to.ol"&gt;
        dev.to.ol
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      dev.to.ol helps R users publish to dev.to
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;
&lt;h1&gt;
&lt;a href="https://dev.to/daveparr" rel="nofollow"&gt;
&lt;img src="https://camo.githubusercontent.com/1f3c6413af566c3bdc34d592cb5f299bf014242798daf4854b3c531ad522b904/68747470733a2f2f6432666c746978307632653073622e636c6f756466726f6e742e6e65742f6465762d62616467652e737667" alt="Dave Parr's DEV Profile" height="30" width="30"&gt;
&lt;/a&gt;.to.ol
&lt;/h1&gt;
&lt;p&gt;&lt;a href="https://www.tidyverse.org/lifecycle/#maturing" rel="nofollow"&gt;&lt;img src="https://camo.githubusercontent.com/ae2f538d678a8e76c5493d870c59fbf928b14906e41227a07af5bbf3566b5068/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f6c6966656379636c652d6d61747572696e672d626c75652e737667" alt="Lifecycle: maturing"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;The goal of &lt;code&gt;dev.to.ol&lt;/code&gt; is to help R users publish to dev.to&lt;/p&gt;
&lt;h2&gt;
Installation&lt;/h2&gt;
&lt;p&gt;You can install the dev.to.ol from &lt;a href="https://raw.githubusercontent.com/DaveParr/dev.to.ol/main/www.github.com"&gt;github&lt;/a&gt; with
&lt;code&gt;remotes&lt;/code&gt;:&lt;/p&gt;
&lt;div class="highlight highlight-source-r js-code-highlight"&gt;
&lt;pre&gt;&lt;span class="pl-e"&gt;remotes&lt;/span&gt;&lt;span class="pl-k"&gt;::&lt;/span&gt;install.github(&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;DaveParr/dev.to.ol&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;)&lt;/pre&gt;

&lt;/div&gt;
&lt;h2&gt;
Workflow&lt;/h2&gt;
&lt;h3&gt;
Create your article&lt;/h3&gt;
&lt;p&gt;The &lt;code&gt;create_new_article&lt;/code&gt; function will give you the front mattter
boilerplate for an article &lt;code&gt;.Rmd&lt;/code&gt; file. Optionally supplying a file name
will create a new file with the front matter at the start.&lt;/p&gt;
&lt;div class="highlight highlight-source-r js-code-highlight"&gt;
&lt;pre&gt;create_new_article(&lt;span class="pl-v"&gt;title&lt;/span&gt; &lt;span class="pl-k"&gt;=&lt;/span&gt; &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;my title&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;)&lt;/pre&gt;

&lt;/div&gt;
&lt;h3&gt;
Write your article!&lt;/h3&gt;
&lt;p&gt;This is the fun bit. Mark your great ideas down in an &lt;code&gt;.Rmd&lt;/code&gt;!&lt;/p&gt;
&lt;h3&gt;
Post your article&lt;/h3&gt;
&lt;p&gt;Once the &lt;code&gt;.Rmd&lt;/code&gt; is written, you can post it to dev.to with
&lt;code&gt;post_new_article&lt;/code&gt;&lt;/p&gt;
&lt;div class="highlight highlight-source-r js-code-highlight"&gt;
&lt;pre&gt;post_new_article(&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;./my-great-article.Rmd&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;)&lt;/pre&gt;

&lt;/div&gt;
&lt;h3&gt;
Check your articles&lt;/h3&gt;
&lt;p&gt;There are two functions to check the posted articles on dev.to,
published and unpublished. Both will return a ‘tidy’ data set by
default.&lt;/p&gt;
&lt;div class="highlight highlight-source-r js-code-highlight"&gt;
&lt;pre&gt;get_users_articles()
&lt;span class="pl-smi"&gt;Using&lt;/span&gt; &lt;span class="pl-smi"&gt;DEVTO&lt;/span&gt; &lt;span class="pl-k"&gt;in&lt;/span&gt; &lt;span class="pl-smi"&gt;.Renviron&lt;/span&gt;
&lt;span class="pl-smi"&gt;The&lt;/span&gt; &lt;span class="pl-smi"&gt;API&lt;/span&gt; &lt;span class="pl-smi"&gt;returned&lt;/span&gt; &lt;span class="pl-smi"&gt;the&lt;/span&gt; &lt;span class="pl-smi"&gt;expected&lt;/span&gt; &lt;span class="pl-smi"&gt;success&lt;/span&gt;&lt;/pre&gt;…
&lt;/div&gt;
&lt;/div&gt;
  &lt;/div&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/DaveParr/dev.to.ol"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;


&lt;h2&gt;
  
  
  What’s in the box?
&lt;/h2&gt;

&lt;p&gt;dev.to.ol has the minimum set of viable functions I think it needs to&lt;br&gt;
operate, and (just) enough testing to keep it stable. The api is&lt;br&gt;
starting to come together and there seems to be at least 1 other person&lt;br&gt;
who cares enough about this project to talk about it to someone else, so&lt;br&gt;
I thought it might be handy to get a little more together about things.&lt;br&gt;
I also wanted to make up a celebration to share a little about why I&lt;br&gt;
made it and what dev.to means to me.&lt;/p&gt;
&lt;h2&gt;
  
  
  Are there other similar boxes?
&lt;/h2&gt;

&lt;p&gt;R is in many ways a literate programming language. Markdown is pretty&lt;br&gt;
baked into most of our ecosystem through &lt;a href="https://rmarkdown.rstudio.com/"&gt;R&lt;br&gt;
Markdown&lt;/a&gt;, as is&lt;br&gt;
&lt;a href="https://en.wikipedia.org/wiki/LaTeX_"&gt;LaTeX&lt;/a&gt;. I think because of this,&lt;br&gt;
and the frequent usage in academic publishing, R users have developed&lt;br&gt;
tooling for blogging pretty heavily. We have&lt;br&gt;
&lt;a href="https://bookdown.org/yihui/blogdown/"&gt;blogdown&lt;/a&gt; and&lt;br&gt;
&lt;a href="https://rstudio.github.io/distill/"&gt;distill&lt;/a&gt; which are becoming very&lt;br&gt;
widely used and feature rich, and &lt;a class="comment-mentioned-user" href="https://dev.to/maelle"&gt;@maelle&lt;/a&gt;
 has written a pretty&lt;br&gt;
comprehensive guide to all the different approaches in the &lt;a href="https://scientific-rmd-blogging.netlify.app/"&gt;Scientific&lt;br&gt;
Blogging with R Markdown&lt;br&gt;
course&lt;/a&gt; including her own&lt;br&gt;
solution for Wordpress,&lt;br&gt;
&lt;a href="https://github.com/maelle/goodpress"&gt;goodpress&lt;/a&gt;. As Data Scientists&lt;br&gt;
(emphasis on the &lt;em&gt;science&lt;/em&gt;), we need to share our work reproducibly, and&lt;br&gt;
in a way that encourages understanding and review.&lt;/p&gt;
&lt;h2&gt;
  
  
  So why have another box?
&lt;/h2&gt;

&lt;p&gt;I was looking for something to keep me sane while I was furloughed.&lt;/p&gt;

&lt;p&gt;Also, I had poked around dev.to about a year before, and was impressed&lt;br&gt;
by what I took to be it’s un-official mission statement of “We aren’t&lt;br&gt;
Medium, we’re what Medium should have been”.&lt;/p&gt;


&lt;div class="ltag__link"&gt;
  &lt;a href="/ben" class="ltag__link__link"&gt;
    &lt;div class="ltag__link__pic"&gt;
      &lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--bgwIhvJ3--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://res.cloudinary.com/practicaldev/image/fetch/s--1M1qt9Sp--/c_fill%2Cf_auto%2Cfl_progressive%2Ch_150%2Cq_auto%2Cw_150/https://dev-to-uploads.s3.amazonaws.com/uploads/user/profile_image/1/f451a206-11c8-4e3d-8936-143d0a7e65bb.png" alt="ben image"&gt;
    &lt;/div&gt;
  &lt;/a&gt;
  &lt;a href="/devteam/medium-was-never-meant-to-be-a-part-of-the-developer-ecosystem-25a0" class="ltag__link__link"&gt;
    &lt;div class="ltag__link__content"&gt;
      &lt;h2&gt;Medium Was Never Meant to Be a Part of the Developer Ecosystem&lt;/h2&gt;
      &lt;h3&gt;Ben Halpern ・ Jun  3 '19 ・ 5 min read&lt;/h3&gt;
      &lt;div class="ltag__link__taglist"&gt;
        &lt;span class="ltag__link__tag"&gt;#meta&lt;/span&gt;
      &lt;/div&gt;
    &lt;/div&gt;
  &lt;/a&gt;
&lt;/div&gt;


&lt;p&gt;I’m sure there is a lot more to it than that, especially with the recent&lt;br&gt;
&lt;a href="https://dev.to/devteam/for-empowering-community-2k6h"&gt;Forem&lt;/a&gt;&lt;br&gt;
announcement, but personally, that was a hook I perceived which really&lt;br&gt;
stuck me.&lt;/p&gt;

&lt;p&gt;However, at the time the in built editor was “fine”. Fine enough, but&lt;br&gt;
not great. I’m also used to R Notebooks, where I write my code next to&lt;br&gt;
my prose, and compile the whole thing in one. It seemed valuable to&lt;br&gt;
bring that ability to my dev.to posts. I also was aware of all the&lt;br&gt;
alternatives above, but dev.to offered 2 things that were different to&lt;br&gt;
the above options:&lt;/p&gt;
&lt;h3&gt;
  
  
  Hosted and managed service
&lt;/h3&gt;

&lt;p&gt;Don’t get me wrong, I have nothing but love for JAMstack and static&lt;br&gt;
sites in general. I &lt;a href="https://github.com/satRdays/satRday_site_template"&gt;maintain a template for making them for SatRdays&lt;br&gt;
conferences&lt;/a&gt;.&lt;br&gt;
However, I don’t love the process of doing it. In this case I absolutely&lt;br&gt;
believe in the cause, which is why I continue to volunteer on the&lt;br&gt;
project. However, the process of interacting with static sites doesn’t&lt;br&gt;
actually make me &lt;em&gt;happy&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;I was looking for the simplest way I could casually blog, which ideally&lt;br&gt;
required no extra work beyond writing articles (to start with).&lt;/p&gt;
&lt;h3&gt;
  
  
  Community
&lt;/h3&gt;

&lt;p&gt;I wanted at least a few people to read my articles, and I wanted to read&lt;br&gt;
theirs. I’m sure I could have accomplished this with more work in SEO,&lt;br&gt;
cultivating a social media network and all that jazz, but again: I just&lt;br&gt;
wanted to write some stuff that some people might read, and read their&lt;br&gt;
stuff. dev.to had a discovery feed, it had active users, it had&lt;br&gt;
community. This was just what I was looking for. I also kind of hate&lt;br&gt;
twitter (yes, I still use it when I ‘have’ to).&lt;/p&gt;
&lt;h2&gt;
  
  
  Motivation
&lt;/h2&gt;

&lt;p&gt;Initially my work flow sucked. I wrote an &lt;code&gt;.Rmd&lt;/code&gt; in RStudio. Compiled it&lt;br&gt;
to a GitHub flavoured &lt;code&gt;.md&lt;/code&gt;. Copy and pasted the output into the editor,&lt;br&gt;
added the meta data, then uploaded all the images. If I found I’d made a&lt;br&gt;
mistake, I’d either do the right thing, which was edit the &lt;code&gt;.Rmd&lt;/code&gt;,&lt;br&gt;
recompile, re-copy-pasta, or I’d more often do the quick thing which is&lt;br&gt;
enter the browser editor and fix the typo.&lt;/p&gt;
&lt;h2&gt;
  
  
  Inception
&lt;/h2&gt;

&lt;p&gt;As I was chewing over the best way to make this work gooder (I am a&lt;br&gt;
programmer after all), a few things happened simultaneously.&lt;/p&gt;
&lt;h3&gt;
  
  
  Most of the R users on the site syndicate
&lt;/h3&gt;

&lt;p&gt;There are some really great R users on here already:&lt;/p&gt;


&lt;div class="ltag__user ltag__user__id__26527"&gt;
  
    .ltag__user__id__26527 .follow-action-button {
      background-color: #5D535E !important;
      color: #9A9EAB !important;
      border-color: #5D535E !important;
    }
  
    &lt;a href="/juliasilge" class="ltag__user__link profile-image-link"&gt;
      &lt;div class="ltag__user__pic"&gt;
        &lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--x98VyAZp--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://res.cloudinary.com/practicaldev/image/fetch/s--fT5UdCwE--/c_fill%2Cf_auto%2Cfl_progressive%2Ch_150%2Cq_auto%2Cw_150/https://dev-to-uploads.s3.amazonaws.com/uploads/user/profile_image/26527/05fc758d-6021-4bc7-b092-85b591d4d265.jpg" alt="juliasilge image"&gt;
      &lt;/div&gt;
    &lt;/a&gt;
  &lt;div class="ltag__user__content"&gt;
    &lt;h2&gt;
&lt;a class="ltag__user__link" href="/juliasilge"&gt;Julia Silge&lt;/a&gt;
&lt;/h2&gt;
    &lt;div class="ltag__user__summary"&gt;
      &lt;a class="ltag__user__link" href="/juliasilge"&gt;I’m an international keynote speaker and real-world practitioner focused on data analysis and machine learning. I love making beautiful charts, the statistical programming language R, and Jane Austen.&lt;/a&gt;
    &lt;/div&gt;
  &lt;/div&gt;
&lt;/div&gt;



&lt;div class="ltag__user ltag__user__id__93212"&gt;
  
    .ltag__user__id__93212 .follow-action-button {
      background-color: #0D4D4B !important;
      color: #FFFFFF !important;
      border-color: #0D4D4B !important;
    }
  
    &lt;a href="/sckott" class="ltag__user__link profile-image-link"&gt;
      &lt;div class="ltag__user__pic"&gt;
        &lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--QOidfpvA--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://res.cloudinary.com/practicaldev/image/fetch/s--tGJS-2zf--/c_fill%2Cf_auto%2Cfl_progressive%2Ch_150%2Cq_auto%2Cw_150/https://dev-to-uploads.s3.amazonaws.com/uploads/user/profile_image/93212/d17bedc3-2fbc-46b6-94c9-b20467dd5fd5.jpeg" alt="sckott image"&gt;
      &lt;/div&gt;
    &lt;/a&gt;
  &lt;div class="ltag__user__content"&gt;
    &lt;h2&gt;
&lt;a class="ltag__user__link" href="/sckott"&gt;Scott Chamberlain&lt;/a&gt;
&lt;/h2&gt;
    &lt;div class="ltag__user__summary"&gt;
      &lt;a class="ltag__user__link" href="/sckott"&gt;co-founder of rOpenSci&lt;/a&gt;
    &lt;/div&gt;
  &lt;/div&gt;
&lt;/div&gt;



&lt;div class="ltag__user ltag__user__id__133005"&gt;
  
    .ltag__user__id__133005 .follow-action-button {
      background-color: #F69289 !important;
      color: #FFFFFF !important;
      border-color: #F69289 !important;
    }
  
    &lt;a href="/maelle" class="ltag__user__link profile-image-link"&gt;
      &lt;div class="ltag__user__pic"&gt;
        &lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--WS4Gywsu--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://res.cloudinary.com/practicaldev/image/fetch/s--x4Z0X7OV--/c_fill%2Cf_auto%2Cfl_progressive%2Ch_150%2Cq_auto%2Cw_150/https://dev-to-uploads.s3.amazonaws.com/uploads/user/profile_image/133005/8b0245aa-ca9a-4fcc-aa78-9a42572165c5.jpeg" alt="maelle image"&gt;
      &lt;/div&gt;
    &lt;/a&gt;
  &lt;div class="ltag__user__content"&gt;
    &lt;h2&gt;
&lt;a class="ltag__user__link" href="/maelle"&gt;Maëlle Salmon&lt;/a&gt;
&lt;/h2&gt;
    &lt;div class="ltag__user__summary"&gt;
      &lt;a class="ltag__user__link" href="/maelle"&gt;&lt;/a&gt;
    &lt;/div&gt;
  &lt;/div&gt;
&lt;/div&gt;



&lt;div class="ltag__user ltag__user__id__21644"&gt;
  
    .ltag__user__id__21644 .follow-action-button {
      background-color: #000000 !important;
      color: #ffffff !important;
      border-color: #000000 !important;
    }
  
    &lt;a href="/colinfay" class="ltag__user__link profile-image-link"&gt;
      &lt;div class="ltag__user__pic"&gt;
        &lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--Zy4K-UMV--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://res.cloudinary.com/practicaldev/image/fetch/s--XVGQPK5L--/c_fill%2Cf_auto%2Cfl_progressive%2Ch_150%2Cq_auto%2Cw_150/https://dev-to-uploads.s3.amazonaws.com/uploads/user/profile_image/21644/f3966f29-a36b-43dd-9f4a-81687d61ee2a.jpg" alt="colinfay image"&gt;
      &lt;/div&gt;
    &lt;/a&gt;
  &lt;div class="ltag__user__content"&gt;
    &lt;h2&gt;
&lt;a class="ltag__user__link" href="/colinfay"&gt;Colin Fay&lt;/a&gt;
&lt;/h2&gt;
    &lt;div class="ltag__user__summary"&gt;
      &lt;a class="ltag__user__link" href="/colinfay"&gt;Nerd. R, JS, &amp;amp; Docker. Loves building things, loves breaking things. &lt;/a&gt;
    &lt;/div&gt;
  &lt;/div&gt;
&lt;/div&gt;


&lt;p&gt;I found out that most of the R users here I follow are actually&lt;br&gt;
re-syndicating through rss:&lt;/p&gt;


&lt;blockquote class="ltag__twitter-tweet"&gt;

  &lt;div class="ltag__twitter-tweet__main"&gt;
    &lt;div class="ltag__twitter-tweet__header"&gt;
      &lt;img class="ltag__twitter-tweet__profile-image" src="https://res.cloudinary.com/practicaldev/image/fetch/s--GGy17lmm--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://pbs.twimg.com/profile_images/1259904402497945606/rHOGSWGD_normal.jpg" alt="Julia Silge profile image"&gt;
      &lt;div class="ltag__twitter-tweet__full-name"&gt;
        Julia Silge
      &lt;/div&gt;
      &lt;div class="ltag__twitter-tweet__username"&gt;
        &lt;a class="comment-mentioned-user" href="https://dev.to/juliasilge"&gt;@juliasilge&lt;/a&gt;

      &lt;/div&gt;
      &lt;div class="ltag__twitter-tweet__twitter-logo"&gt;
        &lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--P4t6ys1m--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://practicaldev-herokuapp-com.freetls.fastly.net/assets/twitter-f95605061196010f91e64806688390eb1a4dbc9e913682e043eb8b1e06ca484f.svg" alt="twitter logo"&gt;
      &lt;/div&gt;
    &lt;/div&gt;
    &lt;div class="ltag__twitter-tweet__body"&gt;
      &lt;a href="https://twitter.com/ma_salmon"&gt;@ma_salmon&lt;/a&gt; &lt;a href="https://twitter.com/DaveParr"&gt;@DaveParr&lt;/a&gt; &lt;a href="https://twitter.com/_ColinFay"&gt;@_ColinFay&lt;/a&gt; &lt;a href="https://twitter.com/ThePracticalDev"&gt;@ThePracticalDev&lt;/a&gt; Under Settings ➡️ Publishing from RSS, you can add the RSS feed from your blog. The posts will go to your Dashboard as drafts. I do need to edit them slightly before publishing (for example, switching out for their shortcodes for a few things).
    &lt;/div&gt;
    &lt;div class="ltag__twitter-tweet__date"&gt;
      14:39 PM - 13 May 2020
    &lt;/div&gt;


    &lt;div class="ltag__twitter-tweet__actions"&gt;
      &lt;a href="https://twitter.com/intent/tweet?in_reply_to=1260580363971317765" class="ltag__twitter-tweet__actions__button"&gt;
        &lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--WwRENZp4--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://practicaldev-herokuapp-com.freetls.fastly.net/assets/twitter-reply-action-238fe0a37991706a6880ed13941c3efd6b371e4aefe288fe8e0db85250708bc4.svg" alt="Twitter reply action"&gt;
      &lt;/a&gt;
      &lt;a href="https://twitter.com/intent/retweet?tweet_id=1260580363971317765" class="ltag__twitter-tweet__actions__button"&gt;
        &lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--PFD0MJBa--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://practicaldev-herokuapp-com.freetls.fastly.net/assets/twitter-retweet-action-632c83532a4e7de573c5c08dbb090ee18b348b13e2793175fea914827bc42046.svg" alt="Twitter retweet action"&gt;
      &lt;/a&gt;
      &lt;a href="https://twitter.com/intent/like?tweet_id=1260580363971317765" class="ltag__twitter-tweet__actions__button"&gt;
        &lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--6wx1BHu3--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://practicaldev-herokuapp-com.freetls.fastly.net/assets/twitter-like-action-1ea89f4b87c7d37465b0eb78d51fcb7fe6c03a089805d7ea014ba71365be5171.svg" alt="Twitter like action"&gt;
      &lt;/a&gt;
    &lt;/div&gt;
  &lt;/div&gt;
&lt;/blockquote&gt;


&lt;p&gt;That’s great for them, but I was actually trying to avoid my own site!&lt;/p&gt;

&lt;h3&gt;
  
  
  Dev.to has an API
&lt;/h3&gt;

&lt;p&gt;I was poking around older posts and found out that dev.to has an&lt;br&gt;
&lt;a href="https://docs.dev.to/api"&gt;API&lt;/a&gt;. At the time I thought I might play with&lt;br&gt;
webhooks to boot ‘saved’ articles into pocket to work on my e-reader (at&lt;br&gt;
some point I still might). However, in this case it was kind of perfect.&lt;br&gt;
An API driven work flow between .Rmd in RStudio and the hosted dev.to&lt;br&gt;
community. I could see it in my head, and it’s not like I was busy in&lt;br&gt;
May…&lt;/p&gt;


&lt;div class="ltag__link"&gt;
  &lt;a href="/daveparr" class="ltag__link__link"&gt;
    &lt;div class="ltag__link__pic"&gt;
      &lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--7n3zyASq--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://res.cloudinary.com/practicaldev/image/fetch/s--UsMW9OkR--/c_fill%2Cf_auto%2Cfl_progressive%2Ch_150%2Cq_auto%2Cw_150/https://dev-to-uploads.s3.amazonaws.com/uploads/user/profile_image/150692/22b3fd57-c859-4087-897b-f63d034fa359.jpeg" alt="daveparr image"&gt;
    &lt;/div&gt;
  &lt;/a&gt;
  &lt;a href="/daveparr/posting-from-rmd-to-dev-to-5gld" class="ltag__link__link"&gt;
    &lt;div class="ltag__link__content"&gt;
      &lt;h2&gt;Posting from .Rmd to dev.to&lt;/h2&gt;
      &lt;h3&gt;Dave Parr ・ May 20 '20 ・ 4 min read&lt;/h3&gt;
      &lt;div class="ltag__link__taglist"&gt;
        &lt;span class="ltag__link__tag"&gt;#rstats&lt;/span&gt;
        &lt;span class="ltag__link__tag"&gt;#projectbenatar&lt;/span&gt;
        &lt;span class="ltag__link__tag"&gt;#showdev&lt;/span&gt;
        &lt;span class="ltag__link__tag"&gt;#markdown&lt;/span&gt;
      &lt;/div&gt;
    &lt;/div&gt;
  &lt;/a&gt;
&lt;/div&gt;


&lt;h2&gt;
  
  
  Evolution
&lt;/h2&gt;

&lt;p&gt;After proving it ‘might’ be doable, I did some work fleshing out the&lt;br&gt;
‘best’ way to do it.&lt;/p&gt;


&lt;div class="ltag__link"&gt;
  &lt;a href="/daveparr" class="ltag__link__link"&gt;
    &lt;div class="ltag__link__pic"&gt;
      &lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--7n3zyASq--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://res.cloudinary.com/practicaldev/image/fetch/s--UsMW9OkR--/c_fill%2Cf_auto%2Cfl_progressive%2Ch_150%2Cq_auto%2Cw_150/https://dev-to-uploads.s3.amazonaws.com/uploads/user/profile_image/150692/22b3fd57-c859-4087-897b-f63d034fa359.jpeg" alt="daveparr image"&gt;
    &lt;/div&gt;
  &lt;/a&gt;
  &lt;a href="/daveparr/posting-straight-from-rmd-to-dev-to-1j4p" class="ltag__link__link"&gt;
    &lt;div class="ltag__link__content"&gt;
      &lt;h2&gt;Posting straight from .Rmd to dev.to (for real this time)&lt;/h2&gt;
      &lt;h3&gt;Dave Parr ・ May 24 '20 ・ 4 min read&lt;/h3&gt;
      &lt;div class="ltag__link__taglist"&gt;
        &lt;span class="ltag__link__tag"&gt;#rstats&lt;/span&gt;
        &lt;span class="ltag__link__tag"&gt;#projectbenatar&lt;/span&gt;
        &lt;span class="ltag__link__tag"&gt;#showdev&lt;/span&gt;
        &lt;span class="ltag__link__tag"&gt;#meta&lt;/span&gt;
      &lt;/div&gt;
    &lt;/div&gt;
  &lt;/a&gt;
&lt;/div&gt;


&lt;p&gt;So I started some super-basic minimum viable functions. How do I get the&lt;br&gt;
post to turn up on dev.to? How do I make the meta-data work with the&lt;br&gt;
post? What happens if I’ve published and need to correct a typo? All&lt;br&gt;
solvable, all import features, and all now implemented and tested. I got&lt;br&gt;
to learn a lot about testing API wrappers with &lt;code&gt;vcr&lt;/code&gt; thanks to &lt;a class="comment-mentioned-user" href="https://dev.to/sckott"&gt;@sckott&lt;/a&gt;
.&lt;/p&gt;


&lt;div class="ltag__link"&gt;
  &lt;a href="/daveparr" class="ltag__link__link"&gt;
    &lt;div class="ltag__link__pic"&gt;
      &lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--7n3zyASq--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://res.cloudinary.com/practicaldev/image/fetch/s--UsMW9OkR--/c_fill%2Cf_auto%2Cfl_progressive%2Ch_150%2Cq_auto%2Cw_150/https://dev-to-uploads.s3.amazonaws.com/uploads/user/profile_image/150692/22b3fd57-c859-4087-897b-f63d034fa359.jpeg" alt="daveparr image"&gt;
    &lt;/div&gt;
  &lt;/a&gt;
  &lt;a href="/daveparr/testing-my-dev-to-api-package-with-testthat-webmockr-and-vcr-2dgm" class="ltag__link__link"&gt;
    &lt;div class="ltag__link__content"&gt;
      &lt;h2&gt;Testing my dev.to API package with testthat, webmockr and vcr&lt;/h2&gt;
      &lt;h3&gt;Dave Parr ・ Jun  3 '20 ・ 6 min read&lt;/h3&gt;
      &lt;div class="ltag__link__taglist"&gt;
        &lt;span class="ltag__link__tag"&gt;#rstats&lt;/span&gt;
        &lt;span class="ltag__link__tag"&gt;#projectbenatar&lt;/span&gt;
        &lt;span class="ltag__link__tag"&gt;#showdev&lt;/span&gt;
        &lt;span class="ltag__link__tag"&gt;#testing&lt;/span&gt;
      &lt;/div&gt;
    &lt;/div&gt;
  &lt;/a&gt;
&lt;/div&gt;


&lt;p&gt;The functions in the package changed a bit as I went through user&lt;br&gt;
testing by using it to write my posts on dev.to that you can see here. I&lt;br&gt;
was also able to generate content pretty quick because I could write&lt;br&gt;
about developing the functions that I was testing at the same time by&lt;br&gt;
writing the articles. Virtuous cycles! (/ black holes)&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--GNpNkCVQ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_66%2Cw_880/https://media2.giphy.com/media/lKKXOCVviOAXS/giphy.gif%3Fcid%3Decf05e471b5c83569df22ec5aad248c62cb864b40d6efed4%26rid%3Dgiphy.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--GNpNkCVQ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_66%2Cw_880/https://media2.giphy.com/media/lKKXOCVviOAXS/giphy.gif%3Fcid%3Decf05e471b5c83569df22ec5aad248c62cb864b40d6efed4%26rid%3Dgiphy.gif" alt="a robot in a patterned blue suit on a marble floor infinitely crawling&amp;lt;br&amp;gt;
away from a black hole eating&amp;lt;br&amp;gt;
everything"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;At this point I was also starting to look for new work, and was&lt;br&gt;
wondering about if I had made the choice backwards. Maybe I did need my&lt;br&gt;
own site after all. Somewhere to put my CV, and that looked a bit more&lt;br&gt;
professional, and maybe didn’t have so much clutter of articles mixed&lt;br&gt;
with my own questions and a bunch of content from other people. Then I&lt;br&gt;
discovered the @stackbit integration!&lt;/p&gt;


&lt;div class="ltag__link"&gt;
  &lt;a href="/ben" class="ltag__link__link"&gt;
    &lt;div class="ltag__link__pic"&gt;
      &lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--bgwIhvJ3--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://res.cloudinary.com/practicaldev/image/fetch/s--1M1qt9Sp--/c_fill%2Cf_auto%2Cfl_progressive%2Ch_150%2Cq_auto%2Cw_150/https://dev-to-uploads.s3.amazonaws.com/uploads/user/profile_image/1/f451a206-11c8-4e3d-8936-143d0a7e65bb.png" alt="ben image"&gt;
    &lt;/div&gt;
  &lt;/a&gt;
  &lt;a href="/devteam/you-can-now-generate-self-hostable-static-blogs-right-from-your-dev-content-via-stackbit-7a5" class="ltag__link__link"&gt;
    &lt;div class="ltag__link__content"&gt;
      &lt;h2&gt;You can now generate self-hostable static blogs right from your DEV content via Stackbit&lt;/h2&gt;
      &lt;h3&gt;Ben Halpern ・ Sep 26 '19 ・ 4 min read&lt;/h3&gt;
      &lt;div class="ltag__link__taglist"&gt;
        &lt;span class="ltag__link__tag"&gt;#meta&lt;/span&gt;
        &lt;span class="ltag__link__tag"&gt;#projectbenatar&lt;/span&gt;
        &lt;span class="ltag__link__tag"&gt;#webdev&lt;/span&gt;
        &lt;span class="ltag__link__tag"&gt;#changelog&lt;/span&gt;
      &lt;/div&gt;
    &lt;/div&gt;
  &lt;/a&gt;
&lt;/div&gt;


&lt;p&gt;This meant that I could hand off all the hosting and styling and&lt;br&gt;
management and deploys to them, but still get a hugo repo which I can&lt;br&gt;
tailor how I want, such as removing the &lt;code&gt;#meta&lt;/code&gt; and &lt;code&gt;#help&lt;/code&gt; posts, which&lt;br&gt;
wouldn’t be useful to a recruiter, but also automatically put my content&lt;br&gt;
into a presentable site, with a few contextual links to thing like my&lt;br&gt;
GitHub and LinkedIn under my personal &lt;a href="//daveparr.info"&gt;daveparr.info&lt;/a&gt;&lt;br&gt;
domain.&lt;/p&gt;


&lt;div class="ltag__link"&gt;
  &lt;a href="/daveparr" class="ltag__link__link"&gt;
    &lt;div class="ltag__link__pic"&gt;
      &lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--7n3zyASq--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://res.cloudinary.com/practicaldev/image/fetch/s--UsMW9OkR--/c_fill%2Cf_auto%2Cfl_progressive%2Ch_150%2Cq_auto%2Cw_150/https://dev-to-uploads.s3.amazonaws.com/uploads/user/profile_image/150692/22b3fd57-c859-4087-897b-f63d034fa359.jpeg" alt="daveparr image"&gt;
    &lt;/div&gt;
  &lt;/a&gt;
  &lt;a href="/daveparr/i-made-my-dev-to-content-into-a-website-to-find-a-new-job-2kn5" class="ltag__link__link"&gt;
    &lt;div class="ltag__link__content"&gt;
      &lt;h2&gt;I made my dev.to content into a website to find a new job&lt;/h2&gt;
      &lt;h3&gt;Dave Parr ・ May 25 '20 ・ 2 min read&lt;/h3&gt;
      &lt;div class="ltag__link__taglist"&gt;
        &lt;span class="ltag__link__tag"&gt;#projectbenatar&lt;/span&gt;
        &lt;span class="ltag__link__tag"&gt;#showdev&lt;/span&gt;
        &lt;span class="ltag__link__tag"&gt;#hugo&lt;/span&gt;
        &lt;span class="ltag__link__tag"&gt;#stackbit&lt;/span&gt;
      &lt;/div&gt;
    &lt;/div&gt;
  &lt;/a&gt;
&lt;/div&gt;


&lt;h2&gt;
  
  
  Future
&lt;/h2&gt;

&lt;p&gt;The primary future goal is finding a good way to reference images in the&lt;br&gt;
article in a way that dev.to can use. It’s likely that this will end up&lt;br&gt;
being github itself. Additionally better testing is probably on the&lt;br&gt;
cards as I go. The functions and package API have stabilised enough now&lt;br&gt;
that this can be comprehensive, and getting some CI/CD would be nice&lt;br&gt;
too. I’m also planning on working on smarter ways to run analytics on&lt;br&gt;
the data you can get back from the API about your posts. Maybe even an&lt;br&gt;
inbuilt shiny app?&lt;/p&gt;

&lt;p&gt;I hope some others of you might find the package useful, and maybe this&lt;br&gt;
might motivate you to share the work you might already be doing in R, or&lt;br&gt;
even pick up R as a new language!&lt;/p&gt;

&lt;p&gt;&lt;a href="https://unsplash.com/photos/WPTHZkA-M4I"&gt;Photo by Erwan Hesry&lt;/a&gt; via&lt;br&gt;
Unsplash&lt;/p&gt;

</description>
      <category>showdev</category>
      <category>projectbenatar</category>
      <category>meta</category>
      <category>rstats</category>
    </item>
    <item>
      <title>Nativefier is bonkers</title>
      <dc:creator>Dave Parr</dc:creator>
      <pubDate>Fri, 26 Jun 2020 21:26:59 +0000</pubDate>
      <link>https://forem.com/daveparr/nativefire-is-bonkers-4m43</link>
      <guid>https://forem.com/daveparr/nativefire-is-bonkers-4m43</guid>
      <description>&lt;p&gt;I made an electron app in 4 lines...&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;nativefier &lt;span class="s2"&gt;"http://musicforprogramming.net/"&lt;/span&gt; &lt;span class="nt"&gt;-n&lt;/span&gt; &lt;span class="s2"&gt;"musicforprogramming"&lt;/span&gt;
&lt;span class="nb"&gt;cd &lt;/span&gt;musicforprogramming-linux-x64/
&lt;span class="nb"&gt;sudo chmod&lt;/span&gt; +x musicforprogramming
./musicforprogramming
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;h1&gt;
  
  
  Motivation
&lt;/h1&gt;

&lt;p&gt;I like music, but don't like music in browser tabs. Basically because i have it open all the time, I want to find and control it easily, and I don't want it cluttering up an area that might be soley focused on work. &lt;/p&gt;

&lt;p&gt;I discovered some neat apps for google play, and wanted to see if there was one for my other go to, &lt;a href="http://musicforprogramming.net/"&gt;musicforprogramming&lt;/a&gt;. There wasn't, so I just casually googled how to convert a page into an electron app and OMFG!&lt;/p&gt;
&lt;h1&gt;
  
  
  Solution
&lt;/h1&gt;


&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--A9-wwsHG--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/nativefier"&gt;
        nativefier
      &lt;/a&gt; / &lt;a href="https://github.com/nativefier/nativefier"&gt;
        nativefier
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      Make any web page a desktop application
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;
&lt;p&gt;Note: Nativefier is unmaintained, please see &lt;a class="issue-link js-issue-link" href="https://github.com/nativefier/nativefier/issues/1577"&gt;#1577&lt;/a&gt;.&lt;/p&gt;
&lt;h1&gt;
Nativefier&lt;/h1&gt;
&lt;p&gt;&lt;a rel="noopener noreferrer" href="https://github.com/nativefier/nativefier.github/dock-screenshot.png"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--agQLcK2g--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://github.com/nativefier/nativefier.github/dock-screenshot.png" alt="Example of Nativefier app in the macOS dock"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;You want to make a native-looking wrapper for WhatsApp Web (or any web page).&lt;/p&gt;
&lt;div class="highlight highlight-source-shell notranslate position-relative overflow-auto js-code-highlight"&gt;
&lt;pre&gt;nativefier &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;web.whatsapp.com&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;p&gt;&lt;a rel="noopener noreferrer" href="https://github.com/nativefier/nativefier.github/nativefier-walkthrough.gif"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--psnw7af---/c_limit%2Cf_auto%2Cfl_progressive%2Cq_66%2Cw_800/https://github.com/nativefier/nativefier.github/nativefier-walkthrough.gif" alt="Walkthrough animation"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;You're done.&lt;/p&gt;
&lt;h2&gt;
Introduction&lt;/h2&gt;
&lt;p&gt;Nativefier is a command-line tool to easily create a “desktop app” for any web site
with minimal fuss. Apps are wrapped by &lt;a href="https://www.electronjs.org/" rel="nofollow"&gt;Electron&lt;/a&gt;
(which uses Chromium under the hood) in an OS executable (&lt;code&gt;.app&lt;/code&gt;, &lt;code&gt;.exe&lt;/code&gt;, etc)
usable on Windows, macOS and Linux.&lt;/p&gt;
&lt;p&gt;I built this because I grew tired of having to Alt-Tab to my browser and then search
through numerous open tabs when using Messenger or
Whatsapp Web (&lt;a href="https://news.ycombinator.com/item?id=10930718" rel="nofollow"&gt;HN thread&lt;/a&gt;). Nativefier features:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Automatically retrieval of app icon / name&lt;/li&gt;
&lt;li&gt;Injection of custom JS &amp;amp; CSS&lt;/li&gt;
&lt;li&gt;Many more, see the &lt;a href="https://github.com/nativefier/nativefierAPI.md"&gt;API docs&lt;/a&gt; or &lt;code&gt;nativefier --help&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
Installation&lt;/h2&gt;
&lt;p&gt;Install Nativefier globally with &lt;code&gt;npm install -g nativefier&lt;/code&gt; . Requirements:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;macOS 10.13+ / Windows / Linux&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://nodejs.org/" rel="nofollow"&gt;Node.js&lt;/a&gt; ≥ 16.9…&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
  &lt;/div&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/nativefier/nativefier"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;



&lt;p&gt;&lt;a href="https://www.todesktop.com/guides/nativefier"&gt;This post&lt;/a&gt; and &lt;a href="https://www.addictivetips.com/ubuntu-linux-tips/nativefier-turn-websites-into-linux-apps/"&gt;this one&lt;/a&gt; helped iron out some kinks and now I can launch programming music right from my VS code terminal!&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--YCXuikuw--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/i/7dh4uokd2pt0zxgeye0n.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--YCXuikuw--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/i/7dh4uokd2pt0zxgeye0n.png" alt="vs code and an electron app of musicforprogramming made with nativefier" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  Extra credit
&lt;/h1&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;alias &lt;/span&gt;&lt;span class="nv"&gt;musicforprogramming&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"~/Dev/musicforprogramming-linux-x64/musicforprogramming
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



</description>
      <category>npm</category>
      <category>electron</category>
      <category>music</category>
      <category>app</category>
    </item>
    <item>
      <title>Investigating interactions between dev.to and stackbit</title>
      <dc:creator>Dave Parr</dc:creator>
      <pubDate>Fri, 26 Jun 2020 10:18:47 +0000</pubDate>
      <link>https://forem.com/daveparr/investigating-interactions-between-dev-to-and-stackbit-38ge</link>
      <guid>https://forem.com/daveparr/investigating-interactions-between-dev-to-and-stackbit-38ge</guid>
      <description>&lt;h2&gt;
  
  
  Is the &lt;code&gt;description&lt;/code&gt; also the &lt;code&gt;excerpt&lt;/code&gt;?
&lt;/h2&gt;

&lt;p&gt;I was poking around in the &lt;a href="https://docs.dev.to/api/"&gt;dev.to api&lt;/a&gt;, and&lt;br&gt;
had my &lt;a href="https://dev.to/devteam/you-can-now-generate-self-hostable-static-blogs-right-from-your-dev-content-via-stackbit-7a5"&gt;stackbit cms&lt;br&gt;
integration&lt;/a&gt;&lt;br&gt;
open in another window and I noticed something I hadn’t before. My first&lt;br&gt;
ever post had a neat ‘excerpt’ that wasn’t actually part of the main&lt;br&gt;
body of the post. My other posts didn’t. In the &lt;code&gt;yaml&lt;/code&gt; there is actually&lt;br&gt;
a key called &lt;code&gt;excerpt&lt;/code&gt; with the string in question. The API also has a&lt;br&gt;
value that can be part of the response which is a string labelled&lt;br&gt;
&lt;a href="https://docs.dev.to/api/#operation/createArticle"&gt;“Description”&lt;/a&gt;&lt;br&gt;
which I hadn’t integrated into my&lt;br&gt;
&lt;a href="https://github.com/DaveParr/dev.to.ol"&gt;&lt;code&gt;dev.to.ol&lt;/code&gt;&lt;/a&gt; package yet. I have&lt;br&gt;
now. I must have written that first post through the editor built into&lt;br&gt;
the website. Is this field no longer supported? I seem to be able to set&lt;br&gt;
it through the API, but I can’t see the info anywhere on the current&lt;br&gt;
site.&lt;/p&gt;

&lt;p&gt;So, if I’m correct, this post should show up with a neat little&lt;br&gt;
description on stackbit, but not on dev.to. &lt;/p&gt;

&lt;p&gt;UPDATE: Apparently not. So what does the description field do then?&lt;/p&gt;

&lt;h2&gt;
  
  
  Should liquid tags work in stackbit?
&lt;/h2&gt;

&lt;p&gt;I like the liquid tags, in some cases. I was even debating writing an R&lt;br&gt;
function to convert a url into the relevant liquid tag through string&lt;br&gt;
parsing, but I’ve spotted that liquid tags aren’t actually supported by&lt;br&gt;
stackbit? See the end of this post on dev, and this copy of it on my&lt;br&gt;
stackbit site, where the link seems large and broken. Are there plans to&lt;br&gt;
support this on stack bit? Or do you already, and I’ve just not spotted&lt;br&gt;
how to ‘turn it on’?&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;n.b. if you are on my personal site daveparr.info, this might be&lt;br&gt;
confusing, but it’s actually generated from my content on dev.to via&lt;br&gt;
an integration with stack bit. You can see more about this from the&lt;br&gt;
links in the footer VVV&lt;/p&gt;
&lt;/blockquote&gt;

</description>
      <category>projectbenatar</category>
      <category>showdev</category>
      <category>rstats</category>
      <category>meta</category>
    </item>
    <item>
      <title>Why use AWS Lambda for Data Science?</title>
      <dc:creator>Dave Parr</dc:creator>
      <pubDate>Thu, 25 Jun 2020 14:23:43 +0000</pubDate>
      <link>https://forem.com/daveparr/why-use-aws-lambda-for-data-science-421</link>
      <guid>https://forem.com/daveparr/why-use-aws-lambda-for-data-science-421</guid>
      <description>&lt;h2&gt;
  
  
  Motivation
&lt;/h2&gt;

&lt;p&gt;Serverless is a way to deploy code, without having to manage the&lt;br&gt;
infrastructure underneath it. In AWS terms this means there is an&lt;br&gt;
compute instance that runs your code, except you don’t control it, and&lt;br&gt;
that might be a good thing. It only exists when it’s asked for by&lt;br&gt;
something else, and therefore you only pay for the work it does. If you&lt;br&gt;
need to do work concurrently you get a new instance, which also goes&lt;br&gt;
away as soon as you don’t need it, so it’s scaleable.&lt;/p&gt;

&lt;p&gt;For a data scientist this is an interesting prospect for a number of&lt;br&gt;
reasons. The first is keeping you hands clean. Not all people in this&lt;br&gt;
role come from a ‘operations’ background. Many of us are analysts first,&lt;br&gt;
and graduate into the role. However, that shouldn’t mean we don’t ‘own&lt;br&gt;
our deployments’. However, it also means that we might not have the&lt;br&gt;
background, time or inclination to really get into the nitty-gritty.&lt;br&gt;
Managed infrastructure, that can scale seamlessly out of the box is a&lt;br&gt;
nice middle ground. We can still manage our own deployments, but theres&lt;br&gt;
less to worry about than owning your own EC2 instances, let alone a&lt;br&gt;
fleet of them. The way that the instances themselves die off is also&lt;br&gt;
valuable. We may be doing work that requires 24/7 processing, but often,&lt;br&gt;
we aren’t. Why pay for a box which might have 50% required utilisation&lt;br&gt;
time, or even less?&lt;/p&gt;
&lt;h3&gt;
  
  
  Limits
&lt;/h3&gt;

&lt;p&gt;Just like in everything there is a balance. There are &lt;em&gt;physical&lt;/em&gt; limits&lt;br&gt;
to this process. I’ve had success deploying data science assets in this&lt;br&gt;
architecture, but if you can’t fit your job in these limits, this&lt;br&gt;
already isn’t for you. Sure, data science &lt;em&gt;can be&lt;/em&gt; giant machine&lt;br&gt;
learning models on huge hardware with massive data volumes, but we have&lt;br&gt;
to be honest and acknowledge that it isn’t always. K.I.S.S. should apply&lt;br&gt;
to everything.&lt;/p&gt;

&lt;p&gt;If you can get good enough business results with a linear regression,&lt;br&gt;
don’t put in 99% more effort to train the new neural network hotness to&lt;br&gt;
get a 2% increase in performance. Simplicity in calculation, deployment,&lt;br&gt;
and explainability &lt;em&gt;matter&lt;/em&gt;.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“No ML is easier to manage than no ML” ©&lt;br&gt;
[@julsimon](&lt;a href="https://twitter.com/julsimon/status/1124383078313537536"&gt;https://twitter.com/julsimon/status/1124383078313537536&lt;/a&gt;)&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2&gt;
  
  
  Getting started
&lt;/h2&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;scipy&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;stats&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;numpy&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;

&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;seed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;12345678&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;random&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;1.6&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;random&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;slope&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;intercept&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;r_value&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p_value&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;std_err&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; 
  &lt;span class="n"&gt;stats&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;linregress&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;This is a nonsense linear regression. IMHO it’s a data science ‘Hello&lt;br&gt;
World’. Let’s make it an AWS Lambda serverless function.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight diff"&gt;&lt;code&gt;&lt;span class="gi"&gt;+ import json
&lt;/span&gt;&lt;span class="p"&gt;from scipy import stats
import numpy as np
&lt;/span&gt;&lt;span class="gi"&gt;+ def lambda_handler(event, context):
&lt;/span&gt;  np.random.seed(12345678)
&lt;span class="err"&gt;
&lt;/span&gt;&lt;span class="p"&gt;x = np.random.random(10)
y = 1.6*x + np.random.random(10)
&lt;/span&gt;&lt;span class="err"&gt;
&lt;/span&gt;&lt;span class="p"&gt;slope, intercept, r_value, p_value, std_err = stats.linregress(x, y) 
&lt;/span&gt;&lt;span class="gi"&gt;+   return_body = {
&lt;/span&gt;  +       "m": slope, "c": intercept,"r2": r_value ** 2, 
  +       "p": p_value, "se": std_err
  +   }
&lt;span class="gi"&gt;+   return {"body": json.dumps(return_body)}
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;These changes achieve 3 things:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; Turning a &lt;em&gt;script&lt;/em&gt; into a &lt;em&gt;function&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt; Supplying the function arguments &lt;code&gt;event&lt;/code&gt; and &lt;code&gt;context&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt; Formatting the return as json&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;These are required as AWS Lambda needs a &lt;em&gt;function&lt;/em&gt;. This is so that its&lt;br&gt;
&lt;em&gt;event driven architecture&lt;/em&gt; can feed in data through &lt;code&gt;event&lt;/code&gt;, and so&lt;br&gt;
that it’s &lt;code&gt;json&lt;/code&gt; formatted data can both be received by your function,&lt;br&gt;
and then also the response be returned by that function into the rest of&lt;br&gt;
the system.&lt;/p&gt;

&lt;p&gt;You can then open up the AWS console in a browser, navigate to the&lt;br&gt;
Lambda service, and then copy and paste this into this screen:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--JJjFq_mH--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://raw.githubusercontent.com/DaveParr/dev.to-posts/master/snakes-lambdas_files/basic.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--JJjFq_mH--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://raw.githubusercontent.com/DaveParr/dev.to-posts/master/snakes-lambdas_files/basic.png" alt="aws console lambda editor" width="800" height="451"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You can then hit run and…&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--6jR9ZBqA--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://raw.githubusercontent.com/DaveParr/dev.to-posts/master/snakes-lambdas_files/basic-fail.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--6jR9ZBqA--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://raw.githubusercontent.com/DaveParr/dev.to-posts/master/snakes-lambdas_files/basic-fail.png" alt="aws console lambda editor with an error&amp;lt;br&amp;gt;
message" width="800" height="451"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;What happened? Well, because it’s a &lt;em&gt;managed&lt;/em&gt; instance, the function&lt;br&gt;
doesn’t know what &lt;code&gt;scipy&lt;/code&gt; is. It’s not installed on the cloud, it was&lt;br&gt;
installed on your machine…&lt;/p&gt;
&lt;h2&gt;
  
  
  Layers
&lt;/h2&gt;

&lt;p&gt;AWS lambda doesn’t &lt;code&gt;pip install ....&lt;/code&gt;. Seeing as these run on compute&lt;br&gt;
instances that turn up when needed, and are destroyed when not needed,&lt;br&gt;
with no attached storage, you need to find a way to tell AWS what your&lt;br&gt;
dependencies are, or you’ll just have to write super-pure base Python!&lt;br&gt;
Well, that may not be &lt;em&gt;strictly&lt;/em&gt; true. &lt;code&gt;json&lt;/code&gt; is &lt;em&gt;built in by default to&lt;br&gt;
every instance&lt;/em&gt;, so is &lt;code&gt;boto3&lt;/code&gt;, but what about our data science buddies?&lt;br&gt;
&lt;code&gt;numpy&lt;/code&gt;, &lt;code&gt;scipy&lt;/code&gt; are &lt;em&gt;&lt;a href="https://aws.amazon.com/blogs/aws/new-for-aws-lambda-use-any-programming-language-and-share-common-components/"&gt;published by&lt;br&gt;
aws&lt;/a&gt; as layers&lt;/em&gt;. Layers are bundles of code, that contain the dependencies you need to run the functions you write.&lt;br&gt;
So in this case we can open the ‘layers’ view in AWS and attach these to our function.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--ahnC46BU--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://raw.githubusercontent.com/DaveParr/dev.to-posts/master/snakes-lambdas_files/layers.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--ahnC46BU--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://raw.githubusercontent.com/DaveParr/dev.to-posts/master/snakes-lambdas_files/layers.png" alt="layers" width="800" height="361"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Now that you’ve attached all your dependencies with layers, go ahead and&lt;br&gt;
run your function again.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--3bRDbGza--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://raw.githubusercontent.com/DaveParr/dev.to-posts/master/snakes-lambdas_files/basic-success.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--3bRDbGza--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://raw.githubusercontent.com/DaveParr/dev.to-posts/master/snakes-lambdas_files/basic-success.png" alt="editor" width="800" height="451"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Success! So now you know the basics of how to put some Python data&lt;br&gt;
science into practice on AWS Lambda.&lt;/p&gt;

&lt;p&gt;This is a companion post to my talk on using data science in AWS lambda.&lt;br&gt;
If you’re keen to know more, and can’t wait for me to write it all up&lt;br&gt;
here. You can get the gist of the whole talks from this repo 😄 &lt;/p&gt;
&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--A9-wwsHG--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/DaveParr"&gt;
        DaveParr
      &lt;/a&gt; / &lt;a href="https://github.com/DaveParr/snakes_and_lambdas"&gt;
        snakes_and_lambdas
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      Talk for pydata on python datascience lambdas
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="MD"&gt;
&lt;h1&gt;
Snakes and Lambdas&lt;/h1&gt;
&lt;p&gt;A presentation on using aws lambda for data science tasks.&lt;/p&gt;
&lt;h2&gt;
Foundation&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;understand aws ecosystem from a base 'cloud concepts' POV&lt;/li&gt;
&lt;li&gt;some python experience&lt;/li&gt;
&lt;li&gt;data science concepts: linear regression, time series analysis, anomaly detection&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
Solves&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;productionising python data science&lt;/li&gt;
&lt;li&gt;micro-service for data science&lt;/li&gt;
&lt;li&gt;local code -&amp;gt; cloud deployment workflow&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
Requires&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;to generate the slides, use &lt;a href="https://pypi.org/project/revelation/" rel="nofollow"&gt;&lt;code&gt;revelation&lt;/code&gt;&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
Delivered at&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.meetup.com/PyData-Cardiff-Meetup/events/261952446/" rel="nofollow"&gt;pydata cardiff&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.alacrityfoundation.co.uk/" rel="nofollow"&gt;alacrity foundation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.meetup.com/PyData-Bristol/events/263898473/" rel="nofollow"&gt;pydata bristol&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
Contributions&lt;/h2&gt;
&lt;p&gt;I wrote and compiled all the material and the experience that drove it, however I have also been able to use a vast wealth of other peoples resources they have shared on the topic. These are clearly linked and I encourage you to go and dive deeper into those resources. Without those resources I would not have been able to implement any of the material described here.&lt;/p&gt;
&lt;/div&gt;

  &lt;/div&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/DaveParr/snakes_and_lambdas"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;



</description>
      <category>python</category>
      <category>datascience</category>
      <category>serverless</category>
      <category>aws</category>
    </item>
    <item>
      <title>Building my first Django project with CSS and Static Files</title>
      <dc:creator>Dave Parr</dc:creator>
      <pubDate>Thu, 25 Jun 2020 13:14:11 +0000</pubDate>
      <link>https://forem.com/daveparr/building-my-first-django-project-with-css-and-static-files-12g1</link>
      <guid>https://forem.com/daveparr/building-my-first-django-project-with-css-and-static-files-12g1</guid>
      <description>&lt;p&gt;I'm working through &lt;a href="https://djangoforbeginners.com/"&gt;Django for Beginners by William S. Vincent&lt;/a&gt;. Until now we've had some pretty barebones Times New Roman style UI, however that's all about to change! I think.&lt;/p&gt;

&lt;p&gt;I'm really appreciating this incremental iteration of concepts in each new Chapter. Each time I start a new app for the chapter it's engraining the muscle memory and concept recall into me. I'm able to run through the first 5 lines of code at the CLI nearly from memory. It's also a great mechanism to help you if you get stuck on a wierd error and don't know what to do. You know the next chapter will have a clean slate. Great idea William.&lt;/p&gt;

&lt;p&gt;The repetition isn't just the set-up though.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“Now we can add the functionality for individual blog pages. How do we do that? We need to create a new view, url, and template. I hope you’re noticing a pattern in development with Django now!”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I sure am. &lt;/p&gt;

&lt;p&gt;I'm glad that the book also doesn't go into non-django areas, but still points readers to places to look for more info. The last chapter pointed me towards some resources to understand more about databases, and this one does similar with CSS.&lt;/p&gt;

&lt;p&gt;It was interesting to see that the approach used to identify a blog post for navigation in the URL patterns looked &lt;em&gt;kinda&lt;/em&gt; regex-ish:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;urlpatterns&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="nf"&gt;path&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;post/&amp;lt;int:pk&amp;gt;/&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;BlogDetailView&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;as_view&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;post_detail&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="nf"&gt;path&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;''&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;BlogListView&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;as_view&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;home&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;&amp;lt;int:pk&amp;gt;&lt;/code&gt; is the part that identifies the blog post required for the url to link from the ListView to the DetailView. It was interesting to find out there was something like a unique identifier for each post baked into it in the background. I was wondering about how this primary key is treated in bigger projects. Is it common to keep this approach of integer ordered primary keys, or do they get replaced in larger, more complex projects with hashes or other unique ids?&lt;/p&gt;

</description>
      <category>python</category>
      <category>django</category>
      <category>css</category>
    </item>
    <item>
      <title>3 minimal features for my dev.to api wrapper</title>
      <dc:creator>Dave Parr</dc:creator>
      <pubDate>Wed, 17 Jun 2020 13:02:10 +0000</pubDate>
      <link>https://forem.com/daveparr/3-minimal-features-for-my-dev-to-api-wrapper-371l</link>
      <guid>https://forem.com/daveparr/3-minimal-features-for-my-dev-to-api-wrapper-371l</guid>
      <description>&lt;p&gt;I’ve made 3 very small features for my &lt;a href="https://github.com/DaveParr/dev.to.ol"&gt;open source R package wrapping&lt;br&gt;
the dev.to API&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;
  
  
  &lt;code&gt;create_new_article&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;Now that the main requirements for the file to post are stabilising,&lt;br&gt;
I’ve written a quick and dirty function to make a boilerplate article:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight r"&gt;&lt;code&gt;&lt;span class="n"&gt;create_new_article&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="k"&gt;function&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
           &lt;/span&gt;&lt;span class="n"&gt;series&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s1"&gt;'series'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
           &lt;/span&gt;&lt;span class="n"&gt;tags&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s1"&gt;'["tag1", "tag2"]'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
           &lt;/span&gt;&lt;span class="n"&gt;file&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;""&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="n"&gt;boilerplate_frontmatter&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="n"&gt;glue&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;glue&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="s1"&gt;'---\ntitle: "{title}"\noutput: github_document\nseries: "{series}"\ntags: {tags}\n---'&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;

    &lt;/span&gt;&lt;span class="n"&gt;cat&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;boilerplate_frontmatter&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;file&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;file&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This will use the &lt;a href="https://glue.tidyverse.org/"&gt;&lt;code&gt;glue&lt;/code&gt;&lt;/a&gt; package to put&lt;br&gt;
the strings in the function argument into the right place in the&lt;br&gt;
boilerplate YAML front matter. If then uses &lt;code&gt;cat&lt;/code&gt; to either print that&lt;br&gt;
to screen, or to create a new file with it, if a file path is supplied.&lt;/p&gt;
&lt;h2&gt;
  
  
  &lt;code&gt;main_image&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;If there is a &lt;code&gt;main_image&lt;/code&gt; parameter in the &lt;code&gt;YAML&lt;/code&gt; front matter that is&lt;br&gt;
a url of an image, that image will be set as the cover image of the&lt;br&gt;
post. I got this one from a photo by &lt;a href="https://unsplash.com/@juliandufort?utm_source=unsplash&amp;amp;utm_medium=referral&amp;amp;utm_content=creditCopyText"&gt;Julian&lt;br&gt;
Dufort&lt;/a&gt;&lt;br&gt;
on&lt;br&gt;
&lt;a href="https://unsplash.com/s/photos/3?utm_source=unsplash&amp;amp;utm_medium=referral&amp;amp;utm_content=creditCopyText"&gt;Unsplash&lt;/a&gt;.&lt;br&gt;
I &lt;em&gt;believe&lt;/em&gt; that if you put an unsplash URL into this field that goes&lt;br&gt;
&lt;strong&gt;directly to the image&lt;/strong&gt; it is within their&lt;br&gt;
&lt;a href="https://unsplash.com/license"&gt;license&lt;/a&gt;, though if anyone knows&lt;br&gt;
something to the contrary please let me know. It’s the first time I have&lt;br&gt;
used this service, despite hearing about it for years.&lt;/p&gt;
&lt;h2&gt;
  
  
  Collapse spaces tags
&lt;/h2&gt;

&lt;p&gt;I recently fooled myself for a good ten minutes into thinking that there&lt;br&gt;
was a problem with my API code yesterday when I kept getting a 422&lt;br&gt;
response to putting a new article up, when in fact it was that I had a&lt;br&gt;
space character in one of my tags. Now the &lt;code&gt;post_new_article&lt;/code&gt; function&lt;br&gt;
collapses any spaces it encounters in tags. Achieving this was a breeze&lt;br&gt;
with &lt;a href="https://purrr.tidyverse.org/"&gt;&lt;code&gt;purrr&lt;/code&gt;&lt;/a&gt; and&lt;br&gt;
&lt;a href="https://stringr.tidyverse.org//"&gt;&lt;code&gt;stringr&lt;/code&gt;&lt;/a&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight r"&gt;&lt;code&gt;&lt;span class="n"&gt;purrr&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;file_frontmatter&lt;/span&gt;&lt;span class="o"&gt;$&lt;/span&gt;&lt;span class="n"&gt;tags&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;stringr&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;str_remove_all&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;" "&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This little nugget takes the list of tags, and then maps the function&lt;br&gt;
&lt;code&gt;str_remove_all&lt;/code&gt; across all the spaces. This isn’t at all exposed to the&lt;br&gt;
user, as it’s non-negotiable from the API side anyway :)&lt;/p&gt;

</description>
      <category>projectbenatar</category>
      <category>showdev</category>
      <category>rstats</category>
      <category>functional</category>
    </item>
  </channel>
</rss>
