<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Matt Pogue</title>
    <description>The latest articles on Forem by Matt Pogue (@mattpogue).</description>
    <link>https://forem.com/mattpogue</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F327256%2F81887a69-2185-491f-9608-ddbbea61782d.jpeg</url>
      <title>Forem: Matt Pogue</title>
      <link>https://forem.com/mattpogue</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/mattpogue"/>
    <language>en</language>
    <item>
      <title>Users Aren't the Enemy - They're Your First Line of Defense</title>
      <dc:creator>Matt Pogue</dc:creator>
      <pubDate>Tue, 06 Sep 2022 15:47:50 +0000</pubDate>
      <link>https://forem.com/mattpogue/users-arent-the-enemy-theyre-your-first-line-of-defense-5cfb</link>
      <guid>https://forem.com/mattpogue/users-arent-the-enemy-theyre-your-first-line-of-defense-5cfb</guid>
      <description>&lt;p&gt;In IT, security is always at the back of your mind, ceaselessly reminding you how easily everything you've built can come crashing down around you. And if you don't feel this way, there's nothing like a company-wide incident to bring security sharply into focus. Info security in a large corporate environment is a diffuse thing, where audits attempt to determine levels of risk that can be accepted or mitigated, and "why do I care when there's a whole department dedicated to worrying about that stuff", right? As solo admins, we don't have the luxury of offloading security on to someone else's plate; it's a front-and-center responsibility that ranks right up there with making payroll and keeping the lights on. In a large corporation, it would take one hell of a security incident to bring the whole organization down, which is not to say that it can't happen, only that in terms of risk, total organizational compromise ranks fairly low in the hierarchy. For small and medium-sized businesses, a single incident - ransomware, server compromise, data thefts and leaks, etc. - could very well be a knockout blow from which the organization never recovers (&lt;a href="https://www.inc.com/adam-levin/think-ransomware-cant-put-you-out-of-business.html"&gt;this article&lt;/a&gt; from Inc. magazine provides some perspective). In today's fraught environment for the solo admin, a little piece of mind goes a long way, but piece of mind can be expensive and may not be in the budget this year. In this post, I'm going to talk about what I believe is one of the best uses of the limited resources we have available - educating and building a rapport with your users.&lt;/p&gt;

&lt;p&gt;First, an aside - very early in my career in the late 90's, I worked on the internal employee help desk for Southwestern Bell (SBC) in St. Louis, before they became AT&amp;amp;T again. Some of the users were getting computers for the first time in their career and called in with the absolute simplest of problems - the "plug it in, turn it on" variety. My team - mostly guys in their early to late 20's - had a grand old time laughing at 30-year veteran lineman who didn't know where the power button was. One day my boss Lloyd (a 30-year veteran himself) countered one of my comments by asking how easily I would be able to hop in a phone company truck, climb a pole, and fix a citywide outage. Or take my self-assured ass down to the accounting department and knock out payroll for 50,000 employees, for that matter. My 19 year old self immediately got his point; all of the users we supported had their areas of specialty and they shouldn't be expected to understand my job any more than I was expected to understand theirs. I've repeated that anecdote to coworkers and subordinates many times over the years because I think it's an important lesson - our end users, for the most part, aren't IT experts and we shouldn't expect them to be (in fact, the ones who think they are can be the most dangerous). That's what we're here for! That doesn't mean they're not going to do stupid things at times, but one of your goals as an admin should be to prepare for the worst while hoping for the best. I honestly believe that having a strong working relationship with your users is one of the single most important security controls you can implement and with that in mind, here are some tips I've learned to help build and maintain that relationship.&lt;/p&gt;

&lt;h2&gt;
  
  
  User Relationship Tip 1 - Open Door Policy
&lt;/h2&gt;

&lt;p&gt;On my first day at my current job, I visited all 35 employees to introduce myself and emphasize the fact that my door was always open. I wanted them to know that I don't consider any question to be dumb or not worth asking. I also let them know that I considered them to be our first line of defense against ransomware, viruses, and all around bad shit and that I took all of those things very seriously. I made it clear that I wanted them to report anything suspicious to me so I could investigate. I've also made myself available if they had questions regarding their home setups too. That doesn't mean that they all get free PC support (I've had to draw that line a few times), but since my employer doesn't provide home equipment for every user I genuinely need to know what kind of equipment my users are running at home and what types of issues they're running in to if and when they're accessing work stuff. Especially during Covid, many of my users needed to work from home and I had to make do with what they had. That also doesn't mean that I'll allow any random home equipment to connect to our VPN, as I'll discuss later, but even if they weren't doing work stuff from home, I didn't want them to be running virus-infected computers if for no other reason than for the good of the Internet at large. Plus, I wanted to get them in the habit of paying attention to what's happening on their computers so they'll notice when something is different. That can be the difference between stopping an infection at a single PC versus cleaning up your entire office. Finally, I tried to be as accessible as possible. If you're not a people person, work on becoming one. I've had users stop by my office to inquire about cryptocurrency, let me know they're now running a Helium mining hot spot, get my thoughts on what type of computer they should get for their kids going to college, and all sorts of tech-related stuff. By letting them know that I'm not the unapproachable "IT guru on the hill" that some admins turn themselves into, I've been able to be proactive about all kinds of problems - not just security related - because a user thought it was important enough to mention to me. That won't happen if they're afraid to enter your office.&lt;/p&gt;

&lt;h2&gt;
  
  
  User Relationship Tip 2 - Educate, Train, Reinforce
&lt;/h2&gt;

&lt;p&gt;My dad tells a story about when he was asked to teach an adult Continuing Education college course in the late 80's on computer basics. At that point in the late 80's the computer revolution was just beginning to transform many occupations. This was before user friendliness, in the days of the command line and the floppy disk, and many of the folks attending his class looked at computers with a mixture of suspicion and trepidation. They were watching their jobs being taken over by a technology they barely understood and some of them were nervous and even hostile. The first night of class, my dad had everyone come up to the front of the class where he had a computer (at the time, very likely a shiny &lt;a href="https://en.wikipedia.org/wiki/IBM_Personal_Computer_XT"&gt;IBM XT&lt;/a&gt;) on a desk where all the attendees could walk 360° around it and see how the components attached. He explained to them what each component did, and after everyone had a chance to look, they returned to their desks, each of which was equipped with a computer, powered off. Lesson 1 that night was - very simply - "power it on". His goal was to take the "mystery" out of it by showing his students that while it was a complex machine, it was just a machine (insert car repair analogy here). By doing those simple exercises, he reduced his students' anxiety and made them much more receptive to learning.&lt;/p&gt;

&lt;p&gt;Fast forward to the present and anyone who's worked in a business environment knows at least the basics of how to operate a computer. However, I'm continually amazed that even most high school and college age students have no clue how a computer actually functions - meaning, what's inside and what each part does. By the same token, you'll find that most of your users have almost no understanding of how email works, why ransomware is a threat, or why one web browser is any better worse than another (or most likely, even what a web browser IS). It's absolutely imperative that you train your users. I've implemented an annual security training program that all users are required to attend and I supplement that training throughout the year by sending out short emails addressing threats that they may have heard about on the news or that I've been asked about. But in each year's class - much like the beginning of the school year for teenagers - I always reinforce the basics. I try not to go too far into the weeds, but I will pick one or two current threats or current news stories (ransomware has dominated for the last few years) and sprinkle them throughout the serious stuff: how to identify spam/phishing emails, how to pick secure passwords and store them in &lt;a href="https://keepassxc.org/"&gt;KeePassXC&lt;/a&gt;, how to spot bad/malicious web content, and what to do when something fishy happens on their computer (UNPLUG!). And the most important rule of all - report any suspicious occurrence to me. I try to keep the class to an hour with questions at the end and I've managed to get even the most meeting-averse users to pay attention. Bottom line: educate your users and train them by reinforcing that education throughout the year (especially informally through emails).&lt;/p&gt;

&lt;h2&gt;
  
  
  User Relationship Tip 3 - Don't Be a Babysitter
&lt;/h2&gt;

&lt;p&gt;For most people (myself included), it feels very creepy to spy on your coworkers. Prior to this wonderful new era where remote work is common, small businesses tended to fall on one side of the fence or the other without a lot of gray area in the middle. Either everyone is in the office all the time or everyone works somewhere else (and maybe there isn't a physical office). My boss is old school - if you're not in the office, you're not working. When Covid hit, it upended this relationship. We were limited to 10 people or less in the office for long stretches at a time. People had to quarantine when they weren't sick. My boss and I discussed the situation and I explained it to him exactly as I'm explaining it here - don't be a babysitter. We settled on generating some activity reports from our internal project management application and a couple of Exchange 365 activity reports. Obviously not enough to track a person's every move when they're working at home, but enough so that you could tell if a person had generated any activity on a given day.&lt;/p&gt;

&lt;p&gt;There are a few reasons why this tip is so important:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;All that goodwill relationship building you've put in with your users will be worthless if they feel that you're spying on them. The old "just following orders" chestnut won't help you much when an "us versus them" mentality takes hold.&lt;/li&gt;
&lt;li&gt;Trust plays a big role in any relationship - I'm asking my users to be my eyes and ears and report back to me when they see suspicious activity. No one's going to do that if they feel you're looking at them suspiciously.&lt;/li&gt;
&lt;li&gt;Especially in small businesses, when a user isn't doing their job, it becomes obvious pretty quickly. We're trusting our managers to manage their staff, and we're also expecting them to trust their subordinates. If someone is at home and they step away from their computer for a few minutes, how is that any different than a user getting up from their desk to chat with a coworker, get a cup of coffee, etc., in the office? It's not any different and we should acknowledge that.&lt;/li&gt;
&lt;li&gt;In the United States we already work too much. This is &lt;a href="https://biteable.com/blog/work-from-home-statistics/"&gt;one example&lt;/a&gt;, but there are many other polls and studies showing that we work more hours from home, not less. Additionally, we're now invading our users' private lives more than ever. Cory Doctorow has a fantastic &lt;a href="https://pluralistic.net/2022/08/21/great-taylors-ghost/#solidarity-or-bust"&gt;article&lt;/a&gt; on his blog about how surveillance in the guise of "productivity monitoring" has invaded our homes. This needs to be drastically scaled back.&lt;/li&gt;
&lt;li&gt;Finally, people are resourceful. If you go too far with the monitoring and blocking, your users will begin to develop ways to bypass your controls. I would much rather my users browse Facebook on their lunch break on a system I have complete control over than to try and find ways to defeat that control. There's absolutely no need for an adversarial relationship with your users. &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;All that being said, this doesn't mean we're going to operate like the wild west. I monitor all traffic pretty extensively, my users' computers are locked down, and I aggressively block spam, executable attachments, and other types of malware before it gets to my users. &lt;/p&gt;

&lt;p&gt;Which leads me to tip #4....&lt;/p&gt;

&lt;h2&gt;
  
  
  User Relationship Tip 4 - Guide Behavior with Controls
&lt;/h2&gt;

&lt;p&gt;After reading the previous tip you might be asking whether I'm advocating for a free-reins environment and that's absolutely not the case! I'm hoping that if you're reading this post you have the sentience to distinguish between technical controls and outright surveillance, but if not, here's the guide that I would use: whenever you're considering a new monitoring system or additional product to enforce corporate policy, ask yourself if you're comfortable with it being done to you. If it makes you uncomfortable it will make your users uncomfortable too. As a general rule of thumb, avoid any app that requires a user to be recorded - video, audio, keystrokes. That's creepy. Your controls should be as unobtrusive as possible. We use the wonderful &lt;a href="https://www.barracuda.com/landing/pages/spamfirewall"&gt;Barracuda Spam Firewall&lt;/a&gt; for email filtering and I've been very happy with it. By default, you should block ALL file types and only allow the ones you think you'll use - PDF, DOC/DOCX, XLSX, PNG/JPG/BMP, etc. You'll find that there's generally a small list of extensions that are legitimate. We also spam filter pretty aggressively and I've trained my users how to add exceptions in Barracuda and/or request an addition to the global whitelist. At implementation we had quite a few adds all at once but now I add about 2 - 3 exceptions per month. I control our desktops and laptops with Group Policy and a hardening script that's part of our standard build. Finally, my firewall allows ports 80 and 443 out from the user IP block along with a couple of exceptions. All other outbound traffic is blocked. When my users connect to our VPN, I route all their outbound traffic through my firewall so I can monitor it. If you're serious about security, your posture should always, always, always start with "default deny" and go from there. Outside of those controls, I don't feel the need to spy on my users or log their keystrokes because I'm not the morality police and I could never hope to wade through all that data in the first place. I've taken intentional steps to limit the damage that could be done if something nefarious were to slip through and I choose to spend the rest of my time making sure that I could completely recover from a disaster - whether physical or digital - when it happens (not if).&lt;/p&gt;

&lt;p&gt;Which leads me to the final tip....&lt;/p&gt;

&lt;h2&gt;
  
  
  User Relationship Tip 5 - Don't Be a Dick
&lt;/h2&gt;

&lt;p&gt;This is good advice for your personal life too. It's 2022. There are hundreds of thousands of IT people in the world and you're not any more special or entitled than any of the rest of us. I exchanged emails with a PE-certified engineer this week who worked on components of the Apollo spacecraft. Unless you're THAT cool, keep it to yourself. I do my best to be humble and genuinely friendly to everyone I work with because they're my 2nd family. Your office may not be like that but there's no reason for snooty hostility either. In IT our main purpose should be to keep the business's technological parts and pieces running smoothly, utilize that same technology to improve productivity, and solve whatever problems our users and customers may encounter.&lt;/p&gt;

</description>
      <category>security</category>
      <category>training</category>
    </item>
    <item>
      <title>The Solo Developer</title>
      <dc:creator>Matt Pogue</dc:creator>
      <pubDate>Wed, 27 Apr 2022 19:53:02 +0000</pubDate>
      <link>https://forem.com/mattpogue/the-solo-developer-jei</link>
      <guid>https://forem.com/mattpogue/the-solo-developer-jei</guid>
      <description>&lt;p&gt;Throughout my career I've worked on multiple engagements where I was tasked with cleaning up after a fellow developer who - after anything from writing a few lines of code to automate a process to writing a full blown application - disappeared, leaving the client company in desperate straits. While I was tempted at times to "punish" the client for their ignorance, after evaluating the situation I typically realized that it was not the client who was at fault, but rather the developer who had failed to properly set expectations and deliver a finished product to the customer.&lt;/p&gt;

&lt;p&gt;In my current full-time position, I spend more than 50% of my time writing code in support of our in-house project management application. The current version (with some major modifications over the years) was put into production in 2010 and still performs beautifully. However, the world has moved on in the interim, so most of my development efforts at this point are dedicated to rewriting the application to bring it into the modern era.&lt;/p&gt;

&lt;p&gt;The point of all this is that I harbor a very low opinion of those long-gone developers I mentioned in the first paragraph. The ones who couldn't be bothered to deliver source code (where appropriate), use a version control system, or even prepare the most basic documentation; and while I am a team of one, I vowed that anyone who came behind me would not suffer the way I have. So in this post, I will outline the steps I've taken to make my small development world a better place for those who will someday follow in my footsteps.&lt;/p&gt;

&lt;p&gt;Step 1 - Version Control&lt;/p&gt;

&lt;p&gt;Before writing any code, you should decide on a version control system and implement it. Personally, I prefer Subversion as it contains all the features I need - file diff between versions, easy branch/tag capabilities, and compatibility with all operating systems. On Windows, I use the TortoiseSVN client, and on Linux the command line client. If you prefer to host your repositories on Windows, VisualSVN Server provides a nice graphical interface to manage your repositories (although there is a cost associated with it). In addition, the WebSVN project provides a beautiful web interface to your repositories. That being said, any well-established VCS will do the trick, from CVS to Git. Typically, I structure my code repositories with a tags/ directory containing my release versions and a trunk/ directory that contains my current working version. A new release gets its own version-numbered folder under tags/. I've found that this is typically enough; however, feel free to use whatever structure makes sense. The structure you choose is far less important than the fact that you're using version control. I also document each commit, tying as much as possible into my bug/issue tracking software (next section).&lt;/p&gt;

&lt;p&gt;While it's important to document and commit your code, don't hesitate to get creative and use your VCS for anything where version/change history is important. For instance, I have a second repository that contains all database-related code and documentation, including:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;A SQL script to recreate the database schema
A backup copy of the database with any seed records that are required to perform a "default" install of the application - e.g. an "admin" user account in the users table, pre-populated quantity/measurement tables, etc., and the SQL script(s) to create it
Any miscellaneous SQL scripts that relate to the project, such as data import scripts, performance/troubleshooting scripts, etc. 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Finally, I also keep a "devops" repository with copies of all my generic server and workstation config files, miscellaneous scripts, server update logs, and anything else I feel is relevant. This gives me an easy way to keep my scripts, files, documentation, and change logs together.&lt;/p&gt;

&lt;p&gt;Step 2 - Issue Tracking&lt;/p&gt;

&lt;p&gt;Even as a solo developer, it's a good idea to maintain an issue tracking database as a supplement to the documentation in your VCS. My personal favorite for the last several years is Redmine, however, I've also had success with Mantis and The Bug Genie (now known as Pachno, so YMMV). Redmine provides easy integration with my Subversion repositories; linking to an issue is as easy as putting "Issue #123" in my commit log which provides an automatic link to the issue when the repository is viewed in Redmine. Redmine also allows me to track time spent per issue, customize my categories and priorities, and autogenerate change logs and wiki entries. Between Redmine and my repository documentation, I can easily tell when a feature was added as well as tying a feature to a specific change in my source code. With its support for multiple projects, I can also use a single Redmine installation to track work for multiple clients.&lt;/p&gt;

&lt;p&gt;Step 3 - IDE&lt;/p&gt;

&lt;p&gt;Without starting a holy war over which IDE is preferable, I think it's safe to say that for C#/.NET development, Visual Studio is going to provide the best bang for the buck, especially considering that my version of choice is Visual Studio Community (free). Since my project management application is composed of several C# class libraries on the backend and and .NET MVC web frontend, the full version of Visual Studio is my primary IDE. However, for almost everything else - including Python, PHP, shell scripting, and Powershell to name a few - I'm now using Visual Studio Code.  With its ever expanding universe of plugins and themes, settings sync via Microsoft, support for Git and other version control systems, and cross platform compatibility, it's hard to beat! As a fallback, on my Windows systems, I always have Notepad++ installed as well. From editing config files to viewing code snippets to debugging JSON from the browser, you really can't beat it. I also use it for making quick edits to scripts and HTML.&lt;/p&gt;

&lt;p&gt;Regardless of which IDE you prefer, the bottom line is to find one that does what you want and supports the features you need. Learn it from top to bottom and stick with it. Over the years, I've tried just about everything from Atom to Jetbrains and they've all had some features I liked. However, at the end of the day, I just didn't have a reason to switch, especially since most of my coding is in the Microsoft world.&lt;/p&gt;

&lt;p&gt;Step 4 - Miscellaneous Applications&lt;/p&gt;

&lt;p&gt;Finally, I want to mention some of the apps I use that are an indispensable part of my toolkit. There are many, many more tools out there that are just as deserving of praise as these and I encourage you as a solo developer to always be on the lookout for apps that can increase your productivity, automate repetitive tasks, or provide those killer features that you can't get anywhere else. Most - if not all - of these applications have alternatives, some of which are probably better, but I'm a big believer in simplicity and stability. Again, that being said, I'm always keeping an eye out for new stuff; constant learning is one of the main requirements of this job. If you're not willing to put in the time or the effort to continually learn and improve upon your skills, you're in the wrong profession!&lt;/p&gt;

&lt;p&gt;Here are some of the apps I use frequently to support my development tasks:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;OpenDBDiff - Diff tool for SQL databases. I use this frequently to make sure that my changes are sync'ed between my Development and Production databases. Quick, easy, and performs as advertised.
Postman - API testing tool. I recently worked on a project building out an API and Postman was exactly what I needed to test both locally, in Development, and in Production. The free version is enough for me, but the paid version provides extra features.
pgAdmin - the only GUI you need for working with PostgreSQL databases.
HTTPToolkit - a relative newcomer to the toolbox, HTTP Toolkit allows you to intercept HTTP and HTTPS traffic from within the browser. Similar to Fiddler Classic, another great tool I've used over the years.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Step 5 - Philosophy&lt;/p&gt;

&lt;p&gt;In conclusion, I want to talk a little bit about the philosophy behind the decisions I make as a solo developer. As I mentioned earlier in the post, I take a dim view of any developer who holds clients hostage by not providing source code or documentation on a project, hijacks domain names, or otherwise operates in a way that gives the rest of us a bad name. In order to differentiate yourself from the bad actors in our profession, here are a few suggestions from my personal experience:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Whether you're in a full time position or a contract/freelance position, determine upfront what's expected of you in terms of timeline, pay scale/rate, and work output. It should be decided upfront who will own the source code you produce; generally, in a freelance situation, documented source code (including VCS repositories) is part of my final deliverable. If that's not going to be the case, then think about offering some type of source code escrow so that future developers can enhance and improve on your original code in the event that you're not available.
Document, document, document! Utilize your VCS and issue tracker to generate "developer docs" in addition to the usage documentation you provide the client. Future developers who have to work on your stuff will thank you for it.
Build a solid and stable development environment. This is more of a personal recommendation. As I mentioned, in our profession, we should always be learning and on the lookout for tools that can enhance our productivity or make our lives easier. However, I've found that deciding on a core set of tools and learning them inside and out has made the biggest boost to my productivity. Especially your IDE - figure out how you work best and go with it. If you prefer VIM and the terminal, then go that way. If you're more of a GUI person, then go that direction. Whichever direction you go, commit to it and learn it.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;These are some of the tips and techniques I've developed over the years as a solo developer. I'm sure there are many other ways to get the same job done and that's kind of the point - pick what works best for you and stick with it! If you have any questions about this post, feel free to email me - &lt;a href="mailto:matt@thesoloadmin.com"&gt;matt@thesoloadmin.com&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>programming</category>
      <category>csharp</category>
      <category>methodology</category>
    </item>
    <item>
      <title>Batch Downloading With Python</title>
      <dc:creator>Matt Pogue</dc:creator>
      <pubDate>Mon, 28 Mar 2022 18:46:05 +0000</pubDate>
      <link>https://forem.com/mattpogue/batch-downloading-with-python-46eg</link>
      <guid>https://forem.com/mattpogue/batch-downloading-with-python-46eg</guid>
      <description>&lt;p&gt;Today one of my user teams won their bid for a new project with one of our best customers (yay!). Our projects always start with purchase orders and when the customers don't have EDI or an API available (most of them), our project expediters have to download the customer's purchase orders and manually enter them into our system (boo!). All the reasons why this is so are a topic for another day. Today's problem is that our new project already has 242 purchase orders that need to be downloaded and entered into our project management app, so at the very least I can help retrieve the PDF's from which they'll do the data entry.&lt;br&gt;
First, a little background - this particular customer is using a new system called &lt;a href="https://modernpo.com"&gt;Modern PO&lt;/a&gt; to track their PO's. The site is nice - a simple and straightforward user interface and basically what you'd expect from a modern form-based website. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--f4TXQieQ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/mmz334k9wy01lzxfm12n.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--f4TXQieQ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/mmz334k9wy01lzxfm12n.png" alt="Image description" width="880" height="364"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Each PO has the PO ID field linked to its details page and each details page contains a set of tabs along the top. The only tab we're concerned with is "View", as this tab contains a "Download" button which will give us a PDF version of the purchase order.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--zhmbpVxG--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/he2o1l6stkwtsbv2lrq5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--zhmbpVxG--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/he2o1l6stkwtsbv2lrq5.png" alt="Image description" width="880" height="391"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I thought perhaps there would be a site API that would allow me to easily request the PO's I needed, but no such luck. So the task today is to write a script that can access the PO list page and for each PO, access its details page, "View" tab, and save the file linked to the "Download" button. Not super simple but not rocket science either. For tasks like this, my GOTO language (ha!) is Python - &lt;a href="https://python.org"&gt;https://python.org&lt;/a&gt;, specifically Python 3 as there's no sense writing new code in Python 2 at this point.&lt;/p&gt;

&lt;p&gt;In this article I want to walk through the process that I use to actually develop a script like this, for anyone who hasn't done a whole lot of scripting and/or may be intimidated by this type of project. Rather than just giving you something you can copy-paste (although you will have that also), I want to show the steps that it takes to build up your own scripts. Before you know it you'll have your own script library to draw from. I go through basically the same process regardless of the language. &lt;/p&gt;

&lt;p&gt;&lt;em&gt;NOTE: The remainder of the article assumes at least a working knowledge of Python. If you've never written anything in Python before, the &lt;a href="https://learnpython.org"&gt;learnpython.org&lt;/a&gt; site is a great place to start.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;So first off, go ahead and download the finished script so you can follow along: &lt;a href="https://thesoloadmin.com/content/files/2022/03/batch_downloader.py"&gt;batch_downloader.py&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you're on Windows, go ahead and open up your preferred Linux distro in WSL (for this post, I'm using Debian) and make sure you have Python 3 installed with the command &lt;code&gt;python3 --version&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--UQoMIOlW--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/ng6uenqhspfwff7p4pkc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--UQoMIOlW--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/ng6uenqhspfwff7p4pkc.png" alt="Image description" width="880" height="460"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you get a "command not found", you'll need to install Python. In Debian/Ubuntu, a simple &lt;code&gt;apt-get install python3&lt;/code&gt; should do the trick.&lt;/p&gt;

&lt;p&gt;Starting with a blank slate when scripting, it's good practice to include the interpreter as the first line in your script. This allows you to run it from the command line without needing to specify the path to Python each time. Just to make sure, I use the which command to verify the path to my executable, in this case:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;(debian) mpogue@darkstar [~]$ which python3
/usr/bin/python3
(debian) mpogue@darkstar [~]$
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now that we have Python 3's path, the first line of our script should read:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;#!/usr/bin/python3
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For those of you who don't know, the "#!" symbol at the beginning of the line is called the "shebang" or "hashbang" symbol. An excellent explanation of the symbol and why you should include the interpreter line can be found at linuxhandbook.com - &lt;a href="https://linuxhandbook.com/shebang/"&gt;https://linuxhandbook.com/shebang/&lt;/a&gt;. &lt;/p&gt;

&lt;p&gt;Next up in your Python scripts, we need to import all the required modules we need (details of the Python import system &lt;a href="https://docs.python.org/3/reference/import.html"&gt;here&lt;/a&gt;):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;from bs4 import BeautifulSoup
import urllib3.request
import re
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;
&lt;a href="https://www.crummy.com/software/BeautifulSoup/"&gt;BeautifulSoup&lt;/a&gt; is a FANTASTIC Python library for parsing and dissecting HTML documents. If not for its abilities, this project would no longer have been classified as "little".&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://urllib3.readthedocs.io/en/stable/"&gt;urllib3&lt;/a&gt; is pretty much the defacto HTTP client library for Python 3 and is used throughout much of Python's core, including pip.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://docs.python.org/3/library/re.html#"&gt;re&lt;/a&gt; is Python's built-in regular expression library. My guilty admission? I've struggled with regular expressions since the day I wrote my first line of code and will continue to struggle with them until I keel over at the keyboard. &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Now let's start diving into the script itself. First up, I declared a variable for the site's base URL. Not only does it save me from having to type it multiple times, it also makes it easier to change in the future if/when I end up reusing the script. I also created a variable to hold the "SessionId" cookie that needs to be submitted with each request. Finally, I instantiated a new &lt;code&gt;PoolManager()&lt;/code&gt; object; &lt;a href="https://urllib3.readthedocs.io/en/stable/user-guide.html"&gt;according to the &lt;code&gt;urllib3&lt;/code&gt; docs&lt;/a&gt;, the PoolManager object "handles all of the details of connection pooling and thread safety so that you don’t have to".&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# The site's base URL
base_url = 'https://www.whateverwhatever.com'

# Session Cookie, retrieved from the browser after authenticating successfully. We'll need to submit this with each request.
session_cookie = '_SessionId=cmKMxrYDzpULXcMbwtuYDNjzdRCWdGS9xgOPUQtbyAdjyu4LvlPylF3ICxVj3V7NSs%2BliTKtRCRNbEc8BhzCdMGeHWJyyT8n0NEaJ7DKU20TzsuD9FZtMbH5od4xhKrE96vlqDvuPEYegbPtL14Of%2BZZsCI4jXCRRcSk%2FojgBYg%2Bwf%2FICDk3MM5STbkkLvWFXR8PK0Xvg6DBy0mnzR2t2jBh7mPijOLFiFRiVriwze8Xkci2QDmziMrclTxHCMWCqjERGFs4wxwJ9f%2BiWq1Y7CEZx5W5GmEyrwRRUJwVGu2dk%2Bfr82Jr3L09GKB1--y9bG00G4CVBMovga--3DplyhIxzHXKYRswjnQfBw%3D%3D'

# New instance of the PoolManager()
http = urllib3.PoolManager()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Before we go any further, now is a good time to fire up an interactive Python shell in your WSL terminal. From the command line, simply run &lt;code&gt;python3&lt;/code&gt; and you'll be dropped into an interactive shell where you can run Python code. The interactive shell is yet another reason why Python is a fantastic language to work with. After entering the shell and executing the lines from the script so far, you should see something like this:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--lXkjvNVu--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/ft3etk3zjzoa6ntl5wav.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--lXkjvNVu--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/ft3etk3zjzoa6ntl5wav.png" alt="Image description" width="880" height="381"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;As an aside, I've been using Visual Studio Code to write my Python scripts for the last year or 2. It's really hard to be the syntax highlighting/code completion it provides, but with Python, you can code in anything from Notepad (no idea why you would) to any of the wonderful IDE's out there. Support for the language is pretty much universal across all operating systems, which is another reason why it's great for this type of task.&lt;/p&gt;

&lt;p&gt;If you're looking at what we've done so far with the script, and you're not overly familiar with web-related programming you might be asking "where did the &lt;code&gt;session_cookie&lt;/code&gt; variable come from"? Here's how I determined the value to pass for my session variable.&lt;/p&gt;

&lt;p&gt;In Firefox, I installed the &lt;a href="https://github.com/Rob--W/cookie-manager"&gt;Cookie Manager&lt;/a&gt; extension which allows me to quickly and easily view all cookies for a given website. Typically, most sites are going to set a cookie containing session details and the modernpo.com site is no different. After going to the site and logging in, click the Cookie Manager button and choose "Open Cookie Manager for the Current Page". The displayed "Name" and "Value" boxes give you the values you need to use.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--ZjDCS6BD--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/g092p4uzxj0lofcx6u5e.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--ZjDCS6BD--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/g092p4uzxj0lofcx6u5e.png" alt="Image description" width="880" height="377"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Next up in our script, we're going to create a new request variable using our &lt;code&gt;PoolManager&lt;/code&gt; instance, along with a new instance of the BeautifulSoup &lt;code&gt;html.parser&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Create the request
req = http.request('GET', base_url + '/my-company/purchase_orders?page=1', headers={'Cookie':session_cookie})

# Create the HTML parser
soup = BeautifulSoup(req.data, 'html.parser')
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Notice that we're sending the HTTP header "Cookie:" containing the &lt;code&gt;session_cookie&lt;/code&gt; variable we defined previously.&lt;/p&gt;

&lt;p&gt;This is the point where - if you're running the script line-by-line in the Python shell - you can start to do some troubleshooting. For instance, after you create the &lt;code&gt;req&lt;/code&gt; variable, it will hold an &lt;code&gt;HTTPResponse&lt;/code&gt; object containing the full HTML document that you can view by printing the &lt;code&gt;.data&lt;/code&gt; property of the object like so:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;gt;&amp;gt;&amp;gt; print(req.data)
b'&amp;lt;!DOCTYPE html&amp;gt;\n&amp;lt;html&amp;gt;\n  &amp;lt;head&amp;gt;\n    &amp;lt;title&amp;gt;Modern Purchase Order&amp;lt;/title&amp;gt;\n    &amp;lt;meta name="csrf-param" content="authenticity_token" /&amp;gt;\n&amp;lt;meta name="csrf-token" content="LKF1mX644f7yJTN0z8w6fjjnTsLxtCyThmUxHtpdSFnJASB6nEzkU7cP/Sz2mAZ4CwBEcMDBep/PZSeZeL5UuQ==" /&amp;gt;\n    \n\n    &amp;lt;link rel="stylesheet" media="all" href="/assets/application-a06c283f2b5a0769586f9c825ae0bfea58088976154154d8e6014a3726819503.css" data-turbolinks-track="reload" /&amp;gt;\n    &amp;lt;script src="/packs/js/application-7e37f964ea2fd972be1b.js" data-turbolinks-track="reload"&amp;gt;&amp;lt;/script&amp;gt;\n    &amp;lt;link rel="icon" type="image/png" href="/assets/favicon-16afc7069a527b6ad197481a92db4986127fb92d8de89131ff3088a5821997c1.png" /&amp;gt;\n\n    &amp;lt;link rel="preconnect" href="https://fonts.googleapis.com"&amp;gt;\n    &amp;lt;link rel="preconnect" href="https://fonts.gstatic.com" crossorigin&amp;gt;\n    &amp;lt;link href="https://fonts.googleapis.com/css2?family=Roboto&amp;amp;display=swap" rel="stylesheet"&amp;gt;\n\n  &amp;lt;/head&amp;gt;\n\n  &amp;lt;body class=""&amp;gt;\n      \t&amp;lt;nav class="navbar sticky-top navbar-expand-md navbar-light bg-white mb-sm-3 p-2 border-bottom" id="mainnav"&amp;gt;\n\t\t&amp;lt;div class="container-fluid"&amp;gt;\n\t    &amp;lt;a class="navbar-brand mr-2" href="/"&amp;gt;\n\t      &amp;lt;img class="img-fluid" style="width:30px" alt="Modern PO" src="/assets/po_logo-58ec7e8bff....
&amp;lt;snip&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you get a similar response, you know you're on the right track. If not, you know that you need to step back and see what was missed. Moving forward, after we initialize the &lt;code&gt;soup&lt;/code&gt; variable, it now also holds a parsed version of the HTML document which can be viewed also:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;gt;&amp;gt;&amp;gt; print(soup)
&amp;lt;!DOCTYPE html&amp;gt;

&amp;lt;html&amp;gt;
&amp;lt;head&amp;gt;
&amp;lt;title&amp;gt;Modern Purchase Order&amp;lt;/title&amp;gt;
&amp;lt;meta content="authenticity_token" name="csrf-param"/&amp;gt;
&amp;lt;meta content="LKF1mX644f7yJTN0z8w6fjjnTsLxtCyThmUxHtpdSFnJASB6nEzkU7cP/Sz2mAZ4CwBEcMDBep/PZSeZeL5UuQ==" name="csrf-token"/&amp;gt;...
&amp;lt;snip&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now that we know we have a valid HTML document in the &lt;code&gt;soup&lt;/code&gt; variable, we need to parse it. Going back to Firefox and the page containing the list of purchase orders, I used Firefox's Web Developer Tools (activated with the key combo "ctrl+shift+i"; see here for details on how to use these tools) to determine that the purchase order detail links were in the format "/purchase_orders//edit". &lt;/p&gt;

&lt;p&gt;Using BeautifulSoup's &lt;code&gt;find_all&lt;/code&gt; method (see &lt;a href="https://beautiful-soup-4.readthedocs.io/en/latest/#find-all"&gt;here&lt;/a&gt;), we'll first get all the "a" elements on the page:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;for link in soup.find_all('a', string=True):
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In our Python shell, we can test this step with the following:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;gt;&amp;gt;&amp;gt; for link in soup.find_all('a', string=True):
...     print(link)
...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The output from the previous lines should be all the "a" elements on the page. This is also handy to validate the format of the purchase order detail links. However, we now need to search the returned elements and get only the ones we need. This is where the regular expression library comes in. &lt;/p&gt;

&lt;p&gt;Inside our &lt;code&gt;for&lt;/code&gt; loop, we use the &lt;code&gt;re&lt;/code&gt; library to search the &lt;code&gt;href&lt;/code&gt; property of each link. For each match, we'll split the link at the "/" character, with the next-to-last element of the array containing the purchase order's unique ID. Again using the Firefox Developer Tools, I was able to determine that the page containing the download link I'm looking for was in the format "/purchase_orders/". Therefore, we need to chop off the final "/edit" portion of the URL, leaving the rest intact. In Python, the line &lt;code&gt;link.get('href').split('/')[:-1]&lt;/code&gt; says to split the &lt;code&gt;href&lt;/code&gt; property of each link at the character "/", and include all elements in the array except the last one (the "/edit" portion). We're then joining the array back together and creating the link we need by combining the &lt;code&gt;base_url&lt;/code&gt; with the array elements:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;if re.search('\/purchase_orders\/\d*\/edit$', link.get('href')):
    link_parts = link.get('href').split('/')[:-1]
    details_link_after_base = '/'.join(link_parts)
    details_page_link = base_url + details_link_after_base
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Note: As you've probably determined from the previous examples, our basic troubleshooting is to use the print command in the Python shell to confirm the format of our variables and to get a better view of each object in general. From here on, I will leave it as an exercise for the reader. However, be aware that any assigned variable in Python can be printed. If you get back an object reference, you most likely are looking for one of the object's properties. This can also be helpful in determining which property you need to reference. &lt;/p&gt;

&lt;p&gt;Next up, I loaded the Details page link for the purchase order in Firefox. Ah ha! The bottom of the page has the "Download" button that links to a PDF of the purchase order. We're almost home!&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--g5iGN0xj--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/umqm12vy46g76te296j0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--g5iGN0xj--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/umqm12vy46g76te296j0.png" alt="Image description" width="880" height="453"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The parent element's contents - &lt;code&gt;&amp;lt;div class="my-4"&amp;gt;&lt;/code&gt; - contain the purchase order number which we'll use as part of the file name. So we need to get the details page, parse it to find the appropriate div by its CSS class, get the purchase order number from the contents, get the download link, create a filename variable, and download and save the PDF:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Get the Details page and assign to a BeautifulSoup variable
        details_page = http.request('GET', details_page_link, headers={'Cookie':session_cookie})
        po_details_html = BeautifulSoup(details_page.data, 'html.parser')

        # Look for the CSS class "my-4" - this is the parent element for our download link
        po_number_div = po_details_html.find('div', class_='my-4')

        # Assign a filename variable using the purchase order number, located in the div's contents
        filename = 'purchase_order_' + po_number_div.contents[0].contents[0] + '.pdf'

        # Locate the download link
        file_url_part = po_details_html.find('a', class_='btn-primary').get('href')
        full_download_link = base_url + file_url_part

        # Print a status message, download the PDF, and save to disk
        print("Downloading PO from link " + full_download_link)
        file_req = http.request('get', full_download_link)
        with open(filename, 'wb') as f:
            f.write(file_req.data)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Again, I would encourage you to use the Python shell to &lt;code&gt;print&lt;/code&gt; each variable as you work your way through the script.&lt;/p&gt;

&lt;p&gt;As a solo admin, learning to write scripts in multiple languages is crucial to your overall success. It will save you and/or your users time in different situations and is a critical tool in your toolbox. It also never hurts to be able to list multiple programming languages on your resume!&lt;/p&gt;

&lt;p&gt;If your adopting this script for your personal use and you run into problems, feel free to leave a comment or send me an email (&lt;a href="mailto:matt@thesoloadmin.com"&gt;matt@thesoloadmin.com&lt;/a&gt;) and I'd be happy to help anyway that I can. Thanks for reading!&lt;/p&gt;

</description>
      <category>python</category>
      <category>programming</category>
      <category>scripting</category>
      <category>shell</category>
    </item>
  </channel>
</rss>
