Forem: Mike

Thinking of self-hosting? Here's some tips.

Mike — Wed, 24 Apr 2019 01:49:19 +0000

This was originally published on my blog, where I regularly post tech tutorials, and other awesome content.

This was inspired by a Reddit Post, in this post the author asks for advice on self-hosting things. I weighed in with some points on locking down your self-hosted services.

Why self-host?

It's fun, it helps give you real world experience. On top of this, if you care for privacy and security, it helps you lock the environment down so you feel more comfortable. Remember, nothing is bulletproof. If a determined threat actor wants in, it only takes one tiny mistake to open the flood gates.

OK, so I'll host. What next?

Welcome to the exciting world of self-hosting! It's a great hobby. System administration rocks.

Let's assume you'll be using a flavour of linux, such as Debian or Ubuntu. First, I always pick the most recent LTS, in the case of Ubuntu, I'll prefer 18.04 LTS. I use DigitalOcean (ref link) for hosting most personal things, including this very own blog along with my wife's. They've done a great job at making both developers and new-comers life super simple.

Go ahead and make an account at DigitalOcean, get some free credit, spin up a droplet for yourself with Ubuntu 18.04. You'll need an SSH key, you can generate one on linux or mac by doing the following:

ssh-keygen -b 4096

To retrieve the public key, just run this command and then paste it into the ssh key slot on DigitalOcean (like shown below):

cat ~/.ssh/id_rsa.pub

Then, finalize and create your droplet! Within about 1 minute your droplet is online and ready to use.

Accessing your droplet

You'll need to use ssh, just type in your terminal (linux/mac) or Putty/other-ssh client on Windows:

ssh root@$IP_ADDRESS

Replace $IP_ADDRESS with your droplets IP address. You're in!

Securing your droplet

Immediately, the first things you'll want to do is start locking down your droplet. Start by running a simple apt update and installing fail2ban:

apt-get update && apt-get install fail2ban -y

This will update your packet repository, install and enable fail2ban. Fail2ban works wonders for SSH and helps ban brute-force attackers. But, we really shouldn't keep our ssh port 22 as the default.

It is always advisable to either (a) change the SSH port, (b) limit access to SSH via firewall, or, (c) do both. I always pick option C.

Let's change the ssh port! I use vim for this (yes, I know) - but nano is a good alternative. If you don't have nano, run apt-get install nano and you'll have it.

vim /etc/ssh/sshd_config

In the first few lines you should see "Port 22" - go ahead and change this to any memorable port you know, pick something that isn't simple to guess (eg. 2222). I'll use in this example port 9132.

Go ahead and save this file, then restart ssh - you may have to reconnect once this is done.

service ssh restart

To reconnect with the new port, run:

ssh root@$IP_ADDRESS -p $PORT

Now this is getting more locked down, but let's imagine we have an apache2 server running, and we use Cloudflare. We want to make sure that only Cloudflare and our IP can access the web server directly, let's introduce a Cloud Firewall from DigitalOcean. Go ahead and create one, tag it appropriately.

Here is what your basic configuration should look like (IP list for cloudflare is available here: https://www.cloudflare.com/ips/

As you can see, HTTP is open to Cloudflare only. Perfect. Now if you try to access it you should get rejected by the firewall (unless you whitelisted yourself). I opened 10/8 for this demonstration so if you have multiple droplets they can communicate properly with each-other (I'd suggest just whitelisting IPs manually, not entire subnets, however). YMMV.

Always take backups!

DigitalOcean charges peanuts to do them automatically, this blog costs me $1/mo to backup. I'd recommend you always keep a personal backup as well, just in case.

That's it!

Now you can experiment and install your own software, like NextCloud, etc. For further reading, checkout /r/homelab and /r/selfhosted - it's quite addictive.

If you have any questions, feel free to give me a shout!

CD: Configuring my own ISP network to reach 2Gbit+ speeds

Mike — Thu, 18 Apr 2019 09:52:37 +0000

This was originally posted on my blog, where you'll from everything from breakfast to servers caught on fire, and spaghetti code in production, come check it out!

I've never blogged about this directly before, but I operate my own ISP locally. I've got stable customers, and I'm profitable! I recently started getting bogged down on port speed however, we managed to exhaust an entire 10G port. Wowsers.

I've spun up new nodes to my cluster for higher bandwidth, and migrated some hosts over this evening. I use Grafana, Observium, and a few other tools to catalog + monitor everything in production - it's perfect for this sort of thing. Right now the new cluster node is averaging 2G consistently, which is pretty awesome. I didn't have to spend hundreds to spin it up either, due to my extensive work at scaling my business, it took me about 45 minutes and pushing some configuration files to customers.

How did I manage to burst over 10G?

Being an ISP, my customers have a variety of needs, some stream consistently, some are heavy data hoarders and download tonnes. As you can see from the above screenshot, there's a lot of data being transferred here. I do extensive monitoring on what is being used to ensure there is no abuse.

How'd I start my own ISP?

It started in 2016, I was fed up with the prices the local ISPs were offering, and they had shotty service at best. If you paid for a 1G pipe, you'd get at best 500Mbit down, 150Mbit up. This wasn't due to any congestion, they just didn't want to allot more bandwidth, despite selling it with more.

I entered into an agreement to lease lines, and colocated gear. I bought bandwidth from transit browsers, acquired an ASN + IP range. Bingo, I began. Selling was easy due to my area of town, I was the best option - I costed about $25/mo more than my competitors, but instead of 1G "symmetrical" with 5TB cap, I offered 1GB symmetrical with 50TB cap. That's a lot of bandwidth.

Automating this wasn't easy, a mixture of freeredius, ansible scripts to push config files. It works, it communicates back with my billing appliance, and my bandwidth calculator pumps the data into the billing appliance as well. At the end of the month, my customers get a detailed invoice stating amount used, average speed, and bandwidth credit (if applicable).

And, as of early this year, I am beginning to become profitable. This is a big milestone, it's been my dream to run an ISP, and it's happening every single day.

Any tips for people who want to run their own ISP?

First, it's hard - the logistics of this all is complex. Before you consider it, get a list of at least 10-20 people who would sign up with you immediately. This helps bring in money, possibly startup funds to acquire equipment.

Start small, don't immediately buy 50G of transit, start with 1G, sell slowly, upgrade as you need. If you jump into it big, you'll lose big time.

Don't fold - even if your competitor is cheaper, keep your same rates - focus on quality not quantity.

Have an actual support panel and reply to tickets actively, engage customers.

Don't focus on selling your own hardware, it's OK to resell and put your own scripts on it.

Most importantly, have fun - this is an experience you'll never forget. You don't need lots of money, you just need passion.

Good luck!

Recursion is difficult: PostgreSQL Edition

Mike — Thu, 18 Apr 2019 09:45:48 +0000

This was originally posted on my blog, where you'll find me discussing everything from lunch to engineering, system administration, and running my own ISP!

Arguably one of the most difficult things to get a grasp on, no matter the experience level the developer has, is recursion. When implemented properly, recursion is beautiful, simple, and can be entirely stateless.

Now imagine implementing recursive functions successfully in PostgreSQL that are capable of maintaining themselves via various triggers. This premise is beautiful, a one-deploy function and trigger that can detect updates in tables, and trigger a function. This function accepts a parameter of the OLD state, and NEW state – seems simple enough, or is it?

Mocking up recursive functions

I've always found mocking the intended recursive function on paper is the best way to get the full picture, in most cases you can't reliably see recursion in your head. I usually default to my most-used language PHP when writing the function on paper, for example if I wanted to take the OLD state, and NEW state, and check if the id_link number changed, it would look like this:

function onUpdate( \psqltable $previous, \psqltable $next ) {
    if( $previous->get('id_link') !== $next->get('id_link') ) {
         // this is now linked to a new parent element, let's rebuild
         rebuild( $previous, $next );
    }
}

Our end goal is, if we're linked to a new "id_link" tag, we want to rebuild an entire tree structure, deleting and re-inserting new nodes as things change, all automatically via triggers. Let's dive into how this may look...

function rebuild( &$previous, &$next ) {
    $children_prev = $previous->children();
    if( count( $children_prev ) > 0 ) rebuild( $children_prev, $next ); // recursion one
    $previous->getCurrent()->reinsert()->change('id_link', $next->get('id_link'))->change('last_updated', time())->apply();
}

If we begin to pull apart this function, we can see we get all our children for our current node, if we have any, we call rebuild again for n children, until we've reached the bottom of the stack. We then get our current index in the object, change our id_link, last_updated fields to the parents id_link and time respectively.

On the surface this seems pretty simple, so let's give it a kick at the can in PSQL this time!

PSQL Mockup

Disclosure: By no means a PSQL expert!

Starting simple with our most basic triggers:

  DROP TRIGGER IF EXISTS trigger_for_changes ON reference_detail;
  CREATE TRIGGER trigger_for_changes BEFORE INSERT OR UPDATE OR DELETE ON reference_detail FOR EACH ROW EXECUTE PROCEDURE reference_detail_changed();

  CREATE OR REPLACE FUNCTION reference_detail_changed() RETURNS trigger language plpgsql AS $$
    BEGIN
        IF TG_OP = 'UPDATE' OR TG_OP = 'INSERT' THEN
            PERFORM adapt_change( OLD.id_link, NEW.id_link );
            RETURN NEW;
        ELSE
            PERFORM adapt_change( OLD.id_link, OLD.id_link );
            RETURN OLD;
        END IF;
    RETURN NEW;
    END;
  $$;

I'd decided in implementation to change it around, instead just pass our ID_LINK to search in the header table, which contains metadata of all current objects. Let's give our function an attempt:

 CREATE OR REPLACE FUNCTION adapt_change( old_id_link VARCHAR, new_id_link VARCHAR ) returns integer
  language plpgsql
  as $$
    DECLARE
      meta_row RECORD;
      ref_detail RECORD;
    BEGIN
      IF old_id_link == new_id_link THEN
        DELETE FROM reference_detail WHERE id_link = old_id_link;
        FOR meta_row IN SELECT id_curr_link FROM reference_meta WHERE id_meta_link = old_id_link LOOP
          DELETE FROM reference_meta WHERE id_meta_link = old_id_link; -- clear out meta record
          PERFORM adapt_change( ref_detail.rm.id_link, ref_detail.rm.id_link ); -- parent was destroyed, time to delete ourselves!
        END LOOP;
      ELSE
        UPDATE reference_detail SET id_link = new_id_link WHERE id_link = old_id_link;
        UPDATE reference_meta SET id_curr_link = new_id_link, id_meta_link = new_id_link WHERE id_link = old_id_link;
        FOR meta_row IN SELECT rowid, id_curr_link FROM reference_meta WHERE id_meta_link = old_id_link LOOP
          UPDATE reference_detail SET id_link = new_id_link WHERE id_link = id_curr_link;
          UPDATE reference_meta SET id_curr_link = new_id_link, id_meta_link = new_id_link WHERE id_curr_link = id_curr_link AND rowid = meta_row.rowid;
        END LOOP;
      END IF;
      RETURN 1;
    END;
  $$;

It's a mess, but it's fully recursive, let me break it down some, starting with our first IF condition, we check if our old id matches our new id, if that's the case, we're a DELETE operation, so we'll delete ourselves, and our child tree as we no longer need them.

Our ELSE condition gets much more interesting, let's take an isolated look at it here:

UPDATE reference_detail SET id_link = new_id_link WHERE id_link = old_id_link;
UPDATE reference_meta SET id_curr_link = new_id_link, id_meta_link = new_id_link WHERE id_link = old_id_link;
FOR meta_row IN SELECT rowid, id_curr_link FROM reference_meta WHERE id_meta_link = old_id_link LOOP
  UPDATE reference_detail SET id_link = new_id_link WHERE id_link = id_curr_link;
  UPDATE reference_meta SET id_curr_link = new_id_link, id_meta_link = new_id_link WHERE id_curr_link = id_curr_link AND rowid = meta_row.rowid;
END LOOP;

Note: For those who are unaware, PERFORM is used as we don't care for the returned value in this case.

First, we update our record in the detail and meta table to match our new parent. Then we begin iterating on our meta_row object, getting the current (id_curr_link) from the meta table, when our id_meta_link matches the ID we just were a child node – then we begin recursion again, and begin to alter that record and all it's children.

It's by no means a perfect function, but for the use case I have currently, which is wanting to delete/update trees of data, it seems to just fit the bill. Performance wise, it's pretty snappy, on average takes less than one second from start to finish.

This is a basic implementation of a self-managing database table, ran by INSERT/UPDATE/DELETE triggers, and recursive functions. Pretty neat, if you ask me – it's nice to have the database maintain itself to a degree, it takes the headway off of using another language (eg, PHP) to maintain it. PostgreSQL is incredibly fast if done right – it's just getting it right which takes work, but it's worth it!

Engineering yourself to become a better Engineer

Mike — Thu, 18 Apr 2019 09:38:08 +0000

This was originally posted on my blog, where I discuss all things engineering, sysadmin, labbing!.

There's this paradox that you need to be as fast as possible when programming - just ship it they say. I'd like to argue the counter - perfect what you can, then ship it. It's important to get a MVP out there, but wha tis equally as important is making the MVP suitable enough for potential customers to want to remain on the platform. When I find a MVP, I make a point of spending 10 minutes on it - in 10 minutes if I don't understand the ultimate goal - or it's not possible to do as advertised within reason - it's lacking engineering, not features.

This goes hand in hand when you want to improve yourself, in order to understand yourself better - make yourself a MVP. You're the customer of your own skillset - how do you face a task? A question I strike often is, "if I showed up at your desk, gave you 20 things to do on your first day, and went 'oh by the way, I have a meeting so good luck see you tomorrow!', how would you react?" - You wouldn't be surprised how many hopefuls tell me, "I'd dive right in and get it all done to perfection before you get back in, I'd go the extra mile and ensure to ..." - Not the answer we've wanted to hear. Let's be real here, you're only going to get a few things done, personally, I'd spend the day introducing myself and getting to know everyone first, then get the tasks done the next day.

This is just one example out of many, I've often found that during interviews, the average person overstates their confidence and ability. I'd rather you aim high and miss, then to shoot low and hit. Tell me straight up with no sugarcoating your abilities, it won't mean you can't work - in fact, it shows you're willing to be honest with yourself. Everyone wants a 10 year junior developer, and unfortunately most people give into it, even if they have excellent experience.

My takeaway for all those who are seeking careers in computer science-related fields:

No is a perfectly acceptable answer
Find the right job for you, don't listen to the people around you - make the decision based on where you want to be in a few years down the road
Don't jump ship because it's tough
Be nice to everyone, despite their attitude towards you
Finally, to err is human - to blame others shows management potential./s

Super fast full text search with PostgreSQL

Mike — Thu, 18 Apr 2019 02:30:24 +0000

This was originally posted oh my blog where I post all things tech and sysadmin and the like!.

What if I told you that searching millions of records didn't have to be complex, and take minutes to complete? Most organizations jump ship to a dedicated searching engine like Elastic Search, but they greatly underestimate the power of PostgreSQL.

Let's suppose we have a table with 1.1 million records, and we want to apply a text search vector to it. We care about the author and publisher names. In order to create the vectors, we need to take out old data, and put it into a new column that has a vector.

Time to break it down! Our author and publisher names are just strings, for example author is "Jack Sparrow", publisher is "A Movie Corporation" - we want to be able to start typing "jac" or "mov cor" and have it pickup Jack Sparrow and/or A Movie Corporation. Makes sense! The issue we face is we don't want to have two columns, we need an index so we can rapidly search, so we make a new column in the database table, let's call this column "tsv" for short.

First, add the column, then apply a GIN index on it.

ALTER TABLE searchable ADD COLUMN tsv tsvector;

CREATE INDEX searchable_gin_idx ON searchable USING GIN(tsv);

Now, let's update our table records accordingly, you can use this SQL to create an English TSVector and instantly apply it to the TSV field.

UPDATE searchable SET tsv = ( to_tsvector('english', author) || to_tsvector('english', publisher ) );

This will begin to build a simple GIN index using our concatenated author and publisher. We will now be able to perform both simple and complex fast search queries. It's the same data, without compromising speed. For example, if we want to find all items containing 'jac parr' as the author, we can do:

SELECT * FROM searchable WHERE tsv @@ to_tsquery('jac:* parr:*');

Instantly, we're presented with a ton of results. It's interesting to note, that even though we've attached a wildcard to the query, PostgreSQL is still able to rapidly search and get our result set. We could further apply a ts_rank_cd in the ORDER BY clause to rank them according to weights.

Speed and consistency matters.
If we run explain analyze on it, we can see that the query took 0.013ms to plan, and 0.067ms to execute.

If we were to run the same query without a GIN index, we see it took 0.121ms to plan, and 132.14ms to execute.

As you can see, this is a major performance boost, and we get the same results we did previously, much faster this time around. Thus, another reason you may not need a searching product - if you optimize your queries and database table with the proper indexes you can leverage the full power of PostgreSQL.

The beauty in simplicity: Migration in real-time, without data loss.

Mike — Thu, 18 Apr 2019 02:26:19 +0000

This was originally posted on my blog, where I publish all things tech, sysadmin, and more

We've all heard the stories - to be able to migrate in real time, without any data loss. I've done it once before for this blog, and I've done it again for my wife and I. In the middle of the day, about 1k visitors online. Here's a breakdown of how I did it.

The Goal

Not only do I want to migrate providers, I want to migrate my content to a new country, literally continents apart. Moving from my old blog host on Hetzner, to DigitalOcean. Luckily, both sides have 1G port (Hetzner unmetered, DO has a few TB cap) - more than enough.

File Size isn't an issue, about 2.4 GB of file size, the transfer of data took about 1 minute in total (which isn't bad, considering distance, it could be improved however).

The issue that is happening is real-time movement, behind the scenes there's analytic engines, few headless APIs, etc. that all interact with data live. This is the issue.

In order to fix this, we'll need to setup some master/slave replication on PostgreSQL, then go fully fledged master/master replication. Next, I imagine we'll have to slowly pull traffic off the old host to the new one. Finally, turn off the old box.

Explanation

With our files transferred, I've setup both DO VMs as 1GB / 25GB SSD / 2TB bandwidth instances. I've enabled SSH keys.

While browsing on DO interface, I noticed they have their Cloud Firewalls, this is great - the interface is better than Hetzners Firewall interface (for dedicated servers). I started by creating one that is filtered by tags (production, cloudflare, blogging) - so this firewall will be applied to virtual machines with those tags.

I want only SSH from my jump servers (OVH BHS VM or my home servers), HTTP/HTTPS only from Cloudflare, and deny the rest of the world for this instance. Because Cloud Firewalls are so simple, this is how the rules ended up looking:

Outbound, I've allowed to all sources (for good or bad). I should look at tightening this up in the future, but for right now I'll let it be as-is. I want to make sure everything gets up for production first.

I've enabled private networking on my two VMs for our blogs, this helps save on bandwidth count, and I've linked them together in same DC. Sweet.

Now, for the SQL instance, I created a firewall called "int-production" - this will reject ALL communications on it's public IPv4 and IPv6 addresses, but can speak to the connected VMs that need access (eg. blogs). I've enabled private networking so I can connect to it direct over private network, yet again saving bandwidth (as it's metered on DigitalOcean).

I setup master/slave replication, allowed the Firewall to accept from Hetzner IP, allowed outbound to Hetzner IP only as well. After 20 minutes, we were synced up and ready to motor. I flipped it over to master/master, and started slowly redirecting the domains over.

15 minutes later, I was confident the hosts were identical, and shut off replication, turned off the Hetzner node. I ran over to my monitoring on UptimeRobot, and saw exactly what I was looking for - 100% uptime.

But wait, all my error logs have CF's IP in them, before I had setup some rules on the host dedicated server, now that these are individual VMs, I need to re-configure them, I found this article + script for nginx that worked like a charm, and setup a crontab on it. I set it to once every week (Sunday, 11 PM GMT), which is more than sufficient for my needs.

No data was lost, we were online the whole time, and this scales for my growing needs. My wife and I's blogs combined have hit well over 50,000 requests per day, and the old system didn't scale really well and would constantly drop legitimate connections, the energy and time lost debugging this outweighed the cost of staying at the old provider.

Thanks to better flexibility with DigitalOcean, I've been able to add some same rules for all my VMs with tags, instead of how I used to - individually and replicate them across N virtual machines. Alert Policies for Memory, Disk, Inbound/Outbound bandwidth help.

Future ideas

In the event of traffic spikes, and as my side project network grows, I'll spin up a DO load balancer, it's only $10 per month and seems very easy to use. Throwing my VMs for blogs, apps behind it. I'd love to get more oversight, possibly migrate my Grafana + Prometheus setup to DO, use internal networking to save on bandwidth.

Let's see how it holds up. I'd love to be in a position to migrate all my production off my own colocated gear, and revert the colocated gear back to a lab. I'm going to keep testing DO for a few months, and hope to make a full switch. The price point is reasonable, the location is good (20ms or less from where I live),

That's all for now! I'll keep updating on my experience with DO as I dive into more features. Maybe next I'll try their k8 services.