Forem: Michael Weibel

Deploying next.js on AWS ElasticBeanstalk

Michael Weibel — Wed, 10 Jun 2020 06:04:43 +0000

AWS ElasticBeanstalk (EB) is a service to deploy applications in a simple manner.
AWS EB has quite a range of features. It allows you to configure rolling deployment, monitoring, alerting, database setup, etc. It's generally much easier to use than doing it from scratch.

As with all such systems, this comes at a cost: you initially don't know a lot about the system and figuring out what's wrong might be difficult.
Additionally, AWS EB recently switched to Amazon Linux 2. This new version has a different way to deploy than the previous version "Amazon Linux AMI". As a result, lots of articles and StackOverflow questions/answers are outdated.
The documentation on AWS itself could be a lot better, too. It's not always clear to which version the docs refer to. For example serving static files does not work for Amazon Linux 2.

I deployed a next.js app on AWS EB recently and learned a few tricks. Here's a quick summary of them.

NODE_ENV

To configure the correct NODE_ENV when building and running the application on AWS EB, place the following contents in the folder .ebextensions/options.config:

option_settings:
  aws:elasticbeanstalk:application:environment:
    NODE_ENV: production

.ebignore

.ebignore allows to ignore files when deploying the repository archive using the EB CLI. The format is just like .gitignore and if .ebignore is not present, the deployment uses .gitignore instead. Usually there are certain things which should be in git but not in the deployed archive, hence the need for a .ebignore file.
Here's my example .ebignore:

# dependencies
node_modules/

# repository/project stuff
.idea/
.git/
.gitlab-ci.yml
README.md

# misc
.DS_Store

# debug
npm-debug.log*
yarn-debug.log*
yarn-error.log*

# local env files
.env.local
.env.development.local
.env.test.local
.env.production.local

# non prod env files
.env.development
.env.test

PORT env variable

Like many other systems, AWS EB exposes the PORT environment variable to specify on which port the app should listen on. If you don't customize the server, ensure to adjust your npm start script in package.json as follows:

"start": "next start -p $PORT"

Using yarn instead of npm

In case you have issues with dependencies not installed correctly (read: weird deployment issues you don't have locally), it might be because you use yarn instead of npm. AWS EB uses by default npm to install your dependencies. If you use yarn, the repository usually has a yarn.lock file instead of a package-lock.json. Here's how to "switch" to yarn instead:

# place in .platform/hooks/prebuild/yarn.sh

#!/bin/bash

# need to install node first to be able to install yarn (as at prebuild no node is present yet)
sudo curl --silent --location https://rpm.nodesource.com/setup_12.x | sudo bash -
sudo yum -y install nodejs

# install yarn
sudo wget https://dl.yarnpkg.com/rpm/yarn.repo -O /etc/yum.repos.d/yarn.repo
sudo yum -y install yarn

# install
cd /var/app/staging/

# debugging..
ls -lah

yarn install --prod

chown -R webapp:webapp node_modules/ || true # allow to fail

Ensure to specify the correct node.js version in the path of the curl command.

"switch" is in quotes because after predeploy eb engine will still run npm install. However it seems to work quite well regardless.
I'd recommend: If you can avoid it, use npm.

Serving static files via nginx

It makes sense to serve static files directly via nginx. This avoids unnecessary load on the node.js server and nginx is generally much faster in serving static content.
Place the following file in .platform/nginx/conf.d/elasticbeanstalk/static.conf:

root /var/app/current/public;

location @backend {
  proxy_pass http://127.0.0.1:8080;
}

location /images/ {
  try_files $uri @backend;

  # perf optimisations
  sendfile           on;
  sendfile_max_chunk 1m;
  tcp_nopush         on;
  tcp_nodelay        on;
}
# add more folders as you need them, using as similar location directive

Additionally you could add caching for the /_next/static path - feel free to try it out. I didn't do it yet to avoid too many changes at once.

GZIP compression

Enabling GZIP Content-Encoding on nginx level requires you to override the default nginx.conf. Find the default nginx.conf in /etc/nginx/nginx.conf, copy the contents to .platform/nginx/nginx.conf and replace gzip off; to gzip on;.
Here's the current (June 2020) example:

#Elastic Beanstalk Nginx Configuration File

user                    nginx;
error_log               /var/log/nginx/error.log warn;
pid                     /var/run/nginx.pid;
worker_processes        auto;
worker_rlimit_nofile    32153;

events {
    worker_connections  1024;
}

http {
    include       /etc/nginx/mime.types;
    default_type  application/octet-stream;

    log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      '"$http_user_agent" "$http_x_forwarded_for"';

    include       conf.d/*.conf;

    map $http_upgrade $connection_upgrade {
        default     "upgrade";
    }

    server {
        listen        80 default_server;
        access_log    /var/log/nginx/access.log main;

        client_header_timeout 60;
        client_body_timeout   60;
        keepalive_timeout     60;
        gzip                  on; # CHANGED(mw): enable gzip compression
        gzip_comp_level       4;
        gzip_types text/plain text/css application/json application/javascript application/x-javascript text/xml application/xml application/xml+rss text/javascript;

        # Include the Elastic Beanstalk generated locations
        include conf.d/elasticbeanstalk/*.conf;
    }
}

Finally, disable gzip compression in next.js to avoid double compressing and reduce load on the node.js server.

Deployment

Run, in the following order:

$ npm run build
$ eb deploy

Logging/Debugging

Here's a bunch of important files/directories. You might need sudo to see/read those paths.

Path	Directory
`/etc/nginx/`	Nginx configurations
`/var/app/current`	Deployed application files
`/var/app/staging`	Only during deployment
`/opt/elasticbeanstalk`	Binaries, Configs, ... from AWS EB itself
`/var/proxy/staging`	Nginx staging deployment config
`/var/log/eb-engine.log`	Deployment log
`/var/log/web-stdout.log`	App stdout log
`/var/log/nginx`	Nginx log

Other settings

Ensure to configure your AWS EB setup in the web console as well. Setup rolling deployments and configure monitoring/alarms.

Add NVIDIA GPU support to k3s with containerd

Michael Weibel — Fri, 13 Mar 2020 09:21:03 +0000

After some failed attempts of adding GPU support to k3s, this article describes how to boot up a worker node with NVIDIA GPU support.
k3s, for those who are new to it, is a very small kubernetes distribution.

There are a few reasons why adding GPU support is not that easy. Main reason is that k3s is using containerd as it's container runtime. Most tutorials and also the official NVIDIA k8s device plugin assume docker as the container runtime. While you can easily switch to docker in k3s, we didn't want to change the runtime itself.
Kubernetes itself has a guide for adding GPU support which outlines the basic steps.

The following recipe has been tested on GCP instances n2-standard-1 with a NVIDIA Tesla T4 GPU attached.
It assumes a running master node. Each worker with an attached GPU needs a few additional steps which are outlined below.

Create device plugin DaemonSet

The device plugin is responsible for advertising the nvidia.com/gpu resource on a node (via kubelet).
This needs to be done on the kubernetes node only once. Every node with the label cloud.google.com/gke-accelerator then gets automatically a pod from this DaemonSet assigned.

$ kubectl apply -f https://raw.githubusercontent.com/kubernetes/kubernetes/release-1.14/cluster/addons/device-plugins/nvidia-gpu/daemonset.yaml

The following steps are necessary for each node which needs GPU support. Placing it in a startup script is a good option.

Install drivers

# required kernel module
modprobe ipmi_devintf

# add necessary repositories
add-apt-repository -y ppa:graphics-drivers
curl -s -L https://nvidia.github.io/nvidia-container-runtime/gpgkey | \
  apt-key add -
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-container-runtime/$distribution/nvidia-container-runtime.list | \
  tee /etc/apt/sources.list.d/nvidia-container-runtime.list
apt-get update

# install graphics driver
apt-get install -y nvidia-driver-440 nvidia-container-runtime nvidia-modprobe

Ensure nvidia driver is loaded and device files ready

From: https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#runfile-verifications

/sbin/modprobe nvidia

if [ "$?" -eq 0 ]; then
  # Count the number of NVIDIA controllers found.
  NVDEVS=`lspci | grep -i NVIDIA`
  N3D=`echo "$NVDEVS" | grep "3D controller" | wc -l`
  NVGA=`echo "$NVDEVS" | grep "VGA compatible controller" | wc -l`

  N=`expr $N3D + $NVGA - 1`
  for i in `seq 0 $N`; do
    mknod -m 666 /dev/nvidia$i c 195 $i
  done

  mknod -m 666 /dev/nvidiactl c 195 255

else
  exit 1
fi

/sbin/modprobe nvidia-uvm

if [ "$?" -eq 0 ]; then
  # Find out the major device number used by the nvidia-uvm driver
  D=`grep nvidia-uvm /proc/devices | awk '{print $1}'`

  mknod -m 666 /dev/nvidia-uvm c $D 0
else
  exit 1
fi

Install k3s

curl -sfL https://get.k3s.io | \
    INSTALL_K3S_SKIP_START=true \
    K3S_URL=https://IP_OF_MASTER_ADDRESS:6443 \
    K3S_TOKEN=CONTENT_OF_/var/lib/rancher/k3s/server/node-token_ON_MASTER \
    sh -s - \
    --node-label "cloud.google.com/gke-accelerator=$(curl -fs "http://metadata.google.internal/computeMetadata/v1/instance/attributes/gpu-platform" -H "Metadata-Flavor: Google")"

INSTALL_K3S_SKIP_START prevents k3s from starting, as we need first to change containerd config (see below)
node-label should be set to that key, the value is only important if you want to schedule pods based on the GPU available. The example here uses a metadata attribute on the instance within GCE. Feel free to change that to something else or just true.

Configure containerd

Containerd needs to be changed to use a different container runtime. This can be achieved by adjusting the config.toml or rather creating a config.toml.tmpl file.


mkdir -p /var/lib/rancher/k3s/agent/etc/containerd/

# why "EOF":
# https://serverfault.com/questions/399428/how-do-you-escape-characters-in-heredoc
# ($ signs would need to be escaped -> use "EOF" instead of EOF)
cat <<"EOF" > /var/lib/rancher/k3s/agent/etc/containerd/config.toml.tmpl
[plugins.opt]
  path = "{{ .NodeConfig.Containerd.Opt }}"

[plugins.cri]
  stream_server_address = "127.0.0.1"
  stream_server_port = "10010"

{{- if .IsRunningInUserNS }}
  disable_cgroup = true
  disable_apparmor = true
  restrict_oom_score_adj = true
{{end}}

{{- if .NodeConfig.AgentConfig.PauseImage }}
  sandbox_image = "{{ .NodeConfig.AgentConfig.PauseImage }}"
{{end}}

{{- if not .NodeConfig.NoFlannel }}
[plugins.cri.cni]
  bin_dir = "{{ .NodeConfig.AgentConfig.CNIBinDir }}"
  conf_dir = "{{ .NodeConfig.AgentConfig.CNIConfDir }}"
{{end}}

[plugins.cri.containerd.runtimes.runc]
  # ---- changed from 'io.containerd.runc.v2' for GPU support
  runtime_type = "io.containerd.runtime.v1.linux"

# ---- added for GPU support
[plugins.linux]
  runtime = "nvidia-container-runtime"

{{ if .PrivateRegistryConfig }}
{{ if .PrivateRegistryConfig.Mirrors }}
[plugins.cri.registry.mirrors]{{end}}
{{range $k, $v := .PrivateRegistryConfig.Mirrors }}
[plugins.cri.registry.mirrors."{{$k}}"]
  endpoint = [{{range $i, $j := $v.Endpoints}}{{if $i}}, {{end}}{{printf "%q" .}}{{end}}]
{{end}}

{{range $k, $v := .PrivateRegistryConfig.Configs }}
{{ if $v.Auth }}
[plugins.cri.registry.configs."{{$k}}".auth]
  {{ if $v.Auth.Username }}username = "{{ $v.Auth.Username }}"{{end}}
  {{ if $v.Auth.Password }}password = "{{ $v.Auth.Password }}"{{end}}
  {{ if $v.Auth.Auth }}auth = "{{ $v.Auth.Auth }}"{{end}}
  {{ if $v.Auth.IdentityToken }}identitytoken = "{{ $v.Auth.IdentityToken }}"{{end}}
{{end}}
{{ if $v.TLS }}
[plugins.cri.registry.configs."{{$k}}".tls]
  {{ if $v.TLS.CAFile }}ca_file = "{{ $v.TLS.CAFile }}"{{end}}
  {{ if $v.TLS.CertFile }}cert_file = "{{ $v.TLS.CertFile }}"{{end}}
  {{ if $v.TLS.KeyFile }}key_file = "{{ $v.TLS.KeyFile }}"{{end}}
{{end}}
{{end}}
{{end}}
EOF

Start k3s agent

Start k3s: systemctl start k3s-agent

That's it! :)