Skip to main content
Version: v4 (current)

Ubuntu Machine Setup

Steps for manual configuration and provisioning of Ubuntu 22.04 server systems. This guide assumes and recommends that the user is starting from a fresh installation. If you unfamiliar with the installation process for Ubuntu, see the link below before progressing.

Base Packages

  • Log in to your host as the root user.

  • Apply system updates and upgrades

    You need update the package list and upgrade system components prior to installing other software. Often this process results in updates that will require a system reboot.

    apt-get update && \
    apt-get upgrade -y

    reboot
  • Install base utilities.

    The following is a curated set of base packages that are dependencies for steps later in the guide and helpful in general.

    apt-get update && \
    apt-get install -y wireguard \
    ssh-import-id \
    sudo \
    curl \
    tmux \
    netplan.io \
    apt-transport-https \
    ca-certificates \
    software-properties-common \
    htop \
    git-extras \
    rsyslog \
    fail2ban \
    vim \
    gpg \
    open-iscsi \
    nfs-common \
    ncdu \
    zip \
    unzip \
    iotop && \
    sudo wget https://github.com/mikefarah/yq/releases/latest/download/yq_linux_amd64 -O /usr/bin/yq && \
    sudo chmod +x /usr/bin/yq && \
    sudo systemctl enable fail2ban && \
    sudo systemctl start fail2ban

Admin User Creation

  • Create the user

    export NEW_USER="<some new user here>"
    useradd -s /bin/bash -d /home/$NEW_USER/ -m -G sudo $NEW_USER
  • Grant passwordless sudo permission

    echo "$NEW_USER ALL=(ALL) NOPASSWD: ALL" >> /etc/sudoers
  • Import an ssh key

    If you have a GitHub, GitLab, or Launchpad account, you can use ssh-import-id to install you ssh public-key into your host.

    If you do not have one of the above, you can always upload your key manually as described in SSH Copy ID for Copying SSH Keys to Servers from ssh.com.

    Example usage for ssh-import-id:

    # GitHub
    sudo -u $NEW_USER ssh-import-id-gh <github username>

    # GitLab
    URL="https://gitlab.exampledomain.com/%s.keys" sudo -u $NEW_USER ssh-import-id <gitlab username>
  • Add the user to relevant groups

    usermod -a -G kvm $NEW_USER
  • Create a password for the user

    passwd $NEW_USER

Install Docker

Setting up docker can get tricky because the version available from apt is usually out of date.

  • Download the docker gpg key

    curl -fsSL https://download.docker.com/linux/ubuntu/gpg | \
    sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg
  • Add the apt package source

    echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
  • Update the apt package list and install Docker

    sudo apt-get update && sudo apt-get install -y docker-ce
  • Add user to the docker group

    usermod -a -G docker $NEW_USER

Install Docker Compose (Optional)

As with Docker, the default apt packages provide a very outdated version of docker compose. Below are the steps required to install a current version.

  • Find the latest version by visiting https://github.com/docker/compose/releases

    export COMPOSE_VERSION="2.17.3"
  • Create a directory for the binary

    mkdir -p ~/.docker/cli-plugins/
  • Download the binary

    curl -SL https://github.com/docker/compose/releases/download/v$COMPOSE_VERSION/docker-compose-linux-x86_64 -o ~/.docker/cli-plugins/docker-compose
  • Make it executable

    chmod +x ~/.docker/cli-plugins/docker-compose

Install NVIDIA GPU Drivers

Ubuntu's built-in driver installation tool is really unreliable. Instructions for its use are included, but readers are advised to get their driver installer directly from Nvidia.

Download and Install from Nvidia

  • Download and install driver dependencies

    sudo apt-get install -y ubuntu-drivers-common \
    linux-headers-generic \
    gcc \
    kmod \
    make \
    pkg-config \
    libvulkan1
  • Find your driver version and download the installer

    You can find both gaming and data-center drivers using the Nvidia web tool, however cloud and vps users will need to refer to their provider's documentation for installation instructions as many providers will require you to use GRID drivers instead.

    Alternatively, you can download a specific driver version using curl

    export DRIVER_VERSION=""

    # GeForce Cards
    curl --progress-bar -fL -O "https://us.download.nvidia.com/XFree86/Linux-x86_64/$DRIVER_VERSION/NVIDIA-Linux-x86_64-$DRIVER_VERSION.run"

    # Datacenter Cards
    curl --progress-bar -fL -O "https://us.download.nvidia.com/tesla/$DRIVER_VERSION/NVIDIA-Linux-x86_64-$DRIVER_VERSION.run"
  • Run the Installer

    It is required to run the driver from the system console, it cannot be installed from an X-session. If you used a desktop image instead of a server image you will need to disable your session manager.

    sudo bash "NVIDIA-Linux-x86_64-*.run"

Download and Install from apt

Automatic install is currently broken for multiple card-types, see #1993019.

  • Automatic install (Broken)

    sudo ubuntu-drivers autoinstall
  • Install specific driver version

    sudo ubuntu-drivers install nvidia:525

Install the Nvidia Container Toolkit (Optional)

The Nvidia Container Toolkit allows containers to access the GPU resources of the underlying host machine. Requires that the GPU drivers are already installed on the host. See the official docs here: https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html

  • Get your system distribution name

    distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
  • Download the gpg key and add the repo to your apt sources

    curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
    && curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.list | \
    sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
    sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
  • Update apt packages and install the container toolkit

    sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit
  • Set nvidia the default container runtime

    sudo nvidia-ctk runtime configure --runtime=docker --set-as-default
  • Restart the docker service

    sudo systemctl restart docker
  • Test that it is working

    Run one of Nvidia's sample workloads to verify functionality.

Install the Prometheus metrics exporter (Optional)

  • Log in as, or assume the root user.

  • Download the metrics exporter application

    wget -O /opt/node_exporter-1.6.1.linux-amd64.tar.gz https://github.com/prometheus/node_exporter/releases/download/v1.6.1/node_exporter-1.6.1.linux-amd64.tar.gz && \
  • Extract the archive and copy into place

    tar -xvf /opt/node_exporter-1.6.1.linux-amd64.tar.gz -C /opt && \
    rm /opt/node_exporter-1.6.1.linux-amd64.tar.gz && \
    ln -s node_exporter-1.6.1.linux-amd64 /opt/node_exporter
  • Download and create the system service

    wget https://raw.githubusercontent.com/small-hack/smol-metal/main/node-exporter.service && \
    sudo mv node-exporter.service /etc/systemd/system/node-exporter.service && \
    systemctl daemon-reload && \
    systemctl enable node-exporter && \
    systemctl restart node-exporter