Run JMH Benchmarks on Hetzner

Provision a dedicated Hetzner cloud server, deploy the current working tree, run JMH benchmarks from any module, download results, and tear down the server.

Prerequisites

hcloud CLI installed and authenticated (hcloud version to verify)
SSH key pair at ~/.ssh/id_ed25519 (or ~/.ssh/id_rsa)
The benchmark module compiles locally

Workflow

Step 0: Determine benchmark module and parameters

Ask the user (or infer from context) which benchmark module to run. The project may contain multiple JMH benchmark modules. Common examples:

jmh-ldbc — LDBC SNB read query benchmarks (default if user says "run benchmarks")
Other modules with JMH dependencies — check for jmh-core dependency in pom.xml

Determine:

Module name (-pl <module>)
JMH regex filter (which benchmarks to include/exclude)
JMH parameters (forks, warmup, measurement iterations)

Defaults (good for comparison runs):

-f 1 -wi 3 -w 5s -i 5 -r 10s

For jmh-ldbc specifically:

Expected runtime: ~90 minutes for 40 benchmarks (20 queries x 2 suites) with -f 1 -wi 3 -w 5s -i 5 -r 10s

Step 1: Provision the server

Naming convention: Use jmh-bench-<branch> for the server and jmh-bench-key-<branch> for the SSH key, where <branch> is the current git branch name (sanitized: lowercase, slashes replaced with dashes, truncated to keep total name under 63 chars). This avoids conflicts when multiple benchmark runs execute concurrently on different branches.

# Determine branch-based names
BRANCH=$(git rev-parse --abbrev-ref HEAD | tr '[:upper:]/' '[:lower:]-' | cut -c1-40)
SERVER_NAME="jmh-bench-${BRANCH}"
KEY_NAME="jmh-bench-key-${BRANCH}"

# Upload local SSH public key
hcloud ssh-key create --name "$KEY_NAME" --public-key-from-file ~/.ssh/id_ed25519.pub

# Create CCX33: 8 dedicated AMD vCPUs, 32 GB RAM, Falkenstein DC
hcloud server create --name "$SERVER_NAME" --type ccx33 --image ubuntu-24.04 --location fsn1 --ssh-key "$KEY_NAME"

Record the IPv4 address from the output. Wait ~15 seconds for the server to boot before attempting SSH.

If SSH fails with a host key conflict, remove the stale key:

ssh-keygen -f ~/.ssh/known_hosts -R <IP>

Step 2: Install JDK 21

ssh -o StrictHostKeyChecking=no root@<IP> \
  'apt-get update -qq && apt-get install -y -qq openjdk-21-jdk-headless git tmux > /dev/null 2>&1 && java -version'

Step 3: Deploy the project

Rsync the worktree root (the directory containing mvnw, pom.xml, core/, etc.), excluding .git, target, and .idea:

rsync -az --exclude='.git' --exclude='target' --exclude='.idea' <worktree-root>/ root@<IP>:/root/ytdb/

Important: The working directory (e.g. /workspace/ytdb/ldbc-jmh) may be a git worktree — it contains the full project tree with mvnw at its root. Rsync this directory, NOT the parent /workspace/ytdb/.

Then initialize a git repo on the server (required by Spotless):

ssh root@<IP> 'git config --global --add safe.directory /root/ytdb && \
  git config --global user.email "bench@test" && \
  git config --global user.name "bench" && \
  cd /root/ytdb && git init && git add -A && git commit -m "baseline" --quiet'

Step 3b: Download dataset from Hetzner S3 (jmh-ldbc only — MANDATORY)

The LDBC dataset must be pre-downloaded before running benchmarks. The benchmark no longer auto-downloads from SURF (the SURF format is incompatible). Download it from Hetzner Object Storage (S3):

ssh root@<IP> 'apt-get install -y -qq python3-pip zstd > /dev/null 2>&1 && \
  pip install --break-system-packages boto3 -q && \
  mkdir -p /root/ytdb/<module>/target/ldbc-dataset/sf0.1 && \
  python3 -c "
import boto3, os
s3 = boto3.client(\"s3\",
    endpoint_url=os.environ[\"S3_ENDPOINT\"],
    aws_access_key_id=os.environ[\"S3_ACCESS_KEY\"],
    aws_secret_access_key=os.environ[\"S3_SECRET_KEY\"])
print(\"Downloading dataset from S3...\")
s3.download_file(\"bench-cache\", \"ldbc/ldbc-sf0.1-composite-merged-fk.tar.zst\", \"/tmp/dataset.tar.zst\")
print(\"Downloaded\")
" && \
  cd /root/ytdb/<module>/target/ldbc-dataset/sf0.1 && \
  zstd -d /tmp/dataset.tar.zst -o /tmp/dataset.tar && \
  tar xf /tmp/dataset.tar && \
  rm -f /tmp/dataset.tar.zst /tmp/dataset.tar && \
  echo "Dataset ready" && ls static/ dynamic/'

Important: The command above requires S3 credentials as environment variables on the remote server. Pass them via SSH:

ssh root@<IP> "export S3_ENDPOINT='<endpoint>' S3_ACCESS_KEY='<key>' S3_SECRET_KEY='<secret>' && ..."

Credentials are stored as GitHub secrets: HETZNER_S3_ACCESS_KEY, HETZNER_S3_SECRET_KEY, HETZNER_S3_ENDPOINT. Retrieve them from GitHub or ask the user.

Replace <module> with the benchmark module (e.g. jmh-ldbc).

The dataset uses LDBC datagen v1.0.0 CsvCompositeMergeForeign format (~19 MB). It is stored in Hetzner Object Storage bucket bench-cache at key ldbc/ldbc-sf0.1-composite-merged-fk.tar.zst.

If S3 credentials are unavailable, generate the dataset locally using the LDBC datagen Docker image, then rsync it to the server:

# On the local machine
docker run --rm \
    -v "$(pwd)/jmh-ldbc/target/ldbc-dataset/sf0.1:/out" \
    ldbc/datagen:latest \
    --scale-factor 0.1 --mode raw --format CsvCompositeMergeForeign

# Then rsync the dataset to the server
rsync -az jmh-ldbc/target/ldbc-dataset/ root@<IP>:/root/ytdb/jmh-ldbc/target/ldbc-dataset/

Do not use the SURF repository at repository.surfsara.nl — it provides CsvComposite format (v0.3.5), which is incompatible with the benchmark loaders.

Step 4: Compile

ssh root@<IP> 'cd /root/ytdb && chmod +x mvnw && \
  ./mvnw -pl <module> -am compile -DskipTests -Dspotless.check.skip=true -q'

Replace <module> with the target benchmark module (e.g. jmh-ldbc).

Wait for BUILD SUCCESS (typically ~60-90 seconds on CCX33).

Step 4b: Pre-load LDBC dataset (jmh-ldbc only)

Critical for jmh-ldbc: The LDBC dataset is downloaded and loaded into the database inside JMH's @Setup(Level.Trial) method. This means the first fork's warmup iteration includes dataset download + DB creation time. For multi-threaded benchmarks, threads start executing queries on a partially-loaded database, producing wildly inaccurate results (e.g., 300+ ops/s when the real throughput is ~3 ops/s).

Always pre-load the dataset before running actual benchmarks:

ssh root@<IP> 'cd /root/ytdb && ./mvnw -pl <module> -am verify -P bench -DskipTests -Dspotless.check.skip=true \
  -Djmh.args="ic5_newGroups -f 0 -wi 0 -i 1 -r 1s -t 1" 2>&1 | tail -20'

This runs a single in-process iteration (-f 0) that triggers dataset download and DB creation. Subsequent forked runs will find the existing DB at ./target/ldbc-bench-db and skip loading.

If the dataset was pre-downloaded via Step 3b: The pre-load step is still required — it creates the YouTrackDB database from the CSV files. However, the download phase will be skipped automatically because the dataset files already exist in target/ldbc-dataset/.

When comparing two code versions (A/B testing): After running version A, delete the benchmark database before running version B to avoid stale cached data:

ssh root@<IP> 'rm -rf /root/ytdb/jmh-ldbc/target/ldbc-bench-db'

The dataset files (target/ldbc-dataset/) can be kept — only the DB needs to be recreated.

Step 5: Run benchmarks

IMPORTANT: Never run multiple benchmarks concurrently on the same server. Always wait for one benchmark run to complete before starting the next.

Start the benchmark in a tmux session so it survives SSH disconnects.

If the module has a bench Maven profile (like jmh-ldbc):

ssh root@<IP> 'tmux new-session -d -s bench \
  "cd /root/ytdb && ./mvnw -pl <module> -am verify -P bench -DskipTests -Dspotless.check.skip=true \
  -Djmh.args=\"<jmh-args> -rf json -rff /root/resul

run-jmh-benchmarks-hetzner

Como adicionar

Cole no README do seu repo

Skills relacionadas

internal-comms

babysit

do

smart-explore

Receba novas skills de DevOps e Infra toda segunda