Run JMH Benchmarks on Hetzner
Provision a dedicated Hetzner cloud server, deploy the current working tree, run JMH benchmarks from any module, download results, and tear down the server.
Prerequisites
hcloudCLI installed and authenticated (hcloud versionto verify)- SSH key pair at
~/.ssh/id_ed25519(or~/.ssh/id_rsa) - The benchmark module compiles locally
Workflow
Step 0: Determine benchmark module and parameters
Ask the user (or infer from context) which benchmark module to run. The project may contain multiple JMH benchmark modules. Common examples:
jmh-ldbc— LDBC SNB read query benchmarks (default if user says "run benchmarks")- Other modules with JMH dependencies — check for
jmh-coredependency inpom.xml
Determine:
- Module name (
-pl <module>) - JMH regex filter (which benchmarks to include/exclude)
- JMH parameters (forks, warmup, measurement iterations)
Defaults (good for comparison runs):
-f 1 -wi 3 -w 5s -i 5 -r 10s
For jmh-ldbc specifically:
- Expected runtime: ~90 minutes for 40 benchmarks (20 queries x 2 suites) with
-f 1 -wi 3 -w 5s -i 5 -r 10s
Step 1: Provision the server
Naming convention: Use jmh-bench-<branch> for the server and jmh-bench-key-<branch> for the SSH key, where <branch> is the current git branch name (sanitized: lowercase, slashes replaced with dashes, truncated to keep total name under 63 chars). This avoids conflicts when multiple benchmark runs execute concurrently on different branches.
# Determine branch-based names
BRANCH=$(git rev-parse --abbrev-ref HEAD | tr '[:upper:]/' '[:lower:]-' | cut -c1-40)
SERVER_NAME="jmh-bench-${BRANCH}"
KEY_NAME="jmh-bench-key-${BRANCH}"
# Upload local SSH public key
hcloud ssh-key create --name "$KEY_NAME" --public-key-from-file ~/.ssh/id_ed25519.pub
# Create CCX33: 8 dedicated AMD vCPUs, 32 GB RAM, Falkenstein DC
hcloud server create --name "$SERVER_NAME" --type ccx33 --image ubuntu-24.04 --location fsn1 --ssh-key "$KEY_NAME"
Record the IPv4 address from the output. Wait ~15 seconds for the server to boot before attempting SSH.
If SSH fails with a host key conflict, remove the stale key:
ssh-keygen -f ~/.ssh/known_hosts -R <IP>
Step 2: Install JDK 21
ssh -o StrictHostKeyChecking=no root@<IP> \
'apt-get update -qq && apt-get install -y -qq openjdk-21-jdk-headless git tmux > /dev/null 2>&1 && java -version'
Step 3: Deploy the project
Rsync the worktree root (the directory containing mvnw, pom.xml, core/, etc.), excluding .git, target, and .idea:
rsync -az --exclude='.git' --exclude='target' --exclude='.idea' <worktree-root>/ root@<IP>:/root/ytdb/
Important: The working directory (e.g. /workspace/ytdb/ldbc-jmh) may be a git worktree — it contains the full project tree with mvnw at its root. Rsync this directory, NOT the parent /workspace/ytdb/.
Then initialize a git repo on the server (required by Spotless):
ssh root@<IP> 'git config --global --add safe.directory /root/ytdb && \
git config --global user.email "bench@test" && \
git config --global user.name "bench" && \
cd /root/ytdb && git init && git add -A && git commit -m "baseline" --quiet'
Step 3b: Download dataset from Hetzner S3 (jmh-ldbc only — MANDATORY)
The LDBC dataset must be pre-downloaded before running benchmarks. The benchmark no longer auto-downloads from SURF (the SURF format is incompatible). Download it from Hetzner Object Storage (S3):
ssh root@<IP> 'apt-get install -y -qq python3-pip zstd > /dev/null 2>&1 && \
pip install --break-system-packages boto3 -q && \
mkdir -p /root/ytdb/<module>/target/ldbc-dataset/sf0.1 && \
python3 -c "
import boto3, os
s3 = boto3.client(\"s3\",
endpoint_url=os.environ[\"S3_ENDPOINT\"],
aws_access_key_id=os.environ[\"S3_ACCESS_KEY\"],
aws_secret_access_key=os.environ[\"S3_SECRET_KEY\"])
print(\"Downloading dataset from S3...\")
s3.download_file(\"bench-cache\", \"ldbc/ldbc-sf0.1-composite-merged-fk.tar.zst\", \"/tmp/dataset.tar.zst\")
print(\"Downloaded\")
" && \
cd /root/ytdb/<module>/target/ldbc-dataset/sf0.1 && \
zstd -d /tmp/dataset.tar.zst -o /tmp/dataset.tar && \
tar xf /tmp/dataset.tar && \
rm -f /tmp/dataset.tar.zst /tmp/dataset.tar && \
echo "Dataset ready" && ls static/ dynamic/'
Important: The command above requires S3 credentials as environment variables on the remote server. Pass them via SSH:
ssh root@<IP> "export S3_ENDPOINT='<endpoint>' S3_ACCESS_KEY='<key>' S3_SECRET_KEY='<secret>' && ..."
Credentials are stored as GitHub secrets: HETZNER_S3_ACCESS_KEY, HETZNER_S3_SECRET_KEY, HETZNER_S3_ENDPOINT. Retrieve them from GitHub or ask the user.
Replace <module> with the benchmark module (e.g. jmh-ldbc).
The dataset uses LDBC datagen v1.0.0 CsvCompositeMergeForeign format (~19 MB). It is stored in Hetzner Object Storage bucket bench-cache at key ldbc/ldbc-sf0.1-composite-merged-fk.tar.zst.
If S3 credentials are unavailable, generate the dataset locally using the LDBC datagen Docker image, then rsync it to the server:
# On the local machine
docker run --rm \
-v "$(pwd)/jmh-ldbc/target/ldbc-dataset/sf0.1:/out" \
ldbc/datagen:latest \
--scale-factor 0.1 --mode raw --format CsvCompositeMergeForeign
# Then rsync the dataset to the server
rsync -az jmh-ldbc/target/ldbc-dataset/ root@<IP>:/root/ytdb/jmh-ldbc/target/ldbc-dataset/
Do not use the SURF repository at repository.surfsara.nl — it provides CsvComposite format (v0.3.5), which is incompatible with the benchmark loaders.
Step 4: Compile
ssh root@<IP> 'cd /root/ytdb && chmod +x mvnw && \
./mvnw -pl <module> -am compile -DskipTests -Dspotless.check.skip=true -q'
Replace <module> with the target benchmark module (e.g. jmh-ldbc).
Wait for BUILD SUCCESS (typically ~60-90 seconds on CCX33).
Step 4b: Pre-load LDBC dataset (jmh-ldbc only)
Critical for jmh-ldbc: The LDBC dataset is downloaded and loaded into the database inside JMH's @Setup(Level.Trial) method. This means the first fork's warmup iteration includes dataset download + DB creation time. For multi-threaded benchmarks, threads start executing queries on a partially-loaded database, producing wildly inaccurate results (e.g., 300+ ops/s when the real throughput is ~3 ops/s).
Always pre-load the dataset before running actual benchmarks:
ssh root@<IP> 'cd /root/ytdb && ./mvnw -pl <module> -am verify -P bench -DskipTests -Dspotless.check.skip=true \
-Djmh.args="ic5_newGroups -f 0 -wi 0 -i 1 -r 1s -t 1" 2>&1 | tail -20'
This runs a single in-process iteration (-f 0) that triggers dataset download and DB creation. Subsequent forked runs will find the existing DB at ./target/ldbc-bench-db and skip loading.
If the dataset was pre-downloaded via Step 3b: The pre-load step is still required — it creates the YouTrackDB database from the CSV files. However, the download phase will be skipped automatically because the dataset files already exist in target/ldbc-dataset/.
When comparing two code versions (A/B testing): After running version A, delete the benchmark database before running version B to avoid stale cached data:
ssh root@<IP> 'rm -rf /root/ytdb/jmh-ldbc/target/ldbc-bench-db'
The dataset files (target/ldbc-dataset/) can be kept — only the DB needs to be recreated.
Step 5: Run benchmarks
IMPORTANT: Never run multiple benchmarks concurrently on the same server. Always wait for one benchmark run to complete before starting the next.
Start the benchmark in a tmux session so it survives SSH disconnects.
If the module has a bench Maven profile (like jmh-ldbc):
ssh root@<IP> 'tmux new-session -d -s bench \
"cd /root/ytdb && ./mvnw -pl <module> -am verify -P bench -DskipTests -Dspotless.check.skip=true \
-Djmh.args=\"<jmh-args> -rf json -rff /root/resul