🏁Tuning a Solana Node [ENG]

Finer tuning of a Solana node

Steps:

  • Determine the best CPU based on base clock frequency, boost, etc on SolanaHCL Hardware.

  • AMD P-State and AMD P-State EPP Scaling Driver Configuration Guide. Check your Linux kernel and try to use amd_pstate=passive if Kernel 6.1+ (change your grub and add amd_pstate=passive, run update-grub and reboot)

  • Configure Linux to use "performance" mode for the CPU scheduler.

  • Apply sysctl tweaks as described here.

  • Format NVMe xfs noatime for ledger, snapshots and ext4 noatime for accounts

  • Store your ledger, snapshots, and accounts on at least three different paths (i.e., three separate NVMe SSDs; Gen4 is acceptable, though Gen5 is preferable).

  • Split accounts into three different paths (or more), each connected to a single NVMe SSD.

  • Enable and configure your index and hash caches in RamDisk. Details here.

  • Enable --block-verification-method unified-scheduler to improve catch-up speed. Already by default. Version > 2.0.

  • --accounts-db-hash-threads 4 or 2 \. On v2.1 you can supply --accounts-db-hash-threads, to specify the number of threads that perform the accounts hash calculation. If you have snapshots disabled, you can safely set this to 1, and the EAH calculation will be less invasive.

  • If you use a HA (High Avalibility) setup, disable the snapshot generation on the primary : --snapshot-interval-slots 0 it will reduce vote latency spikes.

  • Setup a 'non-voting' hot-spare to failover if SLA is a concern : Pumpkin Pool blog post here.

  • If you do go to 24.04 make sure to disable the monthly restarts or it will restart your validator systemd unit automatically every month.

  • Select a data centre with good networking and peering.

  • Use mods (at your own risk).

Important bonus:

  • Configure core isolation for PoH:

    • Find out the nearest available core. in most cases, it's core 2 (cores 0 and 1 are often used by the kernel). if you have more cores, you can choose another available nearest core.

      Know your topology:

      lstopo

    • Check your cores and hyperthreads

      look at the "cores" table to find your core and its hyperthread. for example, if you choose core 2, its hyperthread might be 26 (in my case)

      lscpu --all -e

    • the easiest way to find the hyperthread for eg core 2:

      cat /sys/devices/system/cpu/cpu2/topology/thread_siblings_list

    • isolate the core and its hyperthread:

      in my case the hyperthread for core 2 is 26 /etc/default/grub (dont forget to run update-grub and reboot afterwards)

      GRUB_CMDLINE_LINUX_DEFAULT="quiet amd_pstate=passive nohz_full=2,26 isolcpus=domain,managed_irq,2,26 irqaffinity=0-1,3-25,27-47"

    • nohz_full=2,26: enables full dynamic ticks for core 2 and its hyperthread 26 to reducing overhead and latency.

    • isolcpus=domain,managed_irq,2,26: isolates core 2 and hyperthread 26 from the general scheduler

    • irqaffinity=0-1,3-25,27-47: directs interrupts away from core 2 and hyperthread 26

    • Set the poh thread to core 2

      add the cli to your validator

      --experimental-poh-pinned-cpu-core 2 \ 

    • there is a bug with core_affinity if you isolate your cores: https://github.com/anza-xyz/agave/issues/1968 you can take my bash script to identify the pid of solpohtickprod and set it to eg. core 2:

    #!/bin/bash
    # wait until the ledger is loaded, bc there will be no thread id until then.
    # check logs with: journalctl -xe
    
    
    # wait to load the binary
    # sleep 20
    
    # main pid of agave-validator
    solana_pid=$(pgrep -f "^agave-validator --identity")
    if [ -z "$solana_pid" ]; then
        logger "set_affinity: agave_validator_404"
        exit 1
    fi
    
    # find thread id
    thread_pid=$(ps -T -p $solana_pid -o spid,comm | grep 'solPohTickProd' | awk '{print $1}')
    if [ -z "$thread_pid" ]; then
        logger "set_affinity: solPohTickProd_404"
        exit 1
    fi
    
    # check if aff already set
    current_affinity=$(taskset -cp $thread_pid 2>&1 | awk '{print $NF}')
    if [ "$current_affinity" == "2" ]; then
        logger "set_affinity: solPohTickProd_already_set"
        exit 1
    else
        # set poh to cpu2
        sudo taskset -cp 2 $thread_pid
        logger "set_affinity: set_done"
         # $thread_pid
    fi

    • by following these steps, core 2 will run at full speed without any tdp limits and any interrupts. in my example, core 2 runs at 5.9 ghz with overclocking.

    • the pgrep is using ^ (matching start with) and if i remove ^ it finds the pid.

Discord link.

Check updates:

cat /etc/default/grub|grep GRUB_CMDLINE_LINUX_DEFAULT
pgrep agave-validator 
ps -o spid,psr,comm -T -p pgrep_id | grep 'solPohTickProd'

Benchmark that PoH speed (around node start up):

grep "PoH speed check" solana.log
grep "hashes/sec" solana.log

Another user script:

  1. slightly update core-pin script if anyone finds it helpful i tried making it run as ExecStartPost in the systemd unit but it times out so just running it manually for now https://github.com/RadiantAeon/solana-rpc-deploy/blob/main/core-pin.sh

  2. 21369467 on non isolated core ~20% load from other things, smt off no clue what hash rate on the isolated core is like because of above issue with core pinning on isolated cores.

One more script:

You can pin it once the ledger has loaded, maybe someday we will see a fix for that, allowing us to provide the core index without needing to run the bash script. here's an example of how i solved that:

use libc::{cpu_set_t, sched_getaffinity, sched_setaffinity, CPU_ISSET, CPU_SET, CPU_ZERO};
use std::process;

/// set the cpu affinity for the current process to a specific core.
pub fn set_cpu_affinityx(core_id: usize) -> Result<(), String> {
    unsafe {
        let mut cpu_set: cpu_set_t = std::mem::zeroed();
        CPU_ZERO(&mut cpu_set);
        CPU_SET(core_id, &mut cpu_set);

        let pid = process::id() as libc::pid_t;
        let result = sched_setaffinity(pid, std::mem::size_of::<cpu_set_t>(), &cpu_set);

        if result == 0 {
            println!("successfully set cpu affinity to core {}", core_id);
            Ok(())
        } else {
            let errno = *libc::__errno_location();
            let err_msg = match errno {
                libc::EINVAL => "invalid core id".to_string(),
                libc::ESRCH => "process not found".to_string(),
                libc::EPERM => "permission denied".to_string(),
                _ => format!("unknown error: {}", errno),
            };
            Err(format!("failed to set cpu affinity: {}", err_msg))
        }
    }
}

Contributors

Inspired by

Last updated