🏁Tuning a Solana Node [ENG]

Finer tuning of a Solana node

Steps:

Determine the best CPU based on base clock frequency, boost, etc on SolanaHCL Hardware.
AMD P-State and AMD P-State EPP Scaling Driver Configuration Guide. Check your Linux kernel and try to use amd_pstate=passive if Kernel 6.1+ (change your grub and add amd_pstate=passive, run update-grub and reboot)
Configure Linux to use "performance" mode for the CPU scheduler.
Apply sysctl tweaks as described here.
Format NVMe xfs noatime for ledger, snapshots and ext4 noatime for accounts
Store your ledger, snapshots, and accounts on at least three different paths (i.e., three separate NVMe SSDs; Gen4 is acceptable, though Gen5 is preferable).
Split accounts into three different paths (or more), each connected to a single NVMe SSD.
Enable and configure your index and hash caches in RamDisk. Details here.
~~Enable~~ --block-verification-method unified-scheduler ~~to improve catch-up speed.~~ Already by default. Version > 2.0.
--accounts-db-hash-threads 4 or 2 \. On v2.1 you can supply --accounts-db-hash-threads, to specify the number of threads that perform the accounts hash calculation. If you have snapshots disabled, you can safely set this to 1, and the EAH calculation will be less invasive.
If you use a HA (High Avalibility) setup, disable the snapshot generation on the primary : --snapshot-interval-slots 0 it will reduce vote latency spikes.
Setup a 'non-voting' hot-spare to failover if SLA is a concern : Pumpkin Pool blog post here.
If you do go to 24.04 make sure to disable the monthly restarts or it will restart your validator systemd unit automatically every month.
Make sure to verify nvme link speeds.
Select a data centre with good networking and peering.
Use mods (at your own risk).

Important bonus:

Configure core isolation for PoH:
- Find out the nearest available core. in most cases, it's core 2 (cores 0 and 1 are often used by the kernel). if you have more cores, you can choose another available nearest core.
  Know your topology:
  lstopo
- Check your cores and hyperthreads
  look at the "cores" table to find your core and its hyperthread. for example, if you choose core 2, its hyperthread might be 26 (in my case)
  lscpu --all -e
- the easiest way to find the hyperthread for eg core 2:
  cat /sys/devices/system/cpu/cpu2/topology/thread_siblings_list
- isolate the core and its hyperthread:
  in my case the hyperthread for core 2 is 26 /etc/default/grub (dont forget to run update-grub and reboot afterwards)
  GRUB_CMDLINE_LINUX_DEFAULT="quiet amd_pstate=passive nohz_full=2,26 isolcpus=domain,managed_irq,2,26 irqaffinity=0-1,3-25,27-47"
- nohz_full=2,26: enables full dynamic ticks for core 2 and its hyperthread 26 to reducing overhead and latency.
- isolcpus=domain,managed_irq,2,26: isolates core 2 and hyperthread 26 from the general scheduler
- irqaffinity=0-1,3-25,27-47: directs interrupts away from core 2 and hyperthread 26
- Set the poh thread to core 2
  add the cli to your validator
  --experimental-poh-pinned-cpu-core 2 \
- there is a bug with core_affinity if you isolate your cores: https://github.com/anza-xyz/agave/issues/1968 you can take my bash script to identify the pid of solpohtickprod and set it to eg. core 2:
- https://github.com/1000xsh/solana/blob/main/scripts/set_affinity.sh
```
#!/bin/bash
# wait until the ledger is loaded, bc there will be no thread id until then.
# check logs with: journalctl -xe


# wait to load the binary
# sleep 20

# main pid of agave-validator
solana_pid=$(pgrep -f "^agave-validator --identity")
if [ -z "$solana_pid" ]; then
    logger "set_affinity: agave_validator_404"
    exit 1
fi

# find thread id
thread_pid=$(ps -T -p $solana_pid -o spid,comm | grep 'solPohTickProd' | awk '{print $1}')
if [ -z "$thread_pid" ]; then
    logger "set_affinity: solPohTickProd_404"
    exit 1
fi

# check if aff already set
current_affinity=$(taskset -cp $thread_pid 2>&1 | awk '{print $NF}')
if [ "$current_affinity" == "2" ]; then
    logger "set_affinity: solPohTickProd_already_set"
    exit 1
else
    # set poh to cpu2
    sudo taskset -cp 2 $thread_pid
    logger "set_affinity: set_done"
     # $thread_pid
fi
```
- by following these steps, core 2 will run at full speed without any tdp limits and any interrupts. in my example, core 2 runs at 5.9 ghz with overclocking.
- the pgrep is using ^ (matching start with) and if i remove ^ it finds the pid.

Check updates:

cat /etc/default/grub|grep GRUB_CMDLINE_LINUX_DEFAULT

pgrep agave-validator 
ps -o spid,psr,comm -T -p pgrep_id | grep 'solPohTickProd'

Benchmark that PoH speed (around node start up):

grep "PoH speed check" solana.log

grep "hashes/sec" solana.log

Another user script:

slightly update core-pin script if anyone finds it helpful i tried making it run as ExecStartPost in the systemd unit but it times out so just running it manually for now https://github.com/RadiantAeon/solana-rpc-deploy/blob/main/core-pin.sh
21369467 on non isolated core ~20% load from other things, smt off no clue what hash rate on the isolated core is like because of above issue with core pinning on isolated cores.

One more script:

You can pin it once the ledger has loaded, maybe someday we will see a fix for that, allowing us to provide the core index without needing to run the bash script. here's an example of how i solved that:

use libc::{cpu_set_t, sched_getaffinity, sched_setaffinity, CPU_ISSET, CPU_SET, CPU_ZERO};
use std::process;

/// set the cpu affinity for the current process to a specific core.
pub fn set_cpu_affinityx(core_id: usize) -> Result<(), String> {
    unsafe {
        let mut cpu_set: cpu_set_t = std::mem::zeroed();
        CPU_ZERO(&mut cpu_set);
        CPU_SET(core_id, &mut cpu_set);

        let pid = process::id() as libc::pid_t;
        let result = sched_setaffinity(pid, std::mem::size_of::<cpu_set_t>(), &cpu_set);

        if result == 0 {
            println!("successfully set cpu affinity to core {}", core_id);
            Ok(())
        } else {
            let errno = *libc::__errno_location();
            let err_msg = match errno {
                libc::EINVAL => "invalid core id".to_string(),
                libc::ESRCH => "process not found".to_string(),
                libc::EPERM => "permission denied".to_string(),
                _ => format!("unknown error: {}", errno),
            };
            Err(format!("failed to set cpu affinity: {}", err_msg))
        }
    }
}

Contributors

ferric / StakeWare
ax / 1000x.sh
@ghosty3609 / Kiln.fi

Inspired by

PreviousSolana MEV [RU]NextRunning Firedancer (Solana Testnet)

Last updated 5 months ago

Steps:

Important bonus:

Know your topology:

Check your cores and hyperthreads

isolate the core and its hyperthread:

Set the poh thread to core 2

Another user script:

One more script:

Contributors

Inspired by