🏁Tuning a Solana Node [ENG]

Finer tuning of a Solana node

Steps:

  • Determine the best CPU based on base clock frequency, boost, etc on SolanaHCL Hardwarearrow-up-right.

  • AMD P-State and AMD P-State EPP Scaling Driver Configuration Guidearrow-up-right. Check your Linux kernel and try to use amd_pstate=passive if Kernel 6.1+ (change your grub and add amd_pstate=passive, run update-grub and reboot)

  • Configure Linux to use "performance" mode for the CPU scheduler.

  • Apply sysctl tweaks as described herearrow-up-right.

  • Format NVMe xfs noatime for ledger, snapshots and ext4 noatime for accounts

  • Store your ledger, snapshots, and accounts on at least three different paths (i.e., three separate NVMe SSDs; Gen4 is acceptable, though Gen5 is preferable).

  • Split accounts into three different paths (or more), each connected to a single NVMe SSD.

  • Enable and configure your index and hash caches in RamDisk. Details herearrow-up-right.

  • Enable --block-verification-method unified-scheduler to improve catch-up speed. Already by default. Version > 2.0.

  • --accounts-db-hash-threads 4 or 2 \. On v2.1 you can supply --accounts-db-hash-threads, to specify the number of threads that perform the accounts hash calculation. If you have snapshots disabled, you can safely set this to 1, and the EAH calculation will be less invasive.

  • If you use a HA (High Avalibility) setup, disable the snapshot generation on the primary : --snapshot-interval-slots 0 it will reduce vote latency spikes.

  • Setup a 'non-voting' hot-spare to failover if SLA is a concern : Pumpkin Pool blog post herearrow-up-right.

  • If you do go to 24.04 make sure to disable the monthly restartsarrow-up-right or it will restart your validator systemd unit automatically every month.

  • Select a data centre with good networking and peering.

  • Use modsarrow-up-right (at your own risk).

Important bonus:

  • Configure core isolation for PoH:

    • Find out the nearest available core. in most cases, it's core 2 (cores 0 and 1 are often used by the kernel). if you have more cores, you can choose another available nearest core.

      Know your topology:

    • Check your cores and hyperthreads

      look at the "cores" table to find your core and its hyperthread. for example, if you choose core 2, its hyperthread might be 26 (in my case)

    • the easiest way to find the hyperthread for eg core 2:

    • by following these steps, core 2 will run at full speed without any tdp limits and any interrupts. in my example, core 2 runs at 5.9 ghz with overclocking.

    • the pgrep is using ^ (matching start with) and if i remove ^ it finds the pid.

Discord linkarrow-up-right.

Check updates:

Benchmark that PoH speed (around node start up):

Another user script:

  1. slightly update core-pin script if anyone finds it helpful i tried making it run as ExecStartPost in the systemd unit but it times out so just running it manually for now https://github.com/RadiantAeon/solana-rpc-deploy/blob/main/core-pin.sharrow-up-right

  2. 21369467 on non isolated core ~20% load from other things, smt off no clue what hash rate on the isolated core is like because of above issue with core pinning on isolated cores.

One more script:

You can pin it once the ledger has loaded, maybe someday we will see a fix for that, allowing us to provide the core index without needing to run the bash script. here's an example of how i solved that:

Contributors

Inspired by

Last updated