Listen to this Post

Khawaja Shams, Co-Founder & CEO at Momento, shared insights on tuning Valkey on a c8g.2xl instance, highlighting the impressive performance of Graviton chips. The setup achieved over 1.1 million requests per second (RPS) on a single 8 vCPU box with consistent tail latencies—without pipelining.
Key Observations:
- IRQ Processing Efficiency – Only 2 Graviton cores handle packet processing to support 1M+ RPS, visible as “red” in monitoring tools.
- Main Valkey Thread – Core 6 was fully saturated, indicating optimal workload distribution.
- Thermal & Performance Stability – Unlike x86, Graviton maintains consistent tail latencies even at ~100% CPU utilization.
Graviton’s architecture ensures high throughput while keeping latency predictable—ideal for high-performance databases like Valkey.
You Should Know:
Performance Tuning Commands & Tools
To replicate such performance, use these Linux commands and tools:
1. Monitor CPU & IRQ Activity
mpstat -P ALL 1 Per-core CPU utilization top -H -p $(pgrep valkey) Thread-level CPU usage cat /proc/interrupts Check IRQ distribution
2. Isolate CPU Cores for IRQ Handling
sudo systemctl set-property --runtime -- user.slice AllowedCPUs=0,1 Reserve cores 0-1 echo 0 > /proc/irq//smp_affinity_list Bind IRQs to core 0
3. Optimize Valkey (Redis-compatible) Configuration
In valkey.conf maxmemory 16gb io-threads 4 Match vCPU count disable-thp yes Disable Transparent HugePages
4. Network Tuning for High RPS
sudo sysctl -w net.core.somaxconn=65535 sudo sysctl -w net.ipv4.tcp_max_syn_backlog=8192 sudo ethtool -C eth0 rx-usecs 10 Reduce NIC interrupt delay
5. Benchmark with `redis-benchmark`
redis-benchmark -h 127.0.0.1 -p 6379 -t set,get -n 1000000 -c 32 -P 16
What Undercode Say
Graviton’s ARM-based architecture outperforms x86 in sustained high-load scenarios, making it ideal for real-time databases. Key takeaways:
– IRQ Optimization is critical—dedicate cores to avoid contention.
– Thread Pinning ensures deterministic performance.
– Network Stack Tuning reduces bottlenecks at high RPS.
For further reading:
Prediction
As cloud providers adopt Graviton3/4, expect ~30% better price-performance for in-memory databases. Hybrid x86-ARM clusters may emerge for cost-sensitive workloads.
Expected Output:
A high-performance Valkey setup on Graviton with:
✔ 1M+ RPS at sub-millisecond latency
✔ Dedicated IRQ cores for stability
✔ Optimized network & thread scheduling
IT/Security Reporter URL:
Reported By: Kshams Valkey – Hackers Feeds
Extra Hub: Undercode MoN
Basic Verification: Pass ✅


