A few days ago, someone decided to DDoS the entire IP range of my Hong Kong VPS provider. fail2ban did its job a little too enthusiastically — banned 20,000+ IPs, ran the machine out of memory, and triggered a kernel panic. Great. :/

That’s when I realized the problem: fail2ban lets packets hit the kernel stack first, then reacts. Under a real flood, that reaction cost alone is enough to kill the machine. I went down the XDP/eBPF rabbit hole, which runs at the NIC driver level — packets get dropped before they even touch the kernel.


What is XDP?

XDP (eXpress Data Path) is an eBPF-based, high-performance packet processing path that runs before packets enter the Linux networking stack (at the NIC driver level). This makes it significantly faster than traditional iptables/nftables filtering.


The other thing that annoyed me was manually managing port rules. Every time I spun up a new service I’d have to remember to add it. So I built a small daemon that watches for new listening ports via Netlink Process Connector and updates the BPF whitelist automatically — usually within a second of a service starting.

What it does:

  • Drops unwanted traffic at the NIC level (~34–65 ns/packet on KVM VPS)
  • Auto-syncs open ports to the whitelist — no manual rule management
  • Handles IPv6 extension headers to prevent crafted-packet bypasses
  • One-liner install, sets up a systemd service

What it won’t do:

  • Won’t save you if your uplink is already saturated by a volumetric attack
  • Not a replacement for a DDoS-protected host or upstream scrubbing

This is mostly a learning project that solved a real problem I had. Sharing it in case it’s useful for others running personal VPS setups.

GitHub: https://github.com/Kookiejarz/basic_xdp


📊 Benchmarks

Measured with bpftool prog run ... repeat N data_in <packet> against the JIT-compiled XDP program. Return value 2 = XDP_DROP (fast-path hit).

Host CPU vCPUs Packet input Repeat Avg latency
VPS A Intel Xeon Platinum 8160M @ 2.10 GHz (KVM) 2 synthetic (no data_in) 100 000 000 65 ns
VPS B AMD Ryzen 9 3900X @ 2.0 GHz (KVM) 1 30-byte IPv4 pkt (data_in) 1 000 000 000 40 ns
VPS C AMD EPYC 7Y43 @ 2.55GHz (KVM) 1 30-byte IPv4 pkt (data_in) 1 000 000 000 34 ns

Note — data_in matters.
Without data_in the runner feeds a zero-length buffer; the BPF program returns immediately at the first bounds check (data_end == data), so the measurement reflects JIT dispatch overhead more than real packet-processing logic.
The 40ns and 34 ns figure (VPS B and VPS C, with a real IPv4 frame) are the more representative numbers.

💭 Theoretical throughput (single core, XDP_DROP fast path)

Avg latency Packets / second
65 ns 15.4 Mpps
40 ns 25.0 Mpps
34 ns 29.4 Mpps

Real-world NIC throughput will be the practical ceiling; the XDP program itself is not the bottleneck.

🔍 How to reproduce

git clone https://github.com/Kookiejarz/basic_xdp.git
cd basic_xdp

# Auto-detect interface
sudo bash setup_xdp.sh

# Or specify interface
sudo bash setup_xdp.sh eth0

# Build a minimal IPv4 packet (Ethernet header + IPv4 header, no payload)
python3 -c 'print("00"*14 + "45000028" + "00"*16)' | xxd -r -p > /tmp/pkt.bin

# Get the program ID
PROG_ID=$(bpftool -j prog show pinned /sys/fs/bpf/xdp_fw/prog \
          | python3 -c "import json,sys; print(json.load(sys.stdin)['id'])")

# Run 100 M iterations with a real packet
sudo bpftool prog run id "$PROG_ID" repeat 100000000 data_in /tmp/pkt.bin