DNSサーバの性能試験

とあるイベントNOCでキャッシュDNSサーバーを構築する機会があったので、チューニングをやってみることにした。

やっていることが実際にはGoogle Public DNSへの負荷試験に近い状態になっていたので方法のみ参考にしてほしい。

サーバー側の環境構築

スペック

  • OS: CentOS Linux release 7.7.1908 (Core)
  • CPU: 4コア
  • メモリ: 4GB
  • ストレージ: 100GB
    • SWAP: 1.6GB
  • ネットワーク: GigabitEthernet *1

Unboundの導入

パッケージでインストールすると古いので本来はビルドしたほうがいい。

yum update yum install unbound

設定ファイルを作成

/etc/unbound/unbound.conf

## The server clause sets the main parameters. server: verbosity: 1 statistics-interval: 0 statistics-cumulative: no extended-statistics: yes interface: 0.0.0.0 interface: ::0 interface-automatic: no so-reuseport: yes ip-transparent: yes # use all CPUs # equal to `ls /dev/cpu/ | wc -l` num-threads: 4 # power of 2 close to num-threads msg-cache-slabs: 4096 rrset-cache-slabs: 4096 infra-cache-slabs: 4096 key-cache-slabs: 4096 # more cache memory, rrset=msg*2 rrset-cache-size: 100m msg-cache-size: 50m # more outgoing connections # depends on number of cores: 1024/cores - 50 outgoing-range: 206 # Larger socket buffer. OS may need config. so-rcvbuf: 4m # extend tcp connections incoming-num-tcp: 1000 outgoing-num-tcp: 1000 # IPv4 access-control: 0.0.0.0/0 refuse access-control: 127.0.0.0/8 allow # IPv6 access-control: ::0/0 refuse access-control: ::1 allow access-control: ::ffff:127.0.0.1 allow username: "unbound" directory: "/etc/unbound" chroot: "" # Log identity to report. if empty, defaults to the name of argv[0] # (usually "unbound"). # log-identity: "" # print UTC timestamp in ascii to logfile, default is epoch in seconds. log-time-ascii: yes # print one line with time, IP, name, type, class for every query. log-queries: yes # print one line per reply, with time, IP, name, type, class, rcode, # timetoresolve, fromcache and responsesize. # log-replies: no harden-glue: yes harden-dnssec-stripped: yes harden-below-nxdomain: yes harden-referral-path: yes unwanted-reply-threshold: 10000000 prefetch: yes prefetch-key: yes rrset-roundrobin: yes # if yes, Unbound doesn't insert authority/additional sections # into response messages when those sections are not required. minimal-responses: yes module-config: "ipsecmod validator iterator" trust-anchor-signaling: yes trusted-keys-file: /etc/unbound/keys.d/*.key auto-trust-anchor-file: "/var/lib/unbound/root.key" val-clean-additional: yes val-permissive-mode: no val-log-level: 1 include: /etc/unbound/local.d/*.conf ipsecmod-enabled: no ipsecmod-hook: "/usr/libexec/ipsec/_unbound-hook" ## Remote control config section. remote-control: control-enable: no ## Stub and Forward zones include: /etc/unbound/conf.d/*.conf ## Forward zones forward-zone: name: "." forward-addr: 8.8.8.8

ファイルディスクリプタの確認

参考: https://tweeeety.hateblo.jp/entry/20131220/1387508776

## cat /proc/sys/fs/file-max 381622 ## ## cat /proc/sys/fs/file-nr 1248 0 381622 ## ps aux | grep unb unbound 13949 0.2 8.2 1066060 321520 ? Ssl 22:04 0:00 /usr/sbin/unbound -d root 13958 0.0 0.0 112728 968 pts/0 S+ 22:07 0:00 grep --color=auto unb ## ## ps aux | grep unb unbound 13949 0.2 8.2 1066060 321520 ? Ssl 22:04 0:00 /usr/sbin/unbound -d root 13958 0.0 0.0 112728 968 pts/0 S+ 22:07 0:00 grep --color=auto unb ## ## cat /proc/13949/limits Limit Soft Limit Hard Limit Units Max cpu time unlimited unlimited seconds Max file size unlimited unlimited bytes Max data size unlimited unlimited bytes Max stack size 8388608 unlimited bytes Max core file size 0 unlimited bytes Max resident set unlimited unlimited bytes Max processes 15065 15065 processes Max open files 12886 12886 files Max locked memory 65536 65536 bytes Max address space unlimited unlimited bytes Max file locks unlimited unlimited locks Max pending signals 15065 15065 signals Max msgqueue size 819200 819200 bytes Max nice priority 0 0 Max realtime priority 0 0 Max realtime timeout unlimited unlimited us

クライアント側の環境構築

スペック

  • OS: CentOS Linux release 7.7.1908 (Core)
  • CPU: 4コア
  • メモリ: 16GB
  • ストレージ: 100GB
    • SWAP: 7.8GB
  • ネットワーク: GigabitEthernet *1

負荷試験ツールの導入

DNSサーバへの負荷試験ツールdnsperfを使用した。

https://www.dns-oarc.net/tools/dnsperf

アップデート

yum install -y bind-devel krb5-devel openssl-devel libcap-devel libxml2-devel json-c-devel GeoIP-devel

ビルド

git clone https://github.com/DNS-OARC/dnsperf.git cd dnsperf ./autogen.sh ./configure make make install

クエリ一覧の作成

以下から頻出ドメインの一覧を取得した。

https://github.com/opendns/public-domain-lists

git clone https://github.com/opendns/public-domain-lists cd public-domain-lists/ ## データセットを修正 sed -i 's/$/ A/g' opendns-top-domains.txt sed -i 's/$/ A/g' opendns-random-domains.txt

負荷試験の実施

1回目

$ dnsperf -s 10.1.1.21 -S 1 -d public-domain-lists/opendns-top-domains.txt Statistics: Queries sent: 10000 Queries completed: 9988 (99.88%) Queries lost: 12 (0.12%) Response codes: NOERROR 9107 (91.18%), SERVFAIL 158 (1.58%), NXDOMAIN 723 (7.24%) Average packet size: request 30, response 65 Run time (s): 34.363365 Queries per second: 290.658380 Average Latency (s): 0.299928 (min 0.001580, max 4.893059) Latency StdDev (s): 0.487916

キャッシュがないため 290 qps だった。

2回目

$ dnsperf -s 10.1.1.21 -S 1 -d public-domain-lists/opendns-top-domains.txt Statistics: Queries sent: 10000 Queries completed: 9968 (99.68%) Queries lost: 32 (0.32%) Response codes: NOERROR 9121 (91.50%), SERVFAIL 113 (1.13%), NXDOMAIN 734 (7.36%) Average packet size: request 30, response 65 Run time (s): 8.277772 Queries per second: 1204.188760 Average Latency (s): 0.039892 (min 0.000070, max 4.847847) Latency StdDev (s): 0.269064

キャッシュの効果により 1204 qps だった。キャッシュを削除するにはunbound-control reset cache や再起動をする。

3回目

キャッシュを削除して実行した。

$ dnsperf -s 10.1.1.21 -q 200 -d public-domain-lists/opendns-top-domains.txt Statistics: Queries sent: 10000 Queries completed: 9991 (99.91%) Queries lost: 9 (0.09%) Response codes: NOERROR 9116 (91.24%), SERVFAIL 144 (1.44%), NXDOMAIN 731 (7.32%) Average packet size: request 30, response 65 Run time (s): 15.859268 Queries per second: 629.978635 Average Latency (s): 0.230750 (min 0.000303, max 4.774134) Latency StdDev (s): 0.379951

CPUとメモリリソースに余裕がみられたため、デフォルトが100のオプション -q200 に変えて実行した。その結果 629 qps が得られた。

4回目

$ dnsperf -s 10.1.1.21 -q 200 -d public-domain-lists/opendns-top-domains.txt Statistics: Queries sent: 10000 Queries completed: 9957 (99.57%) Queries lost: 43 (0.43%) Response codes: NOERROR 9120 (91.59%), SERVFAIL 103 (1.03%), NXDOMAIN 734 (7.37%) Average packet size: request 30, response 65 Run time (s): 6.805150 Queries per second: 1463.156580 Average Latency (s): 0.047700 (min 0.000070, max 4.923105) Latency StdDev (s): 0.261557

キャッシュの効果で 1463 qps だった。

5回目

キャッシュを削除して実行した。

$ dnsperf -s 10.1.1.21 -q 300 -d public-domain-lists/opendns-top-domains.txt Statistics: Queries sent: 10000 Queries completed: 9958 (99.58%) Queries lost: 42 (0.42%) Response codes: NOERROR 9102 (91.40%), SERVFAIL 137 (1.38%), NXDOMAIN 719 (7.22%) Average packet size: request 30, response 65 Run time (s): 17.690169 Queries per second: 562.911524 Average Latency (s): 0.406831 (min 0.004011, max 4.923452) Latency StdDev (s): 0.575371

562 qpsq=200 との違いがみられなかった。

6回目

$ dnsperf -s 10.1.1.21 -q 300 -d public-domain-lists/opendns-top-domains.txt Statistics: Queries sent: 10000 Queries completed: 9958 (99.58%) Queries lost: 42 (0.42%) Response codes: NOERROR 9120 (91.58%), SERVFAIL 104 (1.04%), NXDOMAIN 734 (7.37%) Average packet size: request 30, response 65 Run time (s): 5.823530 Queries per second: 1709.959423 Average Latency (s): 0.048118 (min 0.000083, max 4.958302) Latency StdDev (s): 0.261235

キャッシュの効果で 1709 qps だった。

関連記事

SCSKの方の資料が参考になった。