DNSサーバの性能試験
とあるイベントNOCでキャッシュDNSサーバーを構築する機会があったので、チューニングをやってみることにした。
やっていることが実際にはGoogle Public DNSへの負荷試験に近い状態になっていたので方法のみ参考にしてほしい。
サーバー側の環境構築
スペック
- OS: CentOS Linux release 7.7.1908 (Core)
- CPU: 4コア
- メモリ: 4GB
- ストレージ: 100GB
- SWAP: 1.6GB
- ネットワーク: GigabitEthernet *1
Unboundの導入
パッケージでインストールすると古いので本来はビルドしたほうがいい。
yum update
yum install unbound
設定ファイルを作成
/etc/unbound/unbound.conf
## The server clause sets the main parameters.
server:
verbosity: 1
statistics-interval: 0
statistics-cumulative: no
extended-statistics: yes
interface: 0.0.0.0
interface: ::0
interface-automatic: no
so-reuseport: yes
ip-transparent: yes
# use all CPUs
# equal to `ls /dev/cpu/ | wc -l`
num-threads: 4
# power of 2 close to num-threads
msg-cache-slabs: 4096
rrset-cache-slabs: 4096
infra-cache-slabs: 4096
key-cache-slabs: 4096
# more cache memory, rrset=msg*2
rrset-cache-size: 100m
msg-cache-size: 50m
# more outgoing connections
# depends on number of cores: 1024/cores - 50
outgoing-range: 206
# Larger socket buffer. OS may need config.
so-rcvbuf: 4m
# extend tcp connections
incoming-num-tcp: 1000
outgoing-num-tcp: 1000
# IPv4
access-control: 0.0.0.0/0 refuse
access-control: 127.0.0.0/8 allow
# IPv6
access-control: ::0/0 refuse
access-control: ::1 allow
access-control: ::ffff:127.0.0.1 allow
username: "unbound"
directory: "/etc/unbound"
chroot: ""
# Log identity to report. if empty, defaults to the name of argv[0]
# (usually "unbound").
# log-identity: ""
# print UTC timestamp in ascii to logfile, default is epoch in seconds.
log-time-ascii: yes
# print one line with time, IP, name, type, class for every query.
log-queries: yes
# print one line per reply, with time, IP, name, type, class, rcode,
# timetoresolve, fromcache and responsesize.
# log-replies: no
harden-glue: yes
harden-dnssec-stripped: yes
harden-below-nxdomain: yes
harden-referral-path: yes
unwanted-reply-threshold: 10000000
prefetch: yes
prefetch-key: yes
rrset-roundrobin: yes
# if yes, Unbound doesn't insert authority/additional sections
# into response messages when those sections are not required.
minimal-responses: yes
module-config: "ipsecmod validator iterator"
trust-anchor-signaling: yes
trusted-keys-file: /etc/unbound/keys.d/*.key
auto-trust-anchor-file: "/var/lib/unbound/root.key"
val-clean-additional: yes
val-permissive-mode: no
val-log-level: 1
include: /etc/unbound/local.d/*.conf
ipsecmod-enabled: no
ipsecmod-hook: "/usr/libexec/ipsec/_unbound-hook"
## Remote control config section.
remote-control:
control-enable: no
## Stub and Forward zones
include: /etc/unbound/conf.d/*.conf
## Forward zones
forward-zone:
name: "."
forward-addr: 8.8.8.8
ファイルディスクリプタの確認
参考: https://tweeeety.hateblo.jp/entry/20131220/1387508776
## cat /proc/sys/fs/file-max
381622
##
## cat /proc/sys/fs/file-nr
1248 0 381622
## ps aux | grep unb
unbound 13949 0.2 8.2 1066060 321520 ? Ssl 22:04 0:00 /usr/sbin/unbound -d
root 13958 0.0 0.0 112728 968 pts/0 S+ 22:07 0:00 grep --color=auto unb
##
## ps aux | grep unb
unbound 13949 0.2 8.2 1066060 321520 ? Ssl 22:04 0:00 /usr/sbin/unbound -d
root 13958 0.0 0.0 112728 968 pts/0 S+ 22:07 0:00 grep --color=auto unb
##
## cat /proc/13949/limits
Limit Soft Limit Hard Limit Units
Max cpu time unlimited unlimited seconds
Max file size unlimited unlimited bytes
Max data size unlimited unlimited bytes
Max stack size 8388608 unlimited bytes
Max core file size 0 unlimited bytes
Max resident set unlimited unlimited bytes
Max processes 15065 15065 processes
Max open files 12886 12886 files
Max locked memory 65536 65536 bytes
Max address space unlimited unlimited bytes
Max file locks unlimited unlimited locks
Max pending signals 15065 15065 signals
Max msgqueue size 819200 819200 bytes
Max nice priority 0 0
Max realtime priority 0 0
Max realtime timeout unlimited unlimited us
クライアント側の環境構築
スペック
- OS: CentOS Linux release 7.7.1908 (Core)
- CPU: 4コア
- メモリ: 16GB
- ストレージ: 100GB
- SWAP: 7.8GB
- ネットワーク: GigabitEthernet *1
負荷試験ツールの導入
DNSサーバへの負荷試験ツールdnsperfを使用した。
https://www.dns-oarc.net/tools/dnsperf
アップデート
yum install -y bind-devel krb5-devel openssl-devel libcap-devel libxml2-devel json-c-devel GeoIP-devel
ビルド
git clone https://github.com/DNS-OARC/dnsperf.git
cd dnsperf
./autogen.sh
./configure
make
make install
クエリ一覧の作成
以下から頻出ドメインの一覧を取得した。
https://github.com/opendns/public-domain-lists
git clone https://github.com/opendns/public-domain-lists
cd public-domain-lists/
## データセットを修正
sed -i 's/$/ A/g' opendns-top-domains.txt
sed -i 's/$/ A/g' opendns-random-domains.txt
負荷試験の実施
1回目
$ dnsperf -s 10.1.1.21 -S 1 -d public-domain-lists/opendns-top-domains.txt
Statistics:
Queries sent: 10000
Queries completed: 9988 (99.88%)
Queries lost: 12 (0.12%)
Response codes: NOERROR 9107 (91.18%), SERVFAIL 158 (1.58%), NXDOMAIN 723 (7.24%)
Average packet size: request 30, response 65
Run time (s): 34.363365
Queries per second: 290.658380
Average Latency (s): 0.299928 (min 0.001580, max 4.893059)
Latency StdDev (s): 0.487916
キャッシュがないため 290 qps
だった。
2回目
$ dnsperf -s 10.1.1.21 -S 1 -d public-domain-lists/opendns-top-domains.txt
Statistics:
Queries sent: 10000
Queries completed: 9968 (99.68%)
Queries lost: 32 (0.32%)
Response codes: NOERROR 9121 (91.50%), SERVFAIL 113 (1.13%), NXDOMAIN 734 (7.36%)
Average packet size: request 30, response 65
Run time (s): 8.277772
Queries per second: 1204.188760
Average Latency (s): 0.039892 (min 0.000070, max 4.847847)
Latency StdDev (s): 0.269064
キャッシュの効果により 1204 qps
だった。キャッシュを削除するにはunbound-control reset cache
や再起動をする。
3回目
キャッシュを削除して実行した。
$ dnsperf -s 10.1.1.21 -q 200 -d public-domain-lists/opendns-top-domains.txt
Statistics:
Queries sent: 10000
Queries completed: 9991 (99.91%)
Queries lost: 9 (0.09%)
Response codes: NOERROR 9116 (91.24%), SERVFAIL 144 (1.44%), NXDOMAIN 731 (7.32%)
Average packet size: request 30, response 65
Run time (s): 15.859268
Queries per second: 629.978635
Average Latency (s): 0.230750 (min 0.000303, max 4.774134)
Latency StdDev (s): 0.379951
CPUとメモリリソースに余裕がみられたため、デフォルトが100のオプション -q
を 200
に変えて実行した。その結果 629 qps
が得られた。
4回目
$ dnsperf -s 10.1.1.21 -q 200 -d public-domain-lists/opendns-top-domains.txt
Statistics:
Queries sent: 10000
Queries completed: 9957 (99.57%)
Queries lost: 43 (0.43%)
Response codes: NOERROR 9120 (91.59%), SERVFAIL 103 (1.03%), NXDOMAIN 734 (7.37%)
Average packet size: request 30, response 65
Run time (s): 6.805150
Queries per second: 1463.156580
Average Latency (s): 0.047700 (min 0.000070, max 4.923105)
Latency StdDev (s): 0.261557
キャッシュの効果で 1463 qps
だった。
5回目
キャッシュを削除して実行した。
$ dnsperf -s 10.1.1.21 -q 300 -d public-domain-lists/opendns-top-domains.txt
Statistics:
Queries sent: 10000
Queries completed: 9958 (99.58%)
Queries lost: 42 (0.42%)
Response codes: NOERROR 9102 (91.40%), SERVFAIL 137 (1.38%), NXDOMAIN 719 (7.22%)
Average packet size: request 30, response 65
Run time (s): 17.690169
Queries per second: 562.911524
Average Latency (s): 0.406831 (min 0.004011, max 4.923452)
Latency StdDev (s): 0.575371
562 qps
で q=200
との違いがみられなかった。
6回目
$ dnsperf -s 10.1.1.21 -q 300 -d public-domain-lists/opendns-top-domains.txt
Statistics:
Queries sent: 10000
Queries completed: 9958 (99.58%)
Queries lost: 42 (0.42%)
Response codes: NOERROR 9120 (91.58%), SERVFAIL 104 (1.04%), NXDOMAIN 734 (7.37%)
Average packet size: request 30, response 65
Run time (s): 5.823530
Queries per second: 1709.959423
Average Latency (s): 0.048118 (min 0.000083, max 4.958302)
Latency StdDev (s): 0.261235
キャッシュの効果で 1709 qps
だった。
関連記事
SCSKの方の資料が参考になった。
- DNSの評価と計測の話 https://dnsops.jp/event/20130718/20130718-stress-tool-hattori-1.pdf
- フリーのDNSストレスツール の紹介 dnsperf (開発元 Nominum), dnstcpbench(開発元 Nether abs) https://dnsops.jp/event/20130718/20130718-stress-tool-hattori-1.pdf