用户工具

站点工具


02-工程实践:kubernetes:issue:ingress节点ksoftirqd高cpu

Ingress nginx节点软中断性能问题

压测

命令

ab -c 1000 -n 300000000 "http://10.xxx.xxx.32/"

问题机器qps上不去,只有16k左右,对照机器能到30k

压测nginx 404页面

项目 问题机器 对照机器 物理机
qps 16k 30k(开1个或2个ab都是30k) 56k+(只开了2个客户端,多开可能还能涨)
pps 40k 84k 141k
bps 40M 77M 123M
si 97% on cpu0 93% on cpu0 30% on 多个cpu
load 12.6 10 8(24核)
中断 20k 33k 177k/s

RPS

开启RPS之后,软中断分散到所有cpu上(对8核机器,使用命令echo ff > /sys/class/net/eth0/queues/rx-0/rps_cpus

top - 11:55:35 up 16:32,  1 user,  load average: 15.51, 13.07, 10.40
Tasks: 195 total,  11 running, 184 sleeping,   0 stopped,   0 zombie
%Cpu0  : 26.7 us, 19.5 sy,  0.0 ni,  6.9 id,  0.0 wa,  0.0 hi, 44.3 si,  2.7 st
%Cpu1  : 31.3 us, 35.1 sy,  0.0 ni, 12.7 id,  0.0 wa,  0.0 hi, 19.0 si,  1.9 st
%Cpu2  : 39.0 us, 24.3 sy,  0.0 ni, 14.2 id,  0.0 wa,  0.0 hi, 19.1 si,  3.4 st
%Cpu3  : 29.2 us, 23.1 sy,  0.0 ni, 25.0 id,  0.0 wa,  0.0 hi, 20.0 si,  2.7 st
%Cpu4  : 31.5 us, 19.1 sy,  0.0 ni, 28.0 id,  0.0 wa,  0.0 hi, 17.9 si,  3.5 st
%Cpu5  : 19.0 us, 54.3 sy,  0.0 ni,  7.8 id,  0.0 wa,  0.0 hi, 17.8 si,  1.1 st
%Cpu6  : 31.1 us, 20.2 sy,  0.0 ni, 25.3 id,  0.0 wa,  0.0 hi, 17.9 si,  5.4 st
%Cpu7  : 36.1 us, 30.1 sy,  0.0 ni, 11.3 id,  0.0 wa,  0.0 hi, 20.7 si,  1.9 st
KiB Mem : 16267984 total,  7293100 free,   793760 used,  8181124 buff/cache
KiB Swap:        0 total,        0 free,        0 used. 14623460 avail Mem 

问题机器有大量 kernel: TCP: too many orphaned sockets

参考: https://huoding.com/2013/10/30/296

现象

Update: 以下不准,压测产生的qps大于真实场景。

新上线4台虚拟机作为Ingress节点,3台正常,一台高峰期变的很慢。查看发现 ksoftirqd/0进程占用很高的cpu,和其他正常虚拟机对比,Cpu0si非常高

top - 20:33:25 up  1:10,  1 user,  load average: 12.68, 5.32, 2.21
Tasks: 195 total,   7 running, 188 sleeping,   0 stopped,   0 zombie
%Cpu0  :  2.5 us,  1.2 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi, 95.0 si,  1.2 st
%Cpu1  : 38.3 us, 22.2 sy,  0.0 ni, 30.9 id,  0.0 wa,  0.0 hi,  4.9 si,  3.7 st
%Cpu2  : 38.3 us, 21.0 sy,  0.0 ni, 35.8 id,  0.0 wa,  0.0 hi,  2.5 si,  2.5 st
%Cpu3  : 32.1 us, 26.2 sy,  0.0 ni, 35.7 id,  0.0 wa,  0.0 hi,  2.4 si,  3.6 st
%Cpu4  : 35.4 us, 22.8 sy,  0.0 ni, 34.2 id,  0.0 wa,  0.0 hi,  5.1 si,  2.5 st
%Cpu5  : 29.3 us, 20.7 sy,  0.0 ni, 41.5 id,  0.0 wa,  0.0 hi,  3.7 si,  4.9 st
%Cpu6  : 30.5 us, 30.5 sy,  0.0 ni, 30.5 id,  0.0 wa,  0.0 hi,  4.9 si,  3.7 st
%Cpu7  : 35.7 us, 31.0 sy,  0.0 ni, 23.8 id,  0.0 wa,  0.0 hi,  7.1 si,  2.4 st
KiB Mem : 16267984 total, 13893640 free,   640488 used,  1733856 buff/cache
KiB Swap:        0 total,        0 free,        0 used. 15231168 avail Mem 

正常机器

top - 20:35:13 up 7 days,  5:22,  1 user,  load average: 5.71, 5.36, 5.22
Tasks: 198 total,   5 running, 193 sleeping,   0 stopped,   0 zombie
%Cpu0  : 18.1 us,  9.7 sy,  0.0 ni, 18.4 id,  0.0 wa,  0.0 hi, 52.0 si,  1.8 st
%Cpu1  : 15.3 us, 11.1 sy,  0.0 ni, 72.5 id,  0.0 wa,  0.0 hi,  0.7 si,  0.3 st
%Cpu2  : 14.7 us, 11.1 sy,  0.0 ni, 73.5 id,  0.0 wa,  0.0 hi,  0.0 si,  0.7 st
%Cpu3  : 14.0 us, 11.6 sy,  0.0 ni, 73.3 id,  0.4 wa,  0.0 hi,  0.4 si,  0.4 st
%Cpu4  : 16.4 us, 10.7 sy,  0.0 ni, 71.9 id,  0.0 wa,  0.0 hi,  0.0 si,  1.1 st
%Cpu5  :  8.3 us,  8.7 sy,  0.0 ni, 82.3 id,  0.0 wa,  0.0 hi,  0.0 si,  0.7 st
%Cpu6  :  0.0 us,100.0 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu7  : 12.6 us,  9.4 sy,  0.0 ni, 77.3 id,  0.0 wa,  0.0 hi,  0.0 si,  0.7 st
KiB Mem : 16267984 total,  1224244 free,   927316 used, 14116424 buff/cache
KiB Swap:        0 total,        0 free,        0 used. 13885024 avail Mem 

物理机

top - 20:36:11 up 156 days,  3:12,  1 user,  load average: 5.95, 6.38, 5.89
Tasks: 395 total,   6 running, 389 sleeping,   0 stopped,   0 zombie
%Cpu0  : 10.7 us, 23.4 sy,  0.0 ni, 58.6 id,  0.0 wa,  0.0 hi,  7.2 si,  0.0 st
%Cpu1  : 11.4 us, 29.1 sy,  0.0 ni, 51.2 id,  0.0 wa,  0.0 hi,  8.3 si,  0.0 st
%Cpu2  : 16.8 us, 10.8 sy,  0.0 ni, 64.7 id,  0.0 wa,  0.0 hi,  7.7 si,  0.0 st
%Cpu3  : 18.4 us,  6.9 sy,  0.0 ni, 67.0 id,  0.0 wa,  0.0 hi,  7.6 si,  0.0 st
%Cpu4  : 16.8 us,  6.6 sy,  0.0 ni, 68.9 id,  0.0 wa,  0.0 hi,  7.7 si,  0.0 st
%Cpu5  :  6.7 us,  3.0 sy,  0.0 ni, 90.3 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu6  :  3.7 us,  4.1 sy,  0.0 ni, 91.9 id,  0.3 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu7  :  4.1 us,  3.4 sy,  0.0 ni, 91.9 id,  0.7 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu8  :  4.4 us,  3.7 sy,  0.0 ni, 91.9 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu9  :  4.1 us,  4.1 sy,  0.0 ni, 91.9 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu10 :  4.4 us,  3.4 sy,  0.0 ni, 91.9 id,  0.3 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu11 :  3.4 us, 10.8 sy,  0.0 ni, 85.8 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu12 : 12.3 us,  7.0 sy,  0.0 ni, 73.9 id,  0.0 wa,  0.0 hi,  6.7 si,  0.0 st
%Cpu13 :  0.0 us, 35.0 sy,  0.0 ni, 65.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu14 :  1.3 us,  2.3 sy,  0.0 ni, 96.3 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu15 :  1.3 us,  0.3 sy,  0.0 ni, 98.3 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu16 : 14.9 us,  6.4 sy,  0.0 ni, 71.9 id,  0.0 wa,  0.0 hi,  6.8 si,  0.0 st
%Cpu17 : 15.7 us,  6.4 sy,  0.0 ni, 70.8 id,  0.0 wa,  0.0 hi,  7.1 si,  0.0 st
%Cpu18 :  2.0 us,  3.7 sy,  0.0 ni, 94.3 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu19 :  2.0 us,  1.7 sy,  0.0 ni, 95.7 id,  0.7 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu20 :  2.0 us,  1.3 sy,  0.0 ni, 96.7 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu21 :  1.3 us,  2.0 sy,  0.0 ni, 96.7 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu22 :  1.7 us,  1.0 sy,  0.0 ni, 97.3 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu23 :  2.6 us,  4.0 sy,  0.0 ni, 93.4 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem : 13173760+total, 79440544 free,  1963216 used, 50333840 buff/cache
KiB Swap:        0 total,        0 free,        0 used. 12426852+avail Mem 
02-工程实践/kubernetes/issue/ingress节点ksoftirqd高cpu.txt · 最后更改: 2020/04/07 06:34 由 annhe