在做性能调优时,遇到这么一个问题:已知国产机(飞腾+麒麟OS)上机械硬盘的性能非常差,文件读写会有不少开销,那么怎么跟踪程序的读写情况,尽量优化掉不必要的读写呢?这需要查找文件热点。对于这项工作,BPF Compiler Collection里的filetop是个很好的选择,不过BCC这组工具在麒麟OS源里没有提供,遂考虑用strace实现。

跟踪系统调用

严格来说,strace并不能直接跟踪文件的读写情况,而是跟踪所有接受一个文件名为参数的系统调用。不过无论是频繁读写还是频繁判断文件状态,对于调优而言都是可待优化的,因此这里没有严格区分两者。

跟踪文件相关的系统调用:

$ strace -t -e trace=file -o strace.log COMMAND
# --trace=file
# Trace all system calls which take a file name as an argument.  You can think of this as an abbreviation for -e trace=open,stat,chmod,unlink,...  which is useful to seeing what files the process is referencing.

--trace=还可以使用process、network、signal、desc、memory等等,参见https://man7.org/linux/man-pages/man1/strace.1.html

示例

$ strace -t -e trace=file -o strace.log fc-list
$ cat strace.log
18:07:24 execve("/home/tanqiao/program/hotspot/hotspot", ["hotspot"], 0x7ffc0b28a138 /* 80 vars */) = 0
18:07:24 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (没有那个文件或目录)
18:07:24 openat(AT_FDCWD, "/usr/local/cuda-11.5/lib64/tls/x86_64/x86_64/libtinfo.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (没有那个文件或目录)
18:07:24 stat("/usr/local/cuda-11.5/lib64/tls/x86_64/x86_64", 0x7ffc75b3d900) = -1 ENOENT (没有那个文件或目录)
18:07:24 openat(AT_FDCWD, "/usr/local/cuda-11.5/lib64/tls/x86_64/libtinfo.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (没有那个文件或目录)
18:07:24 stat("/usr/local/cuda-11.5/lib64/tls/x86_64", 0x7ffc75b3d900) = -1 ENOENT (没有那个文件或目录)
18:07:24 openat(AT_FDCWD, "/usr/local/cuda-11.5/lib64/tls/x86_64/libtinfo.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (没有那个文件或目录)
18:07:24 stat("/usr/local/cuda-11.5/lib64/tls/x86_64", 0x7ffc75b3d900) = -1 ENOENT (没有那个文件或目录)
18:07:24 openat(AT_FDCWD, "/usr/local/cuda-11.5/lib64/tls/libtinfo.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (没有那个文件或目录)
18:07:24 stat("/usr/local/cuda-11.5/lib64/tls", 0x7ffc75b3d900) = -1 ENOENT (没有那个文件或目录)
18:07:24 openat(AT_FDCWD, "/usr/local/cuda-11.5/lib64/x86_64/x86_64/libtinfo.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (没有那个文件或目录)
18:07:24 stat("/usr/local/cuda-11.5/lib64/x86_64/x86_64", 0x7ffc75b3d900) = -1 ENOENT (没有那个文件或目录)

文件处理

grep去除打开失败的文件

$ cat strace.log | grep -v 'ENOENT'
# -v, --invert-match 改变匹配的意义,只选择不匹配的行

awk提取双引号内的值

$ awk -F '"' '{print $2}'

结合sortuniq去重

$ sort | uniq -c
# uniq
# 	-c, --count	每行前附上重复出现的次数作为前缀

输出类似于:

$ cat strace.log | grep -v 'ENOENT' | awk -F '"' '{print $2}' | sort | uniq -c
1 
5 /etc/fonts/conf.avail
5 /etc/fonts/conf.avail/10-antialias.conf
5 /etc/fonts/conf.avail/10-autohint.conf
5 /etc/fonts/conf.avail/10-hinting-full.conf
5 /etc/fonts/conf.avail/10-hinting-medium.conf
5 /etc/fonts/conf.avail/10-hinting-none.conf
5 /etc/fonts/conf.avail/10-hinting-slight.conf
5 /etc/fonts/conf.avail/10-no-sub-pixel.conf
5 /etc/fonts/conf.avail/10-scale-bitmap-fonts.conf

可以再接上sort -n对重复次数进行排序:

$ cat strace.log | grep -v 'ENOENT' | awk -F '"' '{print $2}' | sort | uniq -c | sort -n
# sort
#	-n	按照字符串的数值顺序比较,暗含-b

输出类似于:

# ...
5 /etc/fonts/conf.avail/99-language-selector-zh.conf
5 /etc/fonts/conf.d/65-khmer.conf
5 /etc/fonts/fonts.conf
5 /home/tanqiao/.config/fontconfig/conf.d/09-Directories.conf
6 /etc/fonts/conf.d
7 /home/tanqiao/.config/fontconfig/conf.d

如此,频繁读取的文件便一目了然

管道指令

也可以选择不将strace输出保存到文件,而是直接用管道:

$ strace -t -e trace=file 2>&1 COMMAND | grep -v 'ENOENT' | awk -F '"' '{print $2}' | sort | uniq -c | sort -n

或者

$ strace -t -e trace=file COMMAND |& grep -v 'ENOENT' | awk -F '"' '{print $2}' | sort | uniq -c | sort -n

这里的关键是使用2>&1将stderr重定向到stdout,因为管道将stdout作为输入。|&则是2>&1 |的缩写。

参考

https://stackoverflow.com/questions/27428150/linux-track-all-files-accessed-by-process

https://unix.stackexchange.com/questions/170043/sort-and-count-number-of-occurrence-of-lines

https://stackoverflow.com/questions/16497317/piping-both-stdout-and-stderr-in-bash