Basic checklist #
uptime
: check load averagedf -h
: check filesystems usagedmesg -H
: check potentially-useful kernel messagessudo journalctl -e
: check service error messagestop
orhtop
orps aux
: check irregular processes (dead or zombie processes)free -h
: check free memoryip a
: check IP addresses
When a system is abnormal (slow?), make sure you are able to identify what aspect of the system might be the cause: CPU, memory, networking or I/O?
Networking #
- See flow graph of network I/O:
nload
- See network I/O per host:
iftop
- List remote IP of open connections:
ss
. - A background service that log the total network I/O in a period:
vnstat
vnstat --top10
vnstat -d
DNS #
$ cat /etc/resolv.conf # are server IPs proper?
$ nslookup www.google.com # nslookup is from the bind-utils package
$ nslookup linux3.sa.csie.ntu.edu.tw 10.217.44.1 # check DNS server 10.217.44.1 is working
$ ping www.google.com # if you don't have `nslookup`, this can also identify DNS problem
Unable to start a service #
lsof -i TCP:[port]
: Check whether its listening port is used by another process
Irregular connections #
- When the system is intruded, the attacker can install malicious programs that perform unusual amount of outbound connections, typically for the purpose of attacking other computers.
- A user might be running P2P applications that are otherwise prohibited in TANet.
## Irregular traffic
$ ss -t -a # Any suspicious IP?
$ netstat -tulnp # Any weird process listening?
## Outbound connection slow ...
$ mtr # Packet dropped in any route?
$ tcpdump -nn host [xxx] and port [yyy]
$ iftop # Fancy tools XD
$ nload # Also fancy tools
Disk or I/O #
Sometimes filesystem can become full or disk I/O become slow.
If you identify the offending over-sized file, it is not sufficient to just unlink it; you must make sure it is not being held opened by any process.
$ df -h # what filesystem is full?
$ ncdu # check directories space usage
$ du -sh * # check directories space usage
$ lsof [somefile] # check what process opens a file
$ iostat -x 1 # check unusual disk I/O
Broken services #
Sometimes a service can misbehave…
pgrep -al cupsd
: are your service alive?systemctl status sshd
: are your service alive?sudo strace -p $(pgrep cupsd) -s 1000 -f -o /tmp/trace-output
: looks what happen in thecupsd
processvim -c 'set filetype=strace' /tmp/trace-output
: view strace’s outputsudo journalctl -f -e -u sshd
: monitor SSH server’s instantaneous outputsudo tail -f /var/log/sssd/sssd.log
: monitor SSSD’s instantaneous log outputlsof [some file]
: Is my file opened by any process?