I specifically wanted programming-language-independent ways to
investigate questions like this, and I guess people who follow me on
twitter get me because I got SO MANY GREAT ANSWERS. I’ll give you a list
of all the answers at the end, but first! We’re going to mount an
investigation.
Let’s start! I wrote up 3 example mystery programs, and you can find
them in this
github repository
.
Mystery Program #1
Let’s investigate our first mystery slow program!
First, let’s check
how long it takes
using
time
:
$ time python mystery_1.py
0.09user 0.01system 0:02.11elapsed 5%CPU (0avgtext+0avgdata 52064maxresident)
We can already see by timing the program that a) it takes 2 seconds and
b) it’s only on the CPU for 5% of that time. So we already know it’s
waiting for something! But what is it waiting for?
What is Mystery Program #1 waiting for?
First, we’ll use a rough tool: ps
can tell us what every process is
currently waiting for. It gives you a piece of information called
wchan
, which the internet defines as “wait channel. The address of an
event on which a particular process is waiting.”
So, let’s start the program and then call ps -eo pid,wchan:42,cmd
to
get the PID, current waiting channel, and command for each process. (the
42 is so it doesn’t cut off any information)
I actually ran it in a loop (ps -eopid,wchan:42,cmd; sleep 0.5
so that
I wouldn’t miss anything)
$ ps -eo pid,wchan:42,cmd | grep mystery_1.py
2382 sk_wait_data python mystery_1.py
The internet
tells us that sk_wait_data
means “Wait for some data on a network
socket.“. So it seems likely that it’s slow because it’s waiting for a
network response! AWESOME.
You can see the source for mystery_1.py if you want to know what it’s actually doing.
Mystery Program #2
time
, again, is a great place to start.
$ time python mystery_2.py
2.74user 0.00system 0:02.74elapsed 99%CPU (0avgtext+0avgdata 18032maxresident)k
This program is spending all of its runtime on the CPU (2.74⁄2.74
seconds), and it’s all in user functions. The operating system won’t
have anything new to tell us here (no IO! No network!), and since it’s
Python (and not, say, C++) it’s actually going to be best to just use a
Python profiler.
It’s important to know when you shouldn’t use fancy operating systems
tools to debug performance, and I think this is one of those cases.
You can see the source for mystery_2.py
if you want to know what it’s actually doing.
Mystery program #3
time python mystery_3.py
0.08user 1.03system 0:10.61elapsed 10%CPU (0avgtext+0avgdata 18176maxresident)k
This one is waiting for 9 seconds. What’s going on?! I actually
originally thought I understood this one but I TOTALLY DIDN’T.
To thicken the plot, if we run the program twice in a row, it takes
drastically different amounts of time:
$ time python mystery_3.py
0.01user 0.40system 0:00.61elapsed 69%CPU (0avgtext+0avgdata 18176maxresident)k
0inputs+585944outputs (0major+1239minor)pagefaults 0swaps
$ time python mystery_3.py
0.01user 0.34system 0:10.55elapsed 3%CPU (0avgtext+0avgdata 18176maxresident)k
24inputs+585944outputs (0major+1238minor)pagefaults 0swaps
It just went from 0.6 seconds to 10 seconds! What in the world is going on?
I tried this wchan
trick again, and ran our mystery program a few times
$ for i in `seq 1 100`
ps -eo pid,wchan:42,cmd | grep mystery_3; sleep 0.5
11285 - python mystery_3.py
11285 sleep_on_buffer python mystery_3.py
11285 sleep_on_buffer python mystery_3.py
11285 sleep_on_buffer python mystery_3.py
11410 sleep_on_page python mystery_3.py
11438 sleep_on_shadow_bh python mystery_3.py
All of this buffer and page business is enough for me to conclude that
there’s something going on with IO. But what, and why does it run so
much more slowly the second time? I was actually totally confused about
this, even though I knew what the program was doing.
Here’s the program. It’s writing 287MB to /tmp/fake.txt
.
line = 'a' * 30000
filename = '/tmp/fake.txt'
with open(filename, 'w') as f:
for i in xrange(10000):
f.write(line)
The whole point of this exercise is to debug using our operating system,
so we need to get a better picture of what the OS is doing! Let’s use
dstat
, which gives us snapshots every second of what network, IO, and
CPU activity is happening (SO GREAT)
$ dstat
----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system--
usr sys idl wai hiq siq| read writ| recv send| in out | int csw
14 5 78 3 0 0| 46k 342k| 0 0 | 0 1629B| 374 1282
3 1 95 1 0 0| 0 232k| 0 0 | 0 0 | 606 2807
3 2 95 0 0 0| 0 0 | 164B 204B| 0 0 | 444 2093
I’m going to just show you the disk stats using dstat -d
for clarity. It prints a new line every second.
$ dstat -d
-dsk/total-
read writ
0 136k
0 56M <-- when I start python mystery_3.py. It finishes immediately
0 46M <-- but it's still writing data!
0 32M <-- still writing...
0 52M <-- still writing...
0 70M <-- still writing...
0 32M <-- still writing...
0 148k <-- okay it's done
0 0
0 144k
0 0
0 144k^
So we see that the OS is writing data even after the program is
finished. For 5 whole seconds! This was when a lightbulb went off in my
head going OH FILESYSTEM CACHES RIGHT. Writing to disks is super
slow, and the kernel loves you and wants you to not have to wait. So
it says “okay, got it!”, but doesn’t actually write the data to disk
until a little later.
When the filesystem cache runs out of space, it says “okay you need to
stop writing now” and actually writes the data to disk and makes you
wait until it’s done. Of course, sometimes you want your data to be
*actually for serious written before you keep on going (for example if
you’re a database!!). This is why the second run of our program takes so
long! It needs to wait for the writes from the previous run to finish,
and also catch up on its own.
Kamal wisely suggested that I could
force the kernel to do finish all the writes before the program
finishes:
import os
line = 'a' * 30000
filename = '/home/bork/fake.txt'
with open(filename, 'w') as f:
for i in xrange(10000):
f.write(line)
f.flush()
os.fsync(f.fileno())
time python writes2.py
0.02user 0.32system 0:06.70elapsed 5%CPU (0avgtext+0avgdata
18192maxresident)k
Surprise: it takes about 6.5 seconds. every time. Which is exactly what
we’d expect from looking at our dstat output above! I have other
Serious Questions about why my hard drive only writes at 40MB/s but that
will be for another time.
I got SO MANY ANSWERS. holy crap you guys. I’m going to write down every
tool someone recommended here so I don’t forget them, though you can
also just read the Twitter replies.
nethogs
, nettop
, and jnettop
for network performance
iotop
is top, but for IO! Awesome!
iostat
and lsof
too for seeing what’s up with IO and files right now
top
and htop
for CPU stats, of course (pro tip: use htop instead of top)
strace
because we <3 strace
perf
is a magical tool that can do anything
atop
which I don’t even understand what it is
pidstat
is an amazing program for looking at both CPU and disk activity which we’re going to explain a little more later
ps xaopid,wchan:42,cmd
is this amazing ps
incantation Aria Stewart told me which tells you what every process is currently doing. whoa.
vmstat which I’m not totally sure what it is yet
dstat
is like iotop and nethogs and top all rolled into one and I’m
super into it.
Brendan Gregg’s great picture of Linux observability tools
which is awesome as a reference but honestly I have a really hard time learning new things from it. I need examples!
Twitter is the bomb and I learned about at least 5 awesome tools I
hadn’t heard of before (nethogs, iotop, pidstat, dstat, and this ps
-eo wchan
business)
that’s all for now!
I’m working on a talk about this for PyCon 2015 next month, so there
should be more posts along these lines coming your way :) :) :)
many many thanks to Aria Stewart, Brendan Gregg, Kamal Marhubi, and
others for telling me about some of these amazing tools!