Today’s dtrace one-liners April 21, 2007
Posted by mgerdts in dtrace, solaris.add a comment
Well, I’ve been spending a bit more time with dtrace lately. I’m now starting to do interesting things with dtrace without looking at the manual or snagging other scripts. Today I found some processes that looked like they weren’t doing much of use. In particular, I saw the following in “prstat -mL”
PID USERNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIG PROCESS/LWPID 15274 oracle 46 53 0.0 0.0 0.0 0.0 0.0 0.0 0 67 .22 0 racgmain/1 15295 oracle 47 53 0.0 0.0 0.0 0.0 0.0 0.0 0 74 .23 0 racgmain/1 16810 oracle 47 53 0.0 0.0 0.0 0.0 0.0 0.0 0 66 .23 0 racgmain/1 25029 oracle 47 53 0.0 0.0 0.0 0.0 0.0 0.0 0 70 .23 0 racgmain/1
First, it is confusing that it says that these processes are doing less than one system call per second (.22 or .23), but spending more than 50% in system calls. Looks like a bug in prstat. To see what system call was being used…
# dtrace -n 'syscall:::entry/execname == "racgmain"/ { @[probefunc] = count() }
dtrace: description 'syscall:::entry' matched 232 probes
^C
read 269039
Ahh, so it is only reads that are causing all of this. Let’s see which file descriptors are involved.
# dtrace -n 'syscall::read:entry/execname == "racgmain"/{ @[pid, probefunc,arg0] = count() }'
^C
22469 read 9 7
22469 read 4 7
22469 read 3 9
22469 read 8 9
22469 read 10 10
22469 read 7 21
22469 read 6 24
15274 read 10 1026462
15295 read 10 1044932
25029 read 10 1051450
16810 read 11 1060720
It seemed rather odd that there would be so much reading on a single file descriptor per file (10 or 11, depending on the process). If this were really the case, I would expect to see some processes showing up in prstat that were high in %sys doing writes. Using pfiles on each of the pids showed that the file descriptor in question was a FIFO, but gave no indication what (if anything) was on the other end of the FIFO.
To help answer the question of what might be on the other end of the FIFO, I tried this (ok, not a one liner…)
#!/usr/sbin/dtrace -qs
syscall::read:return
/execname == "racgmain"/
{
@c[pid, arg0] = count()
}
tick-1s
{
printf("%6s %6s %4s\n", " pid ", " size ", "count");
printa("%6d %6d %@4d\n", @c);
clear(@c);
}
Which gave the following output…
pid size count 15295 0 117524 15274 0 121065 25029 0 121308 16810 0 125358 pid size count 15274 0 108177 25029 0 109639 15295 0 115735 16810 0 117175 pid size count 15295 0 119411 15274 0 119471 25029 0 120273 16810 0 120650
Imagine that… these processes are each doing 100,000+ zero-byte reads per second. Time to tell the DBA’s that they have some real problems with one of their daemons.
reproducible hang with ldom preview April 12, 2007
Posted by mgerdts in Uncategorized.add a comment
This blog entry is because formatting is horribly broken at http://forum.java.sun.com/thread.jspa?threadID=5159568 where I originally posted it.
I configured a T2000 as described in the beginner’s guide (http://www.sun.com/blueprints/0207/820-0832.pdf) with the exception of the device allocated for the root disk. For that I came up with my own variant of http://unixconsole.blogspot.com/2007/04/time-to-build-guest-domain.html.
My variant of using a file involved creating the file with mkfile on a zfs file system. That is…
zpool create zfs mirror c1t0d0s4 c1t1d0s4 zfs create zfs/ldoms zfs set compress=on zfs/ldoms mkfile 32G /zfs/ldoms/root.img
As I install Solaris in the ldom, the server (control domain) dies after extracting a few hundred megabytes of a flash archive. I have traced this down to it running out of memory.
Here’s “vmstat 4″ output on the control domain console:
... 0 0 0 8636536 24776 1 70 0 0 0 0 0 0 0 0 0 1954 311 2244 0 23 76 0 0 0 8645096 22280 1 36 0 0 0 0 0 0 0 0 0 5957 370 9827 0 19 80 0 0 0 8649008 17720 1 40 0 296 313 0 44 0 0 0 0 9975 361 15877 0 24 75 0 0 0 8651104 15944 1 52 0 807 1671 0 700 0 0 0 0 10725 347 17545 0 26 74 0 0 0 8650800 18376 0 60 0 88 239 0 127 0 0 0 0 9816 391 15545 1 33 67 0 0 0 8640432 15936 0 76 0 497 3025 0 3874 0 0 0 0 11367 432 17975 0 35 65 0 0 0 8642968 17032 1 59 0 452 2028 0 842 0 0 0 0 10266 363 16127 0 27 73 kthr memory page disk faults cpu r b w swap free re mf pi po fr de sr m0 m1 m2 m1 in sy cs us sy id 0 0 0 8644768 15744 0 56 0 387 1298 0 126 0 0 0 0 10170 330 16355 0 24 75 0 0 0 8652504 18368 1 113 0 372 2462 0 273 0 0 0 0 11171 321 18613 0 35 65 0 0 0 8652832 15720 1 134 0 411 6081 0 738 0 0 0 0 11541 332 18979 0 34 66 0 0 0 8652232 14312 1 94 0 413 1806 0 7775 0 0 0 0 10718 358 18271 0 38 62 0 0 0 8647360 12592 18 133 9 555 5176 0 17490 1 0 1 0 10394 320 16970 1 37 63 0 0 0 8645248 14408 2 73 22 486 5039 0 3111 2 1 1 0 11749 383 18336 0 40 59 2 0 43 8641800 2784 1 148 99 1070 1517 0 53982 19 9 9 5 8316 356 14226 0 43 57 0 0 116 8647032 800 1 42 127 134 312 3688 76207 14 7 7 1 2153 114 3726 0 29 71
At this point the server froze. Note that 116 processes were swapped and the “de” column is 3688. Very bad news.
My initial thoughts were that I was running into some of the low-memory problems known to happen with the ZFS arc. This does not seem to be the case. According to mdb, the arc size is around 60 MB:
# mdb unix.3 vmcore.3 ... > arc::print -td size uint64_t size = 0t61455360
The control domain is S10 11/06 + 118833-36 + those required for ldoms + many others. The ldom is in the process of being installed is booted from a S10 11/06 netinstall image (118833-33).