hunting bugs in filebench February 16, 2006
Posted by mgerdts in Uncategorized.add a comment
I’ve been using filebench a bit at work and decided that I would like to try a few things out at home. My home machine is not quite as beefy as the V40z’s that I have been testing on at work.
Getting filebench to compile in the first place is a bit of work. Probably works really well on someone else’s system, but mine is obviously different. That’s another story though. After compiling filebench, I ran it for the first time and saw this:
$ /opt/filebench/bin/filebench Segmentation fault (core dumped)
Bummer. Well, let’s see where that is at:
$ gdb /opt/filebench/bin/filebench core GNU gdb 6.4-debian . . . (gdb) where #0 0x37dd84aa in memset () from /lib/tls/i686/cmov/libc.so.6 #1 0x0807b01e in ?? () #2 0x080522da in ipc_init () at ipc.c:264 #3 0x08058bc1 in main (argc=1, argv=0x3f8fdcf4) at parser_gram.y:1140
OK, so let’s go with the assumption that the bug is in the code listed as alpha on the web site, and not libc. So we go up the stack a couple levels.
(gdb) up 2 #2 0x080522da in ipc_init () at ipc.c:264 264 memset(filebench_shm, 0, c2 - c1); (gdb) print filebench_shm $1 = (filebench_shm_t *) 0xffffffff
Hmmm… 0x with a bunch of f’s looks like -1. Perhaps some system call on Solaris (presumably where filebench started) returns NULL on error and on Linux it returns -1. Let’s go looking for that system call.
(gdb) list
259 #endif /* USE_PROCESS_MODEL */
260
261 c1 = (caddr_t)filebench_shm;
262 c2 = (caddr_t)&filebench_shm->marker;
263
264 memset(filebench_shm, 0, c2 - c1);
265 filebench_shm->epoch = gethrtime();
266 filebench_shm->debug_level = 2;
267 filebench_shm->string_ptr = &filebench_shm->strings[0];
268 filebench_shm->shm_ptr = (char *)filebench_shm->shm_addr;
Nope, not there. Maybe a bit further up.
(gdb) list 250
245 #endif
246
247 if ((filebench_shm = (filebench_shm_t *)mmap(0, sizeof(filebench_shm_t),
248 PROT_READ | PROT_WRITE,
249 MAP_SHARED, shmfd, 0)) == NULL) {
250 filebench_log(LOG_FATAL, "Cannot mmap shm");
251 exit(1);
252 }
253
254 #else
It looks like mmap may be the culprit. I first asked man, but this is Linux, not Solaris. No man page for mmap! Next try google. Google comes up with this page that looks a lot like a man page. Why isn’t that found on my system? Another thing for another day. Anyway, it says:
RETURN VALUE
On success, mmap returns a pointer to the mapped area. On error, the value MAP_FAILED (that is, (void *) -1) is returned, and errno is set appropriately. On success, munmap returns 0, on failure -1, and errno is set (probably to EINVAL).
Ok, so it is returning -1 because it doesn’t like something. Let’s see what it is trying to mmap:
(gdb) print sizeof(filebench_shm_t)
$2 = 907368000
(gdb) print sizeof(filebench_shm_t) / 1024 / 1024
$3 = 865
(gdb)
That ’splains it. It looks like it is trying to set up a shared memory segment that is 865 MB. My poor little system only has 512.
FWIW, I have created a patch that addresses this one problem but I haven’t had a chance to test it on Solaris yet. Unfortunately, with the patch, it just tells me that the mmap failed. It doesn’t address the fact that it is trying to allocate a shared memory segment larger than the size of RAM on my system.
Update 1:
I have posted several patches to the bug tracking system at sourceforge.net. This particular one is 1432638. It turns out that mmap on Solaris also returns MAP_FAILED so the patch is simpler than I originally expected.
Download and gunzip in one step February 1, 2006
Posted by mgerdts in Uncategorized.Tags: nexenta, opensolaris
add a comment
I was feeling the need to take a look at Nexenta and decided that I wasn’t terribly interested in waiting for a download, then waiting for a gunzip. Why not do them both at the same time?
$ wget -O /dev/stdout \ http://www.gnusolaris.org/gsmirror/genunix.org/elatte_installcd_alpha2_i386.iso.gz \ | gunzip > elatte_installcd_alpha2_i386.iso => '/dev/stdout' --20:21:47-- http://www.gnusolaris.org/gsmirror/genunix.org/elatte_installcd_alpha2_i386.iso.gz Resolving www.gnusolaris.org... 216.129.112.21 Connecting to www.gnusolaris.org|216.129.112.21|:80... connected. HTTP request sent, awaiting response... 301 Moved Permanently Location: http://www.genunix.org/distributions/gnusolaris/elatte_installcd_alpha2_i386.iso.gz [following] --20:21:48-- http://www.genunix.org/distributions/gnusolaris/elatte_installcd_alpha2_i386.iso.gz Resolving www.genunix.org... 204.152.191.100 Connecting to www.genunix.org|204.152.191.100|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 567,433,011 (541M) [text/plain] 13% [====> ] 77,025,880 359.25K/s ETA 22:27
Just 22 minutes to go. I guess at this rate I could have piped it through cdrecord with “speed=2″.
patch_order made easy January 14, 2006
Posted by mgerdts in Uncategorized.1 comment so far
Some of my most tedious times as a Solaris administrator have been when I needed to create a patch_order file for a custom patch cluster. For a long time I have intended to just write a script…
But now, I don’t have to do that any longer! Today I discovered that smpatch(1M) now has an order subcommand. This makes it really quite simple for me to create a patch_order file for a very long list of patches. In this example, I create the patch_order file for the patches in the Solaris 10 Update 1 UpgradePatches directory:
# cd /mnt/Solaris_10/UpgradePatches # ls > /tmp/patches # smpatch order -d `pwd` -x idlist=/tmp/patches > /tmp/patch_order
Now, if you want to go the full length and create a patch cluster for it:
# mkdir /tmp/10U1_UpgradePatches # cd /tmp/10U1_UpgradePatches # mv /tmp/patch_order . # ln -s /mnt/Solaris_10/UpgradePatches/* . # cp /somewhere/10_Recommended/install_cluster .
Modify the SUPPLEMENT_NAME=”…” line in install_cluster to be more descriptive for this patch cluster. Be sure to not use characters like /, \, |, etc.
# cd /tmp # zip -rq 10U1_UpgradePatches.zip 10U1_UpgradePatches
At this point, you can copy the 10U1_UpgradePatches around to your various machines and use it just like you would a 10_Recommended bundle.
Enjoy!
My first SUPerG April 23, 2005
Posted by mgerdts in Uncategorized.add a comment
I had the privilige of attending SUPerG this week. Some of the highlights of the presentations come in the form of short quotes from some of the speakers.
Volume managers are shims that were introduced to work around bugs that should have been fixed in the file system. – Richard McDougall
And finally, someone of authority that has said what I have been trying to get across for a couple years:
IO Wait == Idle – Glenn Fawcett
Overall I was surprised by the amount of emphasis that Sun has put on the future of storage. With the combination of 10 Gigabit Ethernet, FireEngine, and NFSv4, Sun very clearly stated that the future of storage is in NAS. The story is pretty well outlined by Richard McDougall in this blog entry.
I was happy to see that the first Niagara systems seem to be ahead of schedule. Within the next 8 months I should be able to get a single processor SPARC box that is more powerful than a 12-processor 4800 or any commodity box sporting chips from Intel or AMD. Way cool. The thing with silicon that really blew me away was Rock’s hardware scout. This chip is able to look a thousand or more instructions ahead to optimize cache loads. The net effect is that you can get by with a lot less cache, thereby making huge caches unnecessary.
And for those of us that are a couple years into the life of 15k’s, the good news keeps coming. First, the upgrade to UltraSPARC IV boards brings about 90% performance improvement, the UltraSPARC IV+ will add another 70%. With an expected July release of the US IV+, this means that through periodic system board (and carrier plate…) upgrades, the 15k (and 4800, 6800, 12k, etc.) are keeping pace with Moore’s Law without a forklift upgrade. Very impressive.
There was also a lot of talk about various Solaris 10 features… however, things like zones, dtrace, zfs, etc. have been covered consistently for a long time in the press and in many data centers that there was not a whole lot new to say. Putting all of the improvements in Solaris together with advances in other technologies (10 gigabit ethernet, multiple vendors with NFSv4, CMT, …) really shows that there is a bright future for Solaris.