[lug] q: multithreading on smp weirdness

Chan Kar Heng karheng at softhome.net
Thu Dec 9 10:34:18 MST 2004


i'm back.. only managed to secure the server today. sigh.

>...
>>in my test app, i actually started out without the mmap
>>function calls.
>>i read somewhere that only 1 thread can be in the kernel
>>at 1 time, & was wondering whether these mmap calls
>>would cause these situations.
>
>Couldn't tell you. One rule of thumb I have though is that things telling you "it shouldn't happen" are often wrong, and more theory than reality (or marketing). I would guess that the kernel version and thread library versions make a difference in behavior of multiple threads using one mmap'd memory block.

yeap... agree... and another rule of thumb i have is that computers
don't lie (it must be a user mistake, unless hardware is faulty).


>Most of the gotcha's for thread performance seem to fall into 3 categories: (a) locking/synhcronizations (correctly implemented), (b) race conditions (incorrectly implemented locking/synchronizations), and (c) reentrant code eating up huge amounts of memory and/or time for thread-specific data when multiple threads hit the same function at the same time (note that that anything just plain thread-safe is not necessarily reentrant, and simple thread-safe locking is part of the correctly implemented synchronization).

agree. thread safe != reentrant. :)
& the rest of the good stuff too.


>If you profile, you'll see where the time is involved. See if one of the above 3 can in any way be related to the high time consumer.

i've profiled the stuff & have attached them to the mail (thrddbg.zip).
they don't seem to point to any particularly unexpected place.
weird thing is that when i made the threads sync over 6000 times,
the app actually had shorter elapse time (12 secs).
when i made it sync only once, the elapse time of the test app
was 17 secs.

i've managed to test the app on a few other dual itanium machines
too.

server A:
(it's the one i tested earlier on & the machine that i initially get this
multi-threading weirdness)
has 4 itanium CPUs.
has IRQ requests distributed across CPUs.
runs Redhat Application Server 2.1.
"kernel version 2.4.18-e.12smp #1 SMP Thu Oct 17 15:13:01 EDT 2002 ia64 unknown"

server B:
(the server i tested on after server A; no mthread weirdness;
increasing thread syncs simply increased the overheads,
which were in magnitudes of double digit milliseconds).
has 2 itanium CPU.
has IRQ requests distributed across CPUs.
runs RH AS 3.
"kernel version 2.4.21-20.EL #1 SMP Wed Aug 18 20:30:22 EDT 2004 ia64 ia64 ia64 GNU/Linux"

server C:
(a new one i found for testing; no mthread weirdness).
has 2 itanium CPUs.
has IRQ requests distributed across CPUs.
runs RH AS 3.
"kernel version 2.4.21-15.EL #1 SMP Thu Apr 22 00:13:07 EDT 2004 ia64 ia64 ia64 GNU/Linux"

server D:
(another new one i found for testing. the best part is
i managed to test on this server twice.. once when it
was running RH AS 2.1, another after it was upgraded
to RH AS 3; & in both cases, it had that multithreading
weirdness i described earlier).
has 2 itanium CPUs.
doesn't have IRQ requests distributed across CPUs.
(i didn't think to get the kernel version when it was running
RH AS 2.1)
when running RH AS 3, the kernel version is
"2.4.21-4.EL #1 SMP Fri Oct 3 17:29:39 EDT 2003 ia64 ia64 ia64 GNU/Linux"


so far it seems that on kernels before or equal
"2.4.21-4.EL #1 SMP Fri Oct 3 17:29:39 EDT 2003 ia64 ia64 ia64 GNU/Linux"
, test app wouldn't multithread properly....
& on kernels after or equal
"2.4.21-15.EL #1 SMP Thu Apr 22 00:13:07 EDT 2004 ia64 ia64 ia64 GNU/Linux"
, test app multithreads correctly.


it's driving me bonkers.

anyway, this assignment of mine is being put on hold
for another task.

thanks for all the wonderful suggestions & insight.
i learnt a thing or two. :)

rgds,

kh
-------------- next part --------------
A non-text attachment was scrubbed...
Name: thrddbg.zip
Type: application/zip
Size: 13472 bytes
Desc: not available
URL: <http://lists.lug.boulder.co.us/pipermail/lug/attachments/20041210/100d7006/attachment.zip>


More information about the LUG mailing list