[lug] Horrid NFS performance on linux

Jack Swope jhswope at gmail.com
Thu Apr 18 11:32:23 MDT 2013


We have an NFS server which is performing at a horrid level.
filebench_networkfs test via nfsometer is reporting .2 to .3 MB/sec.  A
base non-optimized server easily hits 25MB/sec.

We have tested using samba to mount the shares and performance via samba is
25MB/sec.  Network is all gigabit from machine to server which is born out
by the samba performance.  During test cycle CPU load only increases by .2
to .3.  Memory usage remains flat, iostats reported below show that it
isn't io bound.  Not CPU, not IO, not network as far as we can see.

Problem symptoms:
Performance- .2 - .3 MB/sec using nfsometer filebench_networkfs 300 sec of
work takes 2 hours
Client sign: every rpc call has a matching rpc authrfsh (see the client
stats ffrom nfsstat below).
Affects all linux and solaris machines in the office.

Attempted solutions:

Changed rsize, wsize to 8192, 32768, 1048576
Changed atime  relatime, atime, noatime
Changed acl to acl and noacl
Changed proto= tcp and udp

Mixed and matched the above setting moving one at a time with no changes in
performance.

Am at my wit's end and preparing to pour a nice cold Coke into the rack to
see if it solves the problem.

Any further suggestions?

Jack


Stats on the server:

Fedora box uname output:
2.6.31.12-174.2.3.fc12.x86_64 #1 SMP Mon Jan 18 19:52:07 UTC 2010 x86_64
x86_64 x86_64 GNU/Linux

nfs-utils version nfs-utils.x86_64  1:1.2.1-6.fc12
NFS versions 2,3, and 4


/proc/cpuinfo there are 2 quad core processor for a total of 8 cores

processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 26
model name      : Intel(R) Xeon(R) CPU           E5520  @ 2.27GHz
stepping        : 5
cpu MHz         : 2260.663
cache size      : 8192 KB
physical id     : 0
siblings        : 4
core id         : 0
cpu cores       : 4
apicid          : 0
initial apicid  : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 11
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp
lm constant_tsc arch_perfmon pebs bts rep_good xtopology tsc_reliable
nonstop_tsc pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca
sse4_1 sse4_2 popcnt lahf_lm ida tpr_shadow vnmi flexpriority ept vpid
bogomips        : 4521.32
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management:

/proc/meminfo

cat /proc/meminfo
MemTotal:       33010388 kB
MemFree:          212380 kB
Buffers:         2845004 kB
Cached:         20695400 kB
SwapCached:        11808 kB
Active:         19020876 kB
Inactive:        4964404 kB
Active(anon):     380308 kB
Inactive(anon):    64928 kB
Active(file):   18640568 kB
Inactive(file):  4899476 kB
Unevictable:           0 kB
Mlocked:               0 kB
SwapTotal:      104856120 kB
SwapFree:       104621616 kB
Dirty:               332 kB
Writeback:             0 kB
AnonPages:        438860 kB
Mapped:            21348 kB
Slab:            8031632 kB
SReclaimable:    7206372 kB
SUnreclaim:       825260 kB
PageTables:        33672 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:    121361312 kB
Committed_AS:    1682728 kB
VmallocTotal:   34359738367 kB
VmallocUsed:      115536 kB
VmallocChunk:   34359607531 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
DirectMap4k:        7552 kB
DirectMap2M:    33538048 kB

Disk storage is software raid mirrored with 2- 1.5TB Seagate drives.


===============================
Server iostat at 10 second interval on raid device

04/17/2013 11:27:51 AM
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.31    0.00    2.28   10.84    0.00   86.57

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
md8             565.70         1.60      4524.00         16      45240

04/17/2013 11:28:01 AM
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.06    0.00    1.00   11.81    0.00   87.14

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
md8             445.20         0.00      3561.60          0      35616

04/17/2013 11:28:11 AM
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.16    0.01    1.56   12.22    0.00   86.04

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
md8             336.00         0.00      2688.00          0      26880

===================
Server nfsstat foe version 3 at 10 second interval

Server rpc stats:
calls      badcalls   badauth    badclnt    xdrcall
2271       0          0          0          0

Server nfs v3:
null         getattr      setattr      lookup       access
readlink
0         0% 154       6% 5         0% 102       4% 86        3% 0
0%
read         write        create       mkdir        symlink
mknod
10        0% 1680     74% 94        4% 0         0% 0         0% 0
0%
remove       rmdir        rename       link         readdir
readdirplus
1         0% 0         0% 1         0% 0         0% 0         0% 0
0%
fsstat       fsinfo       pathconf     commit
0         0% 0         0% 0         0% 117       5%

Server rpc stats:
calls      badcalls   badauth    badclnt    xdrcall
1970       0          0          0          0

Server nfs v3:
null         getattr      setattr      lookup       access
readlink
0         0% 179       9% 11        0% 82        4% 94        4% 0
0%
read         write        create       mkdir        symlink
mknod
29        1% 1379     70% 76        3% 0         0% 0         0% 0
0%
remove       rmdir        rename       link         readdir
readdirplus
0         0% 0         0% 3         0% 0         0% 0         0% 0
0%
fsstat       fsinfo       pathconf     commit
0         0% 0         0% 0         0% 96        4%

Server rpc stats:
calls      badcalls   badauth    badclnt    xdrcall
1829       0          0          0          0

Server nfs v3:
null         getattr      setattr      lookup       access
readlink
0         0% 178       9% 10        0% 87        4% 171       9% 0
0%
read         write        create       mkdir        symlink
mknod
106       5% 1094     60% 52        2% 0         0% 0         0% 0
0%
remove       rmdir        rename       link         readdir
readdirplus
21        1% 0         0% 3         0% 0         0% 0         0% 0
0%
fsstat       fsinfo       pathconf     commit
2         0% 0         0% 0         0% 83        4%

==========================
Client nfsstat version 3 10 second intervals
Client rpc stats:
calls      retrans    authrefrsh
461        0          461

Client nfs v3:
null         getattr      setattr      lookup       access
readlink
0         0% 0         0% 0         0% 111      24% 129      27% 0
0%
read         write        create       mkdir        symlink
mknod
0         0% 110      23% 111      24% 0         0% 0         0% 0
0%
remove       rmdir        rename       link         readdir
readdirplus
0         0% 0         0% 0         0% 0         0% 0         0% 0
0%
fsstat       fsinfo       pathconf     commit
0         0% 0         0% 0         0% 0         0%

Client rpc stats:
calls      retrans    authrefrsh
366        0          366

Client nfs v3:
null         getattr      setattr      lookup       access
readlink
0         0% 0         0% 0         0% 90       24% 95       25% 0
0%
read         write        create       mkdir        symlink
mknod
0         0% 91       24% 90       24% 0         0% 0         0% 0
0%
remove       rmdir        rename       link         readdir
readdirplus
0         0% 0         0% 0         0% 0         0% 0         0% 0
0%
fsstat       fsinfo       pathconf     commit
0         0% 0         0% 0         0% 0         0%

Client rpc stats:
calls      retrans    authrefrsh
350        0          350

Client nfs v3:
null         getattr      setattr      lookup       access
readlink
0         0% 0         0% 0         0% 81       23% 106      30% 0
0%
read         write        create       mkdir        symlink
mknod
0         0% 82       23% 81       23% 0         0% 0         0% 0
0%
remove       rmdir        rename       link         readdir
readdirplus
0         0% 0         0% 0         0% 0         0% 0         0% 0
0%
fsstat       fsinfo       pathconf     commit
0         0% 0         0% 0         0% 0         0%



=============================
tcpdump from server constant stream of these packets

    bot.xxx.xxx.nfs > xxx.xxx.xxx.x.1678097707: reply ok 116 lookup ERROR:
No such file or directory post dattr: DIR 40755 ids 65534/65534 sz 4096
nlink 2 rdev 0/0 fsid 7113a8dfa6bdf8bd fileid 49a0065 a/m/ctime
1366219123.511214381 1366219132.321214505 1366219132.321214505
    bot.xxx.xxx.nfs > xxx.xxx.xxx.x .2030419243: reply ok 116 lookup ERROR:
No such file or directory post dattr: DIR 40755 ids 65534/65534 sz 4096
nlink 2 rdev 0/0 fsid 7113a8dfa6bdf8bd fileid 49a0170 a/m/ctime
1366219123.176214263 1366219242.654078174 1366219242.654078174
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lug.boulder.co.us/pipermail/lug/attachments/20130418/4bd508a9/attachment.html>


More information about the LUG mailing list