[lug] Horrid NFS performance on linux
Jack Swope
jhswope at gmail.com
Thu Apr 18 11:32:23 MDT 2013
We have an NFS server which is performing at a horrid level.
filebench_networkfs test via nfsometer is reporting .2 to .3 MB/sec. A
base non-optimized server easily hits 25MB/sec.
We have tested using samba to mount the shares and performance via samba is
25MB/sec. Network is all gigabit from machine to server which is born out
by the samba performance. During test cycle CPU load only increases by .2
to .3. Memory usage remains flat, iostats reported below show that it
isn't io bound. Not CPU, not IO, not network as far as we can see.
Problem symptoms:
Performance- .2 - .3 MB/sec using nfsometer filebench_networkfs 300 sec of
work takes 2 hours
Client sign: every rpc call has a matching rpc authrfsh (see the client
stats ffrom nfsstat below).
Affects all linux and solaris machines in the office.
Attempted solutions:
Changed rsize, wsize to 8192, 32768, 1048576
Changed atime relatime, atime, noatime
Changed acl to acl and noacl
Changed proto= tcp and udp
Mixed and matched the above setting moving one at a time with no changes in
performance.
Am at my wit's end and preparing to pour a nice cold Coke into the rack to
see if it solves the problem.
Any further suggestions?
Jack
Stats on the server:
Fedora box uname output:
2.6.31.12-174.2.3.fc12.x86_64 #1 SMP Mon Jan 18 19:52:07 UTC 2010 x86_64
x86_64 x86_64 GNU/Linux
nfs-utils version nfs-utils.x86_64 1:1.2.1-6.fc12
NFS versions 2,3, and 4
/proc/cpuinfo there are 2 quad core processor for a total of 8 cores
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 26
model name : Intel(R) Xeon(R) CPU E5520 @ 2.27GHz
stepping : 5
cpu MHz : 2260.663
cache size : 8192 KB
physical id : 0
siblings : 4
core id : 0
cpu cores : 4
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 11
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp
lm constant_tsc arch_perfmon pebs bts rep_good xtopology tsc_reliable
nonstop_tsc pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca
sse4_1 sse4_2 popcnt lahf_lm ida tpr_shadow vnmi flexpriority ept vpid
bogomips : 4521.32
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management:
/proc/meminfo
cat /proc/meminfo
MemTotal: 33010388 kB
MemFree: 212380 kB
Buffers: 2845004 kB
Cached: 20695400 kB
SwapCached: 11808 kB
Active: 19020876 kB
Inactive: 4964404 kB
Active(anon): 380308 kB
Inactive(anon): 64928 kB
Active(file): 18640568 kB
Inactive(file): 4899476 kB
Unevictable: 0 kB
Mlocked: 0 kB
SwapTotal: 104856120 kB
SwapFree: 104621616 kB
Dirty: 332 kB
Writeback: 0 kB
AnonPages: 438860 kB
Mapped: 21348 kB
Slab: 8031632 kB
SReclaimable: 7206372 kB
SUnreclaim: 825260 kB
PageTables: 33672 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 121361312 kB
Committed_AS: 1682728 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 115536 kB
VmallocChunk: 34359607531 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 7552 kB
DirectMap2M: 33538048 kB
Disk storage is software raid mirrored with 2- 1.5TB Seagate drives.
===============================
Server iostat at 10 second interval on raid device
04/17/2013 11:27:51 AM
avg-cpu: %user %nice %system %iowait %steal %idle
0.31 0.00 2.28 10.84 0.00 86.57
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
md8 565.70 1.60 4524.00 16 45240
04/17/2013 11:28:01 AM
avg-cpu: %user %nice %system %iowait %steal %idle
0.06 0.00 1.00 11.81 0.00 87.14
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
md8 445.20 0.00 3561.60 0 35616
04/17/2013 11:28:11 AM
avg-cpu: %user %nice %system %iowait %steal %idle
0.16 0.01 1.56 12.22 0.00 86.04
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
md8 336.00 0.00 2688.00 0 26880
===================
Server nfsstat foe version 3 at 10 second interval
Server rpc stats:
calls badcalls badauth badclnt xdrcall
2271 0 0 0 0
Server nfs v3:
null getattr setattr lookup access
readlink
0 0% 154 6% 5 0% 102 4% 86 3% 0
0%
read write create mkdir symlink
mknod
10 0% 1680 74% 94 4% 0 0% 0 0% 0
0%
remove rmdir rename link readdir
readdirplus
1 0% 0 0% 1 0% 0 0% 0 0% 0
0%
fsstat fsinfo pathconf commit
0 0% 0 0% 0 0% 117 5%
Server rpc stats:
calls badcalls badauth badclnt xdrcall
1970 0 0 0 0
Server nfs v3:
null getattr setattr lookup access
readlink
0 0% 179 9% 11 0% 82 4% 94 4% 0
0%
read write create mkdir symlink
mknod
29 1% 1379 70% 76 3% 0 0% 0 0% 0
0%
remove rmdir rename link readdir
readdirplus
0 0% 0 0% 3 0% 0 0% 0 0% 0
0%
fsstat fsinfo pathconf commit
0 0% 0 0% 0 0% 96 4%
Server rpc stats:
calls badcalls badauth badclnt xdrcall
1829 0 0 0 0
Server nfs v3:
null getattr setattr lookup access
readlink
0 0% 178 9% 10 0% 87 4% 171 9% 0
0%
read write create mkdir symlink
mknod
106 5% 1094 60% 52 2% 0 0% 0 0% 0
0%
remove rmdir rename link readdir
readdirplus
21 1% 0 0% 3 0% 0 0% 0 0% 0
0%
fsstat fsinfo pathconf commit
2 0% 0 0% 0 0% 83 4%
==========================
Client nfsstat version 3 10 second intervals
Client rpc stats:
calls retrans authrefrsh
461 0 461
Client nfs v3:
null getattr setattr lookup access
readlink
0 0% 0 0% 0 0% 111 24% 129 27% 0
0%
read write create mkdir symlink
mknod
0 0% 110 23% 111 24% 0 0% 0 0% 0
0%
remove rmdir rename link readdir
readdirplus
0 0% 0 0% 0 0% 0 0% 0 0% 0
0%
fsstat fsinfo pathconf commit
0 0% 0 0% 0 0% 0 0%
Client rpc stats:
calls retrans authrefrsh
366 0 366
Client nfs v3:
null getattr setattr lookup access
readlink
0 0% 0 0% 0 0% 90 24% 95 25% 0
0%
read write create mkdir symlink
mknod
0 0% 91 24% 90 24% 0 0% 0 0% 0
0%
remove rmdir rename link readdir
readdirplus
0 0% 0 0% 0 0% 0 0% 0 0% 0
0%
fsstat fsinfo pathconf commit
0 0% 0 0% 0 0% 0 0%
Client rpc stats:
calls retrans authrefrsh
350 0 350
Client nfs v3:
null getattr setattr lookup access
readlink
0 0% 0 0% 0 0% 81 23% 106 30% 0
0%
read write create mkdir symlink
mknod
0 0% 82 23% 81 23% 0 0% 0 0% 0
0%
remove rmdir rename link readdir
readdirplus
0 0% 0 0% 0 0% 0 0% 0 0% 0
0%
fsstat fsinfo pathconf commit
0 0% 0 0% 0 0% 0 0%
=============================
tcpdump from server constant stream of these packets
bot.xxx.xxx.nfs > xxx.xxx.xxx.x.1678097707: reply ok 116 lookup ERROR:
No such file or directory post dattr: DIR 40755 ids 65534/65534 sz 4096
nlink 2 rdev 0/0 fsid 7113a8dfa6bdf8bd fileid 49a0065 a/m/ctime
1366219123.511214381 1366219132.321214505 1366219132.321214505
bot.xxx.xxx.nfs > xxx.xxx.xxx.x .2030419243: reply ok 116 lookup ERROR:
No such file or directory post dattr: DIR 40755 ids 65534/65534 sz 4096
nlink 2 rdev 0/0 fsid 7113a8dfa6bdf8bd fileid 49a0170 a/m/ctime
1366219123.176214263 1366219242.654078174 1366219242.654078174
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lug.boulder.co.us/pipermail/lug/attachments/20130418/4bd508a9/attachment.html>
More information about the LUG
mailing list