Todd Pytel
2004-08-02 20:47:03 UTC
Hi misc@,
What originally looked like a Samba problem has led back to my router
running 3.4. The crux of the problem is that, shortly after my
workstations mount an NFS share from a server, other IP traffic to that
same server is briefly denied with an ICMP "Host unreachable" message.
The details: Two workstations (in 192.168.0.x) running Debian Linux
use various services (including NFS and SMB) from an OBSD 3.5 server
(69.x.x.x). Between the workstations and the server is a three-legged
OBSD 3.4 router which manages traffic between the external link, the
69.x.x.x server, and the workstation network (for which it acts as a NAT
gateway).
The behavior I first noticed was that the workstations would sometimes
(roughly 50%) fail to mount the SMB share at bootup, failing with a "No
route to host" error. However, doing a "mount -a -t smbfs" after bootup
worked perfectly every time. Strange. Some more poking and debugging
reveals that traffic between the workstations and the server works
correctly for a while, including DNS lookups, NFS connections, and even
a successful NetBIOS query. But after the NetBIOS query, the connection
attempt is denied by the router with an ICMP "No route to host". Here's
some capture output - 69.x.x.x is the server IP throughout this
discussion:
# The successful NetBIOS name query
14:35:12.278208 192.168.0.1.32769 > 69.x.x.x.137: udp 50 (DF)
14:35:12.278544 69.x.x.x.137 > 192.168.0.1.32769: udp 62
# Client attempts the SMB connection
14:35:12.279082 192.168.0.1.32768 > 69.x.x.x.139:
S 3874542792:3874542792(0) win 5840
<mss 1460,sackOK,timestamp 4294709239 0,nop,wscale 0> (DF)
# The router (192.168.1.1) returns a "No route" ICMP reply
14:35:12.279155 192.168.1.1 > 192.168.0.1:
icmp: host 69.x.x.x unreachable
The capture goes on to show a CUPS/IPP connection denied in the same
way at 14:35:13.9. But by 14:35:17.8, DNS requests are routed correctly.
So to summarize, it appears that the router decides there is no route
to the server for a few seconds.
Sorry if this was all a bit long-winded, but the whole situation
seems strange to me, and I'm not sure what details are significant.
Since this whole exchange occurs immediately after NFS shares are
mounted on the client, my first thought was that some component just
needed a bit of extra time to get itself together. But adding a 5 second
sleep between the NFS mount and the SMB mount in the workstation's init
script doesn't change anything, not even the frequency that the problem
appears (again, about half the time). And, to repeat, the SMB mount
completes flawlessly once the workstation has booted - I've done at
least a hundred trials without a failure. I've also tried setting some
of the recommended Linux socket options for smbmount (TCP_NODELAY,
SO_SNDBUF, SO_RCVBUF), but that had no effect either.
So at this point, I'm stumped. If I can provide any more details about
the setup, just ask. Many thanks for any help you can provide.
What originally looked like a Samba problem has led back to my router
running 3.4. The crux of the problem is that, shortly after my
workstations mount an NFS share from a server, other IP traffic to that
same server is briefly denied with an ICMP "Host unreachable" message.
The details: Two workstations (in 192.168.0.x) running Debian Linux
use various services (including NFS and SMB) from an OBSD 3.5 server
(69.x.x.x). Between the workstations and the server is a three-legged
OBSD 3.4 router which manages traffic between the external link, the
69.x.x.x server, and the workstation network (for which it acts as a NAT
gateway).
The behavior I first noticed was that the workstations would sometimes
(roughly 50%) fail to mount the SMB share at bootup, failing with a "No
route to host" error. However, doing a "mount -a -t smbfs" after bootup
worked perfectly every time. Strange. Some more poking and debugging
reveals that traffic between the workstations and the server works
correctly for a while, including DNS lookups, NFS connections, and even
a successful NetBIOS query. But after the NetBIOS query, the connection
attempt is denied by the router with an ICMP "No route to host". Here's
some capture output - 69.x.x.x is the server IP throughout this
discussion:
# The successful NetBIOS name query
14:35:12.278208 192.168.0.1.32769 > 69.x.x.x.137: udp 50 (DF)
14:35:12.278544 69.x.x.x.137 > 192.168.0.1.32769: udp 62
# Client attempts the SMB connection
14:35:12.279082 192.168.0.1.32768 > 69.x.x.x.139:
S 3874542792:3874542792(0) win 5840
<mss 1460,sackOK,timestamp 4294709239 0,nop,wscale 0> (DF)
# The router (192.168.1.1) returns a "No route" ICMP reply
14:35:12.279155 192.168.1.1 > 192.168.0.1:
icmp: host 69.x.x.x unreachable
The capture goes on to show a CUPS/IPP connection denied in the same
way at 14:35:13.9. But by 14:35:17.8, DNS requests are routed correctly.
So to summarize, it appears that the router decides there is no route
to the server for a few seconds.
Sorry if this was all a bit long-winded, but the whole situation
seems strange to me, and I'm not sure what details are significant.
Since this whole exchange occurs immediately after NFS shares are
mounted on the client, my first thought was that some component just
needed a bit of extra time to get itself together. But adding a 5 second
sleep between the NFS mount and the SMB mount in the workstation's init
script doesn't change anything, not even the frequency that the problem
appears (again, about half the time). And, to repeat, the SMB mount
completes flawlessly once the workstation has booted - I've done at
least a hundred trials without a failure. I've also tried setting some
of the recommended Linux socket options for smbmount (TCP_NODELAY,
SO_SNDBUF, SO_RCVBUF), but that had no effect either.
So at this point, I'm stumped. If I can provide any more details about
the setup, just ask. Many thanks for any help you can provide.
--
Todd Pytel
Todd Pytel