BGP session clear by remote end when MD5 is configure AND the session was initiate from OpenBSD side failed and do not recover.

Discussion:

Daniel Ouellet

2005-10-04 08:38:27 UTC

Permalink

I am not sure that this is normal for routers configure with MD5 or not
to react like this. Both side can and should be allow to initiate the
bgp session. But when the session is not initiate from bgpd, then
unexpected results occur.

OpenBSD <---> Cisco routers.

With MD5.

If the session is initiate from the OpenBSD side (tcp/xxx -> to tcp/179)
on a remote Cisco router, then any 'bgpctl neighbor x.x.x.x clear' on
that remote router will work and the session clear and comes back
instantly. Great!

However if the session in that condition is clear from the Cisco side
(clear ip bgp x.x.x.x), then the OpenBSD side doesn't really reset the
session and it will continue to expect the packets on the same return
port tcp/xxx oppose to accept the new session on the port 179 that is
initiate at that time from the remote side and then reply to the tcp/xxx
request port.

When the session is reset from the remote side, then it should become
Cisco -> OpenBSD with ( tcp/xxx to tcp/179) so the 179 port should be on
the OpenBSD side then no?

Then you will start to get the error in the log like this:

%TCP-6-BADAUTH: No MD5 digest from OpenBSD(179) to Cisco(48384) (RST)

where the OpenBSD is the OpenBSD IP's and same for the Cisco IP's.

Also, I haven't been able yet to establish a session where the Cisco
side would initiate the session and then the OpenBSD side would be the
remote side when the MD5 is configure. It may be possible and sure
should be, but I haven't been able to yet.

I can provide more details if need be, or tests more as well, but that's
in short what is going on.

It's been many days so far and that what I found on why my sessions with
MD5 are not coming up, or when clear doesn't come back to live.

Looks to me like the bgpd wants to be the initiator of the connection
every time and then it will work for itself well. Is it the case here?

I started to check deeper when I realize that one side always reset the
session quicker then the other without MD5 and then got stuck when MD5
is in use.

This is on 3.7 and I had what look like the same problem with 3.6 and
3.8-current ( sep 29).

Am I missing something here? Was the the intention from the start?

Many thanks for putting some light on this for me.

Daniel

Daniel Ouellet

2005-10-05 22:33:05 UTC

Permalink

More on this with test results, example, setup use, and more details.

The short of it is that bgpd will not establish an MD5 connection as
slave ever! So, if you do get an MD5 session in normal operation, it may
well not stay stable at all depending of bgp flap and who will try to
become master after a flap. You may end up with bgp down until human
action is perform to get it back up from both side of the session.

How did I show that. Checking the various possibility without MD5
configure and then ONLY adding the MD5 on the working setup.

Tested summary. Try to see the results when one side is always force to
be master or slave and see the impact of it. Also, make sure that after
a reset the master will stay the master. The use of filter will
accomplish this to try to isolate a possible problem.

Please read on, as I think this show the situation as is.

Daniel

======================

Without MD5 configure.

With bgpd master
Clear session from bgpd side, session comes back up right away.
Clear session from remote side, session comes back up with delay.

With bgpd slave
Clear session from bgpd side, session comes back up with delay.
Clear session from remote side, session comes back up with possible very
long delay. Much bigger then when master.

================

Now with MD5 configure. We only add

tcp md5sig password test on bgpd side and
neighbor 66.63.12.108 password test on the Cisco side.

With bgpd master
Clear session from bgpd side, session comes back up right away.
Clear session from remote side, session comes back up with possible very
long delay.

With bgpd slave
Just can't establish a session what so ever! The Cisco side will get
stuck in the OpenSent mode and cycle a few times all without success.

66.63.12.108 4 65001 0 1 0 0 0 never OpenSent

The OpenBSD side will show an active session, but not up yet obviously:

dev1# bgpctl s neigh 66.63.12.107
BGP neighbor is 66.63.12.107, remote AS 65001
Description: iBGP Test
BGP version 4, remote router-id 0.0.0.0
BGP state = Active
Last read Never, holdtime 240s, keepalive interval 80s

Message statistics:
Sent Received
Opens 1 0
Notifications 0 0
Updates 0 0
Keepalives 0 0
Route Refresh 0 0
Total 1 0

Local host: 66.63.12.108, Local port: 179
Remote host: 66.63.12.107, Remote port: 56923

And the Cisco side will keep cycling there from active to open and back
to active to open, etc.

66.63.12.108 4 65001 0 2 0 0 0 never Active

Now looking at the logs from each side. OpenBSD try to use the port
tcp/56923 and from the Cisco side we see this error:

000035: *Oct 5 13:38:43.503 EDT: %TCP-6-BADAUTH: No MD5 digest from
66.63.12.108(179) to 66.63.12.107(56923) (RST)
000036: *Oct 5 13:38:44.503 EDT: %TCP-6-BADAUTH: No MD5 digest from
66.63.12.108(179) to 66.63.12.107(56923) (RST)

Looks like the OpenBSD side do not provide the MD5 to the Cisco to
establish the session.

It doesn't matter if I clean the session from the Cisco side, or the
bgpd side, order, etc. Both side, many times, what ever. It will simply
not come up!

Even reloading the Cisco router and killing the bpgd and starting new,
it will not come up!

Always the same errors in the logs.

No MD5 digest received from the OpenBSD side looks like.

===============

Why is bgpd will not establish a session as slave when MD5 is configure
even if the RFC said both sides should be allow to do so?

bgpd wants to be the master every time?

Something sure looks weird here.

================

Setup and tests done with results.

OpenBSD 3.7 and Cisco 5350 connected via Fast Ethernet switch.

OpenBSD <-> switch <-> Cisco 5350

BGP minimal configurations used:

================================
OpenBSD side:

dev1# more /etc/bgpd.conf
# Macros
Peer_Test="66.63.12.107"

# Default global configuration
holdtime 30
holdtime min 10
listen on 66.63.12.108
AS 65001
router-id 66.63.12.108

# List of networks to announce from the router.
network 10.0.1.0/24

# neighbors and peers
group "Peering iBGP on AS65001" {
remote-as 65001
local-address 66.63.12.108
announce all
neighbor $Peer_Test {
descr "iBGP Test"
}
}

==================================
Cisco side:

router bgp 65001
no synchronization
bgp log-neighbor-changes
network 10.0.0.0 mask 255.255.255.0
neighbor 66.63.12.108 remote-as 65001
neighbor 66.63.12.108 version 4
neighbor 66.63.12.108 soft-reconfiguration inbound
no auto-summary

===================================
Filters used and apply to the Fast Ethernet configuration of the Cisco
router like this:

interface FastEthernet0/0
description Connection to OpenBSD Test Lab
ip address 66.63.12.107 255.255.255.192
ip access-group bgpd-master in
duplex auto
speed auto

or

interface FastEthernet0/0
description Connection to OpenBSD Test Lab
ip address 66.63.12.107 255.255.255.192
ip access-group bgpd-slave in
duplex auto
speed auto

and the filters used are:

ip access-list extended bgpd-master
permit tcp host 66.63.12.108 neq bgp host 66.63.12.107 eq bgp
deny tcp host 66.63.12.108 eq bgp host 66.63.12.107 neq bgp
permit ip any any
ip access-list extended bgpd-slave
permit tcp host 66.63.12.108 eq bgp host 66.63.12.107 neq bgp
deny tcp host 66.63.12.108 neq bgp host 66.63.12.107 eq bgp
permit ip any any

===========================================
So the tests and logic are:
Force BGPd to initiate the session -> Cisco router with the filter:

ip access-group bgpd-master in

So, the traffic will be like this.

bgpd (tcp/!179) -> Cisco (tcp/179)

with replay

Cisco (tcp/179) -> bgpd (tcp/!179)

Session comes up right away.

vng2#sh ip bgp summ
BGP router identifier 66.63.12.107, local AS number 65001
BGP table version is 15, main routing table version 15
2 network entries using 234 bytes of memory
2 path entries using 104 bytes of memory
3/2 BGP path/bestpath attribute entries using 372 bytes of memory
0 BGP route-map cache entries using 0 bytes of memory
0 BGP filter-list cache entries using 0 bytes of memory
BGP using 710 total bytes of memory
BGP activity 7/5 prefixes, 8/6 paths, scan interval 60 secs

Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down
State/PfxRcd
66.63.12.108 4 65001 12357 12374 15 0 0 00:07:30 1

===========================================
Force Cisco router to initiate the session -> BGPd with the filter:

ip access-group bgpd-slave in

So, the traffic will be like this.

Cisco (tcp/!179) -> bgpd (tcp/179)

with replay

bgpd (tcp/179) -> Cisco (tcp/!179)

Session is never establish!

==========================================

But it should be establish however for MD5 for sure as any sides can be
the master in a bgp session.

However, not here?

Comments on this?

I think my tests are valid. Am I doing something I should be doing here?
I don't think so, but that's what I found so far and why I can't keep a
stable session with MD5 enable on it.

Claudio Jeker

2005-10-06 09:44:13 UTC

Permalink

Post by Daniel Ouellet
More on this with test results, example, setup use, and more details.
The short of it is that bgpd will not establish an MD5 connection as
slave ever! So, if you do get an MD5 session in normal operation, it may
well not stay stable at all depending of bgp flap and who will try to
become master after a flap. You may end up with bgp down until human
action is perform to get it back up from both side of the session.
How did I show that. Checking the various possibility without MD5
configure and then ONLY adding the MD5 on the working setup.
Tested summary. Try to see the results when one side is always force to
be master or slave and see the impact of it. Also, make sure that after
a reset the master will stay the master. The use of filter will
accomplish this to try to isolate a possible problem.
Please read on, as I think this show the situation as is.
Daniel
======================
Without MD5 configure.
With bgpd master
Clear session from bgpd side, session comes back up right away.
Clear session from remote side, session comes back up with delay.
With bgpd slave
Clear session from bgpd side, session comes back up with delay.
Clear session from remote side, session comes back up with possible very
long delay. Much bigger then when master.

I think this is fixed in -current. Henning commited something to make the
delays on neighbor clears faster.

Post by Daniel Ouellet
================
Now with MD5 configure. We only add
tcp md5sig password test on bgpd side and
neighbor 66.63.12.108 password test on the Cisco side.
With bgpd master
Clear session from bgpd side, session comes back up right away.
Clear session from remote side, session comes back up with possible very
long delay.
With bgpd slave
Just can't establish a session what so ever! The Cisco side will get
stuck in the OpenSent mode and cycle a few times all without success.
66.63.12.108 4 65001 0 1 0 0 0 never OpenSent
dev1# bgpctl s neigh 66.63.12.107
BGP neighbor is 66.63.12.107, remote AS 65001
Description: iBGP Test
BGP version 4, remote router-id 0.0.0.0
BGP state = Active
Last read Never, holdtime 240s, keepalive interval 80s
Sent Received
Opens 1 0
Notifications 0 0
Updates 0 0
Keepalives 0 0
Route Refresh 0 0
Total 1 0
Local host: 66.63.12.108, Local port: 179
Remote host: 66.63.12.107, Remote port: 56923
And the Cisco side will keep cycling there from active to open and back
to active to open, etc.
66.63.12.108 4 65001 0 2 0 0 0 never Active
Now looking at the logs from each side. OpenBSD try to use the port
000035: *Oct 5 13:38:43.503 EDT: %TCP-6-BADAUTH: No MD5 digest from
66.63.12.108(179) to 66.63.12.107(56923) (RST)
000036: *Oct 5 13:38:44.503 EDT: %TCP-6-BADAUTH: No MD5 digest from
66.63.12.108(179) to 66.63.12.107(56923) (RST)
Looks like the OpenBSD side do not provide the MD5 to the Cisco to
establish the session.
It doesn't matter if I clean the session from the Cisco side, or the
bgpd side, order, etc. Both side, many times, what ever. It will simply
not come up!
Even reloading the Cisco router and killing the bpgd and starting new,
it will not come up!
Always the same errors in the logs.
No MD5 digest received from the OpenBSD side looks like.

It looks like the tcpmd5 is enabled to late when opeining a session.
I try to have a look at it.

Post by Daniel Ouellet
===============
Why is bgpd will not establish a session as slave when MD5 is configure
even if the RFC said both sides should be allow to do so?
bgpd wants to be the master every time?
Something sure looks weird here.

That's more like a bug. Btw. MD5 between to bgpd is working, at least it
works for me.

Post by Daniel Ouellet
==========================================
But it should be establish however for MD5 for sure as any sides can be
the master in a bgp session.
However, not here?
Comments on this?
I think my tests are valid. Am I doing something I should be doing here?
I don't think so, but that's what I found so far and why I can't keep a
stable session with MD5 enable on it.

For me it looks like a bug for now.

--
:wq Claudio

Daniel Ouellet

2005-10-06 18:23:17 UTC

Permalink

Post by Claudio Jeker

Post by Daniel Ouellet
With bgpd master
Clear session from bgpd side, session comes back up right away.
Clear session from remote side, session comes back up with delay.
With bgpd slave
Clear session from bgpd side, session comes back up with delay.
Clear session from remote side, session comes back up with possible very
long delay. Much bigger then when master.

I think this is fixed in -current. Henning commited something to make the
delays on neighbor clears faster.

My first tests was done with current (sep 29), but with a small
difference in the setup lab. It was done in live network. But I will
sure redo it again. It's to important to me for not be 150% sure it's
working well. So far, it just wasn't. I have well over 100+ peer
sessions, of witch ~70+ are using MD5 and I can't not have them stable.
Plus I have no choice as well to either buy bigger Cisco routers, and
hell I don't want that! Or use OpenBSD and that's what I want. I ma fed
up with CPU limitation power of Cisco and I will kiss them goodbye!

Post by Claudio Jeker

Post by Daniel Ouellet
Even reloading the Cisco router and killing the bpgd and starting new,
it will not come up!
Always the same errors in the logs.
No MD5 digest received from the OpenBSD side looks like.

It looks like the tcpmd5 is enabled to late when opeining a session.
I try to have a look at it.

You have no idea how much I would appreciate that! I started to look at
the code, but that's a long process for me.

Post by Claudio Jeker

That's more like a bug. Btw. MD5 between to bgpd is working, at least it
works for me.

That's what I thought, but I know better then starting to say there is a
bug. Before I do, I sure want to be sure, but it does look like it to me
however so far. My tests so far show that you can have MD5 as long as
OpenBSD is master, but clear sessions, depending with side initiate it,
doesn't come back in one case and are slow in the other. (That was with
3.7 for my last tests on this one) Will redo.

Post by Claudio Jeker

For me it looks like a bug for now.

Same thought here.

Daniel

Claudio Jeker

2005-10-06 15:13:42 UTC

Permalink

Post by Daniel Ouellet
More on this with test results, example, setup use, and more details.
======================
Without MD5 configure.
With bgpd master
Clear session from bgpd side, session comes back up right away.
Clear session from remote side, session comes back up with delay.
With bgpd slave
Clear session from bgpd side, session comes back up with delay.
Clear session from remote side, session comes back up with possible very
long delay. Much bigger then when master.

I see similar delays with my test setup. Most of the time it takes longer
for a session to come back up because of different timers that are run.
After a clear a reopen is tried immediately and that is most often
blocked. In my case the cisco seems to be to slow to close the session in
time for the reopen.
It also matters where you close the connection because in one case the
idle timer is run (30s) instead of the connect retry timer (120s).
Also the idle timer has starts to grow if you flap the session often.

I can't reproduce this. On my test setup all session come back up.

...

Post by Daniel Ouellet
Now looking at the logs from each side. OpenBSD try to use the port
000035: *Oct 5 13:38:43.503 EDT: %TCP-6-BADAUTH: No MD5 digest from
66.63.12.108(179) to 66.63.12.107(56923) (RST)
000036: *Oct 5 13:38:44.503 EDT: %TCP-6-BADAUTH: No MD5 digest from
66.63.12.108(179) to 66.63.12.107(56923) (RST)

This is a Cizzz-coee / RFC feature. They enforce a TCP MD5 digest on TCP RST
packets. Now that's just stupid because it is not possible to do that in
some cases because the other side does not know the key at that time (e.g.
to signalize that the port is unavailable).
In your case this means that somehow the connection from the cisco to your
OpenBSD box is blocked or there is nothing listening on port 179.

Post by Daniel Ouellet
Looks like the OpenBSD side do not provide the MD5 to the Cisco to
establish the session.

OpenBSD only misses the MD5 digest on the RST packets and that is actually
OK. RFC 2385 actually mentions this special case in 4.1:
A connectionless reset will be ignored by the receiver of the reset,
since the originator of that reset does not know the key, and so
cannot generate the proper signature for the segment. This means,
for example, that connection attempts by a TCP which is generating
signatures to a port with no listener will time out instead of being
refused. Similarly, resets generated by a TCP in response to
segments sent on a stale connection will also be ignored.
Operationally this can be a problem since resets help BGP recover
quickly from peer crashes.

Post by Daniel Ouellet
It doesn't matter if I clean the session from the Cisco side, or the
bgpd side, order, etc. Both side, many times, what ever. It will simply
not come up!
Even reloading the Cisco router and killing the bpgd and starting new,
it will not come up!
Always the same errors in the logs.
No MD5 digest received from the OpenBSD side looks like.

Does it initially come up? As I said I can not reproduce it.

Are you running pf? Perhaps the packet get blocked or modified on the way
in and so the session is reset.
Check with netstat -sptcp for the md5 counters.

BTW. I mostly reused your config. I just disabled soft-reconfig inbound
because my 2500 testbox would probably not survive that.

--
:wq Claudio

Daniel Ouellet

2005-10-06 18:50:12 UTC

Permalink

Post by Claudio Jeker

Post by Daniel Ouellet
======================
Without MD5 configure.
With bgpd master
Clear session from bgpd side, session comes back up right away.
Clear session from remote side, session comes back up with delay.
With bgpd slave
Clear session from bgpd side, session comes back up with delay.
Clear session from remote side, session comes back up with possible very
long delay. Much bigger then when master.

The interesting facts here for me were how different it was for each
side. I did this many times 10x+ on each setup to see. bgpd master to
Cisco and clear from bgpd side to Cisco, the Cisco session comes back up
instantly. As for Cisco master initiate clear to bgpd, was the slowest
by far. I mean much longer. The other two possibilities are pretty much
equal. It was interesting finding never the less. Why, I am not sure
however.

Post by Claudio Jeker

Post by Daniel Ouellet
Now with MD5 configure. We only add
tcp md5sig password test on bgpd side and
neighbor 66.63.12.108 password test on the Cisco side.
With bgpd master
Clear session from bgpd side, session comes back up right away.
Clear session from remote side, session comes back up with possible very
long delay.
With bgpd slave
Just can't establish a session what so ever! The Cisco side will get
stuck in the OpenSent mode and cycle a few times all without success.
66.63.12.108 4 65001 0 1 0 0 0 never OpenSent

I can't reproduce this. On my test setup all session come back up.

I will try current again, and send even more details on my setup, or if
you ever want to check it out, I have no problem what so ever to provide
you access to both boxes directly for you to check it out as well. Just
say the words if interested? I try Cisco IOS 12.3x and 12.4x, same
results so far.

Post by Claudio Jeker

Last tests at ~5 AM this morning, still show me this and nothing was in
the path for blocking it a tall. I will recheck as it's been a few days
without sleep so far, so I admit, I could start to be fussz a bit. Lack
of sleep, but I will make sure before saying false things here. But in
any case, not that I like it what so ever, I am not sure of the
Cizzz-coee stuff. The sad thing is that they have a huge portions of the
Internet routers still, hopefully changing quickly, but still, we need
to interact with them a lots.

Post by Claudio Jeker

Post by Daniel Ouellet
Looks like the OpenBSD side do not provide the MD5 to the Cisco to
establish the session.

OpenBSD only misses the MD5 digest on the RST packets and that is actually
A connectionless reset will be ignored by the receiver of the reset,
since the originator of that reset does not know the key, and so
cannot generate the proper signature for the segment. This means,
for example, that connection attempts by a TCP which is generating
signatures to a port with no listener will time out instead of being
refused. Similarly, resets generated by a TCP in response to
segments sent on a stale connection will also be ignored.
Operationally this can be a problem since resets help BGP recover
quickly from peer crashes.

I can deal with that delay and I agree that it makes sense to refuse the
reset, or ignore it, however, looks like so far, the session doesn't
resets. May be because it does receive message still from the Cisco side
on wrong ports, but somehow see it as keep alive. I really don't know
what I am saying here, just a weird thoughts, but so far the results are
that it doesn't resets. I will tests in more details again. But just
know that something is not active in the best interest of the session
here somewhere.

Post by Claudio Jeker

Does it initially come up? As I said I can not reproduce it.

No, not if you killed it, but that was with trying to have bgpod slave,
not master. More to come on this as well.

Post by Claudio Jeker

Are you running pf? Perhaps the packet get blocked or modified on the way
in and so the session is reset.
Check with netstat -sptcp for the md5 counters.
BTW. I mostly reused your config. I just disabled soft-reconfig inbound
because my 2500 testbox would probably not survive that.

No PF what so ever. The setup was describe in the first posting.

In short, I have both boxes connected to a switch and I ONLY put a
filter on the Cisco router to force the direction to the port tcp/179 to
force witch side will be initiate the session. The filter use was one of
the two below, depending what side I wanted to force to be master, or
slave. This was apply to the FastEthernet facing the bgpd side
obviously. That was/is legal no. Not in normal operation, but for
testing reason, it should really do what I am trying to do and force one
side to become the master no? Based on the RFC anyway, it sure should be.

I tested that on both a Cisco 5350 and 7206 VXR. I can most likely also
tests this on 26xx, but I didn't think it was going to be different.
Also, soft-reconfig inbound was enable in all tests, may be I should
check without, but when you peer, most likely many peers will have that
enable by defaults.

ip access-list extended bgpd-master
permit tcp host 66.63.12.108 neq bgp host 66.63.12.107 eq bgp
deny tcp host 66.63.12.108 eq bgp host 66.63.12.107 neq bgp
permit ip any any
ip access-list extended bgpd-slave
permit tcp host 66.63.12.108 eq bgp host 66.63.12.107 neq bgp
deny tcp host 66.63.12.108 neq bgp host 66.63.12.107 eq bgp
permit ip any any

Regards,

Daniel

Daniel Ouellet

2005-10-07 10:10:42 UTC

Permalink

Post by Claudio Jeker

I can't reproduce this. On my test setup all session come back up.

Configuration with MD5.

Well, let see if this help or not. Two example below. One might not be
very elegant, but I think it may well show the problem. I force the bgpd
to try to be slave using some filter on the Cisco router. The filter
WILL be temporary in my case anyway as I want the session to be stuck in
OpenSent mode and then at that time I will remove the filter an sit back
and watch. So, what happen is that the session will never come up, I
think it should anyway, but it doesn't.

Then when I see on the Cisco router OpenSent, I will simply remove the
filter to be 100% sure nothing is blocking the regular traffic and see
if the session can recover. It doesn't.

So, I use this filter to force this stage on the Interface facing the bgpd.

ip access-list extended bgpd-slave
permit tcp any eq bgp any neq bgp
deny tcp any neq bgp any eq bgp
permit ip any any

and apply it like this

interface FastEthernet0/0
description Connection to OpenBSD Test Lab
ip address 66.63.12.107 255.255.255.192
ip access-group bgpd-slave in

I save my config and to be ultra sure nothing else interfere, I simply
reload. No need to do that and it is stupid anyway, but just to be
paranoid here I do that.

After I can ping the Cisco for a few seconds, I initiate my bgpd on both
version of OpenBSD and then when I see the OpenSent stage on the Cisco
router, because even if it should establish a slave connection with this
filter, it doesn't. Why, I wish I knew, but anyway it doesn't. Then when
in OpenSent mode, I remove the filter for the interface totally to be
sure nothing is in the way. Also, remember no pf is running as well and
the two server are fresh install with nothing on them other then they
install and then configuring the bgpd. That's it.

So, when I see:

Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down
State/PfxRcd
66.63.12.106 4 65001 0 1 0 0 0 never OpenSent
66.63.12.108 4 65001 0 1 0 0 0 never OpenSent

I do

no ip access-group bgpd-slave in

on my fast Ethernet interface and the sit back. Nothing will ever happen
here. No session will ever get up. Never! It will cycle in close -> idle
-> active -> OpenSent and then stay there for a few minutes and then
cycle again to the same point and do that over and over again.

What I see on the OpenBSD on 3.7 is

# bgpctl s neigh 66.63.12.107
BGP neighbor is 66.63.12.107, remote AS 65001
Description: iBGP Test
BGP version 4, remote router-id 0.0.0.0
BGP state = Active
Last read Never, holdtime 240s, keepalive interval 80s

Message statistics:
Sent Received
Opens 1 0
Notifications 0 0
Updates 0 0
Keepalives 0 0
Route Refresh 0 0
Total 1 0

Local host: 66.63.12.106, Local port: 179
Remote host: 66.63.12.107, Remote port: 14670

==========================

and at each cycle of close -> idle -> active -> OpenSent, the port above
will changed and in current, after the first cycle, it will show

Last error: unknown error code

instead and no ports informations and error logs like this:

Oct 7 05:44:42 dev2 bgpd[21803]: startup
Oct 7 05:44:42 dev2 bgpd[14625]: route decision engine ready
Oct 7 05:44:42 dev2 bgpd[16756]: listening on 66.63.12.106
Oct 7 05:44:42 dev2 bgpd[16756]: session engine ready
Oct 7 05:44:42 dev2 bgpd[16756]: neighbor 66.63.12.107 (iBGP Test):
state change None -> Idle, reason: None
Oct 7 05:44:42 dev2 bgpd[16756]: neighbor 66.63.12.107 (iBGP Test):
state change Idle -> Connect, reason: Start
Oct 7 05:44:42 dev2 bgpd[16756]: neighbor 66.63.12.107 (iBGP Test):
state change Connect -> OpenSent, reason: Connection open
ed
Oct 7 05:44:42 dev2 bgpd[16756]: neighbor 66.63.12.107 (iBGP Test):
write error: Invalid argument
Oct 7 05:44:42 dev2 bgpd[16756]: neighbor 66.63.12.107 (iBGP Test):
state change OpenSent -> Idle, reason: Fatal error
Oct 7 05:44:49 dev2 ntpd[24590]: adjusting local clock by -170.192293s
Oct 7 05:45:12 dev2 bgpd[16756]: neighbor 66.63.12.107 (iBGP Test):
state change Idle -> Connect, reason: Start
Oct 7 05:46:26 dev2 bgpd[16756]: neighbor 66.63.12.107 (iBGP Test):
socket error: No route to host
Oct 7 05:46:26 dev2 bgpd[16756]: neighbor 66.63.12.107 (iBGP Test):
state change Connect -> Active, reason: Connection open f
ailed
Oct 7 05:48:16 dev2 bgpd[16756]: neighbor 66.63.12.107 (iBGP Test):
state change Active -> OpenSent, reason: Connection opene
d
Oct 7 05:48:16 dev2 bgpd[16756]: neighbor 66.63.12.107 (iBGP Test):
write error: Invalid argument
Oct 7 05:48:16 dev2 bgpd[16756]: neighbor 66.63.12.107 (iBGP Test):
state change OpenSent -> Idle, reason: Fatal error
Oct 7 05:48:34 dev2 ntpd[24590]: adjusting local clock by -169.939425s
Oct 7 05:49:16 dev2 bgpd[16756]: neighbor 66.63.12.107 (iBGP Test):
state change Idle -> Connect, reason: Start
Oct 7 05:49:16 dev2 bgpd[16756]: neighbor 66.63.12.107 (iBGP Test):
socket error: Connection refused
Oct 7 05:49:16 dev2 bgpd[16756]: neighbor 66.63.12.107 (iBGP Test):
state change Connect -> Active, reason: Connection open f
ailed

-------------------

Current is no better but as noted about, the ports information after the
first cycle will be replace with:

Last error: unknown error code

# bgpctl s neigh 66.63.12.107
BGP neighbor is 66.63.12.107, remote AS 65001
Description: iBGP Test
BGP version 4, remote router-id 0.0.0.0
BGP state = Active
Last read Never, holdtime 240s, keepalive interval 80s

Message statistics:
Sent Received
Opens 2 0
Notifications 0 0
Updates 0 0
Keepalives 0 0
Route Refresh 0 0
Total 2 0

Local host: 66.63.12.108, Local port: 179
Remote host: 66.63.12.107, Remote port: 13386

With error log:

Oct 7 05:41:55 dev1 bgpd[15395]: startup
Oct 7 05:41:55 dev1 bgpd[16398]: route decision engine ready
Oct 7 05:41:55 dev1 bgpd[10475]: listening on 66.63.12.108
Oct 7 05:41:55 dev1 bgpd[10475]: session engine ready
Oct 7 05:41:55 dev1 bgpd[10475]: neighbor 66.63.12.107 (iBGP Test):
state change None -> Idle, reason: None
Oct 7 05:41:55 dev1 bgpd[10475]: neighbor 66.63.12.107 (iBGP Test):
state change Idle -> Connect, reason: Start
Oct 7 05:41:55 dev1 bgpd[10475]: neighbor 66.63.12.107 (iBGP Test):
state change Connect -> OpenSent, reason: Connection open
ed
Oct 7 05:41:55 dev1 bgpd[10475]: neighbor 66.63.12.107 (iBGP Test):
write error: Invalid argument
Oct 7 05:41:55 dev1 bgpd[10475]: neighbor 66.63.12.107 (iBGP Test):
state change OpenSent -> Idle, reason: Fatal error
Oct 7 05:42:25 dev1 bgpd[10475]: neighbor 66.63.12.107 (iBGP Test):
state change Idle -> Connect, reason: Start
Oct 7 05:43:40 dev1 bgpd[10475]: neighbor 66.63.12.107 (iBGP Test):
socket error: No route to host
Oct 7 05:43:40 dev1 bgpd[10475]: neighbor 66.63.12.107 (iBGP Test):
state change Connect -> Active, reason: Connection open f
ailed
Oct 7 05:45:31 dev1 bgpd[10475]: neighbor 66.63.12.107 (iBGP Test):
state change Active -> OpenSent, reason: Connection opene
d
Oct 7 05:45:31 dev1 bgpd[10475]: neighbor 66.63.12.107 (iBGP Test):
write error: Invalid argument
Oct 7 05:45:31 dev1 bgpd[10475]: neighbor 66.63.12.107 (iBGP Test):
state change OpenSent -> Idle, reason: Fatal error
Oct 7 05:46:31 dev1 bgpd[10475]: neighbor 66.63.12.107 (iBGP Test):
state change Idle -> Connect, reason: Start
Oct 7 05:46:31 dev1 bgpd[10475]: neighbor 66.63.12.107 (iBGP Test):
socket error: Connection refused
Oct 7 05:46:31 dev1 bgpd[10475]: neighbor 66.63.12.107 (iBGP Test):
state change Connect -> Active, reason: Connection open f
ailed

=====================
Second example.

Now, I make sure no filter are present on the Cisco, reload it and kill
the bgpd on both server and restart them. What happen then is I sure can
establish a session where looks like bgpd will always be the master and
then after it is establish, if I reset from the Cisco side, it will
never come back to life. It will get stuck on OpenSent mode again here too.

I setup two boxes, one with 3.7 and one with current (oct 6) to see any
difference for this specific event. Same results so far when MD5 is
configure on it. Same results with Cisco 5350 and 7206. Same thing with
IOS 12.3(9), 12.3(16) or 12.4(3) as well. Obviously, I didn't try every
version under the sun, but the idea is there anyway.

I establish a session with MD5 where bgpd is initiate the session to the
Cisco box. The "bgpctl show neighbor 66.63.12.107" clearly show that
bgpd connect to the remote on 179. After the session is up, if I do
"clear ip bgp 66.63.12.106" or "clear ip bgp 66.63.12.108", both will
get stuck for ever until I manually clear the session as well from the
bgpd side. So, if ONLY the Cisco side initial a session clear, well gone
it will be until a manual clear is also done on bgpd side. I do see the
session on Cisco do the close, idle, active, OpenSent and then get stuck
there. Really looks like the bgpd side simply is not listening anymore.

Only difference is that on current, you get the port clear looks like
and an error message that 3.7 doesn't provide.

current:
# bgpctl s neigh 66.63.12.107
BGP neighbor is 66.63.12.107, remote AS 65001
Description: iBGP Test
BGP version 4, remote router-id 66.63.12.107
BGP state = Idle, down for 00:16:06
Last read 00:16:14, holdtime 240s, keepalive interval 80s

Message statistics:
Sent Received
Opens 17 4
Notifications 2 0
Updates 4 4
Keepalives 34 41
Route Refresh 0 0
Total 57 49

Last error: unknown error code

--------------
as oppose to 3.7 you get this:

# bgpctl s neigh 66.63.12.107
BGP neighbor is 66.63.12.107, remote AS 65001
Description: iBGP Test
BGP version 4, remote router-id 66.63.12.107
BGP state = Active, down for 00:16:17
Last read 00:16:18, holdtime 240s, keepalive interval 80s

Message statistics:
Sent Received
Opens 8 4
Notifications 2 0
Updates 4 4
Keepalives 34 42
Route Refresh 0 0
Total 48 50

Local host: 66.63.12.108, Local port: 14223
Remote host: 66.63.12.107, Remote port: 179

------------------------
and from the router side:

Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down
State/PfxRcd
66.63.12.106 4 65001 44 73 0 0 0 00:16:23 OpenSent
66.63.12.108 4 65001 44 74 0 0 0 00:16:22 OpenSent

====================

No matter how long I wait, it's stuck there for ever.

Now as far as netstat -sptcp is concern, here is the results fro current.

# netstat -sptcp
tcp:
6392 packets sent
3446 data packets (410744 bytes)
75 data packets (14780 bytes) retransmitted
0 fast retransmitted packets
2774 ack-only packets (3298 delayed)
0 URG only packets
0 window probe packets
7 window update packets
90 control packets
0 packets hardware-checksummed
7564 packets received
2808 acks (for 391996 bytes)
783 duplicate acks
0 acks for unsent data
0 acks for old data
4699 packets (347442 bytes) received in-sequence
435 completely duplicate packets (18997 bytes)
0 old duplicate packets
0 packets with some duplicate data (0 bytes duplicated)
47 out-of-order packets (1404 bytes)
0 packets (0 bytes) of data after window
0 window probes
8 window update packets
37 packets received after close
0 discarded for bad checksums
0 discarded for bad header offset fields
0 discarded because packet too short
0 discarded for missing IPsec protection
0 discarded due to memory shortage
7480 packets hardware-checksummed
2 bad/missing md5 checksums
800 good md5 checksums
29 connection requests
62 connection accepts
76 connections established (including accepts)
112 connections closed (including 3 drops)
0 connections drained
7 embryonic connections dropped
1700 segments updated rtt (of 1664 attempts)
101 retransmit timeouts
3 connections dropped by rexmit timeout
0 persist timeouts
4 keepalive timeouts
0 keepalive probes sent
3 connections dropped by keepalive
1 correct ACK header prediction
2301 correct data packet header predictions
1618 PCB cache misses
0 ECN connections accepted
0 ECE packets received
0 CWR packets received
0 CE packets received
0 ECT packets sent
0 ECE packets sent
0 CWR packets sent
cwr by fastrecovery: 51
cwr by timeout: 101
cwr by ecn: 0
1065 bad connection attempts
245 SYN cache entries added
0 hash collisions
62 completed
0 aborted (no space to build PCB)
172 timed out
0 dropped due to overflow
0 dropped due to bucket overflow
4 dropped due to RST
0 dropped due to ICMP unreachable
725 SYN,ACKs retransmitted
182 duplicate SYNs received for entries already in the cache
0 SYNs dropped (no route or no space)
51 SACK recovery episodes
229 segment rexmits in SACK recovery episodes
18668 byte rexmits in SACK recovery episodes
449 SACK options received
26 SACK options sent

I hope this help a bit more. In any case, it's been now more then 30
minutes and still neither the 3.7 or current have recover, or ever
establish a session yet. From this stage, the only way to establish a
session is to clear from the Cisco side and as the session is in active
mode, before it gets to the OpenSent stage, I then clean the bgpd side,
the session will come up right away, but only if done in that order.

Now I need to get some sleep...

Daniel

Daniel Ouellet

2005-10-18 21:09:48 UTC

Permalink

Hi all,

Here is my latest update on this one and a work around as well. Not
great, but it work for now until this bug is fix.

To reproduce the problem, you only need to enable:

ip tcp selective-ack

on your Cisco router and as soon as you will clean the BGP session setup
with MD5 on your OpenBSD from the Cisco side, regardless of OS version,
and even on current, it will never comes back to life. The only way
would be for you to clear your cisco and when in idle mode, to clear
form the OpenBSD side, then and only then will the session will come
back up.

However, you will still have a LOTS of errors messages in your logs if
you look regarding this MD5 session. These don't go away until a reload
is done, so on busy network, not very friendly either, nor practical as
well.

*** This bug ONLY show up when MD5 is configure WITH ip tcp
selective-ack ***

Without MD5, it's working very well thank you! May be the same bug is
there, but just not affecting the session, may be possible, but I do not
know that however. My tests didn't show that to be true so far anyway.

I have been looking at the code for a few days, and I have to admit, I
get lost at times trying to follow it. But it look to me that it would
be either in tcp_input.c or tcp_output.c. Most likely in tcp_input.c and
in the section that process the reset received command from the remote
end. It also have to be when "TCP_SIGNATURE" is enable as well, so I
would assume that it have to be common between the two, but that's just
a guess for now. Looking at the standard from the September 81 page 65
to 73, on how the process should be done, look it might be there, but I
still haven't fully understood that yet. The tcp_input.c follow that
very strictly, but there have to be a step omitted someplace and I can't
put my finger on it yet. But look like a possibility of reply to the
remote reset with ACK without the MD5 in the packet may be the cause of
it, but again, not sure of that fact.

Why, no problem to setup the session at the start, and only show the
problem when a reset is received at witch point the remote end expect
the ack with MD5 and doesn't get it and will stay stuck in FINWAIT1 mode
for ever. The OpenBSD show connected stage, but the remote end show
OpenSent stage and will stay there.

The work around I use for now is to compile a kernel with

option TCP_SACK # Selective Acknowledgements for TCP

disable. Not great I have to admit, but as I do not control the remote
end of multiples peers and some may actually use the "ip tcp
selective-ack" feature on their routers if they try to get more
efficiency out of it, I would be the one impacted by this and I can't
really see myself telling them not to use it because I have a bug on my
side.

So, for now, I simply compile a kernel with that TCP_SACK disable and
then no selective acknowledgment will be in use and then all peer
sessions with MD5 will not suffer this bug.

So, if anyone is actually using BGPd on their network AND also use MD5,
I would recommend to use for now a kernel without "TCP_SACK" enable in
it if they do not want their bgp session going dead in case of reset
from remote end and have to do manual interventions from both side to
get it back up. If you are 100% sure that none of your peer actually use
this feature, then, you are home free and don't even change anything
with it!

Hope this help some, it sure helped me. I got stuck with this one and
lost a few hairs in the process. (;>

May be someone with better understanding of the process and specially of
the tcp_input.c file might find the reason for this, great. If possible
however, if someone find the problem, I would love if I may ask, to give
me a bit of feedback if time allow on how the problem was solved as I
would love to learn that in the process. I think I am getting close to
it, but I can't put my finger on it yet. So, learning from it would be
greatly appreciated if you would be so kind!

Regards,

Daniel