Skip to content

BGP State Machine

Thomas Mangin edited this page Nov 15, 2025 · 3 revisions

BGP Finite State Machine

Understanding BGP's RFC 4271 State Machine

The BGP Finite State Machine (FSM) defines the behavior of BGP sessions as they progress through connection establishment, capability negotiation, and route exchange. Understanding the FSM is essential for troubleshooting BGP sessions and understanding ExaBGP's behavior.


Table of Contents


Overview

Reference: RFC 4271 Section 8 - BGP Finite State Machine

The BGP FSM consists of six states that a BGP session transitions through:

β”Œβ”€β”€β”€β”€β”€β”€β”  Start  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”  TCP Success  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  OPEN Sent  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Idle │────────>β”‚ Connect │──────────────>β”‚ Active   │────────────>β”‚ OpenSent β”‚
β””β”€β”€β”€β”€β”€β”€β”˜         β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜               β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜             β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
   ^                  β”‚                          β”‚                        β”‚
   β”‚                  β”‚ TCP Fail                 β”‚                        β”‚
   β”‚                  v                          β”‚                        β”‚
   β”‚              β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”                    β”‚                        β”‚
   β”‚              β”‚  Idle   β”‚<β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                        β”‚
   β”‚              β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                                             β”‚
   β”‚                                                                      β”‚
   β”‚                                     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  KA Received  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
   β”‚                                     β”‚ OpenConfirmβ”‚<──────────────│ Established β”‚
   β”‚<β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜               β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                  (Any error)                                              β”‚
                                                                           β”‚
                                                                   (Routes exchanged)

State Machine States

Summary Table

State Number Description Next State (Success)
Idle 1 Initial state, awaiting Start event Connect
Connect 2 Attempting TCP connection OpenSent
Active 3 Retrying TCP connection after failure OpenSent
OpenSent 4 TCP connected, OPEN message sent OpenConfirm
OpenConfirm 5 OPEN received, KEEPALIVE sent Established
Established 6 Peering session active, routes exchanged (Session active)

State Descriptions

1. Idle

Waiting for Start event, normally initiated by operator (establishing new BGP session or resetting an existing session). After errors, BGP falls back to the Idle state.

Characteristics:

  • No resources allocated
  • No TCP connection
  • Waiting for administrative or automatic start

Transitions: After a Start event, BGP:

  1. Initializes all BGP resources
  2. Resets ConnectRetryTimer
  3. Initiates TCP transport connection
  4. Listens for connections initiated by remote peer
  5. Moves to Connect state

Triggers:

  • Manual BGP session start
  • Automatic restart after error
  • Configuration change

2. Connect

BGP is waiting for transport protocol connection to complete.

Characteristics:

  • TCP connection in progress
  • ConnectRetryTimer running
  • Listening for incoming connections

Possible Transitions:

  • TCP connection succeeds β†’ OpenSent state, send OPEN message
  • TCP connection fails β†’ Active state
  • ConnectRetryTimer expires β†’ Remain in Connect, reset timer, retry connection
  • Other events β†’ Idle state

Typical Duration: Milliseconds to seconds (depending on network latency)


3. Active

BGP is trying to initiate a transport protocol connection and acquire a peer.

Characteristics:

  • Previous TCP connection attempt failed
  • Actively retrying connection
  • Still listening for incoming connections
  • ConnectRetryTimer running

Possible Transitions:

  • TCP connection succeeds β†’ OpenSent state, send OPEN message
  • ConnectRetryTimer expires β†’ Restart timer, fall back to Connect state
  • Incoming connection from peer β†’ OpenSent state
  • Other events β†’ Idle state

Common Causes:

  • Network unreachable
  • Firewall blocking connection
  • Peer not listening
  • Incorrect peer IP/port

4. OpenSent

TCP connection established, OPEN message sent, waiting for OPEN from peer.

Characteristics:

  • TCP session active
  • OPEN message sent to peer
  • Waiting for peer's OPEN message
  • Validating peer's OPEN parameters

OPEN Message Validation: BGP checks:

  • BGP version (must be 4)
  • Peer AS number (matches configuration)
  • BGP Identifier (valid, non-zero)
  • Optional Parameters (capabilities)

Possible Transitions:

  • Valid OPEN received β†’ OpenConfirm state, send KEEPALIVE
  • Invalid OPEN received β†’ Send NOTIFICATION, return to Idle
  • TCP connection fails β†’ Idle state
  • HoldTimer expires β†’ Send NOTIFICATION, return to Idle

5. OpenConfirm

OPEN messages exchanged, KEEPALIVE sent, waiting for KEEPALIVE from peer.

Characteristics:

  • Both OPEN messages validated
  • KEEPALIVE sent to peer
  • Waiting for peer's KEEPALIVE
  • Almost ready for route exchange

Possible Transitions:

  • KEEPALIVE received β†’ Established state (session UP!)
  • TCP connection fails β†’ Idle state
  • NOTIFICATION received β†’ Idle state
  • HoldTimer expires β†’ Send NOTIFICATION, return to Idle

Typical Duration: Very brief (milliseconds)


6. Established

BGP session is fully established and operational.

Characteristics:

  • Peering session active
  • Routes being exchanged (UPDATE messages)
  • KEEPALIVE messages sent periodically
  • Monitoring peer liveness

Activities:

  • Receive UPDATEs: Process route advertisements
  • Send UPDATEs: Announce local routes
  • Send KEEPALIVEs: Maintain session liveness
  • Monitor HoldTimer: Detect peer failure

Possible Transitions:

  • NOTIFICATION received β†’ Idle state
  • TCP connection fails β†’ Idle state
  • HoldTimer expires β†’ Send NOTIFICATION, return to Idle
  • Administrative shutdown β†’ Send NOTIFICATION, return to Idle

State Transitions

State Transition Events

Event Description
Start Manual/automatic session initialization
TCP Connection Valid TCP three-way handshake succeeds
TCP Connection Fails TCP connection refused/times out
BGP Open (valid) Valid OPEN message received
BGP Open (invalid) Invalid OPEN message received
BGP Header Error Malformed BGP message
Keepalive Message KEEPALIVE received
Update Message UPDATE received
Notification Message NOTIFICATION received (error)
Hold Timer Expires No message received within HoldTime
ConnectRetry Timer Expires Connection retry timeout

Common State Transition Scenarios

Successful Session Establishment

Idle β†’ Connect β†’ OpenSent β†’ OpenConfirm β†’ Established
      (Start)  (TCP OK)    (OPEN OK)     (KEEPALIVE)

Timeline (typical):

  1. Idle: 0ms - Administrator enables BGP
  2. Connect: 0-100ms - TCP SYN, SYN-ACK, ACK
  3. OpenSent: 0-10ms - Send OPEN, receive OPEN
  4. OpenConfirm: 0-5ms - Send KEEPALIVE, receive KEEPALIVE
  5. Established: Session active

Total time: 10ms - 200ms (typical)


Connection Failure (Retry)

Idle β†’ Connect β†’ Active β†’ Connect β†’ Active β†’ ...
      (Start)  (TCP fail) (Retry)  (Fail)

Behavior:

  • Oscillates between Connect and Active states
  • ConnectRetryTimer governs retry interval
  • Eventually gives up or succeeds

OPEN Message Rejected

Idle β†’ Connect β†’ OpenSent β†’ Idle
      (Start)  (TCP OK)    (Bad OPEN, NOTIFICATION)

Common Reasons:

  • AS number mismatch
  • Unsupported BGP version
  • Bad BGP Identifier (0.0.0.0 or duplicate)
  • Unsupported capabilities

Session Error After Established

Idle β†’ ... β†’ Established β†’ Idle
                         (NOTIFICATION or TCP fail)

Common Causes:

  • Invalid UPDATE message
  • HoldTimer expiration (peer dead)
  • TCP connection lost
  • Administrative shutdown

ExaBGP FSM Behavior

ExaBGP State Machine Features

Standard Compliance:

  • βœ… Fully RFC 4271 compliant FSM
  • βœ… All six states implemented
  • βœ… Proper timer handling (HoldTimer, ConnectRetryTimer)

ExaBGP-Specific Behavior:

  • API Control: Your program can trigger Start events
  • Graceful Restart: RFC 4724 extends FSM with restart state
  • Connection Collision Detection: RFC 4271 Section 6.8
  • API Notifications: ExaBGP sends FSM state changes to your program

State Change Notifications (Text API):

neighbor 192.0.2.1 state idle
neighbor 192.0.2.1 state connect
neighbor 192.0.2.1 state active
neighbor 192.0.2.1 state opensent
neighbor 192.0.2.1 state openconfirm
neighbor 192.0.2.1 state established

JSON API:

{
  "exabgp": "5.0",
  "time": 1699564800,
  "neighbor": {
    "address": {"peer": "192.0.2.1"},
    "state": "established"
  },
  "type": "state"
}

Troubleshooting with State Machine Knowledge

Session Stuck in "Connect" State

Symptom: BGP stays in Connect state

Possible Causes:

  • Firewall blocking TCP/179
  • Peer not reachable (routing issue)
  • Peer not listening on TCP/179
  • Wrong IP address configured

Debugging:

# Check TCP connectivity
telnet 192.0.2.1 179

# Check ExaBGP logs
env exabgp.log.level=DEBUG exabgp /etc/exabgp/exabgp.conf

# Look for:
# "connecting to peer 192.0.2.1"
# "connection refused" or "connection timeout"

Session Stuck in "Active" State

Symptom: BGP oscillates between Connect and Active

Possible Causes:

  • Same as Connect state
  • TCP connection fails repeatedly

Debugging:

# Check if peer is trying to connect to you
tcpdump -i eth0 'tcp port 179'

# Check firewall
iptables -L -n | grep 179

Session Stuck in "OpenSent" State

Symptom: TCP connected, but no OPEN received

Possible Causes:

  • Peer waiting for OPEN from you (version mismatch)
  • Peer rejecting your OPEN (sending NOTIFICATION)
  • HoldTimer too short

Debugging:

# ExaBGP logs show OPEN rejection reason
env exabgp.log.level=DEBUG exabgp /etc/exabgp/exabgp.conf

# Look for:
# "received notification"
# "bad peer as" or "unsupported version"

Common Errors:

  • AS mismatch: peer-as 65001 in config doesn't match peer's AS
  • Router-ID conflict: Duplicate router-id (not unique)
  • Capability mismatch: Peer doesn't support required capability

Session Flapping (Established β†’ Idle)

Symptom: Session goes Established β†’ Idle repeatedly

Possible Causes:

  • HoldTimer expiration (peer not sending KEEPALIVEs)
  • Invalid UPDATE messages
  • TCP connection instability
  • Your API program crashes

Debugging:

# Check for NOTIFICATION messages
env exabgp.log.level=DEBUG exabgp /etc/exabgp/exabgp.conf

# Look for:
# "received notification"
# "hold timer expired"
# "invalid update message"

# Check if your API process is crashing
ps aux | grep your-api-program

See Also


References

  • RFC 4271 - A Border Gateway Protocol 4 (BGP-4)
    • Section 8: BGP Finite State Machine
    • Section 6.8: BGP Connection Collision Detection
  • RFC 4724 - Graceful Restart Mechanism for BGP
  • Netcraftsmen BGP FSM Article - Classic FSM reference
  • ExaBGP Source: src/exabgp/bgp/neighbor.py - FSM implementation

πŸ‘» Ghost written by Claude (Anthropic AI)

Clone this wiki locally