PRP and HSR: Zero-Recovery Industrial Ethernet Protocols

Standard Ethernet redundancy protocols like RSTP or ERPS need time to detect a failure and reconverge – typically tens of milliseconds to seconds. For industrial systems where even a single lost frame is unacceptable (power grid protection relays, railway signaling, process control), IEC 62439-3 defines two protocols that provide zero-recovery-time redundancy: Parallel Redundancy Protocol (PRP) and High-availability Seamless Redundancy (HSR).

Both protocols send duplicate frames over independent paths. The destination receives whichever copy arrives first and discards the duplicate. There is no failover delay because the redundant path is always active and carrying traffic.

How PRP Works

PRP operates on a dual-LAN architecture. Each PRP node (a Doubly Attached Node, or DAN) has two independent Ethernet ports, each connected to a separate physical network.

flowchart LR
    subgraph DAN_A["DAN A"]
        A_APP["Application"]
        A_LRE["Link Redundancy Entity"]
        A_P1["Port 1"]
        A_P2["Port 2"]
    end

    subgraph LAN_A["LAN A"]
        SA1["Switch A1"]
        SA2["Switch A2"]
    end

    subgraph LAN_B["LAN B"]
        SB1["Switch B1"]
        SB2["Switch B2"]
    end

    subgraph DAN_B["DAN B"]
        B_P1["Port 1"]
        B_P2["Port 2"]
        B_LRE["Link Redundancy Entity"]
        B_APP["Application"]
    end

    A_APP --> A_LRE
    A_LRE --> A_P1
    A_LRE --> A_P2

    A_P1 --> SA1
    SA1 --> SA2
    SA2 --> B_P1

    A_P2 --> SB1
    SB1 --> SB2
    SB2 --> B_P2

    B_P1 --> B_LRE
    B_P2 --> B_LRE
    B_LRE --> B_APP

When DAN A sends a frame, the Link Redundancy Entity (LRE) duplicates it. One copy goes through LAN A, the other through LAN B. At DAN B, the LRE accepts the first arriving copy and drops the second based on a sequence number appended as a Redundancy Control Trailer (RCT).

RCT Structure

PRP appends a 6-byte trailer to every Ethernet frame:

Field	Size	Description
Sequence Number	2 bytes	16-bit counter, incremented per frame
LAN Identifier	4 bits	`0xA` for LAN A, `0xB` for LAN B
LSDU Size	12 bits	Length of the payload
PRP Suffix	2 bytes	Fixed value `0x88FB`

The sequence number is the key to duplicate detection. Each source node maintains a per-destination table of recently received sequence numbers. When a frame arrives, the LRE checks whether a frame with the same source address and sequence number has already been received. If yes, the frame is discarded. If no, it is forwarded to the application.

Compatibility with Standard Devices

A major advantage of PRP is backward compatibility. Standard Ethernet devices (Singly Attached Nodes, or SANs) can connect to either LAN A or LAN B and communicate normally. They will not receive duplicate frames because they only see traffic from one network. They lose the redundancy benefit, but they remain reachable.

A RedBox (Redundancy Box) can bridge a SAN into both LANs, providing PRP redundancy to devices that do not natively support it.

How HSR Works

HSR operates on a ring topology. Each HSR node has two ports connected to adjacent nodes, forming a closed loop.

flowchart LR
    N1["Node 1 (DAN)"] <--> N2["Node 2 (DAN)"]
    N2 <--> N3["Node 3 (DAN)"]
    N3 <--> N4["Node 4 (DAN)"]
    N4 <--> N1

When Node 1 sends a frame, it injects a copy into each direction of the ring. One copy travels clockwise, the other counterclockwise. Each intermediate node forwards the frame to the next node while also checking if it is the intended destination. When the frame completes the full loop back to the source, the source node removes it.

HSR Tag

HSR inserts a 6-byte tag after the source MAC address:

Field	Size	Description
HSR EtherType	2 bytes	Fixed value `0x892F`
Network / Path	4 bits	Identifies ring and direction
LSDU Size	12 bits	Length of the payload
Sequence Number	2 bytes	16-bit counter for duplicate discard

The duplicate detection mechanism is the same as PRP – destination nodes use the sequence number and source MAC to identify and discard the second copy.

Ring Fault Behavior

If a link in the ring fails, frames traveling in the direction of the break never reach the destination via that path. The copy traveling the other direction still arrives without interruption. From the application perspective, nothing changes – data continues to flow through the surviving path with zero recovery time.

flowchart LR
    N1["Node 1"] <--> N2["Node 2"]
    N2 <--> N3["Node 3"]
    N3 x-.-x|"link failure"| N4["Node 4"]
    N4 <--> N1

Protocol Comparison

Feature	PRP	HSR
Topology	Dual independent LANs	Ring
Standard	IEC 62439-3 Clause 4	IEC 62439-3 Clause 5
Redundancy entity	Trailer (RCT) appended to frame	Tag inserted after source MAC
Infrastructure	Two full networks with switches	Direct node-to-node ring, no switches
Bandwidth overhead	Minimal (trailer only)	Each frame traverses ring twice
Standard device support	SANs attach to either LAN directly	Requires a RedBox for non-HSR devices
Maximum nodes	No protocol limit (LAN capacity)	Practical limit due to ring latency
Cable requirement	Double (two separate LANs)	Single ring, but dual-port nodes
Frame processing	End nodes only	Every node in the ring

Supervision Frames

Both protocols use supervision frames to maintain the node table. Every PRP/HSR node periodically multicasts a supervision frame containing its MAC address and node identity. Other nodes use these frames to:

Build and maintain a list of active nodes in the network.
Detect when a node has left or failed (if no supervision frame is received within a timeout).
Verify that both paths are operational (a node should receive supervision frames from both LANs or both ring directions).

The default supervision interval is 2 seconds, with a node timeout of 6 seconds (three missed frames).

When to Use PRP

PRP is the better choice when:

The network connects to devices that do not support PRP/HSR. SANs can attach to either LAN and communicate without any protocol awareness. A RedBox extends redundancy to these devices.
The network spans a large campus or multiple buildings. Each LAN can use its own switching infrastructure with independent spanning tree domains. There is no ring latency constraint.
High aggregate bandwidth is needed. Each LAN provides full bandwidth independently. In normal operation, both LANs carry identical traffic, but the total available capacity for non-PRP traffic on each LAN is not consumed by ring forwarding.
Regulatory requirements demand dual-network isolation. Power utility substations following IEC 61850 commonly mandate physically separate networks.

When to Use HSR

HSR is the better choice when:

Minimizing infrastructure cost and cabling. HSR requires no switches. Nodes connect directly to each other in a ring. This is simpler and cheaper for small to medium installations.
Space is constrained. In compact installations (e.g., inside a substation control cabinet), eliminating switches reduces the equipment footprint.
All nodes are HSR-capable. When every device in the network supports HSR natively, there is no need for RedBoxes and the ring topology is straightforward to implement.
The number of nodes is moderate. HSR works well with up to approximately 50 nodes. Beyond that, accumulated ring latency and bandwidth consumption become significant because every frame traverses the entire ring.

Combining PRP and HSR

IEC 62439-3 allows PRP and HSR to coexist in the same network. A common architecture uses HSR rings at the process level (connecting IEDs, merging units, and protection relays) and PRP at the station level (connecting station computers and SCADA interfaces).

flowchart TD
    subgraph station["Station Level (PRP)"]
        SC["Station Computer"]
        HMI["HMI"]
    end

    subgraph LAN_A["LAN A"]
        SW_A["Switch A"]
    end

    subgraph LAN_B["LAN B"]
        SW_B["Switch B"]
    end

    subgraph bay1["Bay 1 (HSR Ring)"]
        RB1["RedBox"]
        IED1["Protection IED"]
        MU1["Merging Unit"]
    end

    subgraph bay2["Bay 2 (HSR Ring)"]
        RB2["RedBox"]
        IED2["Protection IED"]
        MU2["Merging Unit"]
    end

    SC -->|Port 1| SW_A
    SC -->|Port 2| SW_B
    HMI -->|Port 1| SW_A
    HMI -->|Port 2| SW_B

    SW_A --> RB1
    SW_B --> RB1
    SW_A --> RB2
    SW_B --> RB2

    RB1 --> IED1
    IED1 --> MU1
    MU1 --> RB1

    RB2 --> IED2
    IED2 --> MU2
    MU2 --> RB2

The RedBox acts as the bridge between both domains. It is a DAN on the PRP side (connected to both LANs) and an HSR node on the ring side. Frames crossing the boundary are translated between PRP trailers and HSR tags.

Linux Configuration with ip

The Linux kernel includes an hsr network driver that implements both HSR and PRP. Both protocols are configured through the same ip link interface, with the proto parameter selecting the protocol.

Loading the Kernel Module

modprobe hsr

To load the module automatically at boot, add it to /etc/modules-load.d/:

echo hsr > /etc/modules-load.d/hsr.conf

Configuring HSR

Assume eth0 and eth1 are the two physical ports that will form the HSR ring. Both interfaces must be up before creating the HSR device.

ip link set eth0 up
ip link set eth1 up

ip link add name hsr0 type hsr slave1 eth0 slave2 eth1 \
    supervision 45 version 1 proto 0

ip addr add 192.168.1.10/24 dev hsr0
ip link set hsr0 up

The parameters:

slave1 and slave2 – the two physical interfaces forming the ring.
supervision 45 – supervision frame interval in centiseconds (45 = 450 ms).
version 1 – HSR protocol version (0 for HSRv0, 1 for HSRv1). HSRv1 adds the network identifier field and is the recommended version.
proto 0 – selects the HSR protocol. This is the default and can be omitted.

Configuring PRP

PRP uses the same kernel driver. The only difference is proto 1:

ip link set eth0 up
ip link set eth1 up

ip link add name prp0 type hsr slave1 eth0 slave2 eth1 \
    supervision 45 proto 1

ip addr add 192.168.1.10/24 dev prp0
ip link set prp0 up

The version parameter does not apply to PRP and is ignored when proto 1 is set.

Viewing Interface Status

Show the HSR/PRP device details:

ip -d link show hsr0

Example output:

5: hsr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP
    link/ether 00:11:22:33:44:55 brd ff:ff:ff:ff:ff:ff
    hsr slave1 eth0 slave2 eth1 supervision 45 version 1 proto 0

Managing the Node Table

View the list of known nodes learned through supervision frames:

ip hsr node list dev hsr0

Example output:

MAC-Address-A  MAC-Address-B
00:11:22:33:44:55 00:11:22:33:44:55
aa:bb:cc:dd:ee:01 aa:bb:cc:dd:ee:02

Each line shows the MAC addresses seen on Port A and Port B for a given node. If a node appears with a MAC on only one side, the link on the other side may be down.

Removing the Interface

ip link set hsr0 down
ip link del hsr0

This releases both slave interfaces back to normal operation.

Verifying Redundancy

A quick way to confirm that both paths are active is to watch counters while disconnecting one link:

ip -s link show hsr0

The RX/TX counters should continue incrementing on the surviving path with no frame loss.

Practical Considerations

Frame size increase. PRP adds 6 bytes (RCT trailer) and HSR adds 6 bytes (HSR tag) to every frame. If the network runs at the maximum Ethernet frame size (1518 bytes), devices and switches must support at least 1524 bytes to avoid fragmentation or frame drops. Most managed industrial switches handle this, but verify the configuration.

Clock synchronization. Industrial networks using PRP or HSR often also carry IEEE 1588 (PTP) precision time synchronization. Both PRP and HSR are transparent to PTP – the protocol frames pass through like any other traffic. However, PTP boundary clocks or transparent clocks in the switching infrastructure must be aware of the dual-path topology to avoid timestamp inconsistencies.

Network management. Both protocols expose per-node statistics via standard MIBs defined in IEC 62439-3. Key counters to monitor include:

Duplicate frames discarded (confirms redundancy is active).
Supervision frame timeouts (indicates a node or link failure).
Frames received with wrong LAN ID (possible misconfiguration).

Testing redundancy. To verify zero-recovery behavior, disconnect one path and confirm that the application sees no frame loss. For PRP, disconnect one LAN entirely. For HSR, break one link in the ring. Traffic counters on the remaining path should show all frames delivered without interruption.