PRP and HSR: Zero-Recovery Industrial Ethernet Protocols
Standard Ethernet redundancy protocols like RSTP or ERPS need time to detect a failure and reconverge – typically tens of milliseconds to seconds. For industrial systems where even a single lost frame is unacceptable (power grid protection relays, railway signaling, process control), IEC 62439-3 defines two protocols that provide zero-recovery-time redundancy: Parallel Redundancy Protocol (PRP) and High-availability Seamless Redundancy (HSR).
Both protocols send duplicate frames over independent paths. The destination receives whichever copy arrives first and discards the duplicate. There is no failover delay because the redundant path is always active and carrying traffic.
How PRP Works
PRP operates on a dual-LAN architecture. Each PRP node (a Doubly Attached Node, or DAN) has two independent Ethernet ports, each connected to a separate physical network.
flowchart LR
subgraph DAN_A["DAN A"]
A_APP["Application"]
A_LRE["Link Redundancy Entity"]
A_P1["Port 1"]
A_P2["Port 2"]
end
subgraph LAN_A["LAN A"]
SA1["Switch A1"]
SA2["Switch A2"]
end
subgraph LAN_B["LAN B"]
SB1["Switch B1"]
SB2["Switch B2"]
end
subgraph DAN_B["DAN B"]
B_P1["Port 1"]
B_P2["Port 2"]
B_LRE["Link Redundancy Entity"]
B_APP["Application"]
end
A_APP --> A_LRE
A_LRE --> A_P1
A_LRE --> A_P2
A_P1 --> SA1
SA1 --> SA2
SA2 --> B_P1
A_P2 --> SB1
SB1 --> SB2
SB2 --> B_P2
B_P1 --> B_LRE
B_P2 --> B_LRE
B_LRE --> B_APP
When DAN A sends a frame, the Link Redundancy Entity (LRE) duplicates it. One copy goes through LAN A, the other through LAN B. At DAN B, the LRE accepts the first arriving copy and drops the second based on a sequence number appended as a Redundancy Control Trailer (RCT).
RCT Structure
PRP appends a 6-byte trailer to every Ethernet frame:
| Field | Size | Description |
|---|---|---|
| Sequence Number | 2 bytes | 16-bit counter, incremented per frame |
| LAN Identifier | 4 bits | 0xA for LAN A, 0xB for LAN B |
| LSDU Size | 12 bits | Length of the payload |
| PRP Suffix | 2 bytes | Fixed value 0x88FB |
The sequence number is the key to duplicate detection. Each source node maintains a per-destination table of recently received sequence numbers. When a frame arrives, the LRE checks whether a frame with the same source address and sequence number has already been received. If yes, the frame is discarded. If no, it is forwarded to the application.
Compatibility with Standard Devices
A major advantage of PRP is backward compatibility. Standard Ethernet devices (Singly Attached Nodes, or SANs) can connect to either LAN A or LAN B and communicate normally. They will not receive duplicate frames because they only see traffic from one network. They lose the redundancy benefit, but they remain reachable.
A RedBox (Redundancy Box) can bridge a SAN into both LANs, providing PRP redundancy to devices that do not natively support it.
How HSR Works
HSR operates on a ring topology. Each HSR node has two ports connected to adjacent nodes, forming a closed loop.
flowchart LR
N1["Node 1 (DAN)"] <--> N2["Node 2 (DAN)"]
N2 <--> N3["Node 3 (DAN)"]
N3 <--> N4["Node 4 (DAN)"]
N4 <--> N1
When Node 1 sends a frame, it injects a copy into each direction of the ring. One copy travels clockwise, the other counterclockwise. Each intermediate node forwards the frame to the next node while also checking if it is the intended destination. When the frame completes the full loop back to the source, the source node removes it.
HSR Tag
HSR inserts a 6-byte tag after the source MAC address:
| Field | Size | Description |
|---|---|---|
| HSR EtherType | 2 bytes | Fixed value 0x892F |
| Network / Path | 4 bits | Identifies ring and direction |
| LSDU Size | 12 bits | Length of the payload |
| Sequence Number | 2 bytes | 16-bit counter for duplicate discard |
The duplicate detection mechanism is the same as PRP – destination nodes use the sequence number and source MAC to identify and discard the second copy.
Ring Fault Behavior
If a link in the ring fails, frames traveling in the direction of the break never reach the destination via that path. The copy traveling the other direction still arrives without interruption. From the application perspective, nothing changes – data continues to flow through the surviving path with zero recovery time.
flowchart LR
N1["Node 1"] <--> N2["Node 2"]
N2 <--> N3["Node 3"]
N3 x-.-x|"link failure"| N4["Node 4"]
N4 <--> N1
Protocol Comparison
| Feature | PRP | HSR |
|---|---|---|
| Topology | Dual independent LANs | Ring |
| Standard | IEC 62439-3 Clause 4 | IEC 62439-3 Clause 5 |
| Redundancy entity | Trailer (RCT) appended to frame | Tag inserted after source MAC |
| Infrastructure | Two full networks with switches | Direct node-to-node ring, no switches |
| Bandwidth overhead | Minimal (trailer only) | Each frame traverses ring twice |
| Standard device support | SANs attach to either LAN directly | Requires a RedBox for non-HSR devices |
| Maximum nodes | No protocol limit (LAN capacity) | Practical limit due to ring latency |
| Cable requirement | Double (two separate LANs) | Single ring, but dual-port nodes |
| Frame processing | End nodes only | Every node in the ring |
Supervision Frames
Both protocols use supervision frames to maintain the node table. Every PRP/HSR node periodically multicasts a supervision frame containing its MAC address and node identity. Other nodes use these frames to:
- Build and maintain a list of active nodes in the network.
- Detect when a node has left or failed (if no supervision frame is received within a timeout).
- Verify that both paths are operational (a node should receive supervision frames from both LANs or both ring directions).
The default supervision interval is 2 seconds, with a node timeout of 6 seconds (three missed frames).
When to Use PRP
PRP is the better choice when:
- The network connects to devices that do not support PRP/HSR. SANs can attach to either LAN and communicate without any protocol awareness. A RedBox extends redundancy to these devices.
- The network spans a large campus or multiple buildings. Each LAN can use its own switching infrastructure with independent spanning tree domains. There is no ring latency constraint.
- High aggregate bandwidth is needed. Each LAN provides full bandwidth independently. In normal operation, both LANs carry identical traffic, but the total available capacity for non-PRP traffic on each LAN is not consumed by ring forwarding.
- Regulatory requirements demand dual-network isolation. Power utility substations following IEC 61850 commonly mandate physically separate networks.
When to Use HSR
HSR is the better choice when:
- Minimizing infrastructure cost and cabling. HSR requires no switches. Nodes connect directly to each other in a ring. This is simpler and cheaper for small to medium installations.
- Space is constrained. In compact installations (e.g., inside a substation control cabinet), eliminating switches reduces the equipment footprint.
- All nodes are HSR-capable. When every device in the network supports HSR natively, there is no need for RedBoxes and the ring topology is straightforward to implement.
- The number of nodes is moderate. HSR works well with up to approximately 50 nodes. Beyond that, accumulated ring latency and bandwidth consumption become significant because every frame traverses the entire ring.
Combining PRP and HSR
IEC 62439-3 allows PRP and HSR to coexist in the same network. A common architecture uses HSR rings at the process level (connecting IEDs, merging units, and protection relays) and PRP at the station level (connecting station computers and SCADA interfaces).
flowchart TD
subgraph station["Station Level (PRP)"]
SC["Station Computer"]
HMI["HMI"]
end
subgraph LAN_A["LAN A"]
SW_A["Switch A"]
end
subgraph LAN_B["LAN B"]
SW_B["Switch B"]
end
subgraph bay1["Bay 1 (HSR Ring)"]
RB1["RedBox"]
IED1["Protection IED"]
MU1["Merging Unit"]
end
subgraph bay2["Bay 2 (HSR Ring)"]
RB2["RedBox"]
IED2["Protection IED"]
MU2["Merging Unit"]
end
SC -->|Port 1| SW_A
SC -->|Port 2| SW_B
HMI -->|Port 1| SW_A
HMI -->|Port 2| SW_B
SW_A --> RB1
SW_B --> RB1
SW_A --> RB2
SW_B --> RB2
RB1 --> IED1
IED1 --> MU1
MU1 --> RB1
RB2 --> IED2
IED2 --> MU2
MU2 --> RB2
The RedBox acts as the bridge between both domains. It is a DAN on the PRP side (connected to both LANs) and an HSR node on the ring side. Frames crossing the boundary are translated between PRP trailers and HSR tags.
Linux Configuration with ip
The Linux kernel includes an hsr network driver
that implements both HSR and PRP.
Both protocols are configured through the same ip link interface,
with the proto parameter selecting the protocol.
Loading the Kernel Module
modprobe hsr
To load the module automatically at boot,
add it to /etc/modules-load.d/:
echo hsr > /etc/modules-load.d/hsr.conf
Configuring HSR
Assume eth0 and eth1 are the two physical ports
that will form the HSR ring.
Both interfaces must be up before creating the HSR device.
ip link set eth0 up
ip link set eth1 up
ip link add name hsr0 type hsr slave1 eth0 slave2 eth1 \
supervision 45 version 1 proto 0
ip addr add 192.168.1.10/24 dev hsr0
ip link set hsr0 up
The parameters:
slave1andslave2– the two physical interfaces forming the ring.supervision 45– supervision frame interval in centiseconds (45 = 450 ms).version 1– HSR protocol version (0 for HSRv0, 1 for HSRv1). HSRv1 adds the network identifier field and is the recommended version.proto 0– selects the HSR protocol. This is the default and can be omitted.
Configuring PRP
PRP uses the same kernel driver.
The only difference is proto 1:
ip link set eth0 up
ip link set eth1 up
ip link add name prp0 type hsr slave1 eth0 slave2 eth1 \
supervision 45 proto 1
ip addr add 192.168.1.10/24 dev prp0
ip link set prp0 up
The version parameter does not apply to PRP
and is ignored when proto 1 is set.
Viewing Interface Status
Show the HSR/PRP device details:
ip -d link show hsr0
Example output:
5: hsr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP
link/ether 00:11:22:33:44:55 brd ff:ff:ff:ff:ff:ff
hsr slave1 eth0 slave2 eth1 supervision 45 version 1 proto 0
Managing the Node Table
View the list of known nodes learned through supervision frames:
ip hsr node list dev hsr0
Example output:
MAC-Address-A MAC-Address-B
00:11:22:33:44:55 00:11:22:33:44:55
aa:bb:cc:dd:ee:01 aa:bb:cc:dd:ee:02
Each line shows the MAC addresses seen on Port A and Port B for a given node. If a node appears with a MAC on only one side, the link on the other side may be down.
Removing the Interface
ip link set hsr0 down
ip link del hsr0
This releases both slave interfaces back to normal operation.
Verifying Redundancy
A quick way to confirm that both paths are active is to watch counters while disconnecting one link:
ip -s link show hsr0
The RX/TX counters should continue incrementing on the surviving path with no frame loss.
Practical Considerations
Frame size increase. PRP adds 6 bytes (RCT trailer) and HSR adds 6 bytes (HSR tag) to every frame. If the network runs at the maximum Ethernet frame size (1518 bytes), devices and switches must support at least 1524 bytes to avoid fragmentation or frame drops. Most managed industrial switches handle this, but verify the configuration.
Clock synchronization. Industrial networks using PRP or HSR often also carry IEEE 1588 (PTP) precision time synchronization. Both PRP and HSR are transparent to PTP – the protocol frames pass through like any other traffic. However, PTP boundary clocks or transparent clocks in the switching infrastructure must be aware of the dual-path topology to avoid timestamp inconsistencies.
Network management. Both protocols expose per-node statistics via standard MIBs defined in IEC 62439-3. Key counters to monitor include:
- Duplicate frames discarded (confirms redundancy is active).
- Supervision frame timeouts (indicates a node or link failure).
- Frames received with wrong LAN ID (possible misconfiguration).
Testing redundancy. To verify zero-recovery behavior, disconnect one path and confirm that the application sees no frame loss. For PRP, disconnect one LAN entirely. For HSR, break one link in the ring. Traffic counters on the remaining path should show all frames delivered without interruption.