CAN Errors / CAN Error States
What are Error Active, Error Passive, and Bus off of CAN Bus?
Just to give a little background to the answer:
In order to prevent malfunctioning nodes from disturbing, or even blocking, an entire system, the CAN protocol implements a sophisticated fault confinement mechanism. The CAN protocol is intended to be orthogonal, i.e. all nodes address faults in the same manner. Fault confinement is provided where each node constantly monitors its performance with regard to successful and unsuccessful message transactions. A Transmit Error Counter (TEC) and a Receive Error Counter (REC) create a metric for communication quality based on historic performance. Each node will act on its own bus status based on its individual history. As a result, a graceful degradation allows a node to disconnect itself from the bus i.e. stop transmitting. This means that a permanently faulty device will cease to be active on the bus (go into Bus Off state), but communications between other nodes can continue unhindered. If the bus media is severed, shorted or suffers from some other failure mode the ability to continue communications is dependent upon the condition and the physical interface used.
Fault confinement is a checking mechanism that makes it possible to distinguish between short disturbances (e.g. switching noise from a nearby power cable couples into the transmission media) and permanent failures (e.g. a node is malfunctioning and disturbs the bus).
Manipulation of the error counters is asymmetric. On a successful transmission, or reception, of a message, the respective error counter is decremented if it had not been at zero. In the case of a transmit or receive error the counters are incremented, but by a value greater than the value they would be decrement by following a successful message transaction.
If a node detects a local error condition (e.g. due to local conducted noise, application software, etc.), its resulting error flag (primary error flag) will subsequently cause all other nodes to respond with an error flag too (secondary error flags). It is important that a distinction is made between the nodes that detected an error first and the nodes which responded to the primary error flag. If a node transmits an active error frame, and it monitors a dominant bit after the sixth bit of its error flag, it considers itself as the node that has detected the error first. In the case where a node detects errors first too often, it is regarded as malfunctioning, and its impact to the network has to be limited. Therefore, a node can be in one of three possible error states:
Error active Both of its error counters are less than 128. It takes part fully in bus communication and signals an error by transmission of an active error frame.This consists of sequence of 6 dominant bits followed by 8 recessive bits, all other nodes respond with the appropriate error flag, in response to the violation of the bit stuffing rule.
Error passive A node goes into error passive state if at least one of its error counters is greater than 127. It still takes part in bus activities, but it sends a passive error frame only, on errors. Furthermore, an error passive node has to wait an additional time (Suspend Transmission Field, 8 recessive bits after Intermission Field) after transmission of a message, before it can initiate a new data transfer. The primary passive error flag consists of 6 passive bits and thus is “transparent” on the bus and will not “jam” communications.
Bus Off If the Transmit Error Counter of a CAN controller exceeds 255, it goes into the bus off state. It is disconnected from the bus (using internal logic) and does not take part in bus activities anymore. In order to reconnect the protocol controller, a so-called Bus Off recovery sequence has to be executed. This usually involves the re-initialization and configuration of the CAN controller by the host system, after which it will wait for 128 * 11 recessive bit times before it commences communication.
CAN Error Confinement Rules
- When a receiver detects an error, the REC will be increased by 1, except when the detected error was a Bit Error during the sending of an Active error Flag or an Overload Flag.
- When a receiver detects a dominant bit as the first bit after sending an Error Flag, the REC will be increased by 8.
- When a transmitter sends an Error Flag, the TEC is increased by 8. Exception 1: If the transmitter is Error Passive and detects an ACK Error because of not detecting a dominant ACK and does not detect a dominant bit while sending its Passive Error Flag. Exception 2: If the transmitter sends an Error Flag because a Stuff Error occurred during arbitration, and should have been recessive, and has been sent as recessive but monitored as dominant.
- If the transmitter detects a Bit Error while sending an Active Error Flag or an Overload Frame, the TEC is increased by 8.
- If a receiver detects a Bit Error while sending an Active Error Flag or an Overload Flag, the REC is increased by 8.
- Any node tolerates up to 7 consecutive dominant bits after sending an Active Error Flag, Passive Error Flag or Overload Flag. After detecting the fourteenth consecutive dominant bit (in case of an Active Error Flag or an Overload Flag) or after detecting the eighth consecutive dominant bit following a Passive Error Flag, and after each sequence of additional eight consecutive dominant bits, ever y transmitter increases its TEC by 8 and every receiver increases its REC by 8.
- After successful transmission of a frame (getting ACK and no error until EOF is finished), the TEC is decreased by 1 unless it was already 0.
- After the successful reception of a frame (reception without error up to the ACK Slot and the successful sending of the ACK bit), the REC is decreased by 1, if it was between 1 and 127. If the REC was 0, it stays 0, and if it was greater than 127, then it will be set to a value between 119 and 127.
- A node is Error Passive when the TEC equals or exceeds 128, or when the REC equals or exceeds 128. An error condition letting a node become Error Passive causes the node to send an Active Error Flag.
- A node is Bus Off when the TEC is greater than or equal to 256.
- An Error Passive node becomes Error Active again when both the TEC and the REC are less than or equal to 127.
- A node which is Bus Off is permitted to become Error Active (no longer Bus Off) with its error counters both set to 0 after 128 occurrence of 11 consecutive recessive bits have been monitored on the bus.