Question:
What are the most common challenges faced when debugging a PCI Express implementation?
Answer:
The variety of implementations possible in a multi-lane PCI Express system can sometimes make configuration of a PCI Express protocol analyzer a challenge. Here are some of the key parameters that need to be set properly when using the CATC PETracer or PETracer ML for traffic capture and analysis.
There are six main implementation-dependent link configuration settings that can affect how the PETracer and PETracer ML analyzers capture PCI Express link traffic:
- Link Width (PETracer ML only
- Lane Polarity
- Lane Reversal (PETracer ML only)
- Data Scrambling
- PCI Express Specification Version
- Link Reference Clocking
1) Link Width
A single PETracer ML unit can analyze both directions of a one-lane, two-lane, or four-lane PCI Express link. Two PETracer ML units can be combined to analyze both directions of an eight-lane link. Currently, the PETracer ML analyzer needs to know in advance how many lanes to examine on the link to which it is connected, since data in PCI Express packets are 'striped' across all the lanes in the link.
Since link width is negotiated between PCI Express devices, it is possible for two devices to configure to a link width unexpected by the analyzer. When this occurs, captured data traffic will have the following characteristics:
If the NEGOTIATED link width is less than or greater than expected by the analyzer, link training traffic will appear to be normal, since link training traffic is sent on all lanes of the link in parallel (though the analyzer will only display the lanes that it expects to carry traffic). However, all transaction-layer packets (TLPs) and data link-layer packets (DLLPs) will show errors, since the data striping for these packets will not be processed correctly by the analyzer.
If the PHYSICAL link width is less than expected by the analyzer (for example, single-lane when the analyzer is expecting a four-lane link), the analyzer will appear to not capture any traffic at all, since the PETracer ML analyzer expects to see link training traffic on all configured lanes of the link before it deskews and displays any traffic on the link.
Link width is configured from the 'Link' tab of the PETracer ML Recording Options dialog.
2) Lane Polarity
To simplify PCB layout, it is sometimes desirable to invert the polarity of the differential signals comprising a PCI Express lane (D+/D- from the transmitter routed to D-/D+ of the receiving device). The PETracer and PETracer ML analyzers can compensate for the resulting inversion of bits in symbols received on each lane.
The rule of thumb to use for configuring polarity inversion is that if the differential signals for a lane need to be swapped at the receiver of a PCI Express device, analyzer lane polarity inversion should be specified for that lane in that link direction. For example, if a PCI Express add-in card needs to invert lanes at its receiver to process data from the root complex, polarity inversion should be specified for the downstream direction of the link for those lanes. Usually, all lanes in a link will have the same polarity (it is rare for some of the lanes to be inverted and others not, though PETracer ML does allow this).
If lane polarity is not correctly specified in the analyzer application for the link under analysis, the following will be seen in recorded traces:
- For PETracer and PETracer ML in x1 mode, ALL traffic (TLPs, DLLPs, and ordered sets) will be recorded with errors.
- For PETracer ML in x2/x4/x8 mode, no traffic will appear in the trace, since the analyzer will not correctly follow link training.
3) Lane Reversal
PCI Express allows lanes to be reversed between the transmitting and the receiving device (for example, transmitter lanes 0-1-2-3 swapped to 3-2-1-0 at the receiver). PETracer ML supports lane reversal for any multi-lane link, as long as physical lane 0 is included in the resulting link. However, lane reversal needs to be manually configured from the "Link" tab of the Recording Options dialog for each link direction.
The rule of thumb to use for configuring lane reversal is that if the lanes need to be swapped at the receiver of a PCI Express device, analyzer lane reversal should be specified for that link direction. For example, if a PCI Express add-in card needs to reverse lanes at its receiver to process data from the root complex, lane reversal should be specified for the downstream direction of the link.
If lane reversal is not correctly specified in the analyzer application for a direction of the link, the following behavior will be seen:
All link training sequences and ordered sets will appear without errors.
During the final stages of link training (TS2 ordered set transmission), lane numbers are assigned to each lane participating in the link. These lane numbers will appear in the trace from highest to lowest (3-2-1-0 for a x4 link) number if lane reversal is incorrectly specified.
- – TLPs and DLLPs will appear with format and CRC errors
4) Data Scrambling
Data bytes in PCI Express TLPs and DLLPs are "scrambled" under normal operating conditions before 8b/10b encoding by the transmitter and "descrambled" after 8b/10b encoding by the receiver using a linear feedback shift register (LFSR). However, during link training it is possible for the devices participating in the link to negotiate the disabling of data scrambling. Some CATC customers do disable data scrambling during early product development/debug, and both the PETracer and PETracer ML analyzers will detect the appropriate setting during link training. However, if either of the devices participating in the link do not conform to the negotiated scrambling setting, there is a potential for a scrambling mismatch in the captured traffic.
- –All link training sequences and ordered sets will appear without errors (since these types of traffic are never scrambled). The entire link training sequence will be valid up to the initial exchange of flow control parameters (InitFC DLLPs).
- –TLPs and DLLPs will appear with format and CRC errors, potentially in just one link direction (the device not conforming to the negotiated link parameters). However, there will specifically NOT be any symbol or running disparity errors, since the 10b symbols sent will be well-formed. There will also be idle data errors indicated (since idle data is scrambled).
Data scrambling may be disabled (overriding the nego*tiated setting for the link) from the "General" tab of the PETracer Recording Options dialog and the "Link" tab of the PETracer ML Recording Options dialog. The PETracer application also has an additional setting to force scrambling to be always enabled, again overriding the negotiated setting for the link.
5) PCI Express Specification Version
The data scrambling algorithm was changed between the first (1.0) and second (1.0a) releases of the PCI Express Base Specification document. Unfortunately, these two algorithms are NOT compatible with each other. If the analyzer is configured to support 1.0-style descrambling between 1.0a-compatible devices (or vice-versa), the types of errors seen will be identical to those described for data scrambling mismatches above.
The specification version supported is configured from the 'Link' tab of the PETracer ML Recording Options dialog. For PETracer, there are two separate BusEngines provided (PETracer _Rev1_0.rbf and PETracer _Rev1_0a.rbf) to support both specification versions; the appropriate one needs to be loaded into the PETracer analyzer for correct analysis. The version currently loaded into the analyzer may be seen from the Help/About PETracer... menu selection.
6) Link Reference Clocking
PETracer and PETracer ML both support two reference clock sources for the PCI Express link under analysis: a high-precision 250 MHz clock internal to the analyzer and a 100 MHz external reference clock provided by the system under test. In most cases, the internal reference clock should be used; however, if spread-spectrum clocking (SSC) is used by the test system, or if the transmit clocking tolerance in the test system deviates from the +/- 300 ppm standard in the PCI Express specification, it is necessary for the analyzer to use the 100 MHz external reference clock as a reference for link data capture.
The primary symptom of a clocking problem is the appearance of 'bursts' of packet errors in otherwise valid captured traffic, especially in SSC-based systems.
Internal/external reference clocking is configured from the 'General' tab of the PETracer Recording Options dialog and the 'Link' tab of the PETracer ML Recording Options dialog.