Fig 1: General trace infrastructure
In any complex architecture, debug and instrumentation resources are critical for multicore design. Debug instrumentation provides information for post-silicon debug analysis, software optimization, and performance monitoring.
During the debug phase, the design under debug (DUD) executes normal operations, and the run-time traces of the DUD’s observable points are collected. These traces are analysed for fault detection and localization at the debug analyser. At the beginning of the debug phase, the host system (many a time, the host system and the debug analyzer are the same) configures the trigger unit through the debug port and the debug bus. In the figure, the JTAG interface acts as the debug port, and the configuration channel acts as the debug bus. The trigger configuration is performed to instruct the conditions of triggering. This can be periodic or event-based triggering. Whenever the trigger conditions are satisfied, the trigger happens, and the traces of the observable points are captured. Now, these traces can be transferred to the debug analyser during the run-time through the trace bus and the trace port or can be stored in an embedded trace buffer (ETB) for later analysis. Many a time, the traces are not able to be transferred during run-time due to the bandwidth limitations of the trace bus as well as the trace port. Therefore, trace store and forward methodology is popularly adopted. The width of the CoreSight output trace data bus can be 1, 2, 4, 8, or 16 bits wide. Higher bandwidth output can be obtained using the data packet controller (DPC) high-speed debug port (HSDP). The ATB trace capture is typically achieved using multiple components, including:
Fig 2: The Arm Cortex M/R/A processor uses the CoreSight for on-chip Debug and Trace capabilities
Advanced Trace Bus (ATB): A bus used by trace devices to share CoreSight capture resources.
Trace sources: The debug logic is distributed to provide real-time trace facilities for the application processor cores. Below is the list of trace sources introduced by Arm.
- ETM (Embedded Trace Macrocell): The ETM captures detailed information about the executed instructions, providing a complete picture of program flow, including function calls and returns, making it ideal for analyzing complex algorithms or performance bottlenecks.
- ITM (Instrumentation Trace Macrocell): ITM is used to capture custom data points inserted by the programmer through software instrumentation, allowing for logging specific events or variables at specific points in the code. The ITM is an application-driven trace source that supports printf-style debugging to trace Operating System (OS) and application events and emits diagnostic system information. The ITM emits trace information as packets. ITM trace is also called Software trace. The software can write directly to ITM stimulus registers. This emits packets. Timestamps are emitted relative to packets. The ITM contains a 21-bit counter to generate the timestamp.
- DWT (Data Watchpoint and Trace): A watchpoint is a special type of breakpoint that monitors a specific memory tied to a data item. The application pauses execution whenever that memory is modified. The Data Watchpoint and Trace (DWT) unit provides the following: comparators that support watchpoints that cause the processor to enter a Debug state or take a DebugMonitor exception. DWT trace is also called Hardware trace. The DWT generates these packets, and the ITM emits them.
- STM (System Trace Macrocell): The concept behind System Trace Macrocell (STM) trace is that a core can perform data write transactions to a memory-mapped area of the STM, residing on the AXI bus of the processor. This memory-mapped area, called the Stimulus Port, is divided into multiple so-called Channels. A write transaction to such an STM Stimulus Port Channel triggers the STM to emit an STM message via the hardware trace port. The Channel number encoded in the STM message can be used by the trace recording tool to differentiate between different message types. An STM message may contain a data field with a length of up to 64 bits, a timestamp, and also a marker to allow for multi-message protocols, e.g. for sending out strings.
Trace Replicator: In ArmARM CoreSight, a “trace replicator” is a dedicated hardware component that duplicates incoming trace data streams, allowing the same trace information to be sent to multiple destinations simultaneously, essentially acting as a splitter for trace data within the on-chip debug infrastructure; it is useful when you need to send trace information to both an on-chip trace buffer and an off-chip debug interface at the same time.
ATB Bridge: This ATB bridge is used to transport the AMBA trace bus across a power domain boundary.
Trace sinks are the endpoints for the trace data collection on the SoC. There are mainly two types of trace sinks,
On-chip trace sink: An on-chip trace sink is a component within the debug and trace architecture that captures and stores trace data directly on the chip. Here are some key points about on-chip trace sinks:
- Embedded Trace Buffer (ETB): The ETB is an on-chip memory buffer that stores trace data. It captures trace data from various sources and stores it in a dedicated RAM. The ETB is useful for capturing trace data without the need for external trace storage.
- Embedded Trace Router (ETR): The ETR is another on-chip trace sink that stores trace data across an AXI interconnect. It provides flexibility in routing trace data to different on-chip memory locations.
Off-chip trace sink: An off-chip trace sink is a component in the debug and trace architecture that captures and stores trace data outside the chip. Here are some key points about off-chip trace sinks:
- Trace Port Interface Unit (TPIU): The TPIU is an ATB (Advanced Trace Bus) slave that drains trace data off the chip. It acts as a bridge between the on-chip trace data and a data stream that is captured by a Trace Port Analyzer (TPA). The TPIU supports off-chip port sizes from 2 to 34 pins.
- Serial Wire Output (SWO): The SWO is a trace sink similar to the TPIU but uses a single-pin interface. It can only trace one source, the Instrumentation Trace Macrocell (ITM), and outputs the data stream off-chip through a single-pin interface.
Fig 5: The Trace is routed to an external trace analyser via HSSTP


