Optimizing Clock Trees to Meet Performance and System Cost Targets
By James Wilson, Director of Marketing, Timing Products, Silicon Laboratories
Hardware design in high-performance applications such as communications systems, wireless infrastructure, servers, broadcast video, and test and measurement equipment is becoming increasingly complex as systems integrate more functionality and require ever-increasing levels of performance. This trend extends to the board-level clock tree that provides reference timing for the system. A “one size fits all” strategy does not apply when it comes to clock tree design. Optimizing clock trees to meet both performance and cost requirements depends on a number of factors, including the system architecture, integrated circuit (IC) timing requirements (frequencies, signal formats, etc.) and the jitter requirements of the end application.
Reference Timing – When to Use a Crystal vs. a Clock
One of the first design considerations is to inventory the hardware design’s reference clock requirements and select the type of reference clocks that will be used for the processors, FPGAs, ASICs, PHYs, DSPs and various other components in the system. Quartz crystals are typically used if the IC has an integrated oscillator and on-chip phase-locked loops (PLLs) for internal timing. Crystals are cost-effective components that exhibit excellent phase noise and are widely available. They can also be placed in close proximity to the IC, simplifying board layout. However, one of the drawbacks of crystals is that their frequency can vary significantly over temperature, exceeding the parts-per-million (ppm) stability requirements of many serializer-deserializer (SerDes) applications. In many stability-sensitive high-speed SerDes applications, crystal oscillators (XOs) are recommended because they guarantee tighter stability than passive crystals.
Clock generators and clock buffers are typically used when several reference frequencies are required. In some applications, FPGA/ASICs have multiple time domains for the data path, control plane and memory controller interface and require multiple unique reference frequencies. A clock generator or buffer is also preferred when the IC cannot accommodate a crystal input, when the IC must be synchronized to an external reference (source-synchronous application), or when a high-frequency reference not easily generated by a crystal is required.
Free-Running versus Synchronous Clock Trees
Once the hardware design clock inventory has been completed and the crystals have been selected for some of the components, the next step is to select the timing architecture for the remaining clocks: free-running or synchronous. For applications that require one or more independent reference clocks without any special phase-lock or synchronization requirements, XOs, clock generators and clock buffers are the preferred choice. Processors, memory controllers, SoCs and peripheral components (e.g., USB and PCI Express switches) typically use a combination of XOs, clock generators and clock buffers for reference timing in free-running, asynchronous applications. XOs are preferable when the application requires one to two timing sources, while clock generators and buffers are better suited for applications that need several individual clocks. Clock generators can synthesize multiple clocks at different frequencies, but sacrifice some jitter performance in comparison to clock buffer + XO clock trees. Clock buffers can be used in conjunction with a XO reference to distribute multiple clocks at the same frequency and provide the lowest jitter implementation for a multi-output clock tree.
Synchronous clocking is used in applications that require continuous communication and network-level synchronization, such as Optical Transport Networking (OTN), SONET/SDH, mobile backhaul, synchronous Ethernet and HD SDI video transmission. These applications require transmitters and receivers to operate at the same frequency. Synchronizing all SerDes reference clocks to a highly accurate network reference clock (e.g., Stratum 3 or GPS) guarantees synchronization across all nodes. In these applications, low-bandwidth PLL-based clocks provide wander and jitter filtering (jitter cleaning) to ensure that network-level synchronization is maintained. In networking line card PLL applications, specialized jitter attenuating clocks or discrete PLLs with voltage-controlled crystal oscillators (VCXOs) are the preferred clock solution for SerDes clocking. For optimal performance, a jitter attenuating clock should be placed at the end of the clock tree, directly driving the SerDes device. Clock generators and buffers can be used to provide other system references.
Clock jitter is a critical specification for timing components because excessive clock jitter can compromise system performance. There are three common types of clock jitter, and, depending on the application, one type of jitter will be more important than another.
Cycle-to-cycle jitter measures the maximum change in clock period between any two adjacent clock cycles, typically measured over 1,000 clock cycles.
Period jitter is the maximum deviation in clock period with respect to an ideal period over a large number of cycles (10,000 clock cycles typical). Both cycle-to-cycle jitter and period jitter are useful in calculating setup and hold timing margins in digital systems, and are often figures of merit for CPU and SoC devices.
Phase jitter is the figure of merit for high-speed SerDes applications. It is a ratio of noise power to signal power calculated by integrating the clock single sideband phase noise across a range of frequencies offset from a carrier signal. Phase jitter is especially critical in FPGA and high-speed SerDes clocking applications in which excessive phase jitter can degrade the bit error rate of the high-speed serial interface.
During clock tree design and component selection, it is important to evaluate devices based on maximum jitter performance. Typical jitter specifications do not guarantee device performance over all conditions, including process, voltage, temperature and frequency variation. Maximum jitter provides a more comprehensive specification inclusive of these additional factors.
In addition, take special care to review jitter test conditions on timing device data sheets. Clock jitter performance varies across a wide range of conditions including device configuration, operating frequency, signal format, input clock slew rate, power supply and power supply noise. Look for devices that fully specify jitter test conditions since they guarantee operation over a wider operating range.
Selection Criteria for Clock and Oscillator Components
Once the basic clock tree architecture is determined, the next step is component selection. Table 1 summarizes the selection criteria that should be used for choosing clock and oscillator components for both free-running and synchronous clock trees. Look for features that simplify clock tree design to minimize bill-of-material (BOM) cost and complexity.
Estimating Clock Tree Jitter
Before a clock tree design is complete, the total clock tree jitter should be estimated to determine if there is sufficient system-level design margin. It is important to note that total clock tree RMS jitter is much less than the simple sum of data sheet jitter specifications from multiple components. The clock tree jitter can be defined by the following:
Note: This equation can be applied to calculating total period jitter and phase jitter, assuming the jitter distributions are Gaussian and uncorrelated. The equation should not be applied to cycle-to-cycle jitter, which is expressed as a peak jitter number and not RMS.
Component jitter Jn can be estimated using data sheet jitter specifications or calculated from phase noise data. Silicon Labs offers an easy-to-use utility for converting clock phase noise to jitter. See http://www.silabs.com/support/Pages/phase-noise-jitter-calculator.aspx for more details. Be sure to use maximum jitter specifications to generate a conservative estimate of total clock tree jitter.
Simplifying Clock Trees
Many clock trees require special features in addition to basic clock generation and distribution. For example, the application may require format/level translation (e.g. 3.3 V LVPECL to 2.5 V LVDS), switching between two clocks at different frequencies, clock division, pin-selectable output enable control and CMOS drive strength (output impedance) control for electromagnetic interference (EMI) reduction. If designed discretely, implementing these functions adds significant cost and complexity to the clock tree design. Silicon Labs has developed a family of Si5330x universal buffers/translators that integrate format/level translation, clock muxing, clock division and other key clock tree building block functions. These devices replace multiple LVPECL, LVDS, CML, HCSL and LVCMOS buffers with a single clock buffer IC. In addition to simplified clock tree design (see Figure 2), the Si5330x devices minimize BOM cost and complexity, simplify procurement and improve system performance.
Silicon Labs offers a broad portfolio of frequency-flexible clock generators, clock buffers, jitter-cleaning clocks and XO/VCXOs, as well as highly integrated clock tree solutions.
this article to a friend!