## The 10-ps Wavelet TDC: Improving FPGA TDC Resolution beyond Its Cell Delay

Jinyuan Wu and Zonghan Shi Fermi National Accelerator Laboratory, Batavia, IL 60510, USA jywu168@fnal.gov (630)840-8911

## Abstract:

There are two major issues in the delay chain based FPGA TDC due to uneven internal delay in the carry chain. (1) In many applications, the TDC resolution is limited by the "ultra-wide bins", corresponding to the carry chain crossing at the boundaries of the logic array blocks. The apparent widths of these ultra-wide bins can be several times bigger than the average bin width. The "wavelet launcher" described in this paper is designed to make multiple measurements with a single delay chain structure, effectively to sub-divide the ultra-wide bins in each raw measurement. (2) The bin widths are uneven and depend on temperature and power supply voltage, which must be calibrated as frequently as possible. The auto-calibration functional block developed in this work provides semi-continuous calibration that converts the TDC measurements from bins to picoseconds. Several TDC schemes with resolutions in 20 to 10 picoseconds range implemented in today's low cost FPGA have been tested.

## Summary:

Delay chain based TDC can be implemented in FPGA using the carry chain structure as shown in Fig. 1(a). The typical raw bin width, based our measurement in an Altera Cyclone II device (EP2C8T144C6), is about 60ps. However, the logic elements in the FPGA are organized in logic array blocks (LAB) as shown in Fig. 1(b), and at the crossing of the LAB boundaries, extra delay are added into the delay chain resulting in periodic "ultra-wide bins" as shown in the differential non-linearity (DNL) plot in Fig. 1(c). These ultra-wide bins can be as wide as 165ps.





Two major issues must be solved for the practical FPGA TDC "turn-key" applications. (1) In many cases, the TDC resolution is limited by the maximum bin width, not the average bin width. Those ultra-wide bins must be sub-divided. (2) The widths of the bins are uneven and depend on the temperature and power supply voltage. A convenient calibration scheme is needed.

A logic scheme called "wavelet launcher" is designed to sub-divide ultra-wide bins in the raw TDC measurements. A wavelet launcher creates a pulse train or "wavelet" with several 0-to-1 or 1-to-0 logic edges for each input edge and feed the wavelet into the TDC delay chain/register structure, making multiple measurements. A version we tested, the "wavelet launcher A" is shown in Fig. 2.



Fig. 2. The wavelet launcher A: (a) logic block diagram. (b) Screen dump of the test output. (c) Bin widths plot.

When the input is low, a bit pattern is formed in the wavelet launcher, implemented in one LAB containing 16 logic elements as shown in Fig. 2(a). Once the TDC input is high, the bit pattern or wavelet is unleashed and it propagates down in the carry chain/register array structure. At the leading edge of the system clock (400MHz), the bit pattern is recorded in the register array as

shown in Fig. 2(b) and its relative position represents the arrival time of the TDC input. The two 1-to-0 edges are decoded to extract the input arrival time. Note that in regular TDC without wavelet launcher, the input is directly connected to the delay chain and only one transition edge is recorded and decoded for input time. In Fig. 2(c) the sum of the decoded positions of the two edges marked as "tn1+tn2" is compared with the output of the regular single-edge TDC. It can be seen that the ultra-wide bins are now sub-divided and the maximum bin width is about 65ps comparing with 165ps in regular TDC.

An auto calibration functional block has been developed in the FPGA as shown in Fig 3(a). The calibration process is semi-continues during the normal operation of TDC. The random input hits are booked into the DNL histogram implemented with FPGA internal RAM. Each time after 16K hits are booked, the contents of the histogram are integrated and used to update the lookup table (see Fig. 3(b)). The TDC measurements are checked through the lookup table and the center time values in picoseconds of the input bins are output. The lookup table automatically keeps track of the net effect of the temperature and the power supply voltage during the past 16K hits.





A delta t measurement test is done with input hits generated by an independent crystal and the result is shown in Fig. 3(c). The singleedge raw TDC has a delta t RMS error of about 58ps. After calibration, the RMS error is reduced to 40ps. However, the ultra-wide bins still artificially emphasize or de-emphasize some delta t values in Fig. 3(c). The RMS error from the wavelet TDC is further reduced to 25ps and the structure due to ultra-wide bin is eliminated.

Another version of the wavelet launcher, the "wavelet launcher B" is also tested. The wavelet launcher B is simply a ring oscillator enabled by the input (Fig. 4(a)). After arrival of the input, the carry chain/register array structure takes 16 snap shots of the oscillation bit pattern in 16 clock cycles at 400MHz (Fig. 4(b)). The snap shots are processed inside the FPGA for input arrival time, averaged over 16 calibrated and compensated data samples. The details of the process will be discussed in our full paper.



Fig. 4. The wavelet launcher B: (a) logic block diagram. (b) Screen dump of the test output. (c) Test result.

The test result in Fig. 4(c) represents the arrival time difference of a random input and its delayed version. It shows that the delta t RMS error from the Wavelet TDC B (about 12ps) is significantly narrower than that of the raw TDC (about 40ps). Note that the drawback of the Wavelet TDC B is longer dead time since it needs 16 clock cycles to collect all data points. Also the result given here is only the measurement ability. There are works to be done in order to reach this resolution in real system which will be discussed in our full paper.

The following table compares main features of several TDC schemes we have studied: The logic element used is the total number taken from the FPGA compile report. It includes the delay chain/register array, encoder, auto calibration, analysis histogram booking, serial port interface, etc.

| Device: EP2C8T144C6, Price: \$28 (April 2008), Operating Frequency: 400MHz, Total Logic Elements: 8256 |                |              |                      |           |                    |                    |
|--------------------------------------------------------------------------------------------------------|----------------|--------------|----------------------|-----------|--------------------|--------------------|
|                                                                                                        | Max. bin width | Av bin width | $\Delta T RMS error$ | Dead time | Delay Chain Length | Logic Element Used |
| Raw TDC, Non-calibrated                                                                                | 165ps          | 60ps         | 58ps                 | 2.5ns     |                    |                    |
| Raw TDC, calibrated                                                                                    | 165ps          | 60ps         | 40ps                 | 2.5ns     |                    | 1621 (20%)         |
| Wavelet TDC A, calibrated                                                                              | 65ps           | 30ps         | 25ps                 | 5ns       | 64                 |                    |
| Wavelet TDC B, calibrated                                                                              |                |              | 12ps                 | 45ns      |                    | 1988 (24%)         |