# A Three-Dimensional Stacked-Chip Star-Wiring Interconnection for a Digital Noise-Free and Low-Jitter I/O Clock Distribution Network

Chunghyun Ryu, Daehyun Chung, Choonheung Lee, Jinhan Kim, Kicheol Bae, Jiheon Yu, Seungjae Lee, and Joungho Kim

Abstract-Cascaded repeaters are indispensable circuit elements in conventional on-chip clock distribution networks due to heavy loss characteristics of on-chip global interconnections. However, cascaded repeaters cause significant jitter and skew problems in clock distribution networks when they are affected by power supply switching noise generated by digital logic blocks located on the same die. In this letter, we present a new three-dimensional (3-D) stacked-chip star-wiring interconnection scheme to make a clock distribution network free from both on-chip and package-level power supply noise coupling. The proposed clock distribution scheme provides an extremely low-jitter and low-skew clock signal by replacing the cascaded repeaters with lossless star-wiring interconnections on a 3-D stacked-chip package. We have demonstrated a 500-MHz input/output (I/O) clock delivery with 34-ps peak-to-peak jitter and a skew of 11 ps, while a conventional I/O clock scheme exhibited a 146-ps peak-to-peak jitter and a 177-ps skew in the same power supply noise environment.

Index Terms—Low-jitter clock, low-skew clock, noise coupling, power supply noise, repeater, three-dimensional (3-D) stacked-chip star-wiring clock distribution.

### I. INTRODUCTION

In CONVENTIONAL on-chip input/output (I/O) clock distribution schemes, a series of cascaded repeaters is essential to overcome the increase in high-frequency clock signal loss in global on-chip interconnections [1]. However, cascaded repeaters are a source of unacceptable clock jitter and skew when they operate in an excessive on-chip power supply noise environment caused by on-chip digital switching currents [2]. An alternative solution, termed a "chip-package hybrid clock distribution scheme," has been demonstrated to solve the clock noise issues caused by on-chip global interconnections and repeaters using lossless package-level interconnections on a ball grid array (BGA) substrate. This method noticeably reduced clock jitter and delay [3]. However, the clock traces routed on the lossless BGA package layer can suffer from high-frequency noise coupling from the power/ground plane

Manuscript received April 30, 2006; revised July 26, 2006. This work was supported by the Center for Electronic Packaging Materials (ERC), MOST/KOSEF and the Joint Laboratory program for CEPM-Fraunhofer IZM under Grants R11-2000-085-07001-0 and R11-2000-085-00001-0.

C. Ryu, D. Chung, and J. Kim are with Korea Advanced Institute of Science and Technology (KAIST), Department of Electrical Engineering, Daejeon 305-701 Korea (e-mail: sdomain@eeinfo.kaist.ac.kr).

C. Lee, J. Kim, K. Bae, J. Yu, and S. Lee are with Amkor Technology Korea, Seoul 133-120 Korea.

Digital Object Identifier 10.1109/LMWC.2006.885604



Fig. 1. Proposed 3-D stacked-chip star-wiring interconnection scheme for low-jitter and low-skew clock distribution.

cavities in the BGA package substrate, since the clock traces have via transitions and return current discontinuities across the power/ground plane cavities. It is a very common design approach to change these package layers while distributing a signal or clock on a BGA package in most high-density multilayer package substrates to enhance the routing capability for a given number of metal layers. This layer change through the via transitions is particularly required to implement a lossless clock tree network [4].

In this letter, we propose a new I/O clock distribution scheme using a three-dimensional (3-D) stacked-chip star-wiring interconnection scheme. The proposed I/O clock distribution scheme is devised to enable the clock signal delivery network to be free from both on-chip digital switching noise coupling and package-level power/ground plane cavity noise coupling. The star-wiring interconnection even includes a delay locked loop (DLL) replica loop on the second stacked silicon chip. In the proposed 3-D stacked-chip star-wiring interconnection clocking scheme, the clock signal originates from a clock generation circuit, such as a PLL or DLL, implemented on the first stacked silicon chip, and it then jumps to the second stacked silicon chip using wire-bond interconnections or flip-chip interconnections, as shown in Fig. 1. The second stacked silicon chip has a star-shape clock distribution network using on-chip metal line and bond pad structures, while the entire clock distribution network is denoted as a "star-wiring interconnection." The clock signal eventually returns to the first stacked silicon chip through the wire-bond interconnection, and



Fig. 2. Schematic diagrams of: (a) a conventional on-chip I/O clocking scheme and (b) proposed 3-D stacked-chip star-wiring I/O clocking scheme.

reaches the destination circuits, such as the I/O clock buffers, at the periphery of the first stacked chip. The proposed star-wiring interconnection has a much lower resistive loss compared with on-chip metal wires, and does not require cascaded repeaters. The improved performance using the 3-D stacked-chip star-wiring interconnection I/O clock scheme was verified through a series of measurements.

# II. IMPLEMENTATION OF PROPOSED 3-D CLOCK DISTRIBUTION NETWORK

To demonstrate the improved performance of a clock delivery using a 3-D stacked-chip clock distribution scheme, a test 3-D stacked-chip package with a star-wiring scheme was fabricated, along with a single-chip package with a conventional clocking scheme, as shown in Fig. 2. Each clocking scheme contained a DLL circuit, I/O buffers, and digital noise blocks to generate the power supply noise on the first stacked chip. The digital noise blocks generated random digital switching noise in the peak-topeak noise voltage range 300–600 mV. The clock driver, which was a part of the DLL circuit, drove 15 I/O buffers and a DLL feedback loop. In the conventional I/O clock scheme shown in Fig. 2(a), the clock signal runs through lossy on-chip interconnections with a total length of 15 mm using 10 repeaters, and the clock network drives 15 I/O buffers and the DLL feedback loop. In contrast, in the 3-D star-wiring scheme shown in Fig. 2(b), a total clock distribution path of 13 mm and the DLL feedback loop was routed through the wire-bond interconnections and the star network on the second stacked chip. As a result, eight repeaters were eliminated from the clock distribution network of the conventional I/O clock scheme.

Fig. 3 shows photographs of the assembled 3-D stacked-chip package and the star-wiring interconnections. The first stacked chip was fabricated using a 0.35- $\mu$ m CMOS process, and had a total area of  $4 \times 4$  mm². The area of the second stack chip was  $2 \times 2$  mm², and this was fabricated using a 0.18- $\mu$ m CMOS process. The second stacked chip had a star-shaped clock tree with wire-bonded interconnections and on-chip metal lines. The on-chip metal lines on the second stacked chip had short line lengths and wide line widths to minimize the resistive line losses. The bonding wires had much less resistive loss and higher transmission bandwidth compared with the on-chip lines for similar interconnection lengths. As a result, the star-wiring



Fig. 3. Photographs of the fabricated 3-D stacked-chip package.



Fig. 4. Power supply voltage noise measured using a PCB with a mounted package for both the conventional on-chip clock scheme and the proposed clock scheme. The package had chips with a clock distribution network and with digital noise blocks for a 500-Mbps PRBS data pattern.

interconnection offered much lower resistive losses than those of the conventional on-chip clock distribution scheme, along with an enhanced transmission bandwidth. In the proposed clock scheme, the intersymbol-interference (ISI) is caused by the parasitic inductance of the wire-bonds and the capacitance of the star-shaped clock tree, and it distorts the clock signal waveform. However, since the clock signal has a periodic pattern, the ISI simply degrades the voltage margin of the clock signal instead of producing timing jitter. The degraded voltage margin is recovered at the local buffer which consists of inverters as in Fig. 2 [3].

# III. EXPERIMENTAL RESULTS

Fig. 4 shows the measured power supply noise voltage waveform generated in the test PCBs with the mounted packages shown in Fig. 2(a) and (b), which have a same power supply noise since they have identical digital noise blocks. The digital noise blocks were switched using a 500-Mbps pseudo random



Fig. 5. Observed clock jitter performance for 2,500 samples at a clock frequency of 500 MHz using a 300-mV power supply noise: (a) a conventional on-chip I/O clock distribution scheme and (b) proposed 3-D stacked-chip starwiring I/O clock distribution scheme.

bit sequence (PRBS) data pattern employing a 2.5-V power supply. A 300-mV peak-to-peak digital switching noise was observed, which is a typical magnitude of the power supply noise voltage for modern digital chips. The power supply noise waveform, shown in Fig. 4, exhibited a harmonic frequency component from the 500-Mbps PRBS digital data pattern generated in the first stacked chip. In the power supply noise environment, the clock jitter was measured for the packages shown in Fig. 2(a) and (b).

Fig. 5 shows the considerably enhanced clock jitter performance using the proposed clock distribution scheme. For a clock frequency of 500 MHz, the conventional I/O clock distribution scheme showed a peak-to-peak jitter of 146 ps and an RMS jitter of 22 ps using 2,500 samples, while the proposed 3-D stacked-chip star-wiring I/O clock distribution scheme exhibited a peak-to-peak jitter of 34 ps and an RMS jitter of 5 ps. Furthermore, we showed that a significant reduction in the clock skew could be achieved using the proposed scheme. As seen in Fig. 6, the conventional I/O clock distribution scheme produced a clock skew of 177 ps, while the proposed scheme exhibited a clock skew of 11 ps under the same test conditions. The noticeable performance improvement of the proposed clock scheme was obtained by combining the on-chip wire and the star-wiring interconnection, thereby eliminating repeaters from the clock distribution path.



Fig. 6. The observed clock skew waveforms for a clock frequency of 500 MHz: (a) a conventional on-chip I/O clock distribution scheme and (b) proposed 3-D stacked-chip star-wiring I/O clock distribution scheme.

#### IV. CONCLUSION

A novel clock distribution scheme for implementing low-jitter and low-skew clock delivery without both on-chip-level and package-level power supply noise coupling has been successfully developed based on low-loss star-wiring interconnections using wire-bonds in a 3-D stacked package with metal lines on the stacked die. The improved performance of the proposed clock scheme was verified from a series of measurements demonstrating a significantly reduced clock jitter and skew. The frequency of the deliverable clock can be further increased over the GHz range using an advanced CMOS process for the DLL circuit and the clock buffers on the first stacked die. The ultimate limit of the deliverable clock frequency is determined by the transmission bandwidth of the bond-wires in the star-wiring clock network, which is >5 GHz.

## REFERENCES

- [1] V. Adler and E. G. Friedman, "Repeater design to reduce delay and power in resistive interconnect," *IEEE Trans. Circuits Syst. II*, vol. 45, no. 5, pp. 607–616, May 1998.
- [2] P. J. Restle, "A clock distribution network for microprocessors," *IEEE J. Solid-State Circuits*, vol. 36, no. 5, pp. 792–799, May 2006.
- [3] D. Chung, C. Ryu, H. Kim, C. Lee, J. Kim, K. Bae, J. Yu, H. Yoo, and J. Kim, "Chip-package hybrid clock distribution network and DLL for low jitter clock delivery," *IEEE J. Solid-State Circuits*, vol. 41, no. 1, pp. 274–286, Jan. 2006.
- [4] J. Park, H. Kim, Y. Jeong, J. Kim, J. S. Pak, D. G. Kam, and J. Kim, "Modeling and measurement of simultaneous switching noise coupling through signal via transition," *IEEE Trans. Adv. Packag.*, vol. 29, no. 3, pp. 548–559, Aug. 2006.