A 6Gb/s Transceiver Design with Phase-Difference Modulation Signaling for Multi-drop DRAM Interface
Chang Yoon Han
1, Jae Young Seo
2and Byung Sub Kim
aDepartment of Electrical Engineering, Pohang University of Science and Technology E-mail: 1[email protected], 2[email protected]
Abstract - In this paper, we designed the phase-difference modulation (PDM) transceiver for the application of PDM signaling in the multi-drop DRAM interface. Because PDM signaling reduced the effect of the reflected signal by positioning the reflected signal between the clock edges, In addition, PDM transceiver did not increase the hardware cost because it does not demand DFE and FFE circuits. With PDM signaling, we implemented the two amplifiers, which make the design complexity of the clock recovery circuit simple: the clock recovery circuit is a simple interpolator. The proposed PDM transceiver was fabricated in 65 nm CMOS technology and verified the performance by simulations. To verify the performance of the PDM signaling, we compared the simulated 6 Gb/s eye diagram in the multi-drop channel with the NRZ signaling. The simulated vertical and horizontal eye sizes in PDM signaling were increased to 60.5 mV and 63.7 ps, respectively; but the simulated eye was closed in NRZ signaling.
Therefore, with PDM signaling, the multi-drop memory interfaces with high capacity are feasible without increasing the power and hardware cost.
Keywords—Inter-symbol interference, Multi-drop memory interface, Phase difference modulation signaling
I. INTRODUCTION
The required capacity and bandwidth of dynamic random access memory (DRAM) memory interfaces have rapidly increased due to the development of next-generation technology such as big data, machine learning, and artificial intelligence (AI) [1]. For this reason, the DRAM memory interfaces must have high density, high speed, and low power requirements for high-speed data communication of the CPU in the memory industry.
The multi-drop channel in memory interfaces is widely used because it has provided high memory capacity at a low cost. A memory capacity has been increased by increasing the number of drop counts of multi-drop channels. However, the increasing number of drop counts causes reflective inter- symbol interferences (ISIs) due to channel discontinuities at the stubs [2], [3]. Therefore, the reflective ISIs must be
suppressed to increase memory capacity without data distortion in DRAM memory interfaces.
The DDR5 interface adopted advanced techniques to suppress reflective ISI and achieved a high data rate ranging from 3.2 Gb/s to 6.4 Gb/s [4]. However, the advanced techniques increase hardware costs. For example, an advanced dual-inline memory module (DIMM) was adopted in DDR5 interfaces such as registered DIMM (RDIMM) and load-reduced DIMM (LRDIMM) [5], [6]. RDIMM and LRDIMM compensate signal distortion caused by reflective ISIs by placing a register or buffer between the memory controller and DRAM. Also, DDR5 adopted a 1-tap decision feedback equalizer (DFE) to remove 1st post-cursor at RX.
However, DFE requires a large amount of hardware cost to remove irregular reflective ISIs when the number of drops is substantial. To remove a large number of post-cursors, a conventional DFE transceiver requires many DFE taps which significantly increases the chip area and power consumption. When the data rate increases, reflective ISI also increases in the response characteristic of the multi-drop channel, and a large amount of DFE taps is required to compensate for the signal integrity. In a related research paper, NRZ signaling required a 19-tap to obtain 7.8 Gb/s in a multi-drop channel with a 10-cm stub [7].
According to a recent study, the phase-difference modulation (PDM) signaling was presented to reduce reflective ISI instead of non-return-to-zero (NRZ) signaling, which is widely used in multi-drop channels [8]. The PDM signal transmits the data by embedding it on the clock edge.
In PDM signaling, the time-difference ISI (TISI) is defined in the time domain because the PDM signaling transmitted data in the time domain (Fig. 1). Therefore, PDM can reduce the effect of the reflected signal by positioning the reflected signal between the clock edges. In addition, PDM did not increase the hardware cost because PDM does not demand DFE and FFE circuits.
a. Corresponding author; [email protected]
Manuscript Received Apr. 21, 2022, Revised Jun. 20, 2022, Accepted Jun.
26, 2022
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/bync/3.0) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Fig. 1. The concept of the PDM signaling [8].
1 0
1 0
1 0
1 0
CKTX
VTX
PDM TX
PDM RX Multi-drop Channel
Multi-drop Channel VTX
VTX/CKTX
VRX
VRX/CKRX
We designed the phase-difference modulation (PDM) transceiver for the application of PDM signaling in the multi- drop DRAM interface.The previous work [8] demonstrated the PDM signaling is the effective method in the highly reflective interconnects. In this paper, we applied the PDM signaling in more irregular and lossy multi-drop channel than [8], and we compared the performance of PDM and NRZ from the simulation results. We designed the PDM transceiver for the multi-drop channel without significantly increasing hardware cost and power consumption.
The rest of this paper is organized as follows. Section II describes the circuit implementation, Section III shows the simulation results. Finally, Section IV concludes this paper.
II. CIRCUIT IMPLEMENTATION
A. PDM Transmitter Architecture
The PDM transmitter consists of a clock control block, 2:1 serializer, PDM block, and a source series terminated (SST) driver (Fig. 2). The PDM transmitter was implemented in both single-end and differential mode. In the differential mode, the PDM transmitter transfers the modulated data VTX
with complementally data VTXB. Also, in the single-end mode, the PDM transmitter transfers the modulated data VTX
with reference clock CKTX.
The skew-corrected clock signal CKskew was generated in the clock control block, which consists of a duty cycle corrector and a skew control block. A half-rate odd and even data (DO and De) are serialized to full-rate data (DSER) in the 2:1 serializer (Fig. 2). The PDM signal CKmod was generated by modulating CKskew depending on DSER in the PDM block.
An external signal digitally controlled the strength of two- foot transistors (F1, F2) to determine the time difference.
Finally, the PDM signal was transmitted to the multi-drop channel through the SST driver, which is segmented into 6- bit binary-weighted slices for 50 impedance matching.
B. PDM Receiver Architecture 1) Overall Architecture
Fig. 3 shows the overall architecture of the PDM receiver.
The PDM receiver consists of peaking amplifiers, second- stage amplifiers, slicers, and a clock recovery circuit. The PDM receiver was designed in the half-rate architecture. The even data path detected time difference at the rising edges of VRX, and the odd data path detected time difference at the falling edges of VRX.
2) Peaking Amplifier & 2nd Stage Amplifier Circuit
At the input stage, the peaking amplifier is used to compensate for the high-frequency components which are decreasing due to the significant signal distortion in the multi-drop channel. Fig. 4 (a) shows the schematic diagram of the peaking amplifier. The peaking amplifier compensated high-frequency components using cross-coupled PMOS load and active-inductor. The strength of the resister of active-inductor and cross-coupled PMOS of peaking amplifier was digitally controlled by an external signal. The cross-coupled PMOSs provide positive feedback and generate a negative transconductance, and the negative transconductance cancels some positive conductance at the output and a high gain is obtained. Also, strength of the active inductor can be controlled by adjusting the strength of resistance.
Next, the 2nd stage amplifier with a conventional slicer recovers the data in phase difference. Fig. 4 (b) shows the schematic diagram of the 2nd stage amplifier. PDM RX utilized the P-type and N-type 2nd stage amplifier working on rising edges and falling edges, respectively. In the case of the P-type 2nd stage amplifier consists of PMOS input pairs and cross-coupled NMOSs. When the received PDM signal is low, P-type 2nd stage amplifier works like an amplifier because the PMOS pairs are dominant. Thus, the outputs of P-type 2nd stage amplifier are split signal depending on the phase difference. After, when the received PDM signal is high, P-type 2nd stage amplifier works like a latch because
Fig. 2. PDM TX overall architecture [9].
Clock Control
Block
SST Driver
VTX / VTXB / CKTX
Mode -Sel SER CKin
Do De
Diff/Sing 1 0
CKSkew DSER
PDM Block
F1
F2 M2
M1
Amp
Amp Clock Recovery Peaking
Amps N-type
Slicer
P-type Slicer VREF
VREF
VRX
VRXB /
CKRX
De, r
Do, r
Fig. 3. PDM RX block diagram.
(a) (b)
Fig. 4. Schematic diagram of (a) the peaking amplifier and the P-typee 2nd stage amplifier at PDM RX
the cross-coupled NMOSs are dominant. Finally, the slicer performs data decisions from the received PDM signal. The slicer was implemented a strong-Arm sense amplifier.
3) Clock Recovery Circuit
A simple clock recovery block with an interpolator is adopted because the clock is embedded in the received signal. Fig. 5 shows the schematic diagram of the phase interpolator and the skew control circuit in the clock recovery circuit. The received signal was not rail-to-rail full swing because the received signal was attenuated by channel loss. Therefore, the phase interpolator circuit based on current mode logic (CML) was adopted in the clock recovery circuit instead of the phase interpolator based on CMOS logic. The output phase was controlled by the tail current factor in the phase interpolator circuit based on CML.
Also, the tail current was digitally configured by using the current mirror and external current Digital-to-Analog Converter (DAC).
Although the circuit based on CML has the advantage of high performance of amplification, DC power consumption is large. Therefore, the delay block was implemented based on CMOS to decrease power consumption. The 2nd stage implemented the CML-to-CMOS circuit and generated the rail-to-rail swing clock signal as the second stage's output. In the 3rd and 4th stages, the delay cell circuits have performed the function of adjusting the clock delay. The delay cell consists of the delay block and skew control block. Delay block was designed for course-tuning of recovered clock delay with resolution 40 ps and skew control block was designed for fine-tuning of recovered clock delay with resolution 5 ps. Delay block and skew control block are controlled by a 3-bit binary. The delay cell based on CMOS was implemented to cover the clock delay range over 1-UI.
III. RESULTS AND DISCUSSIONS
Fig. 6 shows the layout of the fabricated chip. The proposed transceiver was fabricated in TSMC 65 nm CMOS technology. To minimize the parasitic capacitance of the high-speed path, the transceiver core was located close to a high-speed signal pad. To minimize the inductance of wire bonding, the fabricated chip was located close to the PCB pad. Section III-A and III-B show the simulation results of the fabricated chip.
A. PDM Transmitter Simulation
Fig. 7 shows the simulated eye diagram of PDM block output CKmod. The signal CKmodwas modulated depending on the serialized data DSER. The time difference was simulated by controlling the two weak foot-transistor banks (F1, F2). When the clock was rising, the maximum time difference of clock edge was 58 ps; when the clock was falling, the maximum time difference of clock edge was 54 ps.
(a) (b)
Fig. 6. The chip layout of (a) PDM transmitter and (b) PDM receiver core.
(a) SST Driver
Latch & MUX
PDM Block
PDM Block Clock Control Block
Digital Logic Digital Logic
Peaking Amps
Peaking Amps
Resister Bank
Slicer 2nd Amps
Clock Recovery
(b)
Fig. 7. Simulated eye diagram of PDM block output CKmod at PDM TX.
VREF
1-
CML2 CMOS
b
1-b 1 0
1 0
CKext
CKBext
CKBrRX
CKrRX
CML Interpolator
CMOS Intp.
CMOS Intp.
VREF
/CKRX
VRXBPDM
VRXPDM
Skew control Delay Block
40ps 40ps 40ps 40ps
8:1 MUX
Sub Block
Sub Block
Fig. 5. Schematic diagram of clock recovery circuit at PDM RX
(a)
(b)
Fig. 8. Simulated frequency response of (a) channel and (b) peaking amplefier at PDM RX.
B. PDM Receiver Simulation
The gray line in Fig. 8 (a) shows the frequency response of the multi-drop channel. Because the multi-drop channel was designed with many drop counts, the frequency response is irregular and not smooth. Also, the channel loss is significant due to the increased drop count. The black line in Fig. 8 (a) shows the frequency response of the multi-drop channel with a peaking amplifier. The peaking amplifier compensated the channel loss which increased due to the parasitic capacitance of the peaking amplifier at the target data rate of 6 Gb/s. Fig. 8 (b) shows the AC simulation of the peaking amplifier. The peaking amplifier provides 3.23dB peaking to compensate for the channel loss at the target data rate of 6Gb/s.
Fig. 9 shows the transient simulation result of the 2nd stage amplifier. The 2nd stage amplifier has a hold and amplification phases. After that, the output signal of 2nd stage amplifier is transmitted to the slicer, and data was decided.
Because a wide horizontal eye was achieved at the hold phase, the clock timing requirement at the following slicer was much more relaxed. It is noticeable that the simulated horizontal eye size can be larger than 1 UI in half-rate.
In the simulation, the clock delay range is about 200 ps which is wider than 1 UI of 6 Gb/s. Fig. 10 shows the simulated clock delay range of the clock recovery circuit.
The simulated total delay range is 287.44 ps with resolution less than 5ps. Finally, the total delay range covers 1.44 UI at the target data rate 6 Gb/s.
Fig. 11 shows the simulated differential eye diagrams of PDM and NRZ signalings in the multi-drop channel at 6 Gb/s. The simulated vertical and horizontal eye sizes with PDM signaling was increased to 60.5 mV and 63.7 ps, respectivley (Fig. 6 (a)). However, the eye was closed with NRZ signaling (Fig. 6 (b)).
IV. CONCLUSION
We designed the phase-difference modulation (PDM) transceiver for the application of PDM signaling for the multi-drop DRAM interface. We demonstated the PDM signaling is efficiency in more irregular and lossy multi-drop channel than [8]. The PDM transceiver did not use DFE and FFE because PDM signaling reduce the effect of the reflectied signal by adjusting the position of reflectived signal instead of removing reflective ISI. Therefore, the PDM signaling can reduce the power and hardware cost required. In addition, we have fabricated the PDM transceiver chip. In the simulation result, the PDM signaling was improved the eye size than NRZ signaling in multi-drop channel. As a result, the PDM signaling achieved high memory capacity without additional hardware cost. Finally, the application of PDM signaling in a multi-drop channel is expected to increase the total memory capacity of the system at a low cost. In the next study, we will verify our fabricated chip with the multi-drop printed circuit board (PCB) channel.
ACKNOWLEDGMENT
The chip fabrication and EDA tool were supported by IC Design Education Center (IDEC), Korea. This work was supported in part by the Ministry of Science and ICT (MSIT), Korea, under the Information Technology Research Center (ITRC) support program (IITP-2021-0-02052) supervised by the Institute for Information &
Communications Technology Planning & Evaluation (IITP).
Fig. 10. Simulation result of clock recovery Fig. 9. transient simulation result of the 2nd stage amplifier
(a)
(b)
Fig. 11. Simulated differencial eye diagrams at the RX front-end of (a) PDM signaling and (b) NRZ signaling
REFERENCES
[1] Y.-C. Kwon et al., “25.4 A 20nm 6GB Function-In- Memory DRAM, Based on HBM2 with a 1.2 TFLOPS Programmable Computing Unit Using Bank-Level Parallelism, for Machine Learning Applications,” in Proc. IEEE Int. Solid-State Circuits Conf., 2021, pp.
350-352.
[2] W.-Y. Shin et al., “A 4.8 Gb/s impedance-matched bidirectional multi-drop transceiver for high-capacity memory interface,” in Proc. IEEE Int. Solid-State Circuits Conf., Feb. 2011, pp. 494-495.
[3] W. Lee et al., “Parallel Branching of Two 2-DIMM Sections With Write-Direction Impedance Matching for an 8-Drop 6.4-Gb/s SDRAM Interface,” in IEEE Trans.
Compon, Packag Manuf Thchnol., vol. 9, no. 2, pp. 336- 342, Feb. 2019.
[4] DDR5 SDRAM Standard, JEDEC standard, JESD79-5, 2020.
[5] DDR4 SDRAM Registered DIMM Design Specificati- on, JEDEC standard, MODULE4.20.28.B, 2020.
[6] DDR4 SDRAM Load Reduced DIMM Design Specifi- cation, JEDEC standard, MODULE4.20.27.D, 2016.
[7] J. Seo et al., “A 7.8Gb/s 2.9pJ/b Single-Ended Receiver with 20 tap DFE for Highly-Reflective Channels,” in IEEE Trans. Very Large Scale Integr.(TVLSI) Systems., vol. 28, no. 3, pp. 818-822, Mar. 2020.
[8] S. Lee et al., “A 7.8 Gb/s/pin, 1.96pJ/b transceiver with phase-difference-modulation signaling for highly reflective interconnects,” in IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 67, no. 6, pp. 2114-2127, Jun. 2020.
[9] S. Lee, J. Seo, C. Han, J. -Y. Sim, H. -J. Park and B. Kim,
"A DFE-Enhanced Phase-Difference Modulation Signaling for Multi-Drop Memory Interfaces," in IEEE Trans. Circuits and Syst. II: Express Briefs, vol. 68, no.
6, pp. 1862-1866, Jun. 2021.
Chang Yoon Han (S’20) received the B. S. degrees in Electrical Engineering from Pohang University of Science and Technology (POSTECH), Pohang, Korea, in 2020. He is currently pursuing the M.S. degree in the Department of Electrical Engineering from POSTECH, Pohang. Korea. His interests include high-speed low power links, signal/ power integrity, and interconnect modeling.
Jae Young Seo (S’15) received the B.S. and M.S. degrees in Electrical Engineering from Pohang University of Science and Technology (POSTECH), Pohang, Korea, in 2015 and 2017, respectively. He is currently pursuing the Ph.D. degree in the Department of Electrical Engineering from POSTECH, Pohang. Korea. His interests include high-speed low power links, signal/power integrity, and interconnect modeling.
Byung Sub Kim (M’11-SM’16) received the B.S. degree in electrical engineering from the Pohang University of Science and Technology (POSTECH), Pohang, Korea, in 2000, and the M.S. and Ph.D. degrees in electrical engineering and computer science from Massachusetts Institute of Technology (MIT), Cambridge, MA, USA, in 2004 and 2010, respectively.
Dr. Kim received several honorable awards. He was a recipient of the IEEE Journal of Solid-State Circuits Best Paper Award, in 2009, the Analog Device Inc., and the Outstanding Student Designer Award from MIT, in 2009. He was a corecipient of the Beatrice Winner Award for Editorial Excellence at the 2009 IEEE International Solid-State Circuits Conference. For several years, he served or has been serving as the Technical Program Committee Member of the IEEE International Solid-State Circuits Conference and the IEEE Asian Solid-State Circuit Conference.