Skip to content

Chlsim/SerDes-equalization-portfolio

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SerDes-equalization-portfolio

High-Speed PAM-4 SerDes: Adaptive Equalization & GPU Simulation

Overview

This repository contains a system-level simulation of a high-speed PAM-4 Serializer/Deserializer (SerDes) receiver. The project focuses on mitigating Signal Integrity (SI) challenges—specifically Intersymbol Interference (ISI) and thermal noise (AWGN)—across a dispersive copper channel using adaptive Digital Signal Processing (DSP).

To break the statistical bottleneck of standard CPU-based Monte Carlo simulations, the inference engine is GPU-accelerated via CuPy. This architecture achieves a processing throughput of ~350 Mega-Symbols per second, enabling deep Bit Error Rate (BER) validation down to $10^{-9}$ and proving hardware-grade reliability without arbitrary error floors.

Core Architecture & Engineering Trade-Offs

1. Adaptive Equalization (LMS FFE)

The system employs a 7-tap Feed-Forward Equalizer (FFE). The filter weights are dynamically trained using the Least Mean Squares (LMS) algorithm based on a known preamble sequence.

Architectural Decision: Why FFE over DFE?

During the design phase, a combined FFE + DFE architecture was heavily evaluated against a standalone FFE. However, the deep GPU Monte Carlo simulation revealed a critical PAM-4 behavior: DFE Error Propagation.

Because the voltage margin between PAM-4 levels is narrow, residual thermal noise at $\sim 24\text{dB}$ SNR occasionally causes the hardware Slicer to make an incorrect initial symbol decision. Feeding this incorrect decision back through the DFE's IIR loop artificially generated bursts of sequential errors. The simulation proved mathematically that the theoretical gain from cleaning residual post-cursors was completely negated by these noise-induced error avalanches.

The Verdict: The optimal PPA (Power, Performance, Area) choice is a standalone 7-Tap linear FFE. This avoids the severe RTL critical path of closing a 1-UI non-linear feedback loop, saves significant silicon area, and still successfully achieves the standard $10^{-12}$ BER target.

2. GPU Batched Inference

The recursive nature of the LMS algorithm ($w[n+1] = w[n] + \mu \cdot e[n] \cdot x[n]$) creates a data dependency that destroys SIMD parallelism.

  • The Solution: The architecture splits the data path. Training occurs sequentially on the CPU to reach steady-state Wiener filter weights. These optimal, frozen weights are then transferred to the GPU. The inference phase is executed as highly parallelized convolutions over massive batches ($10^7$ symbols), strictly managing VRAM utilization while fully saturating the GPU cores.

3. Hardware Synchronization (Alignment)

Accurate BER calculation requires exact synchronization between the transmitted symbol and the Slicer's decision. The system dynamically computes the overall group delay by isolating the energy peaks (Main Cursors) of both the channel impulse response and the converged FFE tap weights, preventing false error generation.

Project Structure

  • src/channel.py: Physical layer modeling (PAM-4 generation, dispersion, dynamic AWGN).
  • src/equalizers.py: CPU-based LMS algorithms for FFE/DFE training and hardware comparator logic.
  • src/gpu_accelerator.py: CuPy based inference engine featuring batched processing and Early Stopping logic.
  • main.py: The top-level orchestrator that drives the simulation and generates the Waterfall curve.

Performance Results

The GPU acceleration allowed for the processing of 34 Billion symbols in ~95 seconds on a consumer-grade GPU (GTX 1660 Ti), cleanly resolving a BER of $2.9 \cdot 10^{-9}$ at 27dB SNR, successfully proving the equalizer's capability to approach the $10^{-12}$ industry standard without encountering a residual ISI error floor.

BER Waterfall Curve

Quick Start

  1. Ensure CUDA and cupy are installed on your machine.
  2. Clone the repository.
  3. Run the orchestrator:
    python main.py

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors