This repository contains a system-level simulation of a high-speed PAM-4 Serializer/Deserializer (SerDes) receiver. The project focuses on mitigating Signal Integrity (SI) challenges—specifically Intersymbol Interference (ISI) and thermal noise (AWGN)—across a dispersive copper channel using adaptive Digital Signal Processing (DSP).
To break the statistical bottleneck of standard CPU-based Monte Carlo simulations, the inference engine is GPU-accelerated via CuPy. This architecture achieves a processing throughput of ~350 Mega-Symbols per second, enabling deep Bit Error Rate (BER) validation down to
The system employs a 7-tap Feed-Forward Equalizer (FFE). The filter weights are dynamically trained using the Least Mean Squares (LMS) algorithm based on a known preamble sequence.
During the design phase, a combined FFE + DFE architecture was heavily evaluated against a standalone FFE. However, the deep GPU Monte Carlo simulation revealed a critical PAM-4 behavior: DFE Error Propagation.
Because the voltage margin between PAM-4 levels is narrow, residual thermal noise at
The Verdict: The optimal PPA (Power, Performance, Area) choice is a standalone 7-Tap linear FFE. This avoids the severe RTL critical path of closing a 1-UI non-linear feedback loop, saves significant silicon area, and still successfully achieves the standard
The recursive nature of the LMS algorithm (
-
The Solution: The architecture splits the data path. Training occurs sequentially on the CPU to reach steady-state Wiener filter weights. These optimal, frozen weights are then transferred to the GPU. The inference phase is executed as highly parallelized convolutions over massive batches (
$10^7$ symbols), strictly managing VRAM utilization while fully saturating the GPU cores.
Accurate BER calculation requires exact synchronization between the transmitted symbol and the Slicer's decision. The system dynamically computes the overall group delay by isolating the energy peaks (Main Cursors) of both the channel impulse response and the converged FFE tap weights, preventing false error generation.
src/channel.py: Physical layer modeling (PAM-4 generation, dispersion, dynamic AWGN).src/equalizers.py: CPU-based LMS algorithms for FFE/DFE training and hardware comparator logic.src/gpu_accelerator.py:CuPybased inference engine featuring batched processing and Early Stopping logic.main.py: The top-level orchestrator that drives the simulation and generates the Waterfall curve.
The GPU acceleration allowed for the processing of 34 Billion symbols in ~95 seconds on a consumer-grade GPU (GTX 1660 Ti), cleanly resolving a BER of
- Ensure CUDA and
cupyare installed on your machine. - Clone the repository.
- Run the orchestrator:
python main.py
