Code Generators for Floating-Point Unit Design in Integrated Circuits
OpenFloat is a parameterized floating-point unit (FPU) design generator developed using Chisel, a hardware construction language embedded in Scala. It generates synthesizable RTL for floating-point arithmetic operations targeting FPGA and ASIC implementations.
OpenFloat provides a library of highly configurable floating-point arithmetic modules that support multiple IEEE 754 precision formats. All modules are parameterized by a FloatingPointFormat trait, which defines the exponent and mantissa widths. This allows the same module to be instantiated for any standard format (FP16, BF16, FP32, FP64, FP128) or a custom format.
OpenFloat/
├── src/main/scala/
│ ├── FloatingPoint/ # Core FPU modules
│ │ ├── FloatingPointFormat.scala # Format definitions and FPModule base class
│ │ └── fpu.scala # All floating-point operation implementations
│ ├── Primitives/ # Low-level building blocks
│ │ ├── primitives.scala # Arithmetic primitives (adders, dividers, CORDIC, etc.)
│ │ └── convert.scala # IEEE 754 format conversion utilities
│ ├── Generate/ # RTL generation utilities
│ │ └── generate.scala # SystemVerilog output generator
│ └── TB/ # Test benches
│ └── testbench.scala # Verification test suites
├── build.sbt # SBT build configuration
├── LICENSE # BSD-style license
└── README.md # This file
OpenFloat supports standard IEEE 754 formats, BFloat16, and custom formats via the FloatingPointFormat trait.
| Format | Object | Total Bits | Sign | Exponent | Mantissa | Bias |
|---|---|---|---|---|---|---|
| Half | FP16 |
16 | 1 | 5 | 10 | 15 |
| BFloat16 | BF16 |
16 | 1 | 8 | 7 | 127 |
| Single | FP32 |
32 | 1 | 8 | 23 | 127 |
| Double | FP64 |
64 | 1 | 11 | 52 | 1023 |
| Quad | FP128 |
128 | 1 | 15 | 112 | 16383 |
You can also define arbitrary custom formats using CustomFormat(exponent, mantissa).
| Module | Description | Parameters |
|---|---|---|
FP_add |
Floating-point addition | FORMAT: FloatingPointFormat, latency: pipeline depth (any value >= 1) |
FP_mult |
Floating-point multiplication | FORMAT: FloatingPointFormat, latency: pipeline depth (any value >= 1) |
FP_div |
Digit-recurrence division | FORMAT: FloatingPointFormat, L: iterations, latency: pipeline stages |
FP_sqrt |
Digit-recurrence square root | FORMAT: FloatingPointFormat, L: iterations, latency: pipeline stages |
| Module | Description | Parameters |
|---|---|---|
FP_cos |
Cosine and Sine (CORDIC-based) | FORMAT: FloatingPointFormat, iters: CORDIC iterations |
FP_atan |
Arctangent (CORDIC-based) | FORMAT: FloatingPointFormat, iters: CORDIC iterations |
FP_exp |
Exponential function (e^x) | FORMAT: FloatingPointFormat |
| Module | Description | Parameters |
|---|---|---|
FP_acc |
Floating-point accumulator | FORMAT: FloatingPointFormat, iters: accumulation count, ExpExp, ExpMSB, LSB |
FP_floor |
Floor function | FORMAT: FloatingPointFormat |
FloatTOFixed |
Float to fixed-point conversion | FORMAT: FloatingPointFormat, ibits: integer bits, fbits: fractional bits |
FixedTOFloat |
Fixed-point to float conversion | FORMAT: FloatingPointFormat, ibits: integer bits, fbits: fractional bits |
FP_add and FP_mult accept any latency >= 1. Pipeline registers are automatically distributed across 10 internal stage boundaries using the same pipe_skip/pipe_map algorithm used by the digit-recurrence primitives (divider, frac_sqrt, cordic). Higher latency values improve timing at the cost of throughput latency; values above 10 stack additional registers at evenly-spaced boundaries.
The Primitives package provides low-level building blocks:
| Module | Description |
|---|---|
LZC |
Leading Zero Counter with tree-based reduction |
full_adder |
Parameterized width adder with carry |
full_subtractor |
Parameterized width subtractor with borrow |
multiplier |
Basic integer multiplication |
divider |
Digit-recurrence integer divider (pipelined) |
frac_sqrt |
Fractional square root for normalized numbers |
cordic |
Fixed-point CORDIC processor (rotation/vectoring modes) |
ucordic |
Universal CORDIC (circular, linear, hyperbolic modes) |
cos, atan, exp |
Fixed-point trigonometric/exponential wrappers |
The convert object provides Scala-side IEEE 754 conversion functions for testbench use:
| Function | Description |
|---|---|
convert_string_to_IEEE_754(str, fmt) |
Converts a decimal string to an IEEE 754 bit pattern for any FloatingPointFormat |
convert_IEEE754_to_Decimal(num, fmt) |
Converts an IEEE 754 bit pattern back to a BigDecimal value |
These accept any FloatingPointFormat (including BF16, CustomFormat, etc.), making it easy to generate test vectors and examine outputs for any format.
All modules implement a standard ready-valid handshaking protocol for flow control:
val in_ready = Output(Bool()) // Module can accept new input
val in_valid = Input(Bool()) // Input data is valid
val in_a = Input(UInt()) // Input operand A
val in_b = Input(UInt()) // Input operand B (for binary ops)val out_ready = Input(Bool()) // Consumer ready to accept output
val out_valid = Output(Bool()) // Output data is valid
val out_s = Output(UInt()) // Result- A transaction occurs when both
validandreadyare high on the same clock edge in_readyindicates the module can accept new data- Backpressure propagates through module chains via the ready signals
- The pipeline stalls when
out_validis high butout_readyis low
// Connect two modules in a chain
module_a.out_ready := module_b.in_ready
module_b.in_valid := module_a.out_valid
module_b.in_a := module_a.out_s- Scala: 2.13.12
- SBT: 1.7.2 or later
- Chisel: 6.0.0 (managed by SBT)
- Verilator: For simulation (optional)
To generate SystemVerilog RTL for a specific module, modify src/main/scala/Generate/generate.scala:
object generate extends App {
private def genVerilog(mod: => RawModule): Unit = {
val gen: () => RawModule = () => mod
(new ChiselStage).execute(
Array("--target", "systemverilog"),
Seq(ChiselGeneratorAnnotation(gen),
FirtoolOption("--disable-all-randomization"),
FirtoolOption("-strip-debug-info"),
FirtoolOption("--disable-annotation-unknown")
),
)
}
// Generate desired module
// Import the formats first: import FloatingPoint.{FP32, FP64, BF16}
genVerilog(new FP_mult(FP32, 7)) // 32-bit multiplier with 7-stage pipeline
}Generated SystemVerilog files will be placed in the project root directory.
import FloatingPoint._
import FloatingPoint.fpu._
// 32-bit adder with 7-stage pipeline
val adder = Module(new FP_add(FORMAT = FP32, latency = 7))
adder.io.out_ready := true.B
adder.io.in_valid := input_valid
adder.io.in_a := operand_a
adder.io.in_b := operand_b
val result = adder.io.out_s
val result_valid = adder.io.out_valid// 32-bit divider with 23 iterations and 23-cycle latency
val divider = Module(new FP_div(FORMAT = FP32, L = 23, latency = 23))
divider.io.out_ready := downstream_ready
divider.io.in_valid := input_valid
divider.io.in_a := dividend
divider.io.in_b := divisor
val quotient = divider.io.out_s// 32-bit cos/sin with 23 CORDIC iterations
val trig = Module(new FP_cos(FORMAT = FP32, iters = 23))
trig.io.out_ready := true.B
trig.io.in_valid := angle_valid
trig.io.in_angle := angle_ieee754
val cos_result = trig.io.out_cos
val sin_result = trig.io.out_sinBoth FP_div and FP_sqrt use digit-recurrence algorithms, computing one bit of the result per iteration. The L parameter controls the number of iterations (typically equal to mantissa width), while latency controls how iterations are distributed across pipeline stages.
Trigonometric and hyperbolic functions use the CORDIC algorithm, which computes results through iterative shift-and-add operations. The ucordic module supports three modes:
- Circular mode (mu = 1): cos, sin, atan
- Linear mode (mu = 0): multiplication, division
- Hyperbolic mode (mu = -1): sinh, cosh, atanh, exp, ln
FP_exp uses range reduction to decompose x / ln(2) = w + f into an integer part w and fractional part f. The fractional part is computed via a hyperbolic CORDIC engine (e^(f * ln(2)) = 2^f), while the integer part becomes an exponent bias adjustment. Constant multiplications by ln(2) and 1/ln(2) are implemented using Canonical Signed Digit (CSD) encoding for multiplierless shift-and-add operations.
All modules implement saturation arithmetic:
- Overflow: Result saturates to maximum representable value
- Underflow: Result saturates to minimum normalized value
BSD 3-Clause License
Copyright (c) 2025, The Regents of the University of California, through Lawrence Berkeley National Laboratory and University of Houston-Clear Lake.
See LICENSE for full terms.
Code Generators for Floating-Point Unit Design in Integrated Circuits (OpenFloat) Copyright (c) 2025, The Regents of the University of California, through Lawrence Berkeley National Laboratory (subject to receipt of any required approvals from the U.S. Dept. of Energy) and University of Houston-Clear Lake, All rights reserved.
If you have questions about your rights to use or distribute this software, please contact Berkeley Lab's Intellectual Property Office at IPO@lbl.gov.
NOTICE. This Software was developed under funding from the U.S. Department of Energy and the U.S. Government consequently retains certain rights. As such, the U.S. Government has been granted for itself and others acting on its behalf a paid-up, nonexclusive, irrevocable, worldwide license in the Software to reproduce, distribute copies to the public, prepare derivative works, and perform publicly and display publicly, and to permit others to do so.
This software was developed under funding from the U.S. Department of Energy.