Releases: imezx/Gradien
v1.4.2
# What's new?
- (Added)
scalarSub,scalarDiv,neg,abs,exp,log,sqrt,pow,clamp,sum,mean,max&minops atGradien.Ops.Math. - (Added)
exponential,polynomial,constant,cosine,warmupLinear,warmupPolynomial,oneCycle&reduceOnPlateauatGradien.Optim.Schedulers. - (Added)
xavierNormal,orthogonal,constant&uniformatGradien.Initializer. - (Added)
ElasticNetatGradien.Classic. - (Added)
AttnResBlockatGradien.NN. - (Added)
UpsampleatGradien.NN. - (Added)
DDPGatGradien.RL. - (Added)
TD3atGradien.RL.
# What's changed?
- (Performance) Improved performance by ~20%
- (BugFix) fixed
SAC(RL). - (BugFix) fixed
Adafactor(Optim). - (Changed) completely removed dtype.
- (Changed) changed
getPolicymethod topolicyattribute. - (Changed) some type error fixes.
- (Changed) minor improvement to
Buffer.
Full Changelog: v1.4.0...v1.4.2
v1.4.0
# v1.4.0-rc5
# What's new?
- (Experimental) KAN (Kolmogorov-Arnold Network) at
Gradien.Experimental.NN. - (Added) Tokenizers (inspired by HuggingFace) at
Gradien.Tokenizer. - (Added) GroupNorm layer at
Gradien.NN. - (Added) Adam optimizer now supports L2 weight decay (applied to gradients) at
Gradien.Optim.
# What's changed?
- (Changes) Several minor changes & improvements.
- Linear now slightly safer to "nil" values, no longer stack error.
- Others minor changes to string's.
- (BugFix) Fix mismatch
Flatten.
# v1.4.0-rc4
# What's new?
- (Experimental) Hierarchical RL (Feudal Networks) at
Gradien.Experimental.RL. - (Added) ConvTranspose2d at
Gradien.NN. - (Added) ConvTranspose2d (Ops) at
Gradien.Ops.
# What's changed?
- (Experimental) Optimized Mish Activation by adding threshold.
- (Changed) a few fixes and improvements to
Visual3D. - (Changed) Optimizations.
- Now uses native operations.
- (Experimental) Cached global math functions.
- (Experimental) Some ops are now precalculated.
- (Changed) Updated docs.
# v1.4.0-rc3
# What's new?
- (Experimental) State Space Models (SSMs) / Mamba at
Gradien.NN&Gradien.Ops. - (Added) Sophia Optimizer (2nd Order Approximation) at
Gradien.Optim. - (Added)
is_contiguous,contiguous&expandTensor method.
# What's changed?
- (Changed) Optimized BLAS Matrix Multiplication
- Switched from the naive IJK loop order to IKJ (often 5-10x faster than IJK for large matrices).
- (Changed) Optimized Pooling
- Removed redundant index recalculations and duplicate assignments inside the innermost loop of MaxPool2d.
# v1.4.0-rc2
# What's new?
- (Added) Documentation is finally here!
# What's changed?
- (Changes) Several small fixes & improvements
- Fixes incorrectness to
dump&loadState to optimizer & trainer. - Corrected
Softmaxbackward pass to use the efficient Jacobian-vector product. - Added strict assertion to shapes for
Mathoperations. SoftmaxandActivationsnow reuse internal buffers to reduce memory allocation during training loops.- Fused Kernels:
NN.Linearnow combines matrix multiplication and bias addition into a single parallel block, reducing thread synchronization overhead significantly.
- Fixes incorrectness to
# v1.4.0-rc1
# What's new?
-
(Experimental) Quantum-Inspired Metaheuristic Neural Network (QIMHNN) at
Gradien.Experimental.Models. -
(Experimental) Metaheuristic Optimizer for QIMHNN at
Gradien.Experimental.Optim. -
(Experimental) SwarmPSO Optimizer for QIMHNN at
Gradien.Experimental.Optim. -
(Added) Adafactor Optimizer at
Gradien.Optim. -
(Added) Accumulated Optimizer (Gradient Accumulation) wrapper at
Gradien.Optim.
# What's changed?
-
(Changes) Small fixes
- several small fixes. (forgot what is it)
-
(Changes) Several small improvements
- removed most
metatables. - slightly performance bumps (only micro-optimizations).
- removed most
# v1.4.0-rc0
# What's new?
-
Gradien now available on Wally!
-
(Added) Prebuilt models at
Gradien.Models.MLP,ResMLP,ConvNet,TransformerEncoder,SequenceClassifier,AutoEncoder.
-
(Added) Multi-head self-attention at
Gradien.NN.Attention.Attention.new(embedDim, numHeads, opts)with support for dropout, custom initializers, causal masking, andgetLastAttention()inspection.- Backed by a numerically stable softmax in
Gradien.Ops.Softmax.
-
(Added) Multi-agent RL wrapper at
Gradien.RL.MultiAgent.-
Wraps a list of agents with:
getAgent(i),size()act(i, state, step)andactAll(states, step)parameters()andzeroGrad()that fan out across agents.
-
-
(Added) Buffer utilities at
Gradien.Util.Buffer.- High-level encode/decode of Luau values and tensors into
bufferobjects.
- High-level encode/decode of Luau values and tensors into
-
(Added) Profiler at
Gradien.Util.Profiler.- API:
.start/stop,.scope,.wrap,.instrument,.snapshot,.report,.withEnabled,.get,.reset/flush.
- API:
-
(Added) Stable Softmax at
Gradien.Ops.Softmax.SoftmaxOps.forward(logits)implements a max-shifted, numerically stable softmax used bynn.Softmaxandnn.Attention.
-
(Added) Classification-oriented trainer constructor at
Gradien.Trainer.-
Trainer.newClassification(cfg, opts?):- Default loss:
nn.Losses.cross_entropy_backward(with optional label smoothing). - Default metric:
Metrics.accuracy. - Returns a regular
Trainerwired for supervised classification.
- Default loss:
-
-
(Added) Cosine schedule with warmup in
Gradien.Optim.Schedulers.-
S.linearWarmupThenCosine(lr, warmupSteps, totalSteps, lrMin):- Linear warmup phase followed by cosine decay toward
lrMin.
- Linear warmup phase followed by cosine decay toward
-
-
(Added) Snapshot ↔ buffer helpers at
Gradien.State.State.toBuffer(snap): buffer– serializes aTypes.SnapshotviaUtil.Buffer.State.fromBuffer(buf): Snapshot?– reconstructs snapshots from abuffer.
-
(Added) New initializers at
Gradien.Init.heNormal(W),heUniform(W),lecunNormal(W),lecunUniform(W)– fan-in/out aware weight initializers.
-
(Added) Tensor view & shape helpers at
Gradien.Tensor.Tensor.reshape(t, newShape)– view-style reshape (shared storage) with size checking.Tensor.slice(t, dim, startIdx, endIdx?, step?)– strided slices with view-based implementation.Tensor.transpose(t, dim1?, dim2?)– generic axis swap; defaults to 2D transpose when dims are omitted.Tensor.narrow(t, dim, startIdx, length)– thin wrapper aroundslicefor PyTorch-style narrowing.Tensor.noGrad(t)– in-place: marks a tensor as non-differentiable and clears_grad.
# What's changed?
-
(Changed) Tensor & Autograd
-
Tensornow uses explicitcomputeStridesand view objects internally to implementreshape,slice, andtransposewithout copying storage, while still propagating gradients. -
Tensor.detach()still returns a detached view, butTensor.noGrad()was added for in-place disabling of gradients. -
autograd.Tape.matmul:- Allocates
A._grad/B._gradwith the correct dtype (Tensor.zeros(..., x._dtype)). - Accumulates gradients into existing
.gradbuffers instead of overwriting them.
- Allocates
-
Tape.noGrad(fn)no longer wrapsfninpcall.
-
-
(Changed) Initializers
- All initializer functions now operate on
Types.Tensorand share the internal_randn()normal sampler. - Existing initializers (e.g.
xavierUniform) are updated to use fan-in/fan-out computations consistent with the new He/LeCun variants.
- All initializer functions now operate on
-
(Changed) BatchNorm & Metrics
-
nn.BatchNorm1d:- Running statistics now have shape
{D, 1}instead of{1, 1}and are tracked per-feature. - Training mode computes per-channel means and variances and updates
runningMean/runningVarwith the configuredmomentum. - Eval mode uses the stored per-channel statistics for normalization.
- Running statistics now have shape
-
Metrics:- Multi-class precision/recall/F1 now pre-init
tpC,fpC,fnCarrays with zeros to avoidnilindexing on unseen classes. - Confusion matrix allocation uses
table.create(C, 0)and fills with zeros, fixing edge cases when some classes never appear.
- Multi-class precision/recall/F1 now pre-init
-
-
(Changed) Convolutions & Softmax
-
ops.Conv2d:- Reimplemented using helper routines (
copyShape,makeMatrixView,im2col,col2im,reshapeInPlace,transposeMatrix,addInto) andBLAS.matmul. - Keeps public signature the same (
Conv2d(X, W)), but forwards now use an im2col + GEMM approach for better performance.
- Reimplemented using helper routines (
-
nn.Conv2d:- Continues to delegate to
ops.Conv2d, inheriting the new, more efficient kernel without changing its module API.
- Continues to delegate to
-
nn.Softmax:- Simplified to delegate to
Ops.Softmax.forward, consolidating the softmax implementation inops/Softmax.luau.
- Simplified to delegate to
-
-
(Changed) RL Replay Buffers
-
Gradien.RL.Replay:- Now requires
t.stateandt.nextStateto be tensors and asserts their presence. - Infers
stateDimand dtype on first push and stores state vectors as dense arrays inUtil.Bufferbuffers instead of raw tables. sample(batchSize)reconstructs batched state / next-state tensors from the underlying buffers.
- Now requires
-
Gradien.RL.UniformReplay&Gradien.RL.PrioritizedReplay:- Similarly updated to serialize state vectors into buffers via
Util.Bufferand to rebuild batchedS/NStensors on sampling. - Insert logic and bookkeeping (
head,size_) are clarified and wrapped in explicit conditionals.
- Similarly updated to serialize state vectors into buffers via
-
-
(Changed) Schedulers & Trainer
Trainer.fitnow works through a typedFitOptionstable (epochs,stepsPerEpoch,onMetric), assigning defaults via a localfitOptsbut remaining backwards compatible with previous usage.
-
(Changed) Small fixes
nn.BatchNorm1d, replay buffers, and metrics all gained ...
v1.4.0-rc4
# What's new?
- (Experimental) Hierarchical RL (Feudal Networks) at
Gradien.Experimental.RL. - (Added) ConvTranspose2d at
Gradien.NN. - (Added) ConvTranspose2d (Ops) at
Gradien.Ops.
# What's changed?
- (Experimental) Optimized Mish Activation by adding threshold.
- (Changed) a few fixes and improvements to
Visual3D. - (Changed) Optimizations.
- Now uses native operations.
- (Experimental) Cached global math functions.
- (Experimental) Some ops are now precalculated.
- (Changed) Updated docs.
Full Changelog: v1.4.0-rc3...v1.4.0-rc4
v1.4.0-rc3
# What's new?
- (Experimental) State Space Models (SSMs) / Mamba at
Gradien.NN&Gradien.Ops. - (Added) Sophia Optimizer (2nd Order Approximation) at
Gradien.Optim. - (Added)
is_contiguous,contiguous&expandTensor method.
# What's changed?
- (Changed) Optimized BLAS Matrix Multiplication
- Switched from the naive IJK loop order to IKJ (often 5-10x faster than IJK for large matrices).
- (Changed) Optimized Pooling
- Removed redundant index recalculations and duplicate assignments inside the innermost loop of MaxPool2d.
Full Changelog: v1.4.0-rc2...v1.4.0-rc3
v1.4.0-rc2
# What's new?
- (Added) Documentation is finally here!
# What's changed?
- (Changes) Several small fixes & improvements
- Fixes incorrectness to
dump&loadState to optimizer & trainer. - Corrected
Softmaxbackward pass to use the efficient Jacobian-vector product. - Added strict assertion to shapes for
Mathoperations. SoftmaxandActivationsnow reuse internal buffers to reduce memory allocation during training loops.- Fused Kernels:
NN.Linearnow combines matrix multiplication and bias addition into a single parallel block, reducing thread synchronization overhead significantly.
- Fixes incorrectness to
v1.4.0-rc1
# What's new?
-
(Experimental) Quantum-Inspired Metaheuristic Neural Network (QIMHNN) at
Gradien.Experimental.Models. -
(Experimental) Metaheuristic Optimizer for QIMHNN at
Gradien.Experimental.Optim. -
(Experimental) SwarmPSO Optimizer for QIMHNN at
Gradien.Experimental.Optim. -
(Added) Adafactor Optimizer at
Gradien.Optim. -
(Added) Accumulated Optimizer (Gradient Accumulation) wrapper at
Gradien.Optim.
# What's changed?
-
(Changes) Small fixes
- several small fixes. (forgot what is it)
-
(Changes) Several small improvements
- removed most
metatables. - slightly performance bumps (only micro-optimizations).
- removed most
v1.4.0-rc0
# What's new?
-
Gradien now available on Wally!
-
(Added) Prebuilt models at
Gradien.Models.MLP,ResMLP,ConvNet,TransformerEncoder,SequenceClassifier,AutoEncoder.
-
(Added) Multi-head self-attention at
Gradien.NN.Attention.Attention.new(embedDim, numHeads, opts)with support for dropout, custom initializers, causal masking, andgetLastAttention()inspection.- Backed by a numerically stable softmax in
Gradien.Ops.Softmax.
-
(Added) Multi-agent RL wrapper at
Gradien.RL.MultiAgent.-
Wraps a list of agents with:
getAgent(i),size()act(i, state, step)andactAll(states, step)parameters()andzeroGrad()that fan out across agents.
-
-
(Added) Buffer utilities at
Gradien.Util.Buffer.- High-level encode/decode of Luau values and tensors into
bufferobjects.
- High-level encode/decode of Luau values and tensors into
-
(Added) Profiler at
Gradien.Util.Profiler.- API:
.start/stop,.scope,.wrap,.instrument,.snapshot,.report,.withEnabled,.get,.reset/flush.
- API:
-
(Added) Stable Softmax at
Gradien.Ops.Softmax.SoftmaxOps.forward(logits)implements a max-shifted, numerically stable softmax used bynn.Softmaxandnn.Attention.
-
(Added) Classification-oriented trainer constructor at
Gradien.Trainer.-
Trainer.newClassification(cfg, opts?):- Default loss:
nn.Losses.cross_entropy_backward(with optional label smoothing). - Default metric:
Metrics.accuracy. - Returns a regular
Trainerwired for supervised classification.
- Default loss:
-
-
(Added) Cosine schedule with warmup in
Gradien.Optim.Schedulers.-
S.linearWarmupThenCosine(lr, warmupSteps, totalSteps, lrMin):- Linear warmup phase followed by cosine decay toward
lrMin.
- Linear warmup phase followed by cosine decay toward
-
-
(Added) Snapshot ↔ buffer helpers at
Gradien.State.State.toBuffer(snap): buffer– serializes aTypes.SnapshotviaUtil.Buffer.State.fromBuffer(buf): Snapshot?– reconstructs snapshots from abuffer.
-
(Added) New initializers at
Gradien.Init.heNormal(W),heUniform(W),lecunNormal(W),lecunUniform(W)– fan-in/out aware weight initializers.
-
(Added) Tensor view & shape helpers at
Gradien.Tensor.Tensor.reshape(t, newShape)– view-style reshape (shared storage) with size checking.Tensor.slice(t, dim, startIdx, endIdx?, step?)– strided slices with view-based implementation.Tensor.transpose(t, dim1?, dim2?)– generic axis swap; defaults to 2D transpose when dims are omitted.Tensor.narrow(t, dim, startIdx, length)– thin wrapper aroundslicefor PyTorch-style narrowing.Tensor.noGrad(t)– in-place: marks a tensor as non-differentiable and clears_grad.
# What's changed?
-
(Changed) Tensor & Autograd
-
Tensornow uses explicitcomputeStridesand view objects internally to implementreshape,slice, andtransposewithout copying storage, while still propagating gradients. -
Tensor.detach()still returns a detached view, butTensor.noGrad()was added for in-place disabling of gradients. -
autograd.Tape.matmul:- Allocates
A._grad/B._gradwith the correct dtype (Tensor.zeros(..., x._dtype)). - Accumulates gradients into existing
.gradbuffers instead of overwriting them.
- Allocates
-
Tape.noGrad(fn)no longer wrapsfninpcall.
-
-
(Changed) Initializers
- All initializer functions now operate on
Types.Tensorand share the internal_randn()normal sampler. - Existing initializers (e.g.
xavierUniform) are updated to use fan-in/fan-out computations consistent with the new He/LeCun variants.
- All initializer functions now operate on
-
(Changed) BatchNorm & Metrics
-
nn.BatchNorm1d:- Running statistics now have shape
{D, 1}instead of{1, 1}and are tracked per-feature. - Training mode computes per-channel means and variances and updates
runningMean/runningVarwith the configuredmomentum. - Eval mode uses the stored per-channel statistics for normalization.
- Running statistics now have shape
-
Metrics:- Multi-class precision/recall/F1 now pre-init
tpC,fpC,fnCarrays with zeros to avoidnilindexing on unseen classes. - Confusion matrix allocation uses
table.create(C, 0)and fills with zeros, fixing edge cases when some classes never appear.
- Multi-class precision/recall/F1 now pre-init
-
-
(Changed) Convolutions & Softmax
-
ops.Conv2d:- Reimplemented using helper routines (
copyShape,makeMatrixView,im2col,col2im,reshapeInPlace,transposeMatrix,addInto) andBLAS.matmul. - Keeps public signature the same (
Conv2d(X, W)), but forwards now use an im2col + GEMM approach for better performance.
- Reimplemented using helper routines (
-
nn.Conv2d:- Continues to delegate to
ops.Conv2d, inheriting the new, more efficient kernel without changing its module API.
- Continues to delegate to
-
nn.Softmax:- Simplified to delegate to
Ops.Softmax.forward, consolidating the softmax implementation inops/Softmax.luau.
- Simplified to delegate to
-
-
(Changed) RL Replay Buffers
-
Gradien.RL.Replay:- Now requires
t.stateandt.nextStateto be tensors and asserts their presence. - Infers
stateDimand dtype on first push and stores state vectors as dense arrays inUtil.Bufferbuffers instead of raw tables. sample(batchSize)reconstructs batched state / next-state tensors from the underlying buffers.
- Now requires
-
Gradien.RL.UniformReplay&Gradien.RL.PrioritizedReplay:- Similarly updated to serialize state vectors into buffers via
Util.Bufferand to rebuild batchedS/NStensors on sampling. - Insert logic and bookkeeping (
head,size_) are clarified and wrapped in explicit conditionals.
- Similarly updated to serialize state vectors into buffers via
-
-
(Changed) Schedulers & Trainer
Trainer.fitnow works through a typedFitOptionstable (epochs,stepsPerEpoch,onMetric), assigning defaults via a localfitOptsbut remaining backwards compatible with previous usage.
-
(Changed) Small fixes
nn.BatchNorm1d, replay buffers, and metrics all gained more explicit shape checks, zero-initialization, andassertmessages to catch configuration errors earlier.
-
(Changes) Several small improvements
- slightly performance bumps specially on heavy ops compared to previous versions.
- small
Typesfix forTensor
1.3.0
# What's new?
- (Added)
Debug(few sanity helpers) atGradien.Debug.checkTensor,checkGradients,checkModel, andwrapOptimizer(warns on NaN/Inf and clips viaGradClip).
- (Added)
GradStats(min/max/mean/std over parameter grads) atGradien.Util. - (Added)
Tensor.onesatGradien.Tensor. - (Added)
detach()(to get a non-grad view) atGradien.Tensor. - (Added)
Fusedlinear blocks atGradien.NN- Ops:
linearReLU,linearGELU,linearDropoutReLU; Modules:LinearReLU,LinearGELU,LinearDropoutReLU.
- Ops:
- (Added)
SwiGLUatGradien.NN(uses existingnn.Activations.SwiGLUSplit). - (Added) On-policy RL baselines at
Gradien.RLA2C,PPO.
- (Added) Optimizer & Trainer snapshots at
Gradien.StatedumpTrainer(trainer),loadTrainer(trainer, snap)dumpOptimizer(optimizer),loadOptimizer(optimizer, snap)
- (Added) Autograd conveniences at
Gradien.AutogradTape.noGrad(fn),Tape.grad(f, inputs).
- (Added)
shape(x)and traversal helpers:train(m),eval(m),apply(m, fn),to(m, dtype)atGradien.Util. - (Added)
getState(),setState(state)&setLr(lr)to SGD, Adam, AdamW atGradien.Optim
# What's changed?
- (Changed) Types
- Many minor types improvements and fixes.
- (Changed) Tensor
fromArraynow validates element count againstshapeand usesUtil.sizeFromShape.
- (Changed) Trainer
- New primary API:
Trainer.new({...})with callbacks (onStep,onEpochEnd,onBest, …). Prior call-style kept as sugar.
- New primary API:
- (Changed) Autograd
- Tape respects
noGrad;grad(f, inputs)backprops fromTensor.ones(y.shape).
- Tape respects
1.2.0
# What's new?
- (Added)
Anomaly(NaN / Inf / bad-value detectors for tensors) atGradien.Util - (Added)
RNG(seedable random helpers + generator wrapper used by data utilities) atGradien.Util - (Added)
Hooks(forward hooks that wrapmodule.forwardand inspect outputs) atGradien.Util - (Added)
Parallel(helpers for running code inside Parallel) atGradien.Util - (Added)
Flatten(NCHW{C,H,W,B} → {C*H*W,B}flattener, preserving batch dim) atGradien.NN - (Added)
Conv2d(2D convolution layer on NCHW tensors) atGradien.NN - (Added)
MaxPool2d(2D max-pooling layer backed byops.Pool) atGradien.NN - (Added)
AvgPool2d(2D average-pooling layer backed byops.AvgPool2d) atGradien.NN - (Added)
AvgPool2d(low-level 2D average-pool op for NCHW tensors) atGradien.Ops - (Added)
Gradcheck(numerical gradient checker aroundautograd.Tape) atGradien.tools - (Added)
oneCycle(1-Cycle learning-rate schedule) atGradien.Optim.Schedulers - (Added)
cosineRestarts(cosine annealing with restarts schedule) atGradien.Optim.Schedulers
# What's changed?
- (Changed) Data pipeline
DataLoadernow:- Supports an optional
generatorwithrandint(a, b)for reproducible shuffles - Supports optional
drop_lastto skip incomplete batches - Calls dataset methods as
dataset:at(...),dataset:slice(...),dataset:batch(...).
- Supports an optional
Splitkeeps the same API but now always uses an explicitrng(defaulting to a newRandom) and is fully typed for its{number}index outputs.KFoldnow returns aTypes.BatchIterthat yields{trainIdx, testIdx}pairs with typed index arrays.
- (Changes) Many types improvements and fixes.
1.1.0
# What's new?
- (Added)
RND(Random Network Distillation) atGradien.Extra - (Added)
LambdaBuffer(TD(λ) buffer) atGradien.RL - (Added)
C51DQN(C51 / Categorical DQN) atGradien.RL - (Added)
BDQ(Branched Dueling DDQN) atGradien.RL - (Added)
RunningNorm(Running mean/std (Welford) for scalars) atGradien.Util - (Added)
Visual2D(Visualize network at 2D) atGradien.Util - (Added)
Visual3D(Visualize network at 3D) atGradien.Util - (Added)
_layers(Attribute) inGradien.NN.Sequential - (Added)
tau(option) forDeep Q-Learning Algorithms - (Added)
losses_fn&losses_args(option, default: mse) forDouble DQN / DDQN - (Added)
getPolicy(method) forDeep Q-Learning Algorithms - (Added)
loadParameters(method) forDeep Q-Learning Algorithms