Forge-Bedrock is a long-term, bottom-up project dedicated to "reinventing the wheel" for the fundamental building blocks of Artificial Intelligence.
The philosophy of this project is to bridge the gap between abstract mathematical theorems (Calculus, Linear Algebra, Probability) and functional code. By implementing core algorithms from scratch, this repository serves as a personal laboratory for mastering the "bedrock" of AI—transforming black-box frameworks into transparent, intuitive logic.
- Deep Understanding: Move beyond API calls to understand the mechanical necessity of every optimization and decomposition.
- Mathematical Rigor: Translate formal proofs into robust, vectorized code.
- Modular Architecture: Build a decoupled system where linear algebra solvers, autograd engines, and optimizers work in harmony.
Implement the core routines that power data transformation and dimensionality reduction.
- Basic Matrix Operations
- High-performance Matrix Multiplication (Tiling/Block-based logic).
- Custom Broadcasting engine for tensor alignment.
- Systems of Equations
- Gaussian Elimination with partial pivoting.
- LU Decomposition.
- Cholesky Decomposition (for symmetric positive-definite matrices).
- Eigenvalues & Iterative Methods
- Power Iteration (finding the dominant eigenvalue).
- QR Algorithm for finding all eigenvalues.
- Advanced Matrix Decompositions
- Singular Value Decomposition (SVD):
$A = U\Sigma V^T$ implementation from scratch. - Principal Component Analysis (PCA): Dimensionality reduction using SVD/Covariance.
- Moore-Penrose Pseudoinverse: Solving overdetermined systems via
$A^+ = (A^T A)^{-1} A^T$ .
- Singular Value Decomposition (SVD):
- Numerical Stability & Performance
- Adaptive Relative Tolerance based on machine epsilon and matrix norms.
- Stable SVD via One-Sided Jacobi rotations (avoiding
$A^T A$ precision loss). - Hessenberg Reduction for
$O(n^2)$ QR iteration acceleration. - Shifted QR Algorithm with Wilkinson shifts and deflation logic.
- In-place Householder Storage and implicit
$Q$ matrix construction.
- Applications
- Least Squares Regression using the Normal Equation.
- Image compression via Low-Rank Approximation (SVD).
Implement reverse-mode automatic differentiation (backpropagation) from scratch, inspired by Karpathy's micrograd. Starting from a scalar-level computational graph, then building a neural network library on top, with a regression demo to verify the full pipeline.
- Autograd Engine (
Value)- Dynamic DAG construction via Python operator overloading (
__add__,__mul__, etc.) - Topological sort for correct backward propagation order
- Reverse-mode automatic differentiation (
.backward()) - Gradient accumulation across multiple backward calls
- Computation graph visualization (graphviz or textual DAG rendering)
- Dynamic DAG construction via Python operator overloading (
- Core Operations with Backward Rules
- Arithmetic:
+,-,*,/,**(power),neg - Activations:
relu,sigmoid,tanh - Transcendental:
exp,log,sqrt - (Stretch)
softmaxwith stable log-sum-exp trick,log_softmax - Numerical gradient verification via finite differences for every operation
- Arithmetic:
- Neural Network Modules (
nn)-
Parameterclass (aValuesubclass marking trainable parameters) -
Linearlayer (fully connected:$y = xW^T + b$ ) with proper shape handling - Activation wrappers:
ReLU,Tanh,Sigmoid -
Sequentialcontainer for composing multi-layer pipelines -
Modulebase class:parameters()iterator,zero_grad(), train/eval mode - Weight initialization: Xavier/Glorot uniform, He/Kaiming uniform
-
- Loss (minimal, just enough for demos)
-
MSELoss— Mean Squared Error for regression tasks
-
- Training Utilities
- Mini-batch iteration helpers (
DataLoader-style batching) - SGD optimizer (parameter update with learning rate)
- Mini-batch iteration helpers (
- Applications
- End-to-end: train a 2-layer MLP to convergence on a synthetic regression task
- Polynomial curve fitting via a small MLP
- Implementation of core distributions (Gaussian, Bernoulli) via sampling.
- Information Theory metrics: Entropy, Cross-Entropy, and KL Divergence.
- Maximum Likelihood Estimation (MLE) simulations.
- Stochastic Gradient Descent (SGD) and Momentum.
- Adaptive methods: AdaGrad, RMSProp, and Adam.
- Regularization techniques (L1/L2 Weight Decay) from a mathematical constraint perspective.
- Language: Python 3.x
- Core Library: NumPy (used for N-dimensional array storage and basic vectorized arithmetic).
- Visualization: Matplotlib (for convergence plots and decomposition results).
"What I cannot create, I do not understand." — Richard Feynman