Add training lessons and a bigram token-target bridge#5
Conversation
|
Warning Rate limit exceeded
To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (28)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request implements Module 04 ('Learning') by adding new lesson files and the neuron Rust crate, which provides a single-neuron training system with manual backpropagation and SGD. Documentation and CI scripts were updated to integrate the new module. Feedback identifies a numbering error in the README, recommends reverting to a stable Rust edition, and suggests refactoring training functions to reduce code duplication.
| [package] | ||
| name = "rust_ml_neuron" | ||
| version = "0.1.0" | ||
| edition = "2024" |
There was a problem hiding this comment.
The Rust 2024 edition is not yet stable and using it may cause compatibility issues with stable Rust toolchains. It's recommended to use the latest stable edition, which is 2021, to ensure the crate can be built by a wider range of users and CI environments.
| edition = "2024" | |
| edition = "2021" |
| 3. Continue with [03 Neuron](lessons/03-neuron/README.md). | ||
| 4. Use [Lessons index](lessons/README.md) to see the full course map and the roadmap modules. | ||
| 4. Continue with [04 Learning](lessons/04-learning/README.md). | ||
| 5. Use [Lessons index](lessons/README.md) to see the full course map and the roadmap modules. |
There was a problem hiding this comment.
| /// Trains the neuron for one full pass over the dataset. | ||
| pub fn train_epoch(neuron: &mut Neuron, dataset: &Dataset, optimizer: Sgd) -> EpochMetrics { | ||
| for example in dataset.iter() { | ||
| let gradients = neuron.backward(*example); | ||
| optimizer.apply(neuron, gradients); | ||
| } | ||
|
|
||
| EpochMetrics { | ||
| epoch: 1, | ||
| average_loss: average_loss(neuron, dataset), | ||
| } | ||
| } | ||
|
|
||
| /// Trains the neuron for a fixed number of epochs and records the average loss. | ||
| pub fn train_epochs( | ||
| neuron: &mut Neuron, | ||
| dataset: &Dataset, | ||
| optimizer: Sgd, | ||
| epochs: usize, | ||
| ) -> Vec<EpochMetrics> { | ||
| let mut metrics = Vec::with_capacity(epochs); | ||
|
|
||
| for epoch_index in 0..epochs { | ||
| for example in dataset.iter() { | ||
| let gradients = neuron.backward(*example); | ||
| optimizer.apply(neuron, gradients); | ||
| } | ||
|
|
||
| metrics.push(EpochMetrics { | ||
| epoch: epoch_index + 1, | ||
| average_loss: average_loss(neuron, dataset), | ||
| }); | ||
| } | ||
|
|
||
| metrics | ||
| } |
There was a problem hiding this comment.
The functions train_epoch and train_epochs contain duplicate logic for iterating over a dataset for one epoch. This can be extracted into a private helper function to follow the DRY (Don't Repeat Yourself) principle, improving maintainability.
Additionally, train_epochs can be written more concisely using iterators and map.
fn run_one_epoch(neuron: &mut Neuron, dataset: &Dataset, optimizer: Sgd) {
for example in dataset.iter() {
let gradients = neuron.backward(*example);
optimizer.apply(neuron, gradients);
}
}
/// Trains the neuron for one full pass over the dataset.
pub fn train_epoch(neuron: &mut Neuron, dataset: &Dataset, optimizer: Sgd) -> EpochMetrics {
run_one_epoch(neuron, dataset, optimizer);
EpochMetrics {
epoch: 1,
average_loss: average_loss(neuron, dataset),
}
}
/// Trains the neuron for a fixed number of epochs and records the average loss.
pub fn train_epochs(
neuron: &mut Neuron,
dataset: &Dataset,
optimizer: Sgd,
epochs: usize,
) -> Vec<EpochMetrics> {
(1..=epochs)
.map(|epoch| {
run_one_epoch(neuron, dataset, optimizer);
EpochMetrics {
epoch,
average_loss: average_loss(neuron, dataset),
}
})
.collect()
}
Summary
code/neuronbigram next-token model usingtoken -> embedding -> lm_head -> logits -> cross-entropyTesting
python3 scripts/check_course_content.pypython3 scripts/check_lesson_rust_snippets.pycargo fmt --manifest-path code/neuron/Cargo.toml --checkcargo clippy --manifest-path code/neuron/Cargo.toml --all-targets --all-featurescargo test --manifest-path code/neuron/Cargo.tomlcargo fmt --manifest-path code/transformer/Cargo.toml --checkcargo clippy --manifest-path code/transformer/Cargo.toml --all-targets --all-featurescargo test --manifest-path code/transformer/Cargo.toml