These guidelines define how we write tests across the Shale repository to ensure consistency, readability, and reliability. All 239+ tests follow these patterns — this document reflects our actual validated practices.
- Test Organization Philosophy
- Naming Convention
- File Organization
- Single Behavior Principle
- [Test Structure (Arrange/Act/Assert)](#test-structure-arrangea ctassert)
- Async & Timeout Patterns
- Trait Behavior Tests
- Table-Driven Tests
- Determinism & Isolation
- Assertions
- Negative Testing
- CI Commands
- Quick Reference Checklist
Principle: Following Rust conventions, we distinguish between unit tests and integration tests.
Unit Tests:
- Located in
src/files within#[cfg(test)] mod tests { }blocks - Test single modules in isolation
- Have access to private functions and internal implementation
- Test individual components, functions, structs, and traits
- Run with
cargo test --libIntegration Tests: - Located in the
tests/directory - Test multiple modules working together or the public API
- Only have access to public interfaces
- Test cross-cutting concerns, end-to-end scenarios
- Run with
cargo test --test <name>
src/
├── storage/
│ ├── bloom.rs → Contains #[cfg(test)] mod tests { ... }
│ ├── skiplist.rs → Contains #[cfg(test)] mod tests { ... }
│ ├── memtable.rs → Contains #[cfg(test)] mod tests { ... }
│ └── sst.rs → Contains #[cfg(test)] mod tests { ... }
├── wal/
│ ├── wal.rs → Contains #[cfg(test)] mod tests { ... }
│ ├── wal_fs.rs → Contains #[cfg(test)] mod tests { ... }
│ └── wal_mem.rs → Contains #[cfg(test)] mod tests { ... }
tests/ (Integration tests only)
├── engine.rs → Tests ShaleEngine (multiple modules)
├── backup.rs → Tests backup system (multiple modules)
├── compaction/
│ └── compactor.rs → Tests compaction (multiple modules)
└── api/
├── query.rs → Tests query API (public interface)
└── transaction.rs → Tests transaction API (public interface)
Does the test use ShaleEngine or test multiple modules?
├─ YES → tests/ directory (integration test)
│ Example: Testing backup, compaction, full CRUD flows
│
└─ NO → Is it testing a single module's internal behavior?
├─ YES → src/<module>.rs in #[cfg(test)] mod tests { }
│ Example: Testing skiplist insert, bloom filter contains, codec roundtrip
│
└─ UNSURE → Does it need access to private functions/fields?
├─ YES → Unit test (src/ file)
└─ NO → Could be either; prefer unit test for speed
## Naming Convention
Use **behavior-first** test names that clearly describe the expected outcome:
should*<expected_outcome>_given*when
### Real Examples from Shale
✅ **Good:**
```rust
should_resolve_put_on_get_local()
should_enforce_quota_given_file_registration_when_limit_exceeded()
should_track_operation_counts_given_basic_operations()
should_roundtrip_given_small_input()
should_return_none_given_deleted_key_when_delete()
should_defer_deletion_given_grace_period_when_marked()
❌ Bad:
test_get_local() // What behavior? What's expected?
test_quota() // Too vague
basic_operations() // Not behavior-first
roundtrip_small() // Missing should/given/when
test_1() // Meaningless- Be descriptive: Clarity beats brevity. Long, clear names > short, vague names
- Include context: State preconditions (
given) when they affect the outcome - Omit redundancies: If action is obvious from outcome,
when_<action>can be omitted - Use consistent verbs:
should_return— for getters/queriesshould_enforce— for validation/limitsshould_track— for metrics/countersshould_apply— for state changesshould_defer— for delayed operationsshould_coordinate— for multi-component behavior
What does the test verify?
Getter/Query → should_return_<value>_given_<state>
Example: should_return_none_given_missing_key
Validation → should_enforce_<rule>_given_<condition>
Example: should_enforce_quota_given_limit_exceeded
State Change → should_apply_<change>_when_<trigger>
Example: should_apply_new_rate_given_dynamic_adjustment
Metric/Counter → should_track_<metric>_given_<operation>
Example: should_track_flush_operations_given_memtable_flush
Complex/Other → should_<behavior>_given_<context>_when_<action>
Example: should_defer_deletion_given_grace_period_when_marked
Principle: Unit tests live in src/ files, integration tests in tests/ directory.
// src/storage/skiplist.rs
pub struct SkipList {
// ... implementation ...
}
impl SkipList {
pub fn new() -> Self {
// ... implementation ...
}
pub fn upsert(&mut self, key: Vec<u8>, value: Option<Vec<u8>>, seq: u64) {
// ... implementation ...
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn should_insert_and_get_value() {
// Arrange
let mut sl = SkipList::new();
// Act
sl.upsert(b"key".to_vec(), Some(b"val".to_vec()), 1);
// Assert
assert_eq!(sl.get(b"key"), Some(&Some(b"val".to_vec())));
}
#[test]
fn should_return_none_given_missing_key() {
// Arrange
let sl = SkipList::new();
// Act
let result = sl.get(b"nonexistent");
// Assert
assert_eq!(result, None);
}
}// tests/engine.rs
use shale::{ShaleEngine, ShaleOptions};
use bytes::Bytes;
use tempfile::TempDir;
#[test]
fn should_persist_data_across_restart() {
// Arrange
let dir = TempDir::new().unwrap();
let opts = ShaleOptions {
db_path: dir.path().to_path_buf(),
..Default::default()
};
// Act: Write and close
{
let mut engine = ShaleEngine::open(opts.clone()).unwrap();
engine.put(Bytes::from("key"), Bytes::from("value")).unwrap();
}
// Act: Reopen and read
let engine = ShaleEngine::open(opts).unwrap();
let result = engine.get(&Bytes::from("key")).unwrap();
// Assert
assert_eq!(result, Some(Bytes::from("value")));
}Unit Tests (in src/):
src/
├── index/
│ ├── bloom.rs → 14 tests
│ ├── merge_iterator.rs → 10 tests
│ └── range_tombstone.rs → 21 tests
├── storage/
│ ├── skiplist.rs → 4 tests
│ ├── memtable.rs → 13 tests
│ ├── sparse_index.rs → 5 tests
│ ├── sst.rs → 7 tests
│ ├── sst_mem.rs → 11 tests
│ ├── sst_fs.rs → 6 tests
│ └── file_manager.rs → 11 tests
├── wal/
│ ├── wal.rs → 4 tests
│ ├── wal_fs.rs → 12 tests
│ └── wal_mem.rs → 16 tests
└── utils/
├── tlv.rs → 15 tests
└── codec.rs → 20+ tests
Integration Tests (in tests/):
tests/
├── engine.rs → Full engine integration
├── backup.rs → Backup system
├── manifest.rs → Manifest persistence
├── api/
│ ├── query.rs → Query API
│ ├── transaction.rs → Transaction API
│ └── snapshot.rs → Snapshot isolation
├── compaction/
│ ├── compactor.rs → Compaction logic
│ └── compaction_filter.rs → Custom filters
└── utils/
├── cache.rs → Cache integration with engine
├── metrics.rs → Metrics collection
└── rate_limiter.rs → Rate limiting integration
Unit Tests (same file):
#[cfg(test)]
mod tests {
use super::*; // Import from parent module
use bytes::Bytes;
use crate::error::ShaleResult; // Absolute crate path
#[test]
fn should_behave_correctly() {
let obj = MyStruct::new(); // Can access private items
// ...
}
}Integration Tests (separate file):
// tests/engine.rs
use shale::{ShaleEngine, ShaleOptions}; // Public API only
use shale::error::ShaleError;
use bytes::Bytes;
#[test]
fn should_work_end_to_end() {
let engine = ShaleEngine::open(opts).unwrap();
// Can only access public API
}We recently migrated 70+ tests from tests/ to src/ following Rust conventions:
Migrated to src/ (Unit Tests):
- ✅ bloom.rs (14 tests)
- ✅ merge_iterator.rs (10 tests)
- ✅ range_tombstone.rs (21 tests)
- ✅ tlv.rs (15 tests)
- ✅ skiplist.rs (4 tests)
- ✅ memtable.rs (13 tests)
- ✅ codec.rs (20+ tests)
- ✅ sparse_index.rs (5 tests)
- ✅ sst.rs (7 tests)
- ✅ sst_mem.rs (11 tests)
- ✅ sst_fs.rs (6 tests)
- ✅ wal.rs (4 tests)
- ✅ wal_fs.rs (12 tests)
- ✅ wal_mem.rs (16 tests)
- ✅ file_manager.rs (11 tests)
Kept in
tests/(Integration Tests): - engine.rs (uses ShaleEngine)
- backup.rs (multi-module)
- manifest.rs (persistence integration)
- compaction/* (multi-module compaction)
- api/* (public API tests)
- utils/cache.rs (uses ShaleEngine)
- utils/metrics.rs (uses ShaleEngine)
- utils/rate_limiter.rs (hybrid integration)
Core Rule: Each test verifies exactly one behavior with exactly one Act.
The Act is the single operation/method call you're testing — the subject of verification.
#[test]
fn should_resolve_put_on_get_local() {
// Arrange
let mut tx = Transaction::new(1, 0);
tx.put(Bytes::from_static(b"a"), Bytes::from_static(b"1"), None);
// Act ← This ONE operation is what we're testing
let got = tx.get_local(b"a");
// Assert
assert_eq!(got, Some(Some(Bytes::from_static(b"1"))));
}Setup operations in Arrange (like tx.put() above) are not Acts — they're preconditions.
Multiple assertions are fine if they all verify the same behavior: ✅ Good: All assertions verify "quota enforcement"
#[test]
fn should_enforce_quota_given_file_registration_when_limit_exceeded() {
// Arrange
let fm = FileManager::with_limits(1_000_000, 0);
fm.register_file(Path::new("f1"), 400_000).unwrap();
fm.register_file(Path::new("f2"), 400_000).unwrap();
// Act: Try to exceed quota
let result = fm.register_file(Path::new("f3"), 300_000);
// Assert: All verify the same "quota enforcement" behavior
assert!(result.is_err()); // ✓ Error occurred
if let Err(FileManagerError::QuotaExceeded { requested, available }) = result {
assert_eq!(requested, 300_000); // ✓ Correct request size
assert_eq!(available, 200_000); // ✓ Correct remaining space
}
}✅ Good: Related side-effects of one behavior
#[test]
fn should_track_operation_counts_given_basic_operations() {
// ... arrange ...
// Act
engine.put(k1, v1).unwrap();
engine.put(k2, v2).unwrap();
// Assert: All verify "operation tracking" behavior
let snapshot = metrics.snapshot();
assert_eq!(snapshot.put_count, 2); // ✓ Direct counter
assert_eq!(snapshot.memtable_writes, 2); // ✓ Related metric
assert_eq!(snapshot.total_operations(), 2); // ✓ Derived value
}Split into separate tests when you encounter: | Symptom | Action | | ------------------------ | ------------------------------------------- | -------- | | Multiple Acts | Different operations being tested | ✂️ Split | | Unrelated Assertions | Independent behaviors even with same Act | ✂️ Split | | Vague Name | Can't describe as "should_X" | ✂️ Split | | Failure Ambiguity | When it fails, unclear which behavior broke | ✂️ Split | | Complex Setup | Need different Arrange for different checks | ✂️ Split |
❌ Bad: Multiple unrelated Acts
#[test]
fn should_exercise_many_things() { // ← Red flag: vague name
// Arrange
let mut tx = Transaction::new(1, 0);
tx.put(b"a", b"1", None);
// Act 1: Query
let got = tx.get_local(b"a");
assert_eq!(got, Some(Some(b"1"))); // ← First behavior
// Act 2: Commit
let muts = tx.commit().unwrap();
assert_eq!(muts.len(), 1); // ← Completely different behavior!
}✅ Good: Split into focused tests
#[test]
fn should_resolve_put_on_get_local() {
// Arrange
let mut tx = Transaction::new(1, 0);
tx.put(b"a", b"1", None);
// Act
let got = tx.get_local(b"a");
// Assert
assert_eq!(got, Some(Some(b"1")));
}
#[test]
fn should_return_staged_mutations_in_order_when_commit() {
// Arrange
let mut tx = Transaction::new(1, 0);
tx.put(b"a", b"1", None);
// Act
let muts = tx.commit().unwrap();
// Assert
assert_eq!(muts.len(), 1);
}When reviewing tests, verify:
- Test name clearly describes one expected behavior
- Exactly one Act operation
- All assertions verify the same behavior or its direct side-effects
- If test fails, the behavior that broke is immediately clear
- No "and" in the test name (usually indicates multiple behaviors)
Mandatory Pattern: Every test must have clearly marked sections.
#[test]
fn should_<behavior>() {
// Arrange
// ... setup code ...
// Act
// ... single operation under test ...
// Assert
// ... verify expected outcome ...
}#[test]
fn should_exclude_deleted_keys_given_tombstones_when_scanning() {
// Arrange
let mt = Memtable::new();
mt.put(b"a", b"1");
mt.put(b"b", b"2");
mt.delete(b"a");
// Act
let rows = mt.scan_range(Some(b"a"), Some(b"z"));
// Assert: tombstone 'a' excluded
assert_eq!(
rows,
vec![(Bytes::from_static(b"b"), Bytes::from_static(b"2"))]
);
}Arrange:
- Create test data, objects, setup state
- Use helpers like
tempfile::tempdir()for isolation - Keep setup minimal — only what's needed for this test
- Comment complex setup:
// Arrange: force WAL rotation by...Act: - Exactly one operation under test
- Can span multiple lines if it's one logical operation
- Should be obvious what's being tested Assert:
- Verify the expected behavior
- Add comments for non-obvious expectations
- Use exact matches when deterministic
- Check side-effects that prove correctness
For very simple tests, Arrange and Act can be combined if it aids clarity:
#[test]
fn should_return_none_given_missing_key() {
// Arrange + Act
let mt = Memtable::new();
let result = mt.get(b"nonexistent");
// Assert
assert_eq!(result, None);
}But prefer separate sections when any complexity is involved.
❌ Bad: Arbitrary sleep
std::thread::sleep(Duration::from_secs(1));
// Hope something happened...
assert!(condition);✅ Good: Timeout with fast failure
use tokio::time::{timeout, Duration};
let result = timeout(Duration::from_millis(200), operation).await;
assert!(result.is_ok(), "operation timed out");#[tokio::test]
async fn should_complete_within_timeout() {
// Arrange
let service = Service::new();
// Act
let result = timeout(
Duration::from_millis(100),
service.async_operation()
).await;
// Assert
assert!(result.is_ok(), "operation should complete quickly");
assert_eq!(result.unwrap(), expected_value);
}- Fast tests: Use 50-200ms timeouts for fast operations
- Slow operations: Use minimal timeout needed + margin (e.g., 1s operation → 1.5s timeout)
- Comment delays: If a
sleep()is truly needed, explain why - Prefer polling: For eventual consistency, use retry loops with timeout
use tokio::time::{timeout, Duration};
#[tokio::test]
async fn should_receive_no_messages_given_no_publisher() {
// Arrange
let mut subscriber = create_subscriber();
// Act: Try to receive with timeout
let result = timeout(Duration::from_millis(100), subscriber.next()).await;
// Assert: Timeout means no message (expected)
assert!(result.is_err(), "unexpected message received");
}When a trait has multiple implementations, write tests that enforce the trait's contract, not implementation details.
All implementations must pass the same behavioral tests.
// Shared behavior tests
fn test_compressor_behavior<F, C>(factory: F)
where
F: Fn() -> C,
C: Compressor,
{
// should_roundtrip_given_small_input
{
// Arrange
let codec = factory(); // Fresh instance
let input = b"hello";
// Act
let compressed = codec.compress(input).unwrap();
let output = codec.decompress(&compressed).unwrap();
// Assert
assert_eq!(output.as_slice(), input);
}
// should_roundtrip_given_empty_input
{
// Arrange
let codec = factory(); // Fresh instance
let input: &[u8] = &[];
// Act
let compressed = codec.compress(input).unwrap();
let output = codec.decompress(&compressed).unwrap();
// Assert
assert_eq!(output.as_slice(), input);
}
}
// Apply to each implementation
#[test]
fn noop_codec_behavior() {
test_compressor_behavior(|| NoopCodec::new());
}
#[test]
fn snappy_codec_behavior() {
test_compressor_behavior(|| SnappyCodec::new());
}
#[test]
fn lz4_codec_behavior() {
test_compressor_behavior(|| Lz4Codec::new());
}macro_rules! behavior_tests_for {
($modname:ident, $codec_ty:ty, $ctor:expr) => {
mod $modname {
use super::*;
#[test]
fn should_roundtrip_given_small_input() -> Result<()> {
let codec = $ctor();
let input = b"hello world";
roundtrip(codec, input)
}
#[test]
fn should_roundtrip_given_empty_input() -> Result<()> {
let codec = $ctor();
let input: &[u8] = &[];
roundtrip(codec, input)
}
#[test]
fn should_roundtrip_given_binary_input() -> Result<()> {
let codec = $ctor();
let input = &[0u8, 1, 2, 3, 4, 5, 0xff, 0x00, 0x7f];
roundtrip(codec, input)
}
}
};
}
// Helper function
fn roundtrip<C: Compressor>(codec: C, input: &[u8]) -> Result<()> {
let compressed = codec.compress(input)?;
let decompressed = codec.decompress(&compressed)?;
assert_eq!(decompressed.as_slice(), input);
Ok(())
}
// Generate tests for each implementation
behavior_tests_for!(noop_codec, NoopCodec, NoopCodec::new);
behavior_tests_for!(snappy_codec, SnappyCodec, SnappyCodec::new);
behavior_tests_for!(lz4_codec, Lz4Codec, Lz4Codec::new);✅ Contract enforcement: All implementations must satisfy the same behaviors
✅ Zero duplication: Write tests once, run for all implementations
✅ Easy to extend: New implementation? Add one line
✅ Isolation: Fresh instance per behavior via factory pattern
❌ Don't copy-paste tests for each implementation — use the patterns above
For testing the same behavior with multiple inputs, use table-driven tests.
#[test]
fn should_parse_valid_inputs() {
// Arrange: table of (name, input, expected)
let cases = vec![
("empty", "", 0),
("single", "a", 1),
("many", "abc", 3),
("unicode", "日本語", 3),
];
for (name, input, expected) in cases {
// Act
let result = parse(input);
// Assert
assert_eq!(result, expected, "case '{}' failed", name);
}
}use tokio::time::{timeout, Duration};
#[tokio::test]
async fn should_handle_various_inputs() {
// Arrange
let cases = vec![
("small", vec![1, 2, 3], 6),
("large", vec![10, 20, 30], 60),
("empty", vec![], 0),
];
for (name, input, expected) in cases {
let name = name.to_string(); // Move into async block
// Act
let fut = async move {
let result = async_sum(&input).await;
assert_eq!(result, expected, "case: {}", name);
};
// Assert: Complete within timeout
let res = timeout(Duration::from_millis(100), fut).await;
assert!(res.is_ok(), "case '{}' timed out", name);
}
}✅ Use table-driven when:
- Testing the same behavior with different inputs
- Verifying boundary conditions (empty, min, max, etc.)
- Checking multiple valid/invalid input variations ❌ Don't use when:
- Testing different behaviors (use separate tests instead)
- Each case needs different setup or assertions
- Failure diagnosis would be unclear
Tests must be reliable and independent.
- No Shared State
Each test creates its own resources - No External Dependencies
Tests should not depend on:- Network services (except localhost)
- File system state outside temp dirs
- Environment variables
- Current time (unless explicitly testing time)
- Order Independence
Tests can run in any order, including parallel - Cleanup
Use RAII (tempfile::TempDir) or explicit cleanup
Temp Directories:
#[test]
fn should_persist_data() {
// Arrange: Isolated temp dir, auto-cleaned
let dir = tempfile::tempdir().unwrap();
let engine = ShaleEngine::open(ShaleOptions {
db_path: dir.path().to_path_buf(),
..Default::default()
}).unwrap();
// ... test ...
// Cleanup: automatic when `dir` drops
}In-Memory State:
#[test]
fn should_track_metrics() {
// Arrange: Fresh engine instance (not shared)
let engine = create_test_engine();
// ... test ...
}❌ Global mutable state ❌ Singleton patterns without isolation ❌ Tests that depend on execution order ❌ Tests that modify shared file system paths
When behavior is deterministic, assert exact values: ✅ Good:
assert_eq!(result, Some(Bytes::from_static(b"value")));
assert_eq!(count, 42);
assert_eq!(rows.len(), 3);❌ Bad:
assert!(count > 0); // How many? Why not assert the exact count?Don't over-assert internal details: ✅ Good: Verify observable behavior
let stats = file_manager.stats();
assert_eq!(stats.current_total_bytes, 500_000);
assert_eq!(stats.space_utilization(), 0.5);❌ Bad: Assert internal implementation details
// Don't assert HashMap ordering, private fields, etc.Add messages for non-obvious failures:
assert_eq!(
actual, expected,
"Bloom filter false positive rate too high"
);
assert!(
elapsed < Duration::from_millis(100),
"Operation took too long: {:?}", elapsed
);For complex types, use pattern matching:
match result {
Err(FileManagerError::QuotaExceeded { requested, available }) => {
assert_eq!(requested, 300_000);
assert_eq!(available, 200_000);
}
_ => panic!("Expected QuotaExceeded error"),
}Tests must verify error cases, not just happy paths.
- Validation Errors: Invalid input
- State Errors: Operation not allowed in current state
- Resource Errors: Quota exceeded, file not found, etc.
- Corruption: Malformed data, checksums, etc.
#[test]
fn should_reject_empty_key() {
// Arrange
let engine = create_engine();
// Act
let result = engine.put(Bytes::new(), Bytes::from("value"));
// Assert
assert!(result.is_err());
assert!(matches!(result.unwrap_err(), Error::InvalidKey));
}#[test]
fn should_detect_corrupted_wal() {
// Arrange: Create valid file
let tmp = tempfile::tempdir().unwrap();
let path = tmp.path().join("wal.log");
std::fs::write(&path, valid_wal_bytes()).unwrap();
// Act: Corrupt the file
let mut data = std::fs::read(&path).unwrap();
data[10] ^= 0xFF; // Flip bits
std::fs::write(&path, &data).unwrap();
// Assert: Replay detects corruption
let result = replay_wal_file(&path);
assert!(result.is_err());
assert!(matches!(result.unwrap_err(), Error::Corruption));
}#[test]
fn should_error_when_quota_exceeded() {
// Arrange
let fm = FileManager::with_limits(1_000_000, 0);
fm.register_file(Path::new("f1"), 900_000).unwrap();
// Act: Exceed quota
let result = fm.register_file(Path::new("f2"), 200_000);
// Assert: Error with details
assert!(result.is_err());
if let Err(FileManagerError::QuotaExceeded { requested, available }) = result {
assert_eq!(requested, 200_000);
assert_eq!(available, 100_000);
} else {
panic!("Expected QuotaExceeded error");
}
}Always test:
- Empty collections
- Boundary values (0, MAX, MIN)
- Missing keys
- Duplicate keys
- Concurrent operations (where applicable)
These are our canonical test commands for local development and CI:
# Full test suite
cargo test --all
# Quiet mode (less output)
cargo test --all --quiet
# Show test output even on success
cargo test --all -- --nocapture# Single integration test file
cargo test --test wal
# Single test by name
cargo test should_resolve_put_on_get_local
# All tests in a module
cargo test skiplist::# Generate coverage report (requires cargo-llvm-cov)
cargo llvm-cov --tests
# HTML coverage report
cargo llvm-cov --tests --html# PowerShell: Count total passing tests
cargo test --quiet 2>&1 | Select-String "(\d+) passed" |
ForEach-Object { if ($_ -match '(\d+) passed') { [int]$matches[1] } } |
Measure-Object -Sum |
Select-Object -ExpandProperty Sum# Release mode (for benchmarks)
cargo test --release
# Single-threaded (for debugging)
cargo test -- --test-threads=1Use this checklist when writing or reviewing tests:
- Name:
should_<outcome>_given_<context>_when_<action> - Clear
// Arrange,// Act,// Assertsections - Exactly one Act operation
- All assertions verify the same behavior
- File location mirrors
src/structure - Imports use
use shale::<module>::<Type> - Helper functions (if any) are clearly separated
- Test name clearly describes expected behavior
- Single, focused behavior (not testing multiple things)
- When test fails, immediately clear what broke
- No "and" in test name (indicates multiple behaviors)
- Uses
tempfile::tempdir()for file system tests - No shared global state
- No external dependencies (except localhost)
- Can run in any order, including parallel
- Uses
#[tokio::test]for async - Uses
timeout()instead ofsleep() - Timeout duration is justified (50-200ms for fast ops)
- Factory pattern for fresh instances
- Same tests for all implementations
- Tests trait contract, not implementation details
- Exact matches where deterministic
- Error messages for non-obvious failures
- Asserts observable behavior, not internals
- Tests error paths (not just happy path)
- Tests boundary conditions (empty, max, min)
- Tests invalid input
- Tests resource limits (where applicable)
See our actual test files for reference implementations: Best Examples:
tests/engine.rs— Integration tests, excellent namingtests/transaction.rs— Perfect Arrange/Act/Assert structuretests/codec.rs— Trait behavior tests with macrotests/rate_limiting.rs— Recently updated, follows all guidelinestests/skiplist.rs— Clean, simple, well-structured For Specific Patterns:- Async + Timeout:
tests/rate_limiting.rs - Trait Testing:
tests/codec.rs - Negative Testing:
tests/error.rs - Table-Driven:
tests/bloom.rs(via macro) - Integration:
tests/compactor.rs
- 2025-10-14: Updated to reflect 214-test suite alignment
- 2025-10-14: Added real examples from Shale codebase
- 2025-10-14: Enhanced naming guidelines with decision tree
- 2025-10-14: Added comprehensive trait testing patterns
- 2025-10-14: Improved async/timeout guidance These guidelines evolve with the codebase. When patterns change, update this document to reflect our actual practices.