Test CPU-only mode on Jetson Nano (no CUDA) for reduced memory usage

## Idea

Running Bonsai-8B on Jetson Nano in CPU-only mode (`-ngl 0`) instead of GPU could significantly reduce memory usage, similar to what we see on Raspberry Pi 4.

## Expected benefits

| | GPU mode (current) | CPU-only (proposed) |
|--|-------------------|-------------------|
| RAM used (est.) | 2500 MB | ~1400-1500 MB |
| RAM free (est.) | 980 MB | ~2400 MB |
| Speed | 1.1 tok/s | ~0.4-0.5 tok/s (A57 < A72) |
| KV Q8_0 | SEGFAULT (#2) | Should work |
| Max context | 4096 (tight) | 8K+ possible |

## Why it matters

- 1 GB more free RAM for system stability
- KV cache Q8_0 would work (only crashes with CUDA kernels)
- Context could be doubled or more
- Trade-off: ~2x slower

## Questions to investigate

1. Does the PrismML fork compile CPU-only on Jetson Nano (Ubuntu 18.04, GCC 8)?
   - GCC 8 supports most C++17 but may need `-lstdc++fs`
   - The NEON patch from llamita.cpp may still be needed for GCC 8
   - Need to verify which patches are CUDA-specific vs GCC 8-specific

2. Actual memory usage vs GPU mode

3. Actual speed on Cortex-A57 (slower than A72 on RPi)

4. Does KV Q8_0 work in CPU mode on Jetson?

## Related

- #2 KV cache Q8_0 crashes on CUDA
- Raspberry Pi 4 benchmarks in BUILD-RASPBERRY-PI.md show 1433 MB for same model+context

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Test CPU-only mode on Jetson Nano (no CUDA) for reduced memory usage #4

Idea

Expected benefits

Why it matters

Questions to investigate

Related

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

	GPU mode (current)	CPU-only (proposed)
RAM used (est.)	2500 MB	~1400-1500 MB
RAM free (est.)	980 MB	~2400 MB
Speed	1.1 tok/s	~0.4-0.5 tok/s (A57 < A72)
KV Q8_0	SEGFAULT (#2)	Should work
Max context	4096 (tight)	8K+ possible

Test CPU-only mode on Jetson Nano (no CUDA) for reduced memory usage #4

Description

Idea

Expected benefits

Why it matters

Questions to investigate

Related

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions