1 link tagged with all of: llm + quantization + model-compression + edge-ai + benchmarks
Links
PrismML’s Bonsai 8B trains a large language model with 1-bit weights from scratch, squeezing 8.2 billion parameters into just 1.15 GB. In benchmarks it ties or outperforms FP16 models like Llama 3.1 and runs at real-time speeds on phones, shifting the size-performance trade-off.
llm ✓
quantization ✓
edge-ai ✓
model-compression ✓
benchmarks ✓