1 link tagged with all of: benchmarks + llm + edge-ai + model-compression + quantization
Links
PrismML’s Bonsai 8B trains a large language model with 1-bit weights from scratch, squeezing 8.2 billion parameters into just 1.15 GB. In benchmarks it ties or outperforms FP16 models like Llama 3.1 and runs at real-time speeds on phones, shifting the size-performance trade-off.
llm ✓
quantization ✓
edge-ai ✓
model-compression ✓
benchmarks ✓