quantization

2 links tagged with quantization

Click any tag below to further narrow down your results

Links

I Tested the 1-Bit LLM That Fits in 1 GB — It Shouldn't Be This Good | by Chew Loong Nian | in Level Up Coding

PrismML’s Bonsai 8B trains a large language model with 1-bit weights from scratch, squeezing 8.2 billion parameters into just 1.15 GB. In benchmarks it ties or outperforms FP16 models like Llama 3.1 and runs at real-time speeds on phones, shifting the size-performance trade-off.

Saved by mark · Last saved April 25, 2026 · 6 min read

+ llm quantization ✓ + edge-ai + model-compression + benchmarks

Letter 109: All About Local LLMs

This article walks through why and how to run large language models locally, covering privacy, cost, offline access, and control. It breaks down hardware needs, quantization, PC versus Mac setups, and starter software to get models up and running.

Saved by mark · Last saved April 22, 2026 · 7 min read

+ local-llms + hardware quantization ✓ + mac-vs-pc + privacy