The Hidden Speed Lever in LM Studio: MTP Is Doing More Than You Think
You squeezed your 35B model onto 8GB of VRAM. You disabled mmap, locked the model in RAM, and quantized your KV cache.…
Just a guy trying to play God without permission.
Tag
3 posts
You squeezed your 35B model onto 8GB of VRAM. You disabled mmap, locked the model in RAM, and quantized your KV cache.…
A few months ago, during my usual free-time gaming session, I was scrolling through some old chat histories. It hit me—damn,…
The common wisdom says that if you want to run a 35-billion-parameter model, you need a desktop with a massive, expensive GPU.…