The Hidden Speed Lever in LM Studio: MTP Is Doing More Than You Think
You squeezed your 35B model onto 8GB of VRAM. You disabled mmap, locked the model in RAM, and quantized your KV cache.…
Just a guy trying to play God without permission.
Tag
2 posts
You squeezed your 35B model onto 8GB of VRAM. You disabled mmap, locked the model in RAM, and quantized your KV cache.…
A few months ago, during my usual free-time gaming session, I was scrolling through some old chat histories. It hit me—damn,…