The Hidden Speed Lever in LM Studio: MTP Is Doing More Than You Think
You squeezed your 35B model onto 8GB of VRAM. You disabled mmap, locked the model in RAM, and quantized your KV cache.…
Just a guy trying to play God without permission.
Tag
1 post
You squeezed your 35B model onto 8GB of VRAM. You disabled mmap, locked the model in RAM, and quantized your KV cache.…