When I tried running a 123B parameter model in dedicated mode, the suggested hardware configuration was, understandably, quite powerful. To limit costs it would be nice to have a way to run quantized models, preferably through either GGUF or iMatrix GGUF files.