Llama 3, GPT, ML-Ops, Ray, MLFlow, LoRa, AWQ, GPTQ, LLMOps, Deployment, Generative AI, LLMs, Flash Paged Attention, Cost