Production LLM Deployment: vLLM,FastAPI,Modal and AI Chatbot

Production Grade LLM deployment and High-Load Inferencing with vLLm, Chatbots with Memory, Local Cache of Model Weights

Production LLM Deployment: vLLM,FastAPI,Modal and AI Chatbot

InstructorPetar Petkanov

Duration5h 28m

Students278

Rating4.1 (19)