归档: 2024/11 | 爱敲代码の鱼儿

Never really desperate, only the lost of the soul.

爱敲代码の鱼儿-博客

2024

ServerlessLLM：大型语言模型的低延迟无服务器推理背景当使用Serverless部署LLM时，由于用户需要的LLM模型（gpt-4o,openai-1o,Longchat-lite)，各式各样，将模型全部保存在本地存在巨量GPU

2024-11-12 科研