MiniCPM-V 2.6 Deployment Tutorial

2024年8月20日修改

本文讨论了MiniCPM-V 2.6的部署教程，涵盖多种推理方式及遇到问题的解决办法。关键要点包括：

模型及参考链接：给出模型地址、GitHub仓库地址和Bilibili教程视频链接，适合能修改Python脚本参数和使用基本Bash的个人。

vllm推理：介绍下载模型权重方法，对比了fp16和awq的速度、时间、内存使用等数据，还给出代码推理、视频描述及vllm api server相关内容。

llama.cpp推理：设备需一定内存，介绍获取分支、安装包、获取权重方法及推理指令，说明了各推理参数含义。

Ollama相关：设备需一定内存，介绍获取分支、安装依赖、编译、创建模型实例等步骤。

常见问题解答：针对模型初始化内存不足、输出异常标签、安装编译报错等问题给出解决办法。

🙌

Bilibili Accompanying Video: https://www.bilibili.com/video/BV1sM4m1172r/?vd_source=cd29f4e20ef69babd26f4f34cc7c8b3f

Model Address: https://huggingface.co/openbmb/MiniCPM-V-2_6

GitHub Repository: https://github.com/OpenBMB/MiniCPM-V/tree/main

Suitable for Individuals: Those who are capable of modifying simple parameters in Python scripts and can use the most basic Bash .

vllm

vllm python code inference

1.1

First go to huggingface to download the model weights：

代码块

git clone https://huggingface.co/openbmb/MiniCPM-V-2_6

Or : You can also download the quantized awq model, which is twice as fast and requires only 7G of video memory.

代码块

git clone https://www.modelscope.cn/models/linglingdan/MiniCPM-V_2_6_awq_int4​
​
git clone https://github.com/LDLINGLINGLING/AutoAWQ.git​
cd AutoAWQ​
pip install e .​

MiniCPM-V 2.6 Deployment Tutorial​

MiniCPM-V 2.6 Deployment Tutorial