Getting Vicuna Weights

Vicuna is a open-source chatbot trained by fine-tuning LLaMA on ShartGPT data.

Please note that the official Vicuna weights are delta weights applied to the LLaMA weights in order to comply with the LLaMA license. Users are responsible for applying these delta weights themselves.

In this tutorial, we will show how to apply the delta weights to LLaMA weights to get Vicuna weights.

Install FastChat

FastChat offers convenient utility functions for applying delta to LLaMA weights. You can easily install it using pip.

pip install fschat

Download HuggingFace LLaMA Weights

The HuggingFace LLaMA weights are hosted using Git-LFS. Therefore, it is necessary to install Git-LFS first (you can ignore this step if git-lfs is already installed).

conda install git-lfs
git lfs install

Then download the weights (both the LLaMA weight and Vicuna delta weight):

git clone https://huggingface.co/decapoda-research/llama-7b-hf
git clone https://huggingface.co/lmsys/vicuna-7b-delta-v1.1

There is a name mis-alignment issue in the LLaMA weights and Vicuna delta weights. Please follow these steps to modify the content of the “config.json” file:

sed -i 's/LLaMAForCausalLM/LlamaForCausalLM/g' llama-7b-hf/config.json

Then use fschat to apply delta to LLaMA weights

python3 -m fastchat.model.apply_delta \
    --base-model-path llama-7b-hf \
    --target-model-path vicuna-7b-v1.1 \
    --delta-path vicuna-7b-delta-v1.1

You will get the Vicuna weights in vicuna-7b-v1.1 folder, which can be used as input of MLC-LLM to further compile models.

(Optional) Move Vicuna Weights to dist folder

The default model path of MLC-LLM is dist folder. Therefore, it is recommended to move the Vicuna weights to dist folder.

mkdir -p dist/models
mv vicuna-7b-v1.1 dist/models/vicuna-7b-v1.1