The total process can take awhile to setup Dolly. You’ll need a good internet connection and around 50GB of hard drive space.
Install Nvidia CUDA Toolkit
You’ll need to install the CUDA Toolkit to take advantage of the GPU. The GPU is much faster than just using the CPU.
https://developer.nvidia.com/cuda-downloads
Install Git
Install git from the following site.
Download Dolly
Download Dolly with git.
git lfs install git clone https://huggingface.co/databricks/dolly-v2-12b
Install Python
We’ll also need Python installed if it is not already.
https://www.python.org/downloads/release/
Next we’ll need the following installed
py.exe -m pip install numpy
py.exe -m pip install accelerate>=0.12.0 transformers[torch]==4.25.1
py.exe -m pip install numpy --pre torch --force-reinstall --index-url https://download.pytorch.org/whl/nightly/cu117 --user
The last one is needed to get Dolly to utilize a GPU.
Run Dolly
Run a python console. If you run it as administrator, it should be faster.
py.exe
Run the following commands to set up Dolly.
import torch from transformers import pipeline generate_text = pipeline(model="databricks/dolly-v2-3b", torch_dtype=torch.bfloat16, trust_remote_code=True, device_map="auto") # Or to use the full model run generate_text = pipeline(model="databricks/dolly-v2-12b", torch_dtype=torch.bfloat16, trust_remote_code=True, device_map="auto")
Note: if you have issues, you may want/need to specify an offload folder with offload_folder=”.\offloadfolder”. An SSD is preferable.
Also if you have lots of RAM, you can take out the “torch_dtype=torch.bfloat16”
Alternatively, if we don’t want to trust_remote_code, we can do run the following
from instruct_pipeline import InstructionTextGenerationPipeline from transformers import AutoModelForCausalLM, AutoTokenizer tokenizer = AutoTokenizer.from_pretrained("databricks/dolly-v2-3b", padding_side="left") model = AutoModelForCausalLM.from_pretrained("databricks/dolly-v2-3b", device_map="auto") generate_text = InstructionTextGenerationPipeline(model=model, tokenizer=tokenizer)
Now can ask Dolly a question.
generate_text("Your question?")
Example:
>>> generate_text("Tell me about Databricks dolly-v2-3b?") 'Dolly is the fully managed open-source engine that allows you to rapidly build, test, and deploy machine learning models, all on your own infrastructure.'
Further information is available at the following two links.
https://github.com/databrickslabs/dolly
https://huggingface.co/databricks/dolly-v2-3b