Accelerate and bitsandbytes is needed to install but I did
Asked Answered
A

8

22

I'm trying to load quantization like

from transformers import LlamaForCausalLM
from transformers import BitsAndBytesConfig

model = '/model/'
model = LlamaForCausalLM.from_pretrained(model, quantization_config=BitsAndBytesConfig(load_in_8bit=True))

but I get the error

ImportError: Using `load_in_8bit=True` requires Accelerate: `pip install accelerate` and the latest version of bitsandbytes `pip install -i https://test.pypi.org/simple/ bitsandbytes` or pip install bitsandbytes` 

But I've installed both, and I get the same error. I shut down and restarted the jupyter kernel I was using this on.

Alagez answered 17/8, 2023 at 18:32 Comment(6)
Able to resolve this?Hervey
I made another virtual environment with python 3.11 (the one I was getting these errors with is 3.10), but get different errors, pertaining to a needed gpu, but not a mac gpu (which I have). So idkAlagez
But I still have the venv the errors here happened on if anyone has ideas to try, would be good to know in any case.Alagez
I get this issue both in a 3.11 environment and in a 3.10 environment, on macOS. downgrading to transformers==4.30 (from the default version 4.33) in the python 3.10 env seems to fix it for now.Coronel
Does this not happen on other OS?Alagez
I ran on macOS 14 with python 3.11 and it show NO GPU found. A GPU is needed for quantization.Rubinrubina
C
29

I downgraded transformers library to version 4.30 using the following command:

pip install transformers==4.30
Crush answered 25/8, 2023 at 10:59 Comment(2)
Hi, Niels here from the Transformers team - downgrading your Transformers version is actually not the right solution, the error typically happens when you are running load_in_4bit=True or load_in_8bit=True on CPU. We now return a better error message stating that no GPU was found.Trajectory
So the solution is to just not use itAlagez
S
11

I am on a M1 Mac and also have similar problems. After installing accelerate and bitsandbytes I still get

ImportError: Using load_in_8bit=True requires Accelerate: pip install accelerate and the latest version of bitsandbytes pip install -i https://test.pypi.org/simple/ bitsandbytes or pip install bitsandbytes

I looked around a bit in the Transformers source code and found a function called is_bitsandbytes_available() which only returns true if bitsandbytes is installed and torch.cuda.is_available(), which is not the case on an Apple Silicon machine. Also, https://github.com/TimDettmers/bitsandbytes/issues/252 confirms that there is no MPS (Metal) support in bitsandbytes. I don't know why Huggingface's documentation doesn't mention (unless I've missed it somewhere) that their examples won't work on Apple Silicon.

Sheathe answered 24/11, 2023 at 14:17 Comment(3)
I am trying to fine-tune a LLAMA model (or any LLM for that matter). Do you know which one I could use with my Mac M1? Thanks in advance!Lifton
@EshitaShukla I'm sorry but that is beyond my current levels of ML knowledge (but something I'm also curious about).Sheathe
Hi, load_in_8bit=True is only supported on GPUs which PyTorch supports (NVIDIA and AMD at the moment).Trajectory
C
7

I have tried everything above, nothing worked. After doing research, change the runtime to GPU and it will work. Worked for me.

GPU

Thank me later!

Crosshead answered 24/11, 2023 at 20:34 Comment(1)
Jesus Christ! yes. This worked on GcolabApyretic
M
3

I had this same issue. I was running falcon-7B in colab to fine-tune it. First if I used transformers=4.30 as mentioned above and it solved the issue.

But its for CPU running: change the environment to GPU. and the issue will go away anyway.
That is colab CPU and GPU uses different transformer version. And GPU does not need to downgrade during pip install.

Manifesto answered 10/10, 2023 at 9:0 Comment(1)
This actually solved the problem for me, and my use case was quite similar, i.e. fine-tuning Falcon7B with mental health data on GC. thanks for sharingGaal
M
1

As @niels says in his comment:

Hi, Niels here from the Transformers team - downgrading your Transformers version is actually not the right solution, the error typically happens when you are running load_in_4bit=True or load_in_8bit=True on CPU. We now return a better error message stating that no GPU was found.

So for me, when I downgraded transformers, I just got a keyerror relating to the model I was trying to bring in. When I commented out load_in_4_bit=True, things worked.

from transformers import AutoModelForCausalLM

model = AutoModelForCausalLM.from_pretrained(
    "mistralai/Mistral-7B-v0.1", device_map="auto" #, load_in_4bit=True
)

I'm on this tutorial and hit a problem right after this line, but the model object appears to be fine.

Update: got everything to work on CPU. Just needed to remove the .to("cuda") calls from that tuturial.

Monocular answered 27/2 at 18:29 Comment(0)
G
1

If none of the above works I found that reinstalling bitsandbytes worked

pip install bitsandbytes

I am really not sure why, since the error is

requires Accelerate: pip install accelerate

Glossectomy answered 27/3 at 18:13 Comment(0)
H
0

You need to configure accelerate after installation. Please see this guide: https://huggingface.co/docs/accelerate/basic_tutorials/install#configuring--accelerate

In short:

python -c "from accelerate.utils import write_basic_config; write_basic_config(mixed_precision='fp16')"
Hildegardehildesheim answered 23/11, 2023 at 7:35 Comment(2)
I tried this from inside a Notebook using !python -c... to execute a shell command. I also tried running the code within the double quotes (from accelerate.utils...) within the notebook, and got the same error as before.Tymes
DId not work for meColemancolemanite
V
0

In my case I'm currently fine-tuning a model with Mistral LLM check GPU is available 1. torch.cuda.is_available() 2. check the compatability for cuda version with torch version

Viviparous answered 17/3 at 11:39 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.