How to use transformers pipeline with multi-gpu?
Asked Answered
A

1

14
ner_model = pipeline('ner', model=model, tokenizer=tokenizer, device=0, grouped_entities=True)

the device indicated pipeline to use no_gpu=0(only using GPU), please show me how to use multi-gpu.

Asch answered 4/10, 2020 at 9:34 Comment(1)
There is a Pipeline._config.device option. I think this is a good point to start investigation. Also there is an issue about global configuration for Pipeline settings in trankit repo. Create two pipelines for two GPUs and assign different tasks to them.Fusil
O
4

There is an argument called device_map for the pipelines in the transformers lib; see here. It comes from the accelerate module; see here. You can specify a custom model dispatch, but you can also have it inferred automatically with device_map=" auto". Eventually, you might need additional configuration for the tokenizer, but it should look like this:

ner_model = pipeline('ner', model=model, tokenizer=tokenizer, device_map="auto", grouped_entities=True)
Obidiah answered 20/3, 2023 at 13:27 Comment(2)
device_map="auto" worked for me while loading a model on multiple gpus. However when I do the inference, the input is unable to fit on the gpu 0. I can see my gpu 3 have space left. how do i specify the target gpu to store the input while doing the inference?Paymaster
Note when a model is too big for one GPU, you can set device_map="auto" to automatically determine how to load and store the model weights. Instead if your model can comfortably be in one GPU, by setting this you may encounter unwanted behavior and inference times can go up. find more at, huggingface.co/docs/transformers/en/pipeline_tutorial#deviceMargrettmarguerie

© 2022 - 2024 — McMap. All rights reserved.