ner_model = pipeline('ner', model=model, tokenizer=tokenizer, device=0, grouped_entities=True)
the device indicated pipeline to use no_gpu=0(only using GPU), please show me how to use multi-gpu.
ner_model = pipeline('ner', model=model, tokenizer=tokenizer, device=0, grouped_entities=True)
the device indicated pipeline to use no_gpu=0(only using GPU), please show me how to use multi-gpu.
There is an argument called device_map
for the pipelines in the transformers
lib; see here. It comes from the accelerate
module; see here. You can specify a custom model dispatch, but you can also have it inferred automatically with device_map=" auto"
. Eventually, you might need additional configuration for the tokenizer, but it should look like this:
ner_model = pipeline('ner', model=model, tokenizer=tokenizer, device_map="auto", grouped_entities=True)
device_map="auto"
to automatically determine how to load and store the model weights. Instead if your model can comfortably be in one GPU, by setting this you may encounter unwanted behavior and inference times can go up. find more at, huggingface.co/docs/transformers/en/pipeline_tutorial#device –
Margrettmarguerie © 2022 - 2024 — McMap. All rights reserved.
Pipeline._config.device
option. I think this is a good point to start investigation. Also there is an issue about global configuration for Pipeline settings in trankit repo. Create two pipelines for two GPUs and assign different tasks to them. – Fusil