How does masked_lm_labels argument work in BertForMaskedLM?

from transformers import BertTokenizer, BertForMaskedLM import torch tokenizer = BertTokenizer.from_pretrained('bert-base-uncased') model = BertForMaskedLM.from_pretrained('bert-base-uncased') input_ids = torch.tensor(tokenizer.encode("Hello, my dog is cute", add_special_tokens=True)).unsqueeze(0) # Batch size 1 outputs = model(input_ids, masked_lm_labels=input_ids) loss, prediction_scores = outputs[:2]

The first argument is the masked input, the masked_lm_labels argument is the desired output.

The input_ids should be masked. In general, it is up to you how you do the masking. In the original BERT, they choose 15% tokens and the following with them, either

Use [MASK] tokens; or
Use a random token; or
Keep the original token unchanged.

This modifies the input, so you need to tell your model what original non-masked input, which is the masked_lm_labels argument. Note also, that you do not want to compute the loss only for the tokens that were actually chosen for masking. The rest of the tokens should be replaced with an index -100.

For more details, see the documentation.

Recommended topics

Hot tags