Differentially generate sentences with Huggingface Library for adversarial training (GANs)

I have the following goal, which I have been trying to achieve with the Huggingface Library but I encountered some roadblocks.

The Problem:

I want to generate sentences in a differentiable way at training time. Why am I doing this? I want to apply a discriminator to this output to generate sentences with certain properties, which are "enforced" by the discriminator. These sentences will also be conditioned on a input sentence, so I need a Encoder Decoder Model.

To get around the non differentiability of argmax, I simply take the softmax output of the decoder and multiply it with my embedding matrix. Then I am taking this embedded input and feed it into a transformer discriminator, which simply classifies the input as original/fake. Then I backpropagate through the encoder decoder. Just as one would do it with a normal GAN.

So far I have tried to use the EncoderDecoderModel from Huggingface. This class has a method named generate, which generates sentences in a non differentiable way (greedy or beam-search). So I dug through the source code and tried to build my own differentiable generate method. I didn't get it to work though.

Questions:

Is there a reasonably easy way to do this with the Huggingface Library, as I really want to use the pretrained models and everything else that comes with it?
Is there a way to invoke the forward method of the decoder and only generate one new token, not the whole sequence again?

Thanks for your help, I would really appreciate it, I have been stuck on this for quiet a while now.

As someone who has published on the question, this has been attempted many times in many different ways but doesn't work.

Please see the ICLR paper on the subject https://openreview.net/pdf?id=BJgza6VtPB "Language GANs Falling Short"

This is not something specific to HuggingFace, but it is specific to the mathematical fact that words are discrete objects, and that you can't slightly modify a word and slightly change the meaning of a sentence, like you can an image. Words are discrete.

A way to do achieve what you are trying to do would be with plug-in play language models.

https://www.uber.com/blog/pplm/ https://arxiv.org/abs/1912.02164

What they do is that they train classifiers for some desirable or undesirable characteristics, and add (or subtract) the probability of that class to the probability of each next potential token (score computed for each potential sentences including that token). That way, you either increase or decrease the probability of generating text with that characteristic.

Controllable text generation is a whole field, I recommend that you do a literature review.

Recommended topics

Hot tags