I want to implement non-negative matrix factorization using PyTorch. Here is my initial implement:
def nmf(X, k, lr, epochs):
# X: input matrix of size (m, n)
# k: number of latent factors
# lr: learning rate
# epochs: number of training epochs
m, n = X.shape
W = torch.rand(m, k, requires_grad=True) # initialize W randomly
H = torch.rand(k, n, requires_grad=True) # initialize H randomly
# training loop
for i in range(epochs):
# compute reconstruction error
loss = torch.norm(X - torch.matmul(W, H), p='fro')
# compute gradients
loss.backward()
# update parameters using additive update rule
with torch.no_grad():
W -= lr * W.grad
H -= lr * H.grad
W.grad.zero_()
H.grad.zero_()
if i % 10 == 0:
print(f"Epoch {i}: loss = {loss.item()}")
return W.detach(), H.detach()
Lee and Seung in this paper, proposed to use adaptive learning rates to avoid subtraction and thus the production of negative elements. Here is the stats.SE thread where I get some idea. But I don't know how to implement multiplicative update rule for W,H in pytorch, as it need to separate the positive and negative part of their gradient respectively. Yes, I can manually implement that but I want to leverage this to the torch autograd.
Any idea how to manage to do so? Thanks in advance.