How to build a model in MXNet using matrices and matrix operations explicitly?

I can create a model using the pre-build high-level functions like FullyConnected. For example:

X = mx.sym.Variable('data')
P  = mx.sym.FullyConnected(data = X, name = 'fc1', num_hidden = 2)

In this way I get a symbolic variable P that is dependent on the symbolic variable X. In other words, I have computational graph that can be used to define a model and execute such operations as fit and predict.

Now, I would like to express P through X in a different way. In more detail, instead of using the high-level functionality (like FullyConnected), I would like to specify relations between P and X "explicitly", using low-level tensor operations (like matrix multiplication) and symbolic variables representing model parameters (lake weight matrix).

For example to achieve the same as above, I have tried the followig:

W = mx.sym.Variable('W')
B = mx.sym.Variable('B')
P = mx.sym.broadcast_plus(mx.sym.dot(X, W), B)

However, P obtained this way is not equivalent to P obtained earlier. I cannot use it the same way. In particular, as far as I understand, MXNet is complaining that W and B do not have values (which makes sense).

I have also tried to declare W and B in another way (so that they do have values):

w = np.array([[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]])
b = np.array([7.0, 8.0])

W = mx.nd.array(w)
B = mx.nd.array(b)

It does not work as well. I guess that MXNet complains because it expects a symbolic variable but it gets nd-arrays instead.

So, my question is how to build a model using low-level tensor operations (like matrix multiplication) and explicit objects representing model parameters (like weight matrices).

####################### # Allocate parameters for the first hidden layer ####################### W1 = nd.random_normal(shape=(num_inputs, num_hidden), scale=weight_scale, ctx=model_ctx) b1 = nd.random_normal(shape=num_hidden, scale=weight_scale, ctx=model_ctx) params = [W1, b1, ...]

epochs = 10 learning_rate = .001 smoothing_constant = .01 for e in range(epochs): ... for i, (data, label) in enumerate(train_data): data = data.as_in_context(model_ctx).reshape((-1, 784)) label = label.as_in_context(model_ctx) ... with autograd.record(): output = net(data) loss = softmax_cross_entropy(output, label_one_hot) loss.backward() SGD(params, learning_rate)

Recommended topics

Hot tags