How to implement maclaurin series in keras?
Asked Answered
B

1

7

I am trying to implement expandable CNN by using maclaurin series. The basic idea is the first input node can be decomposed into multiple nodes with different orders and coefficients. Decomposing single nodes to multiple ones can generate different non-linear line connection that generated by maclaurin series. Can anyone give me a possible idea of how to expand CNN with maclaurin series non-linear expansion? any thought?

I cannot quite understand how to decompose the input node to multiple ones with different non-linear line connections that generation by maclaurin series. as far as I know, the maclaurin series is an approximation function but the decomposing node is not quite intuitive to me in terms of implementation. How to implement a decomposing input node to multiple ones in python? How to make this happen easily? any idea?

my attempt:

import tensorflow as tf
import numpy as np
import keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Conv2D, MaxPooling2D, Dropout, Flatten
from keras.datasets import cifar10
from keras.utils import to_categorical

(train_imgs, train_label), (test_imgs, test_label)= cifar10.load_data()
output_class = np.unique(train_label)
n_class = len(output_class)

nrows_tr, ncols_tr, ndims_tr = train_imgs.shape[1:]
nrows_ts, ncols_ts, ndims_ts = test_imgs.shape[1:]
train_data = train_imgs.reshape(train_imgs.shape[0], nrows_tr, ncols_tr, ndims_tr)

test_data = test_imgs.reshape(test_imgs.shape[0], nrows_ts, ncols_ts, ndims_ts)
input_shape = (nrows_tr, ncols_tr, ndims_tr)
train_data = train_data.astype('float32')
trast_data = test_data.astype('float32')
train_data //= 255
test_data //= 255
train_label_one_hot = to_categorical(train_label)
test_label_one_hot = to_categorical(test_label)

def pown(x,n):
    return(x**n)

def expandable_cnn(input_shape, output_shape, approx_order):
    inputs=Input(shape=(input_shape))
    x= Dense(input_shape)(inputs)
    y= Dense(output_shape)(x)
    model = Sequential()
    model.add(Conv2D(filters=32, kernel_size=(3,3), padding='same', activation="relu", input_shape=input_shape))
    model.add(Conv2D(filters=32, kernel_size=(3,3), activation='relu'))
    model.add(MaxPooling2D(pool_size=(2,2)))
    model.add(Dropout(0.25))

    model.add(Flatten())
    model.add(Dense(512, activation='relu'))
    model.add(Dropout(0.5))
    for i in range(2, approx_order+1):
        y=add([y, Dense(output_shape)(Activation(lambda x: pown(x, n=i))(x))])
    model.add(Dense(n_class, activation='softmax')(y))
    return model

but when I ran the above model, I had bunch of compile errors and dimension error. I assume that the way for Tylor non-linear expansion for CNN model may not be correct. Also, I am not sure how to represent weight. How to make this work? any possible idea of how to correct my attempt?

desired output:

I am expecting to extend CNN with maclaurin series non-linear expansion, how to make the above implementation correct and efficient? any possible idea or approach?

Bufordbug answered 2/4, 2020 at 0:59 Comment(0)
V
5

Interesting question. I have implemented a Keras model that computes the Taylor expansion as you described:

from tensorflow.keras.models import Model
from tensorflow.keras.layers import Dense, Input, Lambda


def taylor_expansion_network(input_dim, max_pow):
    x = Input((input_dim,))

    # 1. Raise input x_i to power p_i for each i in [0, max_pow].
    def raise_power(x, max_pow):
        x_ = x[..., None]  # Shape=(batch_size, input_dim, 1)
        x_ = tf.tile(x_, multiples=[1, 1, max_pow + 1])  # Shape=(batch_size, input_dim, max_pow+1)
        pows = tf.range(0, max_pow + 1, dtype=tf.float32)  # Shape=(max_pow+1,)
        x_p = tf.pow(x_, pows)  # Shape=(batch_size, input_dim, max_pow+1)
        x_p_ = x_p[..., None]  # Shape=(batch_size, input_dim, max_pow+1, 1)
        return x_p_

    x_p_ = Lambda(lambda x: raise_power(x, max_pow))(x)

    # 2. Multiply by alpha coefficients
    h = LocallyConnected2D(filters=1,
                           kernel_size=1,  # This layer is computing a_i * x^{p_i} for each i in [0, max_pow]
                           use_bias=False)(x_p_)  # Shape=(batch_size, input_dim, max_pow+1, 1)

    # 3. Compute s_i for each i in [0, max_pow]
    def cumulative_sum(h):
        h = tf.squeeze(h, axis=-1)  # Shape=(batch_size, input_dim, max_pow+1)
        s = tf.cumsum(h, axis=-1)  # s_i = sum_{j=0}^i h_j. Shape=(batch_size, input_dim, max_pow+1)
        s_ = s[..., None]  # Shape=(batch_size, input_dim, max_pow+1, 1)
        return s_

    s_ = Lambda(cumulative_sum)(h)

    # 4. Compute sum w_i * s_i each i in [0, max_pow]
    s_ = LocallyConnected2D(filters=1,  # This layer is computing w_i * s_i for each i in [0, max_pow]
                            kernel_size=1,
                            use_bias=False)(s_)  # Shape=(batch_size, input_dim, max_pow+1)
    y = Lambda(lambda s_: tf.reduce_sum(tf.squeeze(s_, axis=-1), axis=-1))(s_)  # Shape=(batch_size, input_dim)

    # Return Taylor expansion model
    model = Model(inputs=x, outputs=y)
    model.summary()
    return model

The implementation applies the same Taylor expansion to each element of the flattened tensor with shape (batch_size, input_dim=512) coming from the convolutional network.


UPDATE: As we discussed in the comments section, here is some code to show how your function expandable_cnn could be modified to integrate the model defined above:

def expandable_cnn(input_shape, nclass, approx_order):
    inputs = Input(shape=(input_shape))
    h = inputs
    h = Conv2D(filters=32, kernel_size=(3, 3), padding='same', activation='relu', input_shape=input_shape)(h)
    h = Conv2D(filters=32, kernel_size=(3, 3), activation='relu')(h)
    h = MaxPooling2D(pool_size=(2, 2))(h)
    h = Dropout(0.25)(h)
    h = Flatten()(h)
    h = Dense(512, activation='relu')(h)
    h = Dropout(0.5)(h)
    taylor_model = taylor_expansion_network(input_dim=512, max_pow=approx_order)
    h = taylor_model(h)
    h = Activation('relu')(h)
    print(h.shape)
    h = Dense(nclass, activation='softmax')(h)
    model = Model(inputs=inputs, outputs=h)
    return model

Please note that I do not guarantee that your model will work (e.g. that you will get good performance). I just provided a solution based on my interpretation of what you want.

Vex answered 5/4, 2020 at 14:32 Comment(7)
would you mind first let me reproduce your answer with cifar-10 dataset? Thank you very much!Bufordbug
You are welcome :) I did not include the activation g from the equation, but you can easily compute it on the output of this model. Also, I am happy to make any necessary changes. For example, it is not clear whether the model should be applied independently to each element of the input tensor or whether the alphas and ws should be trainable (they are trainable in my implementation)Vex
I am trying to reproduce your answer with cifar-10 dataset, while I find difficulties to understand these lines: x_p_ = x_p[..., None], s_ = s[..., None]. ... in s_ = s[..., None] refers to default params or ... is equal to Shape=(batch_size, input_dim, max_pow+1, 1) ? what if the model is gonna be applied to each element of input tensor, how to implement this? In my pipeline, I have 3 conv_filter and two hidden layers and FC layer, what if I want to use taylor expansion on hidden layer with max_pow=2, how can I make your answer reusable in the piepelie?Bufordbug
your answer helped me a lot, could you have possibly update regarding my above comment? thanksBufordbug
I am glad it is being helpful. About the ellipsis ..., I am basically using it to expand the last dimension of the tensor, e.g. x_p[..., None] is equivalent to x_p[:, :, : , None] and to tf.expand_dims(x_p, -1) . In other words, the dimensions go from (batch_size, input_dim, max_pow+1) to (batch_size, input_dim, max_pow+1, 1).Vex
Right now, the model is ready to be applied to each element of the input tensor, that is, you can apply the model to the output of the FC network (say a tensor of shape (bs, nb_hidden)), and the Taylor series will be expanded for each element along the second axis. Note that, with the current implementation, the weights alpha and w are being shared across elements - it would also be possible to make them independent for each element if that's what you want.Vex
Let us continue this discussion in chat.Bufordbug

© 2022 - 2024 — McMap. All rights reserved.