Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

If some layers of my TensorFlow model must be in training mode to work properly, what should I do to export the correct onnx #2379

Open
jungyin opened this issue Jan 11, 2025 · 1 comment
Labels
question An issue, pull request, or discussion needs more information

Comments

@jungyin
Copy link

jungyin commented Jan 11, 2025

Ask a Question

My model has some BatchNormalization layers, and when I try to switch to testing mode, the results I get are completely different from those in training mode. What should I do to correctly export onnx?

Further information

  • Is this issue related to a specific model?
    Model name:
    phynet

model code:

class PhysNet(keras.Model):

    def __init__(self, norm='batch'):
        self.norm = norm
        if norm == 'batch':
            norm = layers.BatchNormalization
        if norm == 'layer':
            norm = lambda :layers.LayerNormalization(axis=(1,))
        if norm == 'layer_frozen':
            norm = lambda :layers.LayerNormalization(axis=(1,), trainable=False)
        super().__init__()
        self.ConvBlock1 = keras.Sequential([
            layers.Conv3D(16, kernel_size=(1, 5, 5), strides=1, padding='same'),
            norm(),
            layers.Activation('relu')
        ])
        self.ConvBlock2 = keras.Sequential([
            layers.Conv3D(32, kernel_size=(3, 3, 3), strides=1, padding='same'),
            norm(),
            layers.Activation('relu')
        ])
        self.ConvBlock3 = keras.Sequential([
            layers.Conv3D(64, kernel_size=(3, 3, 3), strides=1, padding='same'),
            norm(),
            layers.Activation('relu')
        ])
        self.ConvBlock4 = keras.Sequential([
            layers.Conv3D(64, kernel_size=(3, 3, 3), strides=1, padding='same'),
            norm(),
            layers.Activation('relu')
        ])
        self.ConvBlock5 = keras.Sequential([
            layers.Conv3D(64, kernel_size=(3, 3, 3), strides=1, padding='same'),
            norm(),
            layers.Activation('relu')
        ])
        self.ConvBlock6 = keras.Sequential([
            layers.Conv3D(64, kernel_size=(3, 3, 3), strides=1, padding='same'),
            norm(),
            layers.Activation('relu')
        ])
        self.ConvBlock7 = keras.Sequential([
            layers.Conv3D(64, kernel_size=(3, 3, 3), strides=1, padding='same'),
            norm(),
            layers.Activation('relu')
        ])
        self.ConvBlock8 = keras.Sequential([
            layers.Conv3D(64, kernel_size=(3, 3, 3), strides=1, padding='same'),
            norm(),
            layers.Activation('relu')
        ])
        self.ConvBlock9 = keras.Sequential([
            layers.Conv3D(64, kernel_size=(3, 3, 3), strides=1, padding='same'),
            norm(),
            layers.Activation('relu')
        ])
        self.upsample = keras.Sequential([
            layers.Conv3DTranspose(64, kernel_size=(4, 1, 1), strides=(2, 1, 1), padding='same'),
            norm(),
            layers.Activation('elu')
        ])
        self.upsample2 = keras.Sequential([
            layers.Conv3DTranspose(64, kernel_size=(4, 1, 1), strides=(2, 1, 1), padding='same'),
            norm(),
            layers.Activation('elu')
        ])
        self.convBlock10 = layers.Conv3D(1, kernel_size=(1, 1, 1), strides=1)
        self.MaxpoolSpa = layers.MaxPool3D((1, 2, 2), strides=(1, 2, 2))
        self.MaxpoolSpaTem = layers.MaxPool3D((2, 2, 2), strides=2)
        self.poolspa = layers.AvgPool3D((1, 2, 2))
        self.flatten = layers.Reshape((-1,))

    def call(self, x):
        if self.norm == 'batch':
            training=True
        else:
            training=False
        x = self.ConvBlock1(x, training=training)
        x = self.MaxpoolSpa(x)
        x = self.ConvBlock2(x, training=training)
        x = self.ConvBlock3(x, training=training)
        x = self.MaxpoolSpaTem(x)
        x = self.ConvBlock4(x, training=training)
        x = self.ConvBlock5(x, training=training)
        x = self.MaxpoolSpaTem(x)
        x = self.ConvBlock6(x, training=training)
        x = self.ConvBlock7(x, training=training)
        x = self.MaxpoolSpa(x)
        x = self.ConvBlock8(x, training=training)
        x = self.ConvBlock9(x, training=training)
        x = self.upsample(x, training=training)
        x = self.upsample2(x, training=training)
        x = self.poolspa(x)
        x = self.convBlock10(x, training=training)
        x = self.flatten(x)
        x = x-tf.expand_dims(tf.reduce_mean(x, axis=-1), -1)
        return x

output ,eval was onnx/tf ,train was tf source
image

Model opset:
18

Notes

@jungyin jungyin added the question An issue, pull request, or discussion needs more information label Jan 11, 2025
@jungyin
Copy link
Author

jungyin commented Jan 11, 2025

When I tried to lock my model in validation mode, the outputs of onnx and tf were exactly the same. However, I now need to ensure that my model is exported in training mode to pinpoint the issue caused by BatchNormalization being in non training mode

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
question An issue, pull request, or discussion needs more information
Projects
None yet
Development

No branches or pull requests

1 participant