添加链接
link管理
链接快照平台
  • 输入网页链接,自动生成快照
  • 标签化管理网页链接

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement . We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

System information .

  • Have I written custom code (as opposed to using a stock example script provided in Keras): No
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): MacOS 10.15.7
  • TensorFlow installed from (source or binary): binary
  • TensorFlow version (use command below): v2.7.0-rc1-69-gc256c071bb2 2.7.0
  • Python version: 3.8.12
  • Bazel version (if compiling from source): N/A
  • GPU model and memory: N/A
  • Exact command to reproduce: see below
  • Describe the problem .

    The following code fails because the BatchNormalization layer does not support inputs of type uint8 (see the full stacktrace below). You can run this code in this gist .

    import tensorflow as tf
    (X_train, y_train), (X_test, y_test) = tf.keras.datasets.mnist.load_data()
    model = tf.keras.Sequential([
        tf.keras.layers.Flatten(),
        tf.keras.layers.BatchNormalization(),
        tf.keras.layers.Dense(10, activation="softmax")
    model.compile(loss="sparse_categorical_crossentropy", optimizer="sgd")
    model.fit(X_train, y_train, epochs=2)

    Describe the current behavior .

    The code above raises the following exception: TypeError: Exception encountered when calling layer "batch_normalization_1" (type BatchNormalization). Input 'y' of 'AddV2' Op has type float32 that does not match type uint8 of argument 'x'.

    See the full stacktrace below.

    Describe the expected behavior .

    The BatchNormalization layer should automatically cast integer inputs to floats.

    If you remove the BatchNormalization , everything works fine, because the Dense layer casts its inputs to floats automatically. I expect these layers to behave in the same way: either both of them should cast integers to floats when needed, or neither of them should. IMO the first option is preferable.

    Contributing .

  • Do you want to contribute a PR? (yes/no): yes
  • Briefly describe your candidate solution(if contributing): in the BatchNormalization layer, cast integer inputs to float32 .
  • Source code / logs .

    Full stack trace:

    Epoch 1/2
    ---------------------------------------------------------------------------
    TypeError                                 Traceback (most recent call last)
    <ipython-input-2-15a576831687> in <module>()
          8 model.compile(loss="sparse_categorical_crossentropy", optimizer="sgd")
    ----> 9 model.fit(X_train, y_train, epochs=2)
    1 frames
    /usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/func_graph.py in autograph_handler(*args, **kwargs)
       1127           except Exception as e:  # pylint:disable=broad-except
       1128             if hasattr(e, "ag_error_metadata"):
    -> 1129               raise e.ag_error_metadata.to_exception(e)
       1130             else:
       1131               raise
    TypeError: in user code:
        File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 878, in train_function  *
            return step_function(self, iterator)
        File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 867, in step_function  **
            outputs = model.distribute_strategy.run(run_step, args=(data,))
        File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 860, in run_step  **
            outputs = model.train_step(data)
        File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 808, in train_step
            y_pred = self(x, training=True)
        File "/usr/local/lib/python3.7/dist-packages/keras/utils/traceback_utils.py", line 67, in error_handler
            raise e.with_traceback(filtered_tb) from None
        TypeError: Exception encountered when calling layer "batch_normalization" (type BatchNormalization).
        Input 'y' of 'AddV2' Op has type float32 that does not match type uint8 of argument 'x'.
        Call arguments received:
          • inputs=tf.Tensor(shape=(32, 784), dtype=uint8)
          • training=True
              

    Initial notes from triage:
    we don't think we typically cast automatically for users, but perhaps this is a mistake.

    @fchollet any thoughts on this?

    Hi @LukeWood ,
    Thanks for your feedback. I ran some tests by feeding uint8 inputs to almost every type of layer, and here are the results:

    The following layers do not complain about the uint8 inputs, and they returnfloat32 outputs:

  • Average
  • Dense
  • Embedding
  • LeakyReLU
  • Normalization
  • RandomHeight
  • RandomWidth
  • Rescaling
  • Resizing
  • This one returns int64 outputs:

  • Hashing
  • These layers return uint8 outputs:

  • Activation
  • AlphaDropout
  • CenterCrop
  • Concatenate
  • Cropping1D
  • Cropping2D
  • Cropping3D
  • Dropout
  • Flatten
  • GaussianDropout
  • GaussianNoise
  • GlobalAveragePooling1D
  • GlobalAveragePooling2D
  • GlobalAveragePooling3D
  • GlobalMaxPooling1D
  • GlobalMaxPooling2D
  • GlobalMaxPooling3D
  • Lambda
  • Maximum
  • MaxPooling1D
  • MaxPooling1D
  • MaxPooling2D
  • MaxPooling2D
  • Minimum
  • Multiply
  • Permute
  • RandomContrast
  • RandomCrop
  • RandomFlip
  • RandomRotation
  • RandomTranslation
  • RandomZoom
  • RepeatVector
  • Reshape
  • SpatialDropout1D
  • SpatialDropout2D
  • SpatialDropout3D
  • Subtract
  • ThresholdedReLU
  • TimeDistributed
  • UpSampling1D
  • UpSampling2D
  • UpSampling3D
  • ZeroPadding1D
  • ZeroPadding2D
  • ZeroPadding3D
  • These layers reject the uint8 input and raise an exception:

  • AdditiveAttention
  • Attention
  • AveragePooling1D
  • AveragePooling2D
  • AveragePooling3D
  • BatchNormalization
  • Conv1D
  • Conv1DTranspose
  • Conv2D
  • Conv2DTranspose
  • Conv3D
  • Conv3DTranspose
  • ConvLSTM1D
  • ConvLSTM2D
  • ConvLSTM3D
  • DepthwiseConv1D
  • DepthwiseConv2D
  • Discretization
  • LayerNormalization
  • LocallyConnected1D
  • LocallyConnected2D
  • LSTMCell
  • MaxPooling3D
  • MaxPooling3D
  • PReLU
  • SeparableConv1D
  • SeparableConv2D
  • SimpleRNNCell
  • Softmax
  • I did not test the following layers:

  • ActivityRegularization
  • CategoryEncoding
  • DenseFeatures
  • GRUCell
  • IntegerLookup
  • Masking
  • MultiHeadAttention
  • StackedRNNCells
  • StringLookup
  • TextVectorization
  • So it looks like not casting is indeed the default, with some important exceptions, including Average, Dense, LeakyReLU, Normalization, RandomHeight, RandomWidth, Rescaling, and Resizing.

    So perhaps the decision should be done on a case-by-case basis. Regarding BatchNormalization, it would be nice to be able to use it as the first layer, with uint8 images as input.

    Wdyt?

    Thanks for the analysis. I think casting the int input to floats will make sense for BN layer.

    For those layer that raises error when input is int, I think majority of them just cast the input to backend.floatx() (unless we have a good reason to raise an error about the invalid input type).

    Feel free to send a PR for this issue if you would like to contribute, and we can apply it to all the layers that are applicable.

    The BatchNormalization layer should automatically cast integer inputs to floats.

    The way casting current works in Keras layers, is that each layer has a "dtype policy" which contains a "variable dtype" and a "compute dtype". By default both are equal to float32, but they have have different values (e.g. in mixed precision you'd use a policy with a float32 variable dtype and a float16 compute dtype.

    All layers will cast their inputs to their compute dtype. BUT this only happens for floating point inputs (e.g. casting float64 to float32). In your case no casting happens because the input is integer type.

    We probably have two options here:

  • Extend the rule above to cast all numerical dtypes to the compute dtype. This may well be ok I think?
  • Doing the casting by hand on a case by case basis.
  • I'm trying to think if there are cases where 1) would be obviously incorrect. Maybe image preprocessing layers? But even then the rule "cast to compute dtype" is simple and consistent. @mattdangerw I remember we look at this in the context of KPL, do you remember what our conclusion was?

    I'd favor doing 1) (at the level of the base layer) unless we find a significant reason why this would be incorrect. However, backwards compatibility constraints might prevent us from doing so for layers where uint8 is currently accepted and returns uint8 outputs.

    Casting all numerical types to compute dtype would be incorrect for layers that need to handle categorical/discrete integer inputs. Layers that would fall into that camp are Embedding, IntegerLookup, Hashing, CategoryEncoding, among others.

    We do have an option to disable the casting entirely (kwargs['autocast'] = False). Embedding is a good example, it turns off the autocast option, and casts the outputs, rather than the inputs, to the global compute dtype. That's the general pattern I have been trying to follow in preprocessing layers--cast to compute dtype as soon as possible, which is often after some initial categorical computations. For image preprocessing, I actually think casting early is generally ok.

    The place I would most worry 1) would be breaking is for custom embedding layers. I think most of the model garden NLP models would fall into this camp. They use custom embedding layers that do not set the autocast option, but would fail on non-integer inputs.

    Possibly 2) might be the more practical option. There are probably only so many of our layers that will actually be used at the top of a model. We could explicitly cast ints to compute dtype for the layers that only operate on floats.

    If we go with 1), we need to make sure we leave an option to turn it off (autocast), and make sure that is well know and documented for users. I worry it could be frustrating for a developer who has a legitimate need for integer inputs (there are many), and who can't figure out why inputs are magically changing under the hood.

    Thanks for the detailed explainer, Matt!

    Possibly 2) might be the more practical option. There are probably only so many of our layers that will actually be used at the top of a model. We could explicitly cast ints to compute dtype for the layers that only operate on floats.

    Doing 2) is correct in any case and does not preclude doing 1) in the future, so I would recommend doing 2) right now (at least for BatchNorm and possibly a few more), and open a ticket for future investigation of a more generalized behavior.