Skip to content

layer normalization in layer? #44

@PanXiebit

Description

@PanXiebit

https://github.com/NLPLearn/QANet/blob/8107d223897775d0c3838cb97f93b089908781d4/layers.py#L52

execuse me, in the paper "Layer Normalization,Lei Jimmy Ba, Ryan Kiros, and Geoffrey E. Hinton", it said that the mean and variance is computed over all the hidden units in the same layer, and different training cases have different normalization terms. So I think the mean should be computed like this:

axes = list(range(1, x.shape.ndims))
mean = tf.reduce_mean(x, axes)

So the shape of mean is [batch,]. also the variance is [batch,]
and then feed them to compute the normlized x.

In the tensorflow api of layer normalization, the source code is below, and I think it is the same with mine.
norm_axes = list(range(begin_norm_axis, inputs_rank))
https://github.com/tensorflow/tensorflow/blob/c19e29306ce1777456b2dbb3a14f511edf7883a8/tensorflow/contrib/layers/python/layers/layers.py#L2311

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions