I read your paper Concrete Dropout. I find an inconsistency of your code and paper.
The regularizer of kernel matrix should be proportional to 1-p. (Eq.(3) of your paper)
But in your code, it is inversely proportional to 1-p.
kernel_regularizer = self.weight_regularizer * K.sum(K.square(weight)) / (1. - self.p)
I am not sure whether I misunderstand your paper or code.
I read your paper Concrete Dropout. I find an inconsistency of your code and paper.
The regularizer of kernel matrix should be proportional to 1-p. (Eq.(3) of your paper)
But in your code, it is inversely proportional to 1-p.
I am not sure whether I misunderstand your paper or code.