Hi ! I read Yarin Gal's paper and I did not understand how the weight regulariser and dropout regulariser are initialized. The author provided a formula, but it is not very clear (e.g what means prior length scale ? and which value to assign for this variable ?). Could you explain how you find the values used to inizialize the weight regulariser and the dropout regulariser ?
Hi ! I read Yarin Gal's paper and I did not understand how the weight regulariser and dropout regulariser are initialized. The author provided a formula, but it is not very clear (e.g what means prior length scale ? and which value to assign for this variable ?). Could you explain how you find the values used to inizialize the weight regulariser and the dropout regulariser ?