Hi,
Thanks for your wonderful work.
I am unsure about how you've derived the correlation matrix as per figure 2 in terms of the variables used in the calculation as well as the derivation for the correlation matrix.
For instance, does the word-to-word correlation matrix uses the correlation of w_i W^{Q,1} and w_j W^{K,1}T as the variable for the calculation?
Also, how do you reduce the dimension for the correlation matrix as the standard correlation calculation only deals with scalar variables?
Thanks!
Hi,
Thanks for your wonderful work.
I am unsure about how you've derived the correlation matrix as per figure 2 in terms of the variables used in the calculation as well as the derivation for the correlation matrix.
For instance, does the word-to-word correlation matrix uses the correlation of w_i W^{Q,1} and w_j W^{K,1}T as the variable for the calculation?
Also, how do you reduce the dimension for the correlation matrix as the standard correlation calculation only deals with scalar variables?
Thanks!