Amazing work.
I ran into the same Ord on float problem with a similar solution to yours
let index = output_tensor
.as_data::<NotNan<f32>>()
.iter()
.position_max()
.unwrap();
But it doesnt look like you expose your impl ElemTypeOf for NotNan. Can we pub something to expose that?
Thanks!