-
Notifications
You must be signed in to change notification settings - Fork 72
Open
Description
the loss like:
{'loss': 8.3281, 'grad_norm': 322.77156161507077, 'learning_rate': 0.0, 'epoch': 0.0} | 19/92 [00:03<00:11, 6.29it/s]
{'loss': 8.5, 'grad_norm': 273.2512527507693, 'learning_rate': 5.0000000000000004e-08, 'epoch': 0.0}
{'loss': 7.9219, 'grad_norm': 246.70768633306642, 'learning_rate': 1.0000000000000001e-07, 'epoch': 0.0}
{'loss': 6.9141, 'grad_norm': 207.61659902097261, 'learning_rate': 1.5000000000000002e-07, 'epoch': 0.0}
{'loss': 6.5938, 'grad_norm': 238.2175499387362, 'learning_rate': 2.0000000000000002e-07, 'epoch': 0.0}
{'loss': 7.0312, 'grad_norm': 215.7555159500225, 'learning_rate': 2.5000000000000004e-07, 'epoch': 0.0}
{'loss': 9.0625, 'grad_norm': 292.1837063871723, 'learning_rate': 3.0000000000000004e-07, 'epoch': 0.0}
{'loss': 7.8125, 'grad_norm': 249.52821941121618, 'learning_rate': 3.5000000000000004e-07, 'epoch': 0.0}
{'loss': 7.9609, 'grad_norm': 281.87461514835957, 'learning_rate': 4.0000000000000003e-07, 'epoch': 0.0}
{'loss': 8.2422, 'grad_norm': 220.2848272714045, 'learning_rate': 4.5e-07, 'epoch': 0.0}
{'loss': 8.8281, 'grad_norm': 309.4026145868916, 'learning_rate': 5.000000000000001e-07, 'epoch': 0.0}
{'loss': 8.375, 'grad_norm': 246.82614914951165, 'learning_rate': 5.5e-07, 'epoch': 0.0}
{'loss': 8.4844, 'grad_norm': 267.56563534299346, 'learning_rate': 6.000000000000001e-07, 'epoch': 0.0}
{'loss': 7.75, 'grad_norm': 254.66540924254167, 'learning_rate': 6.5e-07, 'epoch': 0.0}
{'loss': 8.5703, 'grad_norm': 236.04995774767977, 'learning_rate': 7.000000000000001e-07, 'epoch': 0.0}
{'loss': 7.6055, 'grad_norm': 244.71480797209944, 'learning_rate': 7.5e-07, 'epoch': 0.0}
{'loss': 7.4844, 'grad_norm': 239.66509686659916, 'learning_rate': 8.000000000000001e-07, 'epoch': 0.0}
{'loss': 5.6172, 'grad_norm': 196.99852716316857, 'learning_rate': 8.500000000000001e-07, 'epoch': 0.0}
{'loss': 6.5312, 'grad_norm': 209.76915835265947, 'learning_rate': 9e-07, 'epoch': 0.0}
{'loss': 7.3906, 'grad_norm': 209.8876629848969, 'learning_rate': 9.5e-07, 'epoch': 0.0}
{'loss': 5.9844, 'grad_norm': 197.52531679833206, 'learning_rate': 1.0000000000000002e-06, 'epoch': 0.0}
{'loss': 4.9141, 'grad_norm': 176.5006773319435, 'learning_rate': 1.0500000000000001e-06, 'epoch': 0.0}
{'loss': 4.6719, 'grad_norm': 159.39841595633183, 'learning_rate': 1.1e-06, 'epoch': 0.0}
{'loss': 5.5391, 'grad_norm': 246.40031142756555, 'learning_rate': 1.15e-06, 'epoch': 0.0}
{'loss': 6.8672, 'grad_norm': 313.6043602156962, 'learning_rate': 1.2000000000000002e-06, 'epoch': 0.0}
{'loss': 4.918, 'grad_norm': 170.7606607499666, 'learning_rate': 1.25e-06, 'epoch': 0.0}
{'loss': 4.2773, 'grad_norm': 184.55600488249698, 'learning_rate': 1.3e-06, 'epoch': 0.0}
{'loss': 4.6211, 'grad_norm': 373.14208295733414, 'learning_rate': 1.35e-06, 'epoch': 0.0}
{'loss': 4.4883, 'grad_norm': 346.7172125262856, 'learning_rate': 1.4000000000000001e-06, 'epoch': 0.0}
{'loss': 3.9102, 'grad_norm': 270.9716257650011, 'learning_rate': 1.45e-06, 'epoch': 0.0}
it's that normal? It seems not like a VLA fashioned loss.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels