Skip to content

Conversation

@constd
Copy link
Owner

@constd constd commented Dec 15, 2025

No description provided.

@constd constd changed the base branch from main to conv-generator December 15, 2025 18:00
discriminator_optimizer.zero_grad()
self.manual_backward(discriminator_loss["loss"], retain_graph=True)
discriminator_optimizer.step()
if self.global_step >= 10000:
Copy link
Collaborator

@cjacoby cjacoby Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this draft, but can we make this configurable / enableable?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

worse than draft, this hack and yes, we could/should

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A funny note / observation:

global steps are per optimizer step; so with three datasets, it shows up in the graph as 10000 / 3 😅

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should put that in a comment in the docstring

discriminator_optimizer.zero_grad()
self.manual_backward(discriminator_loss["loss"], retain_graph=True)
discriminator_optimizer.step()
if self.global_step >= self.num_frozen_steps:
Copy link
Collaborator

@cjacoby cjacoby Dec 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was looking at the loss graphs, and they didn't seem quite right, so I went back and compared this to how it's done in bigvgan:

https://github.com/NVIDIA/BigVGAN/blob/main/train.py#L486

I their implementation (different than here), the discriminator_optimizer.zero_grad() is always called, but backwards and step are optionally called.

I think the way you have it here we may be implicitly stepping the discriminator through the generator because it's not zeroing the gradients.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants