Skip to content

Input scaling, optimiser choise and other changes#3

Open
dquigley533 wants to merge 1 commit intoTheodoreWolf:mainfrom
dquigley533:main
Open

Input scaling, optimiser choise and other changes#3
dquigley533 wants to merge 1 commit intoTheodoreWolf:mainfrom
dquigley533:main

Conversation

@dquigley533
Copy link
Copy Markdown

I came across your Medium post on this - thanks for making the code available. I'm currently learning about PINNs myself and found your tutorial to be a more useful introduction than many things I've read.

In my further reading and experimentation I made some tweaks which you might consider improvements, hence the PR in case you'd like to incorporate them.

  1. Input/Output scaling. The input time is scaled into [0,1], as is the output. This dramatically reduces the number of parameters needed to get a good reproduction of the PDE solution. I'm using a few dozen nodes in a single hidden layer.

  2. Switch optimiser. Your example was training on the whole data, rather than a batch, so using AdamW was unnecessary since you have the exact (rather than estimated) gradients. Using LBFGS, the training converges in a handful of steps, noting that I switched activation functions from ReLU to GELU to get smooth gradients.

  3. With those changes the problem fixed by the L2 regularisation doesn't seem to occur, but I've left that in anyway.

  4. In the examples which use physics_loss I've weighted this massively higher than the MSE loss on the training points to avoid overfitting to the noise on those points.

The whole thing now runs in a handful of seconds on a CPU, which did wonders for my confidence that PINNs would be tractable for problems in higher numbers of dimensions.

Obviously feel free to ignore - just felt compelled to share what I learned by trying to push your example to its limits.

…parameters needed. Switched to LBFGS since examples use whole batch training. Adjusted training to that physics_loss dominates. Used smooth activation functions.
@TheodoreWolf
Copy link
Copy Markdown
Owner

Hi @dquigley533 and thanks for the PR! Sorry for not getting back to you sooner.
Implementing LBFGS was something I'd wanted to do for a while since it's well known to work better with PINN's bumpy landscapes (on top of doing full batch optimization), so thanks a lot for that!
Checking the people who have starred this and my medium metrics, it seems that most people seem to be going Medium -> GH and that these people rarely have extensive NN experience. I'm therefore inclined not to change the code too much, especially since LBFGS has non-standard syntax.

My suggestion is to instead of modifying the existing code, to subclass your improvements as a new class and then simply adding the improved versions as an additional section in the notebook. This lets people who come from medium not be confused by different code while still seeing that what I've done is clearly not optimal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants