Input scaling, optimiser choise and other changes by dquigley533 · Pull Request #3 · TheodoreWolf/pinns

dquigley533 · 2026-01-22T17:26:32Z

I came across your Medium post on this - thanks for making the code available. I'm currently learning about PINNs myself and found your tutorial to be a more useful introduction than many things I've read.

In my further reading and experimentation I made some tweaks which you might consider improvements, hence the PR in case you'd like to incorporate them.

Input/Output scaling. The input time is scaled into [0,1], as is the output. This dramatically reduces the number of parameters needed to get a good reproduction of the PDE solution. I'm using a few dozen nodes in a single hidden layer.
Switch optimiser. Your example was training on the whole data, rather than a batch, so using AdamW was unnecessary since you have the exact (rather than estimated) gradients. Using LBFGS, the training converges in a handful of steps, noting that I switched activation functions from ReLU to GELU to get smooth gradients.
With those changes the problem fixed by the L2 regularisation doesn't seem to occur, but I've left that in anyway.
In the examples which use physics_loss I've weighted this massively higher than the MSE loss on the training points to avoid overfitting to the noise on those points.

The whole thing now runs in a handful of seconds on a CPU, which did wonders for my confidence that PINNs would be tractable for problems in higher numbers of dimensions.

Obviously feel free to ignore - just felt compelled to share what I learned by trying to push your example to its limits.

…parameters needed. Switched to LBFGS since examples use whole batch training. Adjusted training to that physics_loss dominates. Used smooth activation functions.

TheodoreWolf · 2026-04-20T11:51:33Z

Hi @dquigley533 and thanks for the PR! Sorry for not getting back to you sooner.
Implementing LBFGS was something I'd wanted to do for a while since it's well known to work better with PINN's bumpy landscapes (on top of doing full batch optimization), so thanks a lot for that!
Checking the people who have starred this and my medium metrics, it seems that most people seem to be going Medium -> GH and that these people rarely have extensive NN experience. I'm therefore inclined not to change the code too much, especially since LBFGS has non-standard syntax.

My suggestion is to instead of modifying the existing code, to subclass your improvements as a new class and then simply adding the improved versions as an additional section in the notebook. This lets people who come from medium not be confused by different code while still seeing that what I've done is clearly not optimal.

Introduced input and output scaling to dramatically reduce number of …

1a4f8ca

…parameters needed. Switched to LBFGS since examples use whole batch training. Adjusted training to that physics_loss dominates. Used smooth activation functions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Input scaling, optimiser choise and other changes#3

Input scaling, optimiser choise and other changes#3
dquigley533 wants to merge 1 commit intoTheodoreWolf:mainfrom
dquigley533:main

dquigley533 commented Jan 22, 2026

Uh oh!

TheodoreWolf commented Apr 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

dquigley533 commented Jan 22, 2026

Uh oh!

TheodoreWolf commented Apr 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants