Why is input normalization (mean/std) needed after scaling? Performance drops if removed

Hi, I noticed that in the code, there are two normalization steps for the input data:

- First, the data is scaled (e.g., MinMaxScaler or similar scaling method).
- Then, for each input sequence (with input length), the mean is subtracted and divided by the standard deviation.

I’m curious about the reason for applying the second normalization step (subtracting the mean and dividing by std) after the initial scaling. When I remove this second normalization across the input length, I observe a noticeable drop in model performance. Could you please explain why this additional normalization is necessary? What is the intuition or theoretical reason behind it? 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why is input normalization (mean/std) needed after scaling? Performance drops if removed #158

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Why is input normalization (mean/std) needed after scaling? Performance drops if removed #158

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions