Training stability / reproducing results / training datasets

Firstly, I would sincerely like to thank the authors for releasing this code and for advancing the counting field. This is one of the best counters available, and I really appreciate the effort that went into making the implementation public.

I have a few questions regarding the provided weights and training stability.


1. I noticed that the demo weights perform better than the FSC weights.
Could you please clarify:
What dataset(s) were the demo weights trained on?
Were they trained with additional data, longer schedules, or different augmentation strategies compared to the FSC setup?

2. Training (un)stability 
I observed variability across different training runs. Below I summarize three checkpoints, reporting only MAE, RMSE, AP, and AP50 (rounded to two decimal places) for both validation and test sets.

Run 1
Validation
MAE: 9.00
RMSE: 40.58
AP: 35.45
AP50: 64.47

Test
MAE: 8.41
RMSE: 48.32
AP: 47.23
AP50: 75.45

Run2
Validation
MAE: 9.05
RMSE: 34.17
AP: 34.10
AP50: 62.99

Test
MAE: 7.32
RMSE: 38.99
AP: 46.74
AP50: 74.58

Run3
Validation
MAE: 9.82
RMSE: 37.53
AP: 34.06
AP50: 62.72

Test
MAE: 9.86
RMSE: 56.67
AP: 47.10
AP50: 75.29


In one checkpoint, the **counting performance matches the published results**.
In two runs, counting metrics (MAE/RMSE) degrade, while detection metrics (AP/AP50) improve or remain comparable. All results are better than other methods without the resolutiion upsampling!

Validation metrics appear more stable than test metrics. I am aware of the known FSC147 test-set issue involving a problematic image, which may partially explain the higher variance on the test set.

Could you please comment on:
1.Whether this apparent trade-off between counting accuracy and detection AP is expected?

2.Known sources of training instability?

Thank you again for the excellent work and for making the code publicly available. Any insights would be greatly appreciated.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training stability / reproducing results / training datasets #12

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Training stability / reproducing results / training datasets #12

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions