Model performance substantially lower than reported in the paper

## Description
I attempted to train the DeepMVP phosphorylation site prediction model, but the performance is far below the results reported in the original paper. I am opening this issue to clarify whether there are missing preprocessing steps or other implementation details.

## Steps to Reproduce
1. **Dataset**  
   - Training data: [phosphorylation_st_train.tsv](https://deepmvp.ptmax.org/database/train_test/phosphorylation_sty/phosphorylation_st_train.tsv)  
   - Testing data: [phosphorylation_st_test.tsv](https://deepmvp.ptmax.org/database/train_test/phosphorylation_sty/phosphorylation_st_test.tsv)

2. **Preprocessed UniProt database**  
   Only retain the UniProt ID in FASTA headers:  
   - [swiss_prot_human_20190214.fasta](https://deepmvp.ptmax.org/download/swiss_prot_human_20190214.fasta)

3. **Model files**  
   - [Pretrained / Model repository](https://deepmvp.ptmax.org/download/models.tar.gz)

4. **Training code**
```python
from lib.PTModels import train_model

model = load_model('/home/huch/DeepMVP/DeepMVP-main/models/phosphorylation_st/model_0.h5')

train_model(
    input_data='data/raw/phos_st_training.tsv',
    test_file='data/raw/phos_st_testing.tsv',
    db='data/raw/DeepMVP/swiss_prot_human_20190214_processed.fasta',
    out_dir='./mytrain',
    peptide_length=28*2+1,
    p_model=None,
    model=model,
)
````

## Expected behavior

* According to the paper, the model should achieve substantially higher accuracy(>0.9)

## Observed behavior

* Training log shows:

```
5155/5155 [==============================] - ETA: 0s - loss: 0.2907 - accuracy: 0.8761
Epoch 48: val_accuracy did not improve from 0.83340
```

* This performance is significantly below the reported results, and it will not grow in following epochs.

## Additional context / Questions

* Are there additional preprocessing steps for the dataset or FASTA file that are not documented?
* Are there hyperparameters or model initialization details missing in the repository that affect performance?
* Has anyone successfully reproduced the reported results? If so, could you share your training logs or parameters?


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Model performance substantially lower than reported in the paper #7

Description

Steps to Reproduce

Expected behavior

Observed behavior

Additional context / Questions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Model performance substantially lower than reported in the paper #7

Description

Description

Steps to Reproduce

Expected behavior

Observed behavior

Additional context / Questions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions