Try to reproduce

I tried to reproduce this work, only simply modifying the running code, but the results continued to be relatively blurry (left) compared to the official weight (right) inference.  I'm not sure if it's because of the different datasets, which I used DF2K, while this paper used LSDIR.

```
# Two-CPUs distributed execution
CUDA_VISIBLE_DEVICES="0,1" accelerate launch train_osediff.py \
    --pretrained_model_name_or_path 'stable-diffusion-2-1-base' \
    --ram_path 'preset/models/ram_swin_large_14m.pth' \
    --learning_rate 2.5e-5 \
    --train_batch_size 2 \
    --resolution 512 \
    --gradient_accumulation_steps 1 \
    --checkpointing_steps 500 \
    --mixed_precision 'fp16' \
    --report_to "tensorboard" \
    --seed 123 \
    --output_dir 'experiments/experiments_scratch_512/osediff' \
    --dataset_txt_paths_list 'datasets/DF2K/DF2K_HR_train_p512' \
    --dataset_prob_paths_list 1 \
    --neg_prompt "painting, oil painting, illustration, drawing, art, sketch, cartoon, CG Style, 3D render, unreal engine, blurring, dirty, messy, worst quality, low quality, frames, watermark, signature, jpeg artifacts, deformed, lowres, over-smooth" \
    --cfg_vsd 7.5 \
    --lora_rank 4 \
    --lambda_lpips 2 \
    --lambda_l2 1 \
    --lambda_vsd 1 \
    --lambda_vsd_lora 1 \
    --deg_file_path "params_realesrgan.yml" \
    --tracker_project_name "train_osediff_scratch_512" \
    --gradient_checkpointing \
    --enable_xformers_memory_efficient_attention \
    --dataloader_num_workers 16
```


<img width="1707" height="857" alt="Image" src="https://github.com/user-attachments/assets/9ab1ccf6-e924-4291-9ce2-ad8cc4646387" />

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Try to reproduce #72

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Try to reproduce #72

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions