Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Details of training script #4

Open
dreminm opened this issue Dec 23, 2024 · 2 comments
Open

Details of training script #4

dreminm opened this issue Dec 23, 2024 · 2 comments

Comments

@dreminm
Copy link

dreminm commented Dec 23, 2024

Hello!

I'm currently trying to understand the model training process and noticed the use of the --data_path ./llava/state3_after.json parameter. Could you provide more details of this file? As far as i understand this script provides stage3 finetuning. Do you have script for pretraining you model with distortion classification and score classification (stage 1+2)?

@jzhws
Copy link
Collaborator

jzhws commented Dec 24, 2024

The three stages are trained with the same script with only a little adjustments. So you can use this code to finish the training of the three stages. Besides, the stage-1 data is set to be private so you may only train the stage-2 and stage-3.

@dreminm
Copy link
Author

dreminm commented Jan 13, 2025

Hello! I appreciate your response!

The paper mentions that you used 11 distortions in model pretraining: distortions = ["compression artifact", "spatial blur", "motion blur", "noise", "overexposure", "underexposure", "low contrast", "high contrast", "oversaturation", "desaturation", "block effect"]

However, in the study you referenced (LIQE), the artifacts are labeled as follows:

koniq10k = ['other realistic', 'overexposure', 'realistic blur', 'realistic contrast change', 'realistic noise', 'underexposure']
kadid10k = ['brighten', 'color blocking', 'color quantization', 'contrast', 'darken', 'desaturation', 'diffusion', 'gaussian blur', 'impulse', 'jitter', 'jpeg compression', 'jpeg2000 compression', 'lens', 'motion', 'multiplicative', 'noneccentricity patch', 'oversaturation', 'pixelate', 'quantization', 'sharpness', 'shifting', 'shifting the mean', 'white noise', 'white noise with denoise', 'white with color']
Where did you derive the artifacts ['block effect', 'compression artifact', 'high contrast', 'low contrast', 'motion blur', 'noise', 'spatial blur'] from? Is there any mapping between the labeling in LIQE and yours?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants