Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

--scale_weight_norm being ignored in kohya sd-scripts, maybe due to a difference in apply_max_norm() implementations? #225

Open
thojmr opened this issue Dec 6, 2024 · 2 comments

Comments

@thojmr
Copy link

thojmr commented Dec 6, 2024

I'm trying to narrow down why --scale_weight_norm=x is being ignored when using --network_module=lycoris.kohya using kohya/sd-scripts LoCon training.
lycoris kohya
Average Key Norm is always 0, and the keys count is a tensor instead of a value (missing norm.item() like in kohya/sd-scripts?). But the main issue is the Average Key Norm never changes, and the keys are never scaled, which allows the weight norms to grow indefinitely.

It works with the default LoCon implementation in kohya/sd-scripsts --network_module=networks.lora as seen here
networks lora

I can confirm the issue exists in both main and dev branches of LyCoris, but that's as far as I have dug into it, as the math behind it is not my specialty. If you need training configs, just let me know.

Thoughts?

@KohakuBlueleaf
Copy link
Owner

Should be fixed in latest version (3.1.1)

@thojmr
Copy link
Author

thojmr commented Dec 9, 2024

I tested the changes in 3.1.1. The tensor value is now an integer which is progress, but the Average key norm never changes from 0.
still error

It should be a non zero value as early as step 2. I tried with --network_module=lycoris.kohya AND "algo=locon" and then --network_module=lycoris.kohya by itself, just to see if a different module would work. It still didn't budge from 0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants