You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm trying to narrow down why --scale_weight_norm=x is being ignored when using --network_module=lycoris.kohya using kohya/sd-scripts LoCon training. Average Key Norm is always 0, and the keys count is a tensor instead of a value (missing norm.item() like in kohya/sd-scripts?). But the main issue is the Average Key Norm never changes, and the keys are never scaled, which allows the weight norms to grow indefinitely.
It works with the default LoCon implementation in kohya/sd-scripsts --network_module=networks.lora as seen here
I can confirm the issue exists in both main and dev branches of LyCoris, but that's as far as I have dug into it, as the math behind it is not my specialty. If you need training configs, just let me know.
Thoughts?
The text was updated successfully, but these errors were encountered:
I tested the changes in 3.1.1. The tensor value is now an integer which is progress, but the Average key norm never changes from 0.
It should be a non zero value as early as step 2. I tried with --network_module=lycoris.kohya AND "algo=locon" and then --network_module=lycoris.kohya by itself, just to see if a different module would work. It still didn't budge from 0.
I'm trying to narrow down why
--scale_weight_norm=x
is being ignored when using--network_module=lycoris.kohya
using kohya/sd-scripts LoCon training.Average Key Norm
is always 0, and the keys count is a tensor instead of a value (missingnorm.item()
like in kohya/sd-scripts?). But the main issue is the Average Key Norm never changes, and the keys are never scaled, which allows the weight norms to grow indefinitely.It works with the default LoCon implementation in kohya/sd-scripsts
--network_module=networks.lora
as seen hereI can confirm the issue exists in both
main
anddev
branches of LyCoris, but that's as far as I have dug into it, as the math behind it is not my specialty. If you need training configs, just let me know.Thoughts?
The text was updated successfully, but these errors were encountered: