--scale_weight_norm being ignored in kohya sd-scripts, maybe due to a difference in apply_max_norm() implementations? #225

thojmr · 2024-12-06T21:09:31Z

I'm trying to narrow down why --scale_weight_norm=x is being ignored when using --network_module=lycoris.kohya using kohya/sd-scripts LoCon training.

Average Key Norm is always 0, and the keys count is a tensor instead of a value (missing norm.item() like in kohya/sd-scripts?). But the main issue is the Average Key Norm never changes, and the keys are never scaled, which allows the weight norms to grow indefinitely.

It works with the default LoCon implementation in kohya/sd-scripsts --network_module=networks.lora as seen here

I can confirm the issue exists in both main and dev branches of LyCoris, but that's as far as I have dug into it, as the math behind it is not my specialty. If you need training configs, just let me know.

Thoughts?

The text was updated successfully, but these errors were encountered:

KohakuBlueleaf · 2024-12-09T13:19:10Z

Should be fixed in latest version (3.1.1)

thojmr · 2024-12-09T19:35:47Z

I tested the changes in 3.1.1. The tensor value is now an integer which is progress, but the Average key norm never changes from 0.

It should be a non zero value as early as step 2. I tried with --network_module=lycoris.kohya AND "algo=locon" and then --network_module=lycoris.kohya by itself, just to see if a different module would work. It still didn't budge from 0.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

--scale_weight_norm being ignored in kohya sd-scripts, maybe due to a difference in apply_max_norm() implementations? #225

--scale_weight_norm being ignored in kohya sd-scripts, maybe due to a difference in apply_max_norm() implementations? #225

thojmr commented Dec 6, 2024 •

edited

Loading

KohakuBlueleaf commented Dec 9, 2024

thojmr commented Dec 9, 2024

--scale_weight_norm being ignored in kohya sd-scripts, maybe due to a difference in apply_max_norm() implementations? #225

--scale_weight_norm being ignored in kohya sd-scripts, maybe due to a difference in apply_max_norm() implementations? #225

Comments

thojmr commented Dec 6, 2024 • edited Loading

KohakuBlueleaf commented Dec 9, 2024

thojmr commented Dec 9, 2024

thojmr commented Dec 6, 2024 •

edited

Loading