How to train a `DropoutMLP` #88

wlib · 2024-09-19T02:47:41Z

wlib
Sep 19, 2024

I can train a normal MLP and I can call a DropoutMLP by passing the random_stream keyword argument. However, I can't figure out how to train the DropoutMLP because I get errors when I try to pass random_stream to my loss function through trainer.step(). I tried to follow the code (because I can't find a way to debug in Colab) and I can't see how an error is even possible here. Adding a test case that does this alongside the vanilla MLP training test would be a great example.

Answered by danieldjohnson

Sep 23, 2024

One way to adapt the vanilla MLP training test to support a DropoutMLP is to construct the RandomStream inside the loss function using the rng argument:

mlp = simple_mlp.DropoutMLP.from_config(  # <- using a DropoutMLP instead of a regular MLP
    name="mlp",
    init_base_rng=jax.random.key(0),
    feature_sizes=[2, 32, 32, 2],
    drop_rate=0.2,
)

const_xor_inputs = pz.nx.wrap(
    jnp.array([[-1, -1], [-1, 1], [1, -1], [1, 1]], dtype=jnp.float32),
    "batch",
    "features",
)
const_xor_labels = jnp.array(
    [[0, 1], [1, 0], [1, 0], [0, 1]], dtype=jnp.float32
)

def loss_fn(model, rng, state, xor_inputs, xor_labels):
  # Change starts here!
  # First build a random stream from the …

View full answer

danieldjohnson · 2024-09-23T22:23:08Z

danieldjohnson
Sep 23, 2024
Maintainer

One way to adapt the vanilla MLP training test to support a DropoutMLP is to construct the RandomStream inside the loss function using the rng argument:

mlp = simple_mlp.DropoutMLP.from_config(  # <- using a DropoutMLP instead of a regular MLP
    name="mlp",
    init_base_rng=jax.random.key(0),
    feature_sizes=[2, 32, 32, 2],
    drop_rate=0.2,
)

const_xor_inputs = pz.nx.wrap(
    jnp.array([[-1, -1], [-1, 1], [1, -1], [1, 1]], dtype=jnp.float32),
    "batch",
    "features",
)
const_xor_labels = jnp.array(
    [[0, 1], [1, 0], [1, 0], [0, 1]], dtype=jnp.float32
)

def loss_fn(model, rng, state, xor_inputs, xor_labels):
  # Change starts here!
  # First build a random stream from the `rng`, then pass it to the model.
  random_stream = pz.RandomStream.from_base_key(rng)
  scale = 1 + jax.random.uniform(random_stream.next_key(), shape=(1,))
  model_out = model(xor_inputs, random_stream=random_stream)
  # (end change)
  log_probs = jax.nn.log_softmax(
      model_out.unwrap("batch", "features"), axis=-1
  )
  losses = -scale * log_probs * xor_labels
  loss = jnp.sum(losses) / 4
  return (loss, state + 1, {"loss": loss, "count": state})

trainer = basic_training.StatefulTrainer.build(
    root_rng=jax.random.key(42),
    model=mlp,
    optimizer_def=optax.adam(0.1),
    loss_fn=loss_fn,
    initial_loss_fn_state=100,
)

outputs = []
for _ in range(100):
  outputs.append(
      trainer.step(xor_inputs=const_xor_inputs, xor_labels=const_xor_labels)
  )

It sounds like maybe you were trying to pass a RandomStream through trainer.step directly. This isn't currently supported by basic_training.StatefulTrainer because of how the jit-compilation is set up.

Adding this as an additional test is a good suggestion, thanks!

1 reply

wlib Sep 23, 2024
Author

I can't believe I missed that I could just try to make the stream there; I kind of assumed that the "stream" part of it implied that it was a randomness source that belongs on the "outside". Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to train a `DropoutMLP` #88

{{title}}

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

How to train a DropoutMLP #88

wlib Sep 19, 2024

Replies: 1 comment · 1 reply

danieldjohnson Sep 23, 2024 Maintainer

wlib Sep 23, 2024 Author

How to train a `DropoutMLP` #88

wlib
Sep 19, 2024

Replies: 1 comment 1 reply

danieldjohnson
Sep 23, 2024
Maintainer

wlib Sep 23, 2024
Author