Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hotfix LSTM ouput #2547

Merged
merged 4 commits into from
Dec 11, 2024
Merged

hotfix LSTM ouput #2547

merged 4 commits into from
Dec 11, 2024

Conversation

CarloLucibello
Copy link
Member

@CarloLucibello CarloLucibello commented Dec 11, 2024

The LSTM layer in Flux v0.15.0 returns a tuple of (h, c) where size(h)==(in_dim, timesteps, batch_size). This is in contrast with any other framework (pytorch, flax, lux), where only h is returned as an output. Due to this, we cannot stack LSTM in a Chain.

With this PR, the LSTM returns only h. While this change is breaking, v0.15 has only been around for a few days so we can consider the current behavior a bug introduced while doing the redesign in #2500, fix it now and tag a v0.15.3 as soon as possible.

cc @MartinuzziFrancesco

Copy link

codecov bot commented Dec 11, 2024

Codecov Report

Attention: Patch coverage is 80.00000% with 4 lines in your changes missing coverage. Please review.

Project coverage is 32.20%. Comparing base (428be48) to head (f96bd58).
Report is 2 commits behind head on master.

Files with missing lines Patch % Lines
src/layers/recurrent.jl 80.00% 4 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2547      +/-   ##
==========================================
- Coverage   32.44%   32.20%   -0.25%     
==========================================
  Files          34       34              
  Lines        1991     1972      -19     
==========================================
- Hits          646      635      -11     
+ Misses       1345     1337       -8     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@MartinuzziFrancesco
Copy link
Contributor

I think the change makes sense in light of the other implementations. Pushing it as an quick hotfix shouldn't be an issue, as you said 0.15 is pretty new. This will also probably be superseded by closing #2514 when all the changes will be implemented, specifically adding num_layers to bring the implementation in line with Pytorch, Flax etc

@CarloLucibello CarloLucibello merged commit 130af41 into master Dec 11, 2024
9 of 11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants