Releases: FluxML/Flux.jl
Releases · FluxML/Flux.jl
v0.16.0
Flux v0.16.0
Highlights
This release has a single breaking change:
- The recurrent cells
RNNCell
,LSTMCell
, andGRUCell
forward has been changed to
$y_t, state_t = cell(x_t, state_{t-1})$ in (#2551). Previously, it was$state_t = cell(x_t, state_{t-1})$ .
Other highlights include:
- Added
WeightNorm
normalization layer. - Added
Recurrence
layer, turning a recurrent layer into a layer processing the entire sequence at once.
Merged pull requests:
- Recurrence layer (#2549) (@CarloLucibello)
- Add
WeightNorm
reparametrization (#2550) (@pxl-th) - Change cells' return to
out, state
(#2551) (@CarloLucibello) - fix:
gpu_device
not defined inFlux.DistributedUtils
(#2552) (@AntonOresten)
Closed issues:
v0.15.2
v0.15.1
Flux v0.15.1
Merged pull requests:
- Re-write "basics" page of docs (#2535) (@mcabbott)
- Adding initialstates function to RNNs (#2541) (@MartinuzziFrancesco)
- Update NEWS.md highlighting breaking changes (#2542) (@CarloLucibello)
- relax identity test for devices (#2544) (@CarloLucibello)
- fix
Flux.@functor
(#2546) (@CarloLucibello)
Closed issues:
Flux.@functor
is broken on 0.15 (#2545)
v0.15.0
Flux v0.15.0
Highlights
This release includes two breaking changes:
- The recurrent layers have been thoroughly revised. See below and read the documentation for details.
- Flux now defines and exports its own gradient function. Consequently, using gradient in an unqualified manner (e.g., after
using Flux, Zygote
) could result in an ambiguity error.
The most significant updates and deprecations are as follows:
- Recurrent layers have undergone a complete redesign in PR 2500.
RNNCell
,LSTMCell
, andGRUCell
are now exported and provide functionality for single time-step processing:rnncell(x_t, h_t) -> h_{t+1}
.RNN
,LSTM
, andGRU
no longer store the hidden state internally, it has to be explicitely passed to the layer. Moreover, they now process entire sequences at once, rather than one element at a time:rnn(x, h) -> h′
.- The
Recur
wrapper has been deprecated and removed. - The
reset!
function has also been removed; state management is now entirely up to the user.
- The
Flux.Optimise
module has been deprecated in favor of the Optimisers.jl package.
Now Flux re-exports the optimisers from Optimisers.jl. Most users will be uneffected by this change.
The module is still available for now, but will be removed in a future release. - Most Flux layers will re-use memory via
NNlib.bias_act!
, when possible. - Further support for Enzyme.jl, via methods of
Flux.gradient(loss, Duplicated(model))
.
Flux now owns & exportsgradient
andwithgradient
, but withoutDuplicated
this still defaults to calling Zygote.jl. Flux.params
has been deprecated. Use Zygote's explicit differentiation instead,
gradient(m -> loss(m, x, y), model)
, or useFlux.trainables(model)
to get the trainable parameters.- Flux now requires Functors.jl v0.5. This new release of Functors assumes all types to be functors by default. Therefore, applying
Flux.@layer
orFunctors.@functor
to a type is no longer strictly necessary for Flux's models. However, it is still recommended to use@layer Model
for additional functionality like pretty printing. @layer Model
now behaves the same as@layer :expand Model
, which means that the model is expanded into its sublayers (if there are any) when printed. To force compact printing, use@layer :noexpand Model
.
Merged pull requests:
- Use
NNlib.bias_act!
(#2327) (@mcabbott) - Allow
Parallel(+, f)(x, y, z)
to work like broadcasting, and enableChain(identity, Parallel(+, f))(x, y, z)
(#2393) (@mcabbott) - Epsilon change in normalise for stability (#2421) (@billera)
- Add more
Duplicated
methods for Enzyme.jl support (#2471) (@mcabbott) - Export Optimisers and remove params and Optimise from tests (#2495) (@CarloLucibello)
- RNNs redesign (#2500) (@CarloLucibello)
- Adjust docs &
Flux.@functor
for Functors.jl v0.5, plus misc. depwarns (#2509) (@mcabbott) - GPU docs (#2510) (@mcabbott)
- CompatHelper: bump compat for Optimisers to 0.4, (keep existing compat) (#2520) (@github-actions[bot])
- Distinct init for kernel and recurrent (#2522) (@MartinuzziFrancesco)
- Functors v0.5 + tighter version bounds (#2525) (@CarloLucibello)
- deprecation of params and Optimise (continued) (#2526) (@CarloLucibello)
- Bump codecov/codecov-action from 4 to 5 (#2527) (@dependabot[bot])
- updates for Functors v0.5 (#2528) (@CarloLucibello)
- fix comment (#2529) (@oscardssmith)
- set expand option as default for
@layer
(#2532) (@CarloLucibello) - misc stuff for v0.15 release (#2534) (@CarloLucibello)
- Tweak quickstart.md (#2536) (@mcabbott)
- Remove usage of global variables in linear and logistic regression tutorial training functions (#2537) (@christiangnrd)
- Fix linear regression example (#2538) (@christiangnrd)
- Update gpu.md (#2539) (@AdamWysokinski)
Closed issues:
- RNN layer to skip certain time steps (like
Masking
layer in keras) (#644) - Backprop through time (#648)
- Initial state in RNNs should not be learnable by default (#807)
- Bad recurrent layers training performance (#980)
- flip function assumes the input sequence is a Vector or List, it can be Matrix as well. (#1042)
- Regression in package load time (#1155)
- Recurrent layers can't use Zeros() as bias (#1279)
- Flux.destructure doesn't preserve RNN state (#1329)
- RNN design for efficient CUDNN usage (#1365)
- Strange result with gradient (#1547)
- Call of Flux.stack results in StackOverfloxError for approx. 6000 sequence elements of a model output of a LSTM (#1585)
- Gradient dimension mismatch error when training rnns (#1891)
- Deprecate Flux.Optimisers and implicit parameters in favour of Optimisers.jl and explicit parameters (#1986)
- Pull request #2007 causes Flux.params() calls to not get cached (#2040)
- gradient of
Flux.normalise
return NaN whenstd
is zero (#2096) - explicit differentiation for RNN gives wrong results (#2185)
- Make RNNs blocked (and maybe fixing gradients along the way) (#2258)
- Should everything be a functor by default? (#2269)
- Flux new explicit API does not work but old implicit API works for a simple RNN (#2341)
- Adding Simple Recurrent Unit as a recurrent layer (#2408)
- deprecate Flux.params (#2413)
- Implementation of
AdamW
differs from PyTorch (#2433) gpu
should warn if cuDNN is not installed (#2440)- device movement behavior inconsistent (#2513)
- mark as public any non-exported but documented interface (#2518)
- broken image in the quickstart (#2530)
- Consider making the
:expand
option the default in@layer
(#2531) Flux.params
is broken (#2533)
v0.14.25
Flux v0.14.25
Merged pull requests:
- reintroduce FluxCUDAAdaptor etc.. to smooth out the transition (#2512) (@CarloLucibello)
v0.14.24
v0.14.23
Flux v0.14.23
Merged pull requests:
- Support for lecun normal weight initialization (#2311) (@RohitRathore1)
- Some small printing upgrades (#2344) (@mcabbott)
- simplify test machinery (#2498) (@CarloLucibello)
- Correct dead link for "quickstart page" in README.md (#2499) (@zengmao)
- make
gpu(x) = gpu_device()(x)
(#2502) (@CarloLucibello) - some cleanup (#2503) (@CarloLucibello)
- unbreak some data movement cuda tests (#2504) (@CarloLucibello)
Closed issues:
- Add support for lecun normal weight initialization (#2290)
using Flux, cuDNN
freezes, butusing Flux, CUDA, cuDNN
works (#2346)- Problem with RNN and CUDA. (#2352)
- since new version: Flux throws error when for train! / update! even on quick start problem (#2358)
- Cannot take
gradient
of L2 regularization loss (#2441) - Potential bug of RNN training flow (#2455)
- Problem with documentation (#2485)
- Flux has no Lecun Normalization weight init function? (#2491)
- Zygote fails to differentiate through Flux.params on julia v0.11 (#2497)
- ERROR: UndefVarError:
ADAM
not defined inMain
in flux (#2507)
v0.14.22
Flux v0.14.22
Merged pull requests:
- Bump actions/checkout from 4.2.0 to 4.2.1 (#2489) (@dependabot[bot])
- handle data movement with MLDataDevices.jl (#2492) (@CarloLucibello)
- remove some v0.13 deprecations (#2493) (@CarloLucibello)
Closed issues:
v0.14.21
Flux v0.14.21
Merged pull requests:
- Update ci.yml for macos-latest to use aarch64 (#2481) (@ViralBShah)
- Remove leading empty line in example (#2486) (@blegat)
- Bump actions/checkout from 4.1.7 to 4.2.0 (#2487) (@dependabot[bot])
- fix: CUDA package optional for FluxMPIExt (#2488) (@askorupka)
v0.14.20
Flux v0.14.20
Merged pull requests:
- feat: Distributed data parallel training support (#2464) (@askorupka)
- Run Enzyme tests only on CUDA CI machine (#2478) (@pxl-th)
- Adapt to pending Enzyme breaking change (#2479) (@wsmoses)
- Update TagBot.yml (#2480) (@ViralBShah)
- Bump patch version (#2483) (@wsmoses)