Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues with differentiating with respect to polynomial coefficients #578

Open
vivekbhattacharya opened this issue Jun 27, 2024 · 1 comment

Comments

@vivekbhattacharya
Copy link

vivekbhattacharya commented Jun 27, 2024

I am interested in taking the derivative of a polynomial with respect to its coefficients. When the coefficients are such that all terms of order $n \ge k$ for some $k$ are 0, automatic differentiation packages return 0 as the derivative with respect to those coefficients. Of course, the true derivative of $\beta_0 + \beta_1 x + \cdots + \beta_K x^K$ with respect to $\beta_n$ should be $x^n$.

Here is an example:

using Polynomials
using ForwardDiff

f(x) = Polynomial(x)(0.5)
∇f(x) = ForwardDiff.gradient(f, x)
true_∇f(x) = 0.5.^(0:length(x)-1)

all_params = [[1.0, 2.0, 3.0], [1.0, 0.0, 3.0], [1.0, 0.0, 0.0]]
for param in all_params
    println("Params are $(param).") 
    println("Gradient from ForwardDiff: $(∇f(param))") 
    println("True gradient: $(true_∇f(param))")
    println("")
end

The output from this example is below. Note the issue in the final example.

Params are [1.0, 2.0, 3.0].
Gradient from ForwardDiff: [1.0, 0.5, 0.25]
True gradient: [1.0, 0.5, 0.25]

Params are [1.0, 0.0, 3.0].
Gradient from ForwardDiff: [1.0, 0.5, 0.25]
True gradient: [1.0, 0.5, 0.25]

Params are [1.0, 0.0, 0.0].
Gradient from ForwardDiff: [1.0, 0.0, 0.0]
True gradient: [1.0, 0.5, 0.25]

Is there a way to avoid this issue?

@dbstein
Copy link

dbstein commented Dec 18, 2024

This issue can cause hard to find silent bugs (as I just painfully discovered). In my case, I have a nonlinear function $f(X)$, which looks roughly like

function f(X)
    Ŷ, S = ComplicatedStuff(X)
    Z = ChebyshevT(Ŷ).(S)
    return MoreComplicatedStuff(X, Z)
end

and I'm looking at the Jacobian at an equilibrium, where $f(X)=0$, which in some cases means that $\hat Y$ is exactly zero. The truncation-on-construction behavior then leads to an incorrect Jacobian. Once the bug was found this was easy enough to fix, but it was quite a headache to find.

For the simple case in the above post (and for my use case), lightly modifying the constructor of MutableDensePolynomial is sufficient to give correct results:

@inline my_iszero(x) = iszero(x)
@inline my_iszero(x::ForwardDiff.Dual) = false

struct MutableDensePolynomial{B,T,X} <: AbstractDenseUnivariatePolynomial{B,T,X}
    coeffs::Vector{T}
    function MutableDensePolynomial{B,T,X}(::Val{true}, cs::AbstractVector{S}, order::Int=0) where {B,T,X,S}
        if Base.has_offset_axes(cs)
            @warn "ignoring the axis offset of the coefficient vector"
            cs = parent(cs)
        end
        i = findlast(!my_iszero, cs)
        if i == nothing
            xs = T[]
        else
            xs = T[cs[i] for i  1:i] # make copy
        end
        if order > 0
            prepend!(xs, zeros(T, order))
        end
        new{B,T,Symbol(X)}(xs)

    end
    function MutableDensePolynomial{B,T,X}(check::Val{false}, cs::AbstractVector{T}, order::Int=0) where {B,T,X}
        new{B,T,Symbol(X)}(cs)
    end
end

The only change here is altering the line i = findlast(!iszero, cs) to i = findlast(!my_iszero, cs), along with the definition of my_iszero, which basically says "don't truncate if you're differentiating". If truncation is implemented this way everywhere (and I don't know the code for this package at all), then this would be a relatively easy fix for ForwardDiff but requires ForwardDiff as a dependency.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants