Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmdstan stops working with Intel MKL #1050

Open
wds15 opened this issue Oct 11, 2021 · 11 comments
Open

cmdstan stops working with Intel MKL #1050

wds15 opened this issue Oct 11, 2021 · 11 comments

Comments

@wds15
Copy link
Contributor

wds15 commented Oct 11, 2021

Summary:

The attached model used to work with previous (pre 2.28.0) MKL installations, but it fails to compile now with 2.28.0.

Description:

With the MKL being used the latest cmdstan 2.28.0 stops compiling with the model below.

make/local:

MKL_CBWR=AVX
CC = gcc
CXX = g++
C_MKL=-I ${MKLROOT}/include -march=ivybridge -DEIGEN_USE_MKL_ALL -Wno-enum-compare
LD_MKL=-L${MKLROOT}/lib/intel64 -lmkl_intel_lp64 -lmkl_core -lmkl_sequential -lpthread -lm -ldl -Wl,-rpath=${MKLROOT}/lib/intel64,--no-as-needed
# which need to be passed to compiler/linker options
CXXFLAGS += $(C_MKL)
CFLAGS += $(C_MKL)
LDFLAGS += $(LD_MKL)
STAN_THREADS=true

blrm_exnex-combo3_data_R.txt

blrm-exnex-28_stan.txt

Error message during compilation (g++ 8.2.0):

stan/lib/stan_math/stan/math/prim/fun/lb_constrain.hpp:175:49:   required from ‘auto stan::math::lb_constrain(const std::vector<T1>&, const L&, stan::return_type_t<T1, T2>&) [with T = std::vector<Eigen::Matrix<stan::math::var_value<double>, -1, 1>, std::allocator<Eigen::Matrix<stan::math::var_value<double>, -1, 1> > >; L = int; stan::require_not_std_vector_t<T_high>* <anonymous> = 0; stan::return_type_t<T1, T2> = stan::math::var_value<double>]’
stan/src/stan/io/deserializer.hpp:386:38:   required from ‘auto stan::io::deserializer<T>::read_constrain_lb(const LB&, LP&, Sizes ...) [with Ret = std::vector<std::vector<Eigen::Matrix<stan::math::var_value<double>, -1, 1>, std::allocator<Eigen::Matrix<stan::math::var_value<double>, -1, 1> > >, std::allocator<std::vector<Eigen::Matrix<stan::math::var_value<double>, -1, 1>, std::allocator<Eigen::Matrix<stan::math::var_value<double>, -1, 1> > > > >; bool Jacobian = false; LB = int; LP = stan::math::var_value<double>; Sizes = {int, int, int}; T = stan::math::var_value<double>]’
/home/weberse2/t/blrm-exnex-28.hpp:4732:24:   required from ‘stan::scalar_type_t<T2> blrm_exnex_28_model_namespace::blrm_exnex_28_model::log_prob_impl(VecR&, VecI&, std::ostream*) const [with bool propto__ = false; bool jacobian__ = false; VecR = Eigen::Matrix<stan::math::var_value<double>, -1, 1>; VecI = Eigen::Matrix<int, -1, 1>; stan::require_vector_like_t<VecR>* <anonymous> = 0; stan::require_vector_like_vt<std::is_integral, VecI>* <anonymous> = 0; stan::scalar_type_t<T2> = stan::math::var_value<double>; std::ostream = std::basic_ostream<char>]’
/home/weberse2/t/blrm-exnex-28.hpp:6142:77:   required from ‘T_ blrm_exnex_28_model_namespace::blrm_exnex_28_model::log_prob(Eigen::Matrix<T_job_param, -1, 1>&, std::ostream*) const [with bool propto__ = false; bool jacobian__ = false; T_ = stan::math::var_value<double>; std::ostream = std::basic_ostream<char>]’
stan/src/stan/model/model_base_crtp.hpp:96:77:   required from ‘stan::math::var stan::model::model_base_crtp<M>::log_prob(Eigen::Matrix<stan::math::var_value<double>, -1, 1>&, std::ostream*) const [with M = blrm_exnex_28_model_namespace::blrm_exnex_28_model; stan::math::var = stan::math::var_value<double>; std::ostream = std::basic_ostream<char>]’
stan/src/stan/model/model_base_crtp.hpp:93:20:   required from here
stan/lib/stan_math/lib/eigen_3.3.9/Eigen/src/Core/ArrayWrapper.h:80:48: error: passing ‘const NestedExpressionType’ {aka ‘const Eigen::CwiseUnaryView<Eigen::MatrixBase<Eigen::Map<Eigen::Matrix<stan::math::var_value<double>, -1, 1>, 0, Eigen::Stride<0, 0> > >::val_Op, Eigen::Map<Eigen::Matrix<stan::math::var_value<double>, -1, 1>, 0, Eigen::Stride<0, 0> > >’} as ‘this’ argument discards qualifiers [-fpermissive]
       return m_expression.coeffRef(rowId, colId);
                                                ^
In file included from stan/lib/stan_math/lib/eigen_3.3.9/Eigen/Core:439,
                 from stan/lib/stan_math/lib/eigen_3.3.9/Eigen/Dense:1,
                 from stan/lib/stan_math/stan/math/prim/fun/Eigen.hpp:22,
                 from stan/lib/stan_math/stan/math/rev.hpp:4,
                 from stan/lib/stan_math/stan/math.hpp:19,
                 from stan/src/stan/model/model_header.hpp:4,
                 from /home/weberse2/t/blrm-exnex-28.hpp:3:
stan/lib/stan_math/lib/eigen_3.3.9/Eigen/src/Core/DenseCoeffsBase.h:340:33: note:   in call to ‘Eigen::DenseCoeffsBase<Derived, 1>::Scalar& Eigen::DenseCoeffsBase<Derived, 1>::coeffRef(Eigen::Index, Eigen::Index) [with Derived = Eigen::CwiseUnaryView<Eigen::MatrixBase<Eigen::Map<Eigen::Matrix<stan::math::var_value<double>, -1, 1>, 0, Eigen::Stride<0, 0> > >::val_Op, Eigen::Map<Eigen::Matrix<stan::math::var_value<double>, -1, 1>, 0, Eigen::Stride<0, 0> > >; Eigen::DenseCoeffsBase<Derived, 1>::Scalar = double; Eigen::Index = long int]’
     EIGEN_STRONG_INLINE Scalar& coeffRef(Index row, Index col)
                                 ^~~~~~~~

This is with MKL version 2019.1.144

Reproducible Steps:

Current Output:

The current output. Knowing what is the current behavior is useful.

Expected Output:

It should just work as it did with previous versions. The last version we have tested this setup ok is cmdstan 2.21.

Additional Information:

NA

Current Version:

v2.28.0

@rok-cesnovar
Copy link
Member

Just to double check, it is working with 2.27.0? I am not sure from your post as you mention 2.21.

@wds15
Copy link
Contributor Author

wds15 commented Oct 11, 2021

No.. 2.27.0 also fails with the same error.

@wds15
Copy link
Contributor Author

wds15 commented Oct 11, 2021

2.26.0 works OK.

@wds15
Copy link
Contributor Author

wds15 commented Oct 12, 2021

@andrjohns maybe this easy for you: To me it looks as if Eigen integrates with the MKL using the plugin system. This is apparently interfering with the use of the plugin system as coded up for Stan-math. There are comments in stan/math/prim/fun/Eigen.hpp which talk about multiple plugins, but they don't really make sense to me. Maybe you can have a look or comment? Let me know if you need more info. As stated above: Things stopped working 2.27.0 while they were ok with 2.26.0...if that helps. Thanks!

Have a look at: https://gitlab.com/libeigen/eigen/-/blob/master/Eigen/src/Core/Assign_MKL.h

@andrjohns
Copy link
Contributor

Sounds interesting, will take a look!

@andrjohns
Copy link
Contributor

Alright, I can replicate locally with the MKL when just running the unit tests for the MatrixBase plugins. Will sort out a fix asap

@andrjohns
Copy link
Contributor

@wds15 & @rok-cesnovar, it's an Eigen bug, not Stan. The macros that Eigen are using for generating the MKL code (in the header that Sebastion linked before) are only defined for const inputs on which a CWiseUnaryOp can be called.

So when calling them on a non-const input, it throws the error about discarding attributes. I'll open a bug with Eigen and see if we can patch locally in the interim

@rok-cesnovar
Copy link
Member

Thank you very much for digging into this!

@wds15
Copy link
Contributor Author

wds15 commented Oct 12, 2021

I'd think, that it would be nice to have this in our 2.28.1 release as well... if this is easy for you to handle and you got the time to do it soon.

Thanks!

@andrjohns
Copy link
Contributor

Just a heads up that this is going to be a larger refactor and I don't think should hold up 2.28

@wds15
Copy link
Contributor Author

wds15 commented Oct 14, 2021

Thanks for the heads up…take your time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants