-
-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Attribute or other way to distinguish MCMC vs MC #239
Comments
Not all formats have iteration information. In fact, in a way, only draws_df has. The other ones just store iteration implicitly through the ordering of the draws. Adding an attribute would not be difficult I think. The only question is how to proceed with it when the draws objects are transformed somehow. Attributes in R are timid things that tend to vanish into the dark as soon as one lightly touches the object they belong to. For rvars it might be the easiest(?) to maintain an attribute as all transformations are fully custom there anywhere. For the other objects, I am not sure. I didn't want to go through the effort or reimplementing every standard transformation such as +, * etc. for every format to make sure the attributes are kept/alterted correctly. Does anybody have other ideas how to differentiate this types of draws? |
Yeah, the annoying stuff to make this work in rvars has already been figured out to track chain information, so it could certainly be done. It does seem like we wouldn't want this feature to be limited to rvars though.
Yeah I feel that. If this feature is desired that may end up being the only feasible way unfortunately. In the end it may not be that hard, since most of those operations can be implemented using group generics instead of one-by-one, because we aren't changing their fundamental functionality, just passing on to the superclass and then making sure the attribute is maintained on the result. The only other mechanism I can think of is a "special" variable like the one used to store weights. Seems wasteful though since presumably it would always hold the same value for every draw. |
Now that cmdstanr is getting |
In order to implement this, it might make sense to create some systematic infrastructure for resolving subtype conflicts amongst MC / MCMC / weighted MC / weighted MCMC. This would probably have to include a way for people to do coercion manually if needed, particularly if we decide some of the subtype combinations result in an error that has to be resolved by the user. It could be helpful to fill out a table like this:
Something like a |
I'd fine with explicit coercion, but if automatic then, for example, couple examples are
Binding MCMC and MC seems less likely, with binding as independent chains maybe a bit more likely. I would coerce to MCMC, and MC draws would lose the information that hey were independent also over iterations. Resampling weighted draws with some default is a non-trivial choice. I'm not what would be the use case, for combining non-weighted and weighted. For non-weighted MCMC we could assume the weights are equal. We could also consider the case of two weighted MCMC or weighted MC, but with different weights, which would make the generic math operations also complicated. For variables with equal weights there is no difference whether we do arithmetic first and resampling then or vice versa, except that the diagnostics can be better if we do arithmetic first and keep the weights. |
As @mjskay mentioned in #331, this attribute could be added to rvars. If rvars are then passed to summary functions in summarise draws, as discussed regarding weight support in #184, then summary functions could use this info too. I think this would mostly affect |
The posterior package started with focus on multi-chain MCMC and stores chain and iterations ids. These are useful when computing multi-chain Rhat, ESS, and MCSE. It is also possible to set weights for the draws which is usedul for importance sampling. It would be useful to think about the default behavior of some functions and whether the current draws objects contain sufficient information to do the right thing. For example, if we want to compute MCSE we have 4 different cases
I guess we could assume that if iteration information is available, then the draws are from MCMC. But at the moment, we don't have support for independent (weighted) MC draws. Would it make sense to set the iteration to 1 for all independent draws? Other ideas for making the difference?
This issue is related to
psis()
function in loo package complaining ifr_eff
argument is not set.r_eff
is used to pass the earlier computed (MCMC-ESS)/S. If we could determine whether the draws are from MCMC or MC, we would not need to complain in the latter case (and could compute r_eff internally in the first case)The text was updated successfully, but these errors were encountered: