Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should we try to infer a default kernel when none is provided? #338

Open
mgasner opened this issue Apr 3, 2019 · 1 comment
Open

Should we try to infer a default kernel when none is provided? #338

mgasner opened this issue Apr 3, 2019 · 1 comment

Comments

@mgasner
Copy link
Contributor

mgasner commented Apr 3, 2019

Right now, when calling execute_notebook, papermill requires either an explicitly specified kernel_name, or a kernelspec to be present in the notebook's metadata.

This complicates situations a) where we are writing notebooks that are compatible across the py2/py3 language barrier and b) where we are distributing notebooks to the public.

In the first case, if we specify either a python2 or python3 kernel in our notebooks, we will work out of the box (on a default install) on one Python version, but not on the other, even if the notebook is compatible. There are many workarounds, but all are burdensome and distracting.

In the second case, regardless of what kernel we specify in our notebooks, we are not guaranteed that it will be present in the environment to which we are distributing notebooks.

Granted that it is impossible to magically resolve these issues in general, does anyone see a strong downside to trying to infer a kernel when it is not specified, perhaps following the scheme in #262 (looking at the language), or perhaps looking at Jupyter settings such as MappingKernelManager.default_kernel_name (but cf. jupyter/notebook#3338)?

@mgasner mgasner changed the title Default kernel when none is provided Should we try to infer a default kernel when none is provided? Apr 3, 2019
@MSeal
Copy link
Member

MSeal commented Apr 4, 2019

So think of papermill's kernel assignment as an override tool. It's usually not used and one relies on the notebook's metadata to make a decision.

You're right that the notebook framework doesn't have a place to specify that the notebook can be run with multiple kernels, just fallback mechanisms for if the specified kernel is missing.

I'd actually follow up the conversation on https://github.com/jupyter/nbformat/ and/or https://discourse.jupyter.org/ as this may be something we want to include in nbformat 5.0. Today there 4.4 spec requires that the kernel name and display_name be present in the document: https://github.com/jupyter/nbformat/blob/master/nbformat/v4/nbformat.v4.schema.json#L13-L27 which does force the kernel into 2 vs 3 in the python case.

In your case, is it that you just want to advertise that the notebook support both versions? Because you could use kernel name which is generic (e.g. python) and supply a python kernel in your stack which defaults to the version you see fit but indicates it only accepts 2 and 3 compatible code.

Another important question is, given much of the tooling will be dropping support for python 2 at the end of the year (some already has), how much is it worth indicating 2.7 and 3.x support? Perhaps defaulting to 3 with a metadata indicator in the notebook that it's been tested against 2.7 would be sufficient for dagstermill to choose to keep the 3 kernel or explicitly be able to override if the user lives in 2?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants