You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, we are considering adopting Papermill for parameterizing and running our notebooks, but the main thing stopping us is the lack of support for Zeppelin. Our notebooks are a mix of Jupyter and Zeppelin, and having the ability to run both with the same library would be invaluable.
I was wondering if that is something that has been discussed before, and if this is something that would be a good fit for Papermill?
If this is something that would be of interest, I would be happy to try contributing something there.
The text was updated successfully, but these errors were encountered:
So from what I understand, this is slightly tricky because of the way that Zeppelin thinks about a notebook file. Under the hood there is a note.json. I was not able to track down a spec for that file, so we may have no guarantees about what we can expect to find there.
Because it doesn't seem to have a standard, versioned spec that we can adhere to it can be tricky to parameterise. It would likely require creating a library like nbformat for Zeppelin notebooks that would to plug into what we're currently doing with nbformat to parameterise Jupyter notebooks.
Additionally, I'm not sure how the system thinks about metadata…so while it might be possible to apply tags to cells, we may need to figure out a different convention for labeling cells as holding parameters.
I think there's definitely room in papermill for processing zepplin notebooks. As M mentioned, it definitely operates in a different format than Jupyter so it'd require a few components to get some abstraction upgrades.
The first abstraction that needs adjusting is the node formatting. We'd need something to load the note.json into nbformat or an nbformat-like object for processing. Then parameterization would then need to be able to apply to both notebook formats in a similar manner -- or we'd need parameterization be more abstract if nbformat-like memory store is out. This might require upgrading parameterization to a more plug-in play pattern like we do with other components of papermill either way.
Then we'd want to extend #204 with an --engine=zepplin to wrap a zepplin executor. This will add some java dependency for this particular engine, but that's ok and we can just raise an exception if the JRE isn't available inside the engine.
And finally we'd need to figure out how to handle the iorw patterns for a non-jupter document. This one would require a little more thought, but I don't see any reason we couldn't solve it there too.
MSeal
added
the
idea
An idea which is open to discussion rather than a particular issue or bug
label
Sep 14, 2018
Hi, we are considering adopting Papermill for parameterizing and running our notebooks, but the main thing stopping us is the lack of support for Zeppelin. Our notebooks are a mix of Jupyter and Zeppelin, and having the ability to run both with the same library would be invaluable.
I was wondering if that is something that has been discussed before, and if this is something that would be a good fit for Papermill?
If this is something that would be of interest, I would be happy to try contributing something there.
The text was updated successfully, but these errors were encountered: