-
Notifications
You must be signed in to change notification settings - Fork 310
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support both local kernels and remote (via kernel gateway) at the same time. #1187
Comments
Hi @ojarjur - thank you for opening this issue. I'm unable to respond to this right now but have spent a fair amount of time thinking about this and sharing some of those ideas with others and can see your general approach is similar. (I don't think number 4 is necessary as that functionality should "just work" via the existing Change Kernel behavior.) I hope to be able to circle back to this in a few days (hopefully sooner). |
The general approach I've been mulling over is to introduce the notion of a Kernel Server where a single jupyter server could be configured with one or more Kernel Servers. A Kernel Server would consist of a I'm not sure you had joined the Server/Kernels Team Meeting by this time, but I raised the question about how traits could be configured to apply to multiple instances where each instance had potentially different values. Since traits are class-based, specifying configuration settings for each Each The handlers would essentially do what they do today but call into the As you also intimated, a second index, key'd by So I think this becomes a matter of the following (at a high level of course):
At any rate, I think we're on the same page here. At a high level, this is doable and would be a useful addition. By default, the server would behave just as today - supporting only local kernels. I think we could also accomplish backward compatibility for single-gateway configs keying off Thoughts? |
@kevin-bates thanks for the detailed and thoughtful response, and for bringing the topic up in the weekly meeting. Also, thanks for the class references, that will help if I try to prototype this by converting my existing proof-of-concept into a Jupyter server extension.
I like this approach and I had considered something along these lines but I wasn't sure if the Conceptually, if we take it as granted that we always want to support local kernels, then this can be viewed as an instance of the "zero, one, or infinitely-many" design question in terms of kernel gateways... by default there are zero kernel gateways supported, you can currently opt into supporting one kernel gateway, and the approach you've described extends that to infinitely-many kernel gateways. The "infinitely-many" case is the most general, but supporting it inside of The question of configuration that you mentioned is one example of this complexity, but it's not the only one. There's also complexity in terms of the question of how multiple backends are managed, and I don't think that a single approach to that will satisfy all users. For example, the simplest approach would be a fixed set of backends. However, I suspect most users would be better served by having some sort of automated discovery mechanism that dynamically finds all of the Jupyter servers available to them. That's inherently specific to the user's environment so we can't build a one-size-fits-all solution to it. After you mentioned this at the weekly meeting I had some time to mull it over, and wanted to present another option: What if, instead of a set of static configs, we defined a base class for providing the set of We could provide canned implementations of this class for the Then, users who wanted to use arbitrarily many backends could provide their own implementation of this base class that took advantage of what they know about their particular environment (e.g. knowing how to look up backends and which configs are common to all of them). What do you think of that option? Would that still line up with what you wanted?
Yes, you are right. I forgot to mention it, but that is exactly what I've been doing. I add a suffix on each display name which defaults to " (local)" for local kernelspecs and " (remote)" for remote ones. I wanted these suffixes to be configurable so that the user can change them to their local language.
That sounds good to me. |
Hi @ojarjur - thanks for the response. I think there's some alignment here (in the majority) but I'm hoping we could perhaps have a call together because I believe there may be some "terminology disconnects" that I'd like to iron out. Could you please contact me via email (posted on my GH profile) and we can set something up? In the meantime, I would like to respond to some items.
I don't think we should take this for granted. Users configuring the use of a gateway today essentially disable their local kernels and I don't think we should assume that every installation tomorrow will want local kernel support. Many will, as evidenced by those that have asked those kinds of questions, but I think there's some value to operators to know there won't be kernels running locally. Nevertheless, this is still a zero, one, or many proposition with respect to gateway servers. One thing I think we can take for granted is that a given Jupyter Server deployment will have at least one server against which kernels can be run. I also think we can assume that that server (against which kernels are run) will be managed via the existing
I'd like to better understand how this discovery mechanism would work. (I'm assuming by "Jupyter servers" [and elsewhere "backends"] you mean "Gateway servers" - or "Kernel servers".) In this particular instance, I think operators would prefer explicit configurations that are loaded at startup, so they know exactly where their requests are going. But, again, I may not be understanding how discovery would work.
I agree that any class we provide should be extensible and substitutable, provided there's a well-known interface that the server can interact with. However, given the assumption above that all kernel servers will honor the existing REST APIs, I think a single implementation would be sufficient (at least for a vast majority of cases). Today's
I think this is driving at the discovery stuff and could see an implementation of This is an interesting conversation. Please reach out to me via my email. If we're in widely separate TZs I'd still like to have the conversation. If others are interested in joining our sidebar, please let me know. |
Just got back from parent leave and catching up on this thread. If y'all have a meeting, I'd love to join (or at least see some notes! 😎). |
@ojarjur - could you please send Zach an invite for our 2 pm (PST) chat today? |
This change adds support for kernel spec managers that rename kernel specs based on configured traits. This is a necessary step in the work to support multiplexing between multiple kernel spec managers (jupyter-server#1187), as we need to be able to rename kernel specs in order to prevent collisions between the kernel specs provided by multiple kernel spec managers.
Hey, all! I've been trying to find the answer to this problem for a few months myself and would love to be able to use the local + remote scenario described here. Personally, I'm not in a place where I can contribute to the work involved here due to my extreme lack of knowledge in this area 🥲 But it's been about a year and half since this was last discussed so I'm hoping if anyone here has had any more time to either work around this or think about how to better handle this within jupyter_server? I see this PR is still open. Is it integral for getting us across the finish line? |
Thanks for posting this @waaffles. I'm sorry I'm unable to help with this (although I certainly have opinions 😄). @ojarjur (I hope you're doing well!) - do you still have momentum for this feature? I'm sure David and yourself aren't the only ones that would like to enable multiple kernel "environments"(?). |
@waaffles @kevin-bates Sorry for the delay in responding; I've been pretty busy with other work. I largely went silent here for a long time because I wound up building a minimal-viable solution via a server extension. That one provides a config helper that is tailored to Google's Cloud Platform, but the rest of the code is actually generic and usable in any other environment. You would just install that package and then add something like this to your Jupyter config python file: from jupyter_server.services.sessions.sessionmanager import SessionManager
from kernels_mixer.kernelspecs import MixingKernelSpecManager
from kernels_mixer.kernels import MixingMappingKernelManager
from kernels_mixer.websockets import DelegatingWebsocketConnection
c.ServerApp.kernel_spec_manager_class = MixingKernelSpecManager
c.ServerApp.kernel_manager_class = MixingMappingKernelManager
c.ServerApp.session_manager_class = SessionManager
c.ServerApp.kernel_websocket_connection_class = DelegatingWebsocketConnection
# Add configs for your gateway client below... This is far from ideal for multiple reasons, but it was good enough for the remaining work to fall off of my radar. The big limitations of this are:
However, fixing both of those becomes a much larger scoped change. I would very much like to get that large change into Jupyter server itself, but the size of the change makes that a large amount of work and I have had difficulty finding the time to do it. What this probably needs next is a group discussion on the overall architecture so that we can build a consensus on the right approach. I joined the Jupyter Server Community meeting today to ask about the best way to do that, and @Zsailer suggested using an issue in the team-compass repo to drive that discussion. I will try to write one up today and then will link it to this issue. |
@ojarjur, I believe that adopting this approach will eventually lead to the deprecation of the current gateway client. To minimize the impact on the Jupyter Server's architecture, we should explore more modular and isolated solutions, such as leveraging kernel providers. This would help ensure that any changes to the server's behavior remain minimal and well-contained. However, a significant limitation with the current kernel provider implementation is its lack of flexibility in defining different communication mechanisms (e.g., websockets versus ZMQ) based on the kernel provider in use. This constraint makes it challenging to support scenarios where local kernels use ZMQ connections while remote kernels communicate over websockets. Addressing this limitation will be critical to achieving a seamless integration and maintaining compatibility across diverse kernel configurations. |
@lresende you mean kernel provisioners instead of "providers", correct? |
Can you explain your reasoning here? All of my implementations of this concept have relied on the GatewayClient, so I don't understand how this would lead to that getting deprecated. The only change I can anticipate to the GatewayClient would be to change it from being a singleton.
I'm having difficulty imagining what you are proposing. If you are suggesting that there is no gateway server at all, and the Jupyter server directly provisions remote resources, then that approach would not work for my use case because we have kernel gateway servers that we want to connect to. Alternatively, are you suggesting that kernel provisioners somehow connect to kernel gateway servers? If so, I don't see how that would work when the server needs to list the remote kernelspecs.
I was able to address that already. I simply defined an intermediary implementation of |
Thanks @ojarjur so much for your follow up, the contributions, and for sharing the extension you had made! I came across your original github project but not the extension! This also solves my current use case for now where we only need to connect to a single gateway in conjunction with keeping local kernels around to not have to battle with toggling. Thanks for also getting it on the docket with the jupyter compass discussions. I'm gonna continue to keep an eye on this to hopefully eventually be able adopt the blessed path moving forward. I don't know in what capacity I can help here as I'm still learning the ecosystem, but I hope to be involved in whatever capacity I can. Thanks as well to @Zsailer @lresende and @kevin-bates as well for a lot of the insight along the way here 🙏 |
Problem
I love the option of connecting a Jupyter server to a kernel gateway, but it is currently an all-or-nothing experience; either all of your kernels run locally or they all run using the kernel gateway.
I would like it if I could pick either local or remote when I am selecting a kernelspec.
For example, I want to be able to have two notebooks open in JupyterLab, and be able to run one of them using a kernel started by my local server, and have the other one using a kernel started by a kernel gateway.
Proposed Solution
It is possible to solve this by using some sort of an intermediary proxy as a kernel gateway, which is responsible for deciding whether to run the kernels locally or remotely.
In fact, I have a proof-of-concept implementation of this and was able to verify that it works as you might hope.
However, this approach has a big drawback in that you have to then run two separate instances of the jupyter server locally; one for creating kernels and one for connecting to this proxy, and those two different jupyter servers have to use different configs (telling them where to run the kernels).
It would be much simpler (both in the sense of being cleaner and easier to use) if the jupyter server was able to do this switching natively instead of relying on an intermediary proxy.
Additional context
The proof of concept I linked to above is very specific to my use case and not a general solution to this problem (e.g. it assumes a specific, hard-coded form of auth, etc).
The general approach, however, should be reusable and work in an in-jupyter-server based solution:
The text was updated successfully, but these errors were encountered: