Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filtering request using OpenAPI spec #645

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

fcollonval
Copy link
Member

@fcollonval fcollonval commented Dec 22, 2021

Fixes jupyterlab/jupyterlab_server#167

This is a PoC to add allowed and blocked request at runtime.

@fcollonval
Copy link
Member Author

@bollwyvl I created this PoC to filter request at runtime using OpenAPI specification for allowed and blocked routes.

For allowed routes, this sounds great. But for blocked routes (blocking only request matching that subset of rules), it may be too complicated. In particular you can not blocked easily any sub path by simply blocking a common root. The other issue I see is that OpenAPI does not support path parameter with slashes.

So... should I switch to something more trivial to define routes like Regex and list of methods?

setup.cfg Outdated Show resolved Hide resolved
setup.cfg Outdated Show resolved Hide resolved
@@ -96,7 +98,9 @@ def _prepare_templates(self):
self.initialize_templates()
# Add templates to web app settings if extension has templates.
if len(self.template_paths) > 0:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could just be if self.template_paths

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I won't change it as it is not modified by this PR.

@bollwyvl
Copy link
Contributor

you can not blocked easily any sub path by simply blocking a common root.

Right: it's only convention that sub-routes are even related. Regexen would be a way to do it, but could even be applied as a class of patch. The key thing is still to end up with an actual, accurate OpenAPI at the end, whether created by JSON Patch, JSON-e, python, or otherwise.

@fcollonval
Copy link
Member Author

Thanks a lot for reviewing this one @bollwyvl

you can not blocked easily any sub path by simply blocking a common root.

Right: it's only convention that sub-routes are even related. Regexen would be a way to do it, but could even be applied as a class of patch. The key thing is still to end up with an actual, accurate OpenAPI at the end, whether created by JSON Patch, JSON-e, python, or otherwise.

Ok then let's keep OpenAPI as expected input format and let interested admin/dev design their tool to generate it. But that does not address the issue with path parameter with slashes that we are using a lot (like for the content API). The way I see it, is to allow to register path transformer that will escape slashes as needed to fulfill OpenAPI expectation. Something like:

/api/contents/path/to/file.ext ➡️ /api/contents/path%2Fto%2Ffile.ext

and 🤞 that openapi library will pick correctly the parameter

@fcollonval fcollonval changed the title PoC filtering using OpenAPI spec Filtering request using OpenAPI spec Dec 28, 2021
@codecov-commenter
Copy link

codecov-commenter commented Dec 28, 2021

Codecov Report

Merging #645 (ccac364) into main (db46446) will increase coverage by 0.19%.
The diff coverage is 90.22%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #645      +/-   ##
==========================================
+ Coverage   77.88%   78.08%   +0.19%     
==========================================
  Files         110      113       +3     
  Lines       10405    10630     +225     
  Branches     1400     1429      +29     
==========================================
+ Hits         8104     8300     +196     
- Misses       1911     1937      +26     
- Partials      390      393       +3     
Impacted Files Coverage Δ
jupyter_server/serverapp.py 65.62% <70.83%> (+0.11%) ⬆️
jupyter_server/specvalidator.py 86.07% <86.07%> (ø)
jupyter_server/tests/extension/test_launch.py 94.11% <89.47%> (-1.34%) ⬇️
jupyter_server/tests/test_spec_validation.py 96.61% <96.61%> (ø)
jupyter_server/extension/application.py 75.22% <100.00%> (+0.90%) ⬆️
...pyter_server/tests/extension/mockextensions/app.py 94.44% <100.00%> (+0.50%) ⬆️
jupyter_server/tests/test_specvalidator.py 100.00% <100.00%> (ø)
jupyter_server/services/kernels/kernelmanager.py 80.63% <0.00%> (-1.91%) ⬇️
jupyter_server/base/handlers.py 65.31% <0.00%> (-0.21%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update db46446...ccac364. Read the comment docs.

@fcollonval fcollonval marked this pull request as ready for review December 28, 2021 15:59
@fcollonval
Copy link
Member Author

I gladly add some documentation about this feature. Could somebody advice me on the best place to do so?

@blink1073
Copy link
Contributor

Maybe these docs?

@fcollonval
Copy link
Member Author

I added to the developers section as it is not configurable. So it probably makes more sense there.


class MyExtensionApp(ExtensionApp):

_allowed_spec = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels like these should be public attributes if they are meant to be part of the API

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following the idea of jupyterlab/jupyterlab_server#167, the scenario imagined to design this PR is the following:

James wants to create a SmithApp that inherits from ExtensionApp. That app can be executed either as an Jupyter server extension or as a standalone application. But in the case of the standalone application, he only wants the new handlers of his extension and the get contents endpoint to be available.

To achieve this, James will overrides the protected class attribute _allowed_spec in the child class SmithApp. Users of the SmithApp will never be able to change the endpoints available in standalone mode*.

*of course because this is Python, this is only a convention that protected attribute should not be modified externally.

pinging @bollwyvl to get his opinion

@Zsailer
Copy link
Member

Zsailer commented Dec 31, 2021

@fcollonval and @bollwyvl , great stuff here!

Do you mind waiting to merge this until after we have our next Jupyter Server meeting (January 6th)? I'm hoping to get more folks to weigh-in on this feature since we've discussed this in the past.

For context, we added a way to add/remove whole "services" in Jupyter Server by overriding default_services attribute in the ServerApp:

# A list of services whose handlers will be exposed.
# Subclasses can override this list to
# expose a subset of these handlers.
default_services = (
"api",
"auth",
"config",
"contents",
"files",
"kernels",
"kernelspecs",
"nbconvert",
"security",
"sessions",
"shutdown",
"view",
)

This PR goes a few steps further and allows developers to cherry-pick individual routes (i.e. within individual services), which is really slick.

@fcollonval
Copy link
Member Author

Following the discussion at yesterday meeting,

The authorization layer will definitely have overlap with this proposal as it can be seen as adding authorization policy to all users.
At the functionality level, I see two main differences (compared to #165):

  • This PR enforces the rules on any request - it does not rely of having the handler decorated by tornado.web.authenticated. This is a technical detail but it could be important if you don't control the server extensions loaded within the application.
  • This is not a RBAC approach (in the opposite of Add authorization layer to server request handlers #165). Authorization policy can take lots of forms (RBAC, ABAC, Open Policy Agent, XACML, Amazon Web Service IAM,...). They have their own logic on what should be defined to create a policy. And they allow different granularity of policy - it is one of the reason external services like the Open Policy Agent will thrive in large organizations but are likely too complex for smaller players.
    In particular expressing the rules for this approach is more demanding than what is required for Add authorization layer to server request handlers #165 - admin sys will certainly needs to create their own scripts to generate the OpenAPI specs.

That said I can't formulate the best articulation between this PR and #165. It could make sense to have both simultaneously - although as pointed by @echarles it will likely be confusing.

@echarles
Copy link
Member

echarles commented Jan 7, 2022

This PR enforces the rules on any request - it does not rely of having the handler decorated by tornado.web.authenticated. This is a technical detail but it could be important if you don't control the server extensions loaded within the application.

There have been discussions around the default behavior in case of absence of @authenticated decorator. The same discussion could happen for @authorized. If a developer does not decorate, should the authorization layer reject the request? I like hearing "security by default", so I would be in favor to reject if the developer does not decorate, but this is a discussion to have in #165

This is not a RBAC approach (in the opposite of Add authorization layer to server request handlers #165). Authorization policy can take lots of forms (RBAC, ABAC, Open Policy Agent, XACML, Amazon Web Service IAM,...).

True. This being said, I don't think #165 enforces RBAC approach. I have e.g. given a demo a few months ago during our developer meeting of a Open Policy Agent (OPA) implementation which nicely integrates with the proposed authorization layer.

@fcollonval
Copy link
Member Author

I like hearing "security by default", so I would be in favor to reject if the developer does not decorate, but this is a discussion to have in #165

👍

This is not a RBAC approach (in the opposite of Add authorization layer to server request handlers #165). Authorization policy can take lots of forms (RBAC, ABAC, Open Policy Agent, XACML, Amazon Web Service IAM,...).

True. This being said, I don't think #165 enforces RBAC approach. I have e.g. given a demo a few months ago during our developer meeting of a Open Policy Agent (OPA) implementation which nicely integrates with the proposed authorization layer.

Awesome that the current proposal is so flexible.

So does that means we could transform this PR as an concrete Authorizer implementation (in the sense of the authorization layer) that uses only the request to provide the allow/deny answer?

@echarles
Copy link
Member

echarles commented Jan 7, 2022

So does that means we could transform this PR as an concrete Authorizer implementation (in the sense of the authorization layer) that uses only the request to provide the allow/deny answer?

This would be a nice convergence. Maybe have a a look at #165 where the Authorizer is a class that can be extended, or that you can inherit from something that will implement this PR features.

Not sure if the the deny by default is implemented in that PR, but it should be possible to make it work like that.

@Zsailer Zsailer self-assigned this Feb 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Break up and make add_handlers overrideable
6 participants