Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store matched routes in request #86

Open
wants to merge 11 commits into
base: 2.0
Choose a base branch
from

Conversation

gabegorelick
Copy link
Contributor

Lightly rebased version of #34 with the fixes suggested by @wesleytodd.

Copy link
Contributor

@dougwilson dougwilson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just a placeholder to remember to incorporate the review comments from PR #34 into the changes in this PR.

@gabegorelick
Copy link
Contributor Author

This is just a placeholder to remember to incorporate the review comments from PR #34 into the changes in this PR.

@dougwilson I think I got everything, but let me know if I missed anything.

@dougwilson
Copy link
Contributor

Hi @gabegorelick just a glance through I don't think you got #34 (comment) ? If you want me to check every one of them one by one, though, please give me a few days to get back on the complete list :)

@gabegorelick
Copy link
Contributor Author

just a glance through I don't think you got #34 (comment)

Thanks, I actually just fixed that. (I guess GitHub didn't display that under Files Changed since it was technically a comment on existing code?)

@dougwilson
Copy link
Contributor

I guess GitHub didn't display that under Files Changed since it was technically a comment on existing code?

I'm not sure, I see it under Files Changed on my end. I wonder if you're not seeing that, what other review comments where missed? We'll definately want to go through them all one-by-one to be sure if that's the case here.

@@ -1129,6 +1129,31 @@ describe('Router', function () {
.expect(200, 'saw GET /bar', done)
})
})

describe('req.matchedRoutes', function () {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One test doesn't seem adequate for this feature. Here are a few more tests cases I can think of off-hand:

  • Check that it works for regexp-defined routes
  • Check that is works as expected with sub routers

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I tend to agree. I'll try to add more tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated with some more tests. Let me know if that's what you had in mind. They don't totally match the style of the existing code, so let me know if it should be changed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Can you add a test for what req.matchedRoutes looks like in the top router after exiting the sub router as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dougwilson I think that's failing. I added a test that looks like this:

var router = new Router()
var fooRouter = new Router()
var server = createServer(router)
var matchedFooRoutes
var matchedBarRoutes

router.use('/foo', function (req, res, next) {
    matchedFooRoutes = req.matchedRoutes
    next()
}, fooRouter)
fooRouter.get('/bar', function (req, res, next) {
    matchedBarRoutes = req.matchedRoutes
    next()
})
router.use(saw)

request(server)
    .get('/foo/bar')
    .expect(200, 'saw GET /foo/bar', function (err, res) {
       // matchedFooRoutes is [ '/foo', '/foo', '/bar' ] instead
        assert.deepEqual(matchedFooRoutes, ['/foo']) 
        done(err)
    })

I'll see if I can fix it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed with my most recent commit, but not totally sure that's the right way to go. Let me know what you think.

index.js Outdated Show resolved Hide resolved
lib/layer.js Outdated Show resolved Hide resolved
@gabegorelick
Copy link
Contributor Author

I'm not sure, I see it under Files Changed on my end. I wonder if you're not seeing that, what other review comments where missed?

User error on my part. I went through the changes again and marked them off. The only thing that should be missing is #34 (comment) ("should change the property name to matchedPaths") since I wasn't sure whether there was consensus for that yet.

this.matchedPath = checkPath
break
}
}
}

if (!match) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this.matchedPath should be reset inside this conditional as well, otherwise it will look like there is a matched path when none of the paths matched, but they did in a previous attempt.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated. Is there a way to test this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be by creating a specific order of routes and running though them. I can help make one if you need, perhaps this weekend

@dougwilson
Copy link
Contributor

Hi @gabegorelick I appreciate the entheuseam in getting updated pushed, rock on 👍 the down side is that GitHub UI looses comments I am in the middle of writing when the code changes underneath me :( I need to go get dinner, so I will let you make changes for now and come back to continue to review later when we're not both altering the PR at the same time :)

@gabegorelick
Copy link
Contributor Author

Sorry about that. I may need a day or two to add the tests anyway, so there should be more quiet time on this PR :)

@wesleytodd
Copy link
Member

Hey! Thanks so much for picking this up @gabegorelick! I was planning on spending some time cleaning up many of the pending things in the router this weekend (time permitting), and that was on my list. I will take a look at your changes soon, but if possible it would be awesome if we could get this landed before the next 2.x pre-release.

I have a list of PRs and issues I would like to resolve, and if @dougwilson is on board with it I would like to land them all this weekend. I will pull them all into a 2.0.0-alpha.2 branch so that we can test and review. If all works out hopefully we can have the next alpha published with the next [email protected] release in preparation for that major release (later this month hopefully).

@gabegorelick gabegorelick changed the base branch from master to 2.0 January 3, 2020 16:38
@gabegorelick
Copy link
Contributor Author

@wesleytodd That's great news! I'll try to address all feedback this weekend, but feel free to do whatever you need to do to this PR to land it.

@@ -166,7 +166,8 @@ Router.prototype.handle = function handle(req, res, callback) {
// manage inter-router variables
var parentParams = req.params
var parentUrl = req.baseUrl || ''
var done = restore(callback, req, 'baseUrl', 'next', 'params')
var parentMatchedRoutes = req.matchedRoutes
var done = restore(callback, req, 'baseUrl', 'next', 'params', 'matchedRoutes')
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like adding matchedRoutes is necessary here, but no tests fail. So we should either try to add a test for it, or remove it if it's not necessary.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be apparent once more tests are added with sub routers. This controls the behavior of what happens on a router exit.

@wesleytodd
Copy link
Member

wesleytodd commented Jan 5, 2020

Also looks like it needs a lint fix due to the update to eslint which this is now rebased on top of. @gabegorelick, if you have time to knock this stuff out that would be awesome. If not, I will try to take a look at it this week.

.get('/foo/bar')
.expect(200, 'saw GET /foo/bar', function (err, res) {
assert.deepEqual(matchedFooRoutes, ['/foo'])
assert.deepEqual(matchedBarRoutes, ['/foo', '/bar'])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we think it makes sense for the matched foo routes to contain paths that do not even exist on the foo router? What if the foo router had a foo path, would this mean it had a match? That brings up a question for how this works with sub routers mounted at a path that is no / as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like to think about the use case where this feature would be used, and if you do not get all matched routes at the end it seems much less useful. That said, your counter example makes it clear that what we have here is not correct either.

@dougwilson
Copy link
Contributor

I made a comment above when looking at a test, but it made me think of an even bigger question I think we should think about:

This PR currently add a req.matchedRoutes property that is an array value of... something. It seems to be strings, but sometimes regular expressions. Sometimes it is still undefined according to the tests.

My question is what does this array represent? It is called "routes" but is "/foo" a "route"? Or just a pathname? Like should there be a difference if it was a GET or a POST or should it not matter? When the value is a regular expression, what does that mean, though? Is that useful data for the use case this is being designed for?

@wesleytodd
Copy link
Member

Is that useful data for the use case this is being designed for?

I think it is useful in that it could be stringified and reported on as part of the metrics. I do think the nested router behavior is not good, but the main use case I have thought of for this is to report different metrics depending on which routes matched in the processing of the request. And for this use case all simple examples are really straight forward.

@dougwilson
Copy link
Contributor

Gotcha. So should it be called like "matchedPathnames"? Seems more like what it would be vs "routes". Usually routes include what the method is as part of "route"?

@dougwilson
Copy link
Contributor

P.S. what do we think the array should contain for a request to "GET /foo" given the following?

router.get('/foo', (req, res, next) => next())
router.get('/foo', (req, res) => ...)

Should be be two /foo or just one? Probably could add a test for this as well.

HISTORY.md Outdated Show resolved Hide resolved
@wesleytodd
Copy link
Member

So I think in all reality this conversation points to one of the long existing design flaws in the router. There are other approaches we could take to the internals which would still allow similar user facing API but not incur the issues we have with dynamic and arbitrarily deep trees. I don't think we should shoot for fixing this in 2.0, but I think a new approach for what could be 3.0 should be considered.

@gabegorelick
Copy link
Contributor Author

lint fix due to the update to eslint which this is now rebased on top of. @gabegorelick, if you have time to knock this stuff out that would be awesome. If not, I will try to take a look at it this week.

Fixed the linting. Otherwise I'm staying quiet while all the other questions get sorted out :)

@gabegorelick gabegorelick force-pushed the store-matched-routes branch 2 times, most recently from f2f59a3 to a0afed7 Compare January 5, 2020 19:41
@gabegorelick
Copy link
Contributor Author

gabegorelick commented Jan 5, 2020

what do we think the array should contain for a request to "GET /foo" given the following?

That should be [ '/foo' ]. I've added a failing test to cover that example. I'll try to get it passing.

@gabegorelick
Copy link
Contributor Author

gabegorelick commented Jan 5, 2020

So should it be called like "matchedPathnames"

What about matchedPaths? That would align with req.path. The "name" in "pathname" to me also implies it would always be a string, while "path" is more abstract and so allows for things like regexes.

@dougwilson
Copy link
Contributor

What about matchedPaths? That would align with req.path. The "name" in "pathname" to me also implies it would always be a string, while "path" is more abstract and so allows for things like regexes.

The word "path" and "pathname" are part of the URL syntax. The difference being "path" contains the query string and "pathname" does not. This module has no req.path at all, so there isn't any kind of alignment that would make sense in the context of this module. Node.js also does not have a req.path so, for example, when using this module against all the examples in the README here and tests, there is still nothing that would make sense in the "alignment" argument.

@@ -199,6 +200,7 @@ Router.prototype.handle = function handle(req, res, callback) {
req.baseUrl = parentUrl
req.url = protohost + removed + req.url.substr(protohost.length)
removed = ''
req.matchedRoutes = parentMatchedRoutes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like this is trying to restore the req.matchedRoutes back to what it used to be, but since the router is just directly manipulating the array, it still ends up containing the new additions that were made. I'm not 100% sure what the purpose of this restoration of the property is meant to do after every route call?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dougwilson This whole approach needs to be tweaked anyway to to deal with the "sibling" routes example you mentioned in #86 (comment).

I'm open to suggestions for the best way to deal with the matchedRoutes "stack". I don't know enough yet about the router internals to know the best place to push and pop from it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, due to the lack of documentation here or a full written explanation for how this all should work, I really have no idea what to suggest for the changes, because I'm still unclear how this feature is actually supposed to operate :) Maybe we should circle back to explaining exactly how this feature should work in text form, so then we can all help on the implementation by knowing what we're trying to achieve. Right now, I'm still unclear on what exactly the plain is, so don't have a way to suggest what changes to make to the code to get the outcome to be whatever it is it should be, if that makes sense.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dougwilson Apologies. This PR has moved a number of times between expressjs and pillarjs, so much of the original context may have been lost.

AFAIK, this all got started with expressjs/express#2501 requesting req.originalRoute to complement req.originalUrl. The use case provided there is for logging route names instead of fully resolved paths, which is the same as what I need. Another use case I've seen is for recording metrics. E.g. when recording how many times an endpoint like GET /users/:id is called, you may want to use the full name of the endpoint without path parameters substituted.

As for the technical decisions in this PR, I can't speak to most of them. The original author (@ajfranzoia) seems to have abandoned his efforts, and I'm merely picking what he wrote back up. Perhaps @wesleytodd can provide more insight?

I assume req.originalRoute morphed into an array of matchedRoutes when it was realized that route names aren't always strings and that it wasn't as simple as joining path segments together.

@gabegorelick
Copy link
Contributor Author

The word "path" and "pathname" are part of the URL syntax.

Fair enough. If there's consensus for matchedPathnames, I'll update it to use that.

lib/layer.js Outdated
this.fastStar = paths === '*'
this.fastSlash = paths === '/' && opts.end === false

this.paths = (!Array.isArray(paths) ? [paths] : paths).map(function (path) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I assume that the purpose of pre-expanding the array of paths is to store data long with them? If that is the case path-to-regexp supports arbitrarily-nested arrays like ["/foo",["/bar","/fizz"]] and this implementation would end up only accounting for the first level.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the big changes in this PR is to always call path-to-regexp with a single path segment instead of an array of paths. That allows us to identify the matching path element in the array of paths (it's the one that returns a match from the regex). But as you point out, that won't work for deeply nested arrays.

Perhaps a better approach is to use the return value of the regex to indicate which element in the array was the match. I'm not totally sure what that would look like for the deeply nested array case, or if that's feasible without making breaking changes to path-to-regexp, but it would allow us to avoid pre-expansion of the array here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is ["/foo",["/bar","/fizz"]] equivalent to ["/foo", "/bar", "/fizz"]? If it is, then recursively flattening the array of paths should be fine.

However, another issue is if evaluating multiple regexes hurts performance compared to evaluating a single regex with clauses separated by |. Is that something to worry about?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is ["/foo",["/bar","/fizz"]] equivalent to ["/foo", "/bar", "/fizz"]? If it is, then recursively flattening the array of paths should be fine.

Currently this module delegates the argument to path-to-regexp and I don't know the answer off-hand and would have to look. If you're not even sure, then I think the entire way this functions needs to be re-examined, as if the code here is making assumptions for how path-to-regexp works, we should have a good grasp on how it is working.

However, another issue is if evaluating multiple regexes hurts performance compared to evaluating a single regex with clauses separated by |. Is that something to worry about?

It would probably depending on how much of a performance penalty it has. Typically adding features has a performance impact, but I it is really large, we may want to gate the feature. It would depend on (1) what % of users are going to use the feature and (2) how big of a penalty it has. For example, if 99% of users will use it, then it probably doesn't matter. But if only like 10% of users will use it and it will hurt performance by 20%, maybe a gate would be in order.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently this module delegates the argument to path-to-regexp and I don't know the answer off-hand and would have to look. If you're not even sure, then I think the entire way this functions needs to be re-examined, as if the code here is making assumptions for how path-to-regexp works, we should have a good grasp on how it is working.

A quick test of path-to-regexp indicates they are functionally equivalent (the only difference being an extra non-capturing group wrapper for each level of nesting):

pathToRegexp(['/', '/']) // /(?:^\/(?:\/)?$|^\/(?:\/)?$)/i
pathToRegexp(['/', ['/']]) // /(?:^\/(?:\/)?$|(?:^\/(?:\/)?$))/i

Looking at the source, path-to-regexp does appear to recursively expand the array, so we should be able to do the same before calling it.

But if only like 10% of users will use it and it will hurt performance by 20%, maybe a gate would be in order.

Any theoretical hit to performance should only affect the use of arrays. So that should help.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the source, path-to-regexp does appear to recursively expand the array, so we should be able to do the same before calling it.

Be carful to look at the proper version of the module :) Your link is about version 6.1.0 module, several major versions ahead of what his module is actually using.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, although looks like the implementation hasn't changed since v3.0.0.

I've updated the PR to call array-flatten on the paths.

@dougwilson
Copy link
Contributor

Fair enough. If there's consensus for matchedPathnames, I'll update it to use that.

TBF, I'm not saying there is consensus or you have to make a change. I am throwing things out for discussion. Maybe I'm unclear of the roles of the folks involved in the conversation :) Typically the PR author is the champion of the change, and is invested it in landing and has an explicit use-case they need in mind. So I am throwing points out for discusion with the author of the PR to get feedback and their own thoughts back, haha.

@gabegorelick
Copy link
Contributor Author

Typically the PR author is the champion of the change, and is invested it in landing and has an explicit use-case they need in mind. So I am throwing points out for discusion with the author of the PR to get feedback and their own thoughts back, haha.

Many people have touched this PR, and my contributions are fairly minimal. I'm mostly just trying to get it over the finish line, so I don't have strong opinions. My use case is pretty much identical to what's been outlined in #34, #85, and expressjs/express#2501. As long as the final form of this PR accomplishes it's original goal, I'm fine no matter what it looks like :)

@mmacai
Copy link

mmacai commented Mar 10, 2020

Definitely useful feature, any update on this?

@gabegorelick
Copy link
Contributor Author

Definitely useful feature, any update on this?

This PR works for simple cases, but there are a ton of edge cases that make this feature complicated (see the failing test, for instance). I sort of ran out of steam, but happy to help anyone else that wants to pick this up.

@wesleytodd
Copy link
Member

wesleytodd commented Mar 16, 2024

Hey, I am finally getting around to cleaning this all up. Are you interested in trying to help get this landed for router@2 and express@5? If so please respond asap so we can get it accounted for in the plan.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants