Handling empty tokens or "//" when generating URLs
dantleech opened this issue · 13 comments
If you have a URL schema as follows:
url_schema: /{url-part}/{title}
Where url-part
is a URL part stored in the Document, e.g. store/product
.
It is possible that the url-part
is empty, in which case the URL "//" is generated. In this case it would be required to collapse to a single "/".
We can fix this in a number of ways:
- Automatically collapse "//", "///", "/*" to "/" in the url generator
- Allow an array syntax for the schema:
[ "{url}", "{title}" ]
and the generator will implode on "/" after removing empty values. - Add some option to remove subsequent trailing slashes for tokens which are empty.
I think the first option is the most viable., but it does mean that the user would be responsible for ensuring that the url-part
(in the above example) is not empty in order to ensure not breaking the path scheme they have created.
in ease of use, i agree. it might hide mistakes in some situations by still producing a valid url that is unexpected, but i think thats a trade off we can live with here.
for the third idea, we could use a question-mark syntax similar to regular expressions, something like (/{url-part})?/{title}
- but i would prefer to just collapse consequtive / to a single /
Maybe an option:
url_schema: /{url_part}/{title}
collapse_slashes: true
Would "slash" be the right word?
+1 for a collapse_slashes (i think that name is understandable) and defaulting that to true.
I would rather default to false, as defaulting to true could cause content to appear in unexpected places without warning (e.g. /product1
instead of /products/product-
for example).
or maybe we should even allow filters to be applied to the entire string:
uri_schema: /{foo}/{bar}
token_providers:
- ...
filters: [ "slugify", "collapse_slashes" ]
I'm -1 on filters. It would bring yet another concept to the RoutingAuto and I'm not sure of if there are more use cases than stripping repeated slashes.
I'm leaning towards @dbu's suggestion: (/{url-part})?/{title}
This is actually also the syntax proposed for the Routing component to fix exactly this problem: symfony/symfony#5424 I also like it because it focusses more on the origin of the problem: empty tokens. Stripping slashes is focusing on how to fix problems created by empty tokens.
Btw, this also makes me wonder if we should add an option to say that a token can be optional or should be required. In the latter case, if the token provider returned null for such a token it would throw an exception.
Re. the inline solution: After having a look at that Symfony PR I'm not sure how it addresses our problem here - the syntax is about matching routes, here we are talking about modifying the path when creating a route. i.e. /{token_1}(/{token_2})
doesn't really help as the section in parenthesis will always resolve to /
- is there a variation of this approach that would work?
What we are effectively saying with the ?
syntax is that if the previous token is null (or empty), remove the following slash if one exists. I tend to think this is not very intuitive and obscure (also ?
is potentially a valid part of the URL).
The collapse_slashes
option is quite fool-proof, but introduces a new option that might somehow become redundant in the future (e.g. if we do discover a new filtering case). Also we cannot specify the collapsing behavior on a token-by-token basis, which is a bit of a killer.
The array approach uri_schema: [ '{token}', 'foo-{token_2} ]
is quite straightforward. We just remove empty values from the array before imploding on /
. But then I wonder if we adopt that then the behavior about removing the empty values should also be optional, otherwise this syntax becomes a de-facto way to enable collapsing slashes.
Another method: we could add a global option to the token providers:
uri_schema: /{token_1}/{token_2}
token_providers:
token_1: [ 'content_foo', { remove_trailing_slash_on_empty: true }]
token_2: ..
Thats quite verbose, but this is not something that needs to be done often, and the options resolver provides little room for people to get confused.
Re. the required tokens, I think that makes sense. We could default to throwing an exception on empty tokens (and so avoid the //
scenario) and add an allow_empty
option ? The remove_trailing_slash_on_empty
option would depend on allow_empty
being true.
i think the allow_empty makes a lot of sense. and removing a trailing slash if the token is empty would be the right thing to do (no need for even a config option imo)
as long as /{token_1}{token_2}/{token_3} still works? not every token must be enclosed in /