Customizable namespace name pattern
bandali0 opened this issue · 16 comments
Background: Currently, the Namespaces Extension requires that namespace names match $InterSitePattern, forcing them to start with an uppercase letter and only consist of alphabet letters.
Request: It would be nice to allow customization of namespace url pattern by introducing a $NamespacePattern
.
For my use-case, I would like to allow namespace names starting with and (optionally) consisting of only digits. I asked on #oddmuse
and @AlexDaniel suggested the following pattern:
$NamespacePattern = '[A-Z0-9\x{0080}-\x{fffd}]+[A-Za-z0-9\x{0080}-\x{fffd}]+'
Note: a-z
is omitted to avoid matching the keep
, modules
, page
, and temp
directories.
The following is a git-diff of the changes I made to namespaces.pl
to allow for numeric namespace names, basically replacing every ocurrence of InterSitePattern
with the new NamespacePattern
:
diff --git a/modules/namespaces.pl b/modules/namespaces.pl
index 7ade4caa..3a431ed2 100644
--- a/modules/namespaces.pl
+++ b/modules/namespaces.pl
@@ -42,7 +42,7 @@ AddModuleDescription('namespaces.pl', 'Namespaces Extension');
use File::Glob ':glob';
-our ($q, %Action, %Page, @IndexList, $Now, %InterSite, $SiteName, $ScriptName, $UsePathInfo, $DataDir, $HomePage, @MyInitVariables, @MyAdminCode, $FullUrl, $LinkPattern, $InterSitePattern, $FreeLinks, $FreeLinkPattern, $InterLinkPattern, $FreeInterLinkPattern, $UrlProtocols, $WikiLinks, $FS, $RcFile, $RcOldFile, $RcDefault, $PageDir, $KeepDir, $LockDir, $TempDir, $IndexFile, $VisitorFile, $NoEditFile, $WikiDescription, $LastUpdate, $StaticDir, $StaticUrl, $InterWikiMoniker, $RefererDir, $PermanentAnchorsFile);
+our ($q, %Action, %Page, @IndexList, $Now, %InterSite, $SiteName, $ScriptName, $UsePathInfo, $DataDir, $HomePage, @MyInitVariables, @MyAdminCode, $FullUrl, $LinkPattern, $NamespacePattern, $FreeLinks, $FreeLinkPattern, $InterLinkPattern, $FreeInterLinkPattern, $UrlProtocols, $WikiLinks, $FS, $RcFile, $RcOldFile, $RcDefault, $PageDir, $KeepDir, $LockDir, $TempDir, $IndexFile, $VisitorFile, $NoEditFile, $WikiDescription, $LastUpdate, $StaticDir, $StaticUrl, $InterWikiMoniker, $RefererDir, $PermanentAnchorsFile);
our ($NamespacesMain, $NamespacesSelf, $NamespaceCurrent,
$NamespaceRoot, $NamespaceSlashing, @NamespaceParameters,
%Namespaces);
@@ -80,18 +80,19 @@ $NamespaceSlashing = 0; # affects : decoding NamespaceRcLines
# variables (eg. localnames.pl)
unshift(@MyInitVariables, \&NamespacesInitVariables);
+my $NamespacePattern = '[A-Z0-9\x{0080}-\x{fffd}]+[A-Za-z0-9\x{0080}-\x{fffd}]+';
sub GetNamespace {
my $ns = GetParam('ns', '');
if (not $ns and $UsePathInfo) {
my $path_info = decode_utf8($q->path_info());
# make sure ordinary page names are not matched!
- if ($path_info =~ m|^/($InterSitePattern)(/.*)?|
+ if ($path_info =~ m|^/($NamespacePattern)(/.*)?|
and ($2 or $q->keywords or NamespaceRequiredByParameter())) {
$ns = $1;
}
}
ReportError(Ts('%s is not a legal name for a namespace', $ns))
- if $ns and $ns !~ m/^($InterSitePattern)$/;
+ if $ns and $ns !~ m/^($NamespacePattern)$/;
return $ns;
}
@@ -102,7 +103,7 @@ sub NamespacesInitVariables {
$Namespaces{$NamespacesMain} = $ScriptName . '/';
foreach my $name (Glob("$DataDir/*")) {
if (IsDir($name)
- and $name =~ m|/($InterSitePattern)$|
+ and $name =~ m|/($NamespacePattern)$|
and $name ne $NamespacesMain
and $name ne $NamespacesSelf) {
$Namespaces{$1} = $ScriptName . '/' . $1 . '/';
@@ -311,13 +312,13 @@ sub NewNamespaceScriptUrl {
if ($action =~ /^($UrlProtocols)\%3a/) { # URL-encoded URL
# do nothing (why do we need this?)
} elsif ($action =~ m!(.*?)([^/?&;=]+)%3a(.*)!) {
- # $2 is supposed to match the $InterSitePattern, but it might be
+ # $2 is supposed to match the $NamespacePattern, but it might be
# UrlEncoded in Main:RecentChanges. If $2 contains Umlauts, for
- # example, the encoded $2 will no longer match $InterSitePattern.
+ # example, the encoded $2 will no longer match $NamespacePattern.
# We have a likely candidate -- now perform an additional test.
my ($s1, $s2, $s3) = ($1, $2, $3);
my $s = UrlDecode($s2);
- if ($s =~ /^$InterSitePattern$/) {
+ if ($s =~ /^$NamespacePattern$/) {
if ("$s2:$s3" eq GetParam('oldid', '')) {
if ($s2 eq $NamespacesMain) {
$ScriptName = $NamespaceRoot;
@@ -353,8 +354,8 @@ sub NewNamespaceGetAuthorLink {
}
sub NewNamespaceValidId {
- local $FreeLinkPattern = "($InterSitePattern:)?$FreeLinkPattern";
- local $LinkPattern = "($InterSitePattern:)?$LinkPattern";
+ local $FreeLinkPattern = "($NamespacePattern:)?$FreeLinkPattern";
+ local $LinkPattern = "($NamespacePattern:)?$LinkPattern";
return OldNamespaceValidId(@_);
}
@@ -383,8 +384,8 @@ sub NewNamespaceBrowsePage {
my $text = $revisionPage->{text};
my $oldId = GetParam('oldid', '');
if (not $oldId and not $revision and (substr($text, 0, 10) eq '#REDIRECT ')
- and (($WikiLinks and $text =~ /^\#REDIRECT\s+(($InterSitePattern:)?$InterLinkPattern)/)
- or ($FreeLinks and $text =~ /^\#REDIRECT\s+\[\[(($InterSitePattern:)?$FreeInterLinkPattern)\]\]/))) {
+ and (($WikiLinks and $text =~ /^\#REDIRECT\s+(($NamespacePattern:)?$InterLinkPattern)/)
+ or ($FreeLinks and $text =~ /^\#REDIRECT\s+\[\[(($NamespacePattern:)?$FreeInterLinkPattern)\]\]/))) {
my ($ns, $page) = map { UrlEncode($_) } split(/:/, FreeToNormal($1));
$oldId = ($NamespaceCurrent || $NamespacesMain) . ':' . $id;
local $ScriptName = $NamespaceRoot || $ScriptName;
It may need more work to sand down any rough edges, but it's been working pretty great for me so far.
Thanks! As for moving namespaces to a subdir, good idea, I'm all for it!
Looking at this again, I'm a bit torn: if you can change the namespace pattern independently of the inter site pattern, then you loose the ability to link to other namespaces using a simple prefix. How do you handle that? You just never use it?
I hadn't thought of that :/ I guess like you said I'm just not using it as of now?
I'm all for a better way to do this if there is one 🙂
Well, for campaignwiki.org I've used the following:
# Allow namespaces with spaces: Require upper case first letter and no slashes,
# then word characters or underlines. Spaces will get turned into underlines.
# And because of historic reasons, we also support NO-BREAK SPACE and SPACE.
# $InterSitePattern = '\p{Uppercase}\p{Word}*';
# $InterSitePattern = '\p{Uppercase}[[:alpha:]_ ]*';
$InterSitePattern = '[\p{Uppercase}\d][\w_ ]*';
# Redefine these as well if you change $InterSitePattern since InitLinkPatterns is called before InitConfig!
$InterLinkPattern = "($InterSitePattern:[-a-zA-Z0-9\x{0080}-\x{fffd}_=!?#\$\@~`\%&*+\\/:;.,]*[-a-zA-Z0-9\x{0080}-\x{fffd}_=#\$\@~`\%&*+\\/])$QDelim";
$FreeInterLinkPattern = "($InterSitePattern:[-a-zA-Z0-9\x{0080}-\x{fffd}_=!?#\$\@~`\%&*+\\/:;.,()' ]+)";
It should work for you as well, as I'm using [\p{Uppercase}\d]
for the first character.
Having to redefine the two interlink patterns is ugly, of course, so perhaps I should move those two definitions into a separate function which we could then call after setting the intersite pattern.
Thanks, I switched over to your settings, and seems to be working.
I'm facing an issue, however: whether with my previous config or your current one, my changes to namespaces with numerical names don't get saved :( any thoughts as to why?
It seem to work for me. Let me do some more testing.
I use the following config file:
$AdminPass = 'foo';
$ScriptName = 'http://localhost:8080';
$SurgeProtection = 0;
$WikiLinks = 1;
$InterSitePattern = '[\p{Uppercase}\d][\w_ ]*';
$InterLinkPattern = "($InterSitePattern:[-a-zA-Z0-9\x{0080}-\x{fffd}_=!?#\$\@~`\%&*+\\/:;.,]*[-a-zA-Z0-9\x{0080}-\x{fffd}_=#\$\@~`\%&*+\\/])$QDelim";
$FreeInterLinkPattern = "($InterSitePattern:[-a-zA-Z0-9\x{0080}-\x{fffd}_=!?#\$\@~`\%&*+\\/:;.,()' ]+)";
My modules directory contains nothing by a symbolic link to namespaces.pl
.
My data dir currently looks as follows:
/home/alex/src/oddmuse/test-data:
total used in directory 72 available 308517392
drwxr-xr-x 12 alex alex 4096 Jul 16 22:49 .
drwxr-xr-x 80 alex alex 12288 Jul 16 22:49 ..
drwxr-xr-x 4 alex alex 4096 Jul 16 22:49 007
drwxr-xr-x 4 alex alex 4096 Jul 16 22:45 Alex
drwxr-xr-x 3 alex alex 4096 Jun 10 13:45 Ford
drwxr-xr-x 6 alex alex 4096 Jun 10 13:54 Muu
drwxr-xr-x 4 alex alex 4096 Jun 10 13:45 Zürich
drwxr-xr-x 4 alex alex 4096 Jun 10 13:45 Zürich♥
-rw-r--r-- 1 alex alex 613 Jul 16 22:44 config~
-rw-r--r-- 1 alex alex 402 Jul 16 22:48 config
drwxr-xr-x 3 alex alex 4096 Jun 10 13:45 keep
drwxr-xr-x 2 alex alex 4096 Jun 10 13:45 modules
drwxr-xr-x 2 alex alex 4096 Jun 10 13:45 page
-rw-r--r-- 1 alex alex 7 Jun 10 13:45 pageidx
-rw-r--r-- 1 alex alex 131 Jun 10 13:45 rc.log
drwxr-xr-x 2 alex alex 4096 Jun 10 13:45 temp
Result looks OK:
I added a test and it still seems to work: fded175
Can you try and reproduce the fact that it works using my minimal example? If that doesn't work, perhaps try the latest wiki.pl
and namespaces.pl
from the repo and we can then use git bisect
to find the commit that prevents it from working. We should at least document it...
Thanks for taking a closer look.
I pinpointed the source of my issue, but not "why" it happens.
The only meaningful differences between my server.pl
and the one checked into the repo is that I'd replaced route => '/wiki'
with route => '/'
, and had commented out the get '/' => sub { ...
subroutine, so that I could run the wiki and its pages at the root level, without the /wiki
prefix. Accordingly, I'd set $FullUrl = 'https://emacsconf.org'
and $ScriptName = "$FullUrl"
in the config
file.
When I added /wiki
to the end of $FullUrl
and restored route
back to /wiki
, the edit I made to the numerical namespace's main page was saved again.
Would it be possible to modify namespaces.pl
to work with this sort of configuration (without /wiki
)? Editing normal pages seems to have worked fine so far, and oddly enough editing namespace pages used to work fine too when I had last tried a month or two ago, but has since stopped working sometime between then and yesterday when I tried again.
P.S. the config file is available at https://emacsconf.org/config.
I've used the same setup as before, but the following server.pl
script, which is the same as stuff/mojolicious-app.pl
:
use Mojolicious::Lite;
plugin CGI => {
support_semicolon_in_query_string => 1,
};
plugin CGI => {
route => '/',
script => 'wiki.pl',
};
app->start;
This works for me.
Config:
$AdminPass = 'foo';
$ScriptName = 'http://localhost:8080';
$SurgeProtection = 0;
$WikiLinks = 1;
$InterSitePattern = '[\p{Uppercase}\d][\w_ ]*';
$InterLinkPattern = "($InterSitePattern:[-a-zA-Z0-9\x{0080}-\x{fffd}_=!?#\$\@~`\%&*+\\/:;.,]*[-a-zA-Z0-9\x{0080}-\x{fffd}_=#\$\@~`\%&*+\\/])$QDelim";
$FreeInterLinkPattern = "($InterSitePattern:[-a-zA-Z0-9\x{0080}-\x{fffd}_=!?#\$\@~`\%&*+\\/:;.,()' ]+)";
Started using:
morbo --listen http://*:8080 \
--watch wiki.pl --watch test-data/config --watch test-data/modules/ \
server.pl
Then I changed it to the following:
use Mojolicious::Lite;
# This needs to be in a different section, sometimes?
plugin CGI => {
support_semicolon_in_query_string => 1,
};
plugin CGI => {
route => '/',
# We need this for older versions of Mojolicious::Plugin::CGI
script => 'wiki.pl',
run => \&OddMuse::DoWikiRequest,
before => sub {
no warnings;
$OddMuse::RunCGI = 0;
# The default data directory is determined by the environment variable
# WikiDataDir and falls back to the following
# $OddMuse::DataDir = '/tmp/oddmuse';
use warnings;
require './wiki.pl' unless defined &OddMuse::DoWikiRequest;
},
env => {},
# path to where STDERR from cgi script goes
errlog => ($ENV{WikiDataDir} || '/tmp/oddmuse')
. "/wiki.log",
};
# get '/' => sub {
# my $self = shift;
# $self->redirect_to('/wiki');
# };
app->start;
And it still works. By "work" I mean I can visit the existing page 007/Bond
and edit it and the new content is shown (based on the problem description "changes to namespaces with numerical names don't get saved"). I also visited 8/Trump
and edited it and it got saved as a new page.
I'm still unable to reproduce the problem, apparently.
Are you accessing the website directly or via a webserver that acts as a proxy?
Thanks. I'm using it behind nginx like so:
upstream emacsconf {
server 127.0.0.1:11199;
}
server {
...
location / {
proxy_pass http://emacsconf;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_set_header Host $host;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
if ($request_uri ~ "^/([\d]{4})$") {
return 302 /$1/;
}
}
}
I found the source of my problems: that if
block I use to redirect a page like /2015
to /2015/
. It just occurred to me to try and comment it out and see what happens. If I comment that out, my changes get saved again. I must have added that rule some time since my last edit a few months back, but it never dawned on me that it could be problematic. facepalms
Knowing that, I wonder if it'd be possible to redirect those numeric pages /2015
to their slashed version /2015/
(the namespace), without breaking saving?
I use something like this on Campaign Wiki.
# Fix visiting Main:X if the page doesn't exist but namespace X does.
# Redirect!
push(@MyInitVariables, \&MyNamespacesFix);
sub MyNamespacesFix {
if (not GetParam('title', '')
and GetParam('action', 'browse') eq 'browse') {
my $id = FreeToNormal(GetId());
if (not $NamespaceCurrent
and (not $IndexHash{$id}
or OpenPage($id) and PageMarkedForDeletion())
and $Namespaces{$id}) {
print GetRedirectPage("$id/", NormalToFree($id));
exit;
}
}
};