haltcase/glob

[RFC] IgnoreCase not applied on fixed part of pattern; needs to be documented

timotheecour opened this issue · 5 comments

interesting edge case:

IgnoreCase is currently not applied on fixed part of pattern:

on linux:

find pwd
/tmp/d03
/tmp/d03/A1
/tmp/d03/A1/z2.txt
/tmp/d03/A1/Z2.txt
/tmp/d03/a1
/tmp/d03/a1/z1.txt

echo toSeq(walkGlob("/tmp/d03/A1/z*.txt", options = defaultGlobOptions + {IgnoreCase}))
@["/tmp/d03/A1/z2.txt", "/tmp/d03/A1/Z2.txt"]

if IgnoreCase were applied on fixed part of pattern, it would return:
@["/tmp/d03/A1/z2.txt", "/tmp/d03/A1/Z2.txt", "/tmp/d03/a1/z1.txt"]

I'm actually leaning on preferring current behavior (eg user probably doesn't wanna include "/tmP/d03/A1/Z2.txt" if that file exists; because the fixed part of the pattern is probably only for specifying directory where to do search) however this should be documented.

  • there's also the question of how to search for "foo1/foo2/foo3/foo4.md" where user wants foo1/foo2 to be case sensitive and foo3/foo4.md to be case insensitive; could the following be made to work for exactly that ?
    "foo1/foo2/foo3[]/foo4.md"
    unfortunately, this currently doesn't work: Error: unhandled exception: Missing ']'
    this notation is nice and intuitive, it would allow specifying that the fixed string is foo1/foo2 and the non-fixed string is foo3[]/foo4.md which is equivalent to foo3/foo4.md : the first [] would indicate start of non-fixed string

Hmm I can't replicate:

> find /tmp
/tmp
/tmp/d03
/tmp/d03/A1
/tmp/d03/A1/Z2.txt
/tmp/d03/A1/z2.txt
/tmp/d03/a1
/tmp/d03/a1/z1.txt
import src/glob
import sequtils

let o = defaultGlobOptions + {IgnoreCase}
echo toSeq(walkGlob("/tmp/d03/A1/z*.txt", options = o))

# -> @["/tmp/d03/a1/z1.txt", "/tmp/d03/A1/Z2.txt", "/tmp/d03/A1/z2.txt"]

I'm actually leaning on preferring current behavior (eg user probably doesn't wanna include "/tmP/d03/A1/Z2.txt" if that file exists; because the fixed part of the pattern is probably only for specifying directory where to do search)

I disagree — if you want it to be used as the root of the pattern, pass it as root. If you don't know it in advance, use splitPattern to get the stems ((base, magic)) and then use these as your root and pattern respectively.

could the following be made to work for exactly that ? "foo1/foo2/foo3[]/foo4.md"

This is invalid syntax, you can't have an empty range. Unix's find just says No such file or directory when you try to do that.

weird that you can't replicate; what's your system?
on ubuntu (aws)
git clone https://github.com/timotheecour/dsnippet
cd dsnippet/glob_issue_28
nim c -r test.nim
@["d04/A1/z2.txt", "d04/A1/Z2.txt"]

nim --version
Nim Compiler Version 0.18.1 [Linux: amd64]
Compiled at 2018-07-09
Copyright (c) 2006-2018 by Andreas Rumpf

active boot switches: -d:release

lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 16.04.4 LTS
Release: 16.04
Codename: xenial

Yeah something's weird here. I was able to replicate it in one shell instance and not another. Thanks to choosenim I at least ruled out Nim stable vs devel being the issue. I'm using WSL so I guess it's possible that has something to do with it.

> lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 16.04.3 LTS
Release:        16.04
Codename:       xenial

Will keep investigating...

@timotheecour could you let me know if this is still an issue with master? I'm back to not being able to replicate 😄 I tried your repro and my own separate file structure.

> cd dsnippet/glob_issue_28
> nimble install glob@#head
> nim c -r test.nim
@["d04/a1/z1.txt", "d04/A1/Z2.txt", "d04/A1/z2.txt"]
system details
> nim -v
Nim Compiler Version 0.18.1 [Linux: amd64]
Compiled at 2018-05-15
Copyright (c) 2006-2018 by Andreas Rumpf

active boot switches: -d:release

> lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 16.04.3 LTS
Release:        16.04
Codename:       xenial

Closing since I'm pretty sure this is fixed in master.