atom/scandal

Does not handle searching for multiline regex's

benogle opened this issue ยท 24 comments

PathSearcher runs the regex on each line, not on the text as a whole. So something like [a-z]+\n[0-9]+ will not be matched.

This was a design decision for efficiency. We only need to have each line, not the whole file. Right now the file reader only reads 10k at a time, and searches on the lines returned.

I'm not sure how to handle multiline regexs efficiently. How to make it handle a 100MB file? I'm opening this for discussion. As I build out the PathReplacer, it will have the same limitation.

One approach could be to:

  1. set buffer = ''
  2. read 10k chunk, append it onto buffer
  3. run the regex buffer
  4. if match: buffer = buffer.slice(match.end)
  5. goto 2

But on files with no match, it will read the entire file into memory.

This would be a really useful feature, Is the PathSearcher object interchangable?

Yeah, it can be. We can subclass it or something. Really, the thing that needs to change is the way we read the file.

Was thinking of checking the regex for multiline search and then running a different codepath that handles multiline.

+1 for different handlers, as not being able to search for \n at all is a huge pain. I am ok with complicated regex searches taking longer.

+1 for different handlers as well

Multiline search/replace is a crucial feature for me โ€“ as well as for many other devs, I guess. Iโ€™m on the verge of switching from Sublime to Atom โ€“ and this is the biggest thing holding me back.

๐Ÿ‘

dyoji commented

+1

+1, really missing useful feature i need a lot. makes me go back to dreamweaver just for doing multiline search and replace. i wont mind if it takes little bit more time to find.

+1

+1 would be willing to manually put in the \n to enable. I need this to be able to wrap a multi-line function call in extra parentheses in javascript.

+1 if i was able to do multi-line regex search in Atom...

What if we were to keep the current behavior for single-line searches (because it's fast), and only do something slower if we see a \n in the search string?

I'm working on this. There's some preliminary stuff at https://github.com/marnen/scandal/tree/5-multiline-find-and-replace.

@marnen What's the status on this?

@CharlotteDunois I never got very far, largely because I had higher priorities at the time. You're welcome to look at the branch I created and see if anything is useful; I'll take another look at it as well if I have time.

I'm having a similar issue. I can multi line search in a file with regex no problem but when I try to search the folder/project no results are found.

Find

@if \(\$errors->has\((.*)\)\)(.*|\r?\n)*?@endif

Replace

@include('partials.forms.error', ['input' => $1])

Example to be replaced

@if ($errors->has('name'))
	<span class="help-block">
		<strong>{{ $errors->first('name') }}</strong>
	</span>
@endif

The devs are well aware at this point. It's just not an easy fix so they haven't gotten to it yet.

I'm not an Atom dev here, so take everything here with a grain of salt.

To add on to what @garrettw said, if you look in the relevant source file itself, you'll find that it reads everything in chunks. JS regexps provide no way to suspend and resume matching, so you'd have to rewrite the regexp dynamically and reparse the result to do it. This, however, is a very non-trivial thing to do, and you really need engine support (and most non-streaming regexp engines don't provide this facility).