Does not handle searching for multiline regex's

Question

Does not handle searching for multiline regex's

benogle opened this issue 11 years ago · 24 comments

PathSearcher runs the regex on each line, not on the text as a whole. So something like [a-z]+\n[0-9]+ will not be matched.

This was a design decision for efficiency. We only need to have each line, not the whole file. Right now the file reader only reads 10k at a time, and searches on the lines returned.

I'm not sure how to handle multiline regexs efficiently. How to make it handle a 100MB file? I'm opening this for discussion. As I build out the PathReplacer, it will have the same limitation.

One approach could be to:

set buffer = ''
read 10k chunk, append it onto buffer
run the regex buffer
if match: buffer = buffer.slice(match.end)
goto 2

But on files with no match, it will read the entire file into memory.

ShimShamSam commented 9 years ago

+1

benkelaar commented 9 years ago

+1

luccamordente commented 9 years ago

+1

Phylodome commented 9 years ago

👍

dyoji commented 9 years ago

+1

vincentorback commented 8 years ago

+1

pebkacs commented 8 years ago

+1

AndersDJohnson commented 8 years ago

+1

TKasperczyk commented 8 years ago

+1

Answer 1 · 2014-07-11T18:57:25.000Z

This would be a really useful feature, Is the PathSearcher object interchangable?

Answer 2 · 2014-07-11T19:01:07.000Z

Yeah, it can be. We can subclass it or something. Really, the thing that needs to change is the way we read the file.

Was thinking of checking the regex for multiline search and then running a different codepath that handles multiline.

Answer 3 · 2014-12-19T22:48:12.000Z

+1 for different handlers, as not being able to search for \n at all is a huge pain. I am ok with complicated regex searches taking longer.

Answer 4 · 2014-12-29T07:59:18.000Z

+1 for different handlers as well

Answer 5 · 2015-07-27T13:34:43.000Z

Multiline search/replace is a crucial feature for me – as well as for many other devs, I guess. I’m on the verge of switching from Sublime to Atom – and this is the biggest thing holding me back.

Answer 6 · 2015-11-24T16:47:08.000Z

+1, really missing useful feature i need a lot. makes me go back to dreamweaver just for doing multiline search and replace. i wont mind if it takes little bit more time to find.

Answer 7 · 2016-02-05T22:22:38.000Z

+1 would be willing to manually put in the \n to enable. I need this to be able to wrap a multi-line function call in extra parentheses in javascript.

Answer 8 · 2016-07-29T16:07:27.000Z

+1 if i was able to do multi-line regex search in Atom...

Answer 9 · 2016-12-22T17:50:26.000Z

What if we were to keep the current behavior for single-line searches (because it's fast), and only do something slower if we see a \n in the search string?

Answer 10 · 2016-12-22T21:45:56.000Z

I'm working on this. There's some preliminary stuff at https://github.com/marnen/scandal/tree/5-multiline-find-and-replace.

Answer 11 · 2018-01-05T09:29:21.000Z

@marnen What's the status on this?

Answer 12 · 2018-01-07T18:28:50.000Z

@CharlotteDunois I never got very far, largely because I had higher priorities at the time. You're welcome to look at the branch I created and see if anything is useful; I'll take another look at it as well if I have time.

Answer 13 · 2018-04-26T18:45:38.000Z

I'm having a similar issue. I can multi line search in a file with regex no problem but when I try to search the folder/project no results are found.

Find

@if \(\$errors->has\((.*)\)\)(.*|\r?\n)*?@endif

Replace

@include('partials.forms.error', ['input' => $1])

Example to be replaced

@if ($errors->has('name'))
	<span class="help-block">
		<strong>{{ $errors->first('name') }}</strong>
	</span>
@endif

Answer 14 · 2018-04-26T18:49:57.000Z

The devs are well aware at this point. It's just not an easy fix so they haven't gotten to it yet.

Answer 15 · 2018-07-30T18:00:20.000Z

I'm not an Atom dev here, so take everything here with a grain of salt.

To add on to what @garrettw said, if you look in the relevant source file itself, you'll find that it reads everything in chunks. JS regexps provide no way to suspend and resume matching, so you'd have to rewrite the regexp dynamically and reparse the result to do it. This, however, is a very non-trivial thing to do, and you really need engine support (and most non-streaming regexp engines don't provide this facility).