regexp help request thread

Message Bookmarked
Bookmark Removed

They are truly a dark & secret art imo.

If I have: %cats% are nice but %strings% are stringy, how do I match 'cats' and 'strings' but not the bit in the middle?

Gravel Puzzleworth, Sunday, 9 January 2011 22:02 (thirteen years ago) link

try "(cats).*?(strings)"

a fucking stove just fell on my foot. (Colonel Poo), Sunday, 9 January 2011 22:11 (thirteen years ago) link

Argh I've typed an answer but ILXbbcode keeps eating it!

agrarian gamekeeper (a passing spacecadet), Sunday, 9 January 2011 22:20 (thirteen years ago) link

I don't know what language/engine you're using, but you may be able to switch it between "greedy" (use the biggest possible matching chunk of the input) vs "non-greedy" (match as little as possible) - if you're in perl, swapping * or + for *? or +? is one way to force a non-greedy match

or you could just tell it to match anything which isn't a % symbol by matching a character class of not %, i.e. ^% in square brackets

Default greedy match:
perl -e "$s = '%cats% are nice but %strings% are stringy'; @matches = ($s =~ m/%(.*)%/g); print @matches"
cats% are nice but %strings

Forcing non-greedy with *?:
perl -e "$s = '%cats% are nice but %strings% are stringy'; @matches = ($s =~ m/%(.*?)%/g); print @matches"
cats
strings

Using the character class to exclude %s -- ^% in square brackets means any character except %:
perl -e "$s = '%cats% are nice but %strings% are stringy'; @matches = ($s =~ m/%(<^%>*)%/g); print @matches"
cats
strings

(replace <> in bold with square brackets, forgotten how to get round ILXBBcode)

agrarian gamekeeper (a passing spacecadet), Sunday, 9 January 2011 22:23 (thirteen years ago) link

ILX-code woes made me remove the bit where I set $, to \n for human-readable output; posted it and then realised I could just have used any character except curly braces after my qq

agrarian gamekeeper (a passing spacecadet), Sunday, 9 January 2011 22:26 (thirteen years ago) link

I think I misunderstood the % symbols!

a fucking stove just fell on my foot. (Colonel Poo), Sunday, 9 January 2011 23:14 (thirteen years ago) link

but yeah "%(.*?)%" works

a fucking stove just fell on my foot. (Colonel Poo), Sunday, 9 January 2011 23:20 (thirteen years ago) link

That works, but why doesn't "% are nice but %" count as a match?

Gravel Puzzleworth, Monday, 10 January 2011 00:27 (thirteen years ago) link

the bit in the middle u mean? because that % symbol ("% are ...") has already been matched in its "%cats%" incarnation.

iow "%(.*?)%" will match "%cats%", the processor will then go to work on the remaining part, "are nice but %strings% are stringy", and match "%strings%".

nanoflymo (ledge), Monday, 10 January 2011 09:47 (thirteen years ago) link

you can also "force non-greedy" with [^\n\r]*

progressive cuts (Tracer Hand), Monday, 17 January 2011 13:01 (thirteen years ago) link


You must be logged in to post. Please either login here, or if you are not registered, you may register here.