Split by a character in JavaScript but not contiguous ones
Asked Answered
P

3

6

Here is the case:

var stringExample = "hello=goodbye==hello";
var parts = stringExample.split("=");

Output:

hello,goodbye,,hello

I need this Output:

hello,goodbye==hello

Contiguous / repeated characters must be ignored, just take the single "=" to split.

Maybe some regex?

Pericycle answered 8/1, 2013 at 14:44 Comment(1)
Will there always be alphanumeric characters around the =s that you do want to split on? Or could there be something like hello:=!goodbye that should be split into hello: and !goodbye?Acetal
P
6

You can use a regex :

var parts = stringExample.split(/\b=\b/);

\b checks for word boundaries.

Poser answered 8/1, 2013 at 14:45 Comment(5)
I think only equality signs should be ignored, not any non-word-character.Index
@Index I'm not sure I see your problem. Can you come up with a example ?Indigenous
hello(=)goodbye wouldn't be split in hello( and )goodbye for example - I believe that would be expected (needs clarification by the OP, please)Index
@Index OK, I see what you mean. I think we should use lookbehind for that.Indigenous
@dystroy: Yeah, only that JS doesn't support lookbehind and that is what makes the task complicated :-)Index
A
3

Most probably, @dystroys answer is the one you're looking for. But if any characters other than alphanumerics (A-Z, a-z, 0-9 or _) could surround a "splitting ="), then his solution won't work. For example, the string

It's=risqué=to=use =Unicode!=See?

would be split into

"It's", "risqué=to", "use Unicode!=See?"

So if you need to avoid that, you would normally use a lookbehind assertion:

result = subject.split(/(?<!=)=(?!=)/);  // but that doesn't work in JavaScript!

So even though this would only split on single =s, you can't use it because JavaScript doesn't support the (?<!...) lookbehind assertion.

Fortunately, you can always transform a split() operation into a global match() operation by matching everything that's allowed between delimiters:

result = subject.match(/(?:={2,}|[^=])*/g);

will give you

"It's", "risqué", "to", "use ", "Unicode!", "See?"
Acetal answered 8/1, 2013 at 15:3 Comment(4)
+1 I was wondering why the lookbehind in the split I was testing wasn't working. Is the "doesn't work in Javascript!" documented somewhere ?Indigenous
@dystroy: regular-expressions.info/refflavors.html is my go-to resource for this (or RegexBuddy).Acetal
match is usually not equivalent to a split in the way it handles strings beginning/ending with delimiters. Also, it could return null. The transformation is not trivial :-) And at least you would need to use * instead of +Index
@Bergi: Right, * it is. But I think now the result is equivalent to a (fictitious) split on the regex I mentioned above.Acetal
B
-1

As first approximation to a possible solution could be:

".*[^=]=[^=].*"

Note that this is just the regex, if you want to use it with egrep, sed, java regex or whatever, take care if something needs to be escaped.

BEWARE!: This is a first approximation, this could be improved. Note that, for instance, this regex won't match this string "=" (null - equal - null).

Blastosphere answered 8/1, 2013 at 15:29 Comment(6)
Can you precise how this regex should be used in OP's case ?Indigenous
@dystroy precise isn't a verb, but looking at your profile I see you must speak French.Heliacal
@dystroy I don't understand you. What OP's means? If you are asking for an example of how to use that regex in/with a program let's see it. The content of a file: hello=world hell0==w0rld bye=cruel==world by3==cru3l=w0rld = == ===Blastosphere
OP wants to split a string. Can you produce, using your regex, the needed javascript code or was your answer really just a comment ?Indigenous
Ok, now I realize I cannot use code blocks inside comments ¬¬. Also I have spent my 5 minutes to change the comment. Explain what you are trying to ask and I will answer you :)Blastosphere
a) This regex does nothing more than match an entire string that contains at least one = with at least one non-= character before and after the =. It doesn't do anything that would help with splitting that text into partitions. b) Even if you used capturing groups to capture the parts before and after the =, you could only split a string into two parts. As soon as there's more than one splitting =, the regex will fail.Acetal

© 2022 - 2024 — McMap. All rights reserved.