C# string.IndexOf() returns unexpected value
Asked Answered
S

3

18

This question applies to C#, .net Compact Framework 2 and Windows CE 5 devices.

I encountered a bug in a .net DLL which was in use on very different CE devices for years, without showing any problems. Suddenly, on a new Windows CE 5.0 device, this bug appeared in the following code:

string s = "Print revenue receipt"; // has only single space chars 
int i = s.IndexOf("  "); // two space chars

I expect i to be -1, however this was only true until today, when indexOf suddenly returned 5.

Since this behaviour doesn't occur when using

int i = s.IndexOf("  ", StringComparison.Ordinal);

, I'm quite sure that this is a culture based phenomenom, but I can't recognize the difference this new device makes. It is a mostly identical version of a known device (just a faster cpu and new board).

Both devices:

  • run Windows CE 5.0 with identical localization
  • System.Environment.Version reports '2.0.7045.0'
  • CultureInfo.CurrentUICulture and CultureInfo.CurrentCulture report 'en-GB' (also tested with 'de-DE')
  • 'all' related registry keys are equal.

The new device had the CF 3.5 preinstalled, whose GAC files I experimentally renamed, with no change in the described behaviour. Since at runtime always Version 2.0.7045.0 is reported, I assume these assemblies have no effect.

Although this is not difficult to fix, i can not stand it when things seem that magical. Any hints what i was missing?

Edit: it is getting stranger and stranger, see screenshot: screenshot

One more: screenshot

Sharecrop answered 28/6, 2013 at 11:22 Comment(9)
you run this exact code, and you get 5?Eventempered
not exactly of course, see my screenshot above. I corrected the question also. Interesting points: * s = "Print revenue"; // result -1 * s = "Drucke Beleg aus"; // result -1 (!) pls excuse my frequent edits, I'm new to SO.Sharecrop
i.sstatic.net/iGxNb.pngSharecrop
Did you try to loop trought each character in the s string to see if they are any characters that we dont see displayed? For example, in this question #4893716 it was a soft-hyphen causing the same issue that you haveHasa
@ErgibtSinn have you tried to Clean & Rebuild your project?Eventempered
What is CultureInfo.CurrentCulture and CultureInfo.CurrentUICulture on your computer?Pease
@Eren, yes of course. The code is currently isolated in an empty test project. The device has been resetted several times also.Sharecrop
@Dmitry, currently en-GB, but it also occurs under de-DE. But only on this new device, all other devcies do not show this behaviour.Sharecrop
@Eric, there is no other hidden char, simply only the hand written code you can see in the screenshots (searched SO intensely before posting).Sharecrop
H
4

I believe you already have the answer using an ordinal search

    int i = s.IndexOf("  ", StringComparison.Ordinal);

You can read a small section in the documentation for the String Class which has this to say on the subject:

String search methods, such as String.StartsWith and String.IndexOf, also can perform culture-sensitive or ordinal string comparisons. The following example illustrates the differences between ordinal and culture-sensitive comparisons using the IndexOf method. A culture-sensitive search in which the current culture is English (United States) considers the substring "oe" to match the ligature "œ". Because a soft hyphen (U+00AD) is a zero-width character, the search treats the soft hyphen as equivalent to Empty and finds a match at the beginning of the string. An ordinal search, on the other hand, does not find a match in either case.

Hasa answered 28/6, 2013 at 13:6 Comment(3)
I know that this is the correct answer to the question "how do i fix this?" - but my question is: "why is this happening?".Sharecrop
To find out I suggest you iterate trought each character of your problem string in debug. There may be a character in it that you are not seeingHasa
this couldn't explain why it works on all other devices. At least the VS Debugger doesn't provide any hidden chars when copy+pasting into a hex editor. Please note the example with the loop over the alphabet.Sharecrop
J
0

Culture stuff can really appear to be quite magical on some systems. What I came to always do after years of pain is always set the culture information manually to InvariantCulture where I do not explicitly want different behaviour for different cultures. So my suggestion would be: Make that IndexOf check always use the same culture information, like so:

int i = s.IndexOf("  ", StringComparison.InvariantCulture);
Jetsam answered 28/6, 2013 at 11:29 Comment(3)
I tried this also, but the same behaviour appeared. Only StringComparison.Ordinal fixed it. I need to know where the crucial difference is hidden before the weekend starts ;-) It also seems very difficult to understand, why two spaces could be treated equal to one, while string.Equals(" "," "); (two spaces vs. one space) returns false...Sharecrop
String.Equals uses ordinal comparison; try String.Compare(" ", " ") instead.Doggone
String.Compare returns 1, so they are not recognized as equal.Sharecrop
M
0

The reference at http://msdn.microsoft.com/en-us/library/k8b1470s.aspx states:

"Character sets include ignorable characters, which are characters that are not considered when performing a linguistic or culture-sensitive comparison. In a culture-sensitive search, if value contains an ignorable character, the result is equivalent to searching with that character removed."

This is from 4.5 reference, references from previous versions don't contain nothing like that.

So let me take a guess: they have changed the rules from 4.0 to 4.5 and now the second space of a two space sequence is considered to be a "ignorable character" - at least if the engine recognizes your string as english text (like in your example string s), otherwise not.

And somehow on your new device, a 4.5 dll is used instead of the expected 2.0 dll.

A wild guess, I know :)

Multifaceted answered 28/6, 2013 at 13:16 Comment(1)
A very wild guess but reasonable and educated. System.Environment.Version shows 2.0.7045.0 at run time, so CF2 SP2 is used. Besides this CF2 installation, there are CF3.5 DLLs present additionally.Sharecrop

© 2022 - 2024 — McMap. All rights reserved.