What does ----s mean in the context of StringBuilder.ToString()?
Asked Answered
N

4

36

The Reference Source page for stringbuilder.cs has this comment in the ToString method:

if (chunk.m_ChunkLength > 0)
{
    // Copy these into local variables so that they 
    // are stable even in the presence of ----s (hackers might do this)
    char[] sourceArray = chunk.m_ChunkChars;
    int chunkOffset = chunk.m_ChunkOffset;
    int chunkLength = chunk.m_ChunkLength;

What does this mean? Is ----s something a malicious user might insert into a string to be formatted?

Nightmare answered 3/6, 2015 at 22:6 Comment(9)
We need a hacker to anwer this question :)Rheotaxis
I'm no hacker but I'm guessing that whatever ----s means some word or phrase that ends with s, and the programmer just didn't want to give it away that easilyYser
@54l3d, your name looks like L33Tspeak. I bet you'd be a good one to ask.Nightmare
I'm guessing the phrase (which @Jerron Vanneval's answer says was "race conditions") was elided when the Reference Source was released -- and that in the original MS source it was left intact.Improvisation
Duplicate of #30595703 (this is more generic and was asked earlier).Leslie
As leppie points out, I asked this question just a few days ago.Manufacture
Microsoft fell victim to one of the clbuttic blunders! The most famous of which is "never get involved in a land war in Asia," but only slightly less well-known is this: "Never run seek and destroy on the source code!"Kinglet
I bet the answer is 'StackOverflows' ;)Phosphaturia
@Kinglet it took some research to fully figure out that joke.Reign
A
23

In the CoreCLR repository you have a fuller quote:

Copy these into local variables so that they are stable even in the presence of race conditions

Github

Basically: it's a threading consideration.

Abomination answered 3/6, 2015 at 22:11 Comment(11)
For extra credit, someone should find more occurrences of "race condition" in the CoreCLR, and find corresponding occurrences of "----".Hackman
It's not exactly a fuller quote, since it is definitely different by more than just the ----.Broughton
Why definitely different? Different number of characters? Why do you assume character count remains the same over the transformation?Hackman
@hatchet: if you assume that each hyphen represents a single character. I am confident enough to say that both comments refer to the same idea behind it.Abomination
I'm thinking it was redacted to avoid giving a hint about what kind of exploit they were trying to avoid. And why on earth would you tell people how many characters that exploit's name is if you're trying to hide it?Playbill
@DanField It's probably just an overly zealous find-and-replace in an attempt to ensure political correctness (specifically, to remove racism).Empery
@T.J.Crowder And that removes the comment from the source code how exactly?Empery
@immibis: As I said: You fix it in the source. See the bit before the first occurrence of the word "and" in my comment.Jeraldinejeralee
@T.J.Crowder ... well then you appear to be agreeing that Microsoft should have done what they did, while simultaneously complaining about it?Empery
@immibis: You seem intent on misreading the above. 1. Fix the source (not redaction, rephrasing). 2. Deal with the person causing/allowing the issue. 3. No need to then have auto-redaction software.Jeraldinejeralee
I think "race conditions" originally was shortened "races". Even if the filter wouldn't preserve the number of characters, it'd be very weird to gobble up "condition".Kinglet
H
41

The source code for the published Reference Source is pushed through a filter that removes objectionable content from the source. Verboten words are one, Microsoft programmers use profanity in their comments. So are the names of devs, Microsoft wants to hide their identity. Such a word or name is substituted by dashes.

In this case you can tell what used to be there from the CoreCLR, the open-sourced version of the .NET Framework. It is a verboten word:

// Copy these into local variables so that they are stable even in the presence of race conditions

Which was hand-edited from the original that you looked at before being submitted to Github, Microsoft also doesn't want to accuse their customers of being hackers, it originally said races, thus turning into ----s :)

Hippodrome answered 3/6, 2015 at 22:31 Comment(4)
So race is a forbidden word? And Which was hand-edited means "automatically edited"? Because an human editor considering that race to mean human race seems to be a little strangePentose
No, you can tell it is not by looking at the CoreCLR source. The filter was pretty dumb, an early version also had a nasty bug that caused the source to be emitted twice. It just blindly applied a list of words that were considered risky. Sex, religion, race, politics, profanity. Compare to Wikipedia guidelines.Hippodrome
@HansPassant: That is not a guideline... try this one instead.Secern
Hmm, no, that's a guide to their readers, not their content contributors. OpenCLR has a contributing guide but does not mention profanity. Highly unlikely they'll pay attention to the pull request when you swear a lot, too much work.Hippodrome
A
23

In the CoreCLR repository you have a fuller quote:

Copy these into local variables so that they are stable even in the presence of race conditions

Github

Basically: it's a threading consideration.

Abomination answered 3/6, 2015 at 22:11 Comment(11)
For extra credit, someone should find more occurrences of "race condition" in the CoreCLR, and find corresponding occurrences of "----".Hackman
It's not exactly a fuller quote, since it is definitely different by more than just the ----.Broughton
Why definitely different? Different number of characters? Why do you assume character count remains the same over the transformation?Hackman
@hatchet: if you assume that each hyphen represents a single character. I am confident enough to say that both comments refer to the same idea behind it.Abomination
I'm thinking it was redacted to avoid giving a hint about what kind of exploit they were trying to avoid. And why on earth would you tell people how many characters that exploit's name is if you're trying to hide it?Playbill
@DanField It's probably just an overly zealous find-and-replace in an attempt to ensure political correctness (specifically, to remove racism).Empery
@T.J.Crowder And that removes the comment from the source code how exactly?Empery
@immibis: As I said: You fix it in the source. See the bit before the first occurrence of the word "and" in my comment.Jeraldinejeralee
@T.J.Crowder ... well then you appear to be agreeing that Microsoft should have done what they did, while simultaneously complaining about it?Empery
@immibis: You seem intent on misreading the above. 1. Fix the source (not redaction, rephrasing). 2. Deal with the person causing/allowing the issue. 3. No need to then have auto-redaction software.Jeraldinejeralee
I think "race conditions" originally was shortened "races". Even if the filter wouldn't preserve the number of characters, it'd be very weird to gobble up "condition".Kinglet
P
9

In addition to the great answer by @Jeroen, this is more than just a threading consideration. It's to prevent someone from intentionally creating a race condition and causing a buffer overflow in that manner. Later in the code, the length of that local variable is checked. If the code were to check the length of the accessible variable instead, it could have changed on a different thread between the time length was checked and wstrcpy was called:

        // Check that we will not overrun our boundaries. 
        if ((uint)(chunkLength + chunkOffset) <= ret.Length && (uint)chunkLength <= (uint)sourceArray.Length)
        {
            ///
            /// imagine that another thread has changed the chunk.m_ChunkChars array here!
           /// we're now in big trouble, our attempt to prevent a buffer overflow has been thawrted! 
           /// oh wait, we're ok, because we're using a local variable that the other thread can't access anyway.
            fixed (char* sourcePtr = sourceArray)
                string.wstrcpy(destinationPtr + chunkOffset, sourcePtr, chunkLength);
        }
        else
        {
            throw new ArgumentOutOfRangeException("chunkLength", Environment.GetResourceString("ArgumentOutOfRange_Index"));
        }
    }
    chunk = chunk.m_ChunkPrevious;
} while (chunk != null);

Really interesting question though.

Playbill answered 3/6, 2015 at 22:18 Comment(0)
K
2

Don't think that this is the case - the code in question copies to local variables to prevent bad things happening if the string builder instance is mutated on another thread.

I think the ---- may relate to a four letter swear word...

Krueger answered 3/6, 2015 at 22:13 Comment(1)
Genuinely curious what four letter swear word you think is something a hacker might "do" to cause instability in a more naïve implementation?Mello

© 2022 - 2024 — McMap. All rights reserved.