Strings and Garbage Collection
Asked Answered
B

8

23

I have heard conflicting stories on this topic and am looking for a little bit of clarity.

How would one dispose of a string object immediately, or at the very least clear traces of it?

Borodin answered 11/3, 2010 at 6:52 Comment(3)
Is this a security question or a question about the efficient use of memory and how garbage collection works?Roaring
As was mentioned by others, stay away from GC.Collect. Using it will actually hurt performance as it will unnecessarily promote otherwise short-lived objects into longer-living generations. Strings will collected in Gen0 (the most frequently collected) if you declare it as a local variable and let it go out of scope.Brittni
I'll say a bit of both really.Borodin
B
11

That depends. Literal strings are interned per default, so even if you application no longer references it it will not be collected, as it is referenced by the internal interning structure. Other strings are just like any other managed object. As soon as they are no longer reference by your application they are eligible for garbage collection.

More about interning here in this question: Where do Java and .NET string literals reside?

Balboa answered 11/3, 2010 at 6:57 Comment(11)
Is there any way to make it that certain string literals are not to be interned?Borodin
@Kyle: Only literal strings are interned automatically. So if you have string s = "hello"; it will be interned where as any string you create at runtime will not be interned unless you do so yourself.Balboa
Hehe, yes I understand that. I'm asking if there is any way to not have those strings interned?Borodin
Kyle: no. Literal strings are part of your assembly, and your assembly is never garbage collected. But does your assembly really contain so many string literals that they're causing memory pressure? Or if this is about security ("clearing traces"), remember that users can inspect your assembly, including its literal strings, without ever executing it -- more easily in fact than examining strings constructed at runtime!Demisec
@Kyle: Sorry about that. Please see itowlson's comment.Balboa
@Demisec - Thanks! @Brian, np :) Do you know of any method that can "safely" store a string embedded in an application without SecureString'ing a 1100 char string one char at a time? I say "Safely" as it is not heavy security, more like a bonus.Borodin
Yes and no. It depends on how safe is "safely." Whatever option you choose, a knowledgeable bad guy will be able to see 1100 characters of gibberish; and if you have to include the decoding code alongside the "secure" string, then a knowledgeable bad guy can disassemble your code and decipher the "gibberish." So it depends on how much knowledge and motivation you're trying to defend against, and how bad it is if the bad guy succeeds. If all you want to do is discourage the casual eye, then encryption with a hardwired key might suffice -- but this would definitely NOT be heavy security!Demisec
@Demisec - Absolutely, these are deterrents more than anything else. So, if I load a string into a SecureString just to keep it from prying eyes, this would be considered as good enough as a deterrent?Borodin
You've said that you don't need "heavy security, more like a bonus," so I'd say that encrypting the embedded string would probably be fine as long as you don't make the decrypt key too obvious to the casual Reflectorista. On the other hand, you really only need to bother with SecureString if you want to make sure that the decrypted string really does get cleared. An attacker too lazy to have a go at the 1100 characters of gibberish is probably also too lazy to grovel through the pagefile or slap a debugger on you, so it's not clear to me if SecureString is buying you anything.Demisec
Incidentally, one possible way around the "obvious attack target" of 1100 characters of gibberish is to use steganography -- e.g. embed those characters as, say, a bitmap resource rather than an encrypted string. This is security through obscurity and won't fool a determined attacker, but if you just wish to deter the idly curious, it might present a less obvious thing for them to poke at!Demisec
@Demisec - Thanks a ton. I was looking into this just the other day and it is definitely going to be done from my point of view ;) Thanks, I'm accepting this answer due to the original answer and following comments.Borodin
L
8

If you need to protect a string and be able to dispose it when you want, use System.Security.SecureString class.

Protect sensitive data with .NET 2.0's SecureString class

Lieberman answered 11/3, 2010 at 7:9 Comment(6)
Sure, all good and well, but the problem comes with Parameters and the like. They will be in plain text before they're Secured, making it a pointless (or very small point) in doing it this way. The memory collection side is great, but expensive isn't it?Borodin
Well, why couldn't one use a SecureString as a parameter? I've never had a need to try it, but it seems like it should work and be secure. If anyone stores anything that needs to be secure in a non-secure variable, you can bet a simple debugger can get the contents as plain text. Look at a simple textbox used as a password field.Lieberman
-cont All one needs is to look at the raw contents using the windows api and all the *'s in the world are worthless. Now if that textbox was an inherited user control that replaces the backing string variable with a securestring instance, and that instance was passed to a safe calling function, then what else would need to be done?Lieberman
What I was getting at is more a question of interoperability. If I have a WebService written in say, PHP, that would be useless then as a SecureString would need to load that in as a plain string (encryption aside here).Borodin
The reason SecureString has operators for working a byte at a time is because that's how it's supposed to be used. Read from the user/file/stream/etc into the SecureString (byte by byte), then write to crypto provider/stream a byte at a time. The full string should never be in memory, at worst you're talking about lots of single bytes with no orderingHolmun
Microsoft discourages using SecureString. It is not considered secure anymore.Patti
U
5

I wrote a little extension method for the string class for situations like this, it's probably the only sure way of ensuring the string itself is unreadable until collected. Obviously only works on dynamically generated strings, not literals.

public unsafe static void Clear(this string s)
{
  fixed(char* ptr = s)
  {
    for(int i = 0; i < s.Length; i++)
    {
      ptr[i] = '\0';
    }
  }
}
Upu answered 10/2, 2014 at 11:39 Comment(1)
Note that this won't clean up any spurious copies of your string that were made when the garbage collector compacts the heap or when the operating system moves pages of your process's virtual memory around in RAM.Sfumato
N
3

This is all down to the garbage collector to handle that for you. You can force it to run a clean-up by calling GC.Collect(). From the docs:

Use this method to try to reclaim all memory that is inaccessible.

All objects, regardless of how long they have been in memory, are considered for collection; however, objects that are referenced in managed code are not collected. Use this method to force the system to try to reclaim the maximum amount of available memory.

That's the closest you'll get me thinks!!

Neman answered 11/3, 2010 at 6:56 Comment(1)
Might be worth adding that forcing GC is rarely the 'right' thing to do ... and that unless you can explain how GC works and what the LOH is you probably shouldn't be messing with it!Roaring
I
3

I will answer this question from a security perspective.

If you want to destroy a string for security reasons, then it is probably because you don't want anyone snooping on your secret information, and you expect they might scan the memory, or find it in a page file or something if the computer is stolen or otherwise compromised.

The problem is that once a System.String is created in a managed application, there is not really a lot you can do about it. There may be some sneaky way of doing some unsafe reflection and overwriting the bytes, but I can't imagine that such things would be reliable.

The trick is to never put the info in a string at all.

I had this issue one time with a system that I developed for some company laptops. The hard drives were not encrypted, and I knew that if someone took a laptop, then they could easily scan it for sensitive info. I wanted to protect a password from such attacks.

The way I delt with it is this: I put the password in a byte array by capturing key press events on the textbox control. The textbox never contained anything but asterisks and single characters. The password never existed as a string at any time. I then hashed the byte array and zeroed the original. The hash was then XORed with a random hard-coded key, and this was used to encrypt all the sensitive data.

After everything was encrypted, then the key was zeroed out.

Naturally, some of the data might exist in the page file as plaintext, and it's also possible that the final key could be inspected as well. But nobody was going to steal the password dang it!

Insist answered 25/3, 2010 at 18:6 Comment(4)
Nicely done, however what would you do if the information is passed as a string via a Web Service for instance (all encrypted of course)?Borodin
@Kyle Rozendo - If the encryption is transparent to the application (like SSL for example), then there's probably nothing you can do, but if you are doing the encryption yourself (using the System.Security.Cryptography namespace for example) then it's all done as byte arrays anywy, so there's still no need for generating strings. Of course, once you've shown it to the user, then all bets are off.Insist
Your technique is clever, but I think that any keylogger would have easily bypassed it by storing all the keystrokes from the keyboard.Melissamelisse
@MadTigger - Sure, or someone could mount a small camera on the ceiling and record all the keystrokes being pressed, etc., etc. These laptops were not holding national security secrets or anything like that, fortunately. My main concern was data theft after the laptops were stolen, and I think I got 90% the way there, which was probably good enough in this case.Insist
E
2

There's no deterministic way to clear all traces of a string (System.String) from memory. Your only options are to use a character array or a SecureString object.

Electrotechnics answered 11/3, 2010 at 7:5 Comment(5)
There's no deterministic way to clear all traces of a character array from memory either, is there?Demisec
I think there is, just set all the array element values to 0.Electrotechnics
Oh, and of course there's no deterministic way to clear all traces of a character array, but if you use that character array to represent a string, then there's a deterministic way to clear all traces of that string from memory by using a character array and setting all its elements to 0.Electrotechnics
But I guess you're right, if the array has been moved by the garbage collector at some point, than an old version of it (with non-0 values) might be lying around in memory somewhere.Electrotechnics
You can stop the array from being moved by the garbage collector by using the fixed statement.Linette
R
1

One of the best ways to limit the lifetime of string objects in memory is to declare them as local variables in the innermost scope possible and not as private member variables on a class.

It's a common mistake for junior developers to declare their strings 'private string ...' on the class itself.

I've also seen well-meaning experienced developers trying to cache some complex string concatenation (a+b+c+d...) in a private member variable so they don't have to keep calculating it. Big mistake - it takes hardly any time to recalculate it, the temporary strings are garbage collected almost immediately when the first generation of GC happens, and the memory swallowed by caching all those strings just took available memory away from more important items like cached database records or cached page output.

Roaring answered 11/3, 2010 at 7:13 Comment(1)
If the strings are large, and the concatenation is calculated frequently, caching is a good idea. Joining large strings that are thrown away quickly can cause fragmentation of the Large Object Heap, which isn't compacted during collection. Of course, things like StringBuilders are also useful in these situations to reduce impact on the heap.Delphina
I
-5

Set the string variable to null once you don't need it.

string s = "dispose me!";
...
...
s = null;

and then call GC.Collect() to revoke garbage collector, but GC CANNOT guarantee the string will be collected immediately.

Iiette answered 11/3, 2010 at 7:3 Comment(4)
That's an unfortunate example: as per Brian's answer, because dispose me! is a literal in the assembly code, it will be interned and never garbage collected. You're right as far as strings constructed at runtime go though.Demisec
-1 to anybody suggesting the use of GC.Collect to "dispose" of strings. But itowlson is right on about the interning.Brittni
Oh, this really surprises me! ThxIiette
Also even if you GC... that doesn't mean the data isn't still sitting in ram, if someone freezes you ram and dumps it...Stool

© 2022 - 2024 — McMap. All rights reserved.