Memory usage of concatenating strings using interpolated vs "+" operator
Asked Answered
T

5

43

I see the benefit of using interpolated strings, in terms of readability:

string myString = $"Hello { person.FirstName } { person.LastName }!"

over a concatenation done this way:

string myString = "Hello " + person.FirstName + " " person.LastName + "!";

The author of this video tutorial claims that the first one makes better use of memory.

How come?

Trapezium answered 21/2, 2017 at 14:51 Comment(3)
Unless this is part of a tight loop, do you really think any difference in memory usage here is going to be significant? Choose to write the code that reads cleanest to you and, if you do care about memory usage, profile your code to find the actual hotspots. "Someone on the internet told me" is a terrible way to make performance decisions.Complicate
I did not expect a massive difference in memory usage :) I was trying to understand how the underlying implementation differ, I thought they were strictly equivalent methods!Trapezium
Here is benchmarks. String concatenation is a little faster for a very small number of arguments, but requires more memory. After 20+ arguments concat is better by time and memory.Nudism
M
25

The author doesn't actually say that one makes better use of memory than the other. It says that the one method "makes good use of memory" in the abstract, which, by itself, doesn't really mean much of anything.

But regardless of what they said, the two methods aren't going to be meaningfully different in their implementation. Neither is going to be meaningfully different from the other in terms of memory or time.

Mcqueen answered 21/2, 2017 at 14:58 Comment(18)
Actually concatenation uses a lot more memory compared to string.format, or using a stringbuilder as it makes an allocation for each concatenated element. These will stay in memory for a while longer too.Stirpiculture
@Stirpiculture No, the compiler is going to compile the entire statement into a single method call to string.concat, passing in all of the strings being concatenated as parameters, and thus requiring no intermediate strings. You could separate the concatenations into statements, or incorporate operations other than the + operator to prevent that compiler optimization, but neither of those things happen in the code being asked about so there will be no intermediate strings. Technically string.format and a string builder would both have additional overhead that this doesn't need, but not by much.Mcqueen
Yes it does that and strings are immutable, so every one of the non-parameter strings in that line is added to the heap. See my answer below.Stirpiculture
@Stirpiculture The code is open source. You can just look at the code yourself and see that it's only allocating a single new string. That there is some code that you didn't show us that you tested using code you also didn't show us, that provided results you've interpreted based on indirect evidence to mean it must be creating additional strings just doesn't really mean much of anything to anyone. As I've said, there are lots of ways to write this code incorrectly such that it does create intermediate strings, but the code in the question doesn't.Mcqueen
@Stirpiculture All of that said, what your numbers (despite them lacking way too much information to draw many meaningful conclusions from) do show one thing, and that the differences between those methods are all super tiny so my assertion that there's not going to be a meaningful difference in a situation like this holds.Mcqueen
@Stirpiculture Your conclusions given your data are also suspect. Creating intermediate strings doesn't make things faster at the cost of more memory, it makes things substantially slower. Creating all of the intermediate strings when concatenating a lot of strings means constantly copying all of the characters from one array to another, over and over again, which takes time. An operation that is doing that is going to be notably slower than one that doesn't (at least if the strings aren't so small the difference is drowned out by noise).Mcqueen
The differences add up when you are iterating over an import loop 200,000 times, but then using smaller strings and putting them into an allocation free state by using span<char> would be better in that insatnce as you free up both the allocation ns and the memory allocation itself. Also you miss the point of my argument, the second part with the multiline additions was for my own interest, the first set of vars is literally the const strings added together to result in the same output string as the interpolated one. Yet it takes >170B of data less in RAM (~50%).Stirpiculture
This save 65MB of ram over an import of approximately 1 second. Now imagine you are importing DB data, and you may have 30+ separate strings or other data types of data on each record and 1m+ records. The string system where you use +""+"", may expand out to ~2.5KB, per record. as each one is allocated this will not be garbage ccollected for a while. When I have a moment I might modify my benchmarking program to do just that. I'll post the results back here.Stirpiculture
@Stirpiculture Again, you've not provided the code that your testing nor shown your methodology, so your data by itself doesn't actually mean much, but even if we only look at the data, it's results are tiny, and don't indicate your conclusions. You're assuming miniscule values will scale up proportionally and not accounting for noise in the test data, neither of which is merited. The code here is doing basically the exact same thing with only trivial differences. Testing code this fast and has such minute differences is hard; it's real easy to think there's a difference when there isn't.Mcqueen
Hey man, I have seen this cascade to 4gb previously in about 3 seconds when using these string functions in large datasets, (essentially you end up with a memory leak because the GC cannot keep up), on using the other methods (in my case using a single stringbuilder defined outside the loop) the total app memory stayed <10MB. when I have time I'll do a more comprehensive test and show the results and testing methodology, also the C# developers have previously mentioned the immutable issue adding allocations when using += & +"" at various conferences.Stirpiculture
@Stirpiculture That can happen if you are concatenating a dynamic number of strings, generally because you're concatenating strings in a loop to a variable outside of the loop. And yes, in that situation a stringbuilder would be needed. But that's not the situation asked about in this question. The question is comparing string interpolation to the + operator, which won't behave differently in that situation.Mcqueen
Or if you create the string inside the loop (which is what had to be done). My point is that for each of the non const strings after + in a line I found memory usage going up, when the conference peeps explained about the immutable issues with string operators (where they were being allocated to memory and awaiting garbage collection), the interpolation works better because IME it's only doing the single allocation to the end result string. I'll do some more thorough tests and put everything up. Apparently a lot of string issues have been fixed in C#11 so may have to do multiple versions.Stirpiculture
@Stirpiculture Concatenating a dynamic number of strings in a loop is completely different than what the question is asking about. It has nothing to do with this question. Concatenating a fixed number of strings known at compile time, as shown in the question, does not result in intermediate strings being created. String interpolation doesn't create intermediate strings, but neither does the + operator. It's just a slightly different syntax for what boils down to basically identical code.Mcqueen
A loop is not specified as being used or not being used. I could see the example provided in the OP being used inside a loop to output data or to pull data into a template etc.Stirpiculture
@Stirpiculture The question is very specific about the code it's asking about. You saying that you think the OP is actually asking about something completely different for no reason makes no sense. If they want to know about a completely different situation then that is a completely different question and would need to be asked separately. If there is a completely different question about radically different code, then yes, it could change the answer. But this question isn't asking about code in a loop concatenating strings to each other. It's asking about concatenating exactly 5 strings.Mcqueen
No I don't I said I didn't assume he would use it in or out of a loop, I said I could see it be used in a loop (v big diff in language). I have updated my answer below, and posted the code as well. You are correct there was a bug in my string concat function that created multiple assigns, so interp and inline concat use the same amount of memory. However you will see that format uses less, implying that both interp and concat are doing more assigns than format. This follows through with the multiline and appending tests, at a later date I will add in looped tests to see how GC reacts.Stirpiculture
@Stirpiculture None of your tests are of code that is comparable to the actual code in the question, the code that is being asked about. As such, the tests are, at best, not applicable (although they're also poorly written tests which result in your conclusions also not being applicable to the situations you think you're attempting to apply them to, even though those situations aren't what the OP is doing. Benchmarking microoptimizations is really hard. It's extremely difficult to write code that measures what you think you're measuring when your code does so little, and you feel into traps.Mcqueen
@Stirpiculture Your different tests aren't even computing the same strings. Your string format is computing a much smaller string to begin with, so it's unsurprising that it would need less memory to do that. Additionally as the computed string is never used, and it's not used in a such a way that the compiler can prove that it's not used, it's allow to omit the operation entirely in many of your tests.Mcqueen
W
23

I made a simple test, see below. If you concatenate constants, don't use "string.Concat" because the compiler can't concatenate your strings at compile time. If you use variables, the results are effectively the same.

time measure results:

const string interpolation : 4
const string concatenation : 58
const string addition      : 3
var string interpolation   : 53
var string concatenation   : 55
var string addition        : 55
mixed string interpolation : 47
mixed string concatenation : 53
mixed string addition      : 42

the code:

void Main()
{

const int repetitions = 1000000; 
const string part1 = "Part 1"; 
const string part2 = "Part 2"; 
const string part3 = "Part 3"; 
var vPart1 = GetPart(1); 
var vPart2 = GetPart(2); 
var vPart3 = GetPart(3); 

Test("const string interpolation ", () => $"{part1}{part2}{part3}"); 
Test("const string concatenation ", () => string.Concat(part1, part2, part3)); 
Test("const string addition      ", () => part1 + part2 + part3); 
Test("var string interpolation   ", () => $"{vPart1}{vPart2}{vPart3}"); 
Test("var string concatenation   ", () => string.Concat(vPart1, vPart2, vPart3)); 
Test("var string addition        ", () => vPart1 + vPart2 + vPart3); 
Test("mixed string interpolation ", () => $"{vPart1}{part2}{part3}");
Test("mixed string concatenation ", () => string.Concat(vPart1, part2, part3));
Test("mixed string addition      ", () => vPart1 + part2 + part3);

void Test(string info, Func<string> action) 
{ 
    var watch = Stopwatch.StartNew(); 
    for (var i = 0; i < repetitions; i++) 
    { 
        action(); 
    } 
    watch.Stop(); 
    Trace.WriteLine($"{info}: {watch.ElapsedMilliseconds}"); 
} 

string GetPart(int index) 
    => $"Part{index}"; 

}
Wirra answered 28/3, 2019 at 9:3 Comment(2)
Wasn't the question about the memory consumption and not timings?Empiric
But we all love timings :) Don't we?!Legion
B
11

Strings are immutable. That means they can't be changed.

When you concatenate strings with a + sign, you end up creating multiple strings to get to the final string.

When you use the interpolation method (or StringBuilder), the .NET runtime optimizes your string use, so it (in theory) has better memory usage.

All that being said, it often depends on WHAT you are doing, and HOW OFTEN you are doing it.

One set of concatenations doesn't offer a lot of performance/memory improvements.

Doing those concatenations in a loop can have a lot of improvement.

Babbler answered 21/2, 2017 at 15:24 Comment(1)
No intermediate strings are going to be created in either of the two examples shown. Both are able to produce a single final string because there are a fixed number of strings at compile time, so the concatenation will be able to concatenate them all together. This is true of both methods shown here, not just when using string interpolation. Using a StringBuilder here would be both slower, and use more memory than either other option presented.Mcqueen
S
4

I created a memory test program, I had a bug in one of the benchmarks earlier on so I have fixed that and I have posted the source below the results. A note, this is using C# 7 if you use .,net core you will be using a different version of C# and these results will change.

Further to the immutable arguments above, the allocation is at the point of assignation. so the var output = "something"+"something else"+" "+"something other" contains 2 assignations, the variable assign on the left and the final string on the right (as it is optimised this way by the compiler when a fixed number of vars is used).

As shown below, these assignations happen every time you use this method (string.format and stringbuilder differ here, format uses less memory and builder has extra overhead initially).

Simple

So if you are simply adding vars into a single string yes Interp and Inline Concat use the same amount of RAM, string.format uses the least RAM though so there is obviously some extra allocations occurring with concat & interp that string format avoids.

Using the 1 var and assigning to it multiple times

Interestingly, in the multiline assigns (where you assign the same value to the var multiple times) even with 3 clears and appendformats added to the stringbuilder it is the most efficient in the multiline assigns and is still faster in CPU time than format, easiest on cpu is interp and concat, however the memory is nearing 1MB.

Appending to the var

When constructing a string over successive lines (appending separately in the builtbylines tests as you may for error code messages) String format slips behind the others when using += to append to the output var. Stringbuilder in this instance is the clear winner.

Multiple runs of the functions Here we can see the difference in a very simple 20x run of the line concat that could be found in a function if you wanted to track progress or the part of the function you are attempting to do. The difference between using a builder vs a string is nearly 25%. If you had even a small amount of strings assigned inside a loop of lots of records then the potential memory impact could be quite high by using interp/+=. For example if I were importing a relatively small file of records into a database and was using strings then the number of records could easily exceed 50000 in a very short period of time (let alone the 4gb compressed files I used to have to import), which means the system as a whole could easily crash or end up very slow as it is forced the GC repeatedly within a very short period of time. In those cases I would se a stringbuilder ref and simply refresh and re-assign that OR use a Span.

Results

Source code

using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Diagnostics.Windows.Configs;
using BenchmarkDotNet.Running;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

namespace ConsoleApp1
{
    [AsciiDocExporter]
    [MemoryDiagnoser]
    public class Program
    {
        private string str1 = "test string";
        private string str2 = "this is another string";
        private string str3 = "helo string 3";
        private string str4 = "a fourth string";
                
        [Benchmark]
        public void TestStringConcatStringsConst()
        {
            var output = str1 + " " + str2 + " " + str3 + " " + str4;
        }


        [Benchmark]
        public void TestStringInterp()
        {
            var output = $"{str1} {str2} {str3} {str4}";
        }

        [Benchmark]
        public void TestStringFormat()
        {            
            var output = String.Format("{0} {1} {2} {3}", str1, str2, str3, str4);
        }

        [Benchmark]
        public void TestStringBuilder()
        {
            var output = new StringBuilder().AppendFormat("{0} {1} {2} {3}", str1, str2, str3, str4);
        }

        [Benchmark]
        public void TestStringConcatStrings_FourMultiLineAssigns()
        {
            var output = str1 + " " + str2 + " " + str3 + " " + str4;
            output = str1 + " " + str2 + " " + str3 + " " + str4;
            output = str1 + " " + str2 + " " + str3 + " " + str4;
            output = str1 + " " + str2 + " " + str3 + " " + str4;
        }

        [Benchmark]
        public void TestStringInterp_FourMultiLineAssigns()
        {
            var output = $"{str1} {str2} {str3} {str4}";
            output = $"{str1} {str2} {str3} {str4}";
            output = $"{str1} {str2} {str3} {str4}";
            output = $"{str1} {str2} {str3} {str4}";
        }

        [Benchmark]
        public void TestStringFormat_FourMultiLineAssigns()
        {
            var output = String.Format("{0} {1} {2} {3}", str1, str2, str3, str4);
            output = String.Format("{0} {1} {2} {3}", str1, str2, str3, str4);
            output = String.Format("{0} {1} {2} {3}", str1, str2, str3, str4);
            output = String.Format("{0} {1} {2} {3}", str1, str2, str3, str4);
        }

        [Benchmark]
        //This also clears and re-assigns the data, I used the stringbuilder until the last line as if you are doing multiple assigns with stringbuilder you do not pull out a string until you need it.
        public void TestStringBuilder_FourMultilineAssigns()
        {
            var output = new StringBuilder().AppendFormat("{0} {1} {2} {3}", str1, str2, str3, str4);
            output = output.Clear().AppendFormat("{0} {1} {2} {3}", str1, str2, str3, str4);
            output = output.Clear().AppendFormat("{0} {1} {2} {3}", str1, str2, str3, str4);
            output = output.Clear().AppendFormat("{0} {1} {2} {3}", str1, str2, str3, str4);
        }

        [Benchmark]
        public void TestStringConcat_BuiltByLine()
        {
            var output = str1;
            output += " " + str2;
            output += " " + str3;
            output += " " + str4;
        }

        [Benchmark]
        public void TestStringInterp_BuiltByLine1()
        {
            var output = str1;
            output = $"{output} {str2}";
            output = $"{output} {str3}";
            output = $"{output} {str4}";
        }

        [Benchmark]
        public void TestStringInterp_BuiltByLine2()
        {
            var output = str1;
            output += $" {str2}";
            output += $" {str3}";
            output += $" {str4}";
        }

        [Benchmark]
        public void TestStringFormat_BuiltByLine1()
        {
            var output = str1;
            output = String.Format("{0} {1}", output, str2);
            output = String.Format("{0} {1}", output, str3);
            output = String.Format("{0} {1}", output, str4);
        }

        [Benchmark]
        public void TestStringFormat_BuiltByLine2()
        {
            var output = str1;
            output += String.Format(" {0}", str2);
            output += String.Format(" {0}", str3);
            output += String.Format(" {0}", str4);
        }

        [Benchmark]
        public void TestStringBuilder_BuiltByLine()
        {
            var output = new StringBuilder(str1);
            output.AppendFormat(" {0}", str2);
            output.AppendFormat(" {0}", str3);
            output.AppendFormat(" {0}", str4);
        }

        [Benchmark]
        public void TestConcatLine20x()
        {
            for (int i = 0; i < 20; i++) {
                TestStringConcat_BuiltByLine();
            }
        }

        [Benchmark]
        public void TestInterpLine20x()
        {
            for (int i = 0; i < 20; i++)
            {
                TestStringInterp_BuiltByLine2();
            }
        }

        [Benchmark]
        public void TestBuilderLine20x()
        {
            for (int i = 0; i < 20; i++)
            {
                TestStringBuilder_BuiltByLine();
            }
        }

        static void Main(string[] args)
        {
            var summary = BenchmarkRunner.Run<Program>(null, args);
            //var summary = BenchmarkRunner.Run())
        }
    }
}
Stirpiculture answered 21/4, 2022 at 23:58 Comment(0)
R
2

Because strings in c# are immutable that's why same memory is used again and again so it does not impact memory much but in terms of performance you are actually differentiating between String.Format and String.Concat because at compile time your code will be like this

  string a = "abc";
  string b = "def";

  string.Format("Hello {0} {1}!", a, b);

  string.Concat(new string[] { "Hello ", a, " ", b, "!" });

there is a whole thread about performance between these two if you are interested String output: format or concat in C#

Recreate answered 21/2, 2017 at 15:37 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.