Programmatically determine if std::string uses Copy-On-Write (COW) mechanism
Asked Answered
T

2

5

Following up on the discussion from this question, I was wondering how does one using native C++ determine programmatically whether or not the std::string implementation they are using utilizes Copy-On-Write (COW)

I have the following function:

#include <iostream>
#include <string>

bool stdstring_supports_cow()
{
   //make sure the string is longer than the size of potential
   //implementation of small-string.
   std::string s1 = "012345678901234567890123456789"
                    "012345678901234567890123456789"
                    "012345678901234567890123456789"
                    "012345678901234567890123456789"
                    "012345678901234567890123456789";
   std::string s2 = s1;
   std::string s3 = s2;

   bool result1 = (&s1[0]) == (&s2[0]);
   bool result2 = (&s1[0]) == (&s3[0]);

   s2[0] = 'X';

   bool result3 = (&s1[0]) != (&s2[0]);
   bool result4 = (&s1[0]) == (&s3[0]);

   s3[0] = 'X';

   bool result5 = (&s1[0]) != (&s3[0]);

   return result1 && result2 &&
          result3 && result4 &&
          result5;
}

int main()
{
  if (stdstring_supports_cow())
      std::cout << "std::string is COW." << std::endl;
   else
      std::cout << "std::string is NOT COW." << std::endl;
   return 0;
}

The problem is I can't seem to find a C++ tool chain where it returns true. Is there a flaw in my assumption about how COW is implemented for std::string?

Update: Based on kotlinski comments, I've changed the use of writeble references to data() in the function, it now seems to return "true" for some implementations.

bool stdstring_supports_cow()
{
   //make sure the string is longer than the size of potential
   //implementation of small-string.
   std::string s1 = "012345678901234567890123456789"
                    "012345678901234567890123456789"
                    "012345678901234567890123456789"
                    "012345678901234567890123456789"
                    "012345678901234567890123456789";
   std::string s2 = s1;
   std::string s3 = s2;

   bool result1 = s1.data() == s2.data();
   bool result2 = s1.data() == s3.data();

   s2[0] = 'X';

   bool result3 = s1.data() != s2.data();
   bool result4 = s1.data() == s3.data();

   s3[0] = 'X';

   bool result5 = s1.data() != s3.data();

   return result1 && result2 &&
          result3 && result4 &&
          result5;
}

Note: According N2668: "Concurrency Modifications to Basic String", in the upcoming C++0x standard, COW option will be removed from basic_string. thanks to James and Beldaz for bringing that up.

Territorial answered 21/12, 2010 at 3:59 Comment(2)
If I recall correctly, C++0x will forbid std::string from being COW.Especially
@Zenicoder: justsoftwaresolutions.co.uk/cplusplus/… mentions this, referring to open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2668.htmIvers
T
8

Using &s1[0] to take the adress is not what you want, [0] returns a writable reference and will create a copy.

Use data() instead, it returns a const char*, and your tests may pass.

Tamaratamarack answered 21/12, 2010 at 4:8 Comment(2)
+1 I made the changes you've suggested and so far it looks good: codepad.org/zDJNcqUxTerritorial
Suggest you use data(), not c_str(). c_str() must return a null terminated string. So maybe std::string needs to create a new buffer for that call. (Unlikely, but allowed)Tamaratamarack
O
0

The copy-on-write paradigm is dependent on knowing when you are doing a write. This will occur whenever the object is returning a writable reference.

If you work with const references to the strings, you may be able to compare the addresses if the class was specialized to disable the copy when returning a const reference to the data.

Onceover answered 21/12, 2010 at 4:14 Comment(1)
@Zenikoder, perhaps I wasn't clear when I suggested using const references to the strings? Anything returned by them would be unwriteable and shouldn't trigger a copy.Onceover

© 2022 - 2024 — McMap. All rights reserved.