How to convert an instance of std::string to lower case

Asked 24/11, 2008 at 11:49 Answered 4/4, 2023 at 19:44

1006

I want to convert a std::string to lowercase. I am aware of the function tolower(). However, in the past I have had issues with this function and it is hardly ideal anyway as using it with a std::string would require iterating over each character.

Is there an alternative which works 100% of the time?

Posology answered 24/11, 2008 at 11:49 Comment(14)

How else would you convert each element of a list of anything to something else, without iterating through the list? A string is just a list of characters, if you need to apply some function to each character, your going to have to iterate through the string. No way around that. – Banff 24/11, 2008 at 12:14

Why exactly does this question mert down rating? I don't have a problem with iterating through my string, but I am asking if there are other functions apart from tolower(), toupper() etc. – Posology 24/11, 2008 at 12:24

If you have a C style char array, then I guess you may be able to add ox20202020 to each block of 4 characters (provided they are ALL already uppercase) to convert 4 characters to lowercase at a time. – Banff 24/11, 2008 at 13:5

@Dan: If they might already be lowercase, but are definitely A-Z or a-z, you can OR with 0x20 instead of adding. One of those so-smart-it's-probably-dumb optimisations that are almost never worth it... – Crystallization 24/11, 2008 at 13:11

I don't know why it would've been down-voted... certainly it's worded a little oddly (because you do have to iterate through every item somehow), but it's a valid question – Wiltz 24/11, 2008 at 13:19

Note: tolower() doesn't work 100% of the time. Lowercase/uppercase operations only apply to characters, and std::string is essentially an array of bytes, not characters. Plain tolower is nice for ASCII string, but it will not lowercase a latin-1 or utf-8 string correctly. You must know string's encoding and probably decode it before you can lowercase its characters. – Landbert 24/11, 2008 at 14:42

When I type questions I just tend to dump what is in my mental buffer at the time. It doesn't always make sense. ;) – Posology 24/11, 2008 at 17:40

@onebyone: Ah, never thought of that! Well, I never really meant this was a useful way of doing it, just that it's possible. Actually, I'd be more interested int rying soemthing like that on large texts on a GPU, just for a laugh. – Banff 26/11, 2008 at 12:41

This is a good question. Most scripting languages handle it just the way you would expect it to be handled. – Tedda 1/11, 2009 at 22:11

Note that the answer you selected potentially has undefined behaviour. Despite all the up-votes, it is unsafe. – Chu 29/5, 2014 at 18:5

I think what is meant by "iterating over each character" is "explicitly iterating over each character", such as to reduce code bloat, or verbose code. – Uprear 28/1, 2015 at 17:18

After reading through all these answers and back-and-forth comments, I'm not so certain that this is something you'd want to directly deal with inside your program. You may want to use a standalone module that takes strings and encoding/locale arguments and gives only a good result if it can be verifiably converted, which seems to require using the ICU library for maximum robustness. Alternatively, you can always play it even safer and remove the requirement for using case-checks as verification unless the app's entire point is getting those letters to lower-case. – Athanor 3/5, 2017 at 22:57

DevSolar gives an excellent answer which contains a very good example of why this can't be solved as a pure software exercise. He seems to agree as well as disagree with me on this and apparently won't include that you must be aware of cultural changes for any solution to work. It cannot be solved perfectly for all time in all cases. – Opec 7/11, 2017 at 13:28

I would not expect in an object-oriented language to be forced to dig into the object to manipulate its inner elements. When I call std::string.clear() I don't have to cycle through inner elements and clear one of them at a time. – Muttonhead 25/6, 2021 at 13:38

1121

Adapted from Not So Frequently Asked Questions:

#include <algorithm>
#include <cctype>
#include <string>

std::string data = "Abc";
std::transform(data.begin(), data.end(), data.begin(),
    [](unsigned char c){ return std::tolower(c); });

You're really not going to get away without iterating through each character. There's no way to know whether the character is lowercase or uppercase otherwise.

If you really hate tolower(), here's a specialized ASCII-only alternative that I don't recommend you use:

char asciitolower(char in) {
    if (in <= 'Z' && in >= 'A')
        return in - ('Z' - 'z');
    return in;
}

std::transform(data.begin(), data.end(), data.begin(), asciitolower);

Be aware that tolower() can only do a per-single-byte-character substitution, which is ill-fitting for many scripts, especially if using a multi-byte-encoding like UTF-8.

Frottage answered 24/11, 2008 at 11:59 Comment(41)

That is amazing, ive always wondered what the best way to do it. I had no idea to use std::transform. :) – Shiverick 24/11, 2008 at 13:40

uberjumper: There's actually a whole lot of overhead associated with the STL calls, especially for small"ish" strings. Solutions using a for loop and tolower are probably much faster. – Frottage 25/11, 2008 at 0:54

(Old it may be, the algorithms in question have changed little) @Stefan Mai: What kind of "whole lot of overhead" is there in calling STL algorithms? The functions are rather lean (i.e. simple for loops) and often inlined as you rarely have many calls to the same function with the same template parameters in the same compile unit. – Gosh 11/11, 2011 at 22:14

@eq Fair point, my benchmarks agree with you when compiling with -O3 (though the STL actually outperforms the more hand-tuned code so I'm wondering whether the compiler is pulling some tricks). Debugging STL code is still a bear though ;). – Frottage 11/11, 2011 at 23:0

This non portable solution could be faster. You can avoid branch it this way: inChar |= 0x20. I think it is the fastest way to convert ascii upper to lower. If u want to convert lower to upper then: inChar &= ~0x20. – Xylem 31/1, 2014 at 11:6

@MichalW This works if you have only letters, which isn't always the case. If you're in that realm, you can probably do even better by using bitmasks on longs -- take on 8 characters at a time ;) – Frottage 1/2, 2014 at 7:20

Every time you assume characters are ASCII, God kills a kitten. :( – Victualage 10/2, 2014 at 20:49

Your first example potentially has undefined behaviour (passing char to ::tolower(int).) You need to ensure you don't pass a negative value. – Chu 29/5, 2014 at 17:30

While this should would be the canonical way to do this in a sane world, it has too many problems to recommend it. First, tolower from ctype.h doesn't work with unicode. Secondly, locale.h which is included by many of the other std library headers, defines a conflicting tolower, that causes headaches, see https://mcmap.net/q/54349/-why-can-39-t-quot-transform-s-begin-s-end-s-begin-tolower-quot-be-complied-successfully/339595. It is best to use std::locale or boost::locale::to_lower as other answers suggest. – Hindsight 1/7, 2014 at 17:14

::towlower if you're being international/using wide chars – Moneywort 15/4, 2016 at 0:2

@MichalW Hey, can you explain what you wrote there? Also, why do we use :: in ::tolower ? – Quartered 15/4, 2016 at 13:40

@StefanMai Hi. Why is the "::" needed before "tolower"? I don't understand that. – Db 16/5, 2016 at 1:13

Note that this works for Unicode if you're using a std::u32string and your C locale is compatible with Unicode. – Figured 19/6, 2016 at 9:13

The :: is needed before tolower to indicate that it is in the outermost namespace. If you use this code in another namespace, there may be a different (possibly unrelated) definition of tolower which would end up being preferentially selected without the ::. – Circulation 30/7, 2016 at 16:43

std::transform(data.begin(), data.end(), data.begin(), easytolower); is dangerous. Since the behavior of std::tolower undefined if the input is not representable as unsigned char and is not equal to EOF – Carin 9/8, 2017 at 5:52

@BrianGordon - But its much easier, and there really are way too many cats in the world already. – Giraffe 15/11, 2017 at 13:39

@BrianGordon That is blatantly false, as proven by the fact that there are still kittens in the world! =) – Hypothesis 12/12, 2017 at 21:40

What makes the 2nd solution non-portable? Can I just do this? pastebin.com/MPRMpQJS – Pishogue 24/3, 2018 at 23:12

@BrianGordon there are also cases when you know that the input is ASCII (e.g. the wire format of domain names). – Palila 17/5, 2018 at 13:54

@Palila I didn't know that. How does DNS handle international domain names which can be in unicode? – Victualage 24/5, 2018 at 4:57

@BrianGordon applications have to convert them into an all-ASCII encoding called "Punycode" (RFC 3492) – Palila 24/5, 2018 at 7:41

@TypicalHog: Because there is no guarantee that 'A' to 'Z' is a continuous range (EBCDIC); but more importantly because there are letters outside that range ('Ü', 'á', ...). It's very, very sad that the authors prefer to harvest more upvotes for answers with non-portable solutions instead of properly pointing out their shortcomings... – Pearlinepearlman 2/10, 2018 at 23:8

@DevSolar: easytolower seems a perfectly valid solution for latin ASCII symbols to me. Going to use it for normalizing HTML tag names. – Dantedanton 4/10, 2018 at 7:52

@Cheersandhth.-Alf c99 doesn't mention that it's UB: it either returns lower char, or unmodified. std::tolower, however, mentions ub – Googly 21/1, 2019 at 22:44

@L.F. I fixed your fix. – Towill 6/7, 2019 at 0:25

@Towill To be honest, I have always been having trouble understanding why the char has to be converted unsigned char first. Isn't the value of a (signed) char supposed to be nonnegative, anyway? What is the point of tolowering a negative char? I guess I am missing the point, so would you mind explaining it to be a little bit please :) – Cupellation 6/7, 2019 at 0:32

@L.F. No, char can be analogous to signed char, and a signed char can be negative. tolower only accepts unsigned char and -1. Anything outside its domain is UB, and you don't want to conflate with -1 either. While all members of the basic execution character set are non-negative, that does not necessarily hold for the (complete) execution character set. See the current draft. – Towill 6/7, 2019 at 0:40

@Towill Thank you! I didn't know a char can validly be negative. But then, doesn't converting to unsigned char just change the value? – Cupellation 6/7, 2019 at 0:41

@L.F. char -> unsigned char (value-preserving, modulo 2**CHAR_BIT) -> implicit to int (value-preserving). Of course, if sizeof(int) == 1, things pretty much fall apart. – Towill 6/7, 2019 at 0:44

@Towill OK ... I think I missed that ... Then the int is converted to char, I think, so the resulting value is implementation-defined before C++20 and guaranteed to be the original value since C++20? – Cupellation 6/7, 2019 at 0:47

@L.F. Converting the result from tolower() (int) back to char is also an interesting story, yes. – Towill 6/7, 2019 at 0:51

I don't understand why the tolower here is wrapped in a lambda rather than just passing it to transform on its own. – Chyou 17/10, 2019 at 19:39

@Chyou 1) to make sure that the character is first converted to unsigned char (see Deduplicator's comments above); 2) to enable overload resolution to select the int tolower( int ch ); overload defined in <cctype> instead of the template< class charT > charT tolower( charT ch, const locale& loc ); overload defined in <clocale>. – Cupellation 21/2, 2020 at 2:30

happily coding in Java and the time comes to switch over to a CPP module... comes along a simple string case issue Me: "I'll just look up the std::string toLower() or whatever the standard has for normalizing text case... Hmm, I wonder how they handle all the encoding and localization complexities a 'simple' task like that could entail when std::string is just raw text data?" finds this question... sad requiring that ingest data follows a case convention noises – Eachern 26/5, 2020 at 20:46

I don't think you need to wrap std::tolower in a lambda. – Parasitism 5/2, 2022 at 17:12

@ccj yeah, the distinct lack of "normal" library functions when I started doing C++ was quite disturbing – Oudh 30/8, 2022 at 4:14

@Cheersandhth.-Alf what is "UB" in "...it's UB for non-ASCII input."? – Alkylation 6/4, 2023 at 21:5

@Milan: The answer has been edited in July 2019 to remove the original problem, by replacing char with unsigned char. For that original problem, cppreference notes about std::tolower: ❝If the value of ch is not representable as unsigned char and does not equal EOF, the behavior is undefined❞. And since most all C++ compilers have char as a signed type by default, any non-ASCII character is in practice encoded with one or more negative char values, which if used directly as argument to std::tolower will encounter the quoted UB. Conversion to unsigned char avoids that problem. – Aileenailene 8/4, 2023 at 19:17

@Cheersandhth.-Alf Thanks for your response. Out of curiosity, what is the full form of 'UB'? – Alkylation 11/4, 2023 at 13:37

@Milan: Undefined Behavior. eel.is/c++draft/intro.defs#defns.undefined en.cppreference.com/w/cpp/language/ub – Aileenailene 12/4, 2023 at 18:19

Visual studio 2019 refused to compile this because of an int to char conversion (warning treated as error). I had to use: std::transform(data.begin(), data.end(), data.begin(), [](const char c) {return static_cast<char>(std::tolower(c)); }); to solve the problem. – Choroiditis 16/2 at 8:39

373

Boost provides a string algorithm for this:

#include <boost/algorithm/string.hpp>

std::string str = "HELLO, WORLD!";
boost::algorithm::to_lower(str); // modifies str

Or, for non-in-place:

#include <boost/algorithm/string.hpp>

const std::string str = "HELLO, WORLD!";
const std::string lower_str = boost::algorithm::to_lower_copy(str);

Neptunian answered 24/11, 2008 at 11:57 Comment(8)

Fails for non-ASCII-7. – Pearlinepearlman 27/2, 2015 at 9:28

This is pretty slow, see this benchmark: godbolt.org/z/neM5jsva1 – Gladiatorial 29/6, 2021 at 10:31

@Gladiatorial slow? Well, slow is to debug code because your own implementation has a bug because it was more complicated than to just call the boost library ;) If the code is critical, like called a lot and provides a bottleneck, then, well, it can be worth to think about slowness – Uta 12/2, 2022 at 12:0

I believe boost isn't C++ standard library solution, isn't it? – Graphophone 13/10, 2022 at 11:0

No, it isn't. It's one of these extremely unfortunate answers you see on EVERY SINGLE C++ question on this website... because adding an entire library just to do something so simple is apparently the most popular route! – Diabolism 27/1, 2023 at 16:42

Unfortunately if you know Unicode you know that you need a library to do it correctly. But this doesn't mean boost is the one, because it also requires ICU. Welcome to transitive dependency monsters (and ICU has very unstable ABI to make it worse). – Marceline 14/2, 2023 at 23:34

I find this answer helpful as I already have Boost in my project, and I do need the non-in-place version to_lower – Externalization 2/6, 2023 at 1:10

Not everyone uses Boost. – Chaucerian 31/8, 2023 at 18:12

348

tl;dr

Use the ICU library. If you don't, your conversion routine will break silently on cases you are probably not even aware of existing.

First you have to answer a question: What is the encoding of your std::string? Is it ISO-8859-1? Or perhaps ISO-8859-8? Or Windows Codepage 1252? Does whatever you're using to convert upper-to-lowercase know that? (Or does it fail miserably for characters over 0x7f?)

If you are using UTF-8 (the only sane choice among the 8-bit encodings) with std::string as container, you are already deceiving yourself if you believe you are still in control of things. You are storing a multibyte character sequence in a container that is not aware of the multibyte concept, and neither are most of the operations you can perform on it! Even something as simple as .substr() could result in invalid (sub-) strings because you split in the middle of a multibyte sequence.

As soon as you try something like std::toupper( 'ß' ), or std::tolower( 'Σ' ) in any encoding, you are in trouble. Because 1), the standard only ever operates on one character at a time, so it simply cannot turn ß into SS as would be correct. And 2), the standard only ever operates on one character at a time, so it cannot decide whether Σ is in the middle of a word (where σ would be correct), or at the end (ς). Another example would be std::tolower( 'I' ), which should yield different results depending on the locale -- virtually everywhere you would expect i, but in Turkey ı (LATIN SMALL LETTER DOTLESS I) is the correct answer (which, again, is more than one byte in UTF-8 encoding).

So, any case conversion that works on a character at a time, or worse, a byte at a time, is broken by design. This includes all the std:: variants in existence at this time.

Then there is the point that the standard library, for what it is capable of doing, is depending on which locales are supported on the machine your software is running on... and what do you do if your target locale is among the not supported on your client's machine?

So what you are really looking for is a string class that is capable of dealing with all this correctly, and that is not any of the std::basic_string<> variants.

(C++11 note: std::u16string and std::u32string are better, but still not perfect. C++20 brought std::u8string, but all these do is specify the encoding. In many other respects they still remain ignorant of Unicode mechanics, like normalization, collation, ...)

While Boost looks nice, API wise, Boost.Locale is basically a wrapper around ICU. If Boost is compiled with ICU support... if it isn't, Boost.Locale is limited to the locale support compiled for the standard library.

And believe me, getting Boost to compile with ICU can be a real pain sometimes. (There are no pre-compiled binaries for Windows that include ICU, so you'd have to supply them together with your application, and that opens a whole new can of worms...)

So personally I would recommend getting full Unicode support straight from the horse's mouth and using the ICU library directly:

#include <unicode/unistr.h>
#include <unicode/ustream.h>
#include <unicode/locid.h>

#include <iostream>

int main()
{
    /*                          "Odysseus" */
    char const * someString = u8"ΟΔΥΣΣΕΥΣ";
    icu::UnicodeString someUString( someString, "UTF-8" );
    // Setting the locale explicitly here for completeness.
    // Usually you would use the user-specified system locale,
    // which *does* make a difference (see ı vs. i above).
    std::cout << someUString.toLower( "el_GR" ) << "\n";
    std::cout << someUString.toUpper( "el_GR" ) << "\n";
    return 0;
}

Compile (with G++ in this example):

g++ -Wall example.cpp -licuuc -licuio

This gives:

ὀδυσσεύς

Note that the Σ<->σ conversion in the middle of the word, and the Σ<->ς conversion at the end of the word. No <algorithm>-based solution can give you that.

Pearlinepearlman answered 5/6, 2014 at 15:6 Comment(11)

This is the correct answer in the general case. The standard gives nothing for handling anything except "ASCII" except lies and deception. It makes you think you can maybe deal with maybe UTF-16, but you can't. As this answer says, you cannot get the proper character-length (not byte-length) of a UTF-16 string without doing your own unicode handling. If you have to deal with real text, use ICU. Thanks, @Pearlinepearlman – Managing 25/3, 2015 at 14:0

Is ICU available by default on Ubuntu/Windows or needs to be install separately? Also how about this answer:https://mcmap.net/q/53206/-how-to-convert-an-instance-of-std-string-to-lower-case? – Eddi 11/5, 2016 at 19:0

icu::UnicodeString::length() is technically also lying to you (although less frequently), as it reports the number of 16bit code units rather than the number of code points. ;-) – Ununa 15/6, 2017 at 2:17

@masaers: To be completely fair, with things like combining characters, zero-width joiners and right-to-left markers, the number of code points is rather meaningless. I will remove that remark. – Pearlinepearlman 15/6, 2017 at 5:26

@Pearlinepearlman Agreed! The concept of length is rather meaningless on text (we could add ligatures to the list of offenders). That said, since people are used to tabs and control chars taking up one length unit, code points would be the more intuitive measure. Oh, and thanks for giving the correct answer, sad to see it so far down :-( – Ununa 15/6, 2017 at 6:51

Actually, std::string not being aware that it contains text in a multi-byte character-encoding is a feature, not a bug. It's the only sane way to do it, which is why just about everyone does it. Not having proper standard apis for handling anything but basic text from days gone by which never really were at all is a problem though, yes. It would have to be optional even in a hosted environment though, as it is quite hefty, and there are many cases where it isn't needed. – Towill 15/12, 2020 at 0:49

@Deduplicator: Sorry, but that's just dodging it in all possible ways. There are standards (Unicode), there are quasi-standard APIs for handling it (ICU), and if your intention is to write code that properly converts text to lowercase, unless you can guarantee your code will only ever see ASCII-7 (which would be a rather special case), all the other "solutions" here are 80--20 at best. – Pearlinepearlman 15/12, 2020 at 7:37

That is why there should be such standard APIs. Doesn't negate the fact that much string-manipulation is best done ignoring all but it being a sequence of code-units. And that many use-cases never need anything more sophisticated. – Towill 15/12, 2020 at 11:30

@Towill And that standard API is currently the ICU library, which is what this answer is about. – Pearlinepearlman 15/12, 2020 at 11:59

@Towill I heard that std::text is underway, perhaps even in time for C++23. Let's not give up all hope yet. – Pearlinepearlman 2/3, 2021 at 15:42

icu::UnicodeString seem to be a good class. QString also can do the job. However it is a pain to use in big programs with many libraries. I hope std::text will be a real thing soon – Conclave 16/6, 2022 at 9:57

Using range-based for loop of C++11 a simpler code would be :

#include <iostream>       // std::cout
#include <string>         // std::string
#include <locale>         // std::locale, std::tolower

int main ()
{
  std::locale loc;
  std::string str="Test String.\n";

 for(auto elem : str)
    std::cout << std::tolower(elem,loc);
}

Fromenty answered 9/10, 2013 at 8:0 Comment(4)

However, on a french machine, this program doesn't convert non ASCII characters allowed in the french language. For instance a string 'Test String123. É Ï\n' will be converted to : 'test string123. É Ï\n' although characters É Ï and their lower case couterparts 'é' and 'ï', are allowed in french. It seems that no solution for that was provided by other messages of this thread. – Fromenty 9/10, 2013 at 8:15

I think you need to set a proper locale for that. – Overpower 30/12, 2013 at 8:37

@incises, this then someone posted an answer about ICU and that's certainly the way to go. Easier than most other solutions that would attempt to understand the locale. – Avaria 1/9, 2016 at 21:25

I'd prefer to not use external libraries when possible, personally. – Athanor 11/7, 2017 at 0:54

Another approach using range based for loop with reference variable

string test = "Hello World";
for(auto& c : test)
{
   c = tolower(c);
}

cout<<test<<endl;

Astrograph answered 10/1, 2017 at 19:53 Comment(1)

I guess it won't work for UTF-8, will it? – Tracheid 16/4, 2021 at 19:46

If the string contains UTF-8 characters outside of the ASCII range, then boost::algorithm::to_lower will not convert those. Better use boost::locale::to_lower when UTF-8 is involved. See http://www.boost.org/doc/libs/1_51_0/libs/locale/doc/html/conversions.html

Snatchy answered 10/10, 2012 at 7:24 Comment(1)

A working example? – Inaction 2/1, 2022 at 15:43

This is a follow-up to Stefan Mai's response: if you'd like to place the result of the conversion in another string, you need to pre-allocate its storage space prior to calling std::transform. Since STL stores transformed characters at the destination iterator (incrementing it at each iteration of the loop), the destination string will not be automatically resized, and you risk memory stomping.

#include <string>
#include <algorithm>
#include <iostream>

int main (int argc, char* argv[])
{
  std::string sourceString = "Abc";
  std::string destinationString;

  // Allocate the destination space
  destinationString.resize(sourceString.size());

  // Convert the source string to lower case
  // storing the result in destination string
  std::transform(sourceString.begin(),
                 sourceString.end(),
                 destinationString.begin(),
                 ::tolower);

  // Output the result of the conversion
  std::cout << sourceString
            << " -> "
            << destinationString
            << std::endl;
}

Tasiatasiana answered 28/3, 2013 at 6:25 Comment(2)

This did not resize Ä into ä for me – Sancho 23/1, 2016 at 16:12

Could also use a back inserter iterator here instead of manual resize. – Whine 24/4, 2017 at 1:57

Simplest way to convert string into loweercase without bothering about std namespace is as follows

1:string with/without spaces

#include <algorithm>
#include <iostream>
#include <string>
using namespace std;
int main(){
    string str;
    getline(cin,str);
//------------function to convert string into lowercase---------------
    transform(str.begin(), str.end(), str.begin(), ::tolower);
//--------------------------------------------------------------------
    cout<<str;
    return 0;
}

2:string without spaces

#include <algorithm>
#include <iostream>
#include <string>
using namespace std;
int main(){
    string str;
    cin>>str;
//------------function to convert string into lowercase---------------
    transform(str.begin(), str.end(), str.begin(), ::tolower);
//--------------------------------------------------------------------
    cout<<str;
    return 0;
}

Lindi answered 12/6, 2015 at 6:50 Comment(2)

This is plain wrong: if you check the documentation, you will see that std::tolower cannot work with char, it only supports unsigned char. So this code is UB if str contains characters outside of 0x00-0x7F. – Bolitho 31/1, 2022 at 14:18

This is also false by virtue of using an identifier starting with str in the global namespace, which is strictly reserved. – Thunderclap 3/11, 2022 at 20:20

I wrote this simple helper function:

#include <locale> // tolower

string to_lower(string s) {        
    for(char &c : s)
        c = tolower(c);
    return s;
}

Usage:

string s = "TEST";
cout << to_lower("HELLO WORLD"); // output: "hello word"
cout << to_lower(s); // won't change the original variable.

Orthodontist answered 29/9, 2020 at 22:52 Comment(0)

My own template functions which performs upper / lower case.

#include <string>
#include <algorithm>

//
//  Lowercases string
//
template <typename T>
std::basic_string<T> lowercase(const std::basic_string<T>& s)
{
    std::basic_string<T> s2 = s;
    std::transform(s2.begin(), s2.end(), s2.begin(),
        [](const T v){ return static_cast<T>(std::tolower(v)); });
    return s2;
}

//
// Uppercases string
//
template <typename T>
std::basic_string<T> uppercase(const std::basic_string<T>& s)
{
    std::basic_string<T> s2 = s;
    std::transform(s2.begin(), s2.end(), s2.begin(),
        [](const T v){ return static_cast<T>(std::toupper(v)); });
    return s2;
}

Vocal answered 18/5, 2019 at 14:40 Comment(2)

This is what I needed. I just used the towlower for wide characters which supports the UTF-16. – Duvalier 28/4, 2020 at 8:7

::tolower and ::toupper are needed instead of tolower and toupper – Aleen 9/4, 2023 at 21:22

std::ctype::tolower() from the standard C++ Localization library will correctly do this for you. Here is an example extracted from the tolower reference page

#include <locale>
#include <iostream>

int main () {
  std::locale::global(std::locale("en_US.utf8"));
  std::wcout.imbue(std::locale());
  std::wcout << "In US English UTF-8 locale:\n";
  auto& f = std::use_facet<std::ctype<wchar_t>>(std::locale());
  std::wstring str = L"HELLo, wORLD!";
  std::wcout << "Lowercase form of the string '" << str << "' is ";
  f.tolower(&str[0], &str[0] + str.size());
  std::wcout << "'" << str << "'\n";
}

Chablis answered 29/1, 2016 at 2:25 Comment(6)

Nice, as long as you can convert the characters in place. What if your source string is const? That seems to make it a bit more messy (e.g. it doesn't look like you can use f.tolower() ), since you need to put the characters in a new string. Would you use transform() and something like std::bind1st( std::mem_fun() ) for the operator? – Abbie 17/8, 2016 at 6:9

For a const string, we can just make a local copy and then convert it in place. – Chablis 29/8, 2016 at 14:53

Yeah, though, making a copy adds more overhead. – Abbie 4/9, 2016 at 20:49

You could use std::transform with the version of ctype::tolower that does not take pointers. Use a back inserter iterator adapter and you don't even need to worry about pre-sizing your output string. – Whine 24/4, 2017 at 2:11

Great, especially because in libstdc++'s tolower with locale parameter, the implicit call to use_facet appears to be a performance bottleneck. One of my coworkers has achieved a several 100% speed increase by replacing boost::iequals (which has this problem) with a version where use_facet is only called once outside of the loop. – Kinslow 23/5, 2017 at 12:23

This won't work in Windows where you'd have to call std::locale("English_Unites States.UTF8"). – Bolitho 31/1, 2022 at 14:23

An alternative to Boost is POCO (pocoproject.org).

POCO provides two variants:

The first variant makes a copy without altering the original string.
The second variant changes the original string in place.
"In Place" versions always have "InPlace" in the name.

Both versions are demonstrated below:

#include "Poco/String.h"
using namespace Poco;

std::string hello("Stack Overflow!");

// Copies "STACK OVERFLOW!" into 'newString' without altering 'hello.'
std::string newString(toUpper(hello));

// Changes newString in-place to read "stack overflow!"
toLowerInPlace(newString);

Jamijamie answered 18/9, 2013 at 20:20 Comment(0)

Since none of the answers mentioned the upcoming Ranges library, which is available in the standard library since C++20, and currently separately available on GitHub as range-v3, I would like to add a way to perform this conversion using it.

To modify the string in-place:

str |= action::transform([](unsigned char c){ return std::tolower(c); });

To generate a new string:

auto new_string = original_string
    | view::transform([](unsigned char c){ return std::tolower(c); });

(Don't forget to #include <cctype> and the required Ranges headers.)

Note: the use of unsigned char as the argument to the lambda is inspired by cppreference, which states:

Like all other functions from <cctype>, the behavior of std::tolower is undefined if the argument's value is neither representable as unsigned char nor equal to EOF. To use these functions safely with plain chars (or signed chars), the argument should first be converted to unsigned char:
char my_tolower(char ch)
{
    return static_cast<char>(std::tolower(static_cast<unsigned char>(ch)));
}
Similarly, they should not be directly used with standard algorithms when the iterator's value type is char or signed char. Instead, convert the value to unsigned char first:
std::string str_tolower(std::string s) {
    std::transform(s.begin(), s.end(), s.begin(), 
                // static_cast<int(*)(int)>(std::tolower)         // wrong
                // [](int c){ return std::tolower(c); }           // wrong
                // [](char c){ return std::tolower(c); }          // wrong
                   [](unsigned char c){ return std::tolower(c); } // correct
                  );
    return s;
}

Cupellation answered 15/4, 2019 at 9:36 Comment(0)

On microsoft platforms you can use the strlwr family of functions: http://msdn.microsoft.com/en-us/library/hkxwh33z.aspx

// crt_strlwr.c
// compile with: /W3
// This program uses _strlwr and _strupr to create
// uppercase and lowercase copies of a mixed-case string.
#include <string.h>
#include <stdio.h>

int main( void )
{
   char string[100] = "The String to End All Strings!";
   char * copy1 = _strdup( string ); // make two copies
   char * copy2 = _strdup( string );

   _strlwr( copy1 ); // C4996
   _strupr( copy2 ); // C4996

   printf( "Mixed: %s\n", string );
   printf( "Lower: %s\n", copy1 );
   printf( "Upper: %s\n", copy2 );

   free( copy1 );
   free( copy2 );
}

Seaden answered 29/8, 2014 at 17:18 Comment(0)

There is a way to convert upper case to lower WITHOUT doing if tests, and it's pretty straight-forward. The isupper() function/macro's use of clocale.h should take care of problems relating to your location, but if not, you can always tweak the UtoL[] to your heart's content.

Given that C's characters are really just 8-bit ints (ignoring the wide character sets for the moment) you can create a 256 byte array holding an alternative set of characters, and in the conversion function use the chars in your string as subscripts into the conversion array.

Instead of a 1-for-1 mapping though, give the upper-case array members the BYTE int values for the lower-case characters. You may find islower() and isupper() useful here.

enter image description here

The code looks like this...

#include <clocale>
static char UtoL[256];
// ----------------------------------------------------------------------------
void InitUtoLMap()  {
    for (int i = 0; i < sizeof(UtoL); i++)  {
        if (isupper(i)) {
            UtoL[i] = (char)(i + 32);
        }   else    {
            UtoL[i] = i;
        }
    }
}
// ----------------------------------------------------------------------------
char *LowerStr(char *szMyStr) {
    char *p = szMyStr;
    // do conversion in-place so as not to require a destination buffer
    while (*p) {        // szMyStr must be null-terminated
        *p = UtoL[*p];  
        p++;
    }
    return szMyStr;
}
// ----------------------------------------------------------------------------
int main() {
    time_t start;
    char *Lowered, Upper[128];
    InitUtoLMap();
    strcpy(Upper, "Every GOOD boy does FINE!");

    Lowered = LowerStr(Upper);
    return 0;
}

This approach will, at the same time, allow you to remap any other characters you wish to change.

This approach has one huge advantage when running on modern processors, there is no need to do branch prediction as there are no if tests comprising branching. This saves the CPU's branch prediction logic for other loops, and tends to prevent pipeline stalls.

Some here may recognize this approach as the same one used to convert EBCDIC to ASCII.

Ambrogino answered 8/1, 2014 at 17:48 Comment(3)

"There is a way to convert upper case to lower WITHOUT doing if tests" ever heard of lookup tables? – Dithyramb 16/12, 2014 at 0:10

Undefined behavior for negative chars. – Proliferation 21/11, 2017 at 7:6

Modern CPUs are bottlenecked in memory not CPU. Benchmarking would be interesting. – Unwonted 14/4, 2020 at 15:12

Here's a macro technique if you want something simple:

#define STRTOLOWER(x) std::transform (x.begin(), x.end(), x.begin(), ::tolower)
#define STRTOUPPER(x) std::transform (x.begin(), x.end(), x.begin(), ::toupper)
#define STRTOUCFIRST(x) std::transform (x.begin(), x.begin()+1, x.begin(),  ::toupper); std::transform (x.begin()+1, x.end(),   x.begin()+1,::tolower)

However, note that @AndreasSpindler's comment on this answer still is an important consideration, however, if you're working on something that isn't just ASCII characters.

Spirit answered 30/1, 2016 at 21:2 Comment(7)

I'm downvoting this for giving macros when a perfectly good solution exist -- you even give those solutions. – Opec 7/11, 2017 at 7:44

The macro technique means less typing of code for something that one would commonly use a lot in programming. Why not use that? Otherwise, why have macros at all? – Spirit 7/11, 2017 at 8:2

Macros are a legacy from C that's being worked hard on to get rid of. If you want to reduce the amount of typing, use a function or a lambda. void strtoupper(std::string& x) { std::transform (x.begin(), x.end(), x.begin(), ::toupper); } – Opec 7/11, 2017 at 12:11

@Opec As I want to be a better coder, can you provide me any ANSI doc links where any ANSI C++ committees say something to the effect of, "We need to call a meeting to get rid of macros out of C++"? Or some other roadmap? – Spirit 7/11, 2017 at 20:47

No, I can't. Bjarne's stance on the topic has been made pretty clear on several occasions though. Besides, there are plenty of reasons to not use macros in C as well as C++. x could be a valid expression, that just happens to compile correctly but will give completely bogus results because of the macros. – Opec 8/11, 2017 at 12:2

good macros! @Opec macros help us so much... I expect they never get rid of it. – Uncoil 24/7, 2018 at 23:50

@AquariusPower I disagree. I have yet to see a macro that could not have been done better as a template or a lambda. – Opec 29/7, 2018 at 16:11

Is there an alternative which works 100% of the time?

There are several questions you need to ask yourself before choosing a lowercasing method.

How is the string encoded? plain ASCII? UTF-8? some form of extended ASCII legacy encoding?
What do you mean by lower case anyway? Case mapping rules vary between languages! Do you want something that is localised to the users locale? do you want something that behaves consistently on all systems your software runs on? Do you just want to lowercase ASCII characters and pass through everything else?
What libraries are available?

Once you have answers to those questions you can start looking for a soloution that fits your needs. There is no one size fits all that works for everyone everywhere!

Gatian answered 28/1, 2019 at 21:31 Comment(1)

I suggest you look up a number of answers, at the one provided by @DevSolar. He explains in very good detail why only the ICU library is capable of doing text well in C++. It is by the very people who invented and support UTF-8 and other Unicode encodings. It is much more complex than most realize. – Gravy 1/1 at 18:30

C++ doesn't have tolower or toupper methods implemented for std::string, but it is available for char. One can easily read each char of string, convert it into required case and put it back into string. A sample code without using any third party library:

#include<iostream>
    
int main(){
    std::string str = std::string("How ARe You");
    for(char &ch : str){
        ch = std::tolower(ch);
    }
    std::cout<<str<<std::endl;
    return 0;
}

For character based operation on string : For every character in string

Winepress answered 17/3, 2019 at 14:35 Comment(0)

// tolower example (C++)
#include <iostream>       // std::cout
#include <string>         // std::string
#include <locale>         // std::locale, std::tolower

int main ()
{
  std::locale loc;
  std::string str="Test String.\n";
  for (std::string::size_type i=0; i<str.length(); ++i)
    std::cout << std::tolower(str[i],loc);
  return 0;
}

For more information: http://www.cplusplus.com/reference/locale/tolower/

Pomace answered 20/3, 2017 at 5:20 Comment(0)

An explanation of how this solution works:

string test = "Hello World";
for(auto& c : test)
{
   c = tolower(c);
}

Explanation:

for(auto& c : test) is a range-based for loop of the kind
for ( range_declaration:range_expression)loop_statement:

range_declaration: auto& c
Here the auto specifier is used for for automatic type deduction. So the type gets deducted from the variables initializer.
range_expression: test
The range in this case are the characters of string test.

The characters of the string test are available as a reference inside the for loop through identifier c.

Burchette answered 17/4, 2018 at 12:20 Comment(1)

I don't see the value of adding this as an answer, or as an edit to the linked answer for that matter. If someone needs an explanation of how the range-for loop works, there are multiple resources for that, e.g. stackoverflow.com/questions/35490236. For this question, I think this explanation is just noise - like adding an explanation of how iterators or standard algorithms work for the answers that use std::transform. – Columbine 23/6, 2023 at 13:31

Try this function :)

string toLowerCase(string str) {

    int str_len = str.length();

    string final_str = "";

    for(int i=0; i<str_len; i++) {

        char character = str[i];

        if(character>=65 && character<=92) {

            final_str += (character+32);

        } else {

            final_str += character;

        }

    }

    return final_str;

}

Jariah answered 19/3, 2020 at 1:12 Comment(1)

This function is slow, shouldn't be used in real-life projects. – Gladiatorial 29/6, 2021 at 10:30

Have a look at the excellent c++17 cpp-unicodelib (GitHub). It's single-file and header-only.


#include <exception>
#include <iostream>
#include <codecvt>

// cpp-unicodelib, downloaded from GitHub
#include "unicodelib.h"
#include "unicodelib_encodings.h"

using namespace std;
using namespace unicode;

// converter that allows displaying a Unicode32 string
wstring_convert<codecvt_utf8<char32_t>, char32_t> converter;

std::u32string  in = U"Je suis là!";
cout << converter.to_bytes(in) << endl;

std::u32string  lc = to_lowercase(in);
cout << converter.to_bytes(lc) << endl;

Output

Je suis là!
je suis là!

Obscenity answered 25/4, 2022 at 13:18 Comment(1)

2022, c++17, again and again you have to visit stackoverflow to check for another version of tolower – Pushing 30/4, 2022 at 11:34

Use fplus::to_lower_case() from fplus library.

Search to_lower_case in fplus API Search

Example:

fplus::to_lower_case(std::string("ABC")) == std::string("abc");

Fungicide answered 8/5, 2017 at 7:21 Comment(0)

Google's absl library has absl::AsciiStrToLower / absl::AsciiStrToUpper

Insolent answered 27/5, 2022 at 8:43 Comment(0)

Since you are using std::string, you are using c++. If using c++11 or higher, this doesn't need anything fancy. If words is vector<string>, then:

    for (auto & str : words) {
        for(auto & ch : str)
            ch = tolower(ch);
    }

Doesn't have strange exceptions. Might want to use w_char's but otherwise this should do it all in place.

Diacritical answered 27/1, 2023 at 19:41 Comment(0)

For a different perspective, there is a very common use case which is to perform locale neutral case folding on Unicode strings. For this case, it is possible to get good case folding performance when you realize that the set of foldable characters is finite and relatively small (< 2000 Unicode code points). It happens to work very well with a generated perfect hash (guaranteed zero collisions) can be used to convert every input character to its lowercase equivalent.

With UTF-8, you do have to be conscientious of multi-byte characters and iterate accordingly. However, UTF-8 has fairly simple encoding rules that make this operation efficient.

For more details, including links to the relevant parts of the Unicode standard and a perfect hash generator, see my answer here, to the question How to achieve unicode-agnostic case insensitive comparison in C++.

Horal answered 4/4, 2023 at 19:44 Comment(0)

-1

Code Snippet

#include<bits/stdc++.h>
using namespace std;


int main ()
{
    ios::sync_with_stdio(false);

    string str="String Convert\n";

    for(int i=0; i<str.size(); i++)
    {
      str[i] = tolower(str[i]);
    }
    cout<<str<<endl;

    return 0;
}

Directly answered 10/4, 2017 at 19:11 Comment(0)

-1

Add some optional libraries for ASCII string to_lower, both of which are production level and with micro-optimizations, which is expected to be faster than the existed answers here(TODO: add benchmark result).

Facebook's Folly:

void toLowerAscii(char* str, size_t length)

Google's Abseil:

void AsciiStrToLower(std::string* s);

Gladiatorial answered 22/6, 2021 at 9:49 Comment(0)

-1

I wrote a templated version that works with any string :

#include <type_traits> // std::decay
#include <ctype.h>    // std::toupper & std::tolower


template <class T = void> struct farg_t { using type = T; };
template <template<typename ...> class T1, 
class T2> struct farg_t <T1<T2>> { using type = T2*; };
//---------------

template<class T, class T2 = 
typename std::decay< typename farg_t<T>::type >::type>
void ToUpper(T& str) { T2 t = &str[0]; 
for (; *t; ++t) *t = std::toupper(*t); }


template<class T, class T2 = typename std::decay< typename 
farg_t<T>::type >::type>
void Tolower(T& str) { T2 t = &str[0]; 
for (; *t; ++t) *t = std::tolower(*t); }

Tested with gcc compiler:

#include <iostream>
#include "upove_code.h"

int main()
{

    std::string str1 = "hEllo ";
    char str2 [] = "wOrld";

    ToUpper(str1);
    ToUpper(str2);
    std::cout << str1 << str2 << '\n'; 
    Tolower(str1);
    Tolower(str2);
    std::cout << str1 << str2 << '\n'; 
    return 0;
}

output:

>HELLO WORLD
>
>hello world

Overcome answered 3/2, 2022 at 10:11 Comment(0)

-3

This could be another simple version to convert uppercase to lowercase and vice versa. I used VS2017 community version to compile this source code.

#include <iostream>
#include <string>
using namespace std;

int main()
{
    std::string _input = "lowercasetouppercase";
#if 0
    // My idea is to use the ascii value to convert
    char upperA = 'A';
    char lowerA = 'a';

    cout << (int)upperA << endl; // ASCII value of 'A' -> 65
    cout << (int)lowerA << endl; // ASCII value of 'a' -> 97
    // 97-65 = 32; // Difference of ASCII value of upper and lower a
#endif // 0

    cout << "Input String = " << _input.c_str() << endl;
    for (int i = 0; i < _input.length(); ++i)
    {
        _input[i] -= 32; // To convert lower to upper
#if 0
        _input[i] += 32; // To convert upper to lower
#endif // 0
    }
    cout << "Output String = " << _input.c_str() << endl;

    return 0;
}

Note: if there are special characters then need to be handled using condition check.

Winfordwinfred answered 4/6, 2018 at 2:47 Comment(0)

-3

use this code to change case of string in c++.

#include<bits/stdc++.h>

using namespace std;

int main(){
  string a = "sssAAAAAAaaaaDas";
  transform(a.begin(),a.end(),a.begin(),::tolower);
  cout<<a;
}

Kamalakamaria answered 27/5, 2022 at 6:1 Comment(1)

Never recommend using #include <bits/stdc++.h> in an answer on Stack Overflow. You'll get downvoted. – Stadler 27/5, 2022 at 13:50

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Code Snippet

Recommended topics

Hot tags