How to print special characters explicitly in C?
Asked Answered
K

8

13

When I use below code:

#include <stdio.h>

int main(void)
{
    printf("%s","Hello world\nHello world");
    return 0;
}

it prints as:

 Hello world
 Hello world

How can I prevent this and print it as raw string literal in C? I mean it should be displayed as it is in terminal window like below:

Hello world\nHello world

I know I can achieve this by using backslash for printf but is there any other C function or way to do this without backslashing? It would be helpful when reading files.

K answered 6/4, 2015 at 18:35 Comment(8)
If you read in files that contain a backslash, you can print them with printf without doing anything different. Have you tried it? In your example, the compiler is interpreting the \n and replacing it with a newline. If you fill in your strings some other way, say by reading a line of a file into a string, this doesn't happen.Iatrochemistry
Thanks. I need \n character to be displayed as raw. It defaults not to be visible and creating a new line in terminal window.K
Oh, you want a newline in a string to be displayed as a \n? So if you read a file that contains a newline, it will replace it with a \n? You could write a function to do that.Iatrochemistry
Oh, is not there any standard function for that in C? Should I replace all escape characters by backslashing by a function? It sounds a bit awkward.K
Well, once the text is in a string, there are no escape characters left to replace. You just have a literal newline character in your string. I'll write a short function to show what I'm thinking of.Iatrochemistry
if(ch == '\n') fputs("\\n", stdout); else putchar(ch);Christianize
Consider https://mcmap.net/q/506715/-escape-all-special-characters-in-printfB
After this string is printed, what will read it? Just humans or other C code?B
T
6

There is no built-in mechanism to do this. You have to do it manually, character-by-character. However, the functions in ctype.h may help. Specifically, in the "C" locale, the function isprint is guaranteed to be true for all of the graphic characters in the basic execution character set, which is effectively the same as all the graphic characters in 7-bit ASCII, plus space; and it is guaranteed not to be true for all the control characters in 7-bit ASCII, which includes tab, carriage return, etc.

Here is a sketch:

#include <stdio.h>
#include <ctype.h>
#include <locale.h>

int main(void)
{
    int x;
    setlocale(LC_ALL, "C"); // (1)

    while ((x = getchar()) != EOF)
    {
        unsigned int c = (unsigned int)(unsigned char)x; // (2)

        if (isprint(c) && c != '\\')
            putchar(c);
        else
            printf("\\x%02x", c);
    }
    return 0;
}

This does not escape ' nor ", but it does escape \, and it is straightforward to extend that if you need it to.

Printing \n for U+000A, \r for U+000D, etc. is left as an exercise. Dealing with characters outside the basic execution character set (e.g. UTF-8 encoding of U+0080 through U+10FFFF) is also left as an exercise.

This program contains two things which are not necessary with a fully standards-compliant C library, but in my experience have been necessary on real operating systems. They are marked with (1) and (2).

1) This explicitly sets the 'locale' configuration the way it is supposed to be set by default.

2) The value returned from getchar is an int. It is supposed to be either a number in the range representable by unsigned char (normally 0-255 inclusive), or the special value EOF (which is not in the range representable by unsigned char). However, buggy C libraries have been known to return negative numbers for characters with their highest bit set. If that happens, the printf will print (for instance) \xffffffa1 when it should've printed \xa1. Casting x to unsigned char and then back to unsigned int corrects this.

Tia answered 6/4, 2015 at 18:57 Comment(5)
Note: No need for unsigned int c = (unsigned int)(unsigned char)x;, just use x. getchar() returns a value in the unsigned char range or EOF. x is zero extended.B
@chux You are correct as far as the standard goes, but I have personally tripped over at least two C libraries that didn't get that right. It was a long time ago; perhaps the extra defensiveness is no longer necessary.Tia
Scars from earlier battles cover my fingertips too.B
Thanks @zvol for explanation. But I did not understand what is the purpose behind zero extensioning . Why is x type casted to a char first and then int while x is already char. It will be helpful for me if the code is simpler, please, because I am new to C.K
@K Unfortunately, that line is necessary in practice (although, as pointed out above, not in principle). I have added some explanation.Tia
I
1

Something like this might be what you want. Run myprint(c) to print the character C or a printable representation of it:

#include <ctype.h>

void myprint(int c)
{
    if (isprint(c))
        putchar(c); // just print printable characters
    else if (c == '\n')
        printf("\\n"); // display newline as \n
    else
        printf("%02x", c); // print everything else as a number
}

If you're using Windows, I think all your newlines will be CRLF (carriage return, linefeed) so they'll print as 0d\n the way I wrote that function.

Iatrochemistry answered 6/4, 2015 at 18:56 Comment(0)
C
1

Just use,putchar(specialCharName). It displays the entered special character.

Chinatown answered 22/12, 2019 at 5:30 Comment(0)
E
0

What you're looking for is this:

#include <stdio.h>
int main(void)
{
    printf("%s","Hello world\\nHello world");
    return 0;
}

This would produce the following output: Hello world\nHello world

Englishism answered 6/4, 2015 at 18:47 Comment(5)
I know I can do by backslashing. I need a way without using backslashing if possible.K
There is no other way.Englishism
I agree with @IvoValchev. Do not down vote before giving a chance to explain.Unbind
@GRC There is another way. Take a look at the answers given by yellowantphil or zwol.Livraison
both of those do the almost the same thing, they replace \n with \\n, not in string but in output.Unbind
T
0

If I understand the question, if you have a string containing control characters like newline, tab, backspace, etc., you want to print a text representation of those characters, rather than interpret them as control characters.

Unfortunately, there's no built-in printf conversion specifier that will do that for you. You'll have to walk through the string character by character, test each one to see if it's a control character, and write some text equivalent for it.

Here's a quick, lightly tested example:

#include <stdio.h>
#include <limits.h>
#include <ctype.h>
...
char *src="This\nis\ta\btest";

char *lut[CHAR_MAX] = {0};  // look up table for printable equivalents
                            // of non-printable characters
lut['\n'] = "\\n";
lut['\t'] = "\\t";
lut['\b'] = "\\b";
...
for ( char *p = src; *p != 0; p++ )
{
  if ( isprint( *p ) )
    putchar( *p );
  else
    fputs( lut[ (int) *p], stdout ); // puts adds a newline at the end,
                                     // fputs does not.
}
putchar( '\n' );
Trichome answered 6/4, 2015 at 19:17 Comment(2)
If char is signed by default, the above code will fail if src contains any non ASCII characters. The cast (int) *p does not address this issue! You should use UCHAR_MAX for lut size and cast as (unsigned char) in isprint((unsigned char)*p) and lut[(unsigned char)*p].Tidy
In typical implementations, isprint(CHAR_MAX) --> 0 causing lut[ (int) CHAR_MAX] to access out of bounds. char *lut[CHAR_MAX] is certainly off by 1. I'd expect char *lut[CHAR_MAX+1] (and somehow handle negative values) or even better char *lut[UCHAR_MAX+1] and use unsigned char.B
A
0

Thank you the user @chunk for contributing to the improvement this answer.


Why did not you write general-purpose solution? It would keep you from many problems in the future.

char *
str_escape(char str[])
{
    char chr[3];
    char *buffer = malloc(sizeof(char));
    unsigned int len = 1, blk_size;

    while (*str != '\0') {
        blk_size = 2;
        switch (*str) {
            case '\n':
                strcpy(chr, "\\n");
                break;
            case '\t':
                strcpy(chr, "\\t");
                break;
            case '\v':
                strcpy(chr, "\\v");
                break;
            case '\f':
                strcpy(chr, "\\f");
                break;
            case '\a':
                strcpy(chr, "\\a");
                break;
            case '\b':
                strcpy(chr, "\\b");
                break;
            case '\r':
                strcpy(chr, "\\r");
                break;
            default:
                sprintf(chr, "%c", *str);
                blk_size = 1;
                break;
        }
        len += blk_size;
        buffer = realloc(buffer, len * sizeof(char));
        strcat(buffer, chr);
        ++str;
    }
    return buffer;
}

How it work!

int
main(const int argc, const char *argv[])
{
    puts(str_escape("\tAnbms\n"));
    puts(str_escape("\tA\v\fZ\a"));
    puts(str_escape("txt \t\n\r\f\a\v 1 \t\n\r\f\a\v tt"));
    puts(str_escape("dhsjdsdjhs hjd hjds "));
    puts(str_escape(""));
    puts(str_escape("0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ!\"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~ \t\n\r\f\a\v"));
    puts(str_escape("\x0b\x0c\t\n\r\f\a\v"));
    puts(str_escape("\x01\x02\x03\x04\x05\x06\x07\x08\t\n\x0b\x0c\r\x0e\x0f\x10\x11\x12\x13\x14"));
}

Output

\tAnbms\n
\tA\v\fZ\a
txt \t\n\r\f\a\v 1 \t\n\r\f\a\v tt
dhsjdsdjhs hjd hjds 

0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~ \t\n\r\f\a\v
\v\f\t\n\r\f\a\v
\a\b\t\n\v\f\r

This solution based on an information from the Wikipedia https://en.wikipedia.org/wiki/Escape_sequences_in_C#Table_of_escape_sequences and the answers other users of the stackoverflow.com.


Testing environment

$ lsb_release -a
No LSB modules are available.
Distributor ID: Debian
Description:    Debian GNU/Linux 8.6 (jessie)
Release:    8.6
Codename:   jessie
$ uname -a
Linux localhost 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u2 (2016-10-19) x86_64 GNU/Linux
$ gcc --version
gcc (Debian 4.9.2-10) 4.9.2
Copyright (C) 2014 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Accusatory answered 22/2, 2017 at 15:13 Comment(6)
1) For a general solution, the answer should also show how it handles printing a char outside the 0-127 range. 2) strcat(buffer, chr); is UB as buffer[] is not initialized.B
Dear @chux, why are you need an answer on "how it handles printing a char outside the 0-127 range?" and why "buffer[] is not initialized?" - I don`t no errors and no warnings about it. I am used for compilation the GCC on GNU/Linux (I updated answer).Estate
1) What is the value of buffer[0] the first time strcat(buffer, chr); is called? Code does not set it anywhere, so it is uninitialized. 2) The space allocated is 1 too small.B
@chux, may be you are right. Each string always contains the last character '\0'. I updated the answer and added You as a contributor on top the of the content.Estate
"Each string always contains the last character '\0'", but buffer is not necessarily a string. Code never puts a '\0' in buffer[] before calling strcat(buffer, chr), so that leads to undefined behavior (UB). Simply add buffer[0] = '\0'; after char *buffer = malloc(sizeof(char));.B
@chunk, I think your notice is very advanced, but I don`t this technique at all early and never faced with the UB or other undesirable side effects. I read many resources in web, but never heard about "buffer[0] = '\0'". Where did you find? - a link on an article,a book, a page, etc.Estate
H
0
  /// My experience Win 10 Code blocks GCC MinGW

#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <unistd.h>
#include <windows.h>
///#include <threads.h>
#include <conio.h>
/// #include <dos.h>
#include <direct.h>

int main(void)

{
  /// This will give your desired result, turn string into Raw string :
  printf(R"(Hello world\nHello world)");
  printf(R"(Raw string support printing  *&^%$#@!~()_+-=,<.>/?:;"' )");
  printf("\n");
  printf(R"(.C with a Capital C file format does not support raw string )");
  printf("\n");
  printf(R"(.c with a small c file format does support raw string )");
  printf("\n");
  printf(R"( Raw string did not support \n new line )");
  printf("\n");

  printf(
      R"(More reading material at - https: // en.wikipedia.org/wiki/String_literal#Raw_strings;)");
  printf("\n");
  printf(
      R"(More reading material at - https: // en.wikipedia.org/wiki/String_literal;)");
  printf("\n");
  printf(
      R"(More reading material at - https://mcmap.net/q/506716/-does-c-support-raw-string-literals;)");
  printf("\n");
  printf(
      R"(More reading material at - https: // learn.microsoft.com/en-us/cpp/c-language/c-string-literals?view=vs-2019)");
  printf("\n");
  printf(
      R"(More reading material at-https: // learn.microsoft.com/en-us/cpp/c-language/string-literal-concatenation?view=vs-2019)");
  printf("\n");
  /// Raw string.

    printf(R"(More reading material at - https://www.geeksforgeeks.org/const-qualifier-in-c/;)");
  printf("\n");
  
  
  return 0;
}
Horodko answered 27/7, 2020 at 8:37 Comment(0)
P
0

Bro just use this simple code

#include <stdio.h>
int main()
{
     printf("Hello World\\nHello World");
     return 0;
}

And it should work pretty fine ! Happy Coding 😉

Proudlove answered 13/11, 2023 at 11:47 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.