Save all files in Visual Studio project as UTF-8
Asked Answered
T

15

95

I wonder if it's possible to save all files in a Visual Studio 2008 project into a specific character encoding. I got a solution with mixed encodings and I want to make them all the same (UTF-8 with signature).

I know how to save single files, but how about all files in a project?

Tabbi answered 11/11, 2008 at 0:31 Comment(2)
You should know that RC compiler (as least untill Visual Studio 2008) does not supports UTF8 files - for these files you have to use UTF16.Kentigerma
Also, GlobalSuppressions.cs is UTF-16.Socalled
P
78

Since you're already in Visual Studio, why not just simply write the code?

foreach (var f in new DirectoryInfo(@"...").GetFiles("*.cs", SearchOption.AllDirectories)) {
  string s = File.ReadAllText(f.FullName);
  File.WriteAllText (f.FullName, s, Encoding.UTF8);
}

Only three lines of code! I'm sure you can write this in less than a minute :-)

Pediment answered 12/5, 2009 at 1:26 Comment(4)
What about subdirectories, eg. the "Properties" subdir with lots of *.cs files?Triptolemus
The "SearchOption.AllDirectories" parameter is all that's necessary to include subdirectories. I've edited the code accordingly.Pediment
I have now tried it and it works great. The only thing I had to modify was to use Encoding.GetEncoding(1252)=Western European (Windows) as the second parameter to ReadAllText to preserve my swedish characters (åäö).Tabbi
This solution is even better after VS2015 when you can run this snippet through here: View -> Other Windows -> C# InteractiveInterleaf
C
41

This may be of some help.

link removed due to original reference being defaced by spam site.

Short version: edit one file, select File -> Advanced Save Options. Instead of changing UTF-8 to Ascii, change it to UTF-8. Edit: Make sure you select the option that says no byte-order-marker (BOM)

Set code page & hit ok. It seems to persist just past the current file.

Clayberg answered 13/4, 2010 at 21:21 Comment(4)
Change it to "Unicode (UTF-8 without signature)", otherwise it will add a BOM to the beginning of the file.Gamal
Agreed as well... somebody set up us the BOM.Maryettamaryjane
If your file only contains 7-bit ASCII characters, or if any 8-bit ANSI character is not near the top, saving it as UTF-8 without BOM will open it in other editors (like VSCode) as ASCII or a local encoding (like ISO-8859-1, or windows-1252, to name a few). To enforce the encoding, you must use the BOM :).Zenda
Also, using UTF-8, there is no codepage...(Windows used CP65001 for UTF8 and CP1200 for UTF-16). Actual codepages are a thing of the past. Have a look at this old, but still relevant, famous Joel Spolsky post. And see this post: #1629937Zenda
L
12

In case you need to do this in PowerShell, here is my little move:

Function Write-Utf8([string] $path, [string] $filter='*.*')
{
    [IO.SearchOption] $option = [IO.SearchOption]::AllDirectories;
    [String[]] $files = [IO.Directory]::GetFiles((Get-Item $path).FullName, $filter, $option);
    foreach($file in $files)
    {
        "Writing $file...";
        [String]$s = [IO.File]::ReadAllText($file);
        [IO.File]::WriteAllText($file, $s, [Text.Encoding]::UTF8);
    }
}
Lillie answered 25/8, 2009 at 5:7 Comment(3)
The file stays as UTF8-Signed in visual studio Advanced save optionsWilde
Unicode characters are lost after execution. For example, Ü becomes � and © becomes �.Plains
Be very careful with this code, because it'll destroy your .git directory too (as it just did for me). I suggest changing the wildcard te *.cs. Silly me for trusting the code snippet blindly.Sacttler
W
8

I would convert the files programmatically (outside VS), e.g. using a Python script:

import glob, codecs

for f in glob.glob("*.py"):
    data = open("f", "rb").read()
    if data.startswith(codecs.BOM_UTF8):
        # Already UTF-8
        continue
    # else assume ANSI code page
    data = data.decode("mbcs")
    data = codecs.BOM_UTF8 + data.encode("utf-8")
    open("f", "wb").write(data)

This assumes all files not in "UTF-8 with signature" are in the ANSI code page - this is the same what VS 2008 apparently also assumes. If you know that some files have yet different encodings, you would have to specify what these encodings are.

Wadmal answered 11/11, 2008 at 8:24 Comment(0)
P
6

Using C#:
1) Create a new ConsoleApplication, then install Mozilla Universal Charset Detector
2) Run code:

static void Main(string[] args)
{
    const string targetEncoding = "utf-8";
    foreach (var f in new DirectoryInfo(@"<your project's path>").GetFiles("*.cs", SearchOption.AllDirectories))
    {
        var fileEnc = GetEncoding(f.FullName);
        if (fileEnc != null && !string.Equals(fileEnc, targetEncoding, StringComparison.OrdinalIgnoreCase))
        {
            var str = File.ReadAllText(f.FullName, Encoding.GetEncoding(fileEnc));
            File.WriteAllText(f.FullName, str, Encoding.GetEncoding(targetEncoding));
        }
    }
    Console.WriteLine("Done.");
    Console.ReadKey();
}

private static string GetEncoding(string filename)
{
    using (var fs = File.OpenRead(filename))
    {
        var cdet = new Ude.CharsetDetector();
        cdet.Feed(fs);
        cdet.DataEnd();
        if (cdet.Charset != null)
            Console.WriteLine("Charset: {0}, confidence: {1} : " + filename, cdet.Charset, cdet.Confidence);
        else
            Console.WriteLine("Detection failed: " + filename);
        return cdet.Charset;
    }
}
Postulant answered 19/3, 2015 at 7:7 Comment(0)
G
2

The best solution nowadays is to add to your .editorconfig file in the [*.cs] (or whatever format you want) section:

charset = utf-8

For example, my .editorconfig begins with:

[*.cs]

charset = utf-8

You can also use utf-8-bom if you need to.

Next is to run the dotnet format command in the folder with the solution file, it will do the job.

Done!

Grievance answered 4/8, 2023 at 19:3 Comment(3)
Should be marked as answer in 2023! Just tried in a solution with 3-4 different encodings, it solved it all beautifully.Conception
I tried this with .cshtml files it did not work :/ any suggestions?Quickman
Sorry, @d0rf47, I only have a couple of .cshtml files, so I didn't have a problem with them. Looks like it's a known issue, you can track it and boost with reactions/commentsGrievance
P
1

Thanks for your solutions, this code has worked for me :

Dim s As String = ""
Dim direc As DirectoryInfo = New DirectoryInfo("Your Directory path")

For Each fi As FileInfo In direc.GetFiles("*.vb", SearchOption.AllDirectories)
    s = File.ReadAllText(fi.FullName, System.Text.Encoding.Default)
    File.WriteAllText(fi.FullName, s, System.Text.Encoding.Unicode)
Next
Postprandial answered 22/2, 2010 at 18:38 Comment(0)
E
1

I have created a function to change encoding files written in asp.net. I searched a lot. And I also used some ideas and codes from this page. Thank you.

And here is the function.

  Function ChangeFileEncoding(pPathFolder As String, pExtension As String, pDirOption As IO.SearchOption) As Integer

    Dim Counter As Integer
    Dim s As String
    Dim reader As IO.StreamReader
    Dim gEnc As Text.Encoding
    Dim direc As IO.DirectoryInfo = New IO.DirectoryInfo(pPathFolder)
    For Each fi As IO.FileInfo In direc.GetFiles(pExtension, pDirOption)
        s = ""
        reader = New IO.StreamReader(fi.FullName, Text.Encoding.Default, True)
        s = reader.ReadToEnd
        gEnc = reader.CurrentEncoding
        reader.Close()

        If (gEnc.EncodingName <> Text.Encoding.UTF8.EncodingName) Then
            s = IO.File.ReadAllText(fi.FullName, gEnc)
            IO.File.WriteAllText(fi.FullName, s, System.Text.Encoding.UTF8)
            Counter += 1
            Response.Write("<br>Saved #" & Counter & ": " & fi.FullName & " - <i>Encoding was: " & gEnc.EncodingName & "</i>")
        End If
    Next

    Return Counter
End Function

It can placed in .aspx file and then called like:

ChangeFileEncoding("C:\temp\test", "*.ascx", IO.SearchOption.TopDirectoryOnly)
Emlen answered 10/1, 2012 at 7:7 Comment(0)
G
1

if you are using TFS with VS : http://msdn.microsoft.com/en-us/library/1yft8zkw(v=vs.100).aspx Example :

tf checkout -r -type:utf-8 src/*.aspx
Ginglymus answered 13/8, 2013 at 10:31 Comment(0)
D
1

If you want to avoid this type of error :

enter image description here

Use this following code :

foreach (var f in new DirectoryInfo(@"....").GetFiles("*.cs", SearchOption.AllDirectories))
            {
                string s = File.ReadAllText(f.FullName, Encoding.GetEncoding(1252));
                File.WriteAllText(f.FullName, s, Encoding.UTF8);
            }

Encoding number 1252 is the default Windows encoding used by Visual Studio to save your files.

Decarlo answered 9/2, 2017 at 17:32 Comment(0)
P
1

Convert from UTF-8-BOM to UTF-8

Building on rasx's answer, here is a PowerShell function that assumes your current files are already encoded in UTF-8 (but maybe with BOM) and converts them to UTF-8 without BOM, therefore preserving existing Unicode characters.

Function Write-Utf8([string] $path, [string] $filter='*')
{
    [IO.SearchOption] $option = [IO.SearchOption]::AllDirectories;
    [String[]] $files = [IO.Directory]::GetFiles((Get-Item $path).FullName, $filter, $option);
    foreach($file in $files)
    {
        "Writing $file...";
        [String]$s = [IO.File]::ReadAllText($file, [Text.Encoding]::UTF8);
        [Text.Encoding]$e = New-Object -TypeName Text.UTF8Encoding -ArgumentList ($false);
        [IO.File]::WriteAllText($file, $s, $e);
    }
}
Penal answered 8/10, 2019 at 13:11 Comment(0)
W
0

Experienced encoding problems after converting solution from VS2008 to VS2015. After conversion all project files was encoded in ANSI, but they contained UTF8 content and was recongnized as ANSI files in VS2015. Tried many conversion tactics, but worked only this solution.

 Encoding encoding = Encoding.Default;
 String original = String.Empty;
 foreach (var f in new DirectoryInfo(path).GetFiles("*.cs", SearchOption.AllDirectories))
 {
    using (StreamReader sr = new StreamReader(f.FullName, Encoding.Default))
    {
       original = sr.ReadToEnd();
       encoding = sr.CurrentEncoding;
       sr.Close();
    }
    if (encoding == Encoding.UTF8)
       continue;
    byte[] encBytes = encoding.GetBytes(original);
    byte[] utf8Bytes = Encoding.Convert(encoding, Encoding.UTF8, encBytes);
    var utf8Text = Encoding.UTF8.GetString(utf8Bytes);

    File.WriteAllText(f.FullName, utf8Text, Encoding.UTF8);
 }
Wyant answered 28/8, 2015 at 7:5 Comment(0)
R
0

the item is removed from the menu in Visual Studio 2017 You can still access the functionality through File-> Save As -> then clicking the down arrow on the Save button and clicking "Save With Encoding...".

You can also add it back to the File menu through Tools->Customize->Commands if you want to.

Redneck answered 13/12, 2018 at 20:56 Comment(0)
B
0

set this new UTF8Encoding(true) in read and write

foreach (var f in new DirectoryInfo(@"C:\Apps\ClRs\ClrsWeb").GetFiles("*.aspx.vb", SearchOption.AllDirectories))
            {
                string s = File.ReadAllText(f.FullName, new UTF8Encoding(true));
                File.WriteAllText(f.FullName, s,new UTF8Encoding(true));
            }
Bracknell answered 26/3, 2024 at 11:51 Comment(0)
E
-1

I'm only offering this suggestion in case there's no way to automatically do this in Visual Studio (I'm not even sure this would work):

  1. Create a class in your project named 足の不自由なハッキング (or some other unicode text that will force Visual Studio to encode as UTF-8).
  2. Add "using MyProject.足の不自由なハッキング;" to the top of each file. You should be able to do it on everything by doing a global replace of "using System.Text;" with "using System.Text;using MyProject.足の不自由なハッキング;".
  3. Save everything. You may get a long string of "Do you want to save X.cs using UTF-8?" messages or something.
Eventide answered 11/11, 2008 at 0:55 Comment(3)
Duh, if you really want to make it stick just add a comment with those characters. At least it won't get deleted next time someone goes "Remove Unused Usings" in the Edit menu.Triptolemus
Add "using MyProject.足の不自由なハッキング;" to the top of each file. - I think the main reason for the question was, not to have to open each file separately.Prepared
This does not work for files with German Umlaut characters like äöüß. The file content will still be non-UTF.Histone

© 2022 - 2025 — McMap. All rights reserved.