Bug with adjusting RTF in Winforms when using Windows-wide beta UTF-8 support feature
Asked Answered
R

2

11

I think I've found a bug in Windows or .NET and am looking for a workaround.

To reproduce the problem, first enable the Windows feature "Beta: Use Unicode UTF-8 for worldwide language support".

enter image description here

You may need to reboot the machine.

Now simply create two RichTextBox components in Winforms/C#, and then add the event:

    private void richTextBox1_TextChanged(object sender, EventArgs e)
    {
        string s = richTextBox1.Rtf;
        richTextBox2.Rtf = s;
    }

Finally, run the program and simply type something into the first RichTextBox, and it'll crash with the message "File format is not valid" when it tries to write to the richTextBox2.Rtf. It won't crash if the Windows feature "Beta: Use Unicode UTF-8 for worldwide language support" is disabled.

I'm thinking of two potential workarounds here:

1: Somehow disable within the C# app the entire "Beta: Use Unicode UTF-8 for worldwide language support" feature and pretend it was never enabled in the first place.

2: Somehow edit the RTF string to comply with whatever unknown requirements the new RTF should have before adjusting the RTF of the other RichTextBox. This seems counter-intuitive considering the first RichTextBox should have exactly the same RTF anyway, but anyway...


************* Exception Text **************
System.ArgumentException: File format is not valid.
at System.Windows.Forms.RichTextBox.StreamIn(Stream data, Int32 flags)
at System.Windows.Forms.RichTextBox.StreamIn(String str, Int32 flags)
at System.Windows.Forms.RichTextBox.set_Rtf(String value)
at unicodeTesting.Form1.richTextBox1_TextChanged(Object sender, EventArgs e) in D:\Code\c#\_tests\unicodeTesting\Form1.cs:line 30
at System.Windows.Forms.Control.OnTextChanged(EventArgs e)
at System.Windows.Forms.TextBoxBase.OnTextChanged(EventArgs e)
at System.Windows.Forms.TextBoxBase.WmReflectCommand(Message& m)
at System.Windows.Forms.TextBoxBase.WndProc(Message& m)
at System.Windows.Forms.RichTextBox.WmReflectCommand(Message& m)
at System.Windows.Forms.RichTextBox.WndProc(Message& m)
at System.Windows.Forms.Control.ControlNativeWindow.OnMessage(Message& m)
at System.Windows.Forms.Control.ControlNativeWindow.WndProc(Message& m)
at System.Windows.Forms.NativeWindow.Callback(IntPtr hWnd, Int32 msg, IntPtr wparam, IntPtr lparam)
Redden answered 30/5, 2019 at 17:15 Comment(10)
What crashes? Is it reading from the first text box, or writing to the second? What is the full stack trace?Rosinweed
@canton7: Writing to second. Edited question to clarify and added exception text if that's what you mean by "full stack trace".Redden
Well, it is beta. RTF is wonky, one of the last remaining text formats that is codepage-based at its core and can't be done in a Unicode encoding. You might get ahead with a RichTextBox version that was done this century and is used in the Wordpad applet. https://mcmap.net/q/1018547/-windows-forms-richtextbox-loses-table-background-coloursWheezy
I've seen many bugs relating to the beta UTF-8 locale setting so this isn't surprsingCasanova
Have tried Han's suggestion? The only way I could reproduce your issue was to target a .Net version < 4.7. Net 4.7 and above default to the RichEdit50 control versus the RichEdit20 used in prior versions.Barnes
@TnTinMn: Right, tried Hans' idea, and it works. I worry that there may be side effects from using a different version of the RichTextBox, but I'll give it a go.Redden
@HansPassant: Okay, got round to trying your idea and it works! I worry about side effects from a different RichTextBox version (as I created my own RTF code manually), but I'll give it a shot. If you make it an answer, I'll award you the 100 bounty! By the way, how do I message two people at once. Using @person1 @person2 didn't seem to register.Redden
"I worry that there may be side effects from using a different version of the RichTextBox" - it is unlikely that there will be any noticeable differences. As I wrote earlier, if you target .Net 4.7 or higher you will be using the newer version anyways with no need to create a custom RTB.Barnes
@DanW why did you use that setting in the first place? Windows and .NET use UTF16 for strings. .NET's text readers and writers use UTF8 by default. That setting applies only to nonUnicode programs. You don't need to use UTF8 as the system codepage to work with UnicodeTammany
@DanW the fact that WinForms worked in every country in the world for the last 17 years without problems should prove that the beta UTF8 codepage isn't needed. Problems were caused by font support, or reading files using the wrong ASCII/single-byte codepage, not due to Unicode issues.Tammany
M
7

Microsoft open sourced the WinForms libraries, so you can dig into the source code yourself:

https://github.com/dotnet/winforms/tree/master/src/System.Windows.Forms/src/System/Windows/Forms

The StreamIn method is on line 3140 of https://github.com/dotnet/winforms/blob/master/src/System.Windows.Forms/src/System/Windows/Forms/RichTextBox.cs:

 private void StreamIn(string str, int flags)
    {
        if (str.Length == 0)
        {
            // Destroy the selection if callers was setting
            // selection text
            //
            if ((RichTextBoxConstants.SFF_SELECTION & flags) != 0)
            {
                SendMessage(Interop.WindowMessages.WM_CLEAR, 0, 0);
                ProtectedError = false;
                return;
            }
            // WM_SETTEXT is allowed even if we have protected text
            //
            SendMessage(Interop.WindowMessages.WM_SETTEXT, 0, "");
            return;
        }

        // Rather than work only some of the time with null characters,
        // we're going to be consistent and never work with them.
        int nullTerminatedLength = str.IndexOf((char)0);
        if (nullTerminatedLength != -1)
        {
            str = str.Substring(0, nullTerminatedLength);
        }

        // get the string into a byte array
        byte[] encodedBytes;
        if ((flags & RichTextBoxConstants.SF_UNICODE) != 0)
        {
            encodedBytes = Encoding.Unicode.GetBytes(str);
        }
        else
        {
            encodedBytes = Encoding.Default.GetBytes(str);
        }
        editStream = new MemoryStream(encodedBytes.Length);
        editStream.Write(encodedBytes, 0, encodedBytes.Length);
        editStream.Position = 0;
        StreamIn(editStream, flags);
    }

    private void StreamIn(Stream data, int flags)
    {
        // clear out the selection only if we are replacing all the text
        //
        if ((flags & RichTextBoxConstants.SFF_SELECTION) == 0)
        {
            NativeMethods.CHARRANGE cr = new NativeMethods.CHARRANGE();
            UnsafeNativeMethods.SendMessage(new HandleRef(this, Handle), Interop.EditMessages.EM_EXSETSEL, 0, cr);
        }

        try
        {
            editStream = data;
            Debug.Assert(data != null, "StreamIn passed a null stream");

            // If SF_RTF is requested then check for the RTF tag at the start
            // of the file.  We don't load if the tag is not there
            // 
            if ((flags & RichTextBoxConstants.SF_RTF) != 0)
            {
                long streamStart = editStream.Position;
                byte[] bytes = new byte[SZ_RTF_TAG.Length];
                editStream.Read(bytes, (int)streamStart, SZ_RTF_TAG.Length);
                string str = Encoding.Default.GetString(bytes);
                if (!SZ_RTF_TAG.Equals(str))
                {
                    throw new ArgumentException(SR.InvalidFileFormat);
                }

                // put us back at the start of the file
                editStream.Position = streamStart;
            }

            int cookieVal = 0;
            // set up structure to do stream operation
            NativeMethods.EDITSTREAM es = new NativeMethods.EDITSTREAM();
            if ((flags & RichTextBoxConstants.SF_UNICODE) != 0)
            {
                cookieVal = INPUT | UNICODE;
            }
            else
            {
                cookieVal = INPUT | ANSI;
            }
            if ((flags & RichTextBoxConstants.SF_RTF) != 0)
            {
                cookieVal |= RTF;
            }
            else
            {
                cookieVal |= TEXTLF;
            }
            es.dwCookie = (IntPtr)cookieVal;
            es.pfnCallback = new NativeMethods.EditStreamCallback(EditStreamProc);

            // gives us TextBox compatible behavior, programatic text change shouldn't
            // be limited...
            //
            SendMessage(Interop.EditMessages.EM_EXLIMITTEXT, 0, int.MaxValue);



            // go get the text for the control
            // Needed for 64-bit
            if (IntPtr.Size == 8)
            {
                NativeMethods.EDITSTREAM64 es64 = ConvertToEDITSTREAM64(es);
                UnsafeNativeMethods.SendMessage(new HandleRef(this, Handle), Interop.EditMessages.EM_STREAMIN, flags, es64);

                //Assign back dwError value
                es.dwError = GetErrorValue64(es64);
            }
            else
            {
                UnsafeNativeMethods.SendMessage(new HandleRef(this, Handle), Interop.EditMessages.EM_STREAMIN, flags, es);
            }

            UpdateMaxLength();

            // If we failed to load because of protected
            // text then return protect event was fired so no
            // exception is required for the the error
            if (GetProtectedError())
            {
                return;
            }

            if (es.dwError != 0)
            {
                throw new InvalidOperationException(SR.LoadTextError);
            }

            // set the modify tag on the control
            SendMessage(Interop.EditMessages.EM_SETMODIFY, -1, 0);

            // EM_GETLINECOUNT will cause the RichTextBoxConstants to recalculate its line indexes
            SendMessage(Interop.EditMessages.EM_GETLINECOUNT, 0, 0);


        }
        finally
        {
            // release any storage space held.
            editStream = null;
        }
    }

It does seem like a bug and since it's BETA the best course of action would be to log it with Microsoft at https://developercommunity.visualstudio.com

If you replace your RichTextBox control class with the code from the library you will be able to see which line the error occurs at in:

System.Windows.Forms.RichTextBox.StreamIn(Stream data, Int32 flags)

Update:

This is actually a known issue, https://social.msdn.microsoft.com/Forums/en-US/28940162-5f7b-4687-af19-1eeef90d3963/richtextboxrtf-setter-throwing-systemargumentexception-file-format-is-not-valid-in-windows?forum=winforms

It's already been reported to Microsooft: https://developercommunity.visualstudio.com/content/problem/544623/issue-caused-by-unicode-utf-8-for-world-wide-langu.html

Kyle Wang from MSFT has already narrowed it down to an Operating System issue:

PC1 (OS Build .437 can reproduce the issue):

Env:

enter image description here

Test:
enter image description here

PC2(OS Build .348 can not reproduce the issue):

Env:

enter image description here

Test:

enter image description here

Momentum answered 3/6, 2019 at 0:26 Comment(2)
Hmmm, I assume you're referring to extending the RichTextBox class and overriding the StreamIn methods. Problem is I can't get RichTextBoxConstants, NativeMethods, GetErrorValue64 and HandleRef etc. to resolve, and there are many more compile errors as well.Redden
Check my update, looks like its already on Microsoft's radar and they have narrowed it down. Looks like a regression was re-introduced.Momentum
P
4

From MSDN, When try to set RTF, it will check for the starting string to be equal to “{\rtf” but when this feature is enabled the format will start with “{\urtf” and that is leading to explicit throw of an exception from microsoft.

MSDN reference :

string str = Encoding.Default.GetString(bytes);

if (!SZ_RTF_TAG.Equals(str)) // SZ_RTF_TAG ="{\\rtf";

    throw new ArgumentException(SR.GetString(SR.InvalidFileFormat));

To avoid this you would need to upgrade the .net framework to 4.7 or disable the beta feature. This issue will occur in Windows 1803 and 1809 builds. Similar thread is below

RichTextBox.RTF setter throwing System.ArgumentException. File format is not valid in Windows version 1803

Pacificism answered 24/6, 2019 at 3:22 Comment(6)
Hi there, thank you for your existing research and troubleshooting on this issue. My understanding is even with .Net 4.7 if the OS version is 1803 or 1809 it will cause the ArgumentException? Are you saying .Net 4.7 will work on 1803 and 1809? Btw I have bumped the Microsoft Connect case for you, it was a bit dodgy of them to close it on you.Momentum
Interesting. For now, I'll be using Hans' RichEdit50 solution as I'd like to stay with .NET 3.5 for now, and .NET 4.7 will presumably use RichEdit50 anyway.Redden
@DanW I'd like to stay with .NET 3.5 for now that runtime went out of support some years ago. The earliest supported version is .NET 4.5.2. If you want to use an unsupported runtime, don't enable beta features. Problems found in that runtime won't be fixedTammany
Yup, although I'm in the top 15 Bounty Hunters, you deserved it!Momentum
@PanagiotisKanavos: I don't enable or care about this beta UTF8 feature; but some of my users do, and they're confused when it crashes. That's the problem. .NET 3.5 is available for a wider range of PCs, so hence my reasoning to stick with that one.Redden
@JeremyThompson: Your answer was excellent also. The bounty was auto-awarded, so not sure how that was done.Redden

© 2022 - 2024 — McMap. All rights reserved.