Word seems to use a different apostrophe character than Visual Studio and it is causing problems with using Regex.
I am trying to edit some Word documents in C# using OpenXML. I am basically replacing [[COMPANY]] with a company name. This has worked pretty smoothly until I have reached my corner case of companies with names that end in s. I end up with issue s where sometimes it creates a s's.
Example: Company Name: Simmons Text in Doc: The [[COMPANY]]'s business is cars. Result: The Simmons's business is cars.
This is improper English.
I should be able to just use a basic find and replace like I did for [[COMPANY]], but it is not working.
Regex apostropheReplace = new Regex("s\\'s");
docText = apostropheReplace.Replace(docText, "s\'");
This does not. It seems that Word is using an different character for and apostrophe(') than the standard one that is created when I use the key on my keyboard in Visual Studio. If I write a find and replace using my keyboard it will not work, but if I copy and paste the apostrophe from Word it does.
Regex apostrophyReplace = new Regex("s\\’s");
docText = apostrophyReplace.Replace(docText, "s\'");
Notice the different character in the Regex for the second one. I'm confused as to why this is, and also want to know if the is a proper way of doing this. I tried "'" but that does not work. I just want to know if using the copied character from Word is the proper way of doing this, and is there a way to do it so that both characters work so I don't have an issue with docs that may be created with a different program.