Using C#, is there a good way to find and replace a text string in a docx file without having word installed on that machine?
Yes, using Open XML. Here's an article which addresses your specific question: Creating a Simple Search and Replace Utility for Word 2007 Open XML Format Documents
To work with this file format, one option is to use the Open XML Format Application Programming Interface (API) in the DocumentFormat.OpenXml.Packaging namespace. The classes, methods, and properties in this namespace are located in the DocumentFormat.OpenXml.dll file. You can install this DLL file by installing the Open XML Format SDK version 1.0. The members in this namespace allow you to easily work with the package contents for Excel 2007 workbooks, PowerPoint 2007 presentations, and Word 2007 documents.
...
Private Sub Search_Replace(ByVal file As String) Dim wdDoc As WordprocessingDocument = WordprocessingDocument.Open(file, True) ' Manage namespaces to perform Xml XPath queries. Dim nt As NameTable = New NameTable Dim nsManager As XmlNamespaceManager = New XmlNamespaceManager(nt) nsManager.AddNamespace("w", wordmlNamespace) ' Get the document part from the package. Dim xdoc As XmlDocument = New XmlDocument(nt) ' Load the XML in the part into an XmlDocument instance. xdoc.Load(wdDoc.MainDocumentPart.GetStream) ' Get the text nodes in the document. Dim nodes As XmlNodeList = Nothing nodes = xdoc.SelectNodes("//w:t", nsManager) Dim node As XmlNode Dim nodeText As String = "" ' Make the swap. Dim oldText As String = txtOldText.Text Dim newText As String = txtNewText.Text For Each node In nodes nodeText = node.FirstChild.InnerText If (InStr(nodeText, oldText) > 0) Then nodeText = nodeText.Replace(oldText, newText) ' Increment the occurrences counter. numChanged += 1 End If Next ' Write the changes back to the document. xdoc.Save(wdDoc.MainDocumentPart.GetStream(FileMode.Create)) ' Display the number of change occurrences. txtNumChanged.Text = numChanged End Sub
PresentationML
and DrawingML
as opposed to Word's WordProcessingML
) using only System.IO.Packaging
and Linq-to-XML. So I'll have to point you to a Ken Getz article: msdn.microsoft.com/en-us/library/bb738371(office.12).aspx. Look for any more of his articles written in 2006 - they all use System.IO.Packaging
. After that, he started writing articles with the SDK. You can also check out openxmldeveloper.org –
Doubledecker You may also try Aspose.Words for .NET in order to find and replace text in Word document. This component doesn't require MS Office to be installed. The API is quite simple and easy to use and implement.
Disclosure: I work as developer evangelist at Aspose.
Or you might try DocxTemplater. An open source library. Not as sophisticated as Aspose but open source. https://github.com/Amberg/DocxTemplater
© 2022 - 2024 — McMap. All rights reserved.