How to read an .RTF file using .NET 4.0
Asked Answered
R

4

11

I have seen samples using Word 9.0 object library. But I have Office 2010 Beta and .NET 4.0 in VS2010. Any tips on how to go with the new Word Dlls?

So I just wanted to get the functionality of RTF to TEXT with .NET3.5 or later.

Roselleroselyn answered 19/2, 2010 at 7:21 Comment(0)
R
12

I got a better solution with WPF , using TextRange.

FlowDocument document = new FlowDocument();

//Read the file stream to a Byte array 'data'
TextRange txtRange = null;

using (MemoryStream stream = new MemoryStream(data))
{
    // create a TextRange around the entire document
    txtRange = new TextRange(document.ContentStart, document.ContentEnd);
    txtRange.Load(stream, DataFormats.Rtf);
}

Now you can see the extracted text inside documentTextRange.Text

Roselleroselyn answered 19/2, 2010 at 16:34 Comment(0)
P
5

Do you really new to load .RTF into Word? .net has RichTextBox control that can handle .RTF files. See here: http://msdn.microsoft.com/en-us/library/1z7hy77a.aspx (How to: Load Files into the Windows Forms RichTextBox Control)

Pius answered 19/2, 2010 at 16:18 Comment(0)
S
0
public enum eFileType
{
    Invalid = -1,
    TextDocument = 0,
    RichTextDocument,
    WordDocument
}

public interface IRead
{
    string Read(string file);
}

public static class FileManager
{
    public static eFileType GetFileType(string extension)
    {
        var type = eFileType.Invalid;
        switch (extension)
        {
            case ".txt": type = eFileType.TextDocument;
                break;
            case ".rtf": type = eFileType.RichTextDocument;
                break;
            case ".docx": type = eFileType.WordDocument;
                break;
        }
        return type;
    }
}


public class TextDocument : IRead
{
    public string Read(string file)
    {
        try
        {
            var reader = new StreamReader(file);
            var content = reader.ReadToEnd();
            reader.Close();
            return content;
        }
        catch
        {
            return null;
        }
    }
}

public class RichTextDocument : IRead
{
    public string Read(string file)
    {
        try
        {
            var wordApp = new Application();
            object path = file;
            object nullobj = System.Reflection.Missing.Value;
            var doc = wordApp.Documents.Open(ref path,
                                                  ref nullobj,
                                                  ref nullobj,
                                                  ref nullobj,
                                                  ref nullobj,
                                                  ref nullobj,
                                                  ref nullobj,
                                                  ref nullobj,
                                                  ref nullobj,
                                                  ref nullobj,
                                                  ref nullobj,
                                                  ref nullobj,
                                                  ref nullobj,
                                                  ref nullobj,
                                                  ref nullobj,
                                                  ref nullobj);
            var result = wordApp.ActiveDocument.Content.Text;
            var doc_close = (_Document)doc;
            doc_close.Close();
            return result;
        }
        catch
        {
            return null;
        }
    }
}

public class WordDocument : IRead
{
    public string Read(string file)
    {
        try
        {
            var wordApp = new Application();
            object path = file;
            object nullobj = System.Reflection.Missing.Value;
            var doc = wordApp.Documents.Open(ref path,
                                                  ref nullobj,
                                                  ref nullobj,
                                                  ref nullobj,
                                                  ref nullobj,
                                                  ref nullobj,
                                                  ref nullobj,
                                                  ref nullobj,
                                                  ref nullobj,
                                                  ref nullobj,
                                                  ref nullobj,
                                                  ref nullobj,
                                                  ref nullobj,
                                                  ref nullobj,
                                                  ref nullobj,
                                                  ref nullobj);
            var result = wordApp.ActiveDocument.Content.Text;
            var doc_close = (_Document)doc;
            doc_close.Close();
            return result;
        }
        catch
        {
            return null;
        }
    }
}

public class Factory
{
    public IRead Get(eFileType type)
    {
        IRead read = null;
        switch (type)
        {
            case eFileType.RichTextDocument: read = new RichTextDocument();
                break;
            case eFileType.WordDocument: read = new WordDocument();
                break;
            case eFileType.TextDocument: read = new TextDocument();
                break;
        }
        return read;
    }
}

public class ResumeReader
{
    IRead _read;
    public ResumeReader(IRead read)
    {
        if (read == null) throw new InvalidDataException("read cannot be null");

        _read = read;
    }
    public string Read(string file)
    {
        return _read.Read(file);
    }
}    

edited to correct syntax highlighting

Subterrane answered 14/8, 2013 at 10:52 Comment(1)
Text is actually a fair bit more difficult like that. You don't take text encoding into account at all.Cyruscyst
B
0

If anyone needs a solution to ASP.NET, I found this perfect solution:

Add a reference to System.Windows.Forms or download the DLL itself and reference to it.

Next you can extract the text by creating a temporary RichTextBox:

RichTextBox box = new RichTextBox();
box.Rtf = File.ReadAllText(Path);
string text = box.Text;
Bubalo answered 27/5, 2019 at 10:42 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.