Transform an NSAttributedString to plain text
Asked Answered
C

4

45

I have an instance of NSData containing attributed text (NSAttributedString) originating from an NSTextView. I want to convert the attributed string to a plain string (NSString) without any formatting to do some text analysis (at the moment of conversion I do not have access to the originating NSTextView nor its NSTextStorage instance).

What would be the best way to do this?

EDIT:

Out of curiosity I examined the result of:

[[[self textView] textStorage] words]

which appeared to be a handy thing for doing some text analysis. The resulting array contains instances of NSSubTextStorage (example below of the word "Eastern"):

Eastern{ NSFont = "\"LucidaGrande 11.00 pt. P [] (0x7ffcaae08330) fobj=0x10a8472d0, spc=3.48\""; NSParagraphStyle = "Alignment 0, LineSpacing 0, ParagraphSpacing 0, ParagraphSpacingBefore 0, HeadIndent 0, TailIndent 0, FirstLineHeadIndent 0, LineHeight 0/0, LineHeightMultiple 0, LineBreakMode 0, Tabs (\n 28L,\n 56L,\n 84L,\n 112L,\n
140L,\n 168L,\n 196L,\n 224L,\n 252L,\n 280L,\n
308L,\n 336L\n), DefaultTabInterval 0, Blocks (null), Lists (null), BaseWritingDirection -1, HyphenationFactor 0, TighteningFactor 0.05, HeaderLevel 0"; }

NSSubTextStorage is probably a private class as I could not find any documentation for it. It also retains all formatting.

Catharina answered 26/1, 2012 at 12:46 Comment(0)
P
70

If I understand you correctly you have an NSData, say data, containing an encoded NSAttributedString. To reverse the process:

NSAttributedString *nas = [[NSAttributedString alloc] initWithData:data
                                                           options:nil
                                                documentAttributes:NULL
                                                             error:NULL];

and to get the plain text without attributes you then do:

NSString *str = [nas string];
Princedom answered 26/1, 2012 at 18:13 Comment(3)
Oh my... I must go out for a bit of fresh air. Thanks for the answer! Can't believe I did not see that myself... [self kick]Catharina
If an image attachment exists in attributed screen, resultant string property will return a ? in that attachment's place. I'm trying to find solution for it. Can u please share if you have any idea?Chapbook
@Chapbook - What you are probably seeing is the NSAttachmentCharacter and if so you can simply remove any occurrences of it. It's documented as part of NSTextAttachment.Princedom
C
32

Updating for Swift 5:

attributedText.string
Changeling answered 19/4, 2016 at 21:14 Comment(0)
H
4

With Swift 5 and macOS 10.0+, NSAttributedString has a property called string. string has the following declaration:

var string: String { get }

The character contents of the receiver as an NSString object.

Apple also states about string:

Attachment characters are not removed from the value of this property. [...]


The following Playground code shows how to use NSAttributedString's string property in order to retrieve the string content of an NSAttributedString instance:

import Cocoa

let string = "Some text"
let attributes = [NSAttributedString.Key.underlineStyle : NSUnderlineStyle.single]
let attributedString = NSAttributedString(string: string, attributes: attributes)

/* later */

let newString = attributedString.string
print(newString) // prints: "Some text"
print(type(of: newString)) // prints: String
Hypocotyl answered 3/8, 2017 at 0:46 Comment(0)
A
2

As of Swift 5.7 (or maybe earlier), the new AttributedString struct no longer has a string property. The code below works, even looking silly.

part.characters.map { String($0) }.joined(separator: "")
Assuage answered 9/1, 2023 at 10:58 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.