Remove HTML tags from a String in Dart
Asked Answered
R

7

59

I’ve been trying to achieve this for a while, I have a string which contains a lot of HTML tags in it which is in some encoded form Like & lt; and & gt; (without the spaces) in between the string. Can anyone assist me in removing those tags so that I can get a plain string?

Refit answered 30/7, 2018 at 12:16 Comment(6)
@feeela We are not in javascript here thoughRebbeccarebe
@feeela This won’t work in Dart I guess.Refit
pub.dartlang.org/packages/html or pub.dartlang.org/packages/flutter_webview_plugin might help for this taskRufe
@GünterZöchbauer finally I achieved this using the HTML package.Refit
That was quick. How did you do it? I wasn't sure if the html package supports modifying (never used it). Perhaps you can answer your question with some example code?Rufe
Sure! Just give me a second, it’s hard to write code on phone.Refit
R
119

Finally I achieved this using the html package

Here’s how I did it

import 'package:html/parser.dart';


//here goes the function 
String _parseHtmlString(String htmlString) {
final document = parse(htmlString);
final String parsedString = parse(document.body.text).documentElement.text;

return parsedString;
}

I don’t know if there is any cleaner way to do this but this one worked for me.

Refit answered 30/7, 2018 at 12:45 Comment(2)
It is not built in, perhaps some of your packages already included it? pub.dartlang.org/packages/html#-readme-tab-Helle
use this package to import html/parser pub.dev/packages/htmlBiegel
H
71

You can simply use RegExp without 3rd Lib for remove tag (

</>)

String removeAllHtmlTags(String htmlText) {
    RegExp exp = RegExp(
      r"<[^>]*>",
      multiLine: true,
      caseSensitive: true
    );

    return htmlText.replaceAll(exp, '');
  }
Hover answered 18/8, 2019 at 5:18 Comment(1)
Regex is never the way to work with HTML regex101.com/r/HukWkb/1 When used on a string '<a title="1 < 3, but 3 > 2">Don't use regex to parse HTML</a>', '2">Don't use regex to parse HTML' will be left after replacing instead of 'Don't use regex to parse HTML'Hyperopia
L
40

The intl package provides a method stripHtmlIfNeeded to strip the HTML tags from the string.

The Bidi class under this package provides the utility method for working with the bidirectional text.

import 'package:intl/intl.dart';

Bidi.stripHtmlIfNeeded("<p>Hello World</p>")

If you don't want to use the whole package just for this function, below is the method implementation:

static String stripHtmlIfNeeded(String text) {
  return text.replaceAll(RegExp(r'<[^>]*>|&[^;]+;'), ' ');
}

Documentation: https://api.flutter.dev/flutter/intl/Bidi/stripHtmlIfNeeded.html

Longicorn answered 24/4, 2021 at 9:56 Comment(0)
R
2

By just using

import ‘package:html/parser.dart’;

will get a problem, for those strings that includes <br> and <p> tags. Paragraph info is missing. May first replace <br> to <p>, then get List:

import ‘package:html/parser.dart’  as dom; 

htmlString = '<p> first ... line.<br>second.....line.<p>'; 

List<String> cleanStrings = new List<String>();
List<dom.Element> ps = parse(htmlString.replaceAll('<br>', '</p><p>'))).querySelectorAll('p');
if (ps.isNotEmpty) ps.forEach((f) {
  (f.text != '') cleanStrings.add(f.text);
});
Ramp answered 21/5, 2019 at 15:22 Comment(0)
L
2

Here is my solution if using flutter web or can't import the parser for any reason and it's configurable.

  String formatHtmlString(String string) {
      return string
          .replaceAll("\n\n", "<p>") // Paragraphs
          .replaceAll("\n", "<br>") // Line Breaks
          .replaceAll("\"", "&quot;") // Quote Marks
          .replaceAll("'", "&apos;") // Apostrophe
          .replaceAll(">", "&lt;") // Less-than Comparator (Strip Tags)
          .replaceAll("<", "&gt;") // Greater-than Comparator (Strip Tags)
          .trim(); // Whitespace
    }
Levan answered 18/10, 2021 at 18:25 Comment(0)
W
1

If you want to Decode the HTML content to String then Follow this step:

  1. Add this plugin to pubspec.yaml => HTML Parser - Dart Library
  1. Then in your code add this Line =>

    String htmlText = parse("String with HTML tags").body!.text

Wojcik answered 1/3, 2022 at 13:31 Comment(0)
W
-2

3 steps

first, add this to your "pubspec.yaml" file

dependencies: flutter_html: ^0.8.2

second, import to your dart file

import 'package:flutter_html_view/flutter_html_view.dart';

3rd, simply use

HtmlView(data: "Your Html Data"),

Whidah answered 30/7, 2018 at 12:17 Comment(1)
I believe the question was about turning html into string with tags removed, not about displaying a formatted HTML?Electricity

© 2022 - 2024 — McMap. All rights reserved.