How to remove leading and trailing rendered as white spaces from a given HTML string? [closed]
Asked Answered
F

7

201

I've the following string containing HMTL. What would be sample code in JavaScript to remove leading and trailing white spaces that would appear in the rendering of this string? In other words: how can I obtain a string that would not show any leading or training white spaces in the HTML rendering but otherwise be identical?

<p>&nbsp;&nbsp;</p>
<div>&nbsp;</div>
Trimming using JavaScript<br />
<br />
<br />
<br />
all leading and trailing white spaces
<p>&nbsp;&nbsp;</p>
<div>&nbsp;</div>
Flower answered 5/4, 2012 at 16:4 Comment(14)
What is your real problem? Do you want to remove whitespace before inserting nodes in the document?Hilar
I want to remove all leading white spaces as well as all trailing white spaces. Very my like the Trim method in C# except that it removes even white spaces.Flower
So in my example, I finally should get the following after trimming: Trimming using JavaScript<br /><br /><br /><br />all leading and trailing white spacesFlower
I don't understand. As far as I can tell, this example input does not have leading or trailing whitespace. Therefore the code would not have to do anything.Noe
This question really need to be updated to be a lot more explicit about what "removing leading and trailing whitespaces" means in this context. providing the desired output would be a good start. (Some of this information is possibly in comments on the question and answers, but people shouldn't have to dig through those to understand the question.)Andros
This question is being discussed on meta: meta.stackoverflow.com/q/429472Dragline
This may be a duplicate of the earlier (and slightly better written) Trimming whitespace from HTML content?Andros
I think that although the question isn't obvious about it, there is only one reasonable explanation of what could be meant by it, i.e. remove the leading and trailing white spaces that would appear in a rendered version of it. I would edit the question to make this explicit, but somehow most of the answers don't seem to get it, so I wonder what in general went wrong here. The question can be salvaged but the answers likely not. Maybe better start again.Bonus
What could possibly happen for so many people to misinterpret this question? Surely, the question is not very well written, but it's not terrible either and the author's intent is clear from the question alone, at least to me. This is a very good question and should have a clear answer.Jehovah
The question has been opened again. In this case it's better to bring it to shape. If the creator Sunil wants they can check for themselves.Bonus
Why should the <br /> tags be preserved in the expected output, but not e.g. <div></div>?Noe
You need to maybe stop and recalibrate here because you did two major no no's in the last two hours. Don't put "edit" or "update" sections in the post, don't put an answer in the question. Answers exclusively go into an answer below; you are allowed to self-answer.Possession
@KarlKnechtel, <br /> tag is not removed when trimming because its not a leading whitespace nor a trailing whitespace. But, iIf this tag occurred after the last displayable character(s) then it would be a trailing whitespace and will need to be trimmed. Similarly, if this tag occurred before the first displayable character(s) then it would be a leading whitespace and will need to be trimmed.Flower
Even after the edits, this is still unclear. For one thing, the question doesn't include desired output for the example document. For another, there are LOTS of potential ambiguities! Should empty tags like <p></p> get removed (if leading or trailing)? What about tags not meant to hold text, like an <img>? Should we only delete entire elements or should we e.g. trim <p>&nbsp;foo</p> to <p>foo</p>? What about literal newline and space characters between tags that with default CSS won't render but could with white-space: pre? What about <pre> elements? Probably there's more...Absent
C
278

See the String method trim() - https://developer.mozilla.org/en/JavaScript/Reference/Global_Objects/String/Trim

var myString = '  bunch    of <br> string data with<p>trailing</p> and leading space   ';
myString = myString.trim();
// or myString = String.trim(myString);

Edit

As noted in other comments, it is possible to use the regex approach. The trim method is effectively just an alias for a regex:

if(!String.prototype.trim) {  
  String.prototype.trim = function () {  
    return this.replace(/^\s+|\s+$/g,'');  
  };  
} 

... this will inject the method into the native prototype for those browsers who are still swimming in the shallow end of the pool.

Coussoule answered 5/4, 2012 at 16:6 Comment(5)
I would prefer a regex way, becaase it isn't supported in all browsers (cough cough IE < 9).Chairman
Trim method will not trim white spaces, but only spaces. So its not what I am looking for.Flower
I'm not sure what you think "white spaces" are, but trim will remove whitespace in general (newline, space, tab etc), not just the space character.Nowise
@Nowise I came to understand he means "white space" in the visual sense, as in "visually blank areas in the results of rendering the HTML", and then he added in a comment above "except newlines caused by br tags".Coussoule
I fundamentally misunderstood his question due to semantics. The only answer that even attempts to address the removal of visual white space excluding br's is Anthony's. If the OP is only dealing with &nbsp;'s making the elements empty, there are a few approaches one could use, Anthony's is one. I don't support the use of regex on HTML so I wouldn't use Anthony's approach (I favor of DOM manipulation), but he's the only one who approached the OP's problem correctly understood. I left my answer up since it seems to be helpful, but it isn't the answer to the OP. I'm fine with how it works.Coussoule
H
65
var str = "  my awesome string   "
str.trim();    

for old browsers, use regex

str = str.replace(/^[ ]+|[ ]+$/g,'')
//str = "my awesome string" 
Hinny answered 2/4, 2013 at 23:25 Comment(2)
"[ ]" is exactly the same as " ". A character grouping of exactly one character is.. well... exactly one character.Ochoa
@Flimzy: Yes, it would make more sense to write [\t\n\ ] or just \s instead of [ ] which makes no sense.Mauritius
P
18

I know this is a very old question but it still doesn't have an accepted answer. I see that you want the following removed: html tags that are "empty" and white spaces based on an html string.

I have come up with a solution based on your comment for the output you are looking for:

Trimming using JavaScript<br /><br /><br /><br />all leading and trailing white spaces 

var str = "<p>&nbsp;&nbsp;</p><div>&nbsp;</div>Trimming using JavaScript<br /><br /><br /><br />all leading and trailing white spaces<p>&nbsp;&nbsp;</p><div>&nbsp;</div>";
console.log(str.trim().replace(/&nbsp;/g, '').replace(/<[^\/>][^>]*><\/[^>]+>/g, ""));

.trim() removes leading and trailing whitespace

.replace(/&nbsp;/g, '') removes &nbsp;

.replace(/<[^\/>][^>]*><\/[^>]+>/g, "")); removes empty tags

Peon answered 22/2, 2018 at 20:4 Comment(1)
It works on "that" example, but it's not a great piece of code because it erases &nbsp; entities which may not be the leading/trailing ones.Dormie
C
4
string.replace(/^\s+|\s+$/g, "");
Chokefull answered 5/4, 2012 at 16:6 Comment(1)
Read the question, &nbsp; is used instead of an ordinary whitespace. On top of this, the whitespace is contained within a tag.Hilar
V
3

If you're working with a multiline string, like a code file:

<html>
    <title>test</title>
    <body>
        <h1>test</h1>
    </body>
</html>

And want to replace all leading lines, to get this result:

<html>
<title>test</title>
<body>
<h1>test</h1>
</body>
</html>

You must add the multiline flag to your regex, ^ and $ match line by line:

string.replace(/^\s+|\s+$/gm, '');

Relevant quote from docs:

The "m" flag indicates that a multiline input string should be treated as multiple lines. For example, if "m" is used, "^" and "$" change from matching at only the start or end of the entire string to the start or end of any line within the string.

Valenba answered 28/3, 2017 at 7:22 Comment(0)
P
-1

01). If you need to remove only leading and trailing white space use this:

var address = "  No.255 Colombo  "
address.replace(/^[ ]+|[ ]+$/g,'');

this will return string "No.255 Colombo"

02). If you need to remove all the white space use this:

var address = "  No.255 Colombo  "
address.replace(/\s/g,"");

this will return string "No.255Colombo"

Pretermit answered 16/1, 2019 at 8:11 Comment(0)
E
-5
var trim = your_string.replace(/^\s+|\s+$/g, '');
Erde answered 5/4, 2012 at 16:7 Comment(2)
Will it remove white spaces like <br /> or <p>&nbsp;&nbsp;</p> <div>&nbsp;</div> ? Trmming simple spaces is not a problem. Its the white space removal that I am after.Flower
will it remove white spaceStrathspey

© 2022 - 2024 — McMap. All rights reserved.