Replace "Shift-Enter" line break with "Enter" in word document using Microsoft office API
Asked Answered
S

4

10

I have a number of word documents that will be converted to HTML. It is required the paragraphs in the word documents should be converted to <p> elements.

After some tests with the Microsoft Office API's SaveAs method to convert the documents to the HTML, I realized the paragraphs with manual line breaks (break by "Shift-Enter") couldn't be placed in a separated <p> element, instead the paragraphs are grouped in a same <p> element.

In order to separate them, I have been trying to replace the "Shift-Enter" line breaks with the "Enter"/Carriage return before doing the conversion. However, I couldn't find a suitable way to do the line break replacement job. I have tried the WdLineEndingType parameter in the SaveAs method, but it seems not effective for the issue.

Succedaneum answered 5/2, 2013 at 15:18 Comment(0)
S
5

The ms-word office API provides a find function in the Range object, enabling to search and replace the strings.

The following code is to find the manual line breaks("^l") with the carriage return("^p").

Range r = oDoc.Content;
r.WholeStory();
r.Find.Execute("^l", ref oMissing, ref oMissing, ref oMissing, ref oMissing, ref oMissing, ref oMissing, ref oMissing, ref oMissing, "^p", WdReplace.wdReplaceAll);

Then use SaveAs to convert the word document to HTML, it will properly place each lines in <p> elements.

Succedaneum answered 6/2, 2013 at 1:49 Comment(0)
K
20

For those looking in MS Word: use Control-H (Find & replace).

Find Special character: manual Line break (^l, lowercase L)
Replace with: Paragraph mark (^p)
Replace All will do the whole document.

Edit: changed to lowercase characters.

Kiki answered 1/2, 2015 at 22:56 Comment(0)
S
5

The ms-word office API provides a find function in the Range object, enabling to search and replace the strings.

The following code is to find the manual line breaks("^l") with the carriage return("^p").

Range r = oDoc.Content;
r.WholeStory();
r.Find.Execute("^l", ref oMissing, ref oMissing, ref oMissing, ref oMissing, ref oMissing, ref oMissing, ref oMissing, ref oMissing, "^p", WdReplace.wdReplaceAll);

Then use SaveAs to convert the word document to HTML, it will properly place each lines in <p> elements.

Succedaneum answered 6/2, 2013 at 1:49 Comment(0)
O
0

Paragraph mark ( Paragraph mark )

^p (doesn't work in the Find what box when the Use wildcards option is turned on), or ^13

Overwrought answered 22/6, 2018 at 4:57 Comment(0)
S
-1

I searched this and then someone in my office pointed out you can just copy the text, open a blank document and paste as text. This means the formatting will be removed so a shift enter becomes a enter without needing any find and replace.

Shannanshannen answered 14/8, 2024 at 3:19 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.