Detect which word has been clicked on within a text
Asked Answered
E

16

54

I am building a JS script which at some point is able to, on a given page, allow the user to click on any word and store this word in a variable.

I have one solution which is pretty ugly and involves class-parsing using jQuery: I first parse the entire html, split everything on each space " ", and re-append everything wrapped in a <span class="word">word</span>, and then I add an event with jQ to detect clicks on such a class, and using $(this).innerHTML I get the clicked word.

This is slow and ugly in so many ways and I was hoping that someone knows of another way to achieve this.

PS: I might consider running it as a browser extension, so if it doesn't sound possible with mere JS, and if you know a browser API that would allow that, feel free to mention it !

A possible owrkaround would be to get the user to highlight the word instead of clicking it, but I would really love to be able to achieve the same thing with only a click !

Elstan answered 27/9, 2011 at 1:14 Comment(2)
Is there any particular browser you're targeting?Checked
Most of them, but I'd be glad to start with the browser offering the most convenient tools to do soElstan
A
68

Here's a solution that will work without adding tons of spans to the document (works on Webkit and Mozilla and IE9+):

https://jsfiddle.net/Vap7C/15/

    $(".clickable").click(function(e){
         s = window.getSelection();
         var range = s.getRangeAt(0);
         var node = s.anchorNode;
         
         // Find starting point
         while(range.toString().indexOf(' ') != 0) {                 
            range.setStart(node,(range.startOffset -1));
         }
         range.setStart(node, range.startOffset +1);
         
         // Find ending point
         do{
           range.setEnd(node,range.endOffset + 1);

        }while(range.toString().indexOf(' ') == -1 && range.toString().trim() != '');
        
        // Alert result
        var str = range.toString().trim();
        alert(str);
       });
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>
<p class="clickable">
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Mauris rutrum ante nunc. Proin sit amet sem purus. Aliquam malesuada egestas metus, vel ornare purus sollicitudin at. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Integer porta turpis ut mi pharetra rhoncus. Ut accumsan, leo quis hendrerit luctus, purus nunc suscipit libero, sit amet lacinia turpis neque gravida sapien. Nulla facilisis neque sit amet lacus ornare consectetur non ac massa. In purus quam, imperdiet eget tempor eu, consectetur eget turpis. Curabitur mauris neque, venenatis a sollicitudin consectetur, hendrerit in arcu.
</p>

in IE8, it has problems because of getSelection. This link ( Is there a cross-browser solution for getSelection()? ) may help with those issues. I haven't tested on Opera.

I used https://jsfiddle.net/Vap7C/1/ from a similar question as a starting point. It used the Selection.modify function:

s.modify('extend','forward','word');
s.modify('extend','backward','word');

Unfortunately they don't always get the whole word. As a workaround, I got the Range for the selection and added two loops to find the word boundaries. The first one keeps adding characters to the word until it reaches a space. the second loop goes to the end of the word until it reaches a space.

This will also grab any punctuation at the end of the word, so make sure you trim that out if you need to.

Ambrose answered 16/2, 2012 at 3:9 Comment(9)
I actually had to read the DOM documentation on Mozilla to figure this out.Ambrose
an anonymous user suggested this edit: An improved solution that always gets the proper word, is simpler, and works in IE 4+: jsfiddle.net/Vap7C/80Ambrose
Pretty sweet... I have modified it to trigger only when ctrl is also pressed. It does not seem to want to get the text of <a> elements though. How come?Curve
I don't work for first word. At least in Chromium/LinuxNovotny
In the first code section above, range.setStart(node, (range.startOffset - 1)); crashes when run on the first word in a "node," because it attempts to set range to a negative value. I tried adding logic to prevent that, but then the subsequent range.setStart(node, range.startOffset + 1); returns all but the first letter of the first word. Also, when words are separated by a newline, the last word on the previous line is returned in addition to the clicked-on word. So, this needs some work.Faience
See my followup below on the range.set* code.Faience
@Ambrose example don't work, if content restricted (e.g. max-width: 500px). If you click outside of text block, it alerts with last word in line. JSFiddle. Also, modify property is non-standart, so be careful! MDNPettitoes
Using var range = s.getRangeAt(0).cloneRange() will prevent you from modifying the selection, which is probably what you want for production code.Trommel
When an audio is playing, the code doesn't give the word on which the user has clicked. This issue exist in Firefox, Opera, Safari and Microsoft Edge. In Chrome, the code works properly even when an audio is playing. How can I make the code work even when an audio is playing?Hydroxyl
C
14

As far as I know, adding a span for each word is the only way to do this.

You might consider using Lettering.js, which handles the splitting for you. Though this won't really impact performance, unless your "splitting code" is inefficient.

Then, instead of binding .click() to every span, it would be more efficient to bind a single .click() to the container of the spans, and check event.target to see which span has been clicked.

Checked answered 27/9, 2011 at 1:28 Comment(2)
I just found this one browsing SO: jsfiddle.net/niklasvh/rD2uE it is not as accurate as the 'span' hack, which is a problem, but seems to work... I have to benchmark now (and try to understand what the code actually does)Elstan
@Checked but to create a spellcheck for chat adding for each word will be correct? If a word is marked incorrect and needs to be corrected how can I keep it, it would erase its mark if I convert to spanAlastair
E
14

Here are improvements for the accepted answer:

$(".clickable").click(function (e) {
    var selection = window.getSelection();
    if (!selection || selection.rangeCount < 1) return true;
    var range = selection.getRangeAt(0);
    var node = selection.anchorNode;
    var word_regexp = /^\w*$/;

    // Extend the range backward until it matches word beginning
    while ((range.startOffset > 0) && range.toString().match(word_regexp)) {
      range.setStart(node, (range.startOffset - 1));
    }
    // Restore the valid word match after overshooting
    if (!range.toString().match(word_regexp)) {
      range.setStart(node, range.startOffset + 1);
    }

    // Extend the range forward until it matches word ending
    while ((range.endOffset < node.length) && range.toString().match(word_regexp)) {
      range.setEnd(node, range.endOffset + 1);
    }
    // Restore the valid word match after overshooting
    if (!range.toString().match(word_regexp)) {
      range.setEnd(node, range.endOffset - 1);
    }

    var word = range.toString();
});​
Erv answered 27/12, 2016 at 12:25 Comment(0)
A
11

And another take on @stevendaniel's answer:

$('.clickable').click(function(){
   var sel=window.getSelection();
   var str=sel.anchorNode.nodeValue,len=str.length, a=b=sel.anchorOffset;
   while(str[a]!=' '&&a--){}; if (str[a]==' ') a++; // start of word
   while(str[b]!=' '&&b++<len){};                   // end of word+1
   console.log(str.substring(a,b));
});
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>

<p class="clickable">The objective can also be achieved by simply analysing the
string you get from <code>sel=window.getSelection()</code>. Two simple searches for
the next blank before and after the word, pointed to by the current position
(<code>sel.anchorOffset</code>) and the work is done:</p>

<p>This second paragraph is <em>not</em> clickable. I tested this on Chrome and Internet explorer (IE11)</p>
Aesthetically answered 31/7, 2018 at 14:29 Comment(0)
E
7

The only cross-browser (IE < 8) way that I know of is wrapping in span elements. It's ugly but not really that slow.

This example is straight from the jQuery .css() function documentation, but with a huge block of text to pre-process:

http://jsfiddle.net/kMvYy/

Here's another way of doing it (given here: jquery capture the word value ) on the same block of text that doesn't require wrapping in span. http://jsfiddle.net/Vap7C/1

Euroclydon answered 27/9, 2011 at 1:28 Comment(2)
sorry, had forgotten to hit saveEuroclydon
ok then I had already seen this technique, which seems to be overseen by many despite being efficient and very portable ; unfortunately it isn't accurate (clicking on the first letter of a word usually returns the previous word).Elstan
P
4

-EDIT- What about this? it uses getSelection() binded to mouseup

<script type="text/javascript" src="jquery-1.6.3.min.js"></script>
<script>
$(document).ready(function(){
    words = [];
    $("#myId").bind("mouseup",function(){
        word = window.getSelection().toString();
        if(word != ''){
            if( confirm("Add *"+word+"* to array?") ){words.push(word);}
        }
    });
    //just to see what we've got
    $('button').click(function(){alert(words);});
});
</script>

<div id='myId'>
    Some random text in here with many words huh
</div>
<button>See content</button>

I can't think of a way beside splitting, this is what I'd do, a small plugin that will split into spans and when clicked it will add its content to an array for further use:

<script type="text/javascript" src="jquery-1.6.3.min.js"></script>
<script>
//plugin, take it to another file
(function( $ ){
$.fn.splitWords = function(ary) {
    this.html('<span>'+this.html().split(' ').join('</span> <span>')+'</span>');
    this.children('span').click(function(){
        $(this).css("background-color","#C0DEED");
        ary.push($(this).html());
    });
};
})( jQuery );
//plugin, take it to another file

$(document).ready(function(){
    var clicked_words = [];
    $('#myId').splitWords(clicked_words);
    //just to see what we've stored
    $('button').click(function(){alert(clicked_words);});
});
</script>

<div id='myId'>
    Some random text in here with many words huh
</div>
<button>See content</button>
Portable answered 27/9, 2011 at 2:0 Comment(5)
interestingly enough this is (scarily) close to the version I have at the moment :p my main concern is that this code might be slow on very long pages (academic works, which is my main target), so I was looking for something more clicl-driven, but I might go with thatElstan
I wonder if doing the split server side with php would be better, under your circumstances would that be a valid option?Portable
no, it is supposed to plug on 3rd party sites, so only client-code is possible :) I will try your EDIT, looks good to meElstan
unfortunately it does not work, user has to highlight the text for it to work. it can be improved with the .modify attribute though. I'll post the codeElstan
jsfiddle.net/zLGre with this minor fix I circumvent the inaccuracy. Should make do :DElstan
P
2

Here is a completely different method. I am not sure about the practicality of it, but it may give you some different ideas. Here is what I am thinking if you have a container tag with position relative with just text in it. Then you could put a span around each word record its offset Height, Width, Left, and Top, then remove the span. Save those to an array then when there is a click in the area do a search to find out what word was closest to the click. This obviously would be intensive at the beginning. So this would work best in a situation where the person will be spending some time perusing the article. The benefit is you do not need to worry about possibly 100s of extra elements, but that benefit may be marginal at best.

Note I think you could remove the container element from the DOM to speed up the process and still get the offset distances, but I am not positive.

Pheasant answered 27/9, 2011 at 4:15 Comment(2)
it can be completely smashed by a resized window, because the spans are gone when the user clicks though :/Elstan
@Elstan Yes I did think of that. If the user resizes the window or changes font size then you would have to recalculate the entire thing. Again this method would have a very narrow usage case. I was just throwing it out there.Pheasant
F
2

This is a followup on my comment to stevendaniels' answer (above):

In the first code section above, range.setStart(node, (range.startOffset - 1)); crashes when run on the first word in a "node," because it attempts to set range to a negative value. I tried adding logic to prevent that, but then the subsequent range.setStart(node, range.startOffset + 1); returns all but the first letter of the first word. Also, when words are separated by a newline, the last word on the previous line is returned in addition to the clicked-on word. So, this needs some work.

Here is my code to make the range expansion code in that answer work reliably:

while (range.startOffset !== 0) {                   // start of node
    range.setStart(node, range.startOffset - 1)     // back up 1 char
    if (range.toString().search(/\s/) === 0) {      // space character
        range.setStart(node, range.startOffset + 1);// move forward 1 char
        break;
    }
}

while (range.endOffset < node.length) {         // end of node
    range.setEnd(node, range.endOffset + 1)     // forward 1 char
    if (range.toString().search(/\s/) !== -1) { // space character
        range.setEnd(node, range.endOffset - 1);// back 1 char
        break;
    }
}
Faience answered 11/1, 2018 at 15:43 Comment(1)
PERFECT! Thank you so much.Unclean
D
2

For the sake of completeness to the rest of the answers, I am going to add an explanation to the main methods used:

  • window.getSelection(): This is the main method. It is used to get information about a selection you made in text (by pressing the mouse button, dragging and then releasing, not by doing a simple click). It returns a Selection object whose main properties are anchorOffset and focusOffset, which are the position of the first and last characters selected, respectively. In case it doesn't make total sense, this is the description of anchor and focus the MDN website I linked previously offers:

    The anchor is where the user began the selection and the focus is where the user ends the selection

    • toString(): This method returns the selected text.

    • anchorOffset: Starting index of selection in the text of the Node you made the selection.
      If you have this html:

      <div>aaaa<span>bbbb cccc dddd</span>eeee/div>
      

      and you select 'cccc', then anchorOffset == 5 because inside the node the selection begins at the 5th character of the html element.

    • focusOffset: Final index of selection in the text of the Node you made the selection.
      Following the previous example, focusOffset == 9.

    • getRangeAt(): Returns a Range object. It receives an index as parameter because (I suspect, I actually need confirmation of this) in some browsers such as Firefox you can select multiple independent texts at once.

      • startOffset: This Range's property is analogous to anchorOffset.
      • endOffset: As expected, this one is analogous to focusOffset.
      • toString: Analogous to the toString() method of the Selection object.

Aside from the other solutions, there is also another method nobody seems to have noticed: Document.caretRangeFromPoint()

The caretRangeFromPoint() method of the Document interface returns a Range object for the document fragment under the specified coordinates.

If you follow this link you will see how, in fact, the documentation provides an example that closely resembles what the OP was asking for. This example does not get the particular word the user clicked on, but instead adds a <br> right after the character the user clicked.

function insertBreakAtPoint(e) {
  let range;
  let textNode;
  let offset;

  if (document.caretPositionFromPoint) {
    range = document.caretPositionFromPoint(e.clientX, e.clientY);
    textNode = range.offsetNode;
    offset = range.offset;    
  } else if (document.caretRangeFromPoint) {
    range = document.caretRangeFromPoint(e.clientX, e.clientY);
    textNode = range.startContainer;
    offset = range.startOffset;
  }
  // Only split TEXT_NODEs
  if (textNode && textNode.nodeType == 3) {
    let replacement = textNode.splitText(offset);
    let br = document.createElement('br');
    textNode.parentNode.insertBefore(br, replacement);
  }
}

let paragraphs = document.getElementsByTagName("p");
for (let i = 0; i < paragraphs.length; i++) {
  paragraphs[i].addEventListener('click', insertBreakAtPoint, false);
}
<p>Lorem ipsum dolor sit amet, consetetur sadipscing elitr,
sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat,
sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum.
Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet.</p>

It's just a matter to get the word by getting all the text after the previous and before the next blank characters.

Duley answered 4/7, 2020 at 2:4 Comment(0)
T
2

As with the accepted answer, this solution uses window.getSelection to infer the cursor position within the text. It uses a regex to reliably find the word boundary, and does not restrict the starting node and ending node to be the same node.

This code has the following improvements over the accepted answer:

  • Works at the beginning of text.
  • Allows selection across multiple nodes.
  • Does not modify selection range.
  • Allows the user to override the range with a custom selection.
  • Detects words even when surrounded by non-spaces (e.g. "\t\n")
  • Uses vanilla JavaScript, only.
  • No alerts!

getBoundaryPoints = (range) => ({ start: range.startOffset, end: range.endOffset })

function expandTextRange(range) {
    // expand to include a whole word

    matchesStart = (r) => r.toString().match(/^\s/) // Alternative: /^\W/
    matchesEnd = (r) => r.toString().match(/\s$/)   // Alternative: /\W$/

    // Find start of word 
    while (!matchesStart(range) && range.startOffset > 0) {
        range.setStart(range.startContainer, range.startOffset - 1)
    }
    if (matchesStart(range)) range.setStart(range.startContainer, range.startOffset + 1)

    // Find end of word
    var length = range.endContainer.length || range.endContainer.childNodes.length
    while (!matchesEnd(range) && range.endOffset < length) {
        range.setEnd(range.endContainer, range.endOffset + 1)
    }
    if (matchesEnd(range) && range.endOffset > 0) range.setEnd(range.endContainer, range.endOffset - 1)

    //console.log(JSON.stringify(getBoundaryPoints(range)))
    //console.log('"' + range.toString() + '"')
    var str = range.toString()
}

function getTextSelectedOrUnderCursor() {
    var sel = window.getSelection()
    var range = sel.getRangeAt(0).cloneRange()

    if (range.startOffset == range.endOffset) expandTextRange(range)
    return range.toString()
}

function onClick() {
    console.info('"' + getTextSelectedOrUnderCursor() + '"')
}

var content = document.body
content.addEventListener("click", onClick)
<div id="text">
<p>Vel consequatur incidunt voluptatem. Sapiente quod qui rem libero ut sunt ratione. Id qui id sit id alias rerum officia non. A rerum sunt repudiandae. Aliquam ut enim libero praesentium quia eum.</p>

<p>Occaecati aut consequuntur voluptatem quae reiciendis et esse. Quis ut sunt quod consequatur quis recusandae voluptas. Quas ut in provident. Provident aut vel ea qui ipsum et nesciunt eum.</p>
</div>

Because it uses arrow functions, this code doesn't work in IE; but that is easy to adjust. Furthermore, because it allows the user selection to span across nodes, it may return text that is usually not visible to the user, such as the contents of a script tag that exists within the user's selection. (Triple-click the last paragraph to demonstrate this flaw.)

You should decide which kinds of nodes the user should see, and filter out the unneeded ones, which I felt was beyond the scope of the question.

Trommel answered 26/2, 2021 at 19:58 Comment(0)
D
1

like this Get user selected text with jquery and its uses?

Dvinsk answered 27/9, 2011 at 1:30 Comment(0)
M
1

The selected solution sometimes does not work on Russian texts (shows error). I would suggest the following solution for Russian and English texts:

function returnClickedWord(){
    let selection = window.getSelection(),
        text = selection.anchorNode.data,
        index = selection.anchorOffset,
        symbol = "a";
    while(/[a-zA-z0-9а-яА-Я]/.test(symbol)&&symbol!==undefined){
        symbol = text[index--];
    }
    index += 2;
    let word = "";
    symbol = "a";
    while(/[a-zA-z0-9а-яА-Я]/.test(symbol) && index<text.length){
        symbol = text[index++];
    word += symbol;
    }
    alert(word);
}
document.addEventListener("click", returnClickedWord);
Mcilroy answered 29/9, 2019 at 12:56 Comment(0)
E
1

Here's an alternative to the accepted answer that works with Cyrillic. I don't understand why checking the word boundaries is necessary, but by default the selection is collapsed for some reason for me.

let selection = window.getSelection();
if (!selection || selection.rangeCount < 1) return
let node = selection.anchorNode
let range = selection.getRangeAt(0)

let text = selection.anchorNode.textContent

let startIndex, endIndex
startIndex = endIndex = selection.anchorOffset
const expected = /[A-ZА-Я]*/i

function testSlice() {
  let slice = text.slice(startIndex, endIndex)
  return slice == slice.match(expected)[0]
}

while(startIndex > 0 && testSlice()) {
  startIndex -= 1
}
startIndex += 1

while(endIndex < text.length && testSlice()){
  endIndex += 1
}
endIndex -= 1

range.setStart(node, startIndex)
range.setEnd(node, endIndex)

let word = range.toString()
return word
Elbe answered 3/6, 2021 at 9:4 Comment(0)
S
0

What looks like a slightly simpler solution.

document.addEventListener('selectionchange', () => {
  const selection = window.getSelection();
  const matchingRE = new RegExp(`^.{0,${selection.focusOffset}}\\s+(\\w+)`);
  const clickedWord = (matchingRE.exec(selection.focusNode.textContent) || ['']).pop();
});

I'm testing

Strapped answered 2/8, 2018 at 11:5 Comment(3)
Not working... It returns the first word all the timeFun
@Fun To be fair, returning the first word of the selection would meet the criteria set by the OP. The accepted answer even modifies the selection. Try selecting "lor si" from "dolor sit" and you'll see what I mean.Trommel
The accepted answer alert the right word you click on. If you click on dolor or sit the alert will show the right word. This answer shows always the Lorem no matter which word you click on. Here you can see the differenceFun
P
0

an anonymous user suggested this edit: An improved solution that always gets the proper word, is simpler, and works in IE 4+

http://jsfiddle.net/Vap7C/80/

document.body.addEventListener('click',(function() {
 // Gets clicked on word (or selected text if text is selected)
 var t = '';
 if (window.getSelection && (sel = window.getSelection()).modify) {
    // Webkit, Gecko
    var s = window.getSelection();
    if (s.isCollapsed) {
        s.modify('move', 'forward', 'character');
        s.modify('move', 'backward', 'word');
        s.modify('extend', 'forward', 'word');
        t = s.toString();
        s.modify('move', 'forward', 'character'); //clear selection
    }
    else {
        t = s.toString();
    }
  } else if ((sel = document.selection) && sel.type != "Control") {
    // IE 4+
    var textRange = sel.createRange();
    if (!textRange.text) {
        textRange.expand("word");
    }
    // Remove trailing spaces
    while (/\s$/.test(textRange.text)) {
        textRange.moveEnd("character", -1);
    }
    t = textRange.text;
 }
 alert(t);
});
Parade answered 16/12, 2021 at 18:32 Comment(0)
C
0

Here's an alternative that doesn't not imply to visually modify the range selection.

/**
 * Find a string from a selection
 */
export function findStrFromSelection(s: Selection) {
  const range = s.getRangeAt(0);
  const node = s.anchorNode;
  const content = node.textContent;

  let startOffset = range.startOffset;
  let endOffset = range.endOffset;
  // Find starting point
  // We move the cursor back until we find a space a line break or the start of the node
  do {
    startOffset--;
  } while (startOffset > 0 && content[startOffset - 1] != " " && content[startOffset - 1] != '\n');

  // Find ending point
  // We move the cursor forward until we find a space a line break or the end of the node
  do {
    endOffset++;
  } while (content[endOffset] != " " && content[endOffset] != '\n' && endOffset < content.length);
  
  return content.substring(startOffset, endOffset);
}
Cogwheel answered 31/12, 2021 at 10:51 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.