How do you identify duplicate values in a numerical sequence using XPath 2.0?
Asked Answered
E

4

13

I have an XPath expression which provides me a sequence of values like the one below:

1 2 2 3 4 5 5 6 7

This is easy to convert to a sequence of unique values 1 2 3 4 5 6 7 using distinct-values(). However, what I want to extract is the list of duplicate values = 2 5. I can't think of an easy way to do this. Can anyone help?

Electrometer answered 25/9, 2008 at 12:44 Comment(0)
H
23

Use this simple XPath 2.0 expression:

$vSeq[index-of($vSeq,.)[2]]

where $vSeq is the sequence of values in which we want to find the duplicates.

For explanation of how this "works", see:

http://dnovatchev.wordpress.com/2008/11/16/xpath-2-0-gems-find-all-duplicate-values-in-a-sequence-part-2/

TLDR; This picture can be a visual explanation.

If the sequence is:

$vSeq  =  1,   2,   3,   2,   4,   5,   6,   7,   5,   7,   5

Then evaluating the above XPath expression produces: 2, 5, 7


enter image description here

Honeywell answered 13/11, 2008 at 16:11 Comment(3)
Very nice. I constantly overlook the index-of() function.Bladderwort
Would accept this as the answer, but it seems I am no longer able toAntoinetteanton
Nevermind, thanks @Woody. I am glad I found you. Maybe the moderators can help restore/connect/merge with your previous yourself?Honeywell
B
3

What about:

distinct-values(
  for $item in $seq
  return if (count($seq[. eq $item]) > 1)
         then $item
         else ())

This iterates through the items in the sequence, and returns the item if the number of items in the sequence that are equal to that item is greater than one. You then have to use distinct-values() to remove the duplicates from that list.

Bladderwort answered 25/9, 2008 at 18:14 Comment(1)
Hi Jeni, Seems there is a simpler solution :) $vSeq[index-of($vSeq,.)[2]] Cheers, DimitreHoneywell
C
0

Calculate the difference between your original set and the set of distinct values. This is the set of numbers that occur more than once. Note that numbers in this result set are not necessarily distinct if they occur more than twice in the original sequence so convert again to a set of distinct values if this is required.

Chalone answered 25/9, 2008 at 13:23 Comment(0)
W
0

What about xslt? Is it applicable to your request?

    <xsl:for-each select="/r/a">
        <xsl:variable name="cur" select="." />
        <xsl:if test="count(./preceding-sibling::a[. = $cur]) > 0 and count(./following-sibling::a[. = $cur]) = 0">
            <xsl:value-of select="." />
        </xsl:if>
    </xsl:for-each>
Wisner answered 28/9, 2008 at 20:50 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.