SCORM 2004 Time Format - Regular Expression?
Asked Answered
B

8

6

I am building a SCORM 2004 javascript API for an LMS, and one of the SCORM 2004 requirements is that timeintervals passed into it must follow the following format. Does anyone know what the regular expression of this would be? I am trying to wrap my mind around it, but to no avail. Note: P must always be the first character.

P[yY][mM][dD][T[hH][nM][s[.s]S]] where:

  • y: The number of years (integer, >= 0, not restricted)
  • m: The number of months (integer, >=0, not restricted)
  • d: The number of days (integer, >=0, not restricted)
  • h: The number of hours (integer, >=0, not restricted)
  • n: The number of minutes (integer, >=0, not restricted)
  • s: The number of seconds or fraction of seconds (real or integer, >=0, not restricted). If fractions of a second are used, SCORM further restricts the string to a maximum of 2 digits (e.g., 34.45 – valid, 34.45454545 – not valid).
  • The character literals designators P, Y, M, D, T, H, M and S shall appear if the corresponding non-zero value is present.
  • Zero-padding of the values shall be supported. Zero-padding does not change the integer value of the number being represented by a set of characters. For example, PT05H is equivalent to PT5H and PT000005H.

Example -

  • P1Y3M2DT3H indicates a period of time of 1 year, 3 months, 2 days and 3 hours
  • PT3H5M indicates a period of time of 3 hours and 5 minutes

Any help would be greatly appreciated.

Thanks!

UPDATE:

I have added some additional standards that must be kept -

  • The designator P shall be present
  • If the value of years, months, days, hours, minutes or seconds is zero, the value and corresponding character literal designation may be omitted, but at least one character literal designator and value shall be present in addition to the designator P
  • The designator T shall be omitted if all of the time components (hours, minutes and seconds) are not used. A zero value may be used with any of the time components (e.g., PT0S)
Blindheim answered 20/8, 2009 at 16:24 Comment(0)
N
5

Here is the regex i use;

^P(?=\w*\d)(?:\d+Y|Y)?(?:\d+M|M)?(?:\d+D|D)?(?:T(?:\d+H|H)?(?:\d+M|M)?(?:\d+(?:\­.\d{1,2})?S|S)?)?$ 
Nonce answered 14/4, 2011 at 13:32 Comment(2)
I adapted this for ColdFusion, and I found what I think is an error (at least in CF) for the "seconds" part. I think the seconds portion should be (\d+(?:\.\d{1,2})?S|S)?Dowse
Fails for "P4DT19H46M18.22S"Heart
B
1

Use [0-9] to match any numeral. + to match 1 or more repetitions. ? to match 0 or 1 repetitions. () to group and extract the output.

P(([0-9]+Y)?([0-9]+M)?([0-9]+D)?)(T([0-9]+H)?([0-9]+M)?([0-9.]+S)?)?

import re

>>> p = re.compile('P(([0-9]+Y)?([0-9]+M)?([0-9]+D)?)(T([0-9]+H)?([0-9]+M)?([0-9.]+S)?)?')

>>> p.match('P1Y3M2DT3H').groups()
('1Y3M2D', '1Y', '3M', '2D', 'T3H', '3H', None, None)

>>> p.match('P3M2DT3H').groups()
('3M2D', None, '3M', '2D', 'T3H', '3H', None, None)

>>> p.match('PT3H5M').groups()
('', None, None, None, 'T3H5M', '3H', '5M', None)

>>> p.match('P1Y3M4D').groups()
('1Y3M4D', '1Y', '3M', '4D', None, None, None, None)
Benzoate answered 20/8, 2009 at 16:32 Comment(4)
This regex is too permissive for seconds and will match values like '1.1.1.1.1'.Hauptmann
This expression works except for the detail that FM commented on. I modified it a bit to allow a 0 to be in place of 0M for example (as per the standards), so I have now: P((([0-9]+Y)|0)?(([0-9]+M)|0)?(([0-9]+D)|0)?)(T(([0-9]+H)|0)?(([0-9]+M)|0)?(([0-9.]+S)|0)?) I also found further requirements and have updated the original question.Blindheim
"0 in place of 0M"? Where's that in the spec?Eudy
Similar to my comment for Alan's regex (see below), I found some examples that break. When using your regex, the following should return true but return false (apologies if I'm mistaken!): P1Y3M2D (the entire time portion may be omitted per the spec), PYMDTH23M15S (letters may be present w/o values), PYMDT2HM15.23S (letters may be present w/o values)Wizened
E
1

JavaScript doesn't support /x (free-spacing or comments mode), so remove the whitespace from this regex before using it.

/^P(?=.)
 (?:\d+Y)?
 (?:\d+M)?
 (?:\d+D)?
 (?:T(?=.)
    (?:\d+H)?
    (?:\d+M)?
    (?:\d+
       (?:\.\d{1,2})?
    )?
 )?$/i

Each (?=.) lookahead asserts that there's at least one character remaining at that point in the match. That means at least one of the following groups (ie, the Y, M, D or T group after the P, and the H, M or S group after the T) has to match, even though they're all optional. That satisfies the second of the added requirements in your updated spec.

Eudy answered 20/8, 2009 at 19:32 Comment(1)
Thanks very much for your regex, but I found some examples that break. I believe each of the following are valid in the spec and should return true but return false (apologies if I'm mistaken!): PT00S (the entire date portion may be omitted per the spec, and zero values are acceptable for H/M/S.s) PYMDTH23M15S (letters may be present w/o numbers) PYMDT2HM15.23S (letters may be present w/o numbers)Wizened
D
1

For what it's worth, I've adapted the accepted answer for use with Cold Fusion. I thought some folks might find it useful, so I figured I'd post it. As noted above, CF bombed on the seconds implementation above, so I modified it. I'm not sure if that means it's a general RegEx error in the above example, or if CF and JS have different RegEx implementations. Anyway, here's the CF RegEx, complete with comments (because, you know, otherwise regular expressions are complete gibberish):

<cfset regex = "(?x) ## allow for multi-line expression, including comments (oh, glorious comments)
            ^ ## ensure that this expression occurs at the start of the string
            P ## the first character must be a P
            (\d+Y|Y)? ## year (the ? indicates 0 or 1 times)
            (\d+M|M)? ## month
            (\d+D|D)? ## day
            (?:T ## T delineates between day and time information
            (\d+H|H)? ## hour
            (\d+M|M)? ## minute
            (\d+(?:\.\d{1,2})?S|S)? ## seconds and milliseconds.  The inner ?: ensure that the sub-sub-expression isn't returned as a separate thing
            )? ## closes 'T' subexpression
            $ ## ensure that this expression occurs at the end of the string.  In conjunction with leading ^, this ensures that the string has no extraenous characters">

After that, you run it against your string like this:

<cfset result = reFind(regex,mystring,1,true)>

That returns an array of subexpressions, which you can iterate over to get the discreet parts:

<cfloop from=1 to=#arrayLen(result.len)# index=i>
    <cfif result.len[i] GT 0>
    #mid(mystring, result.pos[i], result.len[i])#<br>
    </cfif>
</cfloop>
Dowse answered 19/6, 2013 at 20:40 Comment(0)
O
0

Our SCORM Engine implementation uses a combination of a regular expression similar to the ones above and some basic JavaScript logic do do further validation.

Optimist answered 27/8, 2009 at 15:58 Comment(0)
W
0

Maybe it's semantics, but this part of the SCORM spec can be interpreted to mean literals are allowed even if a value isn't supplied:

The character literals designators P, Y, M, D, T, H, M and S shall appear if the corresponding non-zero value is present.

"shall appear" meaning a literal MUST be present if the corresponding number is present; it doesn't say "shall ONLY appear" if the corresponding number is present.

I modified Alan's regex to handle this possibility (thanks, Alan):

^P(?:\d+Y|Y)?(?:\d+M|M)?(?:\d+D|D)?(?:T(?:\d+H|H)?(?:\d+M|M)?(?:\d+(?:\.\d{1,2})?S|S)?)?$

The only bug I've found so far is a failure to flag a string that has no numeric values specified, such as 'PTS'. The minimum according to the spec is "P" followed by a single value and accompanying designation, such as P1Y (= 1 year) or PT0S (= 1 second):

at least one character literal designator and value shall be present in addition to the designator P

There must be a way to add a check for a numeric value to this regex, but my regex-fu is not that strong. :)

Wizened answered 17/8, 2010 at 7:55 Comment(0)
L
0

I'm using this expression:

^P(\d+Y)?(\d+M)?(\d+D)?(T(((\d+H)(\d+M)?(\d+(\.\d{1,2})?S)?)|((\d+M)(\d+(\.\d{1,2})?S)?)|((\d+(\.\d{1,2})?S))))?$

This expression does not match a value like "PYMDT0H" : a digit must accompany the designator to be matched.

Leia answered 3/5, 2011 at 9:41 Comment(0)
M
0

Based on the previously accepted answer, I've made this capturing regex for PCRE (PHP, ruby, Ecmascript 2018, ... ): https://regex101.com/r/KfMs1I/6

^P (?=\w*\d) (?:(?<years>\d+)Y|Y)? (?:(?<month>\d+)M|M)? (?:(?<days>\d+)D|D)? (?: T (?:(?<hours>\d+)H|H)? (?:(?<minutes>\d+)M|M)? (?: (?<seconds> \d+ (?: \. \d{1,2} )? )S | S )? )?$

Unfortunately I can't find how to do the same in current JS, because the optional groups cannot be accessed in a reliable way without named groups.

Mendelian answered 13/6, 2018 at 8:6 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.