I want to find digits followed by "f", "ff", "f." or "ff." to standardize the spelling following given conventions/rules.
I already tried some regular expressions, but unfortunately I did not find an universal expression grabbing all of the cases above (f, ff, f., ff.).
In spoken words it seems easy:
- find digits
- followed by an optional whitespace
- then followed by f, ff, f. or ff.
- only whitespaces or NOT word boundaries are allowed before and after the expression
The beginning of the regex is quite easy, but I can’t figure out how to handle the different "f"-cases and the NOT boundaries following.
My best guess yet is:
(?<=\b)(\d+(\h|\b)?f{1,2})\.?
but then still the stings followed by a word character are found.
When I extend the regex to:
(?<=\b)(\d+(\h|\b)?f{1,2})\.?(\W)
the numbered of "false funds" are decreasing, but still it is not the solution
I prepared lines for testing. The lines containing a plus "+" should be found, at the same time the ones with a minus "-" should not be found.
00f aaa +
00f. aaa +
00ff aaa +
00ff. aaa +
00 f aaa +
00 f. aaa +
00 ff aaa +
00 ff. aaa +
+ aaa 00f aaa +
+ aaa 00f. aaa +
+ aaa 00ff aaa +
+ aaa 00ff. aaa +
+ aaa 00 f aaa +
+ aaa 00 f. aaa +
+ aaa 00 ff aaa +
+ aaa 00 ff. aaa +
+ aaa 00f
+ aaa 00f.
+ aaa 00ff
+ aaa 00ff.
+ aaa 00 f
+ aaa 00 f.
+ aaa 00 ff
+ aaa 00 ff.
00 faaa -
00 f.aaa -
00 ffaaa -
00 ff.aaa -
00af aaa -
00af. aaa -
00aff aaa -
00aff. aaa -
- aaa 00 faaa -
- aaa 00 f.aaa -
- aaa 00 ffaaa -
- aaa 00 ff.aaa -
- aaa 00af aaa -
- aaa 00af. aaa -
- aaa 00aff aaa -
- aaa 00aff. aaa -
- aaa00f
- aaa00f.
- aaa00ff
- aaa00ff.
- aaa 00af
- aaa 00af.
- aaa 00aff
- aaa 00aff.
00faaa -
00f.aaa -
00ffaaa -
00ff.aaa -
00af aaa -
00af. aaa -
00aff aaa -
00aff. aaa -
- aaa00 faaa -
- aaa00 f.aaa -
- aaa00 ffaaa -
- aaa00 ff.aaa -
- aaa00af aaa -
- aaa00af. aaa -
- aaa00aff aaa -
- aaa00aff. aaa -
- aaa00af
- aaa00af.
- aaa00aff
- aaa00aff.
Further, the aim is to group the digits anf "f"-cases in a manner, so that they can be uses in a replacement-expression to standardize the spelling to one of those cases:
- 123 ff. (with whitespace, with dot)
- 123 ff (with whitespace, without dot)
- 123ff. (without whitespace, with dot)
- 123ff (without whitespace, without dot)