How to extract a floating number from a string [duplicate]
Asked Answered
Z

7

161

I have a number of strings similar to Current Level: 13.4 db. and I would like to extract just the floating point number. I say floating and not decimal as it's sometimes whole. Can RegEx do this or is there a better way?

Zippy answered 16/1, 2011 at 2:11 Comment(2)
Will it always have an integer portion? Even if it's 0? Do you need to match 0.4 or .4?Schiff
I would say yes. Input is manually entered so there is chance for inconsistency.Zippy
U
323

If your float is always expressed in decimal notation something like

>>> import re
>>> re.findall("\d+\.\d+", "Current Level: 13.4db.")
['13.4']

may suffice.

A more robust version would be:

>>> re.findall(r"[-+]?(?:\d*\.*\d+)", "Current Level: -13.2db or 14.2 or 3")
['-13.2', '14.2', '3']

If you want to validate user input, you could alternatively also check for a float by stepping to it directly:

user_input = "Current Level: 1e100 db"
for token in user_input.split():
    try:
        # if this succeeds, you have your (first) float
        print(float(token), "is a float")
    except ValueError:
        print(token, "is something else")

# => Would print ...
#
# Current is something else
# Level: is something else
# 1e+100 is a float
# db is something else
        
Umbilicus answered 16/1, 2011 at 2:16 Comment(9)
it is not always a decimal number.Zippy
re.findall(r"[-+]?\d*\.*\d+", "Current Level: -13.2 db or 14.2 or 3") ['-13.2', '14.2', '3']Specter
how would you handle something like "level:12,25;" ?Harryharsh
what does [-+]? in the code do?Lepidopterous
@Lepidopterous extracts + or - if present.Pion
@Specter looks fine but r"[-+]?\d*\.?\d+" is a little more concise and will not accept 0..4Dominions
that will miss negative integers "-35 um". Should alternation have [-+]? at the beginning: #"[-+]?\d*\.\d+|[-+]?\d+"Sperrylite
missign thousand separators, scientific expression, better answer available on the pageKoren
Thank you ! Do you have any idea to include number like this : 1,075.01 please ?Damiendamietta
C
90

You may like to try something like this which covers all the bases, including not relying on whitespace after the number:

>>> import re
>>> numeric_const_pattern = r"""
...     [-+]? # optional sign
...     (?:
...         (?: \d* \. \d+ ) # .1 .12 .123 etc 9.1 etc 98.1 etc
...         |
...         (?: \d+ \.? ) # 1. 12. 123. etc 1 12 123 etc
...     )
...     # followed by optional exponent part if desired
...     (?: [Ee] [+-]? \d+ ) ?
...     """
>>> rx = re.compile(numeric_const_pattern, re.VERBOSE)
>>> rx.findall(".1 .12 9.1 98.1 1. 12. 1 12")
['.1', '.12', '9.1', '98.1', '1.', '12.', '1', '12']
>>> rx.findall("-1 +1 2e9 +2E+09 -2e-9")
['-1', '+1', '2e9', '+2E+09', '-2e-9']
>>> rx.findall("current level: -2.03e+99db")
['-2.03e+99']
>>>

For easy copy-pasting:

numeric_const_pattern = '[-+]? (?: (?: \d* \. \d+ ) | (?: \d+ \.? ) )(?: [Ee] [+-]? \d+ ) ?'
rx = re.compile(numeric_const_pattern, re.VERBOSE)
rx.findall("Some example: Jr. it. was .23 between 2.3 and 42.31 seconds")
Cumquat answered 16/1, 2011 at 2:46 Comment(4)
Very good! Finally I've found a really good pattern!Medicate
Yes, best pattern ever for numbers. Thanks a lot!Salomie
Adding (?:\+\s*|\-\s*)? at the front would also allow for a space between the sign and the number. Even though I admit this is probably not very "standard" I have seen this pattern "floating around" in some files.Naturalist
You probably need an r in front of the pattern string in the very last snippet.Gramarye
W
37

Python docs has an answer that covers +/-, and exponent notation

scanf() Token      Regular Expression
%e, %E, %f, %g     [-+]?(\d+(\.\d*)?|\.\d+)([eE][-+]?\d+)?
%i                 [-+]?(0[xX][\dA-Fa-f]+|0[0-7]*|\d+)

This regular expression does not support international formats where a comma is used as the separator character between the whole and fractional part (3,14159). In that case, replace all \. with [.,] in the above float regex.

                        Regular Expression
International float     [-+]?(\d+([.,]\d*)?|[.,]\d+)([eE][-+]?\d+)?
Whopping answered 14/8, 2013 at 17:7 Comment(0)
A
12
re.findall(r"[-+]?\d*\.?\d+|\d+", "Current Level: -13.2 db or 14.2 or 3")

as described above, works really well! One suggestion though:

re.findall(r"[-+]?\d*\.?\d+|[-+]?\d+", "Current Level: -13.2 db or 14.2 or 3 or -3")

will also return negative int values (like -3 in the end of this string)

Americanist answered 25/11, 2011 at 10:24 Comment(0)
R
7

You can use the following regex to get integer and floating values from a string:

re.findall(r'[\d\.\d]+', 'hello -34 42 +34.478m 88 cricket -44.3')

['34', '42', '34.478', '88', '44.3']

Thanks Rex

Resinoid answered 21/5, 2014 at 15:14 Comment(1)
This regex will also find non-numeric combinations of periods and digits: '.... 1.2.3.4 ..56..' yields: ['....', '1.2.3.4', '..56..']Raccoon
C
3

I think that you'll find interesting stuff in the following answer of mine that I did for a previous similar question:

https://mcmap.net/q/151935/-regular-expression-to-match-numbers-with-or-without-commas-and-decimals-in-text/551449

In this answer, I proposed a pattern that allows a regex to catch any kind of number and since I have nothing else to add to it, I think it is fairly complete

Cockroach answered 25/11, 2011 at 11:8 Comment(0)
B
2

Another approach that may be more readable is simple type conversion. I've added a replacement function to cover instances where people may enter European decimals:

>>> for possibility in "Current Level: -13.2 db or 14,2 or 3".split():
...     try:
...         str(float(possibility.replace(',', '.')))
...     except ValueError:
...         pass
'-13.2'
'14.2'
'3.0'

This has disadvantages too however. If someone types in "1,000", this will be converted to 1. Also, it assumes that people will be inputting with whitespace between words. This is not the case with other languages, such as Chinese.

Borscht answered 16/1, 2011 at 2:40 Comment(2)
"4x size AAA 1.5V batteries included" :-)Cumquat
Those terrible users! Always entering in silly data. TBH, I've intentionally kept this example demonstrative rather than robust. When I begun writing this response, @The MYYN only provided regular expressions in the accepted answer. I wanted to provide an example of another way to go about things.Borscht

© 2022 - 2025 — McMap. All rights reserved.