How to get bc to handle numbers in scientific (aka exponential) notation?
Asked Answered
I

11

51

bc doesn't like numbers expressed in scientific notation (aka exponential notation).

$ echo "3.1e1*2" | bc -l
(standard_in) 1: parse error

but I need to use it to handle a few records that are expressed in this notation. Is there a way to get bc to understand exponential notation? If not, what can I do to translate them into a format that bc will understand?

Intromit answered 14/10, 2012 at 13:19 Comment(0)
I
45

Unfortunately, bc doesn't support scientific notation.

However, it can be translated into a format that bc can handle, using extended regex as per POSIX in sed:

sed -E 's/([+-]?[0-9.]+)[eE]\+?(-?)([0-9]+)/(\1*10^\2\3)/g' <<<"$value"

you can replace the "e" (or "e+", if the exponent is positive) with "*10^", which bc will promptly understand. This works even if the exponent is negative or if the number is subsequently multiplied by another power, and allows keeping track of significant digits.

If you need to stick to basic regex (BRE), then this should be used:

sed 's/\([+-]\{0,1\}[0-9]*\.\{0,1\}[0-9]\{1,\}\)[eE]+\{0,1\}\(-\{0,1\}\)\([0-9]\{1,\}\)/(\1*10^\2\3)/g' <<<"$value"

From Comments:

  • A simple bash pattern match could not work (thanks @mklement0) as there is no way to match a e+ and keep the - from a e- at the same time.

  • A correctly working perl solution (thanks @mklement0)

    $ perl -pe 's/([-\d.]+)e(?:\+|(-))?(\d+)/($1*10^$2$3)/gi' <<<"$value"
    
  • Thanks to @jwpat7 and @Paul Tomblin for clarifying aspects of sed's syntax, as well as @isaac and @mklement0 for improving the answer.

Edit:

The answer changed quite a bit over the years. The answer above is the latest iteration as of 17th May 2018. Previous attempts reported here were a solution in pure bash (by @ormaaj) and one in sed (by @me), that fail in at least some cases. I'll keep them here just to make sense of the comments, which contain much nicer explanations of the intricacies of all this than this answer does.

value=${value/[eE]+*/*10^}  ------> Can not work.
value=`echo ${value} | sed -e 's/[eE]+*/\\*10\\^/'` ------> Fail in some conditions
Intromit answered 14/10, 2012 at 13:19 Comment(2)
Two successive bash substitutions will work (i.e. v=${v/e/*10^}; v=${v/^+/^}), provided the result isn't used in an expression with higher precedence than *.Kaltman
It may be helpful to mention that when the superscript on the exponential is negative, one has to specify the scale in bc, otherwise one may get unexpected 0.Slingshot
K
24

Let me try to summarize the existing answers, with comments on each below:

  • (a) If you indeed need to use bc for arbitrary-precision calculations - as the OP does - use the OP's own clever approach, which textually reformats the scientific notation to an equivalent expression that bc understands.

  • If potentially losing precision is not a concern,

    • (b) consider using awk or perl as bc alternatives; both natively understand scientific notation, as demonstrated in jwpat7's answer for awk.
    • (c) consider using printf '%.<precision>f' to simply textually convert to regular floating point representation (decimal fractions, without the e/E) (a solution proposed in a since-deleted post by ormaaj).

(a) Reformatting scientific notation to an equivalent bc expression

The advantage of this solution is that precision is preserved: the textual representation is transformed into an equivalent textual representation that bc can understand, and bc itself is capable of arbitrary-precision calculations.

See the OP's own answer, whose updated form is now capable of transforming an entire expression containing multiple numbers in exponential notation into an equivalent bc expression.


(b) Using awk or perl instead of bc as the calculator

Note: The following approaches assume use of the built-in support for double-precision floating-point values in awk and perl. As is in inherent in floating-point arithmetic,
"given any fixed number of bits, most calculations with real numbers will produce quantities that cannot be exactly represented using that many bits. Therefore the result of a floating-point calculation must often be rounded in order to fit back into its finite representation. This rounding error is the characteristic feature of floating-point computation." (http://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html)

That said,

awk

awk natively understands decimal exponential (scientific) notation.
(You should generally only use decimal representation, because awk implementations differ with respect to whether they support number literals with other bases.)

awk 'BEGIN { print 3.1e1 * 2 }'  # -> 62

If you use the default print function, the OFMT variable controls the output format by way of a printf format string; the (POSIX-mandated) default is %.6g, meaning 6 significant digits, which notably includes the digits in the integer part.

Note that if the number in scientific notation is supplied as input (as opposed to a literal part of the awk program), you must add +0 to force it to the default output format, if used by itself with print:

Depending on your locale and the awk implementation you use, you may have to replace the decimal point (.) with the locale-appropriate radix character, such as , in a German locale; applies to BSD awk, mawk, and to GNU awk with the --posix option.

awk '{ print $1+0 }' <<<'3.1e1' # -> 31; without `+0`, output would be the same as input

Modifying variable OFMT changes the default output format (for numbers with fractional parts; (effective) integers are always output as such).
Alternatively, use the printf function with an explicit output format:

awk 'BEGIN { printf "%.4f", 3.1e1 * 2.1234 }' # -> 65.8254

Perl

perl too natively understands decimal exponential (scientific) notation.

Note: Perl, unlike awk, isn't available on all POSIX-like platforms by default; furthermore, it's not as lightweight as awk.
However, it offers more features than awk, such as natively understanding hexadecimal and octal integers.

perl -le 'print 3.1e1 * 2'  # -> 62

I'm unclear on what Perl's default output format is, but it appears to be %.15g. As with awk, you can use printf to choose the desired output format:

perl -e 'printf "%.4f\n", 3.1e1 * 2.1234' # -> 65.8254

(c) Using printf to convert scientific notation to decimal fractions

If you simply want to convert scientific notation (e.g., 1.2e-2) into a decimal fraction (e.g., 0.012), printf '%f' can do that for you. Note that you'll convert one textual representation into another via floating-point arithmetic, which is subject to the same rounding errors as the awk and perl approaches.

printf '%.4f' '1.2e-2' # -> '0.0120'; `.4` specifies 4 decimal digits.
Kneepad answered 4/3, 2015 at 3:10 Comment(1)
Use Perl6/Raku with rational number arithmetic better than any language around today, quora.com/What-can-Perl-6-do-that-Python-cannot.Nympha
B
12

One can use awk for this; for example,

awk '{ print +$1, +$2, +$3 }' <<< '12345678e-6 0.0314159e2 54321e+13'

produces (via awk's default format %.6g) output like
12.3457 3.14159 543210000000000000
while commands like the following two produce the output shown after each, given that file edata contains data as shown later.

$ awk '{for(i=1;i<=NF;++i)printf"%.13g ",+$i; printf"\n"}' < edata`
31 0.0312 314.15 0 
123000 3.1415965 7 0.04343 0 0.1 
1234567890000 -56.789 -30 

$ awk '{for(i=1;i<=NF;++i)printf"%9.13g ",+$i; printf"\n"}' < edata
       31    0.0312    314.15         0 
   123000 3.1415965         7   0.04343         0       0.1 
1234567890000   -56.789       -30 


$ cat edata 
3.1e1 3.12e-2 3.1415e+2 xyz
123e3 0.031415965e2 7 .4343e-1 0e+0 1e-1
.123456789e13 -56789e-3 -30

Also, regarding solutions using sed, it probably is better to delete the plus sign in forms like 45e+3 at the same time as the e, via regex [eE]+*, rather than in a separate sed expression. For example, on my linux machine with GNU sed version 4.2.1 and bash version 4.2.24, commands
sed 's/[eE]+*/*10^/g' <<< '7.11e-2 + 323e+34'
sed 's/[eE]+*/*10^/g' <<< '7.11e-2 + 323e+34' | bc -l
produce output
7.11*10^-2 + 323*10^34
3230000000000000000000000000000000000.07110000000000000000

Bobbybobbye answered 14/10, 2012 at 16:15 Comment(3)
uhm, so awk handles significant digits correctly. That is interesting. The only drawback I can see is that this way you have to set a maximum precision to your numbers, which if exceeded would make the script not work properly. If there was a way to force awk to use arbitrary precision it would be perfect. I like better your version of the sed command rather than my own, I forgot about the possibilities of *.Intromit
@Ferdinando, yes, awk has the drawbacks you mention, and its real numbers typically are doubles with 16 digit resolution; for example, awk '{printf"%.40g",+$1}' <<< 12345678901234567891234567890123456e-20 produces 123456789012.345672607421875Bobbybobbye
Great alternative to bc, if potentially losing precision is not a concern; Note that the portable way to force something into a number in awk is to append +0, not to prepend +. For instance, while awk '{ print +$1 }' <<<1e-1 works fine in mawk and gawk (outputs 0.1), it does not in BSD awk (as used on OS X; outputs the input unmodified). By contrast, awk '{ print $1+0 }' <<<1e-1 should work with all awk implementations.Kneepad
C
10

You can also define a bash function which calls awk (a good name would be the equal sign "="):

= ()
{
    local in="$(echo "$@" | sed -e 's/\[/(/g' -e 's/\]/)/g')";
    awk -v CONVFMT=%.15g 'BEGIN {print '"$in"' ""}' < /dev/null
}

Then you can use all type of floating point math in the shell. Note that square brackets are used here instead of round brackets, since the latter would have to be protected from the bash by quotes.

> = 1+sin[3.14159] + log[1.5] - atan2[1,2] - 1e5 + 3e-10
-99999.058179847

Or in a script to assign the result

a=$(= 1+sin[4])
echo $a   # 0.243198
Cogitative answered 8/10, 2013 at 15:25 Comment(4)
I like this solution very much, provided I don't find any pitfalls. I have to do basic arithmetic with scientific notation so often and this works a charm so far. For now I have defined your function in my bash_profile and named it scmath. Using the = symbol seems a bit dangerous to meAlesandrini
I might be wrong here, but the answer to 1+sin[3.14159] + log[1.5] - atan2[1,2] - 1e5 + 3e-10 is for sure not 0.94182Rebate
@jo-h python3 -c "from math import *;print(1+sin(3.14159) + log(1.5) - atan2(1,2) - 1e5 + 3e-10)" returns -99999.058179847 and after executing your = function on my Linux it returns -99999.1 (but not 0.94182). I conclude your = function is correct, congratulations :+1: !Bac
I like it. Note shell metacharacters like * ( ) MAY need to be escaped or the whole expression quoted. Especially if there are spaces in the expression. 'x' is a non-metachar that does multiplication substituting for *.Stupidity
T
4

Luckily there is printf, which does the formatting job:

The above example:

printf "%.12f * 2\n" 3.1e1 | bc -l

Or a float comparison:

n=8.1457413437133669e-02
m=8.1456839223809765e-02

n2=`printf "%.12f" $n`
m2=`printf "%.12f" $m`

if [ $(echo "$n2 > $m2" | bc -l) == 1  ]; then 
   echo "n is bigger"
else
   echo "m is bigger"
fi
Tmesis answered 31/3, 2016 at 13:30 Comment(1)
This works, but will fail on small numbers (10^-12)Rebate
H
1

Piping version of OPs accepted answer

$ echo 3.82955e-5 | sed 's/[eE]+*/\*10\^/'
3.82955*10^-5

Piping the input to the OPs accepted sed command gave extra backslashes like

$ echo 3.82955e-5 | sed 's/[eE]+*/\\*10\\^/'
3.82955\*10\^-5
Hemichordate answered 13/4, 2018 at 8:57 Comment(0)
S
1

I managed to do it with a little hack. You can do something like this -

scientific='4.8844221e+002'
base=$(echo $scientific | cut -d 'e' -f1)
exp=$(($(echo $scientific | cut -d 'e' -f2)*1))
converted=$(bc -l <<< "$base*(10^$exp)")
echo $converted 
>> 488.4422100
Semasiology answered 20/10, 2018 at 16:14 Comment(0)
U
0

try this (found this in an example for a CFD input data for processing with m4:)

T0=4e-5
deltaT=2e-6
m4 <<< "esyscmd(perl -e 'printf (${T0} + ${deltaT})')"
Unscrew answered 15/11, 2013 at 12:44 Comment(0)
S
0

Try this: (using bash)

printf "scale=20\n0.17879D-13\n" | sed -e 's/D/*10^/' | bc

or this:

 num="0.17879D-13"; convert="`printf \"scale=20\n$num\n\" | sed -e 's/D/*10^/' | bc`" ; echo $convert
.00000000000001787900
num="1230.17879"; convert="`printf \"scale=20\n$num\n\" | sed -e 's/D/*10^/' | bc`" ; echo $convert
1230.17879

If you have positive exponents you should use this:

num="0.17879D+13"; convert="`printf \"scale=20\n$num\n\" | sed -e 's/D+/*10^/' -e 's/D/*10^/' | bc`" ; echo $convert
1787900000000.00000

That last one would handle every numbers thrown at it. You can adapt the 'sed' if you have numbers with 'e' or 'E' as exponents.

You get to chose the scale you want.

Steeplejack answered 23/10, 2014 at 18:19 Comment(0)
R
0

From a floating point perspective, there is a difference between the scientific representation (e.g. 1.1E2), and its seemingly equivalent numeric base-10 computation (1.1*10^2). The main reason is that some numbers cannot be accurately represented as a binary number. Hence, floating point errors will be introduced in the computation (see Is floating point math broken?)

$ awk 'BEGIN{OFMT="%.17f"; print 1.1e2; print 1.1*10**2}'
110
110.00000000000001421

The solution would then be to change the format of the floating point number and not to convert it into a computation. As mentioned in other posts, printf is the solution here, however one has to be careful with small and big numbers as the example shows (based on this):

v=3.2e-3
printf -- "%.12f" "$v"
0.003200000000
$ v=3.2e-13
$ printf -- "%.12f" "$v"
0.000000000000

So it would be nice to transfer the information of the exponent to printf by defining the precision as an argument. The following conversion does this

$ printf -- "%.*f" $((17-${v#*[eE]})) "$v"

This takes into account that you need 17 digits precision to represent a double-precision floating point number accurately and it exploits the fact that printf converts a negative precision into a default precision. Here are some examples:

for v in 1.2345678901234567e{-2,+2,-10,+10,-20,+20}; do 
   printf -- "%.*f\n" $((17-${v#*[eE]})) "${v}"
done
0.0123456789012345670
123.456789012345670
0.000000000123456789012345670
12345678901.2345670
0.0000000000000000000123456789012345670
123456789012345670000.000000
Rebate answered 13/6, 2022 at 13:3 Comment(0)
B
0

Here is my little perlCalc bash function :

perlCalc ()
{
    set -- ${@/^/**}
    set -- ${@/[/(}
    set -- ${@/]/)}
    \perl -le "print $*"
}

Example :

$ perlCalc 1+sin[3.14159] + log[1.5] - atan2[1,2] - 1e5 + 3e-10
-99999.058179847

It returns the same result as python3 :

$ python3 -c "from math import *;print(1+sin(3.14159) + log(1.5) - atan2(1,2) - 1e5 + 3e-10)"
-99999.058179847
Bac answered 18/8, 2022 at 22:19 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.