Calculating rounded percentage in Shell Script without using "bc"
Asked Answered
A

6

20

I'm trying to calculate percentage of certain items in Shell Script. I would like to round off the value, that is, if the result is 59.5, I should expect 60 and not 59.

item=30
total=70
percent=$((100*$item/$total))

echo $percent

This gives 42.

But actually, the result is 42.8 and I would to round it off to 43. "bc" does the trick, is there a way without using "bc" ?

I'm not authorized to install any new packages. "dc" and "bc" are not present in my system. It should be purely Shell, cannot use perl or python scripts either

Agribusiness answered 18/6, 2014 at 11:34 Comment(2)
You can use dc instead. Bash only supports integer arithmetic (true for most shells)Hydantoin
Possible duplicate of How do I use floating-point division in bash?Suzannesuzerain
C
25

Use AWK (no bash-isms):

item=30
total=70
percent=$(awk "BEGIN { pc=100*${item}/${total}; i=int(pc); print (pc-i<0.5)?i:i+1 }")

echo $percent
43
Comorin answered 19/6, 2014 at 5:16 Comment(6)
While it's good to have an awk alternative - even though it contradicts the "purely shell" premise of the question - it is ill-advised to use a double-quoted string with shell-variable expansion as the awk script, because it leads to confusion over what is expanded by the shell up front vs. what awk interprets later. The cleaner solution is to use a single-quoted awk script to which (shell variable) values are passed with awk's -v option. Also, given that floating-point arithmetic is used anyway, using awk's printf function with format string %.0f is simpler.Lexi
@mklelement - using -v is more acceptable and cleaner.... but in this case, unnecessary. True, I did use the loophole that he did not mention awk (only python and perl were listed as unacceptable)... I provided the shell examples later. ;)Comorin
There's nothing to be gained from not using -v in this case - except promoting ill-advised practices, which includes implementing custom rounding after having performed floating-point arithmetic already. As for the loophole: it's perfectly fine to provide alternative solutions, as long as they're declared as such; even though the question doesn't explicitly rule out awk, it does ask for a "purely Shell" solution, so that part of my comment was simply meant to make it explicit that this solution doesn't qualify as such (while potentially still having value) - no more and no less.Lexi
@Lexi - Well... actually... to me there was a gain... I initially used -v, but that made the script extend enough to the right that I wanted to compress the line a bit... I still also maintain that using awk here is probably a good idea to look at... I presume the "user" thought is was cool too and accepted the answer -- thinking it better than the (much faster) "pure shell" answers I submitted a bit later... so, all is cool... :) Happy I could help "user."Comorin
Again, my intent was not to discount an awk answer, but to make it explicit - again, as guidance to future readers - that this answer is not for someone looking for a pure shell solution (for whatever reason). Clearly, at the very least 5-6 people have found value in your answer, and that's great.Lexi
As for the space issue: it's a great idea to avoid horizontal scrolling - there are far too many answers here that try to cram solutions into a single, scrolling line. However, I suggest not letting space concerns guide what solution to offer (my comment was about the fundamental approach); POSIX-like shells support multi-line strings, allowing you to easily spread an awk script across multiple lines for readability.Lexi
C
11

Taking 2 * the original percent calculation and getting the modulo 2 of that provides the increment for rounding.

item=30
total=70
percent=$((200*$item/$total % 2 + 100*$item/$total))

echo $percent
43

(tested with bash, ash, dash and ksh)

This is a faster implementation than firing off an AWK coprocess:

$ pa() { for i in `seq 0 1000`; do pc=$(awk "BEGIN { pc=100*${item}/${total}; i=int(pc); print (pc-i<0.5)?i:i+1 }"); done; }
$ time pa

real    0m24.686s
user    0m0.376s
sys     0m22.828s

$ pb() { for i in `seq 0 1000`; do pc=$((200*$item/$total % 2 + 100*$item/$total)); done; }
$ time pb

real    0m0.035s
user    0m0.000s
sys     0m0.012s
Comorin answered 29/10, 2015 at 8:21 Comment(5)
Not sure why people are so obsessed with benchmarking the built-ins. If code is going to be slow, it will because of the developer or the algorithm, not this. Right ?Grannia
The original answer that I wrote was in awk... which is good because once we start calculating in floating point, awk starts to look look like the best tool to get the job done. The point of this particular answer was to show that if this is the limit of the math that is needed in our script, rounding can also be calculated faster with integer math in shell.Comorin
It should be pointed out that here by "integer math" it is meant the integer arithmetic with truncation toward 0, which is defined in the ISO C standard. In no way, this code shows that we can compute percentage using ordinary integer math with no truncation, only modulo, etc. Of course, we can, but that's not what this code do. It uses the truncation toward 0 that is done by the shell.Agency
@Agency -- Good point... But I'm focusing these answers on the practical for the questioner (who was just frustrated with not being able to do floating point math)... with various levels of readability, verbosity and speed. awk is nice if the questioner wants to adapt by using floating point and a fast flexible language. If on the other hand the questioner wants to use shell exclusively and wants a really fast solution... he can likely get by using integer math operations.Comorin
Truncation toward 0 is not a high level concept. It is hidden behind the casting of floats to integers. It's not so natural, because otherwise it would be offered at a high level together with floor and ceil. It's because it's a bit weird from a high level perspective that it's only in 1999 that it became a standard for casting. The idea of using this to define a high level concept might seems like an unreliable hack to many. I would not say that these people are not practical. So, it's important to emphasize that it's a standard.Agency
L
7

A POSIX-compliant shell script is only required to support integer arithmetic using the shell language ("only signed long integer arithmetic is required"), so a pure shell solution must emulate floating-point arithmetic[1]:

item=30
total=70

percent=$(( 100 * item / total + (1000 * item / total % 10 >= 5 ? 1 : 0) ))
  • 100 * item / total yields the truncated result of the integer division as a percentage.
  • 1000 * item / total % 10 >= 5 ? 1 : 0 calculates the 1st decimal place, and if it is equal to or greater than 5, adds 1 to the integer result in order to round it up.
  • Note how there's no need to prefix variable references with $ inside an arithmetic expansion $((...)).

If - in contradiction to the premise of the question - use of external utilities is acceptable:


  • awk offers a simple solution, which, however, comes with the caveat that it uses true double-precision binary floating point values and may therefore yield unexpected results in decimal representation - e.g., try printf '%.0f\n' 28.5, which yields 28 rather than the expected 29):
awk -v item=30 -v total=70 'BEGIN { printf "%.0f\n", 100 * item / total }'
  • Note how -v is used to define variables for the awk script, which allows for a clean separation between the single-quoted and therefore literal awk script and any values passed to it from the shell.

  • By contrast, even though bc is a POSIX utility (and can therefore be expected to be present on most Unix-like platforms) and performs arbitrary-precision arithmetic, it invariably truncates the results, so that rounding must be performed by another utility; printf, however, even though it is a POSIX utility in principle, is not required to support floating-point format specifiers (such as used inside awk above), so the following may or may not work (and is not worth the trouble, given the simpler awk solution, and given that precision problems due to floating-point arithmetic are back in the picture):
# !! This MAY work on your platform, but is NOT POSIX-compliant:
# `-l` tells `bc` to set the precision to 20 decimal places, `printf '%.0f\n'`
# then performs the rounding to an integer.
item=20 total=70
printf '%.0f\n' "$(bc -l <<EOF
100 * $item / $total
EOF
)"

[1] However, POSIX allows non-integer support "The shell may use a real-floating type instead of signed long as long as it does not affect the results in cases where there is no overflow." In practice, ksh and zsh. support floating-point arithmetic if you request it, but not bash and dash. If you want to be POSIX-compliant (run via /bin/sh), stick with integer arithmetic. Across all shells, integer division works as usual: the quotient is returned, that is the result of the division with any fractional part truncated (removed).

Lexi answered 31/5, 2016 at 2:52 Comment(4)
You keep saying what you know/believe shells do when they do integer arithmetic. I don't want to be rude, but my comment-question was were is this documented? It feels to me that your position is that it is a practical fact and it is the way it is and it does not need to be documented. Sincerely, I would not build an application on that premise. I have provided documentations that are in the right direction in my answer. If you want to help along, please provide extra documentations. I know what you say: all shells truncate ... toward zero, but it does not replace a documentation.Agency
@Dominic108: I personally think that the explanation in the answer is sufficient - it describes de facto behavior that should match everyone's expectation of how integer division works. We know that POSIX doesn't spell out the behavior, but that it is aligned with ISO C, whose versions since C99 (1999), from what I understand, mandate the truncate-the-fractional-part ("round toward zero") behavior. The behavior is easily verified, and is highly unlikely to change. If you additionally want to go looking for explicit documentation for each indiv. shell, feel free - I don't see the need.Lexi
There is a big difference between one guy, no matter how important he might be, saying "all (or allmost all) shells work that way." and a documentation done by a committee saying "almost all shells work that way ... and we can rely on the fact that all shells claiming compliance with this standard do so."Agency
The fact that POSIX refers to the C99, etc. exactly what you wrote above, you did not say that before. It came after my question. I am reasonably happy with that, now. But, if anyone were to add extra documentation on this subject, it would be welcome.Agency
C
0

The following is based upon my second answer, but with expr and back-ticks -- which (while perhaps abhorrent to most) can be adapted to work in even the most archaic shells natively:

item=30
total=70
percent=`expr 200 \* $item / $total % 2 + 100 \* $item / $total`

echo $percent
43
Comorin answered 30/5, 2016 at 20:39 Comment(8)
Given that arithmetic expansion ($((...))) has been part of the POSIX shell command language since at least 1997; SUS v2, it's fair to assume that unless truly ancient shells must be supported, expr is not needed.Lexi
@Lexi - I have old Solaris instances still being used at work... and they still use tcsh for everyone's login shell (because setup scripts were never translated to sh). Sue my employer (please).Comorin
I see, but your snippet doesn't actually run in tcsh (you'd have to modify the variable assignment statements), and the shell tag is generally understood to refer to POSIX-like shells.Lexi
And while our comments now cover the cases where you do still need expr (pre-'97 Bourne-like shells and shells that don't support arithmetic, such as tcsh/csh (I'll take your word for it)), I encourage you to add this information directly to your answer.Lexi
@mkelement - I meant "can be adapted to work in all shells natively" to meet your objection(s)... I will try to be more clear in the future.Comorin
Can I offer a shift in perspective? My comments weren't objections; they were mean to give context and offer clarifications to your answer in other to provide additional guidance to future readers. Assuming that you agree with the content, such guidance is better provided as part of the answer itself rather than being buried in a comments thread that much fewer people are likely to read.Lexi
@Lexi - Are you happier with the rewording?Comorin
Yes, thanks for updating; personally, I'd put all the (more detailed) findings we've worked out in these comments directly into the answer, but that's obviously your call.Lexi
A
0

With a natural restriction to positive percentage (which cover almost all applications), we have a much simpler solution:

echo $(( ($item*1000/$total+5)/10 ))

It uses the automatic truncation toward 0 that is done by the shell instead of an explicit modulo 2 as in the second answer of Michael Back, which I have upvoted.

BTW, it might be obvious to many, but it was not obvious to me at first that the truncation toward 0 done in evaluating this code is fixed in POSIX, which says that it must respect the C standard for integer arithmetic

Arithmetic operators and control flow keywords shall be implemented as equivalent to those in the cited ISO C standard section, as listed in Selected ISO C Standard Operators and Control Flow Keywords. -- see https://pubs.opengroup.org/onlinepubs/9699919799/

For the C standard, see https://mc-stan.org/docs/2_21/functions-reference/int-arithmetic.html or section 6.3.1.4 in http://www.open-std.org/jtc1/sc22/wg14/www/C99RationaleV5.10.pdf. The first reference is for C++, but it, of course, defines the same integer arithmetic as the second reference which refers to the C99 ISO C standard.

Note that a restriction to positive percentage do not mean that we cannot have a percentage of, say 80%, which is a decrease of 20%. A negative percentage corresponds to a decrease of more than 100%. Of course, it can happen, but not in typical applications.

In accordance with the C standard, to cover negative percentages, we must test if the intermediary value item*1000/$total is negative and, in that case, substract 5, instead of adding 5, but we lose the simplicity.

Agency answered 18/12, 2019 at 15:19 Comment(0)
A
0

Here is an answer without modulo 2 and pure bash, which works with negative percentage:

echo "$(( 200*item/total - 100*item/total ))"

We could define a generic integer division that rounds to the nearest integer and then apply it.

function rdiv {
   echo $((  2 * "$1"/"$2" - "$1"/"$2" ))
}
rdiv "$((100 * $item))" "$total"

In general, to compute the nearest integer using float to int conversion, one can use, say in Java : n = (int) (2 * x) - (int) x;

Explanation (based on binary expansion):

Let fbf(x) be the first bit of the fractional part of x, keeping the sign of x. Let int(x) be the integer part, which is x truncated toward 0, again keeping the sign of x. Rounding to the nearest integer is

round(x) = fbf(x) + int(x).

For example, if x = 100*-30/70 = -42.857, then fbf(x) = -1 and int(x) = -42.

We can now understand the second answer of Michael Back, which is based on modulo 2, because:

fbf(x) = int(2*x) % 2 

We can understand the answer here, because:

fbf(x) = int(2*x) - 2*(int(x))

An easy way to look at this formula is to see a multiplication by 2 as shifting the binary representation to the left. When we first shift x, then truncate, we keep the fbf(x) bit that we lose if we first truncate, then shift.

The key point is that, in accordance with Posix and the C standard, a shell does a "rounding" toward 0, but we want a rounding toward the closest integer. We just need to find a trick to do that and there is no need for modulo 2.

Agency answered 19/12, 2019 at 18:14 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.