Round to nearest 1000 in pandas
Asked Answered
M

5

13

I've searched the pandas documentation and cookbook recipes and it's clear you can round to the nearest decimal place easily using dataframe.columnName.round(decimalplace).

How do you do this with larger numbers?

Example, I have a column of housing prices and I want them rounded to the nearest 10000 or 1000 or whatever.

df.SalesPrice.WhatDoIDo(1000)? 
Mahaliamahan answered 23/12, 2017 at 1:35 Comment(3)
Possible duplicate of How do I round to the nearest ten?Avocet
Divide by 1000, round, multiply by 1000Refutation
Lots of languages have lots of iterative ways to solve this. While there are no shortage of "how to round" questions on SO, I was looking for a specific pandas way to leverage the efficiency of this framework.Mahaliamahan
R
18

By using the notation df.ColumnName.round(), you are actually calling pandas.Series.round, the documentation of which specifies:

decimals : int

Number of decimal places to round to (default: 0). If decimals is negative, it specifies the number of positions to the left of the decimal point.

So you can do:

df = pd.DataFrame({'val':[1,11,130,670]})
df.val.round(decimals=-2)

This produces the output:

0      0
1      0
2    100
3    700
Name: val, dtype: int64

decimals=-3 rounds to the 1000s, and so on. Notably, it also works using pandas.DataFrame.round(), though the documentation doesn't tell you:

df = pd.DataFrame({'val':[1,11,130,670], 'x':[1,11,150,900]})
df.round({'val':-2})

This will round the column val to the nearest 100, but leave x alone.

Randellrandene answered 23/12, 2017 at 1:59 Comment(1)
I completely missed the "if decimals is negative" part of that. Thanks so much!Mahaliamahan
H
3

Function round does accept negative values for cases in which you want to specify precision to the left of the decimal point:

dataframe.columnName.round(-3)

Example:

>>> pd.Series([1, 500, 500.1, 999, 1500, 1501, 946546]).round(-3)
0         0.0
1         0.0
2      1000.0
3      1000.0
4      2000.0
5      2000.0
6    947000.0
dtype: float64
Hermann answered 23/12, 2017 at 1:58 Comment(1)
This is the correct answer. Should receive more upvotes.Tlingit
O
1

You can try this

df = pd.DataFrame({'val':[1,11,130,670]})
10**df.val.astype(str).str.len()
Out[27]: 
0      10
1     100
2    1000
3    1000
Name: val, dtype: int64
Orientation answered 23/12, 2017 at 1:45 Comment(0)
H
1

Another interesting "hack" is this: Let's say you want to round off to the nearest 100s. You can add 50, then divide by 100, convert to integer, multiply back by 100.

df = pd.DataFrame({'val':[1005,1299,1301,4109]})
df.val.round(-2) # Proper way
((df.val+50)/100).astype(int)*100 # Hack

Gives you this, as desired:

[1000, 1300, 1300, 4100]
Heuer answered 23/12, 2017 at 2:52 Comment(0)
P
0

My favorite, dynamic way to do this:

ds: pd.Series to "round"
x: int/float of the power to round

# Define rounding lambda function:
my_rounder = lambda ds, x: ((ds + 0.5*10**x) // 10**x) * 10**x

# Apply lambda function to "prices" values:
housing_df["rounded_prices"] = my_rounder(housing_df["prices"], 3)

# If you need to force/ensure no decimal:
housing_df["rounded_prices"] = housing_df["rounded_prices"].apply(int)

Alternative floor rounder:

my_floor_rounder = lambda ds, x: (ds // 10**x) * 10**x

Breakdown:

print(housing_df["prices"].head())

year
2010    372560.0
2011    374507.0
2012    376454.0
2013    378401.0
2014    380348.0
Name: prices, dtype: float64
    
# This step can be omitted if you're finding the floor:
step_up = housing_df["prices"] + 0.5*10**3
print(step_up.head())

year
2010    373060.0
2011    375007.0
2012    376954.0
2013    378901.0
2014    380848.0
Name: prices, dtype: float64

thsnd = step_up // 10**3
print(thsnd.head())

year
2010    373.0
2011    375.0
2012    376.0
2013    378.0
2014    380.0
Name: prices, dtype: float64

rounded = thsnd * 10**3
print(rounded.head())

year
2010    373000.0
2011    375000.0
2012    376000.0
2013    378000.0
2014    380000.0
Name: prices, dtype: float64

int_rounded = rounded.apply(int)
print(int_rounded.head())

year
2010    373000
2011    375000
2012    376000
2013    378000
2014    380000
Name: prices, dtype: int64
Pylorectomy answered 6/8, 2021 at 8:32 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.