Japanese Numerals to Arabic Numerals converter in Python
Asked Answered
D

2

7

Is there an open source library in Python which does Kanji Numeral to Arabic Numeral conversion/translation?

Input : 10億2千9百万
Output: 1,029,000,000

Input : 1億6,717万2,600
Output: 167,172,600

Input : 3,139百万
Output: 3,139,000,000

Japanese Numeral Systems : http://en.wikipedia.org/wiki/Japanese_numerals

Web Based Converter : http://www.sljfaq.org/cgi/kanjinumbers.cgi

Detach answered 19/2, 2013 at 21:57 Comment(2)
Note that 円 is the character for Yen (currency).Doehne
EDIT: My bad. I've removed that glyph from the example input lines.Detach
L
2

This should work:

import kanjinums
kanjinums.kanji2num("五百十一")

After downloading and installing kanjinums, which is unfortunately not available through pip.

EDIT: This will only work for basic numbers, not the complex cases like mentioned.

With minor modifications this will actually work, for instance:

3139*kanjinums.kanji2num("百万")
3139000000
Lavine answered 19/2, 2013 at 22:3 Comment(3)
Thanks for your response. Unfortunately, that module only does simple dictionary lookup and is good for simple Kanji numerals but does not support complex conversion like the examples I mentioned.Detach
Yes I am realizing that. Sorry! It appears I would need to be able to understand kanji a little better to provide any useful information.Lavine
you will need to write a function or class that converts your numbers using this, by replacing strings with expressions (as one way).Lavine
B
2

This can actually be done relatively easy in a function:

def convert_kanji(self, zahl):
    japnumber = ("兆", "億",  "万")
    jap_factors = {
            "兆": 1000000000000,
            "億": 100000000,
            "万": 10000
            }

    #Define the variables
    converted_number = 0
    already_found = False
    found_kanji_previous = 0

    try: #If the number can be returned as an integer (i.e. no Kanji in it) -> do it
        return(int(zahl)) 
    except ValueError: #If not, disintegrate it
        for key in japnumber: #do it for every Kanji
            if key in zahl: #If it has been found in the original string:
                gef_kanji = zahl.find(key) #mark, which Kanji has been found
                if not already_found: #if it is the first kanji:
                    intermediate_step = int(zahl[:gef_kanji]) * jap_factors[key] #Convert the number in front of the Kanji with the appropriate factor
                    converted_number = intermediate_step
                    already_found = True
                    found_kanji_previous = gef_kanji
                else: #for sll other kanjis
                    intermediate_step = int(zahl[found_kanji_previous+1:gef_kanji]) * jap_factors[key]
                    converted_number = converted_number + intermediate_step #sum them up
                    found_kanji_previous = gef_kanji

        if len(zahl) > (found_kanji_previous+1):
            converted_number = converted_number + int(zahl[found_kanji_previous+1:])
        return converted_number

This is still relatively simple. It can only accept numbers in the form of 2314兆3424億3422万2342.

Also the code might be extremely bad, because this was actually my first program in a long long time. But it might be a good starting point for you.

I am actually currently working on a easy converter which will convert japanese numbers into easy to read western ones (e.g. converting 231億 to “23 billion 100 million”; it actually already does that). I guess there is much do be done, e.g. full-width characters, numbers completely in Kanji etc. If I have tackled all that, I might upload it similarily as kanjinums :D

Biceps answered 21/6, 2013 at 18:28 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.