How to convert a Roman numeral to integer in PHP?
Asked Answered
H

14

31

Using PHP, I'd like to convert a string containing a Roman number into its integer representation. I need this because I need to make calculations on them.

Wikipedia on Roman numerals

It would suffice to only recognize the basic Roman numeral characters, like:

$roman_values=array(
    'I' => 1,
    'V' => 5,
    'X' => 10,
    'L' => 50,
    'C' => 100,
    'D' => 500,
    'M' => 1000,
);

That means the highest possible number is 3999 (MMMCMXCIX). I will use N to represent zero, other than that only positive integers are supported.

I cannot use the PEAR library for Roman numbers.

I found this great question on SO on how to test whether the string contains a valid Roman numeral:

How do you match only valid roman numerals with a regular expression?

What would be the best way of coding this?

Higdon answered 7/6, 2011 at 13:5 Comment(3)
Why can't you use the PEAR library? Surely you could at least look at the code? It's under the same license as PHP.Sinister
Because pear is not wide-available, as example can not be installed in php command line environment. And is not allowed by security reasons :)Butterworth
@stereofrog The PEAR Package Manager is not installed on the server and I don't have rights to install it. And to be honest, it is not really worth for this one simple task.Higdon
C
48

How about this:

$romans = array(
    'M' => 1000,
    'CM' => 900,
    'D' => 500,
    'CD' => 400,
    'C' => 100,
    'XC' => 90,
    'L' => 50,
    'XL' => 40,
    'X' => 10,
    'IX' => 9,
    'V' => 5,
    'IV' => 4,
    'I' => 1,
);

$roman = 'MMMCMXCIX';
$result = 0;

foreach ($romans as $key => $value) {
    while (strpos($roman, $key) === 0) {
        $result += $value;
        $roman = substr($roman, strlen($key));
    }
}
echo $result;

which should output 3999 for the supplied $roman. It seems to work for my limited testing:

MCMXC = 1990
MM = 2000
MMXI = 2011
MCMLXXV = 1975

You might want to do some validation first as well :-)

Crossbench answered 7/6, 2011 at 13:45 Comment(7)
I like how short your solution is, but you need to add a few items to $romans, since, for example, MIM and MDCCCCLXXXXVIIII both could represent 1999 (because there's not a consensus on what constitutes a valid Roman number).Snot
@Crossbench I could really use this snippet in a project that I'm going to release under the MIT license. Is there any chance you'd license your answer under the MIT? I dislike viral licenses, and don't want use the SE default cc by-sa 3.0.Sperm
Happy to release under MIT. What's the simplest/best way to do this?Crossbench
@Snot Actually no, they can't, because: 1. The tens characters ( I, X, C, and M ) can be repeated up to three times. At 4'th, you need to subtract from the next highest fives character. So "MDCCCCLXXXXVIIII" is invalid number. ( CCCC should be replaced with CD ). 2. Greater values should not be followed by lower values, so "MIM" is also invalid. 1999 is written as "MCMXCIX".Dendrochronology
@Alexandru Guzinschi As mentioned in my comment, there is no consensus on what is valid. XXXX is valid for 40. There is no single "correct" way to write Roman numerals.Snot
That is partial valid for ancient text, when no standard was in place, although some rules exists since then (ex: only power of ten can be repeated). In modern times there are a few rules in place for dealing with Roman numerals and what I said in my previous comment is part of those rules. Although I learned them in 9th grade, I can still remember (most of them).Dendrochronology
@HenrikPetterson which version of PHP are you using? The above code has specific logic to deal with IV and I just tested it locally and it still works for meCrossbench
F
10

I am not sure whether you've got ZF or not, but in case you (or any of you who's reading this) do here is my snippet:

$number = new Zend_Measure_Number('MCMLXXV', Zend_Measure_Number::ROMAN);
$number->convertTo (Zend_Measure_Number::DECIMAL);
echo $number->getValue();
Felicific answered 7/6, 2011 at 13:34 Comment(1)
Zend 2 changed, see NumberFormatMcmillin
H
10

This is the one I came up with, I added the validity check as well.

class RomanNumber {
    //array of roman values
    public static $roman_values=array(
        'I' => 1, 'V' => 5, 
        'X' => 10, 'L' => 50,
        'C' => 100, 'D' => 500,
        'M' => 1000,
    );
    //values that should evaluate as 0
    public static $roman_zero=array('N', 'nulla');
    //Regex - checking for valid Roman numerals
    public static $roman_regex='/^M{0,3}(CM|CD|D?C{0,3})(XC|XL|L?X{0,3})(IX|IV|V?I{0,3})$/';

    //Roman numeral validation function - is the string a valid Roman Number?
    static function IsRomanNumber($roman) {
         return preg_match(self::$roman_regex, $roman) > 0;
    }

    //Conversion: Roman Numeral to Integer
    static function Roman2Int ($roman) {
        //checking for zero values
        if (in_array($roman, self::$roman_zero)) {
            return 0;
        }
        //validating string
        if (!self::IsRomanNumber($roman)) {
            return false;
        }

        $values=self::$roman_values;
        $result = 0;
        //iterating through characters LTR
        for ($i = 0, $length = strlen($roman); $i < $length; $i++) {
            //getting value of current char
            $value = $values[$roman[$i]];
            //getting value of next char - null if there is no next char
            $nextvalue = !isset($roman[$i + 1]) ? null : $values[$roman[$i + 1]];
            //adding/subtracting value from result based on $nextvalue
            $result += (!is_null($nextvalue) && $nextvalue > $value) ? -$value : $value;
        }
        return $result;
    }
}
Higdon answered 9/6, 2011 at 10:1 Comment(0)
S
4

Quick idea - go through the Roman number from right to left, if value of $current (more to the left) is smaller than $previous, then subtract it from the result, if larger, then add it.

$romanValues=array(
    'I' => 1,
    'V' => 5,
    'X' => 10,
    'L' => 50,
    'C' => 100,
    'D' => 500,
    'M' => 1000,
);
$roman = 'MMMCMXCIX';

// RTL
$arabic = 0;
$prev = null;
for ( $n = strlen($roman) - 1; $n >= 0; --$n ) {
    $curr = $roman[$n];
    if ( is_null($prev) ) {
        $arabic += $romanValues[$roman[$n]];
    } else {
        $arabic += $romanValues[$prev] > $romanValues[$curr] ? -$romanValues[$curr] : +$romanValues[$curr];
    }
    $prev = $curr;
}
echo $arabic, "\n";

// LTR
$arabic = 0;
$romanLength = strlen($roman);
for ( $n = 0; $n < $romanLength; ++$n ) {
    if ( $n === $romanLength - 1 ) {
        $arabic += $romanValues[$roman[$n]];
    } else {
        $arabic += $romanValues[$roman[$n]] < $romanValues[$roman[$n+1]] ? -$romanValues[$roman[$n]] : +$romanValues[$roman[$n]];
    }
}
echo $arabic, "\n";

Some validation of roman number should also be added, though you said that you already have found how to do it.

Superpower answered 7/6, 2011 at 13:22 Comment(5)
Yes, in this case it does matter, as meaning of "current letter" depends on the value of "next letter" - if next letter is smaller or the same as current, then add current to the result, if next is larger, then subtract current from the result. If we go RTL, we store "next letter" in $prev variable, so it is always accessible with exception of first (right-most) letter where basic is_null($prev) check is sufficient. If we go LTR, we have to check value of next letter as well as existance of next letter.Superpower
Keep in mind though that this might work also for invalid roman letters, e.g., IVL will be treated as -1-5+50 and result in 44, which should be written as XLIV. Therefore, validation of number's structure should be added, as noted in answer.Superpower
@Superpower What do you mean by If we go LTR, we have to check value of next letter as well as existance of next letter. You're using RTL, but you still run a check on the value of the next letter (ternary). What else would be necessary LTR?Higdon
That check just has to be done differently. In RTL, you can use value from previous loop iteration, as "current" from current iteration will be "previous" in next iteration. In LTR, in every iteration you have get value which will be "current" in next iteration, as it is not stored anywhere yet. I've updated answer with LTR version of this code.Superpower
@Superpower Hm, yes, I see. You could save the Roman value as well then, one less array lookup. Nice solution.Higdon
B
3

Copyrights is for this blog (btw!) http://scriptsense.blogspot.com/2010/03/php-function-number-to-roman-and-roman.html

<?php

function roman2number($roman){
    $conv = array(
        array("letter" => 'I', "number" => 1),
        array("letter" => 'V', "number" => 5),
        array("letter" => 'X', "number" => 10),
        array("letter" => 'L', "number" => 50),
        array("letter" => 'C', "number" => 100),
        array("letter" => 'D', "number" => 500),
        array("letter" => 'M', "number" => 1000),
        array("letter" => 0, "number" => 0)
    );
    $arabic = 0;
    $state = 0;
    $sidx = 0;
    $len = strlen($roman);

    while ($len >= 0) {
        $i = 0;
        $sidx = $len;

        while ($conv[$i]['number'] > 0) {
            if (strtoupper(@$roman[$sidx]) == $conv[$i]['letter']) {
                if ($state > $conv[$i]['number']) {
                    $arabic -= $conv[$i]['number'];
                } else {
                    $arabic += $conv[$i]['number'];
                    $state = $conv[$i]['number'];
                }
            }
            $i++;
        }

        $len--;
    }

    return($arabic);
}


function number2roman($num,$isUpper=true) {
    $n = intval($num);
    $res = '';

    /*** roman_numerals array ***/
    $roman_numerals = array(
        'M' => 1000,
        'CM' => 900,
        'D' => 500,
        'CD' => 400,
        'C' => 100,
        'XC' => 90,
        'L' => 50,
        'XL' => 40,
        'X' => 10,
        'IX' => 9,
        'V' => 5,
        'IV' => 4,
        'I' => 1
    );

    foreach ($roman_numerals as $roman => $number)
    {
        /*** divide to get matches ***/
        $matches = intval($n / $number);

        /*** assign the roman char * $matches ***/
        $res .= str_repeat($roman, $matches);

        /*** substract from the number ***/
        $n = $n % $number;
    }

    /*** return the res ***/
    if($isUpper) return $res;
    else return strtolower($res);
}

/* TEST */
echo $s=number2roman(1965,true);
echo "\n and bacK:\n";
echo roman2number($s);


?>
Butterworth answered 7/6, 2011 at 13:10 Comment(1)
Without spending too much time trying to grok the algorithm, it appears flawed - it's valid to write 800 as CCM (though generally considered bad style) as well as DCCC, the method should be that any digit followed by a digit of higher numerical value should be substracted from the latter instead of added.Aggress
S
2

I'm late to the party, but here's mine. Assumes valid Numerals in the string, but doesn't test for a valid Roman number, whatever that is...there doesn't seem to be a consensus. This function will work for Roman numbers like VC (95), or MIM (1999), or MMMMMM (6000).

function roman2dec( $roman ) {
    $numbers = array(
        'I' => 1,
        'V' => 5,
        'X' => 10,
        'L' => 50,
        'C' => 100,
        'D' => 500,
        'M' => 1000,
    );

    $roman = strtoupper( $roman );
    $length = strlen( $roman );
    $counter = 0;
    $dec = 0;
    while ( $counter < $length ) {
        if ( ( $counter + 1 < $length ) && ( $numbers[$roman[$counter]] < $numbers[$roman[$counter + 1]] ) ) {
            $dec += $numbers[$roman[$counter + 1]] - $numbers[$roman[$counter]];
            $counter += 2;
        } else {
            $dec += $numbers[$roman[$counter]];
            $counter++;
        }
    }
    return $dec;
}
Snot answered 3/2, 2013 at 7:35 Comment(0)
L
2
function romanToInt($s) {
    $array = ["I"=>1,"V"=>5,"X"=>10,"L"=>50,"C"=>100,"D"=>500,"M"=>1000];
    $sum = 0;
    for ($i = 0; $i < strlen($s); $i++){
        $curr = $s[$i];
        $next = $s[$i+1];
        if ($array[$curr] < $array[$next]) {
            $sum += $array[$next] - $array[$curr];
            $i++;
        } else {
            $sum += $array[$curr];
        }
    }
    return $sum;
}
Lakesha answered 25/11, 2022 at 7:23 Comment(1)
Your answer could be improved with additional supporting information. Please edit to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers in the help center.Heartburning
F
1

Whew! Those are quite a few answers, and made of them are code-heavy! How about we define an algorithm for this first, before I give an answer?

The Basics

  • Don't store multi-digit Roman numerals, like 'CM' => 900, or anything like that in an array. If you know that M - C (1000 - 100) equals 900, then ultimately, you should only be storing the values of 1000 and 100. You wouldn't have multi-digit Roman numerals like CMI for 901, would you? Any answer that does this will be inefficient from one that understands the Roman syntax.

The Algorithm

Example: LIX (59)

  • Do a for loop on the numbers, starting at the end of the string of Roman numerals. In our example: We start on "X".
  • Greater-Than-Equal-To Case — If the value we are looking at is the same or greater than the last value, simply add it to a cumulative result. In our example: $result += $numeral_values["X"].
  • Less-Than Case — If the value we are subtracting is less than the previous number, we subtract it from our cumulative result. In our example IX, I is 1 and X is 10, so, since 1 is less than 10, we subtract it: giving us 9.

The Demo

Full Working Demo Online

The Code

function RomanNumeralValues() {
    return [
        'I'=>1,
        'V'=>5,
        'X'=>10,
        'L'=>50,
        'C'=>100,
        'D'=>500,
        'M'=>1000,
    ];
}

function ConvertRomanNumeralToArabic($input_roman){
    $input_length = strlen($input_roman);
    if($input_length === 0) {
        return $result;
    }
    
    $roman_numerals = RomanNumeralValues();
    
    $current_pointer = 1;
    $result = 0;
    
    for($i = $input_length - 1; $i > -1; $i--){ 
        $letter = $input_roman[$i];
        $letter_value = $roman_numerals[$letter];
        
        if($letter_value === $current_pointer) {
            $result += $letter_value;
        } elseif ($letter_value < $current_pointer) {
            $result -= $letter_value;
        } else {
            $result += $letter_value;
            $current_pointer = $letter_value;
        }
    }
    
    return $result;
}

print ConvertRomanNumeralToArabic("LIX");
Feltner answered 16/1, 2022 at 21:54 Comment(1)
A few problems here: 1) You're returning $result before defining it; 2) Roman numeral values are static, so it would make more sense to define them in the method (my opinion).Daberath
J
0

Define your own schema! (optional)

function rom2arab($rom,$letters=array()){
    if(empty($letters)){
        $letters=array('M'=>1000,
                       'D'=>500,
                       'C'=>100,
                       'L'=>50,
                       'X'=>10,
                       'V'=>5,
                       'I'=>1);
    }else{
        arsort($letters);
    }
    $arab=0;
    foreach($letters as $L=>$V){
        while(strpos($rom,$L)!==false){
            $l=$rom[0];
            $rom=substr($rom,1);
            $m=$l==$L?1:-1;
            $arab += $letters[$l]*$m;
        }
    }
    return $arab;
}

Inspired by andyb's answer

Jujube answered 7/6, 2011 at 23:33 Comment(1)
What kind of input is this expecting for $rom? I could not figure it out at all.Feltner
I
0

I just wrote this in about 10 mins, it's not perfect, but seems to work for the few test cases I've given it. I'm not enforcing what values are allowed to be subtracted from what, this is just a basic loop that compares the current letter value with the next one in the sequence (if it exists) and then either adds the value or adds the subtracted amount to the total:

$roman = strtolower($_GET['roman']);

$values = array(
'i' => 1,
'v' => 5,
'x' => 10,
'l' => 50,
'c' => 100,
'd' => 500,
'm' => 1000,
);
$total = 0;
for($i=0; $i<strlen($roman); $i++)
{
    $v = $values[substr($roman, $i, 1)];
    $v2 = ($i < strlen($roman))?$values[substr($roman, $i+1, 1)]:0;

    if($v2 && $v < $v2)
    {
        $total += ($v2 - $v);
        $i++;
    }
    else
        $total += $v;

}

echo $total;
Imf answered 2/4, 2012 at 16:13 Comment(1)
You should test your code with error reporting on (or suppress it as necessary), it throws a Notice: Undefined offset: 0 on your $v2... line most executions.Snot
L
0

Just stumbled across this beauty and have to post it all over:

function roman($N)
{
    $c = 'IVXLCDM';
    for ($a = 5, $b = $s = ''; $N; $b++, $a ^= 7)
    {
        for (
            $o = $N % $a, $N = $N / $a ^ 0;

            $o--;

            $s = $c[$o > 2 ? $b + $N - ($N &= -2) + $o = 1 : $b] . $s
        );
    }
    return $s;
}
Left answered 19/2, 2013 at 10:46 Comment(3)
You should post a link where you have stumbled across this :). It seems to be fun, but not a great example of descriptive variable names.Higdon
Had to format it for readability, the original was in code, but I just searched for php roman IVXLCDM and actually found the original on the PHP manual (that formatting is the same as on our code) a shout out to JR along with 100 internet points!Left
This does the opposite of what the question asked, no longer works, and is convoluted code to boot.Punctuate
L
0
function Romannumeraltonumber($input_roman){
  $di=array('I'=>1,
            'V'=>5,
            'X'=>10,
            'L'=>50,
            'C'=>100,
            'D'=>500,
            'M'=>1000);
  $result=0;
  if($input_roman=='') return $result;
  //LTR
  for($i=0;$i<strlen($input_roman);$i++){ 
    $result=(($i+1)<strlen($input_roman) and 
          $di[$input_roman[$i]]<$di[$input_roman[$i+1]])?($result-$di[$input_roman[$i]]) 
                                                        :($result+$di[$input_roman[$i]]);
   }
 return $result;
}
Lilylivered answered 6/5, 2015 at 23:10 Comment(1)
You should add explanation.Feltner
B
0
function rom_to_arabic($number) {

$symbols = array( 
    'M'  => 1000,  
    'D'  => 500, 
    'C'  => 100, 
    'L'  => 50, 
    'X'  => 10, 
    'V'  => 5, 
    'I'  => 1);

$a = str_split($number);

$i = 0;
$temp = 0;
$value = 0;
$q = count($a);
while($i < $q) {

    $thys = $symbols[$a[$i]];
    if(isset($a[$i +1])) {
        $next = $symbols[$a[$i +1]];
    } else {
        $next = 0;
    }

    if($thys < $next) {
        $value -= $thys;
    } else {
        $value += $thys;
    }

    $temp = $thys;
    $i++;
}

return $value;

}
Bink answered 23/8, 2015 at 20:2 Comment(0)
D
0
function parseRomanNumerals($input)
{
$roman_val = '';
$roman_length = strlen($input);
$result_roman = 0;
for ($x = 0; $x <= $roman_length; $x++) {
$roman_val_prev = $roman_val;
$roman_numeral = substr($input, $roman_length-$x,1);

switch ($roman_numeral) {
case "M":
$roman_val = 1000;
break;
case "D":
$roman_val = 500;
break;
case "C":
$roman_val = 100;
break;
case "L":
$roman_val = 50;
break;
case "X":
$roman_val = 10;
break;
case "V":
$roman_val = 5;
break;
case "I":
$roman_val = 1;
break;
default:
$roman_val = 0;
}
if ($roman_val_prev<$roman_val) {
$result_roman = $result_roman - $roman_val;
}
else {
$result_roman = $result_roman + $roman_val;
}
}
return abs($result_roman);
}
Dinar answered 3/5, 2019 at 10:51 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.