Highlight the difference between two strings in PHP
Asked Answered
E

17

149

What is the easiest way to highlight the difference between two strings in PHP?

I'm thinking along the lines of the Stack Overflow edit history page, where new text is in green and removed text is in red. If there are any pre-written functions or classes available, that would be ideal.

Ebb answered 26/11, 2008 at 16:21 Comment(0)
S
46

You were able to use the PHP Horde_Text_Diff package.

However this package is no longer available.

Spoil answered 26/11, 2008 at 16:32 Comment(8)
the link doesn't work anymore. is it any other solution now in 2011? ;-) is it possible go get output like this tortoisesvn.tigris.org/images/TMerge2Diff.pngWallop
Site is gone, but archive.org has a copy of the site: web.archive.org/web/20080506155528/http://software.zuavra.net/…Knotted
Too bad it requires PEAR. PEAR-dependance sucks.Mayotte
From the new web site: "Update: the inline renderer is now a native part of the Text_Diff PEAR package. You don't need to use the hack presented here anymore." So just use Text_Diff now.Inconvertible
GPL isn't just free to use. It forces your module/project to be GPL as well.Phoney
Stumbled across this post whilst trying to implement a diffing mechanism for a project using mongodb document revisions. As @Inconvertible mentioned the link in the answer is old, and will just direct you to here: horde.org/libraries/Horde_Text_Diff/download, but if you use the remi repo, you can do: yum install php-horde-Horde-Text-Diff --enablerepo=remiCharil
what is the usage?Roundhead
I'm wide lost, why this answer is useful, also the accepted answer?Allrud
K
81

Just wrote a class to compute smallest (not to be taken literally) number of edits to transform one string into another string:

http://www.raymondhill.net/finediff/

It has a static function to render a HTML version of the diff.

It's a first version, and likely to be improved, but it works just fine as of now, so I am throwing it out there in case someone needs to generate a compact diff efficiently, like I needed.

Edit: It's on Github now: https://github.com/gorhill/PHP-FineDiff

Knotted answered 22/2, 2011 at 19:23 Comment(5)
I'll try the fork at github.com/xrstf/PHP-FineDiff to get multibyte support!Acephalous
@R. Hill - Works beautifully for me too. This really is a better answer than the current one which appears to be defunct.Coveney
Any updates? It says failed to include file "Texts/Diff.php" and it is not in the zip.Gilbreath
Amazing! I mean the online demo with example code. Perfect char-level differences. Just WoW! :O Thank You!Naturalistic
It seems that now the github.com/BillyNate/PHP-FineDiff fork is the most ahead and it supports multibytes with different encodings. github.com/xrstf/PHP-FineDiff is 404ing @AcephalousStodge
S
46

You were able to use the PHP Horde_Text_Diff package.

However this package is no longer available.

Spoil answered 26/11, 2008 at 16:32 Comment(8)
the link doesn't work anymore. is it any other solution now in 2011? ;-) is it possible go get output like this tortoisesvn.tigris.org/images/TMerge2Diff.pngWallop
Site is gone, but archive.org has a copy of the site: web.archive.org/web/20080506155528/http://software.zuavra.net/…Knotted
Too bad it requires PEAR. PEAR-dependance sucks.Mayotte
From the new web site: "Update: the inline renderer is now a native part of the Text_Diff PEAR package. You don't need to use the hack presented here anymore." So just use Text_Diff now.Inconvertible
GPL isn't just free to use. It forces your module/project to be GPL as well.Phoney
Stumbled across this post whilst trying to implement a diffing mechanism for a project using mongodb document revisions. As @Inconvertible mentioned the link in the answer is old, and will just direct you to here: horde.org/libraries/Horde_Text_Diff/download, but if you use the remi repo, you can do: yum install php-horde-Horde-Text-Diff --enablerepo=remiCharil
what is the usage?Roundhead
I'm wide lost, why this answer is useful, also the accepted answer?Allrud
A
27

This is a nice one, also http://paulbutler.org/archives/a-simple-diff-algorithm-in-php/

Solving the problem is not as simple as it seems, and the problem bothered me for about a year before I figured it out. I managed to write my algorithm in PHP, in 18 lines of code. It is not the most efficient way to do a diff, but it is probably the easiest to understand.

It works by finding the longest sequence of words common to both strings, and recursively finding the longest sequences of the remainders of the string until the substrings have no words in common. At this point it adds the remaining new words as an insertion and the remaining old words as a deletion.

You can download the source here: PHP SimpleDiff...

Almuce answered 24/9, 2010 at 8:15 Comment(6)
I found this very useful as well! Not as complicated as the Pear stuff.Elate
It gives me an errror here: if($matrix[$oindex][$nindex] > $maxlen){ Undefined variable: maxlenYoshida
Ok you posted a commetn to solve that. :) why you don't edit it in the initial code? Thanks anyway +1 ... hmm well you aren't the authorYoshida
here's what appears to be the latest version from 2010: github.com/paulgb/simplediff/blob/master/simplediff.phpUnreserve
Actually, +1 for simplicityCentaurus
This script can now be found here: https://github.com/paulgb/simplediff/blob/master/php/simplediff.phpSepti
H
26

Here is a short function you can use to diff two arrays. It implements the LCS algorithm:

function computeDiff($from, $to)
{
    $diffValues = array();
    $diffMask = array();

    $dm = array();
    $n1 = count($from);
    $n2 = count($to);

    for ($j = -1; $j < $n2; $j++) $dm[-1][$j] = 0;
    for ($i = -1; $i < $n1; $i++) $dm[$i][-1] = 0;
    for ($i = 0; $i < $n1; $i++)
    {
        for ($j = 0; $j < $n2; $j++)
        {
            if ($from[$i] == $to[$j])
            {
                $ad = $dm[$i - 1][$j - 1];
                $dm[$i][$j] = $ad + 1;
            }
            else
            {
                $a1 = $dm[$i - 1][$j];
                $a2 = $dm[$i][$j - 1];
                $dm[$i][$j] = max($a1, $a2);
            }
        }
    }

    $i = $n1 - 1;
    $j = $n2 - 1;
    while (($i > -1) || ($j > -1))
    {
        if ($j > -1)
        {
            if ($dm[$i][$j - 1] == $dm[$i][$j])
            {
                $diffValues[] = $to[$j];
                $diffMask[] = 1;
                $j--;  
                continue;              
            }
        }
        if ($i > -1)
        {
            if ($dm[$i - 1][$j] == $dm[$i][$j])
            {
                $diffValues[] = $from[$i];
                $diffMask[] = -1;
                $i--;
                continue;              
            }
        }
        {
            $diffValues[] = $from[$i];
            $diffMask[] = 0;
            $i--;
            $j--;
        }
    }    

    $diffValues = array_reverse($diffValues);
    $diffMask = array_reverse($diffMask);

    return array('values' => $diffValues, 'mask' => $diffMask);
}

It generates two arrays:

  • values array: a list of elements as they appear in the diff.
  • mask array: contains numbers. 0: unchanged, -1: removed, 1: added.

If you populate an array with characters, it can be used to compute inline difference. Now just a single step to highlight the differences:

function diffline($line1, $line2)
{
    $diff = computeDiff(str_split($line1), str_split($line2));
    $diffval = $diff['values'];
    $diffmask = $diff['mask'];

    $n = count($diffval);
    $pmc = 0;
    $result = '';
    for ($i = 0; $i < $n; $i++)
    {
        $mc = $diffmask[$i];
        if ($mc != $pmc)
        {
            switch ($pmc)
            {
                case -1: $result .= '</del>'; break;
                case 1: $result .= '</ins>'; break;
            }
            switch ($mc)
            {
                case -1: $result .= '<del>'; break;
                case 1: $result .= '<ins>'; break;
            }
        }
        $result .= $diffval[$i];

        $pmc = $mc;
    }
    switch ($pmc)
    {
        case -1: $result .= '</del>'; break;
        case 1: $result .= '</ins>'; break;
    }

    return $result;
}

Eg.:

echo diffline('StackOverflow', 'ServerFault')

Will output:

S<del>tackO</del><ins>er</ins>ver<del>f</del><ins>Fau</ins>l<del>ow</del><ins>t</ins> 

StackOerverfFaulowt

Additional notes:

  • The diff matrix requires (m+1)*(n+1) elements. So you can run into out of memory errors if you try to diff long sequences. In this case diff larger chunks (eg. lines) first, then diff their contents in a second pass.
  • The algorithm can be improved if you trim the matching elements from the beginning and the end, then run the algorithm on the differing middle only. A latter (more bloated) version contains these modifications too.
Hawfinch answered 25/2, 2014 at 17:7 Comment(5)
this is simple, effective, and cross-platform; I used this technique with explode() on various boundaries (line or word) to get different output where appropriate. Very nice solution, thanks!Luna
it says computeDiff is not foundRoundhead
@ichimaru Have you pasted both functions?Hawfinch
@Hawfinch did not saw the other function... i swear! its working now thanks!Roundhead
Thanks, This one is quite handy to find out diff than the accepted answer.Printmaking
C
25

If you want a robust library, Text_Diff (a PEAR package) looks to be pretty good. It has some pretty cool features.

Campion answered 26/11, 2008 at 16:32 Comment(2)
PHP Inline-Diff, mentioned above, "..uses Text_Diff from PEAR to compute a diff". :)Spoil
The link is broken. Cant find the package. This is the same Diff package used by the latest version of Wordpress.Jaffna
M
6

There is also a PECL extension for xdiff:

In particular:

Example from PHP Manual:

<?php
$old_article = file_get_contents('./old_article.txt');
$new_article = $_POST['article'];

$diff = xdiff_string_diff($old_article, $new_article, 1);
if (is_string($diff)) {
    echo "Differences between two articles:\n";
    echo $diff;
}
Moll answered 1/5, 2012 at 11:42 Comment(4)
xdiff pecl extension is no longer maintained, apparently a stable release hasn't been done since 2008-07-01, according to pecl.php.net/package/xdiff, I ended up going with the suggestion by accepted answer as it's much newer, horde.org/libraries/Horde_Text_Diff/downloadCharil
There are a simple install procedure for PHP's XDiff? (for Debian Linux)Judenberg
@MikePurcell, as a matter of fact, it is still maintained. The latest stable version 2.0.1 supporting PHP 7 has been released on 2016-05-16.Bryantbryanty
@PeterKrauss, yes, there is. Take look at this question: serverfault.com/questions/362680/…Bryantbryanty
E
5

I had terrible trouble with the both the PEAR-based and the simpler alternatives shown. So here's a solution that leverages the Unix diff command (obviously, you have to be on a Unix system or have a working Windows diff command for it to work). Choose your favourite temporary directory, and change the exceptions to return codes if you prefer.

/**
 * @brief Find the difference between two strings, lines assumed to be separated by "\n|
 * @param $new string The new string
 * @param $old string The old string
 * @return string Human-readable output as produced by the Unix diff command,
 * or "No changes" if the strings are the same.
 * @throws Exception
 */
public static function diff($new, $old) {
  $tempdir = '/var/somewhere/tmp'; // Your favourite temporary directory
  $oldfile = tempnam($tempdir,'OLD');
  $newfile = tempnam($tempdir,'NEW');
  if (!@file_put_contents($oldfile,$old)) {
    throw new Exception('diff failed to write temporary file: ' . 
         print_r(error_get_last(),true));
  }
  if (!@file_put_contents($newfile,$new)) {
    throw new Exception('diff failed to write temporary file: ' . 
         print_r(error_get_last(),true));
  }
  $answer = array();
  $cmd = "diff $newfile $oldfile";
  exec($cmd, $answer, $retcode);
  unlink($newfile);
  unlink($oldfile);
  if ($retcode != 1) {
    throw new Exception('diff failed with return code ' . $retcode);
  }
  if (empty($answer)) {
    return 'No changes';
  } else {
    return implode("\n", $answer);
  }
}
Etiology answered 30/11, 2011 at 16:22 Comment(0)
P
4

This is the best one I've found.

http://code.stephenmorley.org/php/diff-implementation/

enter image description here

Put answered 8/4, 2015 at 14:29 Comment(4)
Doesn't work properly with UTF-8. It uses array access on strings, which treats each character as one byte wide. Should be easily fixable tough with mb_split.Hamill
Here is a quick fix. Just replace $sequence1 = $string1; $sequence2 = $string2; $end1 = strlen($string1) - 1; $end2 = strlen($string2) - 1; with $sequence1 = preg_split('//u', $string1, -1, PREG_SPLIT_NO_EMPTY); $sequence2 = preg_split('//u', $string2, -1, PREG_SPLIT_NO_EMPTY); $end1 = count($sequence1) - 1; $end2 = count($sequence2) - 1;Hamill
This class runs out of memory using character mode in function computeTable.Put
The current link is code.iamkate.com/php/diff-implementation. I've tested it and it doesn't support UTF-8.Stodge
S
3

A php port of Neil Frasers diff_match_patch (Apache 2.0 licensed)

Sadi answered 25/4, 2012 at 22:28 Comment(1)
Not sure if this is still the best answer in 2023, but it's certainly working fine. You need to wrap mb_ord and mb_char functions in if (!function_exists()) {} calls (or delete them) as they're now built-in functions, but it's working a charm. Apache 2.0 licence is very permissive too.Entomb
P
2

What you are looking for is a "diff algorithm". A quick google search led me to this solution. I did not test it, but maybe it will do what you need.

Pulse answered 26/11, 2008 at 16:28 Comment(1)
I've just tested that script and it works well - the diff operation completes very quickly (taking around 10ms to process the short paragraph I tested) and it was able to detect when a line break was added. Running the code as-is generates a couple of PHP notices which you might want to fix, but other than that it's a very good solution if you need to show the differences inline rather than use the traditional side-by-side diff view.Septi
N
1

I would recommend looking at these awesome functions from PHP core:

similar_text — Calculate the similarity between two strings

http://www.php.net/manual/en/function.similar-text.php

levenshtein — Calculate Levenshtein distance between two strings

http://www.php.net/manual/en/function.levenshtein.php

soundex — Calculate the soundex key of a string

http://www.php.net/manual/en/function.soundex.php

metaphone — Calculate the metaphone key of a string

http://www.php.net/manual/en/function.metaphone.php

Nonrestrictive answered 24/4, 2014 at 5:18 Comment(0)
D
1

I have tried a simple approach with two text box and some color styling. Note: my diff checker will only highlight difference in words and not in characters.

    <?php
    $valueOne = $_POST['value'] ?? "";
    $valueTwo = $_POST['valueb'] ?? "" ;
    
    $trimValueOne = trim($valueOne);
    $trimValueTwo = trim($valueTwo);

    $arrayValueOne = explode(" ",$trimValueOne);
    $arrayValueTwo = explode(" ",$trimValueTwo);

    $allDiff = array_merge(array_diff($arrayValueOne, $arrayValueTwo), array_diff($arrayValueTwo, $arrayValueOne));
    if(array_intersect($arrayValueOne,$allDiff) && array_intersect($arrayValueTwo,$allDiff)){

        if(array_intersect($arrayValueOne,$allDiff)){
            $highlightArr = array_intersect($arrayValueOne,$allDiff);
            $highlightArrValue = array_values($highlightArr);
            for ($i=0; $i <count($arrayValueOne) ;$i++) { 
                for ($j=0; $j <count($highlightArrValue) ; $j++) { 
                    if($arrayValueOne[$i] == $highlightArrValue[$j]){
                        $arrayValueOne[$i] = "<span>".$arrayValueOne[$i]."</span>";
                    }
                }
            }
            $strOne = implode(" ",$arrayValueOne);
            echo "<p class = \"one\">{$strOne}</p>";
        }if(array_intersect($arrayValueTwo,$allDiff)){
        $highlightArr = array_intersect($arrayValueTwo,$allDiff);
        $highlightArrValue = array_values($highlightArr);
        for ($i=0; $i <count($arrayValueTwo) ;$i++) { 
            for ($j=0; $j <count($highlightArrValue) ; $j++) { 
                    if($arrayValueTwo[$i] == $highlightArrValue[$j]){
                        $arrayValueTwo[$i] = "<span>".$arrayValueTwo[$i]."</span>";
                    }
                }
        }
        $strTwo = implode(" ",$arrayValueTwo);
        echo "<p class = \"two\">{$strTwo}</p>";
        }
    }elseif(!(array_intersect($arrayValueOne,$allDiff) && array_intersect($arrayValueTwo,$allDiff))){
        if($trimValueOne == $trimValueTwo){
            echo"<p class = \"one green\">$trimValueOne</p></p>";
            echo"<p class = \"two green\">$trimValueTwo</p></p>";
        }
        else{
            echo"<p class = \"one \">$trimValueOne</p></p>";
            echo"<p class = \"two \">$trimValueTwo</p></p>";
        }

    }
?>


<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta http-equiv="X-UA-Compatible" content="IE=edge">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Document</title>
    <link rel="stylesheet" href="./style.css">
</head>
<body>
    <form method="post" action="">
    <textarea type="text" name="value" placeholder="enter first text"></textarea>
    <textarea type="text" name="valueb" placeholder="enter second text"></textarea>
    <input type="submit">
    </form>
</body>
</html>
Daniels answered 27/8, 2021 at 11:42 Comment(0)
M
0

I came across this PHP diff class by Chris Boulton based on Python difflib which could be a good solution:

PHP Diff Lib

Miquelmiquela answered 11/3, 2015 at 20:51 Comment(0)
D
0

Another solution (for side-by-side comparison as opposed to a unified view): https://github.com/danmysak/side-by-side.

Doubledecker answered 15/1, 2020 at 15:30 Comment(0)
T
0

For those just looking for a very simple function to find characters in string A but not in string B i wrote this quick and very simple function.

function strdiff($a,$b){

    $a = str_split($a);
    $b = str_split($b);

    return array_diff($a,$b);

}
Therapy answered 17/1, 2022 at 11:13 Comment(3)
I don't see this working on onlinephp.io/c/53292Kerseymere
All the characters in your lorem ipsum text is present in both strings, thus it returns a diff of 0Therapy
Add some special characters and you'll see that it works. Keep in mind this diffs characters, not words!Therapy
B
0

Hi this will help you a lot:

$old_data = "We'll of today's hunt we will find inner zen. You are awesome [TEAM_NAME]! Cleveland has a lot more to offer though, so keep on roaming and find some happiness with Let's Roam!;";
$new_data = "We'll of today's hunt we will find inner zen. Great job today, you are freaking super awesome [TEAM_NAME]! though, so keep roaming Cleveland has a lot more to offer and find happiness on www.letsroam.com!;";

if($old_data) {
  $old_words = explode(" " , $old_data);
  $new_words = explode(" ", $new_data);

  $added_words = array();
  $deleted_words = array();
  $unchanged_words = array();
  foreach($new_words as $new_word) {
      $new_word_index = array_search($new_word, $old_words);
      // if($new_word == "you"){
      //   die_r(array());
      // }
      if( $new_word_index > -1) {
          // word already exists
          array_push($unchanged_words, $new_word);
          unset($old_words[$new_word_index]);
      } else {
          // word does not already exists
          array_push($added_words, $new_word);
      } 
      
   }
 $deleted_words = $old_words;
 $added_word_count = count($added_words);
 $added_word_characters = strlen(implode(" ", $added_words));
}
die_r(array(
  "old_data"=> $old_data,
  "new_data"=> $new_data,
  "unchanged_words"=> $unchanged_words,
  "added_words"=> $added_words,
  "deleted_words"=> $deleted_words,
  "added_word_count"=>$added_word_count,
  "added_word_characters"=>$added_word_characters
));
Begorra answered 26/5, 2022 at 16:11 Comment(0)
K
0

Try https://methodfish.com/Projects/MFStrDiff Download class.MFStrDiff.php into your path and then use the following code to generate a word level comparison:

include_once("class.MFStrDiff.php");
$diff=new MFStrDiff();
$diffHtml = $diff->getDiff($tx1, $tx2); // HTML output by default

See demo on https://methodfish.com/Projects/MFStrDiff/demo

Kerseymere answered 22/8, 2023 at 20:53 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.