Diff/Patch functions
Unix-like environments have handy diff and patch functions to track and make incremental changes to files. I wanted the same functionality for strings in PHP. Since I couldn't find anything I liked (and it was nearly impossible to get libXdiff installed), I wrote my own.
Download
Click Here for the Source Code. This is saved as a text file so the web server won't mess with the code. You will want to save it with a "php" extention, not "txt".
DIFF
diff( initial_string, changed_string, [minimum_match] )
- initial_string: The initial string to be changed.
- changed_string: The string containing changes.
- minimum_match: (optional) Minimum number of characters to match.
- This section has been removed.
+ This section has been added.
The optional minimum_match parameter will keep diff from matching up short sequences of letters. Examples of minimum_match:
diff("Sam", "Bart", 1) = ("0-S", "0+B","2-m", "2+rt")
diff("Sam", "Bart", 2) = ("0-Sam", "0+Bart")
PATCH
patch( initial_string, diff_array )
- initial_string: The string to be patched.
- diff_array: Array of differences.
patch("Bart", "0-B") = "art"
patch("Bart", array("0-B", "0+C")) = "Cart"
patch("Bart", array(
array("0-B", "0+C"),
array("1-a", "1+ove"))) = "Covert"
Using an array of diff arrays will allow you to store incremental
changes and then apply multiple changes at once.
UNPATCH
unpatch( final_string, diff_array )
This is functionally identical to the patch function.
The difference is that this will remove patches and turn a patched string into the original string.
I wrote this because I found a wiki is faster if you store the patched text in your database and unpatch it when viewing the history.
Longest Substring Match
Finding the diff between two long strings requires you to find the longest substring match between the two strings.
Initially, I used a very slow method. If one string was M characters long and the other was N characters long, it took M*N*N time to complete (If you don't know about big-O, that is n3 time - a very long time).
I then found a pseudo-function on Wikipedia that is supposed to run in M*N time (which is n2 - still slow, but much better).
I wrote it in PHP and it quickly ran out of memory.
Then, I rewrote the function again - this time conserving memory.
Now, I have a cool "longest substring match" functions that is both fast and nice to memory.
Feel free to steal it for anything you are working on that needs that function.











