I think leveraging a standard phonetic algorithm would be a good idea. I think Soundex might be a bit limited, but a double metaphone would probably be a good choice.
Get the metaphone representations of the words in question, remove the first characters, and check whether the remaining portion of the shorter of the two words matches the end of the longer. With double metaphone, it's very similar, but make four comparisons, primary to primary, secondary to primary, primary to secondary and secondary to secondary.
I think that would be a good starting point.
A note on this and many other phonetic algorithms: It isn't designed to provide precise phonetic definition. Varied geographic pronunciation, common mispronunciations and alternate pronunciations make a hard and fast single correct pronunciation impossible to obtain based solely on the word. Novel spelling and letter usage make it hard to algorithmically obtain a close pronunciation (care for some hors d'oeuvres?). Also, a major goal of many such algorithms are to match similar sounding or misheard words or names to each other, so the results are usually intended to be a bit imprecise (this is probably a good thing, for this purpose as well).