find the Hamming distance between two DNA strings
Asked Answered
I

3

7

i'm just learning python 3 now. '''It's ask the user for two string and find the Hamming distance between the strings.Input sequences should only include nucleotides ‘A’, ’T’, ‘G’ and ‘C’. Program should ask the user to reenter the sequence if user enter an invalid character.Program should be able to compare the strings are of same length. If the strings are not of the same length program should ask the user to enter the strings again.User should be able to enter upper, lower or both cases as an input '''

The program should print the output in the following format:

please enter string one: GATTACA
please enter string two: GACTATA
GATTACA
|| || |  
GACTATA
The hamming distance of sequence GATTACA and GACTATA is 2
So the Hamming distance is 2.

What I already try below, but could not get answer.

def hamming_distance(string1, string2):
    string1 = input("please enter first sequence")
    string2 = input("please enter second sequence")
    distance = 0
     L = len(string1)
    for i in range(L):
        if string1[i] != string2[i]:
            distance += 1
    return distance
Incurrent answered 15/2, 2018 at 4:28 Comment(2)
What is the error that you are facing?Rhettrhetta
You have passed string1 and string2 as parameter then again taking input from the user. Is it what you intended to do? Can you clarify what you meant by "could not get an answer"?Folie
S
4

the line indent error: L = len(strings1)

def hamming_distance(s1, s2):
    if len(s1) != len(s2):
        raise ValueError("Strand lengths are not equal!")
    return sum(ch1 != ch2 for ch1,ch2 in zip(s1,s2))
Seabee answered 15/2, 2018 at 4:39 Comment(1)
Please explain in plain English, the return statement above, thanks. What are ch1 and ch2?Halfcocked
B
3

Alternatively, you could use this. I also added a check that raises an exception because the hamming distance is only defined for sequences of equal length, so an attempt to calculate it between sequences of different lengths should not work.

def distance(str1, str2):
    if len(str1) != len(str2):
        raise ValueError("Strand lengths are not equal!")
    else:
        return sum(1 for (a, b) in zip(str1, str2) if a != b)
Bedcover answered 18/7, 2019 at 8:23 Comment(2)
Please explain in plain English, the return statement above, thanks. What are a and b doing there/what do they represent?Halfcocked
These are the characters from str1 and str2. If they're different you get 1 and then sum that all up to get the number of different characters between str1 and str2. That's your Hamming distance.Bedcover
W
2

Wiki page has elegant python and C implementations for computing hamming distance. This implementation assumes that hamming distance is invalid for sequences of varying length. However, there are two possible ways to report/compute distance for strings of varying length:

1) Perform multiple sequence alignment and then compute hamming distance between the two gap-filled character arrays ... formally referred to as edit distance or Levenshtein distance.

2) Alternatively, one could use the zip_longest function from iterttools. The following implementation will be equivalent to adding a string of gap characters at the end of shorter string so as to match the length of the longer string. [Note: As compared to approach 1 value returned by this method would be an over-estimate of the distance as it doesn't account for alignment]

import itertools

def hammingDist(str1, str2, fillchar = '-'):
    return sum([ch1 != ch2 for (ch1,ch2) in itertools.zip_longest(str1, str2, fillvalue = fillchar)])


def main():
    # Running test cases:    
    print('Expected value \t Value returned')
    print(0,'\t', hammingDist('ABCD','ABCD'))
    print(1,'\t', hammingDist('ABCD','ABED'))
    print(2,'\t', hammingDist('ABCD','ABCDEF'))
    print(2,'\t', hammingDist('ABCDEF','ABCD'))
    print(4,'\t', hammingDist('ABCD',''))
    print(4,'\t', hammingDist('','ABCD'))
    print(1,'\t', hammingDist('ABCD','ABcD'))

if __name__ == "__main__":
    main()    
    import itertools
Waylin answered 29/1, 2019 at 19:41 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.