How do Markov Chain Chatbots work?
Asked Answered
P

3

75

I was thinking of creating a chatbot using something like markov chains, but I'm not entirely sure how to get it to work. From what I understand, you create a table from data with a given word and then words which follow. Is it possible to attach any sort of probability or counter while training the bot? Is that even a good idea?

The second part of the problem is with keywords. Assuming I can already identify keywords from user input, how do I generate a sentence which uses that keyword? I don't always want to start the sentence with the keyword, so how do I seed the markov chain?

Primus answered 15/3, 2011 at 2:5 Comment(13)
Tabular 2D data is not really useful for Artifical Intelligence, a graph would be much more beneficial. To get the words which follow, Bayesian Network is the best solution. To form sentences, Language Processing is required and you can't do it with basic logic (such as Markov Chains)Pendulous
You can't do it with Markov Chains? Some bots exist which do use Markov Chains, although it really just puts sentences together, they don't always make sense I suppose.Primus
You can, but not language processing. Just figuring out the most probable sentence.Pendulous
@AbiusX: What do you mean "just figuring out the most probably sentence"? What would something like a bayesian network give me that markov chains wouldn't? What would I process? Aren't Markov Chains somewhat like Bayesian networks? I mean, you're still dealing with probabilities and something like a directed graph, it's just stored in a table instead.At each word you're still choosing the next word to follow with a given probability, aren't you? What's the difference? And could I start out with Markov Chains and add additional logic? Any suggestions? Thanks!Primus
Markov Chains are some sort of one dimensional Bayesian Nets, BNs are more general. You could use markov chains but they are not really useful in a language processing application.Pendulous
@AbiusX: What would you suggest I implement using Bayesian nets then? What would be different? Sorry, I'm still a bit shaky on them :) I don't suppose you know of any examples that I can sink my teeth into? I've found some stuff on Markov chains, but most of the Bayesian Network stuff is pretty abstract.Primus
Bayesian Nets would result in senseless sentences which look natural (For someone not knowing English well, it seems those are just words he's not familiar with) many bots use them since they're easy to employ. Read some about natural language processingPendulous
@AbiusX: I have been, but it's such a broad topic that it's somewhat difficult to know where to start. Do you have any pointers? Thanks! Also, if BNs would result in sentences which don't respond to the user, how is that any different than Markov Chaining?Primus
@Jordan: AI is a broad topic. I once did a chatbot with Bayesian Nets, Semantic Webs and some creative ideas, which sounded good. If you dont know much AI, stick to creative ideas. A Bayesian Net would make your bot say things like "are you peruel?" which is not a correct sentence but makes you feel like its smart! (Apply Bayesian Nets on letter choices in words to make new words, this is what CAPTCHA algorithms do as you've seen senseless but nice words in CAPTCHA images)Pendulous
@AbiusX: Right, the thing I'm trying to understand is what benefit BNs woudl give you that Markov Chains wouldn't? What would be implemented differently, aside from having a directed graph instead of keeping it in a database? It might be faster to respond, but what sort of functionality would be different? It would seem to me that it's easier/better to store that on disk rather than keep everything loaded in memory anyway?Primus
@Jordan: Markov Chains can help you choose between a 1D set of things, Bayesian networks help to choose from a set of 2D thing (two factors). In your case Markovs Would doPendulous
Ah, yeah, that makes sense. Thanks! It would be nice to extend this a bit though, in which case you might be right.Primus
@Primus did you manage to create the chatbot based on a markov model?Collectivity
S
143

I made a Markov chain chatbot for IRC in Python a few years back and can shed some light how I did it. The text generated does not necessarily make any sense, but it can be really fun to read. Lets break it down in steps. Assuming you have a fixed input, a text file, (you can use input from chat text or lyrics or just use your imagination)

Loop through the text and make a Dictionary, meaning key - value container. And put all pair of words as keys and the word following as a value. For example: If you have a text "a b c a b k" you start with "a b" as key and "c" as value, then "b c" and "a" as value... the value should be a list or any collection holding 0..many 'items' as you can have more than one value for a given pair of words. In the example above you will have "a b" two times followed fist by "c" then in the end by "k". So in the end you will have a dictionary/hash looking like this: {'a b': ['c','k'], 'b c': ['a'], 'c a': ['b']}

Now you have the needed structure for building your funky text. You can choose to start with a random key or a fixed place! So given the structure we have we can start by saving "a b" then randomly taking a following word from the value, c or k, so the first save in the loop, "a b k" (if "k" was the random value chosen) then you continue by moving one step to the right which in our case is "b k" and save a random value for that pair if you have, in our case no, so you break out of the loop (or you can decide other stuff like start over again). When to loop is done you print your saved text string.

The bigger the input, the more values you will have for you keys (pair of words) and will then have a "smarter bot" so you can "train" your bot by adding more text (perhaps chat input?). If you have a book as input, you can construct some nice random sentences. Please note that you don't have to take only one word that follows a pair as a value, you can take 2 or 10. The difference is that your text will appear more accurate if you use "longer" building blocks. Start with a pair as a key and the following word as a value.

So you see that you basically can have two steps, first make a structure where you randomly choose a key to start with then take that key and print a random value of that key and continue till you do not have a value or some other condition. If you want you can "seed" a pair of words from a chat input from your key-value structure to have a start. Its up to your imagination how to start your chain.

Example with real words:

"hi my name is Al and i live in a box that i like very much and i can live in there as long as i want"

"hi my" -> ["name"]

"my name" -> ["is"]

"name is" -> ["Al"]

"is Al" -> ["and"]

........

"and i" -> ["live", "can"]

........

"i can" -> ["live"]

......

Now construct a loop:

Pick a random key, say "hi my" and randomly choose a value, only one here so its "name" (SAVING "hi my name").
Now move one step to the right taking "my name" as the next key and pick a random value... "is" (SAVING "hi my name is").
Now move and take "name is" ... "Al" (SAVING "hi my name is AL").
Now take "is Al" ... "and" (SAVING "hi my name is Al and").

...

When you come to "and i" you will randomly choose a value, lets say "can", then the word "i can" is made etc... when you come to your stop condition or that you have no values print the constructed string in our case:

"hi my name is Al and i can live in there as long as i want"

If you have more values you can jump to any keys. The more values the more combinations you have and the more random and fun the text will be.

Spondee answered 15/3, 2011 at 3:30 Comment(6)
If you use chat text as input, you have to have a lot to make the bot "smart" When a person mention the botname a response was triggered. I took the whole string written and checked first who wrote it and started my response with that name, like "Jordan: bla bla" then i chose a pair of words from that string and started there. I could also make up two sentences with the second start with "and (random word)" I also randomly chose how long my response was, maybe 5-50 words. I also had more than one 'brain' with more structures i could play with. A brain is basically just a textfile.Spondee
Forgot to say that if you use chat input it can be hard as people tend to make a lot of spelling mistakes and shortcuts, then your bot will be like that. You can also look into natural language processing if you have time, here is something in Python linkSpondee
If you still have the source anywhere, throw it up on github.Bingen
I did something like this where I had a program scan over Oliver Twist, and extract every single 4-word chunk combination, and the frequency of each. It took forever to generate the frequency map, and only output garbage Strings of near meaningless text. I wish I new about this method back then!Pugnacious
i think there is a missing piece here, as Markov chains takes decision based on probabilities, so when it comes to the "and i" the system should be able to re-evaluate the entire string (not just the last key-pair) until there to determine the next word. The system should tell you that "live" should be the next one, but not by chance.Gilly
here is additional info on weights for each state youtube.com/watch?v=L97yQMT0jn8Gilly
P
10

The bot chooses a random word from your input and generates a response by choosing another random word that has been seen to be a successor to its held word. It then repeats the process by finding a successor to that word in turn and carrying on iteratively until it thinks it’s said enough. It reaches that conclusion by stopping at a word that was prior to a punctuation mark in the training text. It then returns to input mode again to let you respond, and so on.

It isn’t very realistic but I hereby challenge anyone to do better in 71 lines of code !! This is a great challenge for any budding Pythonists, and I just wish I could open the challenge to a wider audience than the small number of visitors I get to this blog. To code a bot that is always guaranteed to be grammatical must surely be closer to several hundred lines, I simplified hugely by just trying to think of the simplest rule to give the computer a mere stab at having something to say.

Its responses are rather impressionistic to say the least ! Also you have to put what you say in single quotes.

I used War and Peace for my “corpus” which took a couple of hours for the training run, use a shorter file if you are impatient…

here is the trainer

#lukebot-trainer.py
import pickle
b=open('war&peace.txt')
text=[]
for line in b:
    for word in line.split():
        text.append (word)
b.close()
textset=list(set(text))
follow={}
for l in range(len(textset)):
    working=[]
    check=textset[l]
    for w in range(len(text)-1):
        if check==text[w] and text[w][-1] not in '(),.?!':
            working.append(str(text[w+1]))
    follow[check]=working
a=open('lexicon-luke','wb')
pickle.dump(follow,a,2)
a.close()

Here is the bot:

#lukebot.py
import pickle,random
a=open('lexicon-luke','rb')
successorlist=pickle.load(a)
a.close()
def nextword(a):
    if a in successorlist:
        return random.choice(successorlist[a])
    else:
        return 'the'
speech=''
while speech!='quit':
    speech=raw_input('>')
    s=random.choice(speech.split())
    response=''
    while True:
        neword=nextword(s)
        response+=' '+neword
        s=neword
        if neword[-1] in ',?!.':
            break
    print response

You tend to get an uncanny feeling when it says something that seems partially to make sense.

Peptonize answered 28/5, 2014 at 1:12 Comment(0)
H
0

You could do like this: Make a order 1 markov chain generator, using words and not letters. Everytime someone post something, what he posted is added to bot database. Also bot would save when he gone to chat and when a guy posted the first post (in multiples of 10 seconds), then he would save the amount of time this same guy waited to post again (in multiples of 10 seconds)... This second part would be used to see when the guy will post, so he join the chat and after some amount of time based on a table with "after how many 10 seconds the a guy posted after joining the chat", then he would continue to post with the same table thinking "how was the amount of time used to write the the post that was posted after a post that he used X seconds to think about and write"

Hallucinate answered 2/5, 2014 at 13:31 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.