Determining which inputs to weigh in an evolutionary algorithm

Asked 28/10, 2009 at 18:0 Answered 24/2, 2013 at 16:55

Solved artificial-intelligence evolutionary-algorithm tetris

I once wrote a Tetris AI that played Tetris quite well. The algorithm I used (described in this paper) is a two-step process.

In the first step, the programmer decides to track inputs that are "interesting" to the problem. In Tetris we might be interested in tracking how many gaps there are in a row because minimizing gaps could help place future pieces more easily. Another might be the average column height because it may be a bad idea to take risks if you're about to lose.

The second step is determining weights associated with each input. This is the part where I used a genetic algorithm. Any learning algorithm will do here, as long as the weights are adjusted over time based on the results. The idea is to let the computer decide how the input relates to the solution.

Using these inputs and their weights we can determine the value of taking any action. For example, if putting the straight line shape all the way in the right column will eliminate the gaps of 4 different rows, then this action could get a very high score if its weight is high. Likewise, laying it flat on top might actually cause gaps and so that action gets a low score.

I've always wondered if there's a way to apply a learning algorithm to the first step, where we find "interesting" potential inputs. It seems possible to write an algorithm where the computer first learns what inputs might be useful, then applies learning to weigh those inputs. Has anything been done like this before? Is it already being used in any AI applications?

Lithotrity answered 28/10, 2009 at 18:0 Comment(1)

+1 I am trying to get started in this field. I have a couple of pet demo programs but nothing big yet. Interested to see what kind of answers you get back on this. – Ethyne 28/10, 2009 at 18:15

In neural networks, you can select 'interesting' potential inputs by finding the ones that have the strongest correlation, positive or negative, with the classifications you're training for. I imagine you can do similarly in other contexts.

Curling answered 28/10, 2009 at 18:4 Comment(6)

What do you mean by "correlation with classifications" in this context? – Lithotrity 28/10, 2009 at 18:10

Say you're training a neural net to classify patterns as "the letter A" or "not the letter A". You have a bunch of training cases where you have some data and you know whether or not it's an A. You can slice and dice that data any number of ways, each one of which is a potential input. The best potential inputs are the ones that show a strong numeric correlation with the A-or-not-A state. If a potential input doesn't vary, it's useless. If it varies randomly, it's useless. If it varies in coordination with the A-or-not-Aness of the pattern, it's gold. – Curling 28/10, 2009 at 18:17

Ah, I see! I hadn't thought of using using pre-existing sample data (it's hard to imagine in Tetris). In fact, I think that recaptcha (recaptcha.net/learnmore.html) does this. It didn't occur to me until I read your example. – Lithotrity 28/10, 2009 at 18:21

As far as pre-existing data for Tetris goes, when your system has played a game, the records of what situations it faced, what decisions it made (maybe randomly to start) and what the outcomes now constitutes a corpus of data you can use for training. – Curling 28/10, 2009 at 18:30

Would it be a good idea to let people play a few games and use those as samples? I'd think that it would converge much faster if the actions are purposeful – Lithotrity 28/10, 2009 at 18:37

Sounds like a great data source, sure. – Curling 28/10, 2009 at 20:6

I think I might approach the problem you're describing by feeding more primitive data to a learning algorithm. For instance, a tetris game state may be described by the list of occupied cells. A string of bits describing this information would be a suitable input to that stage of the learning algorithm. actually training on that is still challenging; how do you know whether those are useful results. I suppose you could roll the whole algorithm into a single blob, where the algorithm is fed with the successive states of play and the output would just be the block placements, with higher scoring algorithms selected for future generations.

Another choice might be to use a large corpus of plays from other sources; such as recorded plays from human players or a hand-crafted ai, and select the algorithms who's outputs bear a strong correlation to some interesting fact or another from the future play, such as the score earned over the next 10 moves.

Sharpshooter answered 28/10, 2009 at 18:33 Comment(2)

I think your first suggestion is to change the model with which the problem is represented. It could be easier to code but I wonder if it will actually help with the learning. I really like the idea of using other sources though. – Lithotrity 28/10, 2009 at 18:39

I'm not totally satisfied with my own answer either. If I had to guess, I would suppose this probably increases the learning time by about 2 orders of magnitude. – Sharpshooter 28/10, 2009 at 18:52

Yes, there is a way.

If you choose M selected features there are 2^M subsets, so there is a lot to look at. I would to the following:

For each subset S
   run your code to optimize the weights W
   save S and the corresponding W

Then for each pair S-W, you can run G games for each pair and save the score L for each one. Now you have a table like this:

feature1    feature2    feature3    featureM   subset_code game_number    scoreL
1           0           1           1           S1         1              10500
1           0           1           1           S1         2              6230
...
0           1           1           0           S2         G + 1          30120
0           1           1           0           S2         G + 2          25900

Now you can run some component selection algorithm (PCA for example) and decide which features are worth to explain scoreL.

A tip: When running the code to optimize W, seed the random number generator, so that each different 'evolving brain' is tested against the same piece sequence.

I hope it helps in something!

Tympanitis answered 24/2, 2013 at 16:55 Comment(0)

Recommended topics

Hot tags