the action memory engine in project lecherous gnomes ville helin (vhelin#iki.fi) versions: 26-may-2003 * first release 28-may-2003 * small additions here and there the ai bots need a way to accumulate and store the actions they have made or heard about. the action memory engine does this and is itself a very small part of the ai as the overview will show. system overview the action memory engine maintains a list of experiences. each time a new action is executed (or learned from other bots) the system merges it with the old ones. the action memory engine differs from the rumor engine as here we handle information about actions and the results, not about actions and the participants. an action memory here's an example: creature C_a eats a mushroom and gets nutrition from this. afterwards the creature C_a remembers that it got some nutrition by eating a mushroom of that kind. this information can be valuable in hard times when there is nothing else to eat than these mushrooms. if the ai engine forces the bots to learn by experimenting then there might be some bots without this knowledge and they might starve to death even with mushrooms in their pockets. but some other bots might tell them about this or they might discover it themselves. an action memory should not be mixed up with rumor engine's rumor, because actions themselves are universal while rumors are always local (in the context of these engines). rumors discuss about persons, action memories discuss about actions and outcomes. no weights have been assigned to action memories, because actions themselves are not good or evil. the merging of action memories an example will clarify this: creature C_a kills a rat. this was his second time he killed a rat. last time he got meat and gold from it, but this time he gets meat and an item (a club). what can C_a think after these two kills? one thing what he could do is to average his experiences: after taking an average of these two cases he would know that killing a rat will give you meat with 100% probability, gold with 50% probability and items with 50% probability. everybody should know how to maintain an average by remembering the current average ca and the number of averaged values n, but here it is again: new_average = ca*n/(n+1) + ne*1/(n+1) here ne is the new value to be added to the average. the outcome of an action what kind of values we should use to represent the outcome of an action? if we used 1.0 for getting an item and 0.0 for not getting an item our average would be the probability of getting an item based on our samples. but here's an example from difficult times: creature C_a knows that he can get nutrition by eating an apple and by eating a grilled horse, both work with 100% probability. he has enough energy to eat one of them. which one he should eat, based on the probability alone? C_a should perhaps remember the average of absolute rewards he gets by eating something, to make the future selections little easier, and this is what the action memory engine does. one could extend the engine to remember the average of the amount of items, gold coins, etc, if that information was considered to be useful.