the action memory engine in project lecherous gnomes
ville helin (vhelin#iki.fi)

versions: 26-may-2003 * first release
          28-may-2003 * small additions here and there


the ai bots need a way to accumulate and store the actions
they have made or heard about. the action memory engine does this and is
itself a very small part of the ai as the overview will show.


system overview

the action memory engine maintains a list of experiences. each time
a new action is executed (or learned from other bots) the system merges
it with the old ones. the action memory engine differs from the rumor
engine as here we handle information about actions and the results,
not about actions and the participants.


an action memory

here's an example: creature C_a eats a mushroom and gets nutrition from this.
afterwards the creature C_a remembers that it got some nutrition by eating
a mushroom of that kind. this information can be valuable in hard times when
there is nothing else to eat than these mushrooms. if the ai engine forces
the bots to learn by experimenting then there might be some bots without
this knowledge and they might starve to death even with mushrooms in their
pockets. but some other bots might tell them about this or they might
discover it themselves.

an action memory should not be mixed up with rumor engine's rumor, because
actions themselves are universal while rumors are always local (in the
context of these engines). rumors discuss about persons, action memories
discuss about actions and outcomes. no weights have been assigned to
action memories, because actions themselves are not good or evil.


the merging of action memories

an example will clarify this: creature C_a kills a rat. this was his second
time he killed a rat. last time he got meat and gold from it, but this time
he gets meat and an item (a club). what can C_a think after these two kills?
one thing what he could do is to average his experiences: after taking an
average of these two cases he would know that killing a rat will give you
meat with 100% probability, gold with 50% probability and items with 50%
probability.

everybody should know how to maintain an average by remembering the current
average ca and the number of averaged values n, but here it is again:

new_average = ca*n/(n+1) + ne*1/(n+1)

here ne is the new value to be added to the average.


the outcome of an action

what kind of values we should use to represent the outcome of an action? if
we used 1.0 for getting an item and 0.0 for not getting an item our average
would be the probability of getting an item based on our samples.

but here's an example from difficult times:

creature C_a knows that he can get nutrition by eating an apple and by eating a
grilled horse, both work with 100% probability. he has enough energy to eat
one of them. which one he should eat, based on the probability alone?

C_a should perhaps remember the average of absolute rewards he gets by eating
something, to make the future selections little easier, and this is what the
action memory engine does. one could extend the engine to remember the
average of the amount of items, gold coins, etc, if that information was
considered to be useful.