You can subscribe to this list here.
| 2001 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(9) |
Nov
(11) |
Dec
(18) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2002 |
Jan
(68) |
Feb
(194) |
Mar
(75) |
Apr
(44) |
May
(48) |
Jun
(29) |
Jul
(60) |
Aug
(74) |
Sep
(12) |
Oct
(13) |
Nov
(30) |
Dec
(62) |
| 2003 |
Jan
(63) |
Feb
(28) |
Mar
(63) |
Apr
(27) |
May
(53) |
Jun
(8) |
Jul
(17) |
Aug
(2) |
Sep
(95) |
Oct
(28) |
Nov
(36) |
Dec
(24) |
| 2004 |
Jan
(92) |
Feb
(47) |
Mar
(43) |
Apr
(86) |
May
(64) |
Jun
(10) |
Jul
(4) |
Aug
(4) |
Sep
|
Oct
|
Nov
|
Dec
|
| 2005 |
Jan
(1) |
Feb
(4) |
Mar
(3) |
Apr
(5) |
May
|
Jun
|
Jul
(14) |
Aug
(3) |
Sep
|
Oct
|
Nov
|
Dec
(7) |
| 2006 |
Jan
(1) |
Feb
(4) |
Mar
(14) |
Apr
(22) |
May
(51) |
Jun
|
Jul
|
Aug
(6) |
Sep
|
Oct
|
Nov
(25) |
Dec
(1) |
| 2007 |
Jan
|
Feb
(7) |
Mar
(80) |
Apr
(27) |
May
(15) |
Jun
(6) |
Jul
(25) |
Aug
(1) |
Sep
(3) |
Oct
(17) |
Nov
(174) |
Dec
(176) |
| 2008 |
Jan
(355) |
Feb
(194) |
Mar
(5) |
Apr
(28) |
May
(49) |
Jun
|
Jul
(28) |
Aug
(61) |
Sep
(61) |
Oct
(49) |
Nov
(71) |
Dec
(2) |
| 2009 |
Jan
(12) |
Feb
(216) |
Mar
(299) |
Apr
(257) |
May
(324) |
Jun
(222) |
Jul
(103) |
Aug
(127) |
Sep
(72) |
Oct
(76) |
Nov
(2) |
Dec
(23) |
| 2010 |
Jan
(23) |
Feb
(11) |
Mar
(11) |
Apr
(112) |
May
(19) |
Jun
(37) |
Jul
(44) |
Aug
(25) |
Sep
(10) |
Oct
(4) |
Nov
(5) |
Dec
(25) |
| 2011 |
Jan
(44) |
Feb
(19) |
Mar
(18) |
Apr
(3) |
May
(14) |
Jun
(1) |
Jul
(22) |
Aug
(7) |
Sep
|
Oct
|
Nov
|
Dec
|
| 2012 |
Jan
(51) |
Feb
(42) |
Mar
(9) |
Apr
(9) |
May
(2) |
Jun
(29) |
Jul
(47) |
Aug
(5) |
Sep
|
Oct
(38) |
Nov
(33) |
Dec
(13) |
| 2013 |
Jan
|
Feb
(7) |
Mar
(9) |
Apr
|
May
|
Jun
|
Jul
(3) |
Aug
(2) |
Sep
(9) |
Oct
(22) |
Nov
(18) |
Dec
(7) |
| 2014 |
Jan
(2) |
Feb
|
Mar
(2) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2015 |
Jan
|
Feb
(5) |
Mar
|
Apr
(24) |
May
|
Jun
(18) |
Jul
(10) |
Aug
(21) |
Sep
|
Oct
|
Nov
(3) |
Dec
(1) |
| 2016 |
Jan
(1) |
Feb
|
Mar
|
Apr
(2) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2018 |
Jan
(2) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2019 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(1) |
| 2020 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(1) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2022 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(1) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
|
From: Tim S. <ts...@ai...> - 2012-01-13 05:20:13
|
Long post coming up here as I catch up on the last weeks worth of discussions. So I apologize for the length as I ramble on answering some of the things discussed. > 2) "learning" in terms of let it do some exercise, tweak parameters, > redo same/similar exercise, if result better it was a good change > otherwise not. > > The hard part here is, it's "hard" to "redo" same, because e.g. dice > rolls will be different, opponent might act differently; not to mention > the "skew" of it would learn to exploit "other Colossus AIs" weaknesses. > > Argh, the glibber between my ears is producing too many ideas again! Actually this isn't going to be as bad as you think. In theory other than the dice rolls the AI vs AI should make the exact same moves or very close. This is the method I favor which I'll outline further down what I mean. > Other than the "non-rondom dice" (which is a fixed sequence, and might > be unfair nevertheless, depending on who draws how many numbers), > another approach: > > "hits as per expectation": > > Let's say: > 6 rolls for a 4-6, 3 hits. > 9 rolls for a 6, 1+ 1/2. hm, 1.5 > 5 rolls for a 3 => 5 x 2/3 = 10/3 = 3.33. > > Now, three options: round up, round down, "levelled". Levelled, round down or up what is closer, but keep that in mind and put it into calculation for next roll. I agree with this too. Limit the randomness as much as possible to get reasonably deterministic dice rolls. What I would favor doing is for each 6 dice rolls you get EXACTLY 1 of each number. Then for fractional left overs you roll randomly no duplicates. So if for example a unit rolls 8 dice it would get 1,2,3,4,5,6 and then 2 numbers rolled randomly but if those 2 rolls were 5,5 it would re-roll that 2nd 5 so that no number could appear 3 times. That's seems to be as close to expected value as we can get > What about reinforcement learning? > > I was thinking (but can't find an easy-to-use library) to use the > creatures/terrain/whatever as the 'input vector', and to try to get a good > output by: > > 1) creating an 'attack' objective for each of the Legion's creatures > 2) creating a 'preserve' objective for each of the Legion's creatures > 3) creating a 'destroy' objective for each of the opposite Legion's creatures > > and the 'output vector' would be the priorities for the objectives (and > because the evaluation of each objective as a lot of parameters, we might > throw a few of those in here as well). > > Then for a battle with no other objectives (such as preserving a Titan), a > simplified final result could be the reward in the reinforcement system (using > for instance a formula like "my points left minus the other points left"). > > Then replaying the same battle over and over with randomized parameters could > be used as a training set, with both 'good' and 'bad' results. The we'd see if > the system can suggest a 'good' set of parameters for the battle. Yes, I agree 100% with what Romain proposes here. This is the method I would use to train the AI. Romain already has done a fantastic job with the battle simulator where he showed the results of his AI vs the others with identical units. That I believe can be used to do what we want. Here's what I have been thinking about over the last week while reading these emails and reviewing the lecture notes There are a lot of different ways to do machine learning and the professor stressed MANY times that choosing the right one was probably the most important thing you could do in order to avoid wasting a lot of time so some up front simple examples to tell if you are on the right track was one of his recommendations. That way before you invest more time you know whether you have a reasonable chance to succeed. I mentioned a couple of emails back about having a lot of variables to the neural network concept. I still think that's going to be a way to go. What I would want to try to do is get accurate value for every unit in the battle. This value would be calculated each time it was your action (so potentially 14 times in all, twice per turn). Here's how I would initially start with valuing each unit (this would be a function that would take a unit and return a value). 1) Base value: power*skill. So an angel would be 6*4=24. No surprises there. However, the base value should be modified downward as the unit takes damage so an Angel with 2 hits on it is now worth less than 24. How much less? I propose initially to make it scaling with the first hit removing the least amount of value and the last hit removing the most. So for example a Centaur is 3*4=12. Rather than each hit being worth 4 (so that a 1 damage Centaur is worth 2*4=8 I suggest using 2,4,6. So a 1 damage Centaur is still worth 4+6=10 and a 2 damage Centaur is still worth 6. I do it this way so that when a unit is down to it's last hits, applying those hits is worth a lot when deciding WHICH unit to strike (ie this helps the AI eliminate weak units). During testing we'll let the learning process figure out how much each hit is worth. So the Angel hits would be 1.5,2.5,3.5,4.5,5.5,6.5 and so on for other units. 2) Recruit Value: This is whether the unit can recruit anywhere at all (other than the basic 3 tower units). In the actual game it would have to check the caretaker stacks etc to see what remained to recruit, in the simulation we can manually set this to test scenarios. The recruit value would be equal to the best unit the unit in theory might be able to directly recruit in the future. This would take into account multiple units in this stack at the moment. So if for example a stack contained a single Giant, it's best recruit value is another Giant. If a stack contained 2 Giants the recruit value would be a Colossus. This is hard to value of course but again I suggest starting with something like (Level in Tree*Unit Value/2) so that higher level units are worth more. By level, I mean a Centaur is a L1, a Lion a L2, a Minotaur a L3, a Dragon a L4 and a Colossus a L5 so a Centaur would be worth 1*12/2=6, a Lion 2*15/2=15, a Minotaur 3*16/2=24 and so on. 3) Recruit Here: As in this hex where the battle is taking place. If the answer is yes, double the value obtained in (2). 4) Titan: If the unit is a Titan multiply it's value*5. Should be obviously this makes Titans prime targets and prime things to defend. This might need to be increased of course if it's not high enough. Now we may need a few other things in this calculation. But this would be my initial crack at it. This should do what Romain wanted to do above (attack, preserve, destroy) by making units valuable. So in his example of 2 Ogres, 1 Troll, if we changed the simulation to occur on Hills for example (Where a Mino can be recruited by the Ogres) then the AI should try to preserve the Ogres since they are more valuable. We'll know how good a job the AI did by doing the following: Calculate the value of the units at battle start. Calculate the value of the units at battle end (all damage healed of course). The closer you are to the initial value, the better you have done since the AI should attempt to save the most valuable units / kill the most valuable enemy units. The goal would be to figure out the right values for each unit which means figuring out the value of each of those 4 parts of the calculation. The advantage of Romain's simulation is that we can simply tell it where the combat taking place and run a lot of combats to see if this is working and adjust the numbers as needed. Once this part is done, the next step will be improving the movement on the board (though Romain's existing experimental AI may already be good enough if we get proper unit valuations). Tim |
|
From: Clemens K. <lem...@sa...> - 2012-01-11 20:10:38
|
> > from 'bad'? Doesn't it requires the learning process to only include
> 'good'?
> > Doesn't it preclude for learning on-the-fly unless you can somehow tell
> 'good'
> > from 'bad'?
> What's important is that a human did not have to categorise the game
> as a win or lose.
Exactly. I think we have to distinct between two principles approaches:
(AI now in meaning "any artificial intelligence"):
1) Let the AI read lots of data and learn / conclude from that.
Let it analyze the report of all battle on the server was something
somebody mentioned. (I don't think this approach is feasibe for us).
Too many cases, all different, and no "is this good or bad".
2) "learning" in terms of let it do some exercise, tweak parameters,
redo same/similar exercise, if result better it was a good change
otherwise not.
The hard part here is, it's "hard" to "redo" same, because e.g. dice
rolls will be different, opponent might act differently; not to mention
the "skew" of it would learn to exploit "other Colossus AIs" weaknesses.
Argh, the glibber between my ears is producing too many ideas again!
Can we involve real people? E.g., provide on the server a feature
"do battle practice". We define few situations which we want to train
AI first. If users login to server (and e.g. no other player free),
they can spend their time with fighting one of the scenarios against
the AI. Since it's "few", we can even "define" our best guess or
experience (let two good human players play it 5 times?)
"what is a good outcome and what a bad one".
The AI plays those scenarios against the users, and it might grow
better on it! Users might also just redo same just to "practice for
themselves".
[However, "let it run on the server to store results easily" is here in
conflict with "the server has only one CPU" -- so, not too many of
that at same time, or lower priority ("nice" value") -- hopefully the
"new AI" would not need "30 secs full CPU" as old ones??
Make it possible to re-run same scenarios on user PC, makes it harder
to transmit back the results so server and "update" changed parameters
from server to user PC regularly...]
(( Hey, I just said "I have ideas", not that they are good or feasible
ones ;-))
One more step to level the "dice always different": Tweaked dice rolls.
Other than the "non-rondom dice" (which is a fixed sequence, and might
be unfair nevertheless, depending on who draws how many numbers),
another approach:
"hits as per expectation":
Let's say:
6 rolls for a 4-6, 3 hits.
9 rolls for a 6, 1+ 1/2. hm, 1.5
5 rolls for a 3 => 5 x 2/3 = 10/3 = 3.33.
Now, three options: round up, round down, "levelled". Levelled, round down or up what is closer, but keep that in mind and put it into calculation for next roll.
I think users might appreciate this approach to be able to "improve
own strategt and not depend on dice luck ?"
BR,
Clemen
-------- Original-Nachricht --------
> Datum: Wed, 11 Jan 2012 21:50:43 +1030
> Von: Barrie Treloar <bae...@gm...>
> An: Romain Dolbeau <ro...@do...>
> CC: col...@li...
> Betreff: Re: [Colossus-developers] (non-colossus specific) Stanford online course (free) for Artificial Intelligence and Machine Learning.
> On Wed, Jan 11, 2012 at 7:02 PM, Romain Dolbeau <ro...@do...>
> wrote:
> > On 01/10/12 21:38, Barrie Treloar wrote:
> >> Reinforcement learning is a type of unsupervised learning.
> >> You don't manually evaluate the battle, but as you point out use some
> >> formula to determine the results.
> >
> > What I don't understand is: if you don't do that, how does the AI learn
> 'good'
> > from 'bad'? Doesn't it requires the learning process to only include
> 'good'?
> > Doesn't it preclude for learning on-the-fly unless you can somehow tell
> 'good'
> > from 'bad'?
> >
> > I should take the course but I don't have the time :-(
>
> I have all the videos and programming exercises :)
>
> An example way to do the reinforcement would be at the end of the
> game, if the AI wins it gets +1, and if it loses it gets -1. (Or +6
> for 6 players or something like that)
>
> But you ideally do not want to wait that long to train your AI.
>
> So we need a way to measure the "goodness" of a single battle.
> I think someone suggested points accrued vs creatures lost.
> Perhaps even getting an angel?
>
> What's important is that a human did not have to categorise the game
> as a win or lose.
>
> ------------------------------------------------------------------------------
> Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex
> infrastructure or vast IT resources to deliver seamless, secure access to
> virtual desktops. With this all-in-one solution, easily deploy virtual
> desktops for less than the cost of PCs and save 60% on VDI infrastructure
> costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox
> _______________________________________________
> Colossus-developers mailing list
> Col...@li...
> https://lists.sourceforge.net/lists/listinfo/colossus-developers
--
NEU: FreePhone - 0ct/min Handyspartarif mit Geld-zurück-Garantie!
Jetzt informieren: http://www.gmx.net/de/go/freephone
|
|
From: Barrie T. <bae...@gm...> - 2012-01-11 11:20:55
|
On Wed, Jan 11, 2012 at 7:02 PM, Romain Dolbeau <ro...@do...> wrote: > On 01/10/12 21:38, Barrie Treloar wrote: >> Reinforcement learning is a type of unsupervised learning. >> You don't manually evaluate the battle, but as you point out use some >> formula to determine the results. > > What I don't understand is: if you don't do that, how does the AI learn 'good' > from 'bad'? Doesn't it requires the learning process to only include 'good'? > Doesn't it preclude for learning on-the-fly unless you can somehow tell 'good' > from 'bad'? > > I should take the course but I don't have the time :-( I have all the videos and programming exercises :) An example way to do the reinforcement would be at the end of the game, if the AI wins it gets +1, and if it loses it gets -1. (Or +6 for 6 players or something like that) But you ideally do not want to wait that long to train your AI. So we need a way to measure the "goodness" of a single battle. I think someone suggested points accrued vs creatures lost. Perhaps even getting an angel? What's important is that a human did not have to categorise the game as a win or lose. |
|
From: Romain D. <ro...@do...> - 2012-01-11 08:33:04
|
On 01/10/12 21:38, Barrie Treloar wrote: > Reinforcement learning is a type of unsupervised learning. > You don't manually evaluate the battle, but as you point out use some > formula to determine the results. What I don't understand is: if you don't do that, how does the AI learn 'good' from 'bad'? Doesn't it requires the learning process to only include 'good'? Doesn't it preclude for learning on-the-fly unless you can somehow tell 'good' from 'bad'? I should take the course but I don't have the time :-( Cordially, -- Romain Dolbeau <ro...@do...> |
|
From: Romain D. <ro...@do...> - 2012-01-11 08:30:42
|
On 01/10/12 21:23, Clemens Katzer wrote: > First, I cannot map the word "reinforcement" into this context. Yeah, ion the colossus ML, it's definitely an ambiguous term, sorry :-) Barrie explained it well: as in 'reinforcement learning'. > In this approach, the "other side" must remain constant, > i.e. can not use for example the same changing AI; but if doing so, > isn't there the risk that the new AI primarily learns to exploit > the other AI's weaknesses? It doesn't have to remain constant I think, but yes, it will learn to exploit mistakes that other AI makes. But once the code 'works', nothing prevent you from injecting Human-vs-AI battle results into the training set, I believe. Cordially, -- Romain Dolbeau <ro...@do...> |
|
From: Barrie T. <bae...@gm...> - 2012-01-10 20:38:59
|
On Wed, Jan 11, 2012 at 6:53 AM, Clemens Katzer <lem...@sa...> wrote: > >> > Supervised and Unsupervised. >> > In supervised you know the correct answer, >> > in unsupervised you don't. >> >> What about reinforcement learning? > > First, I cannot map the word "reinforcement" into this context. > For me, reinforcement translates to: "some army attacks, and if they > are to weak, they call for reinforcement (some higher level general > dispatches additional troups to "make them stronger". > > Or: "if you tease me, I call my big brother as reinforcement" :) ITs reinforcement as in carrot/stick. You do well and the mouse gets a cheese. You do badly and your get an electric shock. > Regarding your approach: I do not see what exactly is the difference > between this and "unsupervised" - unsupervised learning I would have > imagined is of the form: one defines a "measurement" and the overall > goal is that the "artificial intelligence" shall maximize the achieved > value. Reinforcement learning is a type of unsupervised learning. You don't manually evaluate the battle, but as you point out use some formula to determine the results. > > In this approach, the "other side" must remain constant, > i.e. can not use for example the same changing AI; but if doing so, > isn't there the risk that the new AI primarily learns to exploit > the other AI's weaknesses? > > But then, perhaps I got it all wrong :) You are trying to improve your AI, so it doesn't matter what the opposition does, and it can change. As long as your AI is improving based on the evaluation criteria then its working, and yes if you don't have enough different types of input data you can skew your AI to maximise against a strategy that doesn't exist. |
|
From: Clemens K. <lem...@sa...> - 2012-01-10 20:23:42
|
> > Supervised and Unsupervised. > > In supervised you know the correct answer, > > in unsupervised you don't. > > What about reinforcement learning? First, I cannot map the word "reinforcement" into this context. For me, reinforcement translates to: "some army attacks, and if they are to weak, they call for reinforcement (some higher level general dispatches additional troups to "make them stronger". Or: "if you tease me, I call my big brother as reinforcement" :) Regarding your approach: I do not see what exactly is the difference between this and "unsupervised" - unsupervised learning I would have imagined is of the form: one defines a "measurement" and the overall goal is that the "artificial intelligence" shall maximize the achieved value. In this approach, the "other side" must remain constant, i.e. can not use for example the same changing AI; but if doing so, isn't there the risk that the new AI primarily learns to exploit the other AI's weaknesses? But then, perhaps I got it all wrong :) Just my 5 €-cents (as long as we still have that currency ;-) BR, Clemens > I was thinking (but can't find an easy-to-use library) to use the > creatures/terrain/whatever as the 'input vector', and to try to get a good > output by: > > 1) creating an 'attack' objective for each of the Legion's creatures > 2) creating a 'preserve' objective for each of the Legion's creatures > 3) creating a 'destroy' objective for each of the opposite Legion's > creatures > > and the 'output vector' would be the priorities for the objectives (and > because the evaluation of each objective as a lot of parameters, we might > throw a few of those in here as well). > > Then for a battle with no other objectives (such as preserving a Titan), a > simplified final result could be the reward in the reinforcement system > (using > for instance a formula like "my points left minus the other points left"). > > Then replaying the same battle over and over with randomized parameters > could > be used as a training set, with both 'good' and 'bad' results. The we'd > see if > the system can suggest a 'good' set of parameters for the battle. > > Could that work? > > Cordially, > > -- > Romain Dolbeau > <rom...@ca...> > > > ------------------------------------------------------------------------------ > Write once. Port to many. > Get the SDK and tools to simplify cross-platform app development. Create > new or port existing apps to sell to consumers worldwide. Explore the > Intel AppUpSM program developer opportunity. appdeveloper.intel.com/join > http://p.sf.net/sfu/intel-appdev > _______________________________________________ > Colossus-developers mailing list > Col...@li... > https://lists.sourceforge.net/lists/listinfo/colossus-developers -- NEU: FreePhone - 0ct/min Handyspartarif mit Geld-zurück-Garantie! Jetzt informieren: http://www.gmx.net/de/go/freephone |
|
From: Romain D. <rom...@ca...> - 2012-01-10 15:48:00
|
On 01/10/12 05:36, Barrie Treloar wrote: > The other thing I was thinking was that there are two types of AI learning: > Supervised and Unsupervised. > In supervised you know the correct answer, > in unsupervised you don't. What about reinforcement learning? I was thinking (but can't find an easy-to-use library) to use the creatures/terrain/whatever as the 'input vector', and to try to get a good output by: 1) creating an 'attack' objective for each of the Legion's creatures 2) creating a 'preserve' objective for each of the Legion's creatures 3) creating a 'destroy' objective for each of the opposite Legion's creatures and the 'output vector' would be the priorities for the objectives (and because the evaluation of each objective as a lot of parameters, we might throw a few of those in here as well). Then for a battle with no other objectives (such as preserving a Titan), a simplified final result could be the reward in the reinforcement system (using for instance a formula like "my points left minus the other points left"). Then replaying the same battle over and over with randomized parameters could be used as a training set, with both 'good' and 'bad' results. The we'd see if the system can suggest a 'good' set of parameters for the battle. Could that work? Cordially, -- Romain Dolbeau <rom...@ca...> |
|
From: Romain D. <rom...@ca...> - 2012-01-10 12:20:30
|
On 01/10/12 11:24, Kim Milvang-Jensen wrote: > As far as I know (it has bee a while since i checked), SimpleAI, > RationalAI and MilvangAI actually run the excact same battle code. That's also what I remembered, that only ExperimentalAI changed the battle code (and was SimpleAI in disguise for the strategic part), while RationalAI and MilvangAI changed the strategy, but where SimpleAI in disguise on the Battlelands. Cordially, -- Romain Dolbeau <rom...@ca...> |
|
From: Kim Milvang-J. <ki...@mi...> - 2012-01-10 11:18:41
|
As far as I know (it has bee a while since i checked), SimpleAI, RationalAI and MilvangAI actually run the excact same battle code. Maybe I should consider looking at this again :) On 10/01/2012 09:46 "Romain Dolbeau" <ro...@do...> wrote: > On 01/09/12 18:44, Clemens Katzer wrote: > > > This gives a good start for the "thing around" for those who rather > > care about "AI stuff" than "how to create or run a battle N times". > > I had 100 games run overnight for each possible AI vs. AI combinations > in > ExperimentalAI, SimpleAI, RationalAI andf MilvangAI. Here it comes, > straight > from SQL: > > 2 * Troll + 3 * Ogre vs. same in Plains: > +----------+----------+------+----------+----------------+------------ > ----+ > | attacker | defender | draw | timeloss | attAI | defAI | > +----------+----------+------+----------+----------------+------------ > ----+ > | 75 | 15 | 10 | 0 | SimpleAI | SimpleAI | > | 39 | 40 | 19 | 2 | SimpleAI | ExperimentalAI | > | 72 | 13 | 14 | 1 | SimpleAI | RationalAI | > | 67 | 20 | 13 | 0 | SimpleAI | MilvangAI | > | 57 | 19 | 24 | 0 | ExperimentalAI | SimpleAI | > | 50 | 33 | 15 | 2 | ExperimentalAI | ExperimentalAI | > | 67 | 12 | 21 | 0 | ExperimentalAI | RationalAI | > | 65 | 19 | 16 | 0 | ExperimentalAI | MilvangAI | > | 75 | 14 | 11 | 0 | RationalAI | SimpleAI | > | 32 | 48 | 17 | 3 | RationalAI | ExperimentalAI | > | 72 | 10 | 18 | 0 | RationalAI | RationalAI | > | 63 | 15 | 22 | 0 | RationalAI | MilvangAI | > | 71 | 18 | 11 | 0 | MilvangAI | SimpleAI | > | 43 | 40 | 16 | 1 | MilvangAI | ExperimentalAI | > | 69 | 15 | 16 | 0 | MilvangAI | RationalAI | > | 70 | 19 | 11 | 0 | MilvangAI | MilvangAI | > +----------+----------+------+----------+----------------+------------ > ----+ > > As far as I can tell, SimpleAI/RationalAI/MilvangAI look similar, and > they are > all marginally better at attacking each other than ExperimentalAI is. > On the > other hand, ExperimentalAI has a better defense. > > I'm not sure 100 battles is anywhere close to being high enough to be > significant. > > > I think this your eval code should be checked in, either in a branch > > or ideally into trunk (can it be there, i.e. is it inactive > > as long as not explicitly forced to "do something" ?) > > It's really ugly, and create a dependance on java.sql.* ... > > Cordially, > |
|
From: Clemens K. <lem...@sa...> - 2012-01-10 08:57:48
|
> It's really ugly, and create a dependance on java.sql.* ... ok, that's not so nice. Let's see. BR, Clemens -------- Original-Nachricht -------- > Datum: Tue, 10 Jan 2012 09:46:39 +0100 > Von: Romain Dolbeau <ro...@do...> > An: col...@li... > Betreff: Re: [Colossus-developers] Evaluating a battle > On 01/09/12 18:44, Clemens Katzer wrote: > > > This gives a good start for the "thing around" for those who rather > > care about "AI stuff" than "how to create or run a battle N times". > > I had 100 games run overnight for each possible AI vs. AI combinations in > ExperimentalAI, SimpleAI, RationalAI andf MilvangAI. Here it comes, > straight > from SQL: > > 2 * Troll + 3 * Ogre vs. same in Plains: > +----------+----------+------+----------+----------------+----------------+ > | attacker | defender | draw | timeloss | attAI | defAI > | > +----------+----------+------+----------+----------------+----------------+ > | 75 | 15 | 10 | 0 | SimpleAI | SimpleAI > | > | 39 | 40 | 19 | 2 | SimpleAI | ExperimentalAI > | > | 72 | 13 | 14 | 1 | SimpleAI | RationalAI > | > | 67 | 20 | 13 | 0 | SimpleAI | MilvangAI > | > | 57 | 19 | 24 | 0 | ExperimentalAI | SimpleAI > | > | 50 | 33 | 15 | 2 | ExperimentalAI | ExperimentalAI > | > | 67 | 12 | 21 | 0 | ExperimentalAI | RationalAI > | > | 65 | 19 | 16 | 0 | ExperimentalAI | MilvangAI > | > | 75 | 14 | 11 | 0 | RationalAI | SimpleAI > | > | 32 | 48 | 17 | 3 | RationalAI | ExperimentalAI > | > | 72 | 10 | 18 | 0 | RationalAI | RationalAI > | > | 63 | 15 | 22 | 0 | RationalAI | MilvangAI > | > | 71 | 18 | 11 | 0 | MilvangAI | SimpleAI > | > | 43 | 40 | 16 | 1 | MilvangAI | ExperimentalAI > | > | 69 | 15 | 16 | 0 | MilvangAI | RationalAI > | > | 70 | 19 | 11 | 0 | MilvangAI | MilvangAI > | > +----------+----------+------+----------+----------------+----------------+ > > As far as I can tell, SimpleAI/RationalAI/MilvangAI look similar, and they > are > all marginally better at attacking each other than ExperimentalAI is. On > the > other hand, ExperimentalAI has a better defense. > > I'm not sure 100 battles is anywhere close to being high enough to be > significant. > > > I think this your eval code should be checked in, either in a branch > > or ideally into trunk (can it be there, i.e. is it inactive > > as long as not explicitly forced to "do something" ?) > > It's really ugly, and create a dependance on java.sql.* ... > > Cordially, > > -- > Romain Dolbeau > <ro...@do...> > > > ------------------------------------------------------------------------------ > Write once. Port to many. > Get the SDK and tools to simplify cross-platform app development. Create > new or port existing apps to sell to consumers worldwide. Explore the > Intel AppUpSM program developer opportunity. appdeveloper.intel.com/join > http://p.sf.net/sfu/intel-appdev > _______________________________________________ > Colossus-developers mailing list > Col...@li... > https://lists.sourceforge.net/lists/listinfo/colossus-developers -- NEU: FreePhone - 0ct/min Handyspartarif mit Geld-zurück-Garantie! Jetzt informieren: http://www.gmx.net/de/go/freephone |
|
From: Romain D. <ro...@do...> - 2012-01-10 08:46:56
|
On 01/09/12 18:44, Clemens Katzer wrote: > This gives a good start for the "thing around" for those who rather > care about "AI stuff" than "how to create or run a battle N times". I had 100 games run overnight for each possible AI vs. AI combinations in ExperimentalAI, SimpleAI, RationalAI andf MilvangAI. Here it comes, straight from SQL: 2 * Troll + 3 * Ogre vs. same in Plains: +----------+----------+------+----------+----------------+----------------+ | attacker | defender | draw | timeloss | attAI | defAI | +----------+----------+------+----------+----------------+----------------+ | 75 | 15 | 10 | 0 | SimpleAI | SimpleAI | | 39 | 40 | 19 | 2 | SimpleAI | ExperimentalAI | | 72 | 13 | 14 | 1 | SimpleAI | RationalAI | | 67 | 20 | 13 | 0 | SimpleAI | MilvangAI | | 57 | 19 | 24 | 0 | ExperimentalAI | SimpleAI | | 50 | 33 | 15 | 2 | ExperimentalAI | ExperimentalAI | | 67 | 12 | 21 | 0 | ExperimentalAI | RationalAI | | 65 | 19 | 16 | 0 | ExperimentalAI | MilvangAI | | 75 | 14 | 11 | 0 | RationalAI | SimpleAI | | 32 | 48 | 17 | 3 | RationalAI | ExperimentalAI | | 72 | 10 | 18 | 0 | RationalAI | RationalAI | | 63 | 15 | 22 | 0 | RationalAI | MilvangAI | | 71 | 18 | 11 | 0 | MilvangAI | SimpleAI | | 43 | 40 | 16 | 1 | MilvangAI | ExperimentalAI | | 69 | 15 | 16 | 0 | MilvangAI | RationalAI | | 70 | 19 | 11 | 0 | MilvangAI | MilvangAI | +----------+----------+------+----------+----------------+----------------+ As far as I can tell, SimpleAI/RationalAI/MilvangAI look similar, and they are all marginally better at attacking each other than ExperimentalAI is. On the other hand, ExperimentalAI has a better defense. I'm not sure 100 battles is anywhere close to being high enough to be significant. > I think this your eval code should be checked in, either in a branch > or ideally into trunk (can it be there, i.e. is it inactive > as long as not explicitly forced to "do something" ?) It's really ugly, and create a dependance on java.sql.* ... Cordially, -- Romain Dolbeau <ro...@do...> |
|
From: Barrie T. <bae...@gm...> - 2012-01-10 04:36:58
|
The other thing I was thinking was that there are two types of AI learning: Supervised and Unsupervised. In supervised you know the correct answer, in unsupervised you don't. If we assume that human players are on average better than the AI players, then we can use real game data to train the AI with moves that humans have done. That would help bootstrap the learning process instead of having an untrained neural net fighting itself. Previously I had assumed unsupervised learning. This is where Klemens Colosuss server would come in handy for harvesting data. Which would make fixing up the save files more valuable :) |
|
From: Clemens K. <lem...@sa...> - 2012-01-09 17:44:15
|
Thanks a lot, Romain! This gives a good start for the "thing around" for those who rather care about "AI stuff" than "how to create or run a battle N times". I think this your eval code should be checked in, either in a branch or ideally into trunk (can it be there, i.e. is it inactive as long as not explicitly forced to "do something" ?) Well, I'll take a look. BR, Clemens -------- Original-Nachricht -------- > Datum: Mon, 09 Jan 2012 17:56:39 +0100 > Von: Romain Dolbeau <ro...@do...> > An: col...@li... > Betreff: [Colossus-developers] Evaluating a battle (was: Stanford online course ...) > On 01/08/12 12:09, Romain Dolbeau wrote: > > I think that if we could have a simple battle (4 or 5 pieces, same on > > both side, in Plains) that a "TrainedAI" could learn to fight > > "optimally" against both SimpleAI, ExpAI, other AIs and itself > > Just to know where we stand now, I'm running an updated version from my > very old SQL code. I've created a battle like above (2 Trolls and 3 Ogres > vs. the same, in Plains), and I'm running ExpAI vs. SimpleAI and SimpleAI > vs. ExpAI, and storing in my database. I have no idea if one has an actual > edge over the other... > > I'm attaching the needed stuff to run the same. > > Procedure: > 1) put BattleRecordFilteredResults.java, BattleRecord.java, > BattleRecordSQL.java in the server package (from a SVN checkout) > 2) apply the patch 'patch' > 3) get 'mysql-connector-java-5.0.8-bin.jar' from > <http://dev.mysql.com/downloads/connector/j/5.0.html> and put it in libs/ > 4) rebuild the code > 5) rebuild the tools ('ant tools') > 6) you can regenerate a custom battle with MakeBattle like thus (the 2 XML > files I use are in 'stuff' already, you can use them directly) : > > ##### > java -classpath build/ant/classes net.sf.colossus.tools.MakeBattle > --dlist=Troll,Troll,Ogre,Ogre,Ogre --alist=Troll,Troll,Ogre,Ogre,Ogre > --aAI=ExperimentalAI --dAI=SimpleAI > Test_ExpAI_vs_SimpleAI.xml > > java -classpath build/ant/classes net.sf.colossus.tools.MakeBattle > --dlist=Troll,Troll,Ogre,Ogre,Ogre --alist=Troll,Troll,Ogre,Ogre,Ogre > --dAI=ExperimentalAI --aAI=SimpleAI > Test_SimpleAI_vs_ExpAI.xml > ##### > > 7) have a mysql database running on the local machine ; you need to create > the base inside the SQL and to grant the privileges to the proper user: > > ##### (inside mysql) > create database ColossusBattleRecordV2; > > GRANT ALL PRIVILEGES ON ColossusBattleRecordV2.* TO 'colossus'@'localhost' > identified by 'colossus'; > ##### > > (you can change a lot of thing inside BattleRecordSQL.java) > > 8) to test everything work (from the just rebuilded Colosus SVN checkout) > > ./test.sh 1 /absolute/path/to/Test_ExpAI_vs_SimpleAI.xml > > should run the battle and store the results in the database. Then you can > run both a gazillion time to get statistics on your favorite battle with > your favorite AIs. > > So far, I have 27/7/6 and 15/15/10 (attacker win/defender win/draw, no > timeloss yet) depending on whether ExpAI (first case) or SimpleAI (second > case) is the attacker. > > Cordially, > > P.S. beware, the code is ugly, it's a quick'n'dirty hack to save time on > creating statistics. > > -- > Romain Dolbeau > <ro...@do...> > -- NEU: FreePhone - 0ct/min Handyspartarif mit Geld-zurück-Garantie! Jetzt informieren: http://www.gmx.net/de/go/freephone |
|
From: Romain D. <ro...@do...> - 2012-01-09 16:57:00
|
On 01/08/12 12:09, Romain Dolbeau wrote: > I think that if we could have a simple battle (4 or 5 pieces, same on > both side, in Plains) that a "TrainedAI" could learn to fight > "optimally" against both SimpleAI, ExpAI, other AIs and itself Just to know where we stand now, I'm running an updated version from my very old SQL code. I've created a battle like above (2 Trolls and 3 Ogres vs. the same, in Plains), and I'm running ExpAI vs. SimpleAI and SimpleAI vs. ExpAI, and storing in my database. I have no idea if one has an actual edge over the other... I'm attaching the needed stuff to run the same. Procedure: 1) put BattleRecordFilteredResults.java, BattleRecord.java, BattleRecordSQL.java in the server package (from a SVN checkout) 2) apply the patch 'patch' 3) get 'mysql-connector-java-5.0.8-bin.jar' from <http://dev.mysql.com/downloads/connector/j/5.0.html> and put it in libs/ 4) rebuild the code 5) rebuild the tools ('ant tools') 6) you can regenerate a custom battle with MakeBattle like thus (the 2 XML files I use are in 'stuff' already, you can use them directly) : ##### java -classpath build/ant/classes net.sf.colossus.tools.MakeBattle --dlist=Troll,Troll,Ogre,Ogre,Ogre --alist=Troll,Troll,Ogre,Ogre,Ogre --aAI=ExperimentalAI --dAI=SimpleAI > Test_ExpAI_vs_SimpleAI.xml java -classpath build/ant/classes net.sf.colossus.tools.MakeBattle --dlist=Troll,Troll,Ogre,Ogre,Ogre --alist=Troll,Troll,Ogre,Ogre,Ogre --dAI=ExperimentalAI --aAI=SimpleAI > Test_SimpleAI_vs_ExpAI.xml ##### 7) have a mysql database running on the local machine ; you need to create the base inside the SQL and to grant the privileges to the proper user: ##### (inside mysql) create database ColossusBattleRecordV2; GRANT ALL PRIVILEGES ON ColossusBattleRecordV2.* TO 'colossus'@'localhost' identified by 'colossus'; ##### (you can change a lot of thing inside BattleRecordSQL.java) 8) to test everything work (from the just rebuilded Colosus SVN checkout) ./test.sh 1 /absolute/path/to/Test_ExpAI_vs_SimpleAI.xml should run the battle and store the results in the database. Then you can run both a gazillion time to get statistics on your favorite battle with your favorite AIs. So far, I have 27/7/6 and 15/15/10 (attacker win/defender win/draw, no timeloss yet) depending on whether ExpAI (first case) or SimpleAI (second case) is the attacker. Cordially, P.S. beware, the code is ugly, it's a quick'n'dirty hack to save time on creating statistics. -- Romain Dolbeau <ro...@do...> |
|
From: Barrie T. <bae...@gm...> - 2012-01-08 19:30:39
|
On Sun, Jan 8, 2012 at 9:39 PM, Romain Dolbeau <ro...@do...> wrote: > I think that if we could have a simple battle (4 or 5 pieces, same on > both side, in Plains) that a "TrainedAI" could learn to fight > "optimally" against both SimpleAI, ExpAI, other AIs and itself, then we > would have a starting point. The idea is no Titan (too complicated at > first), an undecisive outcome (the Battle can go either way and a > skilled AI is useful to have), enough pieces to have a fine-grained > result (winning = good, the more pieces left the better) but not too > much pieces, no reinforcement/summoning at first. That would teach the > AI placement, defense, offense, gang-banging, and avoiding time loss. Well, and this is where my lack of AI comes in. You dont need just one AI algorithm, you may need a few for different scenarios. And you can even have a few for the same scenario and then somehow select which answer you like better. Also the re-enforcement part can come way later than the end of the combat. e.g. the result of the entire game. So as long as there are enough input parameters, and hidden units internally to capture different strategies, and enough games played where enough of those actions listed above are occurring, then the weights should be manipulated to start using then. That's where the freakishness of a neural net comes in. You can't really look inside and see what it is doing and why. The further away the re-enforcement then the more memory (or disk space, or time) needed to generate an AI learning cycle. And this may become prohibitive in time, memory, etc to actually generate. |
|
From: <ro...@do...> - 2012-01-08 11:33:34
|
Romain Dolbeau <ro...@do...> wrote: > Of course, having neither the time nor the knowledge I'm not going to > code it, so in the end it's in the hands of whomever is courageous > enough to jump into it :-) Apparently some of the work has been done, that maybe we could reuse: <http://en.wikipedia.org/wiki/Encog> <http://en.wikipedia.org/wiki/Neuroph> The second seems to have a list of applications using it on the web site, didn't find it for the first. Cordially, -- Romain Dolbeau <ro...@do...> |
|
From: <ro...@do...> - 2012-01-08 11:09:35
|
Clemens Katzer <lem...@sa...> wrote: > I guess we all agree that Titan is slighly more complex than Backgammon. I guess we do. I'm not sure about getting the agreement of backgammon players ;-) > In that sense, some of those objectives Romain came up with would > be rather high-level objectives, imposed from "outside" to the battle. I agree. I think those are a completely different kind of AI, with a lot of player wisdom thrown in. > But basic idea: partition the problem and train the AI for specific > situations first? I agree 100%. I think that if we could have a simple battle (4 or 5 pieces, same on both side, in Plains) that a "TrainedAI" could learn to fight "optimally" against both SimpleAI, ExpAI, other AIs and itself, then we would have a starting point. The idea is no Titan (too complicated at first), an undecisive outcome (the Battle can go either way and a skilled AI is useful to have), enough pieces to have a fine-grained result (winning = good, the more pieces left the better) but not too much pieces, no reinforcement/summoning at first. That would teach the AI placement, defense, offense, gang-banging, and avoiding time loss. Then step 2 could be to vary the pieces, first in type, then in numbers, and finally with different numbers on each side. Then introduce the hard stuff: terrain, Titan, and adding the ability to do "intelligent" stuff like killing good pieces by having externally introduced objectives. Of course, having neither the time nor the knowledge I'm not going to code it, so in the end it's in the hands of whomever is courageous enough to jump into it :-) Cordially & happy new year everyone, -- Romain Dolbeau <ro...@do...> |
|
From: Clemens K. <lem...@sa...> - 2012-01-08 11:03:42
|
Hello all, that's a very interesting discussion. I guess we all agree that Titan is slighly more complex than Backgammon. If I understood right, they defined a set of parameters. The AI would always play a whole game with a choosen set, and perhaps adjust, or adjust based on results in some DB. And most of all, there is a clear "good or bad". In Titan even the evaluation "was that a good outcome or not" might already be tricky. (pure value of remaining creatures for example does not tell much). But can we partition the problem? For example, whether Titan(s) are involved or not, would give clearly different main objectives. In that sense, some of those objectives Romain came up with would be rather high-level objectives, imposed from "outside" to the battle. And have the AI learn for different overall scenarios independently. I.e. as long as we can easily categorize a given situation (own Titan or not, enemy Titan or not, annihilation or tactical surgery (kill possible recruiters) up-front, we can choose "the AI for just that purpose". Start with some of those, or combinations of those. Define 1 or few example battle situations, and evaluate a "pretty good outcome". So in real game, if situation classified as one of the "trained" (covered) categories, use a table with params finetuned for it, in other cases go on with a best-guess general (or the one we have) ? And e.g. start with simple things like "plains" only, or plains and bog and trees (just movement impact). I bet those researchers did not come up with the perfect set of parameters at once. THey choose some, runn 300K... hm, not so good... some other, ... better .... or well, one can choose 100, the 50 not important ones will just not play a role or what? But basic idea: partition the problem and train the AI for specific situations first? BR, Clemens -------- Original-Nachricht -------- > Datum: Sun, 8 Jan 2012 09:46:03 +0100 > Von: ro...@do... > An: ts...@ai..., col...@li... > Betreff: Re: [Colossus-developers] (non-colossus specific) Stanford online course (free) for Artificial Intelligence and Machine Learning. > Tim Sowden <ts...@ai...> wrote: > > > They are offering the ML Class again this semester. Even if you don't > > want to do the programming assignments it's worth listening to the video > > lectures (2-3 hrs a week) just to see the latest techniques in use. > > D*mn that's a lot of time :-( > > I'm going to comment in a way that may look like criticism, but it's > really just a way of telling that we have benn thinking about it for a > long time, and that there's a gazillion corner cases that makes the > problem really, really difficult. > > I want this to happen. I just strongly believe it's incredibly > difficult, and that one should set one's expectations reasonably low > before starting to code anything, otherwise disapointement will ensue. > See also my comments in my anwser to Barrie. > > > So back to battles in Colossus. I'd imagine we'd need at least 20 > > parameters and maybe 30. For example what I'd initially consider > > important is: > > > > 1) Value of each unit on both sides > > Why not go for the unit itself, complete with all the parameters? > > > You want to > > max the value of your units and min the value of the enemy units. > > I disagree already :-) If the opposing Titan is there and your isn't, > you want to kill the Titan, other units are of secondary importance. In > the reverse case, saving your Titan is the prime concern. > > And altering the 'combat value' in opposition to the 'point value' may > be extremely difficult; killing a lone Min is often more important than > killing a pair of easily replaced accompanying creatures such as Lio, > and there's a million way the value can be changed. > > > 2) The number of facings you present. Where facings is where an enemy > > unit can be next to you to attack you or range strike you. This would be > > a total for all your men, lower = better obviously as it means fewer > > targets for the enemy to attack. > > ... and for you to attack. When you are on a timer, not necessarily a > good idea. If you have a fresh Ser versus 2 Ogr (unlikely :-), you want > to be sure near the 6-7 turns that both Ogr are in contact... > > > 3) Enemy facings. Higher = better since you want to maximize theirs so > > you get more potential attacks. > > If you have Col and the ennemy are Ogr, you don't need that many > facings... > > Also, Dune and others: better to have one clean facing than two Dune > against native, in paticular if you can't get two pieces to the facings > anyway. > > Very few things are clear-cut in this game with no 'if' or 'but' :-( > > Finding the proper parameters is indeed IMHO the second big problem, > after the 'did I win or did I lose' problem. > > > 4) Number of potential damage you can do. This is the number of dice you > > roll on attack X change to hit for each unit you have that can attack. > > Basically your expected damage. You want to maximize this > > 5) Potential damage you can take. You want to minimize this. > > Those I agree with :-) > > Cordially, > > -- > Romain Dolbeau > <ro...@do...> > > ------------------------------------------------------------------------------ > Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex > infrastructure or vast IT resources to deliver seamless, secure access to > virtual desktops. With this all-in-one solution, easily deploy virtual > desktops for less than the cost of PCs and save 60% on VDI infrastructure > costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox > _______________________________________________ > Colossus-developers mailing list > Col...@li... > https://lists.sourceforge.net/lists/listinfo/colossus-developers -- NEU: FreePhone - 0ct/min Handyspartarif mit Geld-zurück-Garantie! Jetzt informieren: http://www.gmx.net/de/go/freephone |
|
From: <ro...@do...> - 2012-01-08 08:46:15
|
Tim Sowden <ts...@ai...> wrote: > They are offering the ML Class again this semester. Even if you don't > want to do the programming assignments it's worth listening to the video > lectures (2-3 hrs a week) just to see the latest techniques in use. D*mn that's a lot of time :-( I'm going to comment in a way that may look like criticism, but it's really just a way of telling that we have benn thinking about it for a long time, and that there's a gazillion corner cases that makes the problem really, really difficult. I want this to happen. I just strongly believe it's incredibly difficult, and that one should set one's expectations reasonably low before starting to code anything, otherwise disapointement will ensue. See also my comments in my anwser to Barrie. > So back to battles in Colossus. I'd imagine we'd need at least 20 > parameters and maybe 30. For example what I'd initially consider > important is: > > 1) Value of each unit on both sides Why not go for the unit itself, complete with all the parameters? > You want to > max the value of your units and min the value of the enemy units. I disagree already :-) If the opposing Titan is there and your isn't, you want to kill the Titan, other units are of secondary importance. In the reverse case, saving your Titan is the prime concern. And altering the 'combat value' in opposition to the 'point value' may be extremely difficult; killing a lone Min is often more important than killing a pair of easily replaced accompanying creatures such as Lio, and there's a million way the value can be changed. > 2) The number of facings you present. Where facings is where an enemy > unit can be next to you to attack you or range strike you. This would be > a total for all your men, lower = better obviously as it means fewer > targets for the enemy to attack. ... and for you to attack. When you are on a timer, not necessarily a good idea. If you have a fresh Ser versus 2 Ogr (unlikely :-), you want to be sure near the 6-7 turns that both Ogr are in contact... > 3) Enemy facings. Higher = better since you want to maximize theirs so > you get more potential attacks. If you have Col and the ennemy are Ogr, you don't need that many facings... Also, Dune and others: better to have one clean facing than two Dune against native, in paticular if you can't get two pieces to the facings anyway. Very few things are clear-cut in this game with no 'if' or 'but' :-( Finding the proper parameters is indeed IMHO the second big problem, after the 'did I win or did I lose' problem. > 4) Number of potential damage you can do. This is the number of dice you > roll on attack X change to hit for each unit you have that can attack. > Basically your expected damage. You want to maximize this > 5) Potential damage you can take. You want to minimize this. Those I agree with :-) Cordially, -- Romain Dolbeau <ro...@do...> |
|
From: <ro...@do...> - 2012-01-08 08:46:15
|
Barrie Treloar <bae...@gm...> wrote: > The winner was given a +1 and the loser a -1 to reinforce the learning > and a new game played. ... and unfortunately that's where the trouble begins. That kind of AI research assume you have an identifiable outcome: good, bad, and potentially something in between. That's how reinforcement works. We don't have that yet. Also, most of it is either AI-solving-problem, or our case: AI-besting-human-or-AI. Unfortunately, in the second case, it's mostly about symmetrical problems - chess, backgammon, all of those starts with the same pieces in the same (or mirrored) pieces, and on an homogeneous terrain. We don't have that, either: even two identical legions in Plains have different entry, 4-wide vs. 3-wide. Say that a Min-Ogr-Ogr-Ogr attack Cen-Cen. What is a win for the attacker; for the defender? If Att crushes Def with no loss, it's easy, but what if Att loose the Min? That likely becomes a win for the Def... And sometimes the win/loose status will depend on the outside world. > You would run a single combat situation, using the same neural network > to tell you what to do. > After the combat is resolved, if you win the values are probably ok > and don't need changing. > If you lose then the values need to be back-propogated to change the > hidden units values. Which illustrates my point; that was the point of the "Objective" code I added a couple of years ago to ExpAI: first thing, you need to be able whether you won or lose or something in between, before you can start to learn anything from a Battle. Some more comments in my answer to the e-mail from Tim, and a disclaimer that also apply to this one. Cordially, -- Romain Dolbeau <ro...@do...> |
|
From: Tim S. <ts...@ai...> - 2012-01-08 04:52:45
|
David, They are offering the ML Class again this semester. Even if you don't want to do the programming assignments it's worth listening to the video lectures (2-3 hrs a week) just to see the latest techniques in use. As for the Titan applicability it's very applicable. What the article doesn't explicitly state is that a neural network approach is NOT one that does a min/max board expansion of several moves deep. So the number of combinations of moves isn't important. Instead what it does is collect data from hundreds of thousands of ALREADY played battles and evaluates the current position based on what it's seen in the past and the weights it has given to the important parameters. Choosing the important parameters is the hard part. But let me give you an idea of how it works. The Backgammon AI is evaluating 20 parameters at each move. My guess is those parameters are things like the number of points you have, the number your opponent has, the number of stones that aren't in points for you/opponent, the value of the doubling dice, how close you are to finishing the game, how close the opponent is etc etc. From that it decides how to move it's pieces. Not from guessing what might be the next 2 or 3 rolls of the dice. So back to battles in Colossus. I'd imagine we'd need at least 20 parameters and maybe 30. For example what I'd initially consider important is: 1) Value of each unit on both sides (so potentially 14 parameters, one per unit where a Titan is worth a lot, other units worth far less depending on if they might recruit in the future etc). This value may change over time during the battle (ie you have 2 identical units you can recruit with but lose 1 now the one left is weighted more type thing or units are worth less when damaged and about to die etc). You want to max the value of your units and min the value of the enemy units. 2) The number of facings you present. Where facings is where an enemy unit can be next to you to attack you or range strike you. This would be a total for all your men, lower = better obviously as it means fewer targets for the enemy to attack. 3) Enemy facings. Higher = better since you want to maximize theirs so you get more potential attacks. 4) Number of potential damage you can do. This is the number of dice you roll on attack X change to hit for each unit you have that can attack. Basically your expected damage. You want to maximize this 5) Potential damage you can take. You want to minimize this. If we started with that it would give between 6-18 total parameters depending on whether combat was 1-1 up to 7-7. Over time we may add more/fewer if some turn out to not be important or we find other very important ones. Doing this means you don't have to worry much about battle terrain and other things since the above parameters include all that with facings and damage potential. Then you simply play a LOT of battles (300K worth). The neutral net comes up with a value for each of the parameters at each phase (attacker, defender) in the battle. These get saved. The idea is that the algorithm will over time slightly weight each of those parameters as less/more important as it goes over 300K worth of battles until it finds an optimal weight for each of the parameters. Those optimal weights are what forms the final weights for the AI so that in an actual game the AI needs only a few micro seconds to decide the best move in a turn and doesn't need to generate tons of potential moves in the future. It's really fascinating stuff. This concept is literally being used all over the place now on the web to recommend stuff for you to buy (Amazon and similar purchases concept it has) in what's called a recommender system. Tim > On Sun, Jan 8, 2012 at 7:29 AM, David Ripton<dr...@ri...> wrote: >> This is an interesting article, but it doesn't go into enough detail for >> me to understand how to actually implement the neural networks. Guess I >> have to take the class. > The ML class does have enough detail. > Its a little thin, but I believe I have enough knowledge to give it a > crack - its just on a long list of things TODO... > >>> If you glance at it you'll see that it uses neural networks (something >>> covered in detail in the machine learning course including coding >>> examples) and that it took on order of 300,000 games (self playing) to >>> improve to world class. More games improved it slightly not the major >>> jump was those first 300,000 games (hence my comment about collecting a >>> LOT of data). >>> >>> In Colossus case, you don't really need to play complete games as much >>> as run combats to improve the combat part. >> How practical do you think this is for Titan? >> >> The article says that a well-played backgammon game is about 50-60 moves >> long, and each move means 21 possible dice rolls and about 20 legal >> moves per roll, so about 20 moves to consider for the current known >> roll, and 420 moves to consider for each subsequent unknown roll. >> >> A Titan battle is no more than 7 turns per player, but each move can >> have a huge branching factor. For example, a defending legion with 7 >> different 4-skill creatures entering the plains has over 160 million >> possible moves. You have to narrow that down dramatically to make the >> AI fast enough, but I'm not sure whether that narrowing invalidates the >> rest of the neural network idea. > Yes but they are all "kind of the same". > And there is nothing that says the games have to include a human. > The backgammon example used the same AI to play both sides. > The winner was given a +1 and the loser a -1 to reinforce the learning > and a new game played. > > The hard part would be to work out the feature set to use as the input vector. > It would have to include stuff like attacking units, defending units, > battle map positions, battle map terrain, etc. > Obviously some stuff is linked like units and positions so you need to > make sure they get encoded together. > > The example used of image detection takes a 80x80 grey scale image and > converts it into a single dimensional array. > > Thinking of the top of my head, we would use something like > [ > ATTACKING_UNITS, > DEFENDING_UNITS, > BATTLE_MAP_TERRAIN > ] > Where each of these is a single dimensional array that represents a > position on the battle map, including the starting position of being > outside the battle map. > > The output of this might be (and this is hazy since most of the > examples I remember were classification rather than regression) [ > FROM, TO ] where these are both battle map terrain positions. I'm not > sure whether including a boolean to indicate move or attack is > worthwhile as it should be obvious if TO already has a unit in it. > > You would run a single combat situation, using the same neural network > to tell you what to do. > After the combat is resolved, if you win the values are probably ok > and don't need changing. > If you lose then the values need to be back-propogated to change the > hidden units values. > The hard part here is how to do this back-propogation since the class > need to know the "correct" answer and in this case we don't know that. > > Selecting the right type of AI algorithm is also part of the challenge. > Should we be using some sort of search strategy like A*, reinforcement > learning, neural networks? Half the fun might be giving each a crack > and seeing what can be done. > >> > Then once that's improved >> > you'd improve the playing board part of the AI in a similar manner >> > (moving, splitting, recruiting etc). >> >> I think the only part of the masterboard AI that's hard is predicting >> the results of battles. Given that, the simple heuristics that we've >> had for a decade are probably good enough. > Probably :) > But there are lots of edge cases like letting a unit die so you have > room to summon, or focusing on killing so the opponent can no longer > recruit on that tree again. > > And having a simulation of the battle outcome might also help in the > surrender negotiations. > > ------------------------------------------------------------------------------ > Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex > infrastructure or vast IT resources to deliver seamless, secure access to > virtual desktops. With this all-in-one solution, easily deploy virtual > desktops for less than the cost of PCs and save 60% on VDI infrastructure > costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox > _______________________________________________ > Colossus-developers mailing list > Col...@li... > https://lists.sourceforge.net/lists/listinfo/colossus-developers > |
|
From: Barrie T. <bae...@gm...> - 2012-01-07 23:29:19
|
On Sun, Jan 8, 2012 at 7:29 AM, David Ripton <dr...@ri...> wrote: > This is an interesting article, but it doesn't go into enough detail for > me to understand how to actually implement the neural networks. Guess I > have to take the class. The ML class does have enough detail. Its a little thin, but I believe I have enough knowledge to give it a crack - its just on a long list of things TODO... >> If you glance at it you'll see that it uses neural networks (something >> covered in detail in the machine learning course including coding >> examples) and that it took on order of 300,000 games (self playing) to >> improve to world class. More games improved it slightly not the major >> jump was those first 300,000 games (hence my comment about collecting a >> LOT of data). >> >> In Colossus case, you don't really need to play complete games as much >> as run combats to improve the combat part. > > How practical do you think this is for Titan? > > The article says that a well-played backgammon game is about 50-60 moves > long, and each move means 21 possible dice rolls and about 20 legal > moves per roll, so about 20 moves to consider for the current known > roll, and 420 moves to consider for each subsequent unknown roll. > > A Titan battle is no more than 7 turns per player, but each move can > have a huge branching factor. For example, a defending legion with 7 > different 4-skill creatures entering the plains has over 160 million > possible moves. You have to narrow that down dramatically to make the > AI fast enough, but I'm not sure whether that narrowing invalidates the > rest of the neural network idea. Yes but they are all "kind of the same". And there is nothing that says the games have to include a human. The backgammon example used the same AI to play both sides. The winner was given a +1 and the loser a -1 to reinforce the learning and a new game played. The hard part would be to work out the feature set to use as the input vector. It would have to include stuff like attacking units, defending units, battle map positions, battle map terrain, etc. Obviously some stuff is linked like units and positions so you need to make sure they get encoded together. The example used of image detection takes a 80x80 grey scale image and converts it into a single dimensional array. Thinking of the top of my head, we would use something like [ ATTACKING_UNITS, DEFENDING_UNITS, BATTLE_MAP_TERRAIN ] Where each of these is a single dimensional array that represents a position on the battle map, including the starting position of being outside the battle map. The output of this might be (and this is hazy since most of the examples I remember were classification rather than regression) [ FROM, TO ] where these are both battle map terrain positions. I'm not sure whether including a boolean to indicate move or attack is worthwhile as it should be obvious if TO already has a unit in it. You would run a single combat situation, using the same neural network to tell you what to do. After the combat is resolved, if you win the values are probably ok and don't need changing. If you lose then the values need to be back-propogated to change the hidden units values. The hard part here is how to do this back-propogation since the class need to know the "correct" answer and in this case we don't know that. Selecting the right type of AI algorithm is also part of the challenge. Should we be using some sort of search strategy like A*, reinforcement learning, neural networks? Half the fun might be giving each a crack and seeing what can be done. > > Then once that's improved > > you'd improve the playing board part of the AI in a similar manner > > (moving, splitting, recruiting etc). > > I think the only part of the masterboard AI that's hard is predicting > the results of battles. Given that, the simple heuristics that we've > had for a decade are probably good enough. Probably :) But there are lots of edge cases like letting a unit die so you have room to summon, or focusing on killing so the opponent can no longer recruit on that tree again. And having a simulation of the battle outcome might also help in the surrender negotiations. |
|
From: David R. <dr...@ri...> - 2012-01-07 21:44:16
|
On 01/07/12 13:26, Tim Sowden wrote: > Here's a link to one of the games discussed in the class. It's a famous > example of one of the first learning algorithms ever applied to improve > an AI. In this case for Backgammon. The 'hidden units' represent > important ideas you want the AI to learn, stuff like initial placement > of men, using range strikers, magic missile units effectively, reducing > facing etc. > > http://www.research.ibm.com/massive/tdl.html This is an interesting article, but it doesn't go into enough detail for me to understand how to actually implement the neural networks. Guess I have to take the class. > If you glance at it you'll see that it uses neural networks (something > covered in detail in the machine learning course including coding > examples) and that it took on order of 300,000 games (self playing) to > improve to world class. More games improved it slightly not the major > jump was those first 300,000 games (hence my comment about collecting a > LOT of data). > > In Colossus case, you don't really need to play complete games as much > as run combats to improve the combat part. How practical do you think this is for Titan? The article says that a well-played backgammon game is about 50-60 moves long, and each move means 21 possible dice rolls and about 20 legal moves per roll, so about 20 moves to consider for the current known roll, and 420 moves to consider for each subsequent unknown roll. A Titan battle is no more than 7 turns per player, but each move can have a huge branching factor. For example, a defending legion with 7 different 4-skill creatures entering the plains has over 160 million possible moves. You have to narrow that down dramatically to make the AI fast enough, but I'm not sure whether that narrowing invalidates the rest of the neural network idea. > Then once that's improved > you'd improve the playing board part of the AI in a similar manner > (moving, splitting, recruiting etc). I think the only part of the masterboard AI that's hard is predicting the results of battles. Given that, the simple heuristics that we've had for a decade are probably good enough. -- David Ripton dr...@ri... |
|
From: Tim S. <ts...@ai...> - 2012-01-07 18:27:20
|
Romain, > (Un)fortunately, over the years, I've become convinced programming is > not the problem with the AIs - I believe we don't even know what we want > to do/can do. When we do, lots of people will be willing to jump in to > do the coding bit. > > Hopefuly those courses will get us closer :-) > > I would love to hear the ideas of someone who has some fresh up-to-date > knowledge on the subject. I would agree with you. All the basic parts seem to be in place now in the game. It just needs refinement of what's there. Here's a link to one of the games discussed in the class. It's a famous example of one of the first learning algorithms ever applied to improve an AI. In this case for Backgammon. The 'hidden units' represent important ideas you want the AI to learn, stuff like initial placement of men, using range strikers, magic missile units effectively, reducing facing etc. http://www.research.ibm.com/massive/tdl.html If you glance at it you'll see that it uses neural networks (something covered in detail in the machine learning course including coding examples) and that it took on order of 300,000 games (self playing) to improve to world class. More games improved it slightly not the major jump was those first 300,000 games (hence my comment about collecting a LOT of data). In Colossus case, you don't really need to play complete games as much as run combats to improve the combat part. Then once that's improved you'd improve the playing board part of the AI in a similar manner (moving, splitting, recruiting etc). > famous last words :-) Some off-list words on he subject were collected > here: <https://sourceforge.net/apps/trac/colossus/wiki/AiThoughts>, > plus obviously the mailing list archive on SF. The subject of AIs has > been on the table for nearly 10 years now. I checked out the link. Many of the things discussed there would be the hidden units of the neural network. Tim |