Publication
Abalearn: a risk-sensitive approach to self-play learning in Abalone
dc.contributor.author | Campos, Pedro | |
dc.contributor.author | Langlois, Thibault | |
dc.date.accessioned | 2022-09-13T11:05:59Z | |
dc.date.available | 2022-09-13T11:05:59Z | |
dc.date.issued | 2003 | |
dc.description.abstract | This paper presents Abalearn, a self-teaching Abalone pro gram capable of automatically reaching an intermediate level of play without needing expert-labeled training examples, deep searches or ex posure to competent play. Our approach is based on a reinforcement learning algorithm that is risk seeking, since defensive players in Abalone tend to never end a game. We show that it is the risk-sensitivity that allows a successful self-play training. We also propose a set of features that seem relevant for achiev ing a good level of play. We evaluate our approach using a fixed heuristic opponent as a bench mark, pitting our agents against human players online and comparing samples of our agents at different times of training. | pt_PT |
dc.description.version | info:eu-repo/semantics/publishedVersion | pt_PT |
dc.identifier.citation | Campos, P., Langlois, T. (2003). Abalearn: A Risk-Sensitive Approach to Self-play Learning in Abalone. In: Lavrač, N., Gamberger, D., Blockeel, H., Todorovski, L. (eds) Machine Learning: ECML 2003. ECML 2003. Lecture Notes in Computer Science(), vol 2837. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-39857-8_6 | pt_PT |
dc.identifier.doi | 10.1007/978-3-540-39857-8_6 | pt_PT |
dc.identifier.uri | http://hdl.handle.net/10400.13/4603 | |
dc.language.iso | eng | pt_PT |
dc.publisher | Springer | pt_PT |
dc.rights.uri | http://creativecommons.org/licenses/by-nc-nd/4.0/ | pt_PT |
dc.subject | Abalearn | pt_PT |
dc.subject | Self-play learning | pt_PT |
dc.subject | Abalone | pt_PT |
dc.subject | . | pt_PT |
dc.subject | Faculdade de Ciências Exatas e da Engenharia | pt_PT |
dc.title | Abalearn: a risk-sensitive approach to self-play learning in Abalone | pt_PT |
dc.type | conference object | |
dspace.entity.type | Publication | |
oaire.citation.endPage | 46 | pt_PT |
oaire.citation.startPage | 35 | pt_PT |
oaire.citation.title | Machine Learning: ECML 2003. ECML 2003. Lecture Notes in Computer Science(), vol 2837. | pt_PT |
oaire.citation.volume | 2837 | pt_PT |
person.familyName | Pereira Campos | |
person.givenName | Pedro Filipe | |
person.identifier.ciencia-id | 7C19-B5E5-01CA | |
person.identifier.orcid | 0000-0001-7706-5038 | |
rcaap.rights | openAccess | pt_PT |
rcaap.type | conferenceObject | pt_PT |
relation.isAuthorOfPublication | fb4a962b-b799-4ba2-8778-3d9d0a64b2b0 | |
relation.isAuthorOfPublication.latestForDiscovery | fb4a962b-b799-4ba2-8778-3d9d0a64b2b0 |