|
|
| Research article summary (published 8 Jul 2009): |
Simple artificial neural networks that match probability and exploit and explore when confronting a multiarmed bandit.
Full Abstract
The matching law (Herrnstein 1961) states that response rates become proportional to reinforcement rates; this is related to the empirical phenomenon called probability matching (Vulkan 2000). Here, we show that a simple artificial neural network generates responses consistent with probability matching. This behavior was then used to create an operant procedure for network learning. We use the multiarmed bandit (Gittins 1989), a classic problem of choice behavior, to illustrate that operant training balances exploiting the bandit arm expected to pay off most frequently with exploring other arms. Perceptrons provide a medium for relating results from neural networks, genetic algorithms, animal learning, contingency theory, reinforcement learning, and theories of choice.
Author information
Author/s: Dawson, Michael R W (MR); Dupuis, Brian (B); Spetch, Marcia L (ML); Kelly, Debbie M (DM);
Affiliation: Department of Psychology, University of Alberta, Edmonton, AB T6G 2P9, Canada. mdawson(-atsign-)ualberta.ca
Journal and publication information
Publication Type: Journal Article; Research Support, Non-U.S. Gov't
Journal: IEEE transactions on neural networks / a publication of the IEEE Neural Networks Council (IEEE Trans Neural Netw), published in United States. (Language: eng)
Reference: 2009-Aug; vol 20 (issue 8) : pp 1368-71
Dates: Created 2009/08/07; Completed 2009/10/19; Revised 2009/10/28;
PMID: 19596631, status: MEDLINE (last retrieval date: 10/28/2009, IMS Date: )
Sourced from the National Library of Medicine. Abstract text and other information may be subject to copyright.
External Links for this article
(including full text providers, if available):
Click Electronic Full-text Provider Links to see options for finding the electronic full text links to this article. Note there may be a subscription or fee required for access to the full text. See our FAQ for information on finding FREE full text articles.
This article may also be located in paper journal collections available in many libraries. Use the Journal and Publication Information above to find the full article.
MeSH headings (categories)
This article was linked to the MESH Headings shown below.
Related articles
These are the highest related articles currently in the database:
- Instructional sets in human differential eyelid conditioning.
29 Apr 1969 - Responding during reinforcement delay in a self-control paradigm.
29 Apr 1984 - Delay reduction and optimal foraging: variable-ratio search in a foraging analogue.
29 Apr 1994 - Signals, resistance to change, and conditioned reinforcement in a multiple schedule.
30 Jan 2008 - Choice between sequences of fixed-ratio schedules: effects of ratio values and probability of food delivery.
27 Feb 1987 - Testing a computational account of category-specific deficits.
29 Apr 1999 - Automatic localization of anatomical point landmarks for brain image processing algorithms.
28 May 2008 - A competitive neural model of small number detection.
6 Aug 2006 - Rapid determinations of preference in multiple concurrent-chain schedules.
30 Aug 1986 - A molecular analysis of choice on concurrent-chains schedules.
29 Jun 1987
Related Article Map
Legend:
- FREE Full text Article.
- Abstract only.
- Title only. More help.
See a large map of 100+ related articles.