Home   >   CSC-OpenAccess Library   >    Manuscript Information
An Algorithm of Policy Gradient Reinforcement Learning with a Fuzzy Controller in Policies
Harukazu Igarashi, Seiji Ishihara
Pages - 17 - 26     |    Revised - 05-042013     |    Published - 30-04-2013
Volume - 4   Issue - 1    |    Publication Date - April 2013  Table of Contents
Reinforcement Learning, Policy Gradient Method, Fuzzy Inference, Membership Function
Typical fuzzy reinforcement learning algorithms take value-function based approaches, such as fuzzy Q-learning in Markov Decision Processes (MDPs), and use constant or linear functions in the consequent parts of fuzzy rules. Instead of taking such approaches, we propose a fuzzy reinforcement learning algorithm in another approach. That is the policy gradient approach. Our method can handle fuzzy sets even in the consequent part and also learn the rule weights of fuzzy rules. Specifically, we derived learning rules of membership functions and rule weights for both cases when input/output variables to/from the control system are discrete and continuous.
CITED BY (1)  
1 Sugimoto Masaya, Igarashi Harukazu, Ishihara Seiji, & Tanaka Ichi-ki (2014) fuzzy control strategy gradient method with the difference between the approach expressed by the rule:. Action decision in RoboCup small size league intelligence and information, 26 (3), 647-657.
1 Google Scholar 
2 CiteSeerX 
3 refSeek 
4 Scribd 
5 SlideShare 
6 PdfSR 
C. Oh, T. Nakashima, and H. Ishibuchi. “Initialization of Q-values by Fuzzy Rules for Accelerating Q-learning.” in Proc. IEEE World Congress on Computational Intelligence, vol.3, 1998, pp. 2051-2056.
H. Igarashi, S. Ishihara, and M. Kimura. “Reinforcement Learning in Non-Markov Decision Processes-Statistical Properties of Characteristic Eligibility.” IEICE Transactions on Information and Systems, vol. J90-D, no. 9, pp. 2271-2280, 2007. (in Japanese)(This paper is translated into English and included in The Research Reports of Shibaura Institute of Technology, Natural Sciences and Engineering, vol. 52, no. 2, pp. 1-7, 2008.)
H. R. Berenji and D. Vengerov. “Cooperation and Coordination Between Fuzzy Reinforcement Learning Agents in Continuous State Partially Observable Markov Decision Processes,” in 1999 IEEE Int. Fuzzy Systems Conf. Proc., 1999, vol. 2, pp. 621-627.
H. R. Berenji. “A Reinforcement Learning-based Architecture for Fuzzy Logic Control.” Int. J.Approx. Reasoning, vol. 6, pp. 267–292, 1992.
J. Baxter and P. L. Bartlett. “Infinite-Horizon Policy- Gradient Estimation,” Journal of Artificial Intelligence Research, vol. 15, pp. 319-350, 2001.
L. Jouffe. “Fuzzy Inference System Learning by Reinforcement Methods.” IEEE Transactions on Systems, Man, and Cybernetics, vol. 28, No. 3, pp. 338-355, 1998.
M. Sugimoto, H. Igarashi, S. Ishihara, K. Tanaka. “Policy Gradient Reinforcement Learning with a Fuzzy Controller for Policy: Decision Making in RoboCup Soccer Small Size League,”presented at the 29th Fuzzy System Symposium, Osaka, Japan, 2013. (in Japanese).
R. R. Yager and L. A. Zadeh. An Introduction to Fuzzy Logic Applications in Intelligent Systems. Norwell, MA, USA: Kluwer Academic Publishers, 1992.
R. S. Sutton and A. G. Barto. Reinforcement Learning: An Introduction. Cambridge, MA,USA: MIT Press, 1998.
R.J. Williams. “Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning.” Machine Learning, vol. 8, pp. 229-256, 1992.
S. Imai, H. Igarashi, and S. Ishihara. “Policy-Gradient Method Integrating Abstract Information in Policy Function and Its Application to Pursuit Games with a Tunnel of Static Obstacles.” IEICE Transactions on Information and Systems, vol. J94-D, no. 6, pp. 968-976,2011. (in Japanese).This paper is translated into English and included in The Research Reports of Shibaura Institute of Technology, Natural Sciences and Engineering, vol. 52, no. 2, pp. 7-12, 2011.
S. Ishihara and H. Igarashi, “Applying the Policy Gradient Method to Behavior Learning in Multiagent Systems: The Pursuit Problem.” Systems and Computers in Japan, vol. 37, no.10, pp. 101-109, 2006.
T. Horiuchi, A. Fujino, O. Katai, and T. Sawaragi. “Fuzzy Interpolation-based Q-learning with Continuous States and Actions,” in Proc. the Fifth Inter. Conf. on Fuzzy Systems, 1996, vol.1, pp. 594-600.
X. Wang, X. Xu, and H. He. “Policy Gradient Fuzzy Reinforcement Learning,” in Proc. 3rd Inter. Conf. on Machine Learning and Cybernetics, 2004, pp. 992-995.
Y. Hoshino and K. Kamei. “A Proposal of Reinforcement Learning with Fuzzy Environment Evaluation Rules and Its Application to Chess.” J. of Japan Society for Fuzzy Theory and Systems, vol. 13, no. 6, pp. 626-632, 2001. (in Japanese)
Y. Hosoya, T. Yamamura, M. Umano, and K. Seta. “Reinforcement Learning Based on Dynamic Construction of the Fuzzy State Space-Adjustment of Fuzzy Sets of States-,“ in Proc. of the 22nd Fuzzy System Symposium (CD-ROM), vol. 22, 8D3-1, 2006. (in Japanese).
Professor Harukazu Igarashi
Shibaura Institute of Technology - Japan
Associate Professor Seiji Ishihara
Tokyo Denki University - Japan

View all special issues >>