Reinforcement Learning An Introduction

Name: Reinforcement Learning An Introduction
Price: 41.47 USD
Availability: InStock
ISBN: 9780262193986

ISBN-10: 0262193981

ISBN-13: 9780262193986

Edition: 2nd 1998

Authors: Richard S. Sutton, Andrew G. Barto, Francis Bach

List price: $75.00

30 day, 100% satisfaction guarantee!

Sell

Get cash fast!

Marketplace

3 new & used from $41.47

what's this?

Rush Rewards U
Members Receive:

You have reached 400 XP and carrot coins. That is the daily max!

Description:

Offering an account of key ideas and algorithms in reinforcement learning, this volume includes discussion ranging from the history of the field's intellectual foundations to recent developments and applications.

Book details

List price: $75.00
Edition: 2nd
Copyright year: 1998
Publisher: MIT Press
Publication date: 2/26/1998
Binding: Hardcover
Pages: 344
Size: 7.01" wide x 9.29" long x 1.08" tall
Weight: 1.738
Language: English



Contents


Series Foreword


Preface



The Problem



Introduction



Reinforcement Learning



Examples



Elements of Reinforcement Learning



An Extended Example: Tic-Tac-Toe



Summary



History of Reinforcement Learning



Bibliographical Remarks



Evaluative Feedback



An n-Armed Bandit Problem



Action-Value Methods



Softmax Action Selection



Evaluation Versus Instruction



Incremental Implementation



Tracking a Nonstationary Problem



Optimistic Initial Values



Reinforcement Comparison



Pursuit Methods



Associative Search



Conclusions



Bibliographical and Historical Remarks



The Reinforcement Learning Problem



The Agent-Environment Interface



Goals and Rewards



Returns



Unified Notation for Episodic and Continuing Tasks



The Markov Property



Markov Decision Processes



Value Functions



Optimal Value Functions



Optimality and Approximation



Summary



Bibliographical and Historical Remarks



Elementary Solution Methods



Dynamic Programming



Policy Evaluation



Policy Improvement



Policy Iteration



Value Iteration



Asynchronous Dynamic Programming



Generalized Policy Iteration



Efficiency of Dynamic Programming



Summary



Bibliographical and Historical Remarks



Monte Carlo Methods



Monte Carlo Policy Evaluation



Monte Carlo Estimation of Action Values



Monte Carlo Control



On-Policy Monte Carlo Control



Evaluating One Policy While Following Another



Off-Policy Monte Carlo Control



Incremental Implementation



Summary



Bibliographical and Historical Remarks



Temporal-Difference Learning



TD Prediction



Advantages of TD Prediction Methods



Optimality of TD(O)



Sarsa: On-Policy TD Control



Q-Learning: Off-Policy TD Control



Actor-Critic Methods



R-Learning for Undiscounted Continuing Tasks



Games, Afterstates, and Other Special Cases



Summary



Bibliographical and Historical Remarks



A Unified View



Eligibility Traces



n-Step TD Prediction



The Forward View of TD(l)



The Backward View of TD(l)



Equivalence of Forward and Backward Views



Sarsa(l)



Q(l)



Eligibility Traces for Actor-Critic Methods



Replacing Traces



Implementation Issues



Variable l



Conclusions



Bibliographical and Historical Remarks



Generalization and Function Approximation



Value Prediction with Function Approximation



Gradient-Descent Methods



Linear Methods



Control with Function Approximation



Off-Policy Bootstrapping



Should We Bootstrap?



Summary



Bibliographical and Historical Remarks



Planning and Learning



Models and Planning



Integrating Planning, Acting, and Learning



When the Model Is Wrong



Prioritized Sweeping



Full vs. Sample Backups



Trajectory Sampling



Heuristic Search



Summary



Bibliographical and Historical Remarks



Dimensions of Reinforcement Learning



The Unified View



Other Frontier Dimensions



Case Studies



TD-Gammon



Samuel's Checkers Player



The Acrobot



Elevator Dispatching



Dynamic Channel Allocation



Job-Shop Scheduling


References


Summary of Notation


Index