Reinforcement Learning An Introduction

ISBN-10: 0262193981

ISBN-13: 9780262193986

Edition: 1998

Authors: Richard S. Sutton, Andrew G. Barto

List price: $75.00 Buy it from $50.70 Rent it from $34.09
eBook available
This item qualifies for FREE shipping

*A minimum purchase of $35 is required. Shipping is provided via FedEx SmartPost® and FedEx Express Saver®. Average delivery time is 1 – 5 business days, but is not guaranteed in that timeframe. Also allow 1 - 2 days for processing. Free shipping is eligible only in the continental United States and excludes Hawaii, Alaska and Puerto Rico. FedEx service marks used by permission."Marketplace" orders are not eligible for free or discounted shipping.

30 day, 100% satisfaction guarantee

If an item you ordered from TextbookRush does not meet your expectations due to an error on our part, simply fill out a return request and then return it by mail within 30 days of ordering it for a full refund of item cost.

Learn more about our returns policy

Description:

Offering an account of key ideas and algorithms in reinforcement learning, this volume includes discussion ranging from the history of the field's intellectual foundations to recent developments and applications.
New Starting from $50.70
Rent Starting from $34.09
eBooks Starting from $74.99
Buy eBooks
what's this?
Rush Rewards U
Members Receive:
coins
coins
You have reached 400 XP and carrot coins. That is the daily max!
Study Briefs

Limited time offer: Get the first one free! (?)

All the information you need in one place! Each Study Brief is a summary of one specific subject; facts, figures, and explanations to help you learn faster.

Add to cart
Study Briefs
SQL Online content $4.95 $1.99
Add to cart
Study Briefs
MS Excel® 2010 Online content $4.95 $1.99
Add to cart
Study Briefs
MS Word® 2010 Online content $4.95 $1.99
Add to cart
Study Briefs
MS PowerPoint® 2010 Online content $4.95 $1.99
Customers also bought
Loading
Loading
Loading
Loading
Loading
Loading
Loading
Loading
Loading
Loading

Book details

List price: $75.00
Copyright year: 1998
Publisher: MIT Press
Publication date: 2/26/1998
Binding: Hardcover
Pages: 344
Size: 7.25" wide x 9.50" long x 1.25" tall
Weight: 2.024
Language: English

Contents
Series Foreword
Preface
The Problem
Introduction
Reinforcement Learning
Examples
Elements of Reinforcement Learning
An Extended Example: Tic-Tac-Toe
Summary
History of Reinforcement Learning
Bibliographical Remarks
Evaluative Feedback
An n-Armed Bandit Problem
Action-Value Methods
Softmax Action Selection
Evaluation Versus Instruction
Incremental Implementation
Tracking a Nonstationary Problem
Optimistic Initial Values
Reinforcement Comparison
Pursuit Methods
Associative Search
Conclusions
Bibliographical and Historical Remarks
The Reinforcement Learning Problem
The Agent-Environment Interface
Goals and Rewards
Returns
Unified Notation for Episodic and Continuing Tasks
The Markov Property
Markov Decision Processes
Value Functions
Optimal Value Functions
Optimality and Approximation
Summary
Bibliographical and Historical Remarks
Elementary Solution Methods
Dynamic Programming
Policy Evaluation
Policy Improvement
Policy Iteration
Value Iteration
Asynchronous Dynamic Programming
Generalized Policy Iteration
Efficiency of Dynamic Programming
Summary
Bibliographical and Historical Remarks
Monte Carlo Methods
Monte Carlo Policy Evaluation
Monte Carlo Estimation of Action Values
Monte Carlo Control
On-Policy Monte Carlo Control
Evaluating One Policy While Following Another
Off-Policy Monte Carlo Control
Incremental Implementation
Summary
Bibliographical and Historical Remarks
Temporal-Difference Learning
TD Prediction
Advantages of TD Prediction Methods
Optimality of TD(O)
Sarsa: On-Policy TD Control
Q-Learning: Off-Policy TD Control
Actor-Critic Methods
R-Learning for Undiscounted Continuing Tasks
Games, Afterstates, and Other Special Cases
Summary
Bibliographical and Historical Remarks
A Unified View
Eligibility Traces
n-Step TD Prediction
The Forward View of TD(l)
The Backward View of TD(l)
Equivalence of Forward and Backward Views
Sarsa(l)
Q(l)
Eligibility Traces for Actor-Critic Methods
Replacing Traces
Implementation Issues
Variable l
Conclusions
Bibliographical and Historical Remarks
Generalization and Function Approximation
Value Prediction with Function Approximation
Gradient-Descent Methods
Linear Methods
Control with Function Approximation
Off-Policy Bootstrapping
Should We Bootstrap?
Summary
Bibliographical and Historical Remarks
Planning and Learning
Models and Planning
Integrating Planning, Acting, and Learning
When the Model Is Wrong
Prioritized Sweeping
Full vs. Sample Backups
Trajectory Sampling
Heuristic Search
Summary
Bibliographical and Historical Remarks
Dimensions of Reinforcement Learning
The Unified View
Other Frontier Dimensions
Case Studies
TD-Gammon
Samuel's Checkers Player
The Acrobot
Elevator Dispatching
Dynamic Channel Allocation
Job-Shop Scheduling
References
Summary of Notation
Index
Free shipping on orders over $35*

*A minimum purchase of $35 is required. Shipping is provided via FedEx SmartPost® and FedEx Express Saver®. Average delivery time is 1 – 5 business days, but is not guaranteed in that timeframe. Also allow 1 - 2 days for processing. Free shipping is eligible only in the continental United States and excludes Hawaii, Alaska and Puerto Rico. FedEx service marks used by permission."Marketplace" orders are not eligible for free or discounted shipping.

Learn more about the TextbookRush Marketplace.

×