Algorithms for Reinforcement Learning (Paperback)
暫譯: 強化學習的演算法 (平裝本)
Csaba Szepesvari
- 出版商: Morgan & Claypool
- 出版日期: 2010-06-25
- 售價: $1,430
- 貴賓價: 9.5 折 $1,359
- 語言: 英文
- 頁數: 104
- 裝訂: Paperback
- ISBN: 1608454924
- ISBN-13: 9781608454921
-
相關分類:
Reinforcement、DeepLearning、Algorithms-data-structures
立即出貨 (庫存=1)
買這商品的人也買了...
-
$550$435 -
$620$490 -
$990$891 -
$350$315 -
$1,558Introduction to Algorithms, 3/e (IE-Paperback)
-
$1,176Computer Organization and Design: The Hardware/Software Interface, 4/e (ARM Edition) (Paperback)
-
$620$490 -
$900$855 -
$980$833 -
$590$466 -
$950$808 -
$420$332 -
$550$435 -
$600$468 -
$800$632 -
$780$616 -
$690$587 -
$450$356 -
$3,781Reinforcement and Systemic Machine Learning for Decision Making (Hardcover)
-
$650$553 -
$580$452 -
$1,130$893 -
$714$678 -
$450$356 -
$1,600$1,520
商品描述
Reinforcement learning is a learning paradigm concerned with learning to control a system so as to maximize a numerical performance measure that expresses a long-term objective.What distinguishes reinforcement learning from supervised learning is that only partial feedback is given to the learner about the learner's predictions. Further, the predictions may have long term effects through influencing the future state of the controlled system. Thus, time plays a special role. The goal in reinforcement learning is to develop efficient learning algorithms, as well as to understand the algorithms' merits and limitations. Reinforcement learning is of great interest because of the large number of practical applications that it can be used to address, ranging from problems in artificial intelligence to operations research or control engineering. In this book, we focus on those algorithms of reinforcement learning that build on the powerful theory of dynamic programming.We give a fairly comprehensive catalog of learning problems, describe the core ideas, note a large number of state of the art algorithms, followed by the discussion of their theoretical properties and limitations.
商品描述(中文翻譯)
強化學習是一種學習範式,專注於學習如何控制系統,以最大化表達長期目標的數值性能指標。強化學習與監督學習的區別在於,學習者僅獲得有關其預測的部分反饋。此外,這些預測可能會通過影響受控系統的未來狀態而產生長期影響。因此,時間在此過程中扮演著特殊的角色。強化學習的目標是開發高效的學習演算法,以及理解這些演算法的優點和限制。強化學習引起了廣泛的興趣,因為它可以應用於許多實際問題,範圍從人工智慧到運籌學或控制工程。在本書中,我們專注於那些基於動態規劃強大理論的強化學習演算法。我們提供了一個相當全面的學習問題目錄,描述核心思想,列舉大量最先進的演算法,並討論它們的理論特性和限制。