Universal Coding and Order Identification by Model Selection Methods (Springer Monographs in Mathematics)
暫譯: 通用編碼與模型選擇方法的順序識別 (施普林格數學專著)

Élisabeth Gassiat

  • 出版商: Springer
  • 出版日期: 2018-08-09
  • 售價: $4,890
  • 貴賓價: 9.5$4,646
  • 語言: 英文
  • 頁數: 146
  • 裝訂: Hardcover
  • ISBN: 3319962612
  • ISBN-13: 9783319962610
  • 海外代購書籍(需單獨結帳)

商品描述

The purpose of these notes is to highlight the far-reaching connections between Information Theory and Statistics. Universal coding and adaptive compression are indeed closely related to statistical inference concerning processes and using maximum likelihood or Bayesian methods. The book is divided into four chapters, the first of which introduces readers to lossless coding, provides an intrinsic lower bound on the codeword length in terms of Shannon’s entropy, and presents some coding methods that can achieve this lower bound, provided the source distribution is known. In turn, Chapter 2 addresses universal coding on finite alphabets, and seeks to find coding procedures that can achieve the optimal compression rate, regardless of the source distribution. It also quantifies the speed of convergence of the compression rate to the source entropy rate. These powerful results do not extend to infinite alphabets. In Chapter 3, it is shown that there are no universal codes over the class of stationary ergodic sources over a countable alphabet. This negative result prompts at least two different approaches: the introduction of smaller sub-classes of sources known as envelope classes, over which adaptive coding may be feasible, and the redefinition of the performance criterion by focusing on compressing the message pattern. Finally, Chapter 4 deals with the question of order identification in statistics. This question belongs to the class of model selection problems and arises in various practical situations in which the goal is to identify an integer characterizing the model: the length of dependency for a Markov chain, number of hidden states for a hidden Markov chain, and number of populations for a population mixture. The coding ideas and techniques developed in previous chapters allow us to obtain new results in this area. 

This book is accessible to anyone with a graduate level in Mathematics, and will appeal to information theoreticians and mathematical statisticians alike. Except for Chapter 4, all proofs are detailed and all tools needed to understand the text are reviewed.

商品描述(中文翻譯)

這些筆記的目的是強調資訊理論與統計學之間的深遠聯繫。通用編碼和自適應壓縮確實與有關過程的統計推斷密切相關,並使用最大似然或貝葉斯方法。本書分為四個章節,第一章介紹無損編碼,提供了根據香農熵對碼字長度的內在下界,並呈現一些可以實現此下界的編碼方法,前提是已知源分佈。接著,第二章探討有限字母表上的通用編碼,尋求能夠實現最佳壓縮率的編碼程序,無論源分佈如何。它還量化了壓縮率收斂到源熵率的速度。這些強大的結果不適用於無限字母表。在第三章中,顯示在可數字母表上的平穩遍歷源類別中不存在通用編碼。這一負面結果促使至少兩種不同的方法:引入稱為包絡類的較小源子類,對於這些類別,自適應編碼可能是可行的,以及通過專注於壓縮消息模式來重新定義性能標準。最後,第四章處理統計中的順序識別問題。這個問題屬於模型選擇問題的類別,並出現在各種實際情況中,目標是識別一個整數來表徵模型:馬可夫鏈的依賴長度、隱馬可夫鏈的隱藏狀態數量,以及族群混合的族群數量。前幾章中發展的編碼思想和技術使我們能夠在這個領域獲得新結果。

本書適合任何具有研究生數學水平的人,並將吸引資訊理論學者和數學統計學家。除了第四章外,所有證明都詳細說明,理解文本所需的所有工具均已回顧。