Mathematical Foundations Of Infinite-Dimensional .

2y ago
58 Views
2 Downloads
3.69 MB
705 Pages
Last View : 1m ago
Last Download : 3m ago
Upload by : Matteo Vollmer
Transcription

Mathematical Foundations of Infinite-DimensionalStatistical ModelsIn nonparametric and high-dimensional statistical models, the classical Gauss–Fisher–Le Cam theory of the optimality of maximum likelihood and Bayesianposterior inference does not apply, and new foundations and ideas have beendeveloped in the past several decades. This book gives a coherent account ofthe statistical theory in infinite-dimensional parameter spaces. The mathematicalfoundations include self-contained ‘mini-courses’ on the theory of Gaussian andempirical processes, on approximation and wavelet theory and on the basic theoryof function spaces. The theory of statistical inference in such models – hypothesistesting, estimation and confidence sets – is then presented within the minimaxparadigm of decision theory. This includes the basic theory of convolution kerneland projection estimation, as well as Bayesian nonparametrics and nonparametricmaximum likelihood estimation. In the final chapter, the theory of adaptiveinference in nonparametric models is developed, including Lepski’s method, waveletthresholding and adaptive confidence regions for self-similar functions.E VA R I S T G I N É (1944–2015) was the head of the Department of Mathematics at theUniversity of Connecticut. Giné was a distinguished mathematician who worked onmathematical statistics and probability in infinite dimensions. He was the author oftwo books and more than a hundred articles.R I C H A R D N I C K L is Reader in Mathematical Statistics in the Statistical Laboratoryin the Department of Pure Mathematics and Mathematical Statistics at the Universityof Cambridge. He is also Fellow in Mathematics at Queens’ College, Cambridge.

CAMBRIDGE SERIES IN STATISTICAL ANDPROBABILISTIC MATHEMATICSEditorial BoardZ. Ghahramani (Department of Engineering, University of Cambridge)R. Gill (Mathematical Institute, Leiden University)F. P. Kelly (Department of Pure Mathematics and Mathematical Statistics,University of Cambridge)B. D. Ripley (Department of Statistics, University of Oxford)S. Ross (Department of Industrial and Systems Engineering,University of Southern California)M. Stein (Department of Statistics, University of Chicago)This series of high-quality upper-division textbooks and expository monographs covers allaspects of stochastic applicable mathematics. The topics range from pure and applied statisticsto probability theory, operations research, optimization and mathematical programming. Thebooks contain clear presentations of new developments in the field and of the state of the artin classical methods. While emphasising rigorous treatment of theoretical methods, the booksalso contain applications and discussions of new techniques made possible by advances incomputational practice.A complete list of books in the series can be found at www.cambridge.org/statistics.Recent titles include the 7.28.29.30.31.32.33.34.35.36.37.38.Statistical Analysis of Stochastic Processes in Time, by J. K. LindseyMeasure Theory and Filtering, by Lakhdar Aggoun and Robert ElliottEssentials of Statistical Inference, by G. A. Young and R. L. SmithElements of Distribution Theory, by Thomas A. SeveriniStatistical Mechanics of Disordered Systems, by Anton BovierThe Coordinate-Free Approach to Linear Models, by Michael J. WichuraRandom Graph Dynamics, by Rick DurrettNetworks, by Peter WhittleSaddlepoint Approximations with Applications, by Ronald W. ButlerApplied Asymptotics, by A. R. Brazzale, A. C. Davison and N. ReidRandom Networks for Communication, by Massimo Franceschetti and Ronald MeesterDesign of Comparative Experiments, by R. A. BaileySymmetry Studies, by Marlos A. G. VianaModel Selection and Model Averaging, by Gerda Claeskens and Nils Lid HjortBayesian Nonparametrics, edited by Nils Lid Hjort et al.From Finite Sample to Asymptotic Methods in Statistics, by Pranab K. Sen, Julio M.Singer and Antonio C. Pedrosa de LimaBrownian Motion, by Peter Mörters and Yuval PeresProbability (Fourth Edition), by Rick DurrettStochastic Processes, by Richard F. BassRegression for Categorical Data, by Gerhard TutzExercises in Probability (Second Edition), by Loı̈c Chaumont and Marc YorStatistical Principles for the Design of Experiments, by R. Mead, S. G. Gilmour andA. MeadQuantum Stochastics, by Mou-Hsiung ChangNonparametric Estimation under Shape Constraints, by Piet Groeneboom and GeurtJongbloedLarge Sample Covariance Matrices, by Jianfeng Yao, Zhidong Bai and Shurong Zheng

Mathematical Foundationsof Infinite-DimensionalStatistical ModelsEvarist GinéRichard NicklUniversity of Cambridge

32 Avenue of the Americas, New York, NY 10013-2473, USACambridge University Press is part of the University of Cambridge.It furthers the University’s mission by disseminating knowledge in the pursuit ofeducation, learning and research at the highest international levels of excellence.www.cambridge.orgInformation on this title: www.cambridge.org/9781107043169c Evarist Giné and Richard Nickl 2016 This publication is in copyright. Subject to statutory exceptionand to the provisions of relevant collective licensing agreements,no reproduction of any part may take place without the writtenpermission of Cambridge University Press.First published 2016Printed in the United States of AmericaA catalog record for this publication is available from the British Library.Library of Congress Cataloging in Publication DataGiné, Evarist, 1944–2015Mathematical foundations of infinite-dimensional statistical models /Evarist Giné, University of Connecticut, Richard Nickl, University of Cambridge.pages cm. – (Cambridge series in statistical and probabilistic mathematics)Includes bibliographical references and index.ISBN 978-1-107-04316-9 (hardback)1. Nonparametric statistics. 2. Function spaces. I. Nickl, Richard, 1980– II. Title.QA278.8.G56 2016519.5 4–dc232015021997ISBN 978-1-107-04316-9 HardbackCambridge University Press has no responsibility for the persistence or accuracy ofURLs for external or third-party Internet websites referred to in this publicationand does not guarantee that any content on such websites is, or will remain,accurate or appropriate.

A la meva esposa RosalindDem Andenken meiner Mutter Reingard, 1940–2010

Contentspage xiPreface11.1Nonparametric Statistical ModelsStatistical Sampling Models1.1.1 Nonparametric Models for Probability Measures1.1.2 Indirect Observations1.2Gaussian Models1.2.1 Basic Ideas of Regression1.2.2 Some Nonparametric Gaussian Models1.2.3 Equivalence of Statistical Experiments1.3Notes22.1Gaussian ProcessesDefinitions, Separability, 0-1 Law, Concentration2.1.1 Stochastic Processes: Preliminaries and Definitions2.1.2 Gaussian Processes: Introduction and First Properties2.2Isoperimetric Inequalities with Applications to Concentration2.2.1 The Isoperimetric Inequality on the Sphere2.2.2 The Gaussian Isoperimetric Inequality for the StandardGaussian Measure on RN2.2.3 Application to Gaussian Concentration2.32.4The Metric Entropy Bound for Suprema of Sub-Gaussian ProcessesAnderson’s Lemma, Comparison and Sudakov’s Lower Bound2.4.1 Anderson’s Lemma2.4.2 Slepian’s Lemma and Sudakov’s Minorisation2.5The Log-Sobolev Inequality and Further Concentration2.5.1 Some Properties of Entropy: Variational Definition and Tensorisation2.5.2 A First Instance of the Herbst (or Entropy) Method: Concentrationof the Norm of a Gaussian Variable about Its Expectation2.6Reproducing Kernel Hilbert Spaces2.6.1 Definition and Basic Properties2.6.2 Some Applications of RKHS: Isoperimetric Inequality,Equivalence and Singularity, Small Ball Estimates2.6.3 An Example: RKHS and Lower Bounds for Small Ball Probabilitiesof Integrated Brownian Motion2.72.8Asymptotics for Extremes of Stationary Gaussian 26060626666727988102

viiiContents33.1Empirical ProcessesDefinitions, Overview and Some Background Inequalities3.1.1 Definitions and Overview3.1.2 Exponential and Maximal Inequalities for Sums of IndependentCentred and Bounded Real Random Variables3.1.3 The Lévy and Hoffmann-Jørgensen Inequalities3.1.4 Symmetrisation, Randomisation, Contraction3.2Rademacher Processes3.2.1 A Comparison Principle for Rademacher Processes3.2.2 Convex Distance Concentration and Rademacher Processes3.2.3 A Lower Bound for the Expected Supremum of a RademacherProcess3.3The Entropy Method and Talagrand’s Inequality3.3.1 The Subadditivity Property of the Empirical Process3.3.2 Differential Inequalities and Bounds for Laplace Transformsof Subadditive Functions and Centred Empirical Processes, λ 03.3.3 Differential Inequalities and Bounds for Laplace Transformsof Centred Empirical Processes, λ 03.3.4 The Entropy Method for Random Variables with BoundedDifferences and for Self-Bounding Random Variables3.3.5 The Upper Tail in Talagrand’s Inequality for NonidenticallyDistributed Random Variables*3.4First Applications of Talagrand’s Inequality3.4.1 Moment Inequalities3.4.2 Data-Driven Inequalities: Rademacher Complexities3.4.3 A Bernstein-Type Inequality for Canonical U-statistics of Order 23.5Metric Entropy Bounds for Suprema of Empirical Processes3.5.1 Random Entropy Bounds via Randomisation3.5.2 Bracketing I: An Expectation Bound3.5.3 Bracketing II: An Exponential Bound for EmpiricalProcesses over Not Necessarily Bounded Classes of Functions3.6Vapnik-Červonenkis Classes of Sets and Functions3.6.1 Vapnik-Červonenkis Classes of Sets3.6.2 VC Subgraph Classes of Functions3.6.3 VC Hull and VC Major Classes of Functions3.7Limit Theorems for Empirical Processes3.7.13.7.23.7.33.7.4Some MeasurabilityUniform Laws of Large Numbers (Glivenko-Cantelli Theorems)Convergence in Law of Bounded ProcessesCentral Limit Theorems for Empirical Processes I: Definitionand Some Properties of Donsker Classes of Functions3.7.5 Central Limit Theorems for Empirical Processes II: Metricand Bracketing Entropy Sufficient Conditions for the DonskerProperty3.7.6 Central Limit Theorems for Empirical Processes III: LimitTheorems Uniform in P and Limit Theorems for 6212212217222228229233242250257261286

Contents44.1Function Spaces and Approximation TheoryDefinitions and Basic Approximation Theory4.1.14.1.24.1.34.1.44.2Notation and PreliminariesApproximate IdentitiesApproximation in Sobolev Spaces by General Integral OperatorsLittlewood-Paley DecompositionOrthonormal Wavelet Bases4.2.1 Multiresolution Analysis of L24.2.2 Approximation with Periodic Kernels4.2.3 Construction of Scaling Functions4.3Besov tions and CharacterisationsBasic Theory of the Spaces BspqRelationships to Classical Function SpacesPeriodic Besov Spaces on [0, 1]Boundary-Corrected Wavelet Bases*Besov Spaces on Subsets of RdMetric Entropy EstimatesGaussian and Empirical Processes in Besov Spaces4.4.1 Random Gaussian Wavelet Series in Besov Spaces4.4.2 Donsker Properties of Balls in Besov Spaces4.5Notes55.1Linear Nonparametric EstimatorsKernel and Projection-Type Estimators5.1.1 Moment Bounds5.1.2 Exponential Inequalities, Higher Moments and Almost-Sure LimitTheorems5.1.3 A Distributional Limit Theorem for Uniform Deviations*5.2Weak and Multiscale Metrics5.2.1 Smoothed Empirical Processes5.2.2 Multiscale Spaces5.3Some Further Topics5.3.1 Estimation of Functionals5.3.2 Deconvolution5.4Notes66.1The Minimax ParadigmLikelihoods and Information6.1.1 Infinite-Dimensional Gaussian Likelihoods6.1.2 Basic Information Theory6.2Testing Nonparametric Hypotheses6.2.16.2.26.2.36.2.46.3Construction of Tests for Simple HypothesesMinimax Testing of Uniformity on [0, 1]Minimax Signal-Detection Problems in Gaussian White NoiseComposite Testing ProblemsNonparametric Estimation6.3.1 Minimax Lower Bounds via Multiple Hypothesis 39439451462467467468473476478485492494511512

xContents6.3.2 Function Estimation in L Loss6.3.3 Function Estimation in Lp -Loss6.4Nonparametric Confidence Sets6.4.1 Honest Minimax Confidence Sets6.4.2 Confidence Sets for Nonparametric Estimators6.5Notes77.17.2Likelihood-Based ProceduresNonparametric Testing in Hellinger DistanceNonparametric Maximum Likelihood Estimators7.2.17.2.27.2.37.2.47.3Rates of Convergence in Hellinger DistanceThe Information Geometry of the Likelihood FunctionThe Maximum Likelihood Estimator over a Sobolev BallThe Maximum Likelihood Estimator of a Monotone DensityNonparametric Bayes Procedures7.3.17.3.27.3.37.3.4General Contraction Results for Posterior DistributionsContraction Results with Gaussian PriorsProduct Priors in Gaussian RegressionNonparametric Bernstein–von Mises Theorems7.4Notes88.1Adaptive InferenceAdaptive Multiple-Testing 644657664ReferencesAuthor IndexIndex6676836878.1.1 Adaptive Testing with L2 -Alternatives8.1.2 Adaptive Plug-in Tests for L -Alternatives8.2Adaptive Estimation8.2.1 Adaptive Estimation in L28.2.2 Adaptive Estimation in L 8.3Adaptive Confidence Sets8.3.18.3.28.3.38.3.48.4Confidence Sets in Two-Class Adaptation ProblemsConfidence Sets for Adaptive Estimators IConfidence Sets for Adaptive Estimators II: Self-Similar FunctionsSome Theory for Self-Similar Functions

PrefaceThe classical theory of statistics was developed for parametric models withfinite-dimensional parameter spaces, building on fundamental ideas of C. F. Gauss,R. A. Fisher and L. Le Cam, among others. It has been successful in providing modernscience with a paradigm for making statistical inferences, in particular, in the ‘frequentistlarge sample size’ scenario. A comprehensive account of the mathematical foundations ofthis classical theory is given in the monograph by A. van der Vaart, Asymptotic Statistics(Cambridge University Press, 1998).The last three decades have seen the development of statistical models that are infinite (or‘high’) dimensional. The principal target of statistical inference in these models is a functionor an infinite vector f that itself is not modelled further parametrically. Hence, these modelsare often called, in some abuse of terminology, nonparametric models, although f itselfclearly also is a parameter. In view of modern computational techniques, such models aretractable and in fact attractive in statistical practice. Moreover, a mathematical theory ofsuch nonparametric models has emerged, originally driven by the Russian school in theearly 1980s and since then followed by a phase of very high international activity.This book is an attempt to describe some elements of the mathematical theory ofstatistical inference in such nonparametric, or infinite-dimensional, models. We willfirst establish the main probabilistic foundations: the theory of Gaussian and empiricalprocesses, with an emphasis on the ‘nonasymptotic concentration of measure’ perspectiveon these areas, including the pathbreaking work by M. Talagrand and M. Ledoux onconcentration inequalities for product measures. Moreover, since a thorough understandingof infinite-dimensional models requires a solid background in functional analysis andapproximation theory, some of the most relevant results from these areas, particularly thetheory of wavelets and of Besov spaces, will be developed from first principles in this book.After these foundations have been laid, we turn to the statistical core of the book.Comparing nonparametric models in a very informal way with classical parametric models,one may think of them as models in which the number of parameters that one estimates fromthe observations is growing proportionally to sample size n and has to be carefully selectedby the statistician, ideally in a data-driven way. In practice, nonparametric modelling isoften driven by the honesty of admitting that the traditional assumption that n is largecompared to the number of unknown parameters is too strong. From a mathematicalpoint of view, the frequentist theory that validates statistical inferences in such modelsundergoes a radical shift: leaving the world of finite-dimensional statistical models behindimplies that the likelihood function no longer provides ‘automatically optimal’ statisticalmethods (‘maximum likelihood estimators’) and that extreme care has to be exercised whenxi

xiiPrefaceconstructing inference procedures. In particular, the Gauss–Fisher–Le Cam efficiency theorybased on the Fisher information typically yields nothing informative about what optimalprocedures are in nonparametric statistics, and a new theoretical framework is required.We will show how the minimax paradigm can serve as a benchmark by which a theoryof optimality in nonparametric models can be developed. From this paradigm arises the‘adaptation’ problem, whose solution has been perhaps one of the major achievements ofthe theory of nonparametric statistics and which will be presented here for nonparametricfunction estimation problems. Finally, likelihood-based procedures can be relevant innonparametric models as well, particularly after some regularisation step that can beincorporated by adopting a ‘Bayesian’ approach or by imposing qualitative a priori shapeconstraints. How such approaches can be analysed mathematically also will be shown here.Our presentation of the main statistical materials focusses on function estimationproblems, such as density estimation or signal in white-noise models. Many othernonparametric models have similar features but are formally different. Our aim is topresent a unified statistical theory for a canonical family of infinite-dimensional models,and this comes at the expense of the breadth of topics that could be covered. However,the mathematical mechanisms described here also can serve as guiding principles for manynonparametric problems not covered in this book.Throughout this book, we assume familiarity with material from real and functionalanalysis, measure and probability theory on the level of a US graduate course on thesubject. We refer to the monographs by G. Folland, Real Analysis (Wiley, 1999), andR. Dudley, Real Analysis and Probability (Cambridge University Press, 2002), for relevantbackground. Apart from this, the monograph is self-contained, with a few exceptions and‘starred sections’ indicated in the text.This book would not have been possible without the many colleagues and friends fromwhom we learnt, either in person or through their writings. Among them, we would like tothank P. Bickel, L. Birgé, S. Boucheron, L. Brown, T. Cai, I. Castillo, V. Chernozhukov,P. Dawid, L. Devroye, D. Donoho, R. Dudley, L. Dümbgen, U. Einmahl, X. Fernique,S. Ghosal, A. Goldenshluger, Y. Golubev, M. Hoffmann, I. Ibragimov, Y. Ingster,A. Iouditski, I. Johnstone, G. Kerkyacharian, R. Khasminskii, V. Koltchinskii, R. Latala,M. Ledoux, O. Lepski, M. Low, G. Lugosi, W. Madych, E. Mammen, D. Mason, P. Massart,M. Nussbaum, D. Picard, B. Pötscher, M. Reiß, P. Rigollet, Y. Ritov, R. Samworth,V. Spokoiny, M. Talagrand, A. Tsybakov, S. van de Geer, A. van der Vaart, H. van Zanten,J. Wellner, H. Zhou and J. Zinn.We are grateful to A. Carpentier, I. Castillo, U. Einmahl, D. Gauthier, D. Heydecker,K. Ray, J. Söhl and B. Szabò for proofreading parts of the manuscript and providing helpfulcorrections.Moreover, we are indebted to Diana Gillooly of Cambridge University Press for hersupport, patience and understanding in the process of this book project since 2011.R.N. would also like to thank his friends N. Berestycki, C. Damböck, R. Dawidand M. Neuber for uniquely stimulating friendships that have played a large role in theintellectual development that led to this book (and beyond).

PrefacexiiiOutline and Reading GuideIn principle, all the chapters of this book can be read independently. In particular, thechapters on Gaussian and empir

Mathematical Foundations of Infinite-Dimensional Statistical Models In nonparametric and high-dimensional statistical models, the classical Gauss– Fisher–Le Cam theory of the optimality of maximum likelihood and Bayesian posterior inference does not apply, and new foundations a

Related Documents:

Advantages when you use the Infinite Campus Gradebook Infinite Campus has a native grade book that integrates with the student information system. While other grade books (Jupiter and Canvas) are still available we hope you will try Infinite Campus' grade book. Advantages of using the Infinite Campus grade book: 1. No sync needed.

Infinite Campus has a native grade book that integrates with the student information system. While other grade books (Jupiter and Canvas) are still available we hope you will try Infinite Campus' grade book. Advantages of using the Infinite Campus grade book: 1. No sync needed. The grade book is part of the core product of Infinite Campus.

CI8605AU INFINITE PROTECT CURLING TONG Use & Care Instruction Manual Thank you for purchasing the new Remington Infinite Protect Curling Tong. The Curling Tong features the Infinite Protection Technology that helps protect your hair for long lasting vibrancy & shine. KEY PARTS 1. Advanced ceramic coating infused with Shea Oil and U.V. Filters 2.

Sep 26, 2017 · September 26, 2017 Infinite Campus User Guide (Rev A) Page 5 of 116 . Introduction . The IEP in Infinite Campus (IC) The Special Education module in Infinite Campus allows users the

As a Visa Infinite cardholder, your card provides benefits, special offers and experiences that make the things you do even better. Whether it’s travel, dining, or entertainment, these perks were created for you by Visa.7 Visa Infinite Concierge Your complimentary Visa Infinite Concierge can h

Chapter 5 Carboxylic Acids and Esters 17 Common Name Structural Formula BP ( C) MP ( C) Solubility (g/100 mL H2O) Formic acid H—CO2H 101 8 Infinite Acetic acid CH3—CO2H 118 17 Infinite Propionic acid CH3CH2—CO2H 141 -21 Infinite Butyric acid CH3(CH2)2—CO2H 164 -5 Infinite Valeric acid CH 3(CH2)

The Infinite 200 PRO has been developed to deliver accuracy and performance in a format that allows you to build a versatile detection system to match your changing application needs. With the Quad4 Monochromators-based Infinite M200 PRO and filter-based Infinite F200 PRO detection option

Safety Code for Elevators and Escalators, ASME A17.1-2013, as amended in this ordinance and Appendices A through D, F through J, L, M and P through V. Exceptions: 1.1. ASME A17.1 Sections 5.4, 5.5, 5.10 ((and)) , 5.11, and 5.12 are not adopted. 1.2. ASME A17.1 Section 1.2.1, Purpose, is not adopted. 2015 SEATTLE BUILDING CODE 639 . ELEVATORS AND CONVEYING SYSTEMS . 2. Safety Standard for .