Robust Bayesian Compressed Sensing - JunFang@uestc

1y ago
7 Views
2 Downloads
557.64 KB
5 Pages
Last View : 2m ago
Last Download : 3m ago
Upload by : Kaleb Stephen
Transcription

1Robust Bayesian Compressed SensingQian Wan, Huiping Duan, Jun Fang, and Hongbin Li, Senior Member, IEEEAbstract— We consider the problem of robust compressedsensing where the objective is to recover a high-dimensionalsparse signal from compressed measurements partially corruptedby outliers. A new sparse Bayesian learning method is developedfor this purpose. The basic idea of the proposed method isto identify the outliers and exclude them from sparse signalrecovery. To automatically identify the outliers, we employ aset of binary indicator variables to indicate which observationsare outliers. These indicator variables are assigned a betaBernoulli hierarchical prior such that their values are confinedto be binary. In addition, a Gaussian-inverse Gamma prior isimposed on the sparse signal to promote sparsity. Based onthis hierarchical prior model, we develop a variational Bayesianmethod to estimate the indicator variables as well as the sparsesignal. Simulation results show that the proposed method achievesa substantial performance improvement over existing robustcompressed sensing techniques.Index Terms— Robust Bayesian compressed sensing, variational Bayesian inference, outlier detection.I. I NTRODUCTIONCompressed sensing, a new paradigm for data acquisition and reconstruction, has drawn much attention over thepast few years [1]–[3]. The main purpose of compressedsensing is to recover a high-dimensional sparse signal froma low-dimensional linear measurement vector. In practice,measurements are inevitably contaminated by noise due tohardware imperfections, quantization errors, or transmissionerrors. Most existing studies (e.g. [4]–[6]) assume that measurements are corrupted with noise that is evenly distributedacross the observations, such as independent and identicallydistributed (i.i.d.) Gaussian, thermal, or quantization noise.This assumption is valid for many cases. Nevertheless, forsome scenarios, measurements may be corrupted by outliersthat are significantly different from their nominal values. Forexample, during the data acquisition process, outliers can becaused by sensor failures or calibration errors [7], [8], and itis usually unknown which measurements have been corrupted.Outliers can also arise as a result of signal clipping/saturationor impulse noise [9], [10]. Conventional compressed sensingtechniques may incur severe performance degradation in thepresence of outliers. To address this issue, in previous worksQian Wan and Jun Fang are with the National Key Laboratory of Scienceand Technology on Communications, University of Electronic Science andTechnology of China, Chengdu 611731, China, Email: JunFang@uestc.edu.cnHuiping Duan is with the School of Electronic Engineering, University ofElectronic Science and Technology of China, Chengdu 611731, China, Email:huipingduan@uestc.edu.cnHongbin Li is with the Department of Electrical and Computer Engineering,Stevens Institute of Technology, Hoboken, NJ 07030, USA, E-mail: Hongbin.Li@stevens.eduThis work was supported in part by the National Science Foundation ofChina under Grant 61522104, and the National Science Foundation underGrant ECCS-1408182 and Grant ECCS-1609393, and the Air Force Office ofScientific Research under Grant FA9550-16-1-0243.(e.g. [7]–[10]), outliers are modeled as a sparse error vector,and the observed data are expressed asy Ax e w(1)where A RM N is the sampling matrix with M N , xdenotes an N -dimensional sparse vector with only K nonzerocoefficients, e RM denotes the outlier vector consistingof T M nonzero entries with arbitrary amplitudes, andw denotes the additive multivariate Gaussian noise with zeromean and covariance matrix (1/γ)I. The above model can beformulated as a conventional compressed sensing problem as x w , Bu w(2)y A IeEfficient compressed sensing algorithms can then be employedto estimate the sparse signal as well as the outliers. Recoveryguarantees of x and e were also analyzed in [7]–[10].The rationale behind the above approach is to detect andcompensate for these outliers simultaneously. Besides theabove method, another more direct approach is to identify andexclude the outliers from sparse signal recovery. Although itmay seem preferable to compensate rather than simply rejectoutliers, inaccurate estimation of the compensation (i.e. outliervector) could result in a destructive effect on sparse signalrecovery, particularly when the number of measurements islimited. In this case, identifying and rejecting outliers could bea more sensible strategy. Motivated by this insight, we developa Bayesian framework for robust compressed sensing, in whicha set of binary indicator variables are employed to indicatewhich observations are outliers. These variables are assigneda beta-Bernoulli hierarchical prior such that their values areconfined to be binary. Also, a Gaussian inverse-Gamma prioris placed on the sparse signal to promote sparsity. A variationalBayesian method is developed to find the approximate posterior distributions of the indicators, the sparse signal and otherlatent variables. Simulation results show that the proposedmethod achieves a substantial performance improvement overthe compensation-based robust compressed sensing method.II. H IERARCHICAL P RIOR M ODELWe develop a Bayesian framework which employs a set ofindicator variables z , {zm } to indicate which observationis an outlier, i.e. zm 1 indicates that ym is a normalobservation, otherwise ym is an outlier. More precisely, wecan write(zm 1arm x wm(3)ym arm x wm em zm 0where arm denotes the mth row of A, em and wm are the mthentry of e and w, respectively. The probability of the observed

2where the parameters c and d are set to be small, e.g. c d 10 10 . The graphical model of the proposed hierarchicalprior is shown in Fig. 1.III. VARIATIONAL BAYESIAN I NFERENCEWe now proceed to perform Bayesian inference for theproposed hierarchical model. Let θ , {z, x, π, α, γ} denotethe hidden variables in our hierarchical model. Our objectiveis to find the posterior distribution p(θ y), which is usuallycomputationally intractable. To circumvent this difficulty, observe that the marginal probability of the observed data canbe decomposed into two termsFig. 1.Graphical model for robust Bayesian compressed sensing.ln p(y) L(q) KL(q p)data conditional on these indicator variables can be expressedasp(y x, z, γ) MY(N (ym arm x, 1/γ))zmm 1zmp(zm πm ) Bernoulli(zm πm ) πm(1 πm )1 zm m(5)and πm follows a beta distributionp(πm ) Beta(e, f ) m(6)where e and f are parameters characterizing the beta distribution. Note that the beta-Bernoulli prior assumes the randomvariables {zm } are mutually independent, and so are therandom variables {πm }.To encourage a sparse solution, a Gaussian-inverse Gammahierarchical prior, which has been widely used in sparseBayesian learning (e.g. [13]–[16]), is employed. Specifically,in the first layer, x is assigned a Gaussian prior distributionp(x α) NYp(xn αn )(7)where p(xn αn ) N (xn 0, αn 1 ), and α , {αn } arenon-negative hyperparameters controlling the sparsity of thesignal x. The second layer specifies Gamma distributions ashyperpriors over the precision parameters {αn }, i.e.p(α) n 1L(q) Zq(θ) lnp(y, θ)dθq(θ)Gamma(αn a, b) NYΓ(a) 1 ba αna 1 e bαnn 1(11)andKL(q p) Zq(θ) lnp(θ y)dθq(θ)(12)where q(θ) is any probability density function, KL(q p) isthe Kullback-Leibler divergence between p(θ y) and q(θ).Since KL(q p) 0, it follows that L(q) is a rigorous lowerbound on ln p(y). Moreover, notice that the left hand sideof (10) is independent of q(θ). Therefore maximizing L(q)is equivalent to minimizing KL(q p), and thus the posteriordistribution p(θ y) can be approximated by q(θ) throughmaximizing L(q). Specifically, we could assume some specificparameterized functional form for q(θ) and then maximizeL(q) with respect to the parameters of the distribution. Aparticular form of q(θ) that has been widely used with greatsuccess is the factorized form over the component variablesin θ [17]. For our case, the factorized form of q(θ) can bewritten asq(θ) qz (z)qx (x)qα (α)qπ (π)qγ (γ)n 1NYwhere(4)in which those “presumed outliers” are automatically disabledwhen calculating the probability. To infer the indicator variables, a beta-Bernoulli hierarchical prior [11], [12] is placedon z, i.e. each component of z is assumed to be drawn froma Bernoulli distribution parameterized by πm(10)(13)We can compute the posterior distribution approximation byfinding q(θ) of the factorized form that maximizes the lowerbound L(q). The maximization can be conducted in an alternating fashion for each latent variable, which leads to [17]ln qx (x) hln p(y, θ)iqα (α)qγ (γ)qz (z)qπ (π) constantln qα (α) hln p(y, θ)iqx (x)qγ (γ)qz (z)qπ (π) constantln qγ (γ) hln p(y, θ)iqx (x)qα (α)qz (z)qπ (π) constant(8)ln qz (z) hln p(y, θ)iqx (x)qα (α)qγ (γ)qπ (π) constantwhere the parameters a and b are set to small values (e.g.a b 10 10 ) in order to provide non-informative (overa logarithmic scale) hyperpriors over {αn }. Also, to estimatethe noise variance, we place a Gamma hyperprior over γ, i.e.ln qπ (π) hln p(y, θ)iqx (x)qα (α)qγ (γ)qz (z) constantp(γ) Gamma(γ c, d) Γ(c) 1 dc γ c 1 e dγ(9)(14)where h·i· denotes an expectation with respect to the distributions specified in the subscript. More details of the Bayesianinference are provided below.

31) Update of qx (x): We first consider the calculation ofqx (x). Keeping those terms that are dependent on x, we haveln qx (x) hln p(y x, z, γ) ln p(x α)iqα (α)qγ (γ)qz (z) in whichh(y Ax)T D z (y Ax)iqx (x) (y Aµx )T D z (y Aµx ) trace(AT D z AΦx )MNXhγzm (ym arm x)2 i 1 X hαn x2n i22nm 14) Update of qz (z): The posterior approximation of qz (z)yieldshγi(y Ax)T D z (y Ax) 1 T x Dα x22(15)ln qz (z) hln p(y x, z, γ) ln p(z π)iqx (x)qγ (γ)qπ (π) MXm 1whereD z , diag(hzi), D α , diag(hαi)(16)hzi and hαi denote the expectation of z and α, respectively.It is easy to show that q(x) follows a Gaussian distributionwith its mean and covariance matrix given respectively byµx hγiΦx AT D z y 1Φx hγiAT D z A D α(17)(18)2) Update of qα (α): Keeping only the terms that dependon α, the variational optimization of qα (α) yieldsln qα (α) hln p(x α) ln p(α a, b)iqx (x) NX(a 0.5) ln αn (0.5 x2n b)αn(19)The posterior qα (α) therefore follows a Gamma distributionNYGamma(αn ã, b̃n )(1 zm ) ln(1 πm )i(20)n 1P (zm 1) Cehln πm i e P (zm 0) Ce (c 0.5MXln qπ (π) hln p(z π) ln p(π e, f )iqz (z) hzm i 1) ln γ (d 0.5h(y Ax)T(21)(22)where c̃ and d are given respectively asc̃ c 0.5MXhzm i(23)m 1d d 0.5h(y Ax)T D z (y Ax)iqx (x)hzm ln πm (1 zm ) ln(1 πm ) (e 1) ln πmMXh(zm e 1) ln πm (f zm ) ln(1 πm )im 1(29)It can be easily verified that qπ (π) follows a Beta distribution,i.e.YYqπ (π) p(πm ) Beta(hzm i e, 1 f hzm i)m(30)Clearly, the posterior qγ (γ) obeys a Gamma distribution qγ (γ) Gamma(γ c̃, d)MXmm 1Dz (y Ax)i)γ(28) (f 1) ln(1 πm )iln qγ (γ) hln p(y x, z, γ) ln p(γ c, d)iqx (x)qz (z) (c 1) ln γ dγ(27)m 13). Update of qγ (γ): The variational approximation ofqγ (γ) can be obtained as:m 1(26)The last two equalities can also be found in [12], in whichΨ(·) represents the digamma function.5) Update of qπ (π): The posterior approximation of qπ (π)can be calculated as b̃n b 0.5hx2n i0.5hzm i ln γ 0.5γhzm ih(ym arm x)2 ihln(1 πm )ih(ym arm x)2 i (ym arm µx )2 arm Φx arm Thln πm i Ψ(e hzm i) Ψ(e f 1)ã a 0.5MX2γh(ym arm x) i2where C is a normalizing constant such that P (zm 1) P (zm 0) 1, andin which ã and b̃n are given respectively as (25)Clearly, zm still follows a Bernoulli distribution with itsprobability given byhln(1 πm )i Ψ(1 f hzm i) Ψ(e f 1)n 1qα (α) hzm 0.5γ(ym arm x)2 ln πm (24)In summary, the variational Bayesian inference involvesupdates of the approximate posterior distributions for hiddenvariables x, α, z, π, and γ in an alternating fashion. Someof the expectations and moments used during the update aresummarized asãhαn i b̃nc̃hγi d hx2n i hxn i2 Φx (n, n)P (zm 1)hzm i P (zm 1) P (zm 0)where Φx (n, n) denotes the nth diagonal element of Φx .

4IV. S IMULATION R ESULTSy Ax wwhere w denotes i.i.d. Gaussian observation noise with zeromean and variance 1/γ, A CM N is an overcomplete dictionary constructed by evenly-spaced angular points{θn }, with the (m, n)th entry of A given by am,n exp { 2jπ(m 1) sin(θn )D/λ}, in which D denotes thedistance between two adjacent sensors, λ represents the wavelength of the source signal, and {θn } are evenly-spaced gridpoints in the interval [ π/2, π/2]. The signal x contains Knonzero entries that are independently drawn from a unitcircle. Suppose that T out of M measurements are corruptedby outliers. For those corrupted measurements {ym }, theirvalues are chosen uniformly from [ 10, 10].We first consider a noiseless case, i.e. 1/γ 0. Fig. 2depicts the success rates of different methods vs. the number ofmeasurements and the number of outliers, respectively, wherewe set N 64, K 3, T 7 (the number of outliers) in Fig.2(a), and M 25, K 3, N 64 in Fig. 2(b). The successrate is computed as the ratio of the number of successful trialsto the total number of independent runs. A trial is consideredsuccessful if the normalized reconstruction error of the sparsesignal x is no greater than 10 6 . From Fig. 2, we see thatour proposed BP-RBCS achieves a substantial performanceimprovement over the C-RBCS. This result corroborates our1 Codesare available at http://www.junfang-uestc.net/codes/RBCS.rar1BP RBCSC RBCSRBCS ideal0.60.40.2010BP RBCSC RBCSRBCS ideal0.8success ratesuccess rate0.80.60.40.215202500305Number of measurements101520Number of outliers(a)(b)Fig. 2. (a) Success rates of respective algorithms vs. M ; (b). Success ratesof respective algorithms vs. T .1010BP RBCSC RBCSRBCS ideal5NMSE0 5 10 5 10 15 15 20 20 2510BP RBCSC RBCSRBCS ideal50NMSEWe now carry out experiments to illustrate the performanceof our proposed method which is referred to as the betaBernoulli prior model-based robust Bayesian compressed sensing method (BP-RBCS)1 . As discussed earlier, another robustcompressed sensing approach is compensation-based and canbe formulated as a conventional compressed sensing problem(2). For comparison, the sparse Bayesian learning method [13],[18] is employed to solve (2), and this method is referredto as the compensation-based robust Bayesian compressedsensing method (C-RBCS). Also, we consider an “ideal”method which assumes the knowledge of the locations of theoutliers. The outliers are then removed and the sparse Bayeisanlearning method is employed to recover the sparse signal.This ideal method is referred to as RBCS-ideal, and servesas a benchmark for the performance of the BP-RBCS andC-RBCS. Note that both C-RBCS and RBCS-ideal use thesparse Bayesian learning method for sparse signal recovery.The parameters {a, b, c, d} of the sparse Bayesian learningmethod are set to a b c d 10 10 . Our proposedmethod involves the parameters {a, b, c, d, e, f }. The first fourare also set to a b c d 10 10 . The beta-Bernoulliparameters {e, f } are set to e 0.7 and f 1 e 0.3since we expect that the number of outliers is usually smallrelative to the total number of measurements. Our simulationresults suggest that stable recovery is ensured as long as e isset to a value in the range [0.5, 1].We consider the problem of direction-of-arrival (DOA)estimation where K narrowband far-field sources impinge ona uniform linear array of M sensors from different directions.The received signal can be expressed as1152025 258Number of measurements(a)10121416Number of outliers(b)Fig. 3. (a) NMSEs of respective algorithms vs. M ; (b). NMSEs of respectivealgorithms vs. T .claim that rejecting outliers is a better strategy than compensating for outliers, particularly when the number of measurementsis small, because inaccurate estimation of the compensationvector could lead to a destructive, instead of a constructive,effect on sparse signal recovery. Next, we consider a noisy casewith 1/γ 0.01. Fig. 3 plots the normalized mean squareerrors (NMSEs) of the recovered sparse signal by differentmethods vs. the number of measurements and the number ofoutliers, respectively, we set N 64, K 3, T 7 in Fig.3(a), and M 25, K 3, N 64 in Fig. 3(b). This result,again, demonstrates the superiority of our proposed methodover the C-RBCS.V. C ONCLUSIONSWe proposed a new Bayesian method for robust compressedsensing. The rationale behind the proposed method is toidentify the outliers and exclude them from sparse signalrecovery. To this objective, a set of indicator variables wereemployed to indicate which observations are outliers. A betaBernoulli prior is assigned to these indicator variables. Avariational Bayesian inference method was developed to findthe approximate posterior distributions of the latent variables.Simulation results show that our proposed method achieves asubstantial performance improvement over the compensationbased robust compressed sensing method.

5R EFERENCES[1] S. S. Chen, D. L. Donoho, and M. A. Saunders, “Atomic decompositionby basis pursuit,” SIAM J. Sci. Comput., vol. 20, no. 1, pp. 33–61, 1998.[2] E. Candés and T. Tao, “Decoding by linear programming,” IEEE Trans.Information Theory, no. 12, pp. 4203–4215, Dec. 2005.[3] D. L. Donoho, “Compressive sensing,” IEEE Trans. Inform. Theory,vol. 52, pp. 1289–1306, 2006.[4] E. Candes, “The restricted isometry property and its implications forcompressive sensing,” Compte Rendus de l’Academie des Sciences,Paris, Serie I, vol. 346, pp. 589–592, 2008.[5] M. J. Wainwright, “Information-theoretic limits on sparsity recoveryin the high-dimensional and noisy setting,” IEEE Trans. InformationTheory, vol. 55, no. 12, pp. 5728–5741, Dec. 2009.[6] T. Wimalajeewa and P. K. Varshney, “Performance bounds for sparsitypattern recovery with quantized noisy random projections,” IEEE Journal on Selected Topics in Signal Processing, vol. 6, no. 1, pp. 43–57,Feb. 2012.[7] R. G. B. Jason N. Laska, Mark A. Davenport, “Exact signal recoveryfrom sparsely corrupted measurements through the pursuit of justice,”in The 43rd Asilomar Conference on Signals, Systems and Computers,Pacific Grove, California, USA, November 1-4 2009.[8] R. C. Kaushik Mitra, Ashok Veeraraghavan, “Analysis of sparse regularization based robust regression approaches,” IEEE Trans. SignalProcessing, no. 5, pp. 1249–1257, Mar. 2013.[9] R. E. Carrillo, K. E. Barner, and T. C. Aysal, “Robust sampling andreconstruction methods for sparse signals in the presence of impulsivenoise,” IEEE Journal of Selected Topics in Signal Processing, no. 2, pp.392–408, Apr. 2010.[10] C. Studer, P. Kuppinger, G. Pope, and H. Bolcskei, “Recovery of sparselycorrupted signals,” IEEE Trans. Information Theory, no. 5, pp. 3115–3130, May 2012.[11] L. He and L. Carin, “Exploiting structure in wavelet-based Bayesiancompressive sensing,” IEEE Trans. Signal Processing, vol. 57, no. 9,pp. 3488–3497, Sept. 2009.[12] J. Paisley and L. Carin, “Nonparametric factor analysis with Beta processpriors,” in 26th Annual International Conference on Machine Learning,Montreal, Canada, June 14-18 2009.[13] S. Ji, Y. Xue, and L. Carin, “Bayesian compressive sensing,” IEEE Trans.Signal Processing, vol. 56, no. 6, pp. 2346–2356, June 2008.[14] Z. Zhang and B. D. Rao, “Extension of SBL algorithms for the recoveryof block sparse signals with intra-block correlation,” IEEE Trans. SignalProcessing, vol. 61, no. 8, pp. 2009–2015, Apr. 2013.[15] Z. Yang, L. Xie, and C. Zhang, “Off-grid direction of arrival estimation using sparse Bayesian inference,” IEEE Trans. Signal Processing,vol. 61, no. 1, pp. 38–42, Jan. 2013.[16] J. Fang, Y. Shen, H. Li, and P. Wang, “Pattern-coupled sparse Bayesianlearning for recovery of block-sparse signals,” IEEE Trans. SignalProcessing, no. 2, pp. 360–372, Jan. 2015.[17] D. G. Tzikas, A. C. Likas, and N. P. Galatsanos, “The variational approximation for Bayesian inference,” IEEE Signal Processing Magazine,pp. 131–146, Nov. 2008.[18] M. Tipping, “Sparse Bayesian learning and the relevance vector machine,” Journal of Machine Learning Research, vol. 1, pp. 211–244,2001.

A new sparse Bayesian learning method is developed for this purpose. The basic idea of the proposed method is . we develop a variational Bayesian method to estimate the indicator variables as well as the sparse . w denotes the additive multivariate Gaussian noise with zero mean and covariance matrix (1/γ)I. The above model can be

Related Documents:

Variational Bayesian inference Outlier detection a b s t r a c t We compressedconsider sensing theproblem isof recover a high- where objective to dimensional sparse signal from compressed measurements partially corrupted by outliers. A new sparse Bayesian learning method is developed for this purpose. The basic idea of the proposed method is to

12 spd 3/4" drill press 86 orbit or-2501f ne lot # 87 12-ton hydraulic pipe bender ne oxy/acet set w/torch, gauges, cart & bottles 88 ne 89 compressed gas bottle ne 90 compressed gas bottle ne 91 compressed gas bottle ne 92 compressed gas bottle ne 93 compressed gas bottle ne 94 compressed gas bottle ne 95 compressed gas bottle ne 96 compressed .

Proximity Sensor Sensing object Reset distance Sensing distance Hysteresis OFF ON Output Proximity Sensor Sensing object Within range Outside of range ON t 1 t 2 OFF Proximity Sensor Sensing object Sensing area Output (Sensing distance) Standard sensing object 1 2 f 1 Non-metal M M 2M t 1 t 2 t 3 Proximity Sensor Output t 1 t 2 Sensing .

Compressed Air & Gas Institute s 1300 Sumner Avenue s Cleveland, OH 44115 Phone: 216/241-7333 s Fax: 216/241-0105 s E-mail: cagi@cagi.org 204 Compressed Air Distribution (Systems) CHAPTER 4 4 Compressed Air Distribution (Systems) CompreSSeD Air DiStribution SyStemS When a compressed air distribution system is properly designed, installed, operated

Computational Bayesian Statistics An Introduction M. Antónia Amaral Turkman Carlos Daniel Paulino Peter Müller. Contents Preface to the English Version viii Preface ix 1 Bayesian Inference 1 1.1 The Classical Paradigm 2 1.2 The Bayesian Paradigm 5 1.3 Bayesian Inference 8 1.3.1 Parametric Inference 8

value of the parameter remains uncertain given a nite number of observations, and Bayesian statistics uses the posterior distribution to express this uncertainty. A nonparametric Bayesian model is a Bayesian model whose parameter space has in nite dimension. To de ne a nonparametric Bayesian model, we have

PRINCIPLES OF REMOTE SENSING Shefali Aggarwal Photogrammetry and Remote Sensing Division Indian Institute of Remote Sensing, Dehra Dun Abstract : Remote sensing is a technique to observe the earth surface or the atmosphere from out of space using satellites (space borne) or from the air using aircrafts (airborne). Remote sensing uses a part or several parts of the electromagnetic spectrum. It .

successfully in captivity, yet animal nutrition is a new and relatively unexplored field. Part of the problem is a lack of facilities in zoological institutions and a lack of expertise. There is, thus, a strong need to develop nutritional studies and departments in zoological institutions. Research on nutrition is carried out both as a problem-solving exercise (in relation to ill-health or .