Expectation-Maximization Gaussian-Mixture ... - Semantic Scholar

Comment

Report 3 Downloads 150 Views

Expectation-Maximization Gaussian-Mixture Approximate Message Passing Philip Schniter and Jeremy Vila

Supported in part by NSF-I/UCRC grant IIP-0968910, by NSF grant CCF-1018368, and by DARPA/ONR grant N66001-10-1-4090.

CISS @ Princeton – 3/23/12

Compressive Sensing Goal: recover signal x from noisy sub-Nyquist measurements x ∈ RN

y = Ax + w

y, w ∈ RM

M < N.

where x is K-sparse with K < M , or compressible. With sufficient sparsity and appropriate conditions on the mixing matrix A (e.g. RIP, nullspace), accurate recovery of x is possible using polynomial-complexity algorithms. A common approach (LASSO) is to solve the convex problem min ky − Axk22 + αkxk1 x

where α can be tuned in accordance with sparsity and SNR.

Philip Schniter and Jeremy Vila (OSU)

EM-GM-GAMP

CISS @ Princeton – 3/23/12

2 / 19

Phase Transition Curves (PTC) K The PTC identifies ratios ( M N , M ) for which perfect noiseless recovery of K-sparse x occurs (as M, N, K → ∞ under i.i.d Gaussian A).

Suppose {xn } are drawn i.i.d. pX (xn ) = λf (xn )+(1−λ)δ(xn )

1

with known λ , K/N . LASSO’s PTC is invariant to f (·). Thus, LASSO is robust in the face of unknown f (·). MMSE-reconstruction’s PTC is far better than Lasso’s, but requires knowing f (·).

K/M (sparsity)

0.8

0.6

0.4

MMSE reconstruct empirical AMP

0.2

theoretical LASSO

0

0.2

0.4

0.6

0.8

M/N (undersampling)

Wu and Verd´ u, “Optimal phase transitions in compressed sensing,” arXiv Nov. 2011. Philip Schniter and Jeremy Vila (OSU)

EM-GM-GAMP

CISS @ Princeton – 3/23/12

3 / 19

Motivations For practical compressive sensing. . . want minimal MSE – distributions are unknown ⇒ can’t formulate MMSE estimator – but there is hope: various algs seen to outperform Lasso for specific signal classes – really, we want a universal algorithm: good for all signal classes

want fast runtime – especially for large signal-length N (i.e., scalable).

want to avoid algorithmic tuning parameters, – who has the patience to tweak yet another CS algorithm!

Philip Schniter and Jeremy Vila (OSU)

EM-GM-GAMP

CISS @ Princeton – 3/23/12

4 / 19

Proposed Approach: “EM-GM-GAMP” Model the signal and noise using flexible distributions: – i.i.d Bernoulli Gaussian-mixture (GM) signal p(xn ) = λ

L X

ωl N (xn ; θl , φl ) + (1 − λ)δ(xn ) ∀n

l=1

– i.i.d Gaussian noise with variance ψ

Learn the prior parameters q , {λ, ωl , θl , φl , ψ}L l=1 – treat as deterministic and use expectation-maximization (EM)

Exploit the learned priors in near-MMSE signal reconstruction – use generalized approximate message passing (GAMP)

Philip Schniter and Jeremy Vila (OSU)

EM-GM-GAMP

CISS @ Princeton – 3/23/12

5 / 19

Approximate Message Passing (AMP) AMP methods infer x from y = Ax + w using loopy belief propagation with carefully constructed approximations. The original AMP [Donoho, Maleki, Montanari ’09] solves the LASSO problem (i.e., Laplacian MAP) assuming i.i.d matrix A. The Bayesian AMP [Donoho, Maleki, Montanari ’10] framework tackles MMSE inference under generic signal priors. The generalized AMP [Rangan ’10] framework tackles MAP or MMSE inference under generic signal & noise priors and generic A.

AMP is a form of iterative thresholding, requiring only two applications of A per iteration and ≈ 25 iterations. Very fast! Rigorous large-system analyses (under i.i.d Gaussian A) have established that (G)AMP follows a state-evolution trajectory with optimal properties [Bayati, Montanari ’10], [Rangan ’10]. Philip Schniter and Jeremy Vila (OSU)

EM-GM-GAMP

CISS @ Princeton – 3/23/12

6 / 19

AMP Heuristics (Sum-Product)

p1→1 (x1 )

x1

N (y1 ; [Ax]1 , ψ) 1

Message from yi node to xj node:

{xr }r6=j

≈

Z

x2

N (y2 ; [Ax]2 , ψ)

≈ N via CLT Z zP }| { Q pi→j (xj ) ∝ N yi ; r air xr , ψ r6=j pi←r (xr )

N (yM ; [Ax]M , ψ)

pX (x2 )

.. .

.. .

pX (x1 )

.. . xN

pX (xN )

pM ←N (xN )

N (yi ; zi , ψ) N zi ; zˆi (xj ), νiz (xj ) ∼ N zi

To compute zˆi (xj ), νiz (xj ), the means and variances of {pi←r }r6=j suffice, thus Gaussian message passing! Remaining problem: we have 2M N messages to compute (too many!). 2

Exploiting similarity among the messages N (y ; [Ax] , ψ) {pi←j }M i=1 , AMP employs a Taylor-series approximation of their difference whose N (y ; [Ax] , ψ) error vanishes as M → ∞ for dense A (and similar for {pi←j }N i=1 as N → ∞). Finally, need to compute only O(M +N ) N (y ; [Ax] , ψ) messages! 1

2

M

Philip Schniter and Jeremy Vila (OSU)

EM-GM-GAMP

p1→1 (x1 )

x1

pX (x1 )

1

x2

pX (x2 )

2

.. .

.. .

.. . xN

M

pX (xN )

pM ←N (xN )

CISS @ Princeton – 3/23/12

7 / 19

Expectation-Maximization We use expectation-maximization (EM) to learn the signal and noise prior parameters q , {λ, ω, θ, φ, ψ} The missing data is chosen to be the signal and noise vectors (x, w). The updates are performed coordinate-wise. For example, updating λ at the ith EM iteration involves (E-step)

Q(λ|q i ) =

N X

n=1

(M-step)

E ln p(xn ; λ, ω i , θ i , φi ) y; q i

λi+1 = arg max Q(λ|q i ). λ∈(0,1)

The updates of (ω, θ, φ, ψ) are similar (details in paper). All quantities needed for the EM updates are provided by GAMP! Philip Schniter and Jeremy Vila (OSU)

EM-GM-GAMP

CISS @ Princeton – 3/23/12

8 / 19

Parameter Initialization Initialization matters; EM can get stuck in a local max. We suggest. . . initializing the sparsity λ according to the theoretical LASSO PTC. initializing the noise and active-signal variances using known energies kyk22 , kAk2F and user-supplied SNR0 (which defaults to 20 dB): ψ0 =

kyk22 , (SNR0 + 1)M

(σ 2 )0 =

kyk22 − M ψ 0 λ0 kAk2F

fixing L (e.g., L = 3) and initializing the GM parameters (ω, θ, φ) as the best fit to a uniform distribution with variance σ 2 . We have also developed a “splitting” mode that adds one GM component at a time. a “heavy tailed” mode that forces zero-mean GM components. Philip Schniter and Jeremy Vila (OSU)

EM-GM-GAMP

CISS @ Princeton – 3/23/12

9 / 19

Examples of Learned Signal-pdfs The following shows the Gaussian-mixture pdf learned by EM-GM-GAMP when the true active-signal pdf was uniform (left) and ±1 (right): 1

2

Estimated True

Estimated True

0.8

0.6

p(xn )

p(xn )

1.5

0.4

1

0.5

0.2

0 −1

−0.5

0

xn

0.5

1

0 −1.5

−0.5

0

xn

0.5

1

1.5

True and learned signal pdfs

True and learned signal pdfs

Philip Schniter and Jeremy Vila (OSU)

−1

EM-GM-GAMP

CISS @ Princeton – 3/23/12

10 / 19

Empirical PTCs: Bernoulli-Rademacher (±1) signals We now evaluate noiseless reconstruction performance via phase-transition curves constructed using N = 1000-length signals, i.i.d Gaussian A, and 100 realizations. 0.9

We see EM-GM-GAMP performing significantly better than LASSO for this signal class.

0.8

K/M

0.7

We also see EM-GM-GAMP performing nearly as well as GM-GAMP under genie-aided parameter settings.

0.6 0.5 0.4

genie GM-GAMP EM-GM-GAMP EM-BG-GAMP Laplacian-AMP theoretical LASSO

0.3 0.2 0.1 0.2

0.4

0.6

0.8

M/N Empirical noiseless Bernoulli-Rademacher PTCs

Philip Schniter and Jeremy Vila (OSU)

EM-GM-GAMP

CISS @ Princeton – 3/23/12

11 / 19

0.9

0.9

0.8

0.8

0.7

0.7

0.6

0.6

K/M

K/M

PTCs for Bernoulli-Gaussian and Bernoulli signals

0.5 0.4

genie GM-GAMP EM-GM-GAMP EM-BG-GAMP Laplacian-AMP theoretical LASSO

0.3 0.2 0.1 0.2

0.4

0.6

0.5 0.4

genie GM-GAMP EM-GM-GAMP EM-BG-GAMP Laplacian-AMP theoretical LASSO

0.3 0.2 0.1

0.8

0.2

0.4

M/N

0.6

0.8

M/N

Empirical noiseless Bernoulli-Gaussian PTCs

Empirical noiseless Bernoulli PTCs

For these signals, we see EM-GM-GAMP performing. . . significantly better than LASSO, nearly as well as genie-aided GM-GAMP, on par with our previous “EM-BG-GAMP” algorithm. Philip Schniter and Jeremy Vila (OSU)

EM-GM-GAMP

CISS @ Princeton – 3/23/12

12 / 19

Noisy Recovery: Bernoulli-Rademacher (±1) signals We now compare the normalized MSE of EM-GM-GAMP to several state-of-the-art algorithms (SL0, T-MSBL, BCS, Lasso via SPGL1) for the task of noisy signal recovery under i.i.d Gaussian A. For this, we fixed N = 1000, K = 100, SNR = 25dB and varied M .

Notice that our previous EM-BG-GAMP algorithm cannot accurately model the Bernoulli-Rademacher prior.

0 −5

NMSE [dB]

For these Bernoulli-Rademacher signals, we see EM-GM-GAMP outperforming the other algorithms for all undersampling ratios M/N .

−10 −15 −20

SL0

−25

BCS

genie Lasso

−30 −35 0.2

EM-BG-GAMP T-MSBL EM-GM-GAMP

0.25

0.3

0.35

0.4

0.45

0.5

M/N Noisy Bernoulli-Rademacher recovery NMSE.

Philip Schniter and Jeremy Vila (OSU)

EM-GM-GAMP

CISS @ Princeton – 3/23/12

13 / 19

Noisy Recovery: Bernoulli-Gaussian and Bernoulli signals

BCS EM-BG-GAMP

−10

SL0 genie Lasso

−5

T-MSBL

NMSE [dB]

NMSE [dB]

0

SL0 genie Lasso

−5

EM-GM-GAMP

−15

BCS T-MSBL

−10

EM-BG-GAMP EM-GM-GAMP

−15 −20 −25

−20

−30 −25 −35 0.2

0.25

0.3

0.35

0.4

0.45

0.5

M/N

0.2

0.25

0.3

0.35

0.4

0.45

0.5

M/N

Noisy Bernoulli-Gaussian recovery NMSE.

Noisy Bernoulli recovery NMSE.

For Bernoulli-Gaussian and Bernoulli signals, EM-GM-GAMP again dominates the other algorithms. We attribute the excellent performance of EM-GM-GAMP to its ability to learn and exploit the true signal prior. Philip Schniter and Jeremy Vila (OSU)

EM-GM-GAMP

CISS @ Princeton – 3/23/12

14 / 19

Noisy Recovery of Heavy-tailed (Student’s-t) signals BCS

−5

T-MSBL EM-BG-GAMP

NMSE [dB]

SL0

−6

genie Lasso EM-GM-GAMP

−7 −8 −9 −10 0.3

0.35

0.4

0.45

0.5

0.55

0.6

M/N Noisy Student-t recovery NMSE.

Algorithm rankings on heavy-tailed signals are often the reverse of those for sparse signals! In its “heavy tailed” mode, EM-GM-GAMP performs on par with the best algorithms for all M/N . Philip Schniter and Jeremy Vila (OSU)

EM-GM-GAMP

CISS @ Princeton – 3/23/12

15 / 19

Runtime versus signal-length N We fix M/N = 0.5, K/N = 0.1, SNR = 25dB, and average 50 trials. 2

−20

SPGL1 SL0 SPGL1 fft BCS T-MSBL EM-BG-GAMP

−25

NMSE [dB]

Runtime [sec]

10

1

10

T-MSBL BCS SL0 SPGL1 EM-GM-GAMP

0

10

EM-BG-GAMP

EM-BG-GAMP fft EM-GM-GAMP fft

−30

EM-GM-GAMP

−35

EM-GM-GAMP fft

SPGL1 fft EM-BG-GAMP fft −1

10

3

10

4

10

5

N

10

6

10

Noisy Bernoulli-Rademacher recovery time.

−40 3 10

4

10

5

N

6

10

10

Noisy Bernoulli-Rademacher recovery NMSE.

For all N > 1000, EM-GM-GAMP has the fastest runtime! EM-GM-GAMP can also leverage fast operators for A (e.g., FFT). Philip Schniter and Jeremy Vila (OSU)

EM-GM-GAMP

CISS @ Princeton – 3/23/12

16 / 19

Extension to structured sparsity (Justin Ziniel) Recovery of an audio signal sparsified via DCT Ψ and compressively sampled via i.i.d Gaussian Φ (so that A = ΦΨ). Exploit persistence of support across time via discrete Markov chains and turbo AMP. t ...

...

(T )

s1

...

(1)

s1

(0)

(0)

x1

(0)

f1

(1)

h1

(0)

h1

(0)

g1

.. .

.. .

.. .

(0)

(0)

fn

.. .

.. .

(T )

θ1

(1)

θ1

(0)

θ1 .. .

(1)

sn

(1)

sn

(T )

sn

θn

...

gm

(1)

θn

(0)

(T )

sN

θn .. . (0)

gM

(0)

xN

.. .

(0)

(1)

sN

(0)

(T −1)

hN

sN

fN

(1)

hN

(0) hN

AMP

(1)

(T )

θN ...

θN (0)

θN

(T −1)

d1 (T )

d1

(0)

d1 (0)

(0)

xn

(T −1)

h1

s1

(T −1)

dN

(1)

dN (0) dN

algorithm EM-GM-AMP turbo EM-GM-AMP

M/N = 1/5 -9.04 dB 8.77 s -12.34 dB 9.37 s

Philip Schniter and Jeremy Vila (OSU)

M/N = 1/3 -12.72 dB 10.26 s -16.07 dB 11.05 s

EM-GM-GAMP

M/N = 1/2 -17.17 dB 11.92 s -20.94 dB 12.96 s

CISS @ Princeton – 3/23/12

17 / 19

Conclusions We proposed a sparse reconstruction alg that uses EM to learn GM-signal and AWGN-noise priors, and that uses GAMP to exploit these priors for near-MMSE signal recovery. Advantages of EM-GM-GAMP: State-of-the-art NMSE performance for all tested signal types. State-of-the-art complexity for signals of length N & 1000. Minimal tuning: choose between “sparse” or “heavy-tailed” modes.

Ongoing related work: Theoretical performance guarantees of EM-GM-GAMP. Extension to non-Gaussian noise. Universal learning/exploitation of structured sparsity. Extensions to matrix completion, dictionary learning, robust PCA.

Philip Schniter and Jeremy Vila (OSU)

EM-GM-GAMP

CISS @ Princeton – 3/23/12

18 / 19

Matlab code is available at http://ece.osu.edu/~vilaj/EMGMAMP/EMGMAMP.html

Thanks!

Philip Schniter and Jeremy Vila (OSU)

EM-GM-GAMP

CISS @ Princeton – 3/23/12

19 / 19

Recommend Documents