Compressed Sensing and Bayesian Experimental ... - Semantic Scholar

Report 3 Downloads 177 Views
Compressed Sensing and Bayesian Experimental Design Matthias W. Seeger and Hannes Nickisch Max Planck Institute for Biological Cybernetics Tübingen, Germany

July 8, 2008

M. Seeger & H. Nickisch (MPI)

Bayesian Experimental Design, #459

July 8, 2008

1 / 12

Problem Statement

Measuring Natural Images Reconstruct natural images u ∈ Rn from m  n noisy linear measurements y = Xu + ε ∈ Rm Digital photography Magnetic resonance imaging

How to choose X? Compressed sensing theory: random X Image sparsity highly structured: Random X should not do well

Weiss et.al., Snowbird 07

Our Contributions 1

Large study on natural images

2

Bayesian method for optimizing X: Learning compressed sensing

M. Seeger & H. Nickisch (MPI)

Bayesian Experimental Design, #459

July 8, 2008

2 / 12

Problem Statement

Measuring Natural Images Reconstruct natural images u ∈ Rn from m  n noisy linear measurements y = Xu + ε ∈ Rm Digital photography Magnetic resonance imaging

How to choose X? Compressed sensing theory: random X Image sparsity highly structured: Random X should not do well

Weiss et.al., Snowbird 07

Our Contributions 1

Large study on natural images

2

Bayesian method for optimizing X: Learning compressed sensing

M. Seeger & H. Nickisch (MPI)

Bayesian Experimental Design, #459

July 8, 2008

2 / 12

Our Approach

Compressed Sensing as Bayesian Design

Low-level statistics of natural images: Images are sparse Use a sparsity prior distribution

Sequential measurement optimization Next filter x> ∗ along direction of largest uncertainty Need for Bayesian posterior P(u|y)

Approximate inference drives optimization of X

1

10

0

10

−1

Expectation Propagation

10

Histogram Sparse Normal

−2

10

−3

10

−4

10

−1

M. Seeger & H. Nickisch (MPI)

Bayesian Experimental Design, #459

−0.5

0

July 8, 2008

0.5

1

3 / 12

Results

Sequential Algorithm Illustration

M. Seeger & H. Nickisch (MPI)

Bayesian Experimental Design, #459

July 8, 2008

4 / 12

Results

Sequential Algorithm Illustration

M. Seeger & H. Nickisch (MPI)

Bayesian Experimental Design, #459

July 8, 2008

4 / 12

Results

Comparison of Different Methods 14 SBL (opt) LASSO (rand) L2 (heur) EP (opt)

Reconstruction error

12 10 8 6 4 10

100

300 500 700 Number of measurements

1024

75 images: 64 × 64 = 4k , σ 2 = 0.005 M. Seeger & H. Nickisch (MPI)

Bayesian Experimental Design, #459

July 8, 2008

5 / 12

Results

A Very Simple Baseline

L2 (wavelet heuristic) I: Fixed  index of wavelet coefficients, coarse → fine. yi i ∈ I vi ← 0 i 6∈ I T ˆ u ←W v Natural images: How you measure is more important than how you reconstruct

M. Seeger & H. Nickisch (MPI)

Bayesian Experimental Design, #459

July 8, 2008

6 / 12

Results

A Very Simple Baseline

L2 (wavelet heuristic) I: Fixed  index of wavelet coefficients, coarse → fine. yi i ∈ I vi ← 0 i 6∈ I T ˆ u ←W v Natural images: How you measure is more important than how you reconstruct

M. Seeger & H. Nickisch (MPI)

Bayesian Experimental Design, #459

July 8, 2008

6 / 12

Results

The Same for Larger Images 75 images: 64 × 64 = 4k

75 images: 256 × 256 = 65k

14

25

LASSO (rand)

LASSO (rand) L2 (heur) LASSO (heur)

12

L2 (heur) LASSO (heur) 20

10 8

15

6 4 10 100

300

500

700

σ 2 = 0.005 M. Seeger & H. Nickisch (MPI)

102410

5k

10k

15k

σ 2 = 0.001

Bayesian Experimental Design, #459

July 8, 2008

7 / 12

Compressed Sensing by Minimax Theory

Compressed Sensing by Minimax Theory y = Xu + ε ∈ Rm ,

u ∈ Rn sparse

What if u image?

Sparse Signals

Natural Images

Upper bounds Candès, Romberg, Tao, Th. 1.3; Wainwright(a), Th. 1 If m > s log n and P(X) . . . : For all s-sparse signals u: Lasso reconstruction exact with prob. ≈ 1 (over X)

M. Seeger & H. Nickisch (MPI)

Bayesian Experimental Design, #459

July 8, 2008

8 / 12

Compressed Sensing by Minimax Theory

Compressed Sensing by Minimax Theory y = Xu + ε ∈ Rm ,

u ∈ Rn sparse

What if u image?

Sparse Signals

Natural Images

Lower bounds (this is “tight” because . . . ) “No recovery can be successful for all [s-sparse] signals using significantly fewer observations.” CRT, Sect. 1.4 “We think of the underlying true vector [u] with its support [T ] randomly chosen, . . . ” Wainwright(b), Sect. 1.2

M. Seeger & H. Nickisch (MPI)

Bayesian Experimental Design, #459

July 8, 2008

8 / 12

Compressed Sensing by Minimax Theory

Compressed Sensing by Minimax Theory m

y = Xu + ε ∈ R , What if u image?

n

u ∈ R sparse

111111111111111111111 000000000000000000000 000000000000000000000 111111111111111111111 000000000000000000000 111111111111111111111 000000000000000000000 111111111111111111111 000000000000000000000 111111111111111111111 000000000000000000000 111111111111111111111 000000000000000000000 111111111111111111111 000000000000000000000 111111111111111111111 000000000000000000000 111111111111111111111 000000000000000000000 111111111111111111111 000000000000000000000 111111111111111111111 Sparse Signals 000000000000000000000 111111111111111111111 000000000000000000000 111111111111111111111 000000000000000000000 111111111111111111111 000000000000000000000 111111111111111111111 000000000000000000000 111111111111111111111 000000000000000000000 111111111111111111111 000000000000000000000 111111111111111111111 000000000000000000000 111111111111111111111 0000000 1111111 000000000000000000000 111111111111111111111 0000000 1111111 000000000000000000000 111111111111111111111 0000000 1111111 Natural Images 000000000000000000000 111111111111111111111 0000000 1111111 000000000000000000000 111111111111111111111 0000000 1111111 000000000000000000000 111111111111111111111

Lower bounds (this is “tight” because . . . ) “No recovery can be successful for all [s-sparse] signals using significantly fewer observations.” CRT, Sect. 1.4 “We think of the underlying true vector [u] with its support [T ] randomly chosen, . . . ” Wainwright(b), Sect. 1.2

M. Seeger & H. Nickisch (MPI)

Bayesian Experimental Design, #459

July 8, 2008

8 / 12

Compressed Sensing by Minimax Theory

Compressed Sensing by Minimax Theory m

y = Xu + ε ∈ R , What if u image?

n

u ∈ R sparse

111111111111111111111 000000000000000000000 000000000000000000000 111111111111111111111 000000000000000000000 111111111111111111111 000000000000000000000 111111111111111111111 000000000000000000000 111111111111111111111 000000000000000000000 111111111111111111111 000000000000000000000 111111111111111111111 000000000000000000000 111111111111111111111 000000000000000000000 111111111111111111111 000000000000000000000 111111111111111111111 000000000000000000000 111111111111111111111 Sparse Signals 000000000000000000000 111111111111111111111 000000000000000000000 111111111111111111111 000000000000000000000 111111111111111111111 000000000000000000000 111111111111111111111 000000000000000000000 111111111111111111111 000000000000000000000 111111111111111111111 000000000000000000000 111111111111111111111 000000000000000000000 111111111111111111111 0000000 1111111 000000000000000000000 111111111111111111111 0000000 1111111 000000000000000000000 111111111111111111111 0000000 1111111 Natural Images 000000000000000000000 111111111111111111111 0000000 1111111 000000000000000000000 111111111111111111111 0000000 1111111 000000000000000000000 111111111111111111111

Optimality of Lasso and simple P(X): Minimax optimality. Signals sparse, all other things random

M. Seeger & H. Nickisch (MPI)

Bayesian Experimental Design, #459

July 8, 2008

8 / 12

Where is the Energy?

Where is the Energy?

M. Seeger & H. Nickisch (MPI)

Bayesian Experimental Design, #459

July 8, 2008

9 / 12

Where is the Energy?

Where is the Energy?

M. Seeger & H. Nickisch (MPI)

Bayesian Experimental Design, #459

July 8, 2008

9 / 12

Where is the Energy?

Where is the Energy?

M. Seeger & H. Nickisch (MPI)

Bayesian Experimental Design, #459

July 8, 2008

9 / 12

The World according to Minimax

The World according to Minimax

M. Seeger & H. Nickisch (MPI)

Bayesian Experimental Design, #459

July 8, 2008

10 / 12

The World according to Minimax

The World according to Minimax

M. Seeger & H. Nickisch (MPI)

Bayesian Experimental Design, #459

July 8, 2008

10 / 12

The World according to Minimax

The World according to Minimax

M. Seeger & H. Nickisch (MPI)

Bayesian Experimental Design, #459

July 8, 2008

10 / 12

The World according to Minimax

Nyquist.1 → Nyquist.2

Nyquist.1: Signal band-limited, otherwise random: You cannot do better than X1 Nyquist.2: Signal band-limited, sparse, otherwise random: But now you really cannot do better than X2 < X1 Natural images (real-world signals) √ are band-limited √ are approximately sparse have much exploitable structure beyond that!

You can do better than X2 on images. You can show this works, by sound empirical evaluation

M. Seeger & H. Nickisch (MPI)

Bayesian Experimental Design, #459

July 8, 2008

11 / 12

The World according to Minimax

Nyquist.1 → Nyquist.2

Nyquist.1: Signal band-limited, otherwise random: You cannot do better than X1 Nyquist.2: Signal band-limited, sparse, otherwise random: But now you really cannot do better than X2 < X1 Natural images (real-world signals) √ are band-limited √ are approximately sparse have much exploitable structure beyond that!

You can do better than X2 on images. You can show this works, by sound empirical evaluation

M. Seeger & H. Nickisch (MPI)

Bayesian Experimental Design, #459

July 8, 2008

11 / 12

The World according to Minimax

Nyquist.1 → Nyquist.2

Nyquist.1: Signal band-limited, otherwise random: You cannot do better than X1 Nyquist.2: Signal band-limited, sparse, otherwise random: But now you really cannot do better than X2 < X1 Natural images (real-world signals) √ are band-limited √ are approximately sparse have much exploitable structure beyond that!

You can do better than X2 on images. You can show this works, by sound empirical evaluation

M. Seeger & H. Nickisch (MPI)

Bayesian Experimental Design, #459

July 8, 2008

11 / 12

Conclusions

Conclusions L1 /Lasso/Dantzig/. . . , minimally coherent P(X): meets minimax lower bounds for sparse signals (all other things random) Natural image statistics: sparse, but much else non-random ⇒ Can robustly choose much better filters X Our method uses the same prior knowledge as L1 /Lasso. It does more with the posterior than just maximizing it You should optimize X for your domain of interest. You can do that with little firm prior knowledge. Needs experimental design, not just uniform random sampling Bayesian experimental design can be scaled up with novel variational inference algorithms Seeger et.al., submitted

M. Seeger & H. Nickisch (MPI)

Bayesian Experimental Design, #459

July 8, 2008

12 / 12

Conclusions

Conclusions L1 /Lasso/Dantzig/. . . , minimally coherent P(X): meets minimax lower bounds for sparse signals (all other things random) Natural image statistics: sparse, but much else non-random ⇒ Can robustly choose much better filters X Our method uses the same prior knowledge as L1 /Lasso. It does more with the posterior than just maximizing it You should optimize X for your domain of interest. You can do that with little firm prior knowledge. Needs experimental design, not just uniform random sampling Bayesian experimental design can be scaled up with novel variational inference algorithms Seeger et.al., submitted

M. Seeger & H. Nickisch (MPI)

Bayesian Experimental Design, #459

July 8, 2008

12 / 12

Conclusions

Conclusions L1 /Lasso/Dantzig/. . . , minimally coherent P(X): meets minimax lower bounds for sparse signals (all other things random) Natural image statistics: sparse, but much else non-random ⇒ Can robustly choose much better filters X Our method uses the same prior knowledge as L1 /Lasso. It does more with the posterior than just maximizing it You should optimize X for your domain of interest. You can do that with little firm prior knowledge. Needs experimental design, not just uniform random sampling Bayesian experimental design can be scaled up with novel variational inference algorithms Seeger et.al., submitted

M. Seeger & H. Nickisch (MPI)

Bayesian Experimental Design, #459

July 8, 2008

12 / 12

Conclusions

Conclusions L1 /Lasso/Dantzig/. . . , minimally coherent P(X): meets minimax lower bounds for sparse signals (all other things random) Natural image statistics: sparse, but much else non-random ⇒ Can robustly choose much better filters X Our method uses the same prior knowledge as L1 /Lasso. It does more with the posterior than just maximizing it You should optimize X for your domain of interest. You can do that with little firm prior knowledge. Needs experimental design, not just uniform random sampling Bayesian experimental design can be scaled up with novel variational inference algorithms Seeger et.al., submitted

M. Seeger & H. Nickisch (MPI)

Bayesian Experimental Design, #459

July 8, 2008

12 / 12

Conclusions

Conclusions L1 /Lasso/Dantzig/. . . , minimally coherent P(X): meets minimax lower bounds for sparse signals (all other things random) Natural image statistics: sparse, but much else non-random ⇒ Can robustly choose much better filters X Our method uses the same prior knowledge as L1 /Lasso. It does more with the posterior than just maximizing it You should optimize X for your domain of interest. You can do that with little firm prior knowledge. Needs experimental design, not just uniform random sampling Bayesian experimental design can be scaled up with novel variational inference algorithms Seeger et.al., submitted

M. Seeger & H. Nickisch (MPI)

Bayesian Experimental Design, #459

July 8, 2008

12 / 12