Due Monday, February 21

In this assignment you will investigate the power spectrum and higher-order statistics of natural images, and you will examine some models for how neurons in the visual system may have been adapted to this structure.

It will be easiest for this exercise if the image is square, so if its not already square you should truncate the largest dimension. Now, to compute the power spectrum of the image you use the Fourier transform, which converts a function in the space domain to a function in the frequency domain. You can compute the Fourier transform of an image in Matlab usingim=double(imread('einstein.jpg'));

The functionimf=fftshift(fft2(im));

To display the power spectrum you will need to define some frequency coordinates. If the image is of size N, then the frequencies run from -N/2 to N/2-1 (assuming N is even):impf=abs(imf).^2;

Then display the log power spectrum using imagesc:f=-N/2:N/2-1;

You will note that the power falls off with frequency as you move away from the center. Ignore the vertical and horizontal streak for now - its an artifact due to the boundaries of the image. Now, to get a better idea of how the power falls off, we can do a rotational average of the power spectrum:imagesc(f,f,log10(impf)), axis xy

Now define a frequency coordinate from 0 to N/2:Pf=rotavg(impf);

and plot the rotationally averaged spectrum on a log-log plot:f1=0:N/2;

You should see that the power tends to fall as 1/loglog(f1,Pf)

Then convert to polar coordinates:[fx fy] = meshgrid(f);

The filter is then[theta rho]=cart2pol(fx,fy);

Plot the filter usingfiltf = rho;

and you will see that it is a function that simply rises with frequency, independent of orientation. But remember that due to noise, we don't want to whiten all the way to the highest spatial-frequencies. Also, because we are working in rectangular coordinates, there is a larger range of spatial frequencies along the diagonals, so we'd like to contrict the whitening function to lie within a constant radius just short of the Nyquist frequency. To do this, we multiply by a circular, Gaussian filter as follows:imagesc(f,f,filtf), axis xy

Now plot out the filter again and you will see that it rises linearly to a point and then begins to fall again. Take a rotational average and plot as a function of one dimension and you will see more clearly how the filter is shaped. To see what the filter looks like in the space domain, you need to calculate the inverse Fourier transform:filtf = rho.*exp(-0.5*(rho/(0.7*N/2)).^2);

You can view the filter as follows:filt = fftshift(real(ifft2(fftshift(filtf))));

You will need to zoom into the center with the magnifying glass to see the filter. Note that it is basically a center-surround type filter. The inhibitory surround may be a bit hard to see because they are small negative numbers, but its basically analogous to what is found in the retina and LGN. To get the full scoop on this story you should read the following paper:imagesc(filt,[-1 1]*max(filt(:)))

axis image, colorbar

Atick JJ (1992) Could information theory provide an ecological theory of sensory processing?Now let's see what happens when we apply our whitening filter to the image. Remember that convolution in the space domain is multiplication in the frequency domain, so we simply multiply the spectrum of the image with the spectrum of the filter and take the inverse fft as follows:Network, 3, 213-251

View the imageimwf =filtf.*imf;

imw = ifft2(fftshift(imwf));

And look at its spectrumimagesc(imw), axis image

colormap gray

Note that the spectrum is now pretty much flat, with simply a lowpass roll-off at the high end of the spectrum.Pwf=rotavg(abs(imwf).^2);

loglog(f1,Pwf)

There's one thing left to do: contrast normalization. It has been shown that neurons in the retina and LGN already do some form of local contrast normalization. One simple model of this process is to divide the output of a neuron by some measure of the total activity in the neighborhood - for example, standard deviation. Here's one way of doing this using a Gaussian neighborhood with a diameter of 16 sample nodes:

Now look at the contrast normalized imageD=16;

[x y] = meshgrid(-D/2:D/2-1);

G = exp(-0.5*((x.^2+y.^2)/(D/2)^2));

G = G/sum(G(:));

imv = conv2(imw.^2,G,'same');

imn = imw./sqrt(imv);

The goal of projection pursuit is to find a low dimensional projection of the data that yields a non-Gaussian distribution. In general, any random projection of the data will yield a Gaussian distribution. But when the linear subspace defining the projection is aligned with the structure of the data, then the projection will be non-Gaussian. Sparse coding can be thought of as a special case of projection pursuit in which the form of non-Gaussianity is sparse - i.e., a distribution peaked at zero with heavy tails with respect to a Gaussian. This means that the value of the projection is typically around zero, but not in the trivial sense that it simply has low variance. A population of neurons with such distributions would exhibit a sparse code, because only a small fraction of the neurons will be active.

We can think of a cortical simple cell as computing a one-dimensional projection of the data if its instantaneous activity is essentially the result of computing the inner-product between its weight vector (receptive field) and the image. So, let's try designing different receptive fields for a model cortical neuron and see what their projections look like. For starters, let's make the receptive field an array of weight values with the same size

You can do the convolution as follows:w = randn(D);

w = w/sqrt(sum(w(:).^2)); % normalizes the weight vector

w = w - mean(w(:)); % zero mean

Note that we use the contrast-normalized, whitened image, because we are simulating a weighted sum of LGN inputs to the cortex. Look at the resulting image as viewed through the rf of this neuron usingimfilt = conv2(imn,w,'same');

Then compute a Gaussian of the same variancepvar=[-1:.01:1]*max(abs(imfilt(:)));

H=hist(imfilt(:),pvar);

and plot both on a semilog plot as follows:sigma=std(imfilt(:));

mu=mean(imfilt(:));

P=exp(-0.5*((pvar-mu)/sigma).^2);

P=P/sum(P);

(You may have to use thesemilogy(pvar,H/N^2,pvar,P,'k--')

A random projection of the data (random receptive field) yields a Gaussian distribution, as predicted from projection pursuit. So, what kind of receptive field yields a non-Gaussian distribution? Let's try a Gabor filter model of a cortical simple cell. A Gabor filter is simply a Gaussian modulated sinusoid. We can make a vertically oriented Gabor as follows:

You can see what this looks like usingsigma_x = D/8;

sigma_y = D/8;

f_x = 1/(D/4);

g = exp(-0.5*((x/sigma_x).^2 + (y/sigma_y).^2)).*sin(2*pi*f_x*x);

g = g/sqrt(sum(g(:).^2)); % normalization

Now look at the resulting image as viewed through the rf of this neuron, and plot the histogram and compare to a Gaussian as before. You should see that it is now distinctly non-Gaussian, and the specific way in which it is non-Gaussian is such that it is peaked at zero with heavy tails. Another way of saying this is that the neuron spends more time at zero than expected from a Gaussian distribution. Try different parameter settings for the Gabor filter to see if you can either accentuate or degrade the degree of non-Gaussianity (sparsity) exhibited the model neuron.imfilt = conv2(imn,g,'same');

For further reading on these issues see

Simoncelli EP, Olshausen BA (2001) Natural Image Statistics and Neural Representation.Annual Review of Neuroscience, 24, 1193-1215.

http://www.cns.nyu.edu/pub/eero/simoncelli01-reprint.pdf

- A set of images and their corresponding power spectra, with at least
one that demonstrates strong deviation from the 1/
*f*^{2}trend, along with an explanation of why. - The whitened, contrast-normalized image computed for a "good" 1/
*f*^{2}image above. Use this same image for parts 3-5 below. - The resulting response histogram, plus super-imposed Gaussian fit, obtained using the random receptive field model.
- The resulting response histogram, plus super-imposed Gaussian fit, obtained using a Gabor filter for which you have optimized the parameters for sparsity. Show the Gabor and report what parameters you used.
- Same thing for Gabor filter parameters that minimize sparsity.