
This directory contains the source code for the sparse coding
simulation.  You are free to use or make chages to this software for
your own experimental purposes.  Please do not distribute (or
especially sell) to others.  The program should always be obtained
from the URL http://redwood.psych.cornell.edu/sparsenet/sparsenet.html.  
The latest version will be made available there.  If you wish to be
notified of changes or corrections as they occur, please send me email
so I can add you to my list.

Bruno Olshausen
bruno@ai.mit.edu


-------------------------------------------


To compile:
-----------

Compilation is dependendent on the existence of two libraries: 1)
numerical recipies conjugate gradient descent procedures, and 2)
TK/TCL.  Makefile should be modified to reflect the location of these
libraries and include files for successful compilation.  For the
numerical recipies routines, I linked together frprmn.o, linmin.o,
f1dim.o, brent.o, mnbrak.o, and nrutil.o to form libnropt.a, but you
could just link to the entire numerical recipies library.

The variable ITMAX in diff_net_cg.c sets the maximum number of
iterations for conjugate gradient descent for determining the a_i.
The numerical recipies procedure frprmn.c defines this as a constant,
so if you wish to have control over this parameter you will need to
change the definition of ITMAX in frprmn.c to "int ITMAX" instead of
"#define ITMAX 200".  If for some reason you can't make changes to
frprmn.c on your system, then simply comment out all references to
ITMAX in diff_net_cg.c.  In this case, you will only be able to
control the halting point through setting ftol (read the numerical
recipies book for description of this parameter).

The following is a description of the source code files:

diff_net_cg.c:	The sparse coding algorithm.
display.c:	Display routines
display_init.c:	Display initialization
histo.c:	Histogram collection routines
sim.c:		TCL/TK interpreter for running the program

gen_data.c:	A program for generating sparse test data


To run:
-------

The simulation is actually run by using the net.tcl script.  Type
"net.tcl" at the prompt and away it goes.  

There are two windows that come up.  One is labeled "net.tcl," which
lets you control the program, set parameters, etc.  The other, labeled
"net," displays the simulation outputs, weights, etc.

The parameter beta controls the sparseness factor (lambda/sigma in the
paper).  Eta controls the learning rate - start this out at 500 and
then creep it down as the solution converges.  Settle determines the
maximum number of iterations used by the conjugate gradient descent
procedure for determining the a_i.  You will see that most of the
"work" is done in the first ten iterations.

The white panel in the simulation display window shows the output of
the network (the a_i) as a bar chart.  The blue and red bars displayed
at the bottom show the std dev (sqrt(var)) and gain factor,
respectively, for each unit.  The phi_i are shown below this.  At the
bottom of the window are shown the input to the network, the
reconstruction, and the residual image, respectively.  At the lower
left is shown the iteration number.  A weight update is performed
every 100 iterations.

The number of inputs and outputs is set in the first command line of
net.tcl - i.e., in the line that begins "initSim."  The first argument
gives the number of inputs, the second the number of outputs.
Dimensions of the input are always assumed to be square, so you should
enter a number that is the square of some integer - i.e., 64, 144,
256, etc.  You can make the number outputs less than, equal to, or
greater than the number of inputs.

The images used as training data are set by the constant IMAGE_FILE in
sim.c.  This will initially be set to trainset.test/image%d, which is
the sparse pixels training set.  To use the preprocessed natural
images, change this to trainset/image%d and also change NUM_IMAGES to
10.

Saving the state of the network will write the phi's to a file called
"weights" and the gain factors for each phi to a file called "gain."
When you load state, it will expect to find both of these files.

The program gen_data was used to generate the test images in
trainset.test.

Please let me know of any bugs you find.
