### This folder includes the code for our monotone deep Boltzmann machine implementation.

-- Required package: 
torch, numpy, matplotlib, tqdm

-- Structure of the code: 
	-- proxsoftmax.py includes the implementation for our prox_f^alpha for alpha between 0 and 1. Notice that one cannot set alpha=1 here.
	-- multitier_conv.py includes the multi-tier structure of the DEQ, in the form of convolutions (notice that even dense linear layers can be re-written as convolutions)
		The most important arguments are 

		sizes = [(num_classes, 28, 28, 1),
         (40, 14, 14, 10,),
         (80, 7, 7, 20),
         (10, 1)]

		kernels = np.array([[3, 0, 0, 0],
		                    [3, 3, 0, 0],
		                    [3, 3, 3, 0],
		                    [0, 0, 1, 1]])

		Here 'sizes' specifies the (input_channel, input_height, input_width, input_groups) for convolutions, or (input_size, input_groups) for linear layers. 'kernels' specifies the connection and kernel sizes.

	-- deq_model.py includes the implementation for the forward-backward splitting. The most high-level class is 'ConvDeqCrf', which will be our model. 
	-- util.py includes utility functions like data preparation.
	-- train_and_eval.py includes training and evaluation. 
	-- main.py. The main entrance for running this code. One can directly run 'python -i main.py'. The variable 'num_classes' indicates the number of bins we use for discretizing the input intensity. For example, if num_classes=2 and the input 	intensity of MNIST image pixel at position (0,0) is x, then the binned output will be 0 if x<0.5, and 1 otherwise.
		MON_DEFAULTS = {
		    'alpha': 0.125,
		    'tol': 1e-2,
		    'max_iter': 50
		}
		specifies the arguments for the forward-backward splitting. 'alpha' is the alpha used in damped iteration and prox_f^alpha, 'tol' is the relative tolerance at which we stop the splitting, and 'max_iter' is the maximum number of (anderson) iterations.