Skip to content

Commit fc585b1

Browse files
committed
Adds capability to build and train fully input convex CNNs.
Adds capability to build and train fully input convex CNNs.
1 parent b75bb78 commit fc585b1

14 files changed

+1099
-51
lines changed

README.md

+2-3
Original file line numberDiff line numberDiff line change
@@ -128,9 +128,8 @@ more robust classification network.
128128
This repository introduces the following functions that are used throughout the
129129
examples:
130130

131-
- [`buildConstrainedNetwork`](conslearn/buildConstrainedNetwork.m) - Build a
132-
network with specific constraints induced on the architecture and
133-
initialization of the weights.
131+
- [`buildConstrainedNetwork`](conslearn/buildConstrainedNetwork.m) - Build a multi-layer perceptron (MLP) with constraints on the architecture and initialization of the weights.
132+
- [`buildConvexCNN`](conslearn/buildConvexCNN.m) - Build a fully-inpt convex convolutional neural network (CNN).
134133
- [`trainConstrainedNetwork`](conslearn/trainConstrainedNetwork.m) - Train a
135134
constrained network and maintain the constraint during training.
136135
- [`lipschitzUpperBound`](conslearn/lipschitzUpperBound.m) - Compute an upper
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,148 @@
1+
function net = buildImageFICCNN(inputSize, outputSize, filterSize, numFilters, options)
2+
% BUILDIMAGEFICCNN Construct a fully-input convex convolutional neural
3+
% network for image inputs.
4+
%
5+
% NET = BUILDIMAGEFICCNN(INPUTSIZE, OUTPUTSIZE, FILTERSIZE, NUMFILTERS)
6+
% creates a fully-input convex dlnetwork object, NET.
7+
%
8+
% INPUTSIZE is a row vector of integers [h w c], where h, w, and c
9+
% correspond ot the height, width and number of channels respectively.
10+
%
11+
% OUTPUTSIZE is an intenger indicating the number of neurons in the
12+
% output fully connected layer.
13+
%
14+
% FILTERSIZE is matrix with two columns specifying the height and width
15+
% for each convolutional layer. The network will have as many
16+
% convolutional layers as there are rows in FILTERSIZE. If FILTERSIZE is
17+
% provided as a column vector, it is assumed that the filters are square.
18+
%
19+
% NUMFILTERES is a column vector of integers that specifies the number of
20+
% filters for each convolutional layers. It must have the same number of
21+
% rows as FILTERSIZE.
22+
%
23+
% NET = BUILDIMAGEFICCNN(__, NAME=VALUE) specifies additional options
24+
% using one or more name-value arguments.
25+
%
26+
% Stride - Stride for each convolutional
27+
% layer, specified as a two-column
28+
% matrix where the first column is
29+
% the stride height, and the second
30+
% column is the stride width. If
31+
% Stride is specified as a column
32+
% vector, a square stride is assumed.
33+
% The default value is 1 for all
34+
% layers.
35+
%
36+
% DilationFactor - Dilation factor for each
37+
% convolutional layer, specified as a
38+
% two-column matrix where the frist
39+
% column is the stride height and the
40+
% second column is the stride width.
41+
% If DilationFactor is a column
42+
% vector, a square dilation factor is
43+
% assumed. The default value is 1 for
44+
% all layers.
45+
%
46+
% Padding - Padding method for each
47+
% convolutional layer specified as
48+
% "same" or "causal". Padding must be
49+
% a string array with the same number
50+
% of rows as FITLERSIZE. The default
51+
% value is "causal" for all layers.
52+
%
53+
% PaddingValue - Padding for each convolutional
54+
% layer, specified as a column vector
55+
% with the same number of rows as
56+
% FILTERSIZE. The default value is 0
57+
% for all layers.
58+
%
59+
% ConvexNonDecreasingActivation - Convex non-decreasing activation
60+
% function, specified as "softplus"
61+
% or "relu". The default value is
62+
% "relu".
63+
64+
% Copyright 2024 The MathWorks, Inc.
65+
66+
arguments
67+
inputSize (1,:) {mustBeNonempty, mustBeReal, mustBeInteger, mustBePositive, mustBeTwoOrThreeRowVector(inputSize, "inputSize")}
68+
outputSize (1,1) {mustBeReal, mustBeInteger, mustBePositive}
69+
filterSize {mustBeNonempty, mustBeReal, mustBeInteger, mustBePositive, mustBeOneOrTwoColumn(filterSize, "filterSize")}
70+
numFilters (:,1) {mustBeNonempty, mustBeReal, mustBeInteger, mustBePositive, mustBeEqualLength(filterSize, numFilters, "numFilters")}
71+
options.Stride {mustBeNonempty, mustBeReal, mustBeInteger, mustBePositive, mustBeOneOrTwoColumn(options.Stride, "Stride"), mustBeEqualLength(filterSize, options.Stride, "Stride")} = ones(numel(numFilters), 2)
72+
options.DilationFactor {mustBeNonempty, mustBeReal, mustBeInteger, mustBePositive, mustBeOneOrTwoColumn(options.DilationFactor, "DilationFactor"), mustBeEqualLength(filterSize, options.DilationFactor, "DilationFactor")} = ones(numel(numFilters), 2)
73+
options.Padding (:,1) {mustBeNonzeroLengthText, mustBeMember(options.Padding, "same"), mustBeEqualLength(filterSize, options.Padding, "Padding")} = repelem("same", numel(numFilters));
74+
options.PaddingValue (:,1) {mustBeNonempty, mustBeReal, mustBeEqualLength(filterSize, options.PaddingValue, "PaddingValue")} = zeros(numel(numFilters), 1);
75+
options.ConvexNonDecreasingActivation {mustBeNonzeroLengthText, mustBeTextScalar, mustBeMember(options.ConvexNonDecreasingActivation, ["relu", "softplus"])} = "relu"
76+
end
77+
78+
79+
80+
% Select the activation function based on user input
81+
switch options.ConvexNonDecreasingActivation
82+
case "relu"
83+
activationLayer = @(name) reluLayer(Name=name);
84+
case "softplus"
85+
activationLayer = @(name) softplusLayer(Name=name);
86+
end
87+
88+
% Build the input layer
89+
layers = [
90+
imageInputLayer(inputSize, Name="input", Normalization="none")
91+
];
92+
93+
% Build the convolutional layers
94+
for ii = 1:numel(numFilters)
95+
convLayerName = "conv2d_+_" + ii;
96+
activationLayerName = "cnd_" + ii;
97+
batchNormLayerName = "batchnorm_+_" + ii;
98+
99+
convLayer = convolution2dLayer(filterSize(ii, :), numFilters(ii), ...
100+
Stride=options.Stride(ii, :), ...
101+
DilationFactor=options.DilationFactor(ii, :), ...
102+
Padding=options.Padding(ii), ...
103+
PaddingValue=options.PaddingValue(ii), ...
104+
Name=convLayerName);
105+
106+
layers = [
107+
layers;
108+
convLayer;
109+
activationLayer(activationLayerName);
110+
batchNormalizationLayer(Name=batchNormLayerName)
111+
]; %#ok<AGROW>
112+
end
113+
114+
% Modify the name of the first convolutional layer to remove constraints
115+
layers(2).Name = "conv2d_1";
116+
117+
% Add final pooling and fully connected layers
118+
layers = [
119+
layers;
120+
globalAveragePooling2dLayer(Name="global_avg_pool");
121+
fullyConnectedLayer(outputSize, Name="fc_+_end")
122+
];
123+
124+
% Initialize the dlnetwork
125+
net = dlnetwork(layers);
126+
127+
% Make the network convex
128+
net = conslearn.convex.makeNetworkConvex(net);
129+
130+
end
131+
132+
function mustBeTwoOrThreeRowVector(x, name)
133+
if ~(isrow(x) && (numel(x) == 2 || numel(x) == 3))
134+
error("'%s' must be a row vector with two or three elements.", name);
135+
end
136+
end
137+
138+
function mustBeOneOrTwoColumn(x, name)
139+
if ~(size(x, 2) == 1 || size(x, 2) == 2)
140+
error("'%s' must be an array with one or two columns.", name);
141+
end
142+
end
143+
144+
function mustBeEqualLength(filterSize, otherVar, otherVarName)
145+
if ~isequal(size(filterSize, 1), size(otherVar, 1))
146+
error("'%s' must have the same number of rows as the filter size value.", otherVarName);
147+
end
148+
end
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,124 @@
1+
function net = buildSequenceFICCNN(inputSize, outputSize, filterSize, numFilters, options)
2+
% BUILDSEQUENCEFICCNN Construct a fully-input convex convolutional
3+
% neural network for sequence inputs.
4+
%
5+
% NET = BUILDSEQUENCEFICCNN(INPUTSIZE, OUTPUTSIZE, FILTERSIZE,
6+
% NUMFILTERS) creates a fully-input convex dlnetwork object, NET.
7+
%
8+
% INPUTSIZE is a integer indicating the number of features.
9+
%
10+
% OUTPUTSIZE is an intenger indicating the number of neurons in the
11+
% output fully connected layer.
12+
%
13+
% FILTERSIZE is column vector of integer filter sizes. The network will
14+
% have as many convolutional layers as there are rows in FILTERSIZE.
15+
%
16+
% NUMFILTERES is a column vector of integers that specifies the number of
17+
% filters for each convolutional layers. It must have the same number of
18+
% rows as FILTERSIZE.
19+
%
20+
% NET = BUILDSEQUENCEFICCNN(__, NAME=VALUE) specifies additional options
21+
% using one or more name-value arguments.
22+
%
23+
% Stride - Stride for each convolutional
24+
% layer, specified as a column vector
25+
% of integers with the same number of
26+
% rows as FILTERSIZE. The default
27+
% value is 1 for all layers.
28+
%
29+
% DilationFactor - Dilation factor for each
30+
% convolutional layer, specified as a
31+
% column vector with the same number
32+
% of rows as FILTERSIZE. The default
33+
% value is 1 for all layers.
34+
%
35+
% Padding - Padding method for each
36+
% convolutional layer specified as
37+
% "same" or "causal". Padding must be
38+
% a a string array with the same
39+
% number of rows as FITLERSIZE. The
40+
% default value is "causal" for all
41+
% layers.
42+
%
43+
% PaddingValue - Padding for each convolutional
44+
% layer, specified as a column vector
45+
% with the same number of rows as
46+
% FILTERSIZE. The default value is 0
47+
% for all layers.
48+
%
49+
% ConvexNonDecreasingActivation - Convex non-decreasing activation
50+
% function, specified as "softplus"
51+
% or "relu". The default value is
52+
% "relu".
53+
54+
% Copyright 2024 The MathWorks, Inc.
55+
56+
arguments
57+
inputSize (1,1) {mustBeReal, mustBeInteger, mustBePositive}
58+
outputSize (1,1) {mustBeReal, mustBeInteger, mustBePositive}
59+
filterSize (:,1) {mustBeNonempty, mustBeReal, mustBeInteger, mustBePositive}
60+
numFilters (:,1) {mustBeNonempty, mustBeReal, mustBeInteger, mustBePositive, mustBeEqualLength(filterSize, numFilters, "numFilters")}
61+
options.Stride (:,1) {mustBeNonempty, mustBeReal, mustBeInteger, mustBePositive, mustBeEqualLength(filterSize, options.Stride, "Stride")} = ones(numel(filterSize), 1)
62+
options.DilationFactor (:,1) {mustBeNonempty, mustBeReal, mustBeInteger, mustBePositive, mustBeEqualLength(filterSize, options.DilationFactor, "DilationFactor")} = ones(numel(filterSize), 1)
63+
options.Padding (:,1) {mustBeNonzeroLengthText, mustBeMember(options.Padding, ["same", "causal"]), mustBeEqualLength(filterSize, options.Padding, "Padding")} = repelem("causal", numel(filterSize))
64+
options.PaddingValue (:,1) {mustBeNonempty, mustBeReal, mustBeEqualLength(filterSize, options.PaddingValue, "PaddingValue")} = zeros(numel(filterSize), 1)
65+
options.ConvexNonDecreasingActivation {mustBeNonzeroLengthText, mustBeTextScalar, mustBeMember(options.ConvexNonDecreasingActivation, ["relu", "softplus"])} = "relu"
66+
end
67+
68+
% Select the activation function based on user input
69+
switch options.ConvexNonDecreasingActivation
70+
case "relu"
71+
activationLayer = @(name) reluLayer(Name=name);
72+
case "softplus"
73+
activationLayer = @(name) softplusLayer(Name=name);
74+
end
75+
76+
% Build the input layer
77+
layers = [
78+
sequenceInputLayer(inputSize, Name="input", Normalization="none")
79+
];
80+
81+
% Build the convolutional layers
82+
for ii = 1:numel(numFilters)
83+
convLayerName = "conv1d_+_" + ii;
84+
activationLayerName = "cnd_" + ii;
85+
batchNormLayerName = "batchnorm_+_" + ii;
86+
87+
convLayer = convolution1dLayer(filterSize(ii), numFilters(ii), ...
88+
Stride=options.Stride(ii), ...
89+
DilationFactor=options.DilationFactor(ii), ...
90+
Padding=options.Padding(ii), ...
91+
PaddingValue=options.PaddingValue(ii), ...
92+
Name=convLayerName);
93+
94+
layers = [
95+
layers;
96+
convLayer;
97+
activationLayer(activationLayerName);
98+
batchNormalizationLayer(Name=batchNormLayerName)
99+
]; %#ok<AGROW>
100+
end
101+
102+
% Modify the name of the first convolutional layer to remove constraints
103+
layers(2).Name = "conv1d_1";
104+
105+
% Add final pooling and fully connected layers
106+
layers = [
107+
layers;
108+
globalAveragePooling1dLayer(Name="global_avg_pool");
109+
fullyConnectedLayer(outputSize, Name="fc_+_end")
110+
];
111+
112+
% Initialize the dlnetwork
113+
net = dlnetwork(layers);
114+
115+
% Make the network convex
116+
net = conslearn.convex.makeNetworkConvex(net);
117+
118+
end
119+
120+
function mustBeEqualLength(filterSize, otherVar, otherVarName)
121+
if ~isequal(size(filterSize, 1), size(otherVar, 1))
122+
error("'%s' must have the same number of rows as the filter size value.", otherVarName);
123+
end
124+
end
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
function convexParameterIdx = getConvexParameterIdx(net)
22
% GETCONVEXPARAMETERIDX Returns the indices in the learnable parameter
33
% table of the network that correspond to weights with convex constraints.
4-
% The network *must* be created using the buildConstrainedNetwork function
5-
% with a convex constraint type.
4+
% The network *must* be created using the buildConstrainedNetwork or
5+
% buildConvexCNN function with a convex constraint type.
66

77
% Copyright 2024 The MathWorks, Inc.
88

@@ -11,6 +11,6 @@
1111
end
1212

1313
convexParameterIdx = contains(net.Learnables.Layer,"_+_") & ...
14-
contains(net.Learnables.Parameter,"Weight");
14+
( contains(net.Learnables.Parameter,"Weight") | contains(net.Learnables.Parameter,"Scale"));
1515
end
1616

conslearn/+conslearn/+convex/makeNetworkConvex.m

+2-2
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
function net = makeNetworkConvex(net)
22
% MAKENETWORKCONVEX Constrain the weights of a convex network to ensure
33
% convexity of the outputs with respect to the network inputs. The network
4-
% *must* be created using the buildConstrainedNetwork function with a convex
5-
% constraint type.
4+
% *must* be created using the buildConstrainedNetwork or
5+
% buildConvexCNN function with a convex constraint type.
66

77
% Copyright 2024 The MathWorks, Inc.
88

0 commit comments

Comments
 (0)