Skip to content

Commit

Permalink
Updated README
Browse files Browse the repository at this point in the history
  • Loading branch information
anordertoreclaim committed Aug 7, 2019
1 parent 0a15d2e commit be3a306
Show file tree
Hide file tree
Showing 2 changed files with 2 additions and 2 deletions.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -154,7 +154,7 @@ optional arguments:
# Examples of samples
The biggest challenge is to make the network converge to a good set of parameters. I've experimented with hyperparameters and here are the results I've managed to obtain for N-way MNIST using different models.

Generally, in order for model to converge to a good set of parameters, one needs to go with a small learning rate (in order of 1e-4). I've also found that bigger kernel sizes work best for hidden layers.
Generally, in order for model to converge to a good set of parameters, one needs to go with a small learning rate (about 1e-4). I've also found that bigger kernel sizes in hidden layers work better.

A very simple model, `python train.py --epochs 2 --color-levels 2 --hidden-fmaps 21 --lr 0.002 --max-norm 2` (all others are default values), trained for just 2 epochs, managed to produce these samples on a binary MNIST:

Expand Down
2 changes: 1 addition & 1 deletion train.py
Original file line number Diff line number Diff line change
Expand Up @@ -99,7 +99,7 @@ def main():
parser.add_argument('--hidden-layers', type=int, default=6,
help='Number of layers of gated convolutions with mask of type "B"')

parser.add_argument('--learning-rate', '--lr', type=float, default=0.0002,
parser.add_argument('--learning-rate', '--lr', type=float, default=0.0001,
help='Learning rate of optimizer')
parser.add_argument('--weight-decay', type=float, default=0.0001,
help='Weight decay rate of optimizer')
Expand Down

0 comments on commit be3a306

Please sign in to comment.