So, I ran yesterday’s experiment to completion, and it’s a bit hard to evaluate exactly what’s the performance on the validation test because
1) I inadvertently used 20000-25000 as validation instead of 20000-22500.
2) I’m not using enough validation points per epoch and that introduces variance in the measurements
I’m not making stopping decisions based on that value, though, so we could just average the performance measurements in the last iterations and that would work.
The YAML file used is on my github at
The experiment was run on Gtx TITAN Black, but it’s using approximately just 1GB of RAM, so it could be run anywhere else. I picked a number of epochs so that it would run during the whole night.
I’m currently running the thing again, but this time with the proper validation set.
As for the recipe, I used the following intuition :
– You want to have a few convolution layers at first. Then you want some fully-connected layers. Use RELUs instead of sigmoids or tanh.
– Initialize the parameters to something that’s not too small. This way, the training can actually make some progress instead of being stuck around 0.5 for an eternity.
– Pick a learning rate that’s not too big. I didn’t even get fancy by changing the learning rate over the training, like the theory would usually ask for. I was shooting for 80% accuracy, and not for the best performance possible. For some reason, momentum 0.9 did poorly when I tried it, and momentum 0.1 did great.
I ran Vincent’s script to get an accurate measurement of the validation error using the last model from the training (so the procedure stays honest).
python dogs_vs_cats_error.py /home/gyomalin/ML/tmp/base_conv_06_mlp.pkl
Using gpu device 0: GeForce GTX TITAN Black
Train error rate is 0.1485
Valid error rate is 0.1712
Test error rate is 0.1536