it'll cause overfitting with no symptoms since evaluation uses the testing set which the model has now seen before. Does this seem like it could have potential?

I'm trying something new and am wondering if those more experienced than me think it's a good idea. Every X epochs without improvement I train my model for Y epochs using the testing data, instead of the training data. This is in the hopes of giving it some understanding that's nonexistent in the training set, but present in the testing set. This is mostly for small(ish) data sets. I'm using this sparingly, so X is large and Y is small. Batch size for the testing set is much larger than for the training set so it gets very generalized averages for gradient updates. Still, I do worry it'll cause overfitting with no symptoms since evaluation uses the testing set which the model has now seen before. Does this seem like it could have potential?
You already invited:

Dexter

Upvotes from:

 I wouldn't recommend that sort of approach,it'll very likely improve the performance statistics (accuracy etc) on your test set
but the point of the test set is to see how well the model generalizes to data it has never seen before - which should be a proxy for how well it performs on real new data... training on your test set means it can no longer serve that purpose
your final evaluation should always be on data the net has never seen before during training

Kai

Upvotes from:

@Stanley if you think that your training data may have different characteristics than your test data, you should find something that can support that - if you just think that your network is stuck in a local minimum, you may try kicking the network by increasing the learning rate for some iterations or by adding some randomness on top of your weights

If you wanna answer this question please Login or Register