Transformers for Recommender Systems - Part 3
Regularization to improve duplication penalty loss
Transformers for Recommender Systems - Part 3
This is a small update and a continuation of the previous posts Transformers for Recommender Systems - Part 1 and Transformers for Recommender Systems - Part 2.
In this post, we improved the model further by improving duplication penalty loss via regularization. This was a pretty small change such that we use median instead of mean to calculate the penalty loss.
Median essentially helps in reducing the impact of outliers via L1 regularization.
Updated Loss Function
This improved the train loss from 5.33 -> 5.31 and validation loss from 5.26 -> 5.24.
All the code is available on Github.
This post is licensed under CC BY 4.0 by the author.