[1] AUC optimization vs. error rate minimization, NIPS 2003. citation: 777
[2] On Loss Functions for Deep Neural Networks in Classification, Feb. 18 2017. citation: 754
[3] On Optimization Methods for Deep Learning, ICML 2011. citation: 1256
[4] On Discriminative vs. Generative classifiers: A comparison of logistic regression and naive Bayes, NIPS 2001. citation: 3425
[5] Neural Relational Inference for Interacting Systems, ICML 2018. citation: 755
[6] FaceNet: A Unified Embedding for Face Recognition and Clustering, CVPR 2015. citation: 14125
[7] On Calibration of Modern Neural Networks, Aug. 3 2017. citation: 4233
[8] Gaussian Error Linear Units (GELUs), 2016 (2023 v5). citation: 3387
[9] Universal Language Model Fine-tuning for Text Classification, ACL 2018. citation: 3814