輪講資料
On the Versatile Uses of Partial Distance Correlation in Deep Learning
Deep Learning without Shortcuts: Shaping the Kernel with Tailored Rectifiers
Energy-Based Learning for Scene Graph Generation
Do Transformers Really Perform Bad for Graph Representation?
Prototypical Contrastive Learning of Unsupervised Representations
Surrogate Gap Minimization Improves Sharpness-Aware Training
Vision Transformer with Deformable Attention