딥러닝 어쩌구/Trendings
CVPR 2021 best paper candidates 목록
2021. 8. 24. 11:18
저번 글에서 이번 CVPR 베스트 페이퍼를 살펴봤는데,
CVPR 2021 Best paper는 아니지만 후보에 올랐던 논문들을 한번씩 보려고 한다.
283 | Privacy-Preserving Image Features via Adversarial Affine Subspace Embeddings | Mihai Dusmanu (ETH Zurich); Johannes L Schönberger (Microsoft); Sudipta Sinha (Microsoft); Marc Pollefeys (ETH Zurich / Microsoft) |
415 | Learning Calibrated Medical Image Segmentation via Multi-Rater Agreement Modeling | Wei Ji (University of Alberta); Shuang Yu (Tencent); Junde Wu (Harbin Institute of Technology); Kai Ma (Tencent); Cheng Bian (Tencent); Qi Bi (University of Amsterdam); |
456 | Diffusion Probabilistic Models for 3D Point Cloud Generation | Shitong Luo (Peking University); Wei Hu (Peking University) |
566 | Task Programming: Learning Data Efficient Behavior Representations | Jennifer J. Sun (Caltech); Ann Kennedy (Northwestern University); Eric Zhan (Caltech); David J. Anderson (Caltech); Yisong Yue (Caltech); Pietro Perona (California Institute of Technology) |
902 | PoseAug: A Differentiable Pose Augmentation Framework for 3D Human Pose Estimation | Kehong Gong (National University of Singapore); Jianfeng Zhang (NUS); Jiashi Feng (NUS) |
1058 | SCANimate: Weakly Supervised Learning of Skinned Clothed Avatar Networks | Shunsuke Saito (Facebook); Jinlong Yang (Max Planck Institute for Intelligent Systems); Qianli Ma (Max Planck Institute for Intelligent Systems); Michael J. Black (Max Planck Institute for Intelligent Systems) |
1078 | On Self-Contact and Human Pose | Lea Müller (Max Planck Institute for Intelligent Systems); Ahmed A A Osman (Max Planck Institute for Intelligent Systems); Siyu Tang (ETH Zurich); Chun-Hao Paul Huang (Max Planck Institute for Intelligent Systems); Michael J. Black (Max Planck Institute for Intelligent Systems) |
1269 | Binary TTC: A Temporal Geofence for Autonomous Navigation | Abhishek Badki (University of California, Santa Barbara); Orazio Gallo (NVIDIA Research); Jan Kautz (NVIDIA); Pradeep Sen (UC Santa Barbara) |
1300 | Rethinking and Improving the Robustness of Image Style Transfer | Pei Wang (UC San Diego); Yijun Li (Adobe Research); Nuno Vasconcelos (UC San Diego) |
1704 | Audio-Visual Instance Discrimination with Cross-Modal Agreement | Pedro Morgado (University of California, San Diego); Nuno Vasconcelos (UCSD, USA); Ishan Misra (Facebook AI Research) |
1824 | Point2Skeleton: Learning Skeletal Representations from Point Clouds | Cheng Lin (The University of Hong Kong); Changjian Li (University College London); Yuan Liu (The University of Hong Kong); Nenglun Chen (The University of Hong Kong); Yi King Choi (The University of Hong Kong); Wenping Wang (The University of Hong Kong) |
1929 | Human POSEitioning System (HPS): 3D Human Pose Estimation and Self-Localization in Large Scenes From Body-Mounted Sensors | Vladimir Guzov (Max Planck Institute for Informatics); Aymen Mir (Max Planck Institute of Informatics); Torsten Sattler (Czech Technical University in Prague); Gerard Pons-Moll (MPII, Germany) |
2551 | Where and What? Examining Interpretable Disentangled Representations | Xinqi Zhu (University of Sydney); Chang Xu (University of Sydney); Dacheng Tao (The University of Sydney) |
https://arxiv.org/pdf/2104.05622.pdf | ||
3225 | Learning To Recover 3D Scene Shape From a Single Image | Wei Yin (University of Adelaide); Jianming Zhang (Adobe Research); Oliver Wang (Adobe Systems Inc); Simon Niklaus (Adobe Research); Long T Mai (Adobe Research); Simon Chen (Adobe Research); Chunhua Shen (University of Adelaide) |
3367 | GIRAFFE: Representing Scenes As Compositional Generative Neural Feature Fields | Michael Niemeyer (Max Planck Institute for Intelligent Systems, Tübingen and University of Tübingen); Andreas Geiger (MPI-IS and University of Tuebingen) |
이게 베스트 페이퍼. 저번에 리뷰 했으니 스킵 | ||
3386 | Polygonal Building Extraction by Frame Field Learning | Nicolas Girard (Inria Sophia-Antipolis); Dmitriy Smirnov (MIT); Justin M Solomon (MIT); Yuliya Tarabalka (Inria Sophia-Antipolis) |
https://arxiv.org/pdf/2004.14875.pdf | segmentation결과를 실제 Downstream task들에 사용할때를 고려해서 building들을 뽑는 것 같다 | |
3433 | NeuralRecon: Real-Time Coherent 3D Reconstruction From Monocular Video | Jiaming Sun (SenseTime); Yiming Xie (SenseTime); Linghao Chen (Zhejiang University); Xiaowei Zhou (Zhejiang University); Hujun Bao (Zhejiang University) |
3592 | CoCosNet v2: Full-Resolution Correspondence Learning for Image Translation | Xingran Zhou (Zhejiang University); Bo Zhang (Microsoft Research Asia); Ting Zhang (MSRA); Pan Zhang (USTC); Jianmin Bao (Microsoft Research Asia); Dong Chen (Microsoft Research Asia); Zhongfei Zhang (Binghamton University); Fang Wen (Microsoft Research Asia) |
https://openaccess.thecvf.com/content/CVPR2021/papers/Zhou_CoCosNet_v2_Full-Resolution_Correspondence_Learning_for_Image_Translation_CVPR_2021_paper.pdf | image to image translation을 잘 하는 논문인 것 같다. patch match기법을 가지고 점점 맞춘다는데... | |
4263 | Less Is More: ClipBERT for Video-and-Language Learning via Sparse Sampling | Jie Lei (UNC Chapel Hill); Linjie Li (Microsoft); Luowei Zhou (Microsoft); Zhe Gan (Microsoft); Tamara Berg (UNC Chapel Hill, USA); Mohit Bansal (University of North Carolina at Chapel Hill); Jingjing Liu (Microsoft) |
https://arxiv.org/pdf/2102.06183.pdf | VQA처럼 비전이랑 NLP 같이해야하는 쪽에서 많이 언급되었던 논문으로 아는데, NLP 관심 없어서 스킵... | |
4286 | Neural Body: Implicit Neural Representations With Structured Latent Codes for Novel View Synthesis of Dynamic Humans | Sida Peng (Zhejiang University); Yuanqing Zhang (Zhejiang University); Yinghao Xu (Chinese University of Hong Kong); Qianqian Wang (Cornell); Qing Shuai (Zhejiang University); Hujun Bao (Zhejiang University); Xiaowei Zhou (Zhejiang University) |
https://arxiv.org/pdf/2012.15838.pdf | 일단 view 합성하는 분야. ill-posed problem을 해결하기 위해서, 비디오 프레임들이 모두 공유하는 'structured' latent code만든다고 한다. Neural body라는걸 제안하는데, 이게 사람 몸에 대한 neural representation을 구조적으로 배운다. 대략 살펴보니까 color나 density를 모델이 있는 것 같은데... |
4418 | Exploring Simple Siamese Representation Learning | Xinlei Chen (FAIR); Kaiming He (Facebook AI Research) |
이건 다음번에 팀원분이 발표해주신다해서 스킵 | ||
4551 | Guided Interactive Video Object Segmentation Using Reliability-Based Attention Maps | Yuk Heo (Korea University); Yeong Jun Koh (Chungnam National University); Chang-Su Kim (Korea university) |
4877 | GeoSim: Realistic Video Simulation via Geometry-Aware Composition for Self-Driving | Yun Chen (Uber ATG); Frieda Rong (Uber ATG); Shivam Duggal (Delhi Technological University); Shenlong Wang (Uber ATG, University of Toronto); Xinchen Yan (Uber ATG); Sivabalan Manivasagam (University of Toronto); Shangjie Xue (MIT); Ersin Yumer (Uber ATG); Raquel Urtasun (Uber ATG) |
4945 | Neural Lumigraph Rendering | Petr Kellnhofer (Stanford University); Lars C Jebe (Raxium); Andrew Jones (Raxium); Ryan Spicer (Raxium); Kari Pulli (University of Oulu); Gordon Wetzstein (Stanford University) |
5291 | Event-Based Synthetic Aperture Imaging With a Hybrid Network | Xiang Zhang (Wuhan University); Wei Liao (WuHan University); Lei Yu (Wuhan University); Wen Yang (Wuhan University); Gui-Song Xia (Wuhan University) |
5562 | Energy-Based Learning for Scene Graph Generation | Mohammed Suhail (University of British Columbia); Abhay Mittal (Amazon); Behjat Siddiquie (Amazon); Christopher Broaddus (Amazon); Jayan Eledath (Amazon); gerard medioni (USC); Leonid Sigal (University of British Columbia) |
6333 | Learning High Fidelity Depths of Dressed Humans by Watching Social Media Dance Videos | Yasamin Jafarian (University of Minnesota); Hyun Soo Park (The University of Minnesota) |
7124 | MP3: A Unified Model To Map, Perceive, Predict and Plan | Sergio Casas (Uber ATG / University of Toronto); Abbas Sadat (Uber ATG); Raquel Urtasun (Uber ATG) |
8458 | NeX: Real-Time View Synthesis With Neural Basis Expansion | Suttisak Wizadwongsa (Vidyasirimedhi Institute of Science and Technology); Pakkapon Phongthawee (Vidyasirimedhi Institute of Science and Technology); Jiraphon Yenphraphai (Vidyasirimedhi Institute of Science and Technology); Supasorn Suwajanakorn (Vidyasirimedhi Institute of Science and Technology) |
8866 | NewtonianVAE: Proportional Control and Goal Identification From Pixels via Physical Latent Spaces | Miguel Jaques (University of Edinburgh); Michael Burke (Monash University); Timothy Hospedales (Edinburgh University) |
https://openaccess.thecvf.com/content/CVPR2021/papers/Jaques_NewtonianVAE_Proportional_Control_and_Goal_Identification_From_Pixels_via_Physical_CVPR_2021_paper.pdf | 뭔가 control가능한 latent space를 학습한다는 것 같은데 글이 안읽힌다 ㄱ- | |
10237 | Fast End-to-End Learning on Protein Surfaces | Freyr Sverrisson (EPFL); Jean Feydy (Imperial College London); Bruno Correia (EPFL); Michael Bronstein (Imperial College London / Twitter) |
10509 | Real-Time High-Resolution Background Matting | Shanchuan Lin (University of Washington); Andrey Ryabtsev (University of Washington); Soumyadip Sengupta (University of Washington); Brian Curless (University of Washington); Steve Seitz (University of Washington); Ira Kemelmacher-Shlizerman (University of Washington) |