본문 바로가기
딥러닝 어쩌구/Trendings

CVPR 2021 best paper candidates 목록

by 포숑은 맛있어 2021. 8. 24.
반응형

저번 글에서 이번 CVPR 베스트 페이퍼를 살펴봤는데,

CVPR 2021 Best paper는 아니지만 후보에 올랐던 논문들을 한번씩 보려고 한다.

 

283 Privacy-Preserving Image Features via Adversarial Affine Subspace Embeddings Mihai Dusmanu (ETH Zurich); Johannes L Schönberger (Microsoft); Sudipta Sinha (Microsoft); Marc Pollefeys (ETH Zurich / Microsoft)
     
415 Learning Calibrated Medical Image Segmentation via Multi-Rater Agreement Modeling Wei Ji (University of Alberta); Shuang Yu (Tencent); Junde Wu (Harbin Institute of Technology); Kai Ma (Tencent); Cheng Bian (Tencent); Qi Bi (University of Amsterdam); 
     
456 Diffusion Probabilistic Models for 3D Point Cloud Generation Shitong Luo (Peking University); Wei Hu (Peking University)
     
566 Task Programming: Learning Data Efficient Behavior Representations Jennifer J. Sun (Caltech); Ann Kennedy (Northwestern University); Eric Zhan (Caltech); David J. Anderson (Caltech); Yisong Yue (Caltech); Pietro Perona (California Institute of Technology)
     
902 PoseAug: A Differentiable Pose Augmentation Framework for 3D Human Pose Estimation Kehong Gong (National University of Singapore); Jianfeng Zhang (NUS); Jiashi Feng (NUS)
     
1058 SCANimate: Weakly Supervised Learning of Skinned Clothed Avatar Networks Shunsuke Saito (Facebook); Jinlong Yang (Max Planck Institute for Intelligent Systems); Qianli Ma (Max Planck Institute for Intelligent Systems); Michael J. Black (Max Planck Institute for Intelligent Systems)
     
1078 On Self-Contact and Human Pose Lea Müller (Max Planck Institute for Intelligent Systems); Ahmed A A Osman (Max Planck Institute for Intelligent Systems); Siyu Tang (ETH Zurich); Chun-Hao Paul Huang (Max Planck Institute for Intelligent Systems); Michael J. Black (Max Planck Institute for Intelligent Systems)
     
1269 Binary TTC: A Temporal Geofence for Autonomous Navigation Abhishek Badki (University of California, Santa Barbara); Orazio Gallo (NVIDIA Research); Jan Kautz (NVIDIA); Pradeep Sen (UC Santa Barbara)
     
1300 Rethinking and Improving the Robustness of Image Style Transfer Pei Wang (UC San Diego); Yijun Li (Adobe Research); Nuno Vasconcelos (UC San Diego)
     
1704 Audio-Visual Instance Discrimination with Cross-Modal Agreement Pedro Morgado (University of California, San Diego); Nuno Vasconcelos (UCSD, USA); Ishan Misra (Facebook AI Research)
     
1824 Point2Skeleton: Learning Skeletal Representations from Point Clouds Cheng Lin (The University of Hong Kong); Changjian Li (University College London); Yuan Liu (The University of Hong Kong); Nenglun Chen (The University of Hong Kong); Yi King Choi (The University of Hong Kong); Wenping Wang (The University of Hong Kong)
     
1929 Human POSEitioning System (HPS): 3D Human Pose Estimation and Self-Localization in Large Scenes From Body-Mounted Sensors Vladimir Guzov (Max Planck Institute for Informatics); Aymen Mir (Max Planck Institute of Informatics); Torsten Sattler (Czech Technical University in Prague); Gerard Pons-Moll (MPII, Germany)
     
2551 Where and What? Examining Interpretable Disentangled Representations Xinqi Zhu (University of Sydney); Chang Xu (University of Sydney); Dacheng Tao (The University of Sydney)
  https://arxiv.org/pdf/2104.05622.pdf  
3225 Learning To Recover 3D Scene Shape From a Single Image Wei Yin (University of Adelaide); Jianming Zhang (Adobe Research); Oliver Wang (Adobe Systems Inc); Simon Niklaus (Adobe Research); Long T Mai (Adobe Research); Simon Chen (Adobe Research); Chunhua Shen (University of Adelaide)
     
3367 GIRAFFE: Representing Scenes As Compositional Generative Neural Feature Fields Michael Niemeyer (Max Planck Institute for Intelligent Systems, Tübingen and University of Tübingen); Andreas Geiger (MPI-IS and University of Tuebingen)
    이게 베스트 페이퍼. 저번에 리뷰 했으니 스킵
3386 Polygonal Building Extraction by Frame Field Learning Nicolas Girard (Inria Sophia-Antipolis); Dmitriy Smirnov (MIT); Justin M Solomon (MIT); Yuliya Tarabalka (Inria Sophia-Antipolis)
  https://arxiv.org/pdf/2004.14875.pdf segmentation결과를 실제 Downstream task들에 사용할때를 고려해서 building들을 뽑는 것 같다
3433 NeuralRecon: Real-Time Coherent 3D Reconstruction From Monocular Video Jiaming Sun (SenseTime); Yiming Xie (SenseTime); Linghao Chen (Zhejiang University); Xiaowei Zhou (Zhejiang University); Hujun Bao (Zhejiang University)
     
3592 CoCosNet v2: Full-Resolution Correspondence Learning for Image Translation Xingran Zhou (Zhejiang University); Bo Zhang (Microsoft Research Asia); Ting Zhang (MSRA); Pan Zhang (USTC); Jianmin Bao (Microsoft Research Asia); Dong Chen (Microsoft Research Asia); Zhongfei Zhang (Binghamton University); Fang Wen (Microsoft Research Asia)
  https://openaccess.thecvf.com/content/CVPR2021/papers/Zhou_CoCosNet_v2_Full-Resolution_Correspondence_Learning_for_Image_Translation_CVPR_2021_paper.pdf image to image translation을 잘 하는 논문인 것 같다. patch match기법을 가지고 점점 맞춘다는데...
4263 Less Is More: ClipBERT for Video-and-Language Learning via Sparse Sampling Jie Lei (UNC Chapel Hill); Linjie Li (Microsoft); Luowei Zhou (Microsoft); Zhe Gan (Microsoft); Tamara Berg (UNC Chapel Hill, USA); Mohit Bansal (University of North Carolina at Chapel Hill); Jingjing Liu (Microsoft)
  https://arxiv.org/pdf/2102.06183.pdf VQA처럼 비전이랑 NLP 같이해야하는 쪽에서 많이 언급되었던 논문으로 아는데, NLP 관심 없어서 스킵...
4286 Neural Body: Implicit Neural Representations With Structured Latent Codes for Novel View Synthesis of Dynamic Humans Sida Peng (Zhejiang University); Yuanqing Zhang (Zhejiang University); Yinghao Xu (Chinese University of Hong Kong); Qianqian Wang (Cornell); Qing Shuai (Zhejiang University); Hujun Bao (Zhejiang University); Xiaowei Zhou (Zhejiang University)
  https://arxiv.org/pdf/2012.15838.pdf 일단 view 합성하는 분야.
ill-posed problem을 해결하기 위해서, 비디오 프레임들이 모두 공유하는 'structured' latent code만든다고 한다. Neural body라는걸 제안하는데, 이게 사람 몸에 대한 neural representation을 구조적으로 배운다. 대략 살펴보니까 color나 density를 모델이 있는 것 같은데...
4418 Exploring Simple Siamese Representation Learning Xinlei Chen (FAIR); Kaiming He (Facebook AI Research)
    이건 다음번에 팀원분이 발표해주신다해서 스킵
4551 Guided Interactive Video Object Segmentation Using Reliability-Based Attention Maps Yuk Heo (Korea University); Yeong Jun Koh (Chungnam National University); Chang-Su Kim (Korea university)
     
4877 GeoSim: Realistic Video Simulation via Geometry-Aware Composition for Self-Driving Yun Chen (Uber ATG); Frieda Rong (Uber ATG); Shivam Duggal (Delhi Technological University); Shenlong Wang (Uber ATG, University of Toronto); Xinchen Yan (Uber ATG); Sivabalan Manivasagam (University of Toronto); Shangjie Xue (MIT); Ersin Yumer (Uber ATG); Raquel Urtasun (Uber ATG)
     
4945 Neural Lumigraph Rendering Petr Kellnhofer (Stanford University); Lars C Jebe (Raxium); Andrew Jones (Raxium); Ryan Spicer (Raxium); Kari Pulli (University of Oulu); Gordon Wetzstein (Stanford University)
     
5291 Event-Based Synthetic Aperture Imaging With a Hybrid Network Xiang Zhang (Wuhan University); Wei Liao (WuHan University); Lei Yu (Wuhan University); Wen Yang (Wuhan University); Gui-Song Xia (Wuhan University)
     
5562 Energy-Based Learning for Scene Graph Generation Mohammed Suhail (University of British Columbia); Abhay Mittal (Amazon); Behjat Siddiquie (Amazon); Christopher Broaddus (Amazon); Jayan Eledath (Amazon); gerard medioni (USC); Leonid Sigal (University of British Columbia)
     
6333 Learning High Fidelity Depths of Dressed Humans by Watching Social Media Dance Videos Yasamin Jafarian (University of Minnesota); Hyun Soo Park (The University of Minnesota)
     
7124 MP3: A Unified Model To Map, Perceive, Predict and Plan Sergio Casas (Uber ATG / University of Toronto); Abbas Sadat (Uber ATG); Raquel Urtasun (Uber ATG)
     
8458 NeX: Real-Time View Synthesis With Neural Basis Expansion Suttisak Wizadwongsa (Vidyasirimedhi Institute of Science and Technology); Pakkapon Phongthawee (Vidyasirimedhi Institute of Science and Technology); Jiraphon Yenphraphai (Vidyasirimedhi Institute of Science and Technology); Supasorn Suwajanakorn (Vidyasirimedhi Institute of Science and Technology)
     
8866 NewtonianVAE: Proportional Control and Goal Identification From Pixels via Physical Latent Spaces Miguel Jaques (University of Edinburgh); Michael Burke (Monash University); Timothy Hospedales (Edinburgh University)
  https://openaccess.thecvf.com/content/CVPR2021/papers/Jaques_NewtonianVAE_Proportional_Control_and_Goal_Identification_From_Pixels_via_Physical_CVPR_2021_paper.pdf 뭔가 control가능한 latent space를 학습한다는 것 같은데 글이 안읽힌다 ㄱ-
10237 Fast End-to-End Learning on Protein Surfaces Freyr Sverrisson (EPFL); Jean Feydy (Imperial College London); Bruno Correia (EPFL); Michael Bronstein (Imperial College London / Twitter)
     
10509 Real-Time High-Resolution Background Matting Shanchuan Lin (University of Washington); Andrey Ryabtsev (University of Washington); Soumyadip Sengupta (University of Washington); Brian Curless (University of Washington); Steve Seitz (University of Washington); Ira Kemelmacher-Shlizerman (University of Washington)

 

 

 

반응형

댓글