Main Conference Provisional Schedule

Tuesday 25th

Orals 1 (Tue. am)

Detection, Recognition, Classification, and Localization in 2D/3D

640 Long-tail Detection with Effective Class-Margins Jang Hyun Cho (The University of Texas at Austin)*; Philipp Kraehenbuehl (UT Austin)
1448 Multimodal Object Detection via Probabilistic Ensembling Yi-Ting Chen (University of Maryland); Jinghao Shi (Carnegie Mellon University); Zelin Ye (CMU); Mertz Christoph (CMU); Deva Ramanan (Carnegie Mellon University); Shu Kong (Carnegie Mellon University)*
1791 Improving Robustness by Enhancing Weak Subnets Yong Guo (Max Planck Institute for Informatics)*; David Stutz (Max Planck Institute for Informatics); Bernt Schiele (MPI Informatics)
2179 Adversarially-Aware Robust Object Detector ZiYi Dong (Sun Yat-Sen University)*; Pengxu Wei (Sun Yat-sen University); Liang Lin (Sun Yat-sen University)
2723 Fine-Grained Scene Graph Generation with Data Transfer Ao Zhang (National University of Singapore)*; Yuan Yao (Tsinghua University); qianyu chen (Tsinghua University); Wei Ji (National University of Singapore); Zhiyuan Liu (Tsinghua University); Maosong Sun (Tsinghua University); Tat-Seng Chua (National university of Singapore)
6108 Vote from the Center: 6 DoF Pose Estimation in RGB-D Images by Radial Keypoint Voting Yangzheng Wu (Queen’s University)*; Mohsen Zand (Queen’s University); Ali Etemad (Queen’s University); Michael Alan Greenspan (Queen’s University)

Orals 2 (Tue. am)

Motion and Tracking

561 Particle Video Revisited: Tracking Through Occlusions Using Point Trajectories Adam Harley (Carnegie Mellon University)*; Zhaoyuan Fang (Carnegie Mellon University); Katerina Fragkiadaki (Carnegie Mellon University)
2385 A Perturbation-Constrained Adversarial Attack for Evaluating the Robustness of Optical Flow Jenny Schmalfuss (University of Stuttgart)*; Philipp Scholze (University of Stuttgart); Andrés Bruhn (University of Stuttgart)
2623 Social-SSL: Self-Supervised Cross-Sequence Representation Learning Based on Transformers for Multi-Agent Trajectory Prediction Li-Wu Tsao (National Chiao Tung University)*; Yan-Kai Wang (National Chiao Tung University); Hao-Siang Lin (National Chiao Tung University); Hong-Han Shuai (National Yang Ming Chiao Tung University); Lai-Kuan Wong (Multimedia University); Wen-Huang Cheng (National Chiao Tung University)
2874 Diverse Human Motion Prediction Guided by Multi-Level Spatial-Temporal Anchors Sirui Xu (University of Illinois Urbana-Champaign)*; Yu-Xiong Wang (University of Illinois at Urbana-Champaign); Liangyan Gui (University of Illinois Urbana-Champaign)
4806 TEMOS: Generating diverse human motions from textual descriptions Mathis Petrovich (Ecole des Ponts)*; Michael Black (Max Planck Institute for Intelligent Systems); Gul Varol (Ecole des Ponts ParisTech)
7092 PREF: Predictability Regularized Neural Motion Fields Liangchen Song (University at Buffalo)*; Xuan Gong (University at Buffalo); Benjamin Planche (United Imaging Intelligence); Meng Zheng (United Imaging Intelligence); David Doermann (University at Buffalo); Junsong Yuan (“State University of New York at Buffalo, USA”); Terrence Chen (United Imaging Intelligence); Ziyan Wu (United Imaging Intelligence)

Posters 1 (Tue. early)

154 Tip-Adapter: Training-free Adaption of CLIP for Few-shot Classification Renrui Zhang (Shanghai AI Lab)*; Zhang Wei (Shanghai AI-Lab); Rongyao Fang (Chinese University of Hong Kong); Peng Gao (Chinese university of hong kong); Kunchang Li (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences); Jifeng Dai (Tsinghua University); Yu Qiao (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences); Hongsheng Li (The Chinese University of Hong Kong)
222 Panoptic Scene Graph Generation Jingkang Yang (Nanyang Technological University)*; Yi Zhe Ang (Nanyang Technological University); Zujin GUO (Nanyang Technological University); Kaiyang Zhou (Nanyang Technological University); Wayne Zhang (SenseTime Research); Ziwei Liu (Nanyang Technological University)
458 Online Segmentation of LiDAR Sequences: Dataset and Algorithm Romain Loiseau (École des ponts ParisTech)*; Mathieu Aubry (École des ponts ParisTech); Loic Landrieu (IGN)
514 PolarMOT: How far can geometric relations take us in 3D multi-object tracking? Aleksandr Kim (Technical University of Munich); Guillem Brasó (TUM); Aljosa Osep (TUM Munich)*; Laura Leal-Taixé (TUM)
744 Streamable Neural Fields Junwoo Cho (Sungkyunkwan University)*; Seungtae Nam (Sungkyunkwan University); Daniel Rho (Sungkyunkwan University); Jong Hwan Ko (Sungkyunkwan University); Eunbyung Park (Sungkyunkwan University)
762 Where to Focus: Investigating Hierarchical Attention Relationship for Fine-Grained Visual Classification Yang Liu (Beihang University); Lei Zhou (Wuhan University of Technology)*; Pengcheng Zhang (Beihang University); Xiao Bai (Beihang University); Lin Gu (RIKEN,AIP / The University of Tokyo); Xiaohan Yu (Griffith University); Jun Zhou (Griffith University); Hancock Edwin (“University of York, UK”)
776 Mind the Gap in Distilling StyleGANs Guodong Xu (The Chinese University of Hong Kong)*; Yuenan HOU (Shanghai AI Lab); Ziwei Liu (Nanyang Technological University); Chen Change Loy (Nanyang Technological University)
812 DaViT: Dual Attention Vision Transformers Mingyu Ding (The University of Hong Kong)*; Bin Xiao (Microsoft); Noel C Codella (Microsoft); Ping Luo (The University of Hong Kong); Jingdong Wang (Baidu); Lu Yuan (Microsoft)
1049 3D Human Pose Estimation Using Möbius Graph Convolutional Networks Niloofar Azizi (ICG department of TU Graz)*; Horst Possegger (Graz University of Technology); Emanuele Rodola (Sapienza University of Rome); Horst Bischof (Graz University of Technology)
1114 CramNet: Camera-Radar Fusion with Ray-Constrained Cross-Attention for Robust 3D Object Detection Jyh-Jing Hwang (Waymo)*; Henrik Kretzschmar (Waymo); Joshua M Manela (Waymo); Sean Rafferty (Waymo); Nicholas Armstrong-Crews (Waymo); Tiffany Chen (Waymo); Dragomir Anguelov (Waymo)
1119 Latent Discriminant deterministic Uncertainty Gianni Franchi (ENSTA Paris)*; Xuanlong Yu (ENSTA Paris); Andrei Bursuc (valeo.ai); Emanuel Aldea (Paris-Saclay University); Severine Dubuisson (Aix-Marseille University); David Filliat (ENSTA Paris)
1209 Masked Discrimination for Self-Supervised Learning on Point Clouds Haotian Liu (University of Wisconsin-Madison)*; Mu Cai (University of Wisconsin-Madison); Yong Jae Lee (University of Wisconsin-Madison)
1344 AgeTransGAN for Facial Age Transformation with Rectified Performance Metrics Gee-Sern Hsu (National Taiwan University of Science and Technology)*; Rui-Cang Xie ( National Taiwan University of Science and Technology); Zhi-Ting Chen (National Taiwan University of Science and Technology); Yu-Hong Lin (National Taiwan University of Science and Technology)
1366 Adaptive Fine-Grained Sketch-Based Image Retrieval Ayan Kumar Bhunia (University of Surrey)*; Aneeshan Sain (University of Surrey); Parth Hiren Shah (Indian Institute of Technology Guwahati); Animesh Gupta (Thapar University); Pinaki Nath Chowdhury (University of Surrey); Tao Xiang (University of Surrey); Yi-Zhe Song (University of Surrey)
1469 ARAH: Animatable Volume Rendering of Articulated Human SDFs Shaofei wang (ETH Zurich)*; Katja Schwarz (MPI Tuebingen); Andreas Geiger (University of Tuebingen); Siyu Tang (ETH Zurich)
1542 A Perceptual Quality Metric for Video Frame Interpolation Qiqi Hou (Portland State University)*; Abhijay Ghildyal (Portland State University); Feng Liu (Portland State University)
1549 AlignSDF: Pose-Aligned Signed Distance Fields for Hand-Object Reconstruction Zerui Chen (Inria Paris); Yana Hasson (Inria); Cordelia Schmid (Inria/Google)*; Ivan Laptev (INRIA Paris)
1596 DualFormer: Local-Global Stratified Transformer for Efficient Video Recognition Yuxuan Liang (National University of Singapore)*; Pan Zhou (Sea AI Lab); Roger Zimmermann (NUS); Shuicheng Yan (Sea AI Labs)
1625 ECCV Caption: Correcting False Negatives by Collecting Machine-and-Human-verified Image-Caption Associations for MS-COCO Sanghyuk Chun (NAVER AI Lab)*; Wonjae Kim (NAVER AI Lab); Song Park (NAVER AI Lab); Minsuk Chang (NAVER AI Lab); Seong Joon Oh (Naver AI Lab)
1628 Learning to Detect Every Thing in an Open World Kuniaki Saito (Boston University)*; Ping Hu (Boston University); Trevor Darrell (UC Berkeley); Kate Saenko (Boston University)
1762 MoDA: Map style transfer for self-supervised Domain Adaptation of embodied agents Eun Sun Lee (Seoul National University)*; Junho Kim (Seoul National University); SangWon Park (Seoul Nat’l University); Young Min Kim (Seoul National University)
1980 How Severe is Benchmark-Sensitivity in Video Self-Supervised Learning? Fida Mohammad Thoker (University of Amsterdam)*; Hazel Doughty (University of Amsterdam); Piyush Nitin Bagad (University of Amsterdam); Cees Snoek (University of Amsterdam)
2312 Data Efficient 3D Learner via Knowledge Transferred from 2D Model Ping-Chung Yu (National Tsing Hua University)*; Cheng Sun (National Tsing Hua University); Min Sun (NTHU)
2350 IntereStyle: Encoding an Interest Region for Robust StyleGAN Inversion Seung Jun Moon (KAIST)*; Gyeong-Moon Park (Kyung Hee University)
2379 You Should Look at All Objects Zhenchao Jin (University of Science and Technology of China)*; Dongdong Yu (ByteDance Inc.); Luchuan Song (University of Science and Technology of China); Zehuan Yuan (Bytedance.Inc); Lequan Yu (The University of Hong Kong)
2418 Multi-Person 3D Pose and Shape Estimation via Inverse Kinematics and Refinement Junuk Cha (UNIST)*; Muhammad Saqlain (Ulsan National Institute of Science and Technology); GeonU Kim (UNIST); Mingyu Shin (ULSAN NATIONAL INSTITUTE OF SCIENCE AND TECHNOLOGY); Seungryul Baek (UNIST)
2538 Dual Perspective Network for Audio Visual Event Localization Varshanth Rao (Huawei Technologies)*; Md Ibrahim Khalil (Huawei Noah’s Ark Laboratory); Haoda Li (University of California, Berkeley); Peng Dai (Huawei Technologies Inc.Canada); Juwei Lu (Huawei Noah’s Ark Lab)
2649 Context Enhanced Stereo Transformer weiyu Guo (University of Chinese Academy of Sciences)*; Zhaoshuo Li (Johns Hopkins University); Yongkui Yang (Shenzhen Institute of Advanced Technology,Chinese Academy of Sciences); Zheng Wang (Shenzhen Institutes of Advanced Technology); Russ Taylor (Johns Hopkins University); Mathias Unberath (Johns Hopkins University); Alan Yuille (Johns Hopkins University); Yingwei Li (Johns Hopkins University)
2740 A Style-Based GAN Encoder for High Fidelity Reconstruction of Images and Videos Xu YAO (Telecom ParisTech)*; Alasdair Newson (Telecom Paris); Yann Gousseau (Telecom Paris); PIERRE HELLIER (Interdigital (Technicolor))
2814 Diverse Image Inpainting with Normalizing Flow Cairong Wang (Graduate school at Shenzhen, Tsinghua University)*; Yiming M Zhu (Graduate school at ShenZhen,Tsinghua university); Chun Yuan (Graduate school at ShenZhen,Tsinghua university)
2818 Video Activity Localisation with Uncertainties in Temporal Boundary Jiabo Huang (Queen Mary University of London)*; Hailin Jin (Adobe Research); Shaogang Gong (Queen Mary University of London); Yang Liu (Peking University)
2863 BlobGAN: Spatially Disentangled Scene Representations Dave Epstein (UC Berkeley)*; Taesung Park (Adobe Research); Richard Zhang (Adobe); Eli Shechtman (Adobe Research, US); Alexei A Efros (UC Berkeley)
3116 Towards Interpretable Video Super-Resolution via Alternating Optimization Jiezhang Cao (ETH Zürich)*; Jingyun Liang (ETH Zurich); Kai Zhang (ETH Zurich); Wenguan Wang (Eidgenössische Technische Hochschule Zürich); Qin Wang (ETH Zurich); Yulun Zhang (ETH Zurich); Hao Tang (ETH Zurich); Luc Van Gool (ETH Zurich)
3390 SegPGD: An Effective and Efficient Adversarial Attack for Evaluating and Boosting Segmentation Robustness Jindong Gu (University of Munich)*; Hengshuang Zhao (University of Oxford); Volker Tresp (Siemens AG and Ludwig Maximilian University of Munich ); Philip Torr (University of Oxford)
3685 VecGAN: Image-to-Image Translation with Interpretable Latent Directions Yusuf Dalva (Bilkent University); Said F Altındiş (Bilkent University); Aysegul Dundar (Bilkent University)*
3719 PartImageNet: A Large, High-Quality Dataset of Parts Ju He (Johns Hopkins University)*; Shuo Yang (University of Technology Sydney); Shaokang Yang (ByteDance); Adam Kortylewski (Max Planck Institute for Informatics); Xiaoding Yuan (Johns Hopkins University); Jie-Neng Chen (Johns Hopkins University); shuai liu (ByteDance Inc.); Cheng Yang (ByteDance Inc.); Qihang Yu (Johns Hopkins University); Alan Yuille (Johns Hopkins University)
3870 Point Cloud Compression with Sibling Context and Surface Priors Zhili CHEN (HKUST); Zian Qian (HKUST); Sukai Wang (HKUST); Qifeng Chen (HKUST)*
3904 CANF-VC: Conditional Augmented Normalizing Flows for Video Compression Yung-Han Ho (NCTU); Chih-Peng Chang (National Chiao Tung Univeristy); Peng-Yu Chen (NYCU); Alessandro Gnutti (University of Brescia); Wen-Hsiao Peng (National Yang Ming Chiao Tung University)*
4022 MotionCLIP: Exposing Human Motion Generation to CLIP Space Guy Tevet (Tel Aviv University)*; Brian Gordon (Tel Aviv University); Amir Hertz (Tel Aviv University); Amit H Bermano (Tel-Aviv University); Danny Cohen-Or (Tel Aviv University)
4217 Learning Audio-Video Modalities from Image Captions Arsha Nagrani (Google )*; Paul Hongsuck Seo (Google); Bryan Seybold (Google); Anja Hauth (Google AI); Santiago Manen (Google); Chen Sun (Brown University); Cordelia Schmid (Google)
4220 Inverted Pyramid Multi-task Transformer for Dense Scene Understanding Hanrong Ye (The Hong Kong University of Science and Technology)*; Dan Xu (The Hong Kong University of Science and Technology)
4359 A Fast Knowledge Distillation Framework for Visual Recognition Zhiqiang Shen (Carnegie Mellon University)*; Eric Xing (MBZUAI, CMU, and Petuum Inc.)
4593 Cross-Domain Ensemble Distillation for Domain Generalization Kyungmoon Lee (POSTECH)*; Sungyeon Kim (POSTECH); Suha Kwak (POSTECH)
4675 Learning Implicit Feature Alignment Function for Semantic Segmentation Hanzhe Hu (Carnegie Mellon University)*; Yinbo Chen (UC San Diego); Jiarui Xu (University of California San Diego); Shubhankar Borse (Qualcomm AI Research ); Hong Cai (Qualcomm AI Research); Fatih Porikli (Qualcomm AI Research); Xiaolong Wang (UCSD)
4715 ParticleSfM: Exploiting Dense Point Trajectories for Localizing Moving Cameras in the Wild Wang Zhao (Tsinghua University)*; Shaohui Liu (ETH Zurich); Hengkai Guo (ByteDance AI Lab); Wenping Wang (The University of Hong Kong); Yong-Jin Liu (Tsinghua University)
4832 AdvDO: Realistic Adversarial Attacks for Trajectory Prediction Yulong Cao (University of Michigan, Ann Arbor )*; Chaowei Xiao (NVIDIA); Anima Anandkumar (NVIDIA/Caltech); Danfei Xu (Stanford University); Marco Pavone (Stanford University)
4857 Self-Supervised Classification Network Elad Amrani (IBM / Technion)*; Leonid Karlinsky (IBM-Research); Alex Bronstein (Technion)
4883 DANBO: Disentangled Articulated Neural Body Representations via Graph Neural Networks Shih-Yang Su (University of British Columbia)*; Timur Bagautdinov (Facebook); Helge Rhodin (UBC)
4927 Learning to Fit Morphable Models Vasileios Choutas (ETH Zurich)*; Federica Bogo (Meta); Jingjing Shen (Microsoft); Julien Valentin (Microsoft)
4998 NeRF for Outdoor Scene Relighting Viktor Rudnev (Max Planck Institute for Informatics)*; Mohamed Elgharib (Max Planck Institute for Informatics); William Smith (University of York); Lingjie Liu (Max Planck Institute for Informatics ); Vladislav Golyanik (MPI for Informatics); Christian Theobalt (MPI Informatik)
5001 FusionVAE: A Deep Hierarchical Variational Autoencoder for RGB Image Fusion Fabian Duffhauss (Bosch Center for Artificial Intelligence)*; Vien Anh Ngo (Bosch Center for Artificial Intelligence); Hanna Ziesche (Bosch Center for AI); Gerhard Neumann (Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany)
5032 Instance As Identity: A Generic Online Paradigm for Video Instance Segmentation Feng Zhu (University of Technology Sydney)*; Zongxin Yang (Zhejiang University); Xin Yu (University of Technology Sydney); Yi Yang (Zhejiang University); Yunchao Wei (UTS)
5049 RBC: Rectifying the Biased Context in Continual Semantic Segmentation Hanbin Zhao (Zhejiang University)*; Fengyu Yang (University of Michigan); Xinghe Fu (Zhejiang University); Xi Li (Zhejiang University)
5172 Active Audio-Visual Separation of Dynamic Sound Sources Sagnik Majumder (University of Texas at Austin)*; Kristen Grauman (Facebook AI Research & UT Austin)
5272 The Fish Counting Dataset: A Benchmark for Multiple Object Tracking and Counting Justin Kay (Caltech, Ai.Fish); Peter Kulits (Caltech); Suzanne C Stathatos (Caltech); Siqi Deng (Amazon); Erik Young (Trout Unlimited); Sara M Beery (Caltech); Grant Van Horn (Cornell University)*; Pietro Perona (California Institute of Technology)
5293 DeepMend: Learning Occupancy Functions to Represent Shape for Repair Nikolas Lamb (Clarkson University)*; Sean Banerjee (Clarkson University); Natasha Kholgade Banerjee (Clarkson University)
5315 Intelli-Paint: Towards Developing More Human-Intelligible Painting Agents Jaskirat Singh (Australian National University)*; Cameron Y Smith (Adobe Research); Jose Echevarria (Adobe System Inc.); Liang Zheng (Australian National University)
5317 Rethinking Few-Shot Object Detection on A Multi-Domain Benchmark Kibok Lee (Yonsei University); Hao Yang (Amazon)*; Satyaki Chakraborty (Amazon ); Zhaowei Cai (Amazon); Gurumurthy Swaminathan (Amazon); Avinash Ravichandran (Amazon); Onkar Dabeer (Amazon)
5372 A Repulsive Force Unit for Garment Collision Handling in Neural Networks Qingyang Tan (UMD)*; Yi Zhou (Adobe Research); Tuanfeng Wang (adobe research); Duygu Ceylan (Adobe Research); Xin Sun (Adobe Research); Dinesh Manocha (University of Maryland at College Park)
5396 Motion Transformer for Unsupervised Image Animation Jiale Tao (University of Electronic Science and Technology of China)*; Biao Wang (Alibaba Group); Tiezheng Ge (Alibaba Group); Yuning Jiang (Alibaba Group); Wen Li (University of Electronic Science and Technology of China); Lixin Duan (University of Electronic Science and Technology of China)
5496 Multi-Granularity Pruning for Model Acceleration on Mobile Devices Tianli Zhao (Institute of Automation,Chinese Academy of Sciences;University of Chinese Academy of Sciences); Xi Sheryl Zhang (Institute of Automation, Chinese Academy of Sciences); Wentao Zhu (Amazon); Jiaxing Wang (Institute of Automation, Chinese Academy of Sciences); Sen Yang (Kuaishou); Ji Liu (Kwai Inc.); Jian Cheng (“Chinese Academy of Sciences, China”)*
5549 Decoupled Contrastive Learning Chun-Hsiao Yeh (Academia Sinica / UC Berkeley)*; Cheng-Yao Hong (Academia Sinica); Yen-Chi Hsu (Academia Sinica); Tyng-Luh Liu (Academia Sinica); Yubei Chen (Berkeley AI Research, UC Berkeley); yann lecun (Facebook)
5770 ShAPO: Implicit Representations for Multi-Object Shape, Appearance, and Pose Optimization Muhammad Zubair Irshad (Georgia Institute of Technology)*; Sergey Zakharov (Toyota Research Institute); Rareș A Ambruș (Toyota Research Institute); Thomas Kollar (Toyota Research Institute); Zsolt Kira (Georgia Institute of Technology); Adrien Gaidon (Toyota Research Institute)
5907 Domain Knowledge-Informed Self-Supervised Representations for Workout Form Assessment Paritosh Parmar (University of British Columbia)*; Amol Gharat (Flex A.I.); Helge Rhodin (UBC)
5940 SPSN: Superpixel Prototype Sampling Network for RGB-D Salient Object Detection Minhyeok Lee ( Yonsei University)*; Chaewon Park (Yonsei University); Suhwan Cho (Yonsei University); Sangyoun Lee (Yonsei University)
5988 Correspondence Reweighted Translation Averaging Lalit Manam (Indian Institute of Science Bengaluru)*; Venu Madhav Govindu (Indian Institute of Science)
6067 3D Equivariant Graph Implicit Functions Yunlu Chen (University of Amsterdam)*; Basura Fernando (Agency for Science, Technology and Research, A*STAR, Singapore); Hakan Bilen (University of Edinburgh); Matthias Niessner (Technical University of Munich); Efstratios Gavves (University of Amsterdam )
6203 PatchRD: Detail-Preserving Shape Completion by Learning Patch Retrieval and Deformation Bo Sun (UT Austin)*; Vladimir Kim (Adobe); Qixing Huang (The University of Texas at Austin); Noam Aigerman (Adobe); Siddhartha Chaudhuri (Adobe Research)
6454 Towards Accurate Network Quantization with Equivalent Smooth Regularizers Kirill Solodskikh (Huawei Noah’s Ark Lab, MSU)*; Vladimir Chikin (Huawei Noah’s Ark Lab); Ruslan Aydarkhanov (Huawei Noah’s Ark Lab); Dehua Song (Huawei Noah’s Ark Lab); Irina Zhelavskaya (Skolkovo Institute of Science and Technology (Skoltech)); Jiansheng Wei (Huawei Technologies Co. Ltd.)
6475 Explicit Model Size Control and Relaxation via Smooth Regularization for Mixed-Precision Quantization Vladimir Chikin (Huawei Noah’s Ark Lab)*; Kirill Solodskikh (Huawei Noah’s Ark Lab, MSU); Irina Zhelavskaya (Skolkovo Institute of Science and Technology (Skoltech))
6512 BASQ: Branch-wise Activation-clipping Search Quantization for Sub-4-bit Neural Networks Han-Byul Kim (Seoul National University)*; Eunhyeok Park (POSTECH); Sungjoo Yoo (Seoul National University)
6655 S3C: Self-Supervised Stochastic Classifiers for Few-Shot Class-Incremental Learning Jayateja Kalla (Indian Institute of Science); Soma Biswas (Indian Institute of Science, Bangalore)*
6921 A Gyrovector Space Approach for Symmetric Positive Semi-definite Matrix Learning Xuan Son Nguyen (Ensea)*
6946 Telepresence Video Quality Assessment Zhenqiang Ying (The University of Texas at Austin)*; Deepti Ghadiyaram (Facebook); Alan Bovik (University of Texas at Austin)
6986 Geometric Features Informed Multi-person Human-object Interaction Recognition in Videos Tanqiu Qiao (Durham University); Qianhui Men (University of Oxford); Frederick W. B. Li (University of Durham); Yoshiki Kubotani (Waseda University); Shigeo Morishima (Waseda Research Institute for Science and Engineering); Hubert P. H. Shum (Durham University)*
7039 Neural Correspondence Field for Object Pose Estimation Lin Huang (University at Buffalo); Tomas Hodan (Facebook Reality Labs)*; Lingni Ma (Facebook Reality Labs); Linguang Zhang (Facebook Reality Labs); Luan Tran (Facebook); Christopher D Twigg (Meta); PO-CHEN WU (Meta Inc.); Junsong Yuan (“State University of New York at Buffalo, USA”); Cem Keskin (Facebook); Robert Wang (Facebook Reality Labs)
7070 A Comparative Study of Graph Matching Algorithms in Computer Vision Stefan Haller (Heidelberg University)*; Lorenz Feineis (Heidelberg University); Lisa Hutschenreiter (Heidelberg University); Florian Bernard (University of Bonn); Carsten Rother (University of Heidelberg); Dagmar Kainmueller (MDC); Paul Swoboda (MPI fuer Informatik, Saarbruecken); Bogdan Savchynskyy (Heidelberg University)
7110 GigaDepth: Learning Depth from StructuredLight with Branching Neural Networks Simon Schreiberhuber (TUWien)*; Jean-Baptiste Weibel (TU Wien); Timothy Patten (University of Technology Sydney); Markus Vincze (TU Wien)
7346 Learning from Unlabeled 3D Environments for Vision-and-Language Navigation Shizhe Chen (INRIA)*; Pierre-Louis Guhur (Inria); Makarand Tapaswi (Wadhwani AI, IIIT Hyderbad); Cordelia Schmid (Inria/Google); Ivan Laptev (INRIA Paris)
7474 Towards Metrical Reconstruction of Human Faces Wojciech Zielonka (Max Planck Institute for Intelligent Systems); Timo Bolkart (Max Planck Institute for Intelligent Systems); Justus Thies (Max Planck Institute for Intelligent Systems)*
7573 Scaling Adversarial Training to Large Perturbation Bounds Sravanti Addepalli (Indian Institute of Science)*; Samyak Jain (Indian Institute of Technology (BHU), Varanasi); Gaurang Sriramanan (University of Maryland, College Park); Venkatesh Babu RADHAKRISHNAN (Indian Institute of Science)
7667 Text-based Temporal Localization of Novel Events Sudipta Paul (University of California, Riverside)*; Niluthpol C Mithun (SRI International); Amit K. Roy-Chowdhury (University of California, Riverside)
7693 Federated Self-supervised Learning for Video Understanding Yasar Rehman (TCL Corporate Research(Hong Kong) Co. Ltd); Yan Gao (University of Cambridge)*; Jiajun Shen (TCL Research); Pedro Gusmao (University of Cambridge); Nicholas Lane (University of Cambridge and Samsung AI)
7719 Towards Efficient and Effective Self-Supervised Learning of Visual Representations Sravanti Addepalli (Indian Institute of Science)*; Kaushal Bhogale (Indian Institute of Technology, Madras); Priyam Dey (Indian Institute of Science); Venkatesh Babu RADHAKRISHNAN (Indian Institute of Science)
7756 Dynamic Temporal Filtering in Video Models Fuchen Long (JD.com); Zhaofan Qiu (JD.com); Yingwei Pan (JD AI Research)*; Ting Yao (JD AI Research); Chong-Wah Ngo (Singapore Management University); Tao Mei (AI Research of JD.com)
7879 Grounding Visual Representations with Texts for Domain Generalization Seonwoo Min (LG AI Research)*; Nokyung Park (Korea University); Siwon Kim (Seoul National University); Seunghyun Park (Clova AI Research, NAVER Corp.); Jinkyu Kim (Korea University)
8042 OCR-free Document Understanding Transformer Geewook Kim (NAVER Corporation)*; Teakgyu Hong (Upstage AI); Moonbin Yim (Clova AI Research, NAVER Corp.); Jeongyeon Nam (Naver); Jinyoung Park (TmaxAI); Jinyeong Yim (Google); Wonseok Hwang (LBox); Sangdoo Yun (NAVER AI LAB); Dongyoon Han (NAVER AI Lab); Seunghyun Park (Clova AI Research, NAVER Corp.)
1135 Image-based CLIP-Guided Essence Transfer Hila Chefer (Tel Aviv University)*; Sagie Benaim (University of Copenhagen); Roni Paiss (Tel Aviv University, Google); Lior Wolf (Tel Aviv University, Israel)
1257 Motion Inspired Unsupervised Perception and Prediction in Autonomous Driving Mahyar Najibi (Waymo LLC); Jingwei Ji (Waymo); Yin Zhou (Waymo)*; Charles R. Qi (Waymo); Xinchen Yan (Waymo); Scott Ettinger (Waymo); Dragomir Anguelov (Waymo)
1493 Quantized GAN for Complex Music Generation from Dance Videos Ye Zhu (Illinois Institute of Technology)*; Kyle B Olszewski (Snap Inc.); Yu Wu (Princeton University); Panos Achlioptas (Stanford University); Menglei Chai (Snap Inc.); Yan Yan (Illinois Institute of Technology); Sergey Tulyakov (Snap Inc)
1576 DLCFT: Deep Linear Continual Fine-Tuning for General Incremental Learning Hyounguk Shon (KAIST)*; Janghyeon Lee (LG AI Research); Seung Hwan Kim (LG AI Research); Junmo Kim (KAIST)
1737 A Reliable Online Method for Joint Estimation of Focal Length and Camera Rotation Yiming Qian (Osaka University)*; James Elder (York University)
2016 GIPSO: Geometrically Informed Propagation for Online Adaptation in 3D LiDAR Segmentation Cristiano Saltori (University of Trento)*; Evgeny Krivosheev (University of Trento); Stéphane Lathuilière (Telecom-Paris); Nicu Sebe (University of Trento); Fabio Galasso (Sapienza University); Giuseppe Fiameni (NVIDIA); Elisa Ricci (University of Trento); Fabio Poiesi (Fondazione Bruno Kessler)
2050 CoSMix: Compositional Semantic Mix for Domain Adaptation in 3D LiDAR Segmentation Cristiano Saltori (University of Trento)*; Fabio Galasso (Sapienza University); Giuseppe Fiameni (NVIDIA); Nicu Sebe (University of Trento); Elisa Ricci (University of Trento); Fabio Poiesi (Fondazione Bruno Kessler)
2061 Hierarchical Average Precision Training for Pertinent Image Retrieval Elias Ramzi (Conservatoire Nation des Arts et Metiers)*; Nicolas Audebert (Cnam); Nicolas Thome (CNAM, Paris); Clément Rambour (Cnam); Xavier B Bitot (Coexya)
2166 Error Compensation Framework for Flow-Guided Video Inpainting Jaeyeon Kang (Yonsei University); Seoung Wug Oh (Adobe Research); Seon Joo Kim (Yonsei University)*
2596 Robust Landmark-based Stent Tracking in X-ray Fluoroscopy Luojie Huang (Johns Hopkins Uniersity); Yikang Liu (United Imaging Intelligence America); Li Chen (University of Washington); Eric Z. Chen (United Imaging Intelligence America); Xiao Chen (United Imaging Intelligence America); Shanhui Sun (United Imaging Intelligence America)*
3132 Learning Pedestrian Group Representations for Multi-modal Trajectory Prediction Inhwan Bae (Gwangju Institute of Science and Technology)*; Jin-Hwi Park (GIST); Hae-Gon Jeon (GIST)
3518 FrequencyLowCut pooling – Plug & Play against Catastrophic Overfitting Julia Grabinski (University of Siegen)*; Janis Keuper (Fraunhofer); Margret Keuper (University of Mannheim); Steffen Jung (MPII)
3604 TREND: Truncated Generalized Normal Density Estimation of Inception Embeddings for GAN Evaluation Junghyuk Lee (School of Integrated Technology, Yonsei University); Jong-Seok Lee (“Yonsei University, Korea”)*
3689 Three things everyone should know about Vision Transformers Hugo Touvron (Facebook AI Research)*; Matthieu Cord (Sorbonne University); Alaaeldin M El-Nouby (Facebook AI Research); Jakob Verbeek (Facebook); Herve Jegou (Facebook AI Research)
3737 NeuMan: Neural Human Radiance Field from a Single Video Wei Jiang (University of British Columbia)*; Kwang Moo Yi (University of British Columbia); Golnoosh Samei (UBC); Oncel Tuzel (Apple); Anurag Ranjan (Apple)
3958 Unsupervised Domain Adaptation for One-Stage Object Detector using Offsets to Bounding Box Jayeon Yoo (Seoul National University); Inseop Chung (Seoul National University); Nojun Kwak (Seoul National University)*
4175 Visual Prompt Tuning Menglin Jia (Cornell University)*; Luming Tang (Cornell University); Bor-Chun Chen (Facebook AI); Claire T Cardie (Cornell University); Serge Belongie (University of Copenhagen); Bharath Hariharan (Cornell University); Ser-Nam Lim (Meta AI)
4252 Backbone is All Your Need: A Simplified Architecture for Visual Object Tracking Boyu Chen (The University of Sydney); Peixia Li (The University of Sydney)*; Lei Bai (Shanghai AI Laboratory); Lei Qiao (SenseTime Group Limited); Qiuhong Shen (Harbin Institute of Technology (Shenzhen)); Bo Li (SenseTime Group Limited); Weihao Gan (SenseTime Group Limited); Wei Wu (SenseTime Group Limited); Wanli Ouyang (The University of Sydney)
4894 CHORE: Contact, Human and Object REconstruction from a single RGB image Xianghui Xie (Saarland University )*; Bharat Lal Bhatnagar (University of Tübingen, MPI informatik); Gerard Pons-Moll (University of Tübingen)
4918 Learned Vertex Descent: A New Direction for 3D Human Model Fitting Enric Corona (IRI)*; Gerard Pons-Moll (University of Tübingen); Guillem Alenyà (IRI); Francesc Moreno (IRI)
5036 3D Clothed Human Reconstruction in the Wild Gyeongsik Moon (Seoul National University); Hyeongjin Nam (Seoul National University); Takaaki Shiratori (Meta Reality Labs Research); Kyoung Mu Lee (Seoul National University)*
5140 RealPatch: A Statistical Matching Framework for Model Patching with Real Samples Sara Romiti (University of Sussex)*; Christopher Inskip (University of Sussex); Viktoriia Sharmanska (University of Sussex and Imperial College London); Novi Quadrianto (University of Sussex, Basque Center for Applied Mathematics, and Monash Indonesia)
5144 GAN Cocktail: mixing GANs without dataset access Omri Avrahami (The Hebrew University of Jerusalem)*; Dani Lischinski (The Hebrew University of Jerusalem); Ohad Fried (IDC Herzliya)
5297 Graph Neural Network for Cell Tracking in Microscopy Videos Tal Ben-Haim (School of Electrical and Computer Engineering, Ben-Gurion University)*; Tammy Riklin Raviv (BGU)
5458 Object Discovery via Contrastive Learning for Weakly Supervised Object Detection Jinhwan Seo (Pohang University of Science and Technology)*; Wonho Bae (University of British Columbia); Danica J. Sutherland (University of British Columbia); Junhyug Noh (Lawrence Livermore National Laboratory); Daijin Kim (Pohang University of Science and Technology)
5536 TextAdaIN: Paying Attention to Shortcut Learning in Text Recognizers Oren Nuriel (Amazon)*; Ron Litman (Amazon); Sharon Fogel (Amazon)
5591 Semi-Supervised Learning of Optical Flow by Flow Supervisor Woobin Im (KAIST); Sebin Lee (KAIST); Sungeui Yoon (KAIST)*
5616 Deep ensemble learning by diverse knowledge distillation for fine-grained object classification Naoki Okamoto (Chubu university)*; Tsubasa Hirakawa (Chubu University); Takayoshi Yamashita (Chubu University); Hironobu Fujiyoshi (Chubu University)
5794 Controllable Video Generation through Global and Local Motion Dynamics Aram Davtyan (University of Bern)*; Paolo Favaro (University of Bern)
6063 WISE: Whitebox Image Stylization by Example-based Learning Winfried Lötzsch (Merantix Momentum); Max Reimann (Hasso-Plattner-Institute)*; Martin Büßemeyer (Hasso-Plattner-Institut); Amir Semmo (Digital Masterpieces GmbH); Jürgen Döllner (Hasso-Plattner-Institut); Matthias Trapp (Hasso Plattner Institute, University of Potsdam)
6194 Objects Can Move: 3D Change Detection by GeometricTransformation Consistency Aikaterini Adam (National Techniclal University of Athens)*; Torsten Sattler (Czech Technical University in Prague); Konstantinos Karantzalos (National Technical University of Athens); Tomas Pajdla (Czech Technical University in Prague)
6615 Personalized Education: Blind Knowledge Distillation Xiang Deng (State University of New York at Binghamton)*; Jian Zheng (Amazon); Zhongfei Zhang (Binghamton University)
6619 Fast Two-View Motion Segmentation Using Christoffel Polynomials Bengisu Ozbay (Northeastern University); Octavia Camps (Northeastern University); Mario Sznaier (Northeastern University)*
6817 Expanded Adaptive Scaling Normalization for End to End Image Compression Chajin Shin (Yonsei University)*; Hyeongmin Lee (Yonsei University ); Hanbin Son (Yonsei Univ.); Sangjin Lee (Yonsei University); Dogyoon Lee (Yonsei University); Sangyoun Lee (Yonsei University)
6827 Embedding contrastive unsupervised features to cluster in- and out-of-distribution noise in corrupted image datasets Paul Albert (Insight Centre for Data Analytics (DCU))*; Eric Arazo (Insight Centre for Data Analytics (DCU)); Noel O Connor (Home); Kevin McGuinness (DCU)
6912 PIP: Physical Interaction Prediction via Mental Simulation with Span Selection Jiafei Duan (University of Washington)*; Samson Yu (Agency for Science, Technology and Research); Soujanya Poria (Singapore University of Technology and Design); Bihan Wen (Nanyang Technological University); Cheston Tan (Institute for Infocomm Research, Singapore)
7028 Conditional-Flow NeRF: Accurate 3D Modelling with Reliable Uncertainty Quantification Jianxiong Shen (IRI, CSIC-UPC)*; Antonio Agudo (Institut de Robotica i Informatica Industrial, CSIC-UPC); Francesc Moreno (IRI); Adria Ruiz (Seedtag)
7043 The Missing Link: Finding label relations across datasets Jasper Uijlings (Google Research)*; Thomas Mensink (Google Research); Vittorio Ferrari (Google Research)
7051 Contrasting quadratic assignments for set-based representation learning Artem Moskalev (University of Amsterdam)*; Ivan Sosnovik (University of Amsterdam); Volker Fischer (Bosch Center for Artificial Intelligence); Arnold W.M. Smeulders (University of Amsterdam)
7545 Latent Space Smoothing for Individually Fair Representations Momchil Peychev (ETH Zurich)*; Anian Ruoss (DeepMind); Mislav Balunovic (ETH Zurich); Maximilian Baader (ETH Zürich); Martin Vechev (ETH Zurich)
7741 Parameterized Temperature Scaling for Boosting the Expressive Power in Post-Hoc Uncertainty Calibration Christian Tomani (TUM)*; Daniel Cremers (TU Munich); Florian Buettner (German Cancer Research Center and Frankfurt University)
8001 Bridging the visual semantic gap in VLN via semantically richer instructions Joaquín Ignacio Ossandón (Universidad Catolica de Chile)*; Benjamín Earle (Universidad Católica de Chile); Alvaro Soto (Universidad Catolica de Chile)
640 Long-tail Detection with Effective Class-Margins Jang Hyun Cho (The University of Texas at Austin)*; Philipp Kraehenbuehl (UT Austin)
1448 Multimodal Object Detection via Probabilistic Ensembling Yi-Ting Chen (University of Maryland); Jinghao Shi (Carnegie Mellon University); Zelin Ye (CMU); Mertz Christoph (CMU); Deva Ramanan (Carnegie Mellon University); Shu Kong (Carnegie Mellon University)*
1791 Improving Robustness by Enhancing Weak Subnets Yong Guo (Max Planck Institute for Informatics)*; David Stutz (Max Planck Institute for Informatics); Bernt Schiele (MPI Informatics)
2179 Adversarially-Aware Robust Object Detector ZiYi Dong (Sun Yat-Sen University)*; Pengxu Wei (Sun Yat-sen University); Liang Lin (Sun Yat-sen University)
2723 Fine-Grained Scene Graph Generation with Data Transfer Ao Zhang (National University of Singapore)*; Yuan Yao (Tsinghua University); qianyu chen (Tsinghua University); Wei Ji (National University of Singapore); Zhiyuan Liu (Tsinghua University); Maosong Sun (Tsinghua University); Tat-Seng Chua (National university of Singapore)
6108 Vote from the Center: 6 DoF Pose Estimation in RGB-D Images by Radial Keypoint Voting Yangzheng Wu (Queen’s University)*; Mohsen Zand (Queen’s University); Ali Etemad (Queen’s University); Michael Alan Greenspan (Queen’s University)
561 Particle Video Revisited: Tracking Through Occlusions Using Point Trajectories Adam Harley (Carnegie Mellon University)*; Zhaoyuan Fang (Carnegie Mellon University); Katerina Fragkiadaki (Carnegie Mellon University)
2385 A Perturbation-Constrained Adversarial Attack for Evaluating the Robustness of Optical Flow Jenny Schmalfuss (University of Stuttgart)*; Philipp Scholze (University of Stuttgart); Andrés Bruhn (University of Stuttgart)
2623 Social-SSL: Self-Supervised Cross-Sequence Representation Learning Based on Transformers for Multi-Agent Trajectory Prediction Li-Wu Tsao (National Chiao Tung University)*; Yan-Kai Wang (National Chiao Tung University); Hao-Siang Lin (National Chiao Tung University); Hong-Han Shuai (National Yang Ming Chiao Tung University); Lai-Kuan Wong (Multimedia University); Wen-Huang Cheng (National Chiao Tung University)
2874 Diverse Human Motion Prediction Guided by Multi-Level Spatial-Temporal Anchors Sirui Xu (University of Illinois Urbana-Champaign)*; Yu-Xiong Wang (University of Illinois at Urbana-Champaign); Liangyan Gui (University of Illinois Urbana-Champaign)
4806 TEMOS: Generating diverse human motions from textual descriptions Mathis Petrovich (Ecole des Ponts)*; Michael Black (Max Planck Institute for Intelligent Systems); Gul Varol (Ecole des Ponts ParisTech)
7092 PREF: Predictability Regularized Neural Motion Fields Liangchen Song (University at Buffalo)*; Xuan Gong (University at Buffalo); Benjamin Planche (United Imaging Intelligence); Meng Zheng (United Imaging Intelligence); David Doermann (University at Buffalo); Junsong Yuan (“State University of New York at Buffalo, USA”); Terrence Chen (United Imaging Intelligence); Ziyan Wu (United Imaging Intelligence)

Orals 3 (Tue. pm)

Architecture, Training, and Optimization

3757 Unpaired Image Translation via Vector Symbolic Architectures Justin Theiss (University of California, Berkeley)*; Jay Leverett (Meta); Daeil Kim (Meta); Aayush Prakash (Meta)
4901 Adaptive Token Sampling For Efficient Vision Transformers Mohsen Fayyaz (Microsoft)*; Soroush Abbasi Koohpayegani (University of Maryland Baltimore County); Farnoush Rezaei Jafari (Technische Universität Berlin); Sunando Sengupta (Microsoft); HAMID VAEZI JOZE (Microsoft); Eric Sommerlade (Microsoft); Hamed Pirsiavash (University of California Davis); Jürgen Gall (University of Bonn)
5271 Cross-Modal Knowledge Transfer Without Task-Relevant Source Data SK MIRAJ AHMED (University of California Riverside); Suhas Lohit (Mitsubishi Electric Research Laboratories)*; Kuan-Chuan Peng (Mitsubishi Electric Research Laboratories (MERL)); Michael J Jones (MERL); Amit K. Roy-Chowdhury (University of California, Riverside)
5667 The Challenges of Continuous Self-Supervised Learning Senthil Purushwalkam (Carnegie Mellon University); Pedro Morgado (University of Wisconsin-Madison)*; Abhinav Gupta (CMU/FAIR)
6326 Identifying Hard Noise in Long-Tailed Sample Distribution Xuanyu Yi (Nanyang Technological University)*; Kaihua Tang (Nanyang Technological University); Xian-Sheng Hua (Damo Academy, Alibaba Group); Joo-Hwee Lim (Institute for Infocomm Research); Hanwang Zhang (Nanyang Technological University)
6568 PACTran: PAC-Bayesian Metrics for Estimating the Transferability of Pretrained Models to Classification Tasks Nan Ding (Google)*; Xi Chen (Google Research); Tomer Levinboim (Google); Soravit Changpinyo (Google Research); Radu Soricut (Google)
7302 Semidefinite Relaxations of Truncated Least-Squares in Robust Rotation Search: Tight or Not Liangzu Peng (Johns Hopkins University)*; Mahyar Fazlyab (Johns Hopkins University); Rene Vidal (Johns Hopkins University, USA)
7345 Lottery Ticket Hypothesis for Spiking Neural Networks Youngeun Kim (Yale University)*; Yuhang Li (Yale University); Hyoungseob Park (Yale University); Yeshwanth Venkatesha (Yale university); Ruokai Yin (Yale University); Priyadarshini Panda (Yale University)
7464 Cartoon Explanations of Image Classifiers Stefan Kolek (Ludwig Maximilian University of Munich)*; Duc Anh Nguyen (LMU Munich); Ron Levie (Technion); Joan Bruna (Courant Institute of Mathematical Sciences, NYU, USA); Gitta Kutyniok (Ludwig Maximilian University of Munich)

Orals 4 (Tue. pm)

Shape From-X and Applications

1552 Revisiting a kNN-based Image Classification System with High-capacity Storage Kengo Nakata (Kioxia Corporation)*; Youyang Ng (Kioxia Corporation); Daisuke Miyashita (Kioxia Corporation); Asuka Maki (Kioxia Corporation); Yu-Chieh Lin (Kioxia Corporation); Jun Deguchi (Kioxia Corporation)
1742 A Level Set Theory for Neural Implicit Evolution under Explicit Flows Ishit Mehta (University of California San Diego)*; Manmohan Chandraker (UC San Diego); Ravi Ramamoorthi (University of California San Diego)
4720 Organic Priors in Non-Rigid Structure from Motion Suryansh Kumar (ETH Zurich)*; Luc Van Gool (ETH Zurich)
4910 Implicit Field Supervision For Robust Non-Rigid Shape Matching Ramana S Sundararaman (Ecole Polytechnique)*; Gautam Pai (École Polytechnique); Maks Ovsjanikov (Ecole polytechnique)
5713 Shape-Pose Disentanglement using SE(3)-equivariant Vector Neurons Oren Katzir (Tel Aviv University)*; Dani Lischinski (The Hebrew University of Jerusalem); Danny Cohen-Or (Tel Aviv University)
7414 Unsupervised Pose-aware Part Decomposition for Man-made Articulated Objects Yuki Kawana (The University of Tokyo)*; Yusuke Mukuta (The University of Tokyo); Tatsuya Harada (The University of Tokyo / RIKEN)

Posters 2 (Tue. late)

6583 Multi-modal Text Recognition Networks: Interactive Enhancements between Visual and Semantic Features Byeonghu Na (KAIST); Yoonsik Kim (Clova AI Research, NAVER Corp.); Sungrae Park (Upstage AI Research, Upstage AI)*
103 PixelFolder: An Efficient Progressive Pixel Synthesis Network for Image Generation Jing He (Xiamen university)*; Yiyi Zhou (Xiamen University); Qi Zhang (Tencent); Jun Peng (Xiamen University); Yunhang Shen (Xiamen University); Xiaoshuai Sun (Xiamen University); Chao Chen (Youtu Laboratory); Rongrong Ji (Xiamen University, China)
287 Benchmarking Omni-Vision Representation through the Lens of Visual Realms Yuanhan Zhang (Nanyang Technological University); Zhenfei Yin (Sensetime); Jing Shao (Sensetime); Ziwei Liu (Nanyang Technological University)*
303 DecoupleNet: Decoupled Network for Domain Adaptive Semantic Segmentation Xin Lai (The Chinese University of Hong Kong)*; Zhuotao Tian (The Chinese University of Hong Kong); Xiaogang XU (The Chinese University of Hong Kong); Yingcong Chen (Hong Kong University of Science and Technology); Shu Liu (SmartMore); Hengshuang Zhao (University of Oxford); Liwei Wang (CUHK); Jiaya Jia (Chinese University of Hong Kong)
583 Q-FW: A Hybrid Classical-Quantum Frank-Wolfe for Quadratic Binary Optimization Alp Yurtsever (Umeå University); Tolga Birdal (TU Munich)*; Vladislav Golyanik (MPI for Informatics)
656 Structural Causal 3D Reconstruction Weiyang Liu (University of Cambridge)*; Zhen Liu (Mila, University of Montreal); Liam Paull (Université de Montréal); Adrian Weller (University of Cambridge); Bernhard Schölkopf (MPI for Intelligent Systems, Tübingen)
945 Adaptive Face Forgery Detection in Cross Domain Luchuan Song (University of Science and Technology of China)*; Zheng Fang (BeihangUniversity); Xiaodan Li (Alibaba Group); Xiaoyi Dong (University of Science and Technology of China); Zhenchao Jin (University of Science and Technology of China); Yuefeng Chen (Alibaba Group); Siwei Lyu (University at Buffalo)
1103 Skeleton-free Pose Transfer for Stylized 3D Characters Zhouyingcheng Liao (Saarland University)*; Jimei Yang (Adobe); Jun Saito (Adobe); Gerard Pons-Moll (University of Tübingen); Yang Zhou (Adobe Research)
1214 GEB+: A Benchmark for Generic Event Boundary Captioning, Grounding and Retrieval Yuxuan Wang (National University of Singapore); Difei Gao (NUS); Licheng Yu (Facebook); Stan Weixian Lei (National University of Singapore); Matt Feiszli (Facebook Research); Mike Zheng Shou (National University of Singapore)*
1225 FAST-VQA: Efficient End-to-end Video Quality Assessment with Fragment Sampling Haoning Wu (Nanyang Technological University)*; Chaofeng Chen (Nanyang Technological University); Jingwen Hou (Nanyang Technological University); Liang Liao (Nanyang Technological University); Annan Wang (Nanyang Technological University); Wenxiu Sun (SenseTime Research and Tetras.AI); Qiong Yan (SenseTime Group Limited); Weisi Lin (Nanyang Technological University, Singapore)
1243 Long-Tailed Class Incremental Learning Xialei Liu (Nankai University)*; Yusong Hu (Nankai University); Xu-Sheng Cao (Nankai University); Andy Bagdanov (University of Florence, Italy); Ke Li (Tencent); Ming-Ming Cheng (Nankai University)
1336 RC-MVSNet: Unsupervised Multi-View Stereo with Neural Rendering Di Chang (Technical University of Munich)*; Aljaz Bozic (Technical University Munich); Tong Zhang (EPFL); Qingsong Yan (hong kong university of science and technology); Yingcong Chen (Hong Kong University of Science and Technology); Sabine Süsstrunk (EPFL); Matthias Niessner (Technical University of Munich)
1424 ARF: Artistic Radiance fields Kai Zhang (Cornell University)*; Nicholas I Kolkin (Adobe Research); Sai Bi (Adobe Research); Fujun Luan (Adobe Research); Zexiang Xu (Adobe Research); Eli Shechtman (Adobe Research, US); Noah Snavely (Cornell University and Google AI)
1441 Static and Dynamic Concepts for Self-supervised Video Representation Learning Rui Qian (The Chinese University of Hong Kong)*; Shuangrui Ding (Shanghai Jiao Tong University); Xian Liu (The Chinese University of Hong Kong); Dahua Lin (The Chinese University of Hong Kong)
1471 Learning Hierarchy Aware Features for Reducing Mistake Severity Ashima Garg (IIIT Delhi)*; Depanshu Sani (Indraprastha Institute of Information Technology); Saket Anand (Indraprastha Institute of Information Technology Delhi)
1541 TL;DW? Summarizing Instructional Videos with Task Relevance & Cross-Modal Saliency Medhini Narasimhan (UC Berkeley)*; Arsha Nagrani (Google); Chen Sun (Brown University); Michael Rubinstein (Google); Trevor Darrell (UC Berkeley); Anna Rohrbach (UC Berkeley); Cordelia Schmid (Google)
1991 TOCH: Spatio-Temporal Object-to-Hand Correspondence for Motion Refinement Keyang Zhou (University of Tübingen)*; Bharat Lal Bhatnagar (University of Tübingen, MPI informatik); Jan E. Lenssen (TU Dortmund); Gerard Pons-Moll (University of Tübingen)
2105 Reference-based Image Super-Resolution with Deformable Attention Transformer Jiezhang Cao (ETH Zürich)*; Jingyun Liang (ETH Zurich); Kai Zhang (ETH Zurich); Yawei Li (ETH Zurich); Yulun Zhang (ETH Zurich); Wenguan Wang (Eidgenössische Technische Hochschule Zürich); Luc Van Gool (ETH Zurich)
2146 Temporally Consistent Semantic Video Editing Yiran Xu (University of Maryland, College Park)*; Badour A Sh AlBahar (Virginia Tech); Jia-Bin Huang (Facebook )
2293 Meta Spatio-Temporal Debiasing for Video Scene Graph Generation LI XU (Singapore University of Technology and Design)*; Haoxuan Qu (Singapore University of Technology and Design); Jason Kuen (Adobe Research); Jiuxiang Gu (Adobe Research); Jun Liu (Singapore University of Technology and Design)
2511 Real-Time Neural Character Rendering with Pose-Guided Multiplane Images Hao Ouyang (HKUST)*; Bo Zhang (Microsoft Research Asia); Pan Zhang (Shanghai AI Laboratory); Hao Yang (Microsoft Research Asia); Dong Chen (Microsoft Research Asia); Jiaolong Yang (Microsoft Research); Qifeng Chen (HKUST); Fang Wen (Microsoft Research Asia )
2581 Studying Bias in GANs through the Lens of Race Vongani H Maluleke (University of California, Berkeley); Neerja Thakkar (University of California, Berkeley)*; Tim Brooks (UC Berkeley); Ethan Weber (UC Berkeley); Trevor Darrell (UC Berkeley); Alexei A Efros (UC Berkeley); Angjoo Kanazawa (University of California Berkeley); Devin Guillory (UC Berkeley)
2586 Autoregressive 3D Shape Generation via Canonical Mapping An-Chieh Cheng (National Tsing Hua University); Xueting Li (University of California, Merced); Sifei Liu (NVIDIA)*; Min Sun (NTHU); Ming-Hsuan Yang (University of California at Merced)
2663 Hierarchically Self-Supervised Transformer for Human Skeleton Representation Learning Yuxiao Chen (Rutgers University)*; Long Zhao (Google Research); Jianbo Yuan (Bytedance); Yu Tian (Rutgers); zhaoyang xia (Rutgers University); Shijie Geng (Rutgers University); Ligong Han (Rutgers University); Dimitris N. Metaxas (Rutgers)
2887 Controllable and Guided Face Synthesis for Unconstrained Face Recognition Feng Liu (Michigan State University)*; Minchul Kim (Michigan State University); Anil Jain (Michigan State University); Xiaoming Liu (Michigan State University)
3035 Point Cloud Domain Adaptation via Masked Local 3D Structure Prediction hanxue liang (University of Texas at Austin)*; Hehe Fan (NUS); Zhiwen Fan (University of Texas at Austin); Yi Wang (University of Texas at Austin); Tianlong Chen (Unversity of Texas at Austin); Yu Cheng (Microsoft Research); Zhangyang Wang (University of Texas at Austin)
3155 Manifold Adversarial Learning for Cross-domain 3D Shape Representation Hao Huang (New York University); Cheng Chen (New York University); Yi Fang (New York University)*
3240 Interpretable Image Classification with Differentiable Prototypes Assignment Dawid Damian Rymarczyk (Jagiellonian University)*; Łukasz Struski (Jagiellonian University); Michał Górszczak (Jagiellonian University); Koryna Lewandowska (Jagiellonian University); Jacek Tabor (Jagiellonian University); Bartosz Zieliński (Jagiellonian University)
3250 ConCL: Concept Contrastive Learning for Dense Prediction Pre-training in Pathology Images Jiawei Yang (UCLA)*; Hanbo Chen (Tencent AI Lab); Yuan Liang (UCLA); Junzhou Huang (University of Texas at Arlington); Lei He (UCLA); Jianhua Yao (National Institutes of Health)
3254 Leveraging Action Affinity and Continuity for Semi-supervised Temporal Action Segmentation Guodong Ding (National University of Singapore)*; Angela Yao (National University of Singapore)
3265 Data Association between Event Streams andIntensity Frames under Diverse Baselines Dehao Zhang (Peking University)*; Qiankun Ding (Peking University); Peiqi Duan (Peking University); Chu Zhou (Peking University); Boxin Shi (Peking University)
3387 Spotting Temporally Precise, Fine-Grained Events in Video James Hong (Stanford University)*; Haotian Zhang (Stanford University); Michaël Gharbi (Adobe Research); Matthew Fisher (Adobe Research); Kayvon Fatahalian (Stanford)
3484 Unfolded Deep Kernel Estimation for Blind Image Super-resolution Hongyi Zheng (The Hong Kong Polytechnic University); Hongwei Yong (The Hong Kong Polytechnic University); Lei Zhang (“Hong Kong Polytechnic University, Hong Kong, China”)*
3535 Equivariance and Invariance Inductive Bias for Learning from Insufficient Data Tan Wang (Nanyang Technological University)*; Qianru Sun (Singapore Management University); Sugiri Pranata (Panasonic R&D Center Singapore); Karlekar Jayashree (Panasonic); Hanwang Zhang (Nanyang Technological University)
3591 CLASTER: Clustering with Reinforcement Learning for Zero-Shot Action Recognition Shreyank N Gowda (University of Edinburgh)*; Laura Sevilla-Lara (Facebook); Frank Keller (University of Edinburgh); Marcus Rohrbach (Facebook AI Research)
3665 Primitive-based Shape Abstraction via Nonparametric Bayesian Inference Yuwei Wu (National University of Singapore)*; Weixiao Liu (National University of Singapore); Sipu Ruan (National University of Singapore); Gregory S Chirikjian (National University of Singapore)
3758 ConMatch: Semi-Supervised Learning with Confidence-Guided Consistency Regularization Jiwon Kim (Korea University)*; Youngjo Min (Korea University); Daehwan Kim (Samsung electro mechanics); Gyuseong Lee (Korea University); Junyoung Seo (Korea University); Kwangrok Ryoo (Korea University); Seungryong Kim (Korea University)
3965 PreTraM: Self-Supervised Pre-training via Connecting Trajectory and Map Chenfeng Xu (UC Berkeley)*; Tian Li (University of California, San Diego); Chen Tang (UC Berkeley); Lingfeng Sun (UC Berkeley); Kurt Keutzer (EECS, UC Berkeley); Masayoshi TOMIZUKA (MSC Lab); Alireza Fathi (Google); Wei Zhan (University of California, Berkeley)
3998 Relative Pose from SIFT Features Daniel Barath (ETH Zürich)*; Zuzana Kukelova (Czech Technical University in Prague)
3999 Monocular 3D Object Reconstruction with GAN Inversion Junzhe Zhang (Nanyang Technological University)*; Daxuan Ren (Nanyang Technological University); Zhongang Cai (SenseTime International Pte Ltd); Chai Kiat Yeo (Nanyang Technological University); Bo Dai (Shanghai AI Lab); Chen Change Loy (Nanyang Technological University)
4152 Image2Point: 3D Point-Cloud Understanding with 2D Image Pretrained Models Chenfeng Xu (UC Berkeley)*; Shijia Yang (UC Berkeley); Tomer Galanti (Massachusetts Institute of Technology); Bichen Wu (Facebook Research); Xiangyu Yue (University of California, Berkeley); Bohan Zhai (UC Berkeley); Wei Zhan (University of California, Berkeley); Kurt Keutzer (EECS, UC Berkeley); Peter Vajda (Facebook); Masayoshi Tomizuka (University of California, Berkeley)
4284 Translating a Visual LEGO Manual to a Machine-Executable Plan Ruocheng Wang (Stanford University)*; Yunzhi Zhang (Stanford University); Jiayuan Mao (MIT); Chin-Yi Cheng (Google Research); Jiajun Wu (Stanford University)
4288 Monitored Distillation for Positive Congruent Depth Completion Tian Yu Liu (UCLA); Parth Agrawal (UCLA); Allison Y Chen (University of California, Los Angeles); Byung-Woo Hong (Chung-Ang University); Alex Wong (Yale University)*
4293 AirDet: Few-Shot Detection without Fine-tuning for Autonomous Exploration Bowen Li (Tongji University)*; Chen Wang (Carnegie Mellon University); Pranay Reddy Anthireddy (Indian Institute of Information Technology, Design and Manufacturing, Jabalpur); Seungchan Kim (Carnegie Mellon University); Sebastian Scherer (Carnegie Mellon University)
4408 Invariant Feature Learning for Generalized Long-Tailed Classification Kaihua Tang (Nanyang Technological University)*; Mingyuan Tao (Damo Academy, Alibaba Group); Jiaxin Qi (Nanyang Technological University); Zhenguang Liu (Zhejiang University); Hanwang Zhang (Nanyang Technological University)
4477 Panoramic Human Activity Recognition Ruize Han (College of Intelligence and Computing, Tianjin University); Haomin Yan (Tianjin University); Jiacheng Li (College of Intelligence and Computing, Tianjin University); Songmiao Wang (Tianjin University); Wei Feng (College of Intelligence and Computing, Tianjin University, China)*; Song Wang (University of South Carolina)
4596 Cross-Modal 3D Shape Generation and Manipulation Zezhou Cheng (University of Massachusetts, Amherst)*; Menglei Chai (Snap Inc.); Jian Ren (Snap Inc.); Hsin-Ying Lee (Snap Inc); Kyle B Olszewski (Snap Inc.); Zeng Huang (Snap Inc.); Subhransu Maji (University of Massachusetts, Amherst); Sergey Tulyakov (Snap Inc)
4880 Custom Structure Preservation in Face Aging Guillermo Gomez-Trenado (University of Granada)*; Stéphane Lathuilière (Telecom-Paris); Pablo Mesejo (University of Granada); Oscar Cordón García (University of Granada)
4970 LaLaLoc++: Global Floor Plan Comprehension for Layout Localisation in Unvisited Environments Henry Howard-Jenkins (University of Oxford)*; Victor Adrian Prisacariu (University of Oxford)
5042 Zero-Shot Category-Level Object Pose Estimation Walter Goodwin (University of Oxford)*; Sagar Vaze (Visual Geometry Group, University of Oxford); Ioannis Havoutis (“Oxford Robotics Institute, Universtity of Oxford”); Ingmar Posner (Oxford University)
5044 AssistQ: Affordance-centric Question-driven Task Completion for Egocentric Assistant Benita Wong (National University of Singapore)*; Joya Chen (National University of Singapore); You Wu (Harvard University); Stan Weixian Lei (National University of Singapore); Dongxing Mao (National University of Singapore); Difei Gao (NUS); Mike Zheng Shou (National University of Singapore)
5081 Unleashing Transformers: Parallel Token Prediction with Discrete Absorbing Diffusion for Fast High-Resolution Image Generation from Vector-Quantized Codes Sam Bond-Taylor (Durham University)*; Peter Hessey (Durham University); Hiroshi Sasaki (Durham University); Toby P Breckon (Durham University); Chris G. Willcocks (Durham University)
5099 Towards Sequence-Level Training for Visual Tracking Minji Kim (Seoul National University)*; Seungkwan Lee (POSTECH); Jungseul Ok (POSTECH); Bohyung Han (Seoul National University); Minsu Cho (POSTECH)
5234 Learn2Augment: Learning to Composite Videos for Data Augmentation in Action Recognition Shreyank N Gowda (University of Edinburgh)*; Marcus Rohrbach (Facebook AI Research); Frank Keller (University of Edinburgh); Laura Sevilla-Lara (Facebook)
5373 CYBORGS: Contrastively Bootstrapping Object Representations by Grounding in Segmentation Renhao Wang (Tsinghua University)*; Hang Zhao (Tsinghua University); Yang Gao (Tsinghua University)
5388 AutoAvatar: Autoregressive Neural Fields for Dynamic Avatar Modeling Ziqian Bai (Simon Fraser University)*; Timur Bagautdinov (Facebook); Javier Romero (Facebook); Michael Zollhöfer (Facebook Reality Labs); Ping Tan (Simon Fraser University); Shunsuke Saito (Facebook)
5478 $\ell_\infty$-Robustness and Beyond: Unleashing Efficient Adversarial Training Hadi Mohaghegh Dolatabadi (University of Melbourne)*; Sarah Erfani (University of Melbourne); Christopher Leckie (University of Melbourne)
5584 Sound-guided Semantic Video Generation Seung Hyun Lee (Korea University)*; Gyeongrok Oh (Korea University); Wonmin Byeon (NVIDIA Research); Jihyun Bae (Korea University); Chanyoung Kim (Korea University); Won Jeong Ryoo (Korea University); Sang Ho Yoon (KAIST); Hyunjun Cho (Korea University); Jinkyu Kim (Korea University); Sangpil Kim (Korea University)
5612 Continual 3D Convolutional Neural Networks for Real-time Processing of Videos Lukas Hedegaard (Aarhus University)*; Alexandros Iosifidis (Aarhus University)
5644 Pose Forecasting in Industrial Human-Robot Collaboration Alessio Sampieri (Sapienza University)*; Guido Maria D’Amely di Melendugno (Sapienza University); ANDREA AVOGARO (University of Verona); Federico Cunico (University of Verona); Francesco Setti (University of Verona); Geri Skenderi (University of Verona); Marco Cristani (University of Verona); Fabio Galasso (Sapienza University)
5712 Overcoming Shortcut Learning in a Target Domain by Generalizing Basic Visual Factors from a Source Domain Piyapat Saranrittichai (Bosch Center for Artificial Intelligence)*; Chaithanya Kumar Mummadi (Bosch Center for Artificial Intelligence); Claudia Blaiotta (Bosch Center for Artificial Intelligence); Mauricio Munoz (Bosch Center for Artificial Intelligence); Volker Fischer (Bosch Center for Artificial Intelligence)
5859 Referring Object Manipulation of Natural Images with Conditional Classifier-Free Guidance Myungsub Choi (Google)*
5905 ClearPose: Large-scale Transparent Object Dataset and Benchmark Xiaotong Chen (University of Michigan, Ann Arbor)*; Huijie Zhang (University of Michigan, Ann Arbor); Zeren Yu (University of Michigan–Ann Arbor); Anthony Opipari (University of Michigan); Odest Chadwicke Jenkins (University of Michigan)
5950 Long Video Generation with Time-Agnostic VQGAN and Time-Sensitive Transformer Songwei Ge (University of Maryland)*; Thomas F Hayes (Meta); Harry Yang (Facebook); Xi Yin (Facebook); Guan Pang (Facebook); David Jacobs (University of Maryland, USA); Jia-Bin Huang (Facebook ); Devi Parikh (Georgia Tech & Facebook AI Research)
6068 AnimeCeleb: Large-Scale Animation CelebHeads Dataset for Head Reenactment Kangyeol Kim (KAIST)*; Sunghyun Park (KAIST); Jaeseong Lee (KAIST); Sunghyo Chung (Korea University); Junsoo Lee (NAVER WEBTOON Ltd.); Jaegul Choo (Korea Advanced Institute of Science and Technology)
6080 Learning Semantic Segmentation from Multiple Datasets with Label Shifts Dongwan Kim (Seoul National University)*; Yi-Hsuan Tsai (Google); Yumin Suh (NEC Labs America); Masoud Faraki (NEC Labs); Sparsh Garg (NEC Labs America); Manmohan Chandraker (UC San Diego); Bohyung Han (Seoul National University)
6141 NewsStories: Illustrating articles with visual summaries Reuben Tan (Boston University)*; Bryan Plummer (Boston University); Kate Saenko (Boston University); J.P. Lewis (Google Research); Avneesh Sud (Google); Thomas Leung (Google)
6227 SpatialDETR: Robust Scalable Transformer-Based 3D Object Detection from Multi-View Camera Images with Global Cross-Sensor Attention Simon Doll (University of Tübingen)*; Richard Schulz (Mercedes Benz); Lukas Schneider (Daimer); Viviane Benzin (Mercedes-Benz AG); Markus Enzweiler (Esslingen University of Applied Sciences); Hendrik P. A. Lensch (University of Tübingen)
6298 FedVLN: Privacy-preserving Federated Vision-and-Language Navigation Kaiwen Zhou (University of California, Santa Cruz)*; Xin Eric Wang (University of California, Santa Cruz)
6380 ML-BPM: Multi-teacher Learning with Bidirectional Photometric Mixing for Open Compound Domain Adaptation in Semantic Segmentation Fei Pan (KAIST)*; Sungsu Hur (KAIST); Seokju Lee (KENTECH); Junsik Kim (Harvard University); In So Kweon (KAIST)
6394 Order Learning Using Partially Ordered Data via Chainization Seon-Ho Lee (MCL, Korea University); Chang-Su Kim (Korea university)*
6439 MimicME: A Large Scale Diverse 4D Database for Facial Expression Analysis Athanasios Papaioannou (Huawei)*; Baris Gecer (Huawei); Shiyang Cheng (Samsung); Grigorios Chrysos (EPFL); Jiankang Deng (Imperial College London); Eftychia Fotiadou (Imperial College London); Christos Kampouris (ApolloXR); Dimitrios Kollias (Queen Mary University London); Stylianos Moschoglou (Huawei Technologies Co. Ltd); Kritaphat Songsri-In (Imperial College London); Stylianos Ploumpis (Huawei Technologies Co. Ltd); George Trigeorgis (Imperial College London ); Panagiotis Tzirakis (Imperial College London); Evangelos Ververas (Imperial College London); Yuxiang Zhou (Deepmind, Google); Allan Ponniah (NHS); Anastasios Roussos (Institute of Computer Science, Foundation for Research and Technology Hellas); Stefanos Zafeiriou (Imperial College London)
6451 Video Anomaly Detection by Solving Decoupled Spatio-Temporal Jigsaw Puzzles Guodong Wang (Beihang University)*; Yunhong Wang (State Key Laboratory of Virtual Reality Technology and System, Beihang University, Beijing 100191, China); Jie Qin (Nanjing University of Aeronautics and Astronautics); Dongming Zhang ( National Computer Network Emergency Response Technical Team/Coordination Center of China ); Xiuguo bao (National Computer Network Emergency Response Technical Team/Coordination Center of China); Di Huang (Beihang University, China)
6540 TIPS: Text-Induced Pose Synthesis Prasun Roy (University of Technology Sydney)*; Subhankar Ghosh (University of Technology Sydney ); Saumik Bhattacharya (Indian Institute of Technology Kharagpur ); Umapada Pal (Indian Statistical Institute, Kolkata); Michael Blumenstein (University of Technology Sydney)
6693 PoseGPT: Quantization-based 3D Human Motion Generation and Forecasting Thomas LUCAS (Naver)*; Fabien Baradel (Naver Labs Europe); Philippe Weinzaepfel (NAVER LABS Europe); Gregory Rogez (NAVER LABS Europe)
6698 Learning Series-Parallel Lookup Tables for Efficient Image Super-Resolution Cheng Ma (Tsinghua University); Jingyi Zhang (Tsinghua University); Jie Zhou (Tsinghua University); Jiwen Lu (Tsinghua University)*
6715 Frozen CLIP Models are Efficient Video Learners Ziyi Lin (The Chinese University of Hong Kong)*; Shijie Geng (Rutgers University); Renrui Zhang (Shanghai AI Lab); Peng Gao (Chinese university of hong kong); Gerard de Melo (Hasso Plattner Institute); Xiaogang Wang (Chinese University of Hong Kong, Hong Kong); Jifeng Dai (Tsinghua University); Yu Qiao (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences); Hongsheng Li (The Chinese University of Hong Kong)
6719 Deforming Radiance Fields with Cages Tianhan Xu (The University of Tokyo)*; Tatsuya Harada (The University of Tokyo / RIKEN)
6862 D2-TPred: Discontinuous Dependency for Trajectory Prediction under Traffic Lights Yuzhen Zhang (Zhengzhou University); Wentong Wang (Zhengzhou University); weizhi guo (zhengzhou university); Pei Lv (Zhengzhou University)*; Mingliang Xu (Zhengzhou University); Wei Chen (State Key Lab of CAD&CG, Zhejiang University); Dinesh Manocha (University of Maryland at College Park)
6971 Decouple-and-Sample: Protecting sensitive information in task agnostic data release Abhishek Singh (MIT)*; Ethan Garza (MIT); Ayush Chopra (MIT); Praneeth Vepakomma (MIT); Vivek Sharma (MIT); Ramesh Raskar (Massachusetts Institute of Technology)
6974 k-SALSA: k-anonymous synthetic averaging of retinal images via local style alignment Minkyu Jeon (Korea University)*; Hyeonjin Park (Korea university); Hyunwoo J Kim (Korea University); Michael G Morley (Ophthalmic Consultants fo Boston); Hyunghoon Cho (Broad Institute of MIT and Harvard)
7044 On Label Granularity and Object Localization Elijah Cole (Caltech)*; Kimberly Wilber (Google); Grant Van Horn (Cornell University); Xuan Yang (Google); Marco Fornoni (Google); Pietro Perona (California Institute of Technology); Serge Belongie (University of Copenhagen); Andrew Howard (Google); Oisin Mac Aodha (University of Edinburgh)
7375 On the Angular Update and Hyperparameter Tuning of a Scale-Invariant Network Juseung Yun (KAIST)*; Janghyeon Lee (LG AI Research); Hyounguk Shon (KAIST); Eojindl Yi (KAIST); Seung Hwan Kim (LG AI Research); Junmo Kim (KAIST)
7803 MonteBoxFinder: Detecting and Filtering Primitives to Fit a Noisy Point Cloud Michaël Ramamonjisoa (Ecole des Ponts)*; Sinisa Stekovic (Graz University of Technology); Vincent Lepetit (Ecole des Ponts ParisTech)
7899 In Defense of Image Pre-Training for Spatiotemporal Recognition Xianhang Li (University of California, Santa Cruz)*; Huiyu Wang (JHU); Chen Wei (Johns Hopkins University); Jieru Mei (Johns Hopkins University); Alan Yuille (Johns Hopkins University); Yuyin Zhou (UC Santa Cruz); Cihang Xie (University of California, Santa Cruz)
7964 Multi-Exit Semantic Segmentation Networks Alexandros Kouris (Imperial College London and Samsung AI)*; Stylianos Venieris (Samsung AI); Stefanos Laskaridis (Samsung AI); Nicholas Lane (University of Cambridge and Samsung AI)
8029 PANDORA: Polarization-Aided Neural Decomposition Of Radiance Akshat Dave (Rice University)*; Yongyi Zhao (Rice University); Ashok Veeraraghavan (Rice University)
8170 Online Task-free Continual Learning with Dynamic Sparse Distributed Memory Julien Pourcel (ENSEA)*; Ngoc-Son Vu (ENSEA); Robert M FRENCH (CNRS)
19 Learning Depth from Focus in the Wild Changyeon Won (GIST)*; Hae-Gon Jeon (GIST)
74 AvatarPoser: Articulated Full-Body Pose Tracking from Sparse Motion Sensing Jiaxi Jiang (ETH Zurich)*; Paul Streli (ETH Zurich); Huajian Qiu (EPFL); Andreas R Fender (ETH Zurich); Larissa Laich (Facebook Reality Labs); Patrick Snape (Meta); Christian Holz (ETH Zürich)
267 DVS-Voltmeter: Stochastic Process-based Event Simulator for Dynamic Vision Sensors SongNan Lin (Nanyang Technological University)*; Ye Ma (McGill University); Zhenhua Guo (Aliababa Group); Bihan Wen (Nanyang Technological University)
602 RelPose: Predicting Probabilistic Relative Rotation for Single Objects in the Wild Jason Y Zhang (Carnegie Mellon University)*; Deva Ramanan (Carnegie Mellon University); Shubham Tulsiani (Carnegie Mellon University)
626 R2L: Distilling Neural Radiance Field to Neural Light Field for Efficient Novel View Synthesis Huan Wang (Northeastern University); Jian Ren (Snap Inc.); Zeng Huang (Snap Inc.)*; Kyle B Olszewski (Snap Inc.); Menglei Chai (Snap Inc.); YUN FU (Northeastern University); Sergey Tulyakov (Snap Inc)
650 TM2T: Stochastic and Tokenized Modeling for the Reciprocal Generation of 3D Human Motions and Texts Chuan Guo (University of Alberta)*; Xinxin Zuo (University of Alberta); Sen Wang (University of Alberta); Li Cheng (ECE dept., University of Alberta)
1026 Improving Test-Time Adaptation via Shift-agnostic Weight Regularization and Nearest Source Prototypes Sungha Choi (Qualcomm AI Research)*; Seunghan Yang (Qualcomm AI Research); Seokeon Choi (Qualcomm AI research); Sungrack Yun (Qualcomm AI Research)
1060 3D-FM GAN: Towards 3D-Controllable Face Manipulation Yuchen Liu (Princeton University)*; Zhixin Shu (Adobe Research); Yijun Li (Adobe Research); Zhe Lin (Adobe Research); Richard Zhang (Adobe); Sun-Yuan Kung (Princeton University)
1356 Prediction-Guided Distillation for Dense Object Detection Chenhongyi Yang (University of Edinburgh)*; Mateusz Ochal (Heriot Watt University); Amos Storkey (U Edinburgh); Elliot J Crowley (University of Edinburgh)
1886 My View is the Best View: Procedure Learning from Egocentric Videos Siddhant Bansal (IIIT, Hyderabad)*; Chetan Arora (Indian Institute of Technology Delhi); C.V. Jawahar (IIIT-Hyderabad)
1999 HRDA: Context-Aware High-Resolution Domain-Adaptive Semantic Segmentation Lukas Hoyer (ETH Zurich)*; Dengxin Dai (ETH Zurich); Luc Van Gool (ETH Zurich)
2012 Combating Label Distribution Shift for Active Domain Adaptation Sehyun Hwang (POSTECH)*; Sohyun Lee (POSTECH); Sungyeon Kim (POSTECH); Jungseul Ok (POSTECH); Suha Kwak (POSTECH)
2134 Transform your Smartphone into a DSLR Camera: Learning the ISP in the Wild Ardhendu Shekhar Tripathi (ETH Zurich)*; Martin Danelljan (ETH Zurich); Samarth Shukla (ETH Zurich); Radu Timofte (University of Wurzburg & ETH Zurich); Luc Van Gool (ETH Zurich)
2337 Symmetry Regularization and Saturating Nonlinearity for Robust Quantization Sein Park (POSTECH); Yeongsang Jang (POSTECH); Eunhyeok Park (POSTECH)*
2545 Is Appearance Free Action Recognition Possible? Filip Ilic (Graz University of Technology)*; Rick Wildes (York University); Thomas Pock (Graz University of Technology)
2573 Texturify: Generating Textures on 3D Shape Surfaces Yawar Siddiqui (Technical University of Munich)*; Justus Thies (Max Planck Institute for Intelligent Systems); Fangchang Ma (Apple Inc.); Qi Shan (Apple Inc.); Matthias Niessner (Technical University of Munich); Angela Dai (Technical University of Munich)
2686 Selective Query-guided Debiasing for Video Corpus Moment Retrieval Sunjae Yoon (KAIST)*; Ji Woo Hong (KAIST); Eunseop Yoon (KAIST); DaHyun Kim (KAIST); Junyeong Kim (Chung-Ang University); Hee Suk Yoon (KAIST); Chang D. Yoo (KAIST)
2709 Few-shot Image Generation with Mixup-based Distance Learning Chaerin Kong (Seoul National University); Jeesoo Kim (Naver Webtoon AI); Donghoon Han (Seoul National University); Nojun Kwak (Seoul National University)*
2861 A Unified Framework for Domain Adaptive Pose Estimation Donghyun Kim (MIT-IBM Watson AI Lab)*; Kaihong Wang (Boston University); Stan Sclaroff (Boston University); Margrit Betke (Boston University); Kate Saenko (Boston University)
3094 GCISG: Guided Causal Invariant Learning for Improved Syn-to-real Generalization Gilhyun Nam (Agency for Defense Development)*; Gyeongjae Choi (Agency for Defense Development); Kyungmin Lee (Agency for Defense Development)
3429 PlaneFormers: From Sparse View Planes to 3D Reconstruction Samir Agarwala (University of Michigan)*; Linyi Jin (University of Michigan); Chris Rockwell (University of Michigan); David Fouhey (University of Michigan)
4181 Multi-scale and Cross-scale Contrastive Learning for Semantic Segmentation THEODOROS PISSAS (University College London)*; Claudio S Ravasio (King’s College London (KCL)); Lyndon DaCruz (Moorfields Eye Hospital / University College London); Christos Bergeles (Kings College London)
4367 Spectrum-aware and Transferable Architecture Search for Hyperspectral Image Restoration Wei He (Wuhan University)*; Quanming Yao (Tsinghua University); Naoto Yokoya (The University of Tokyo); Tatsumi Uezato (Hitachi, Ltd); Hongyan Zhang (Wuhan University); Liangpei Zhang (Wuhan University)
4513 Visual Cross-View Metric Localization with Dense Uncertainty Estimates Zimin Xia (Delft University of Technology)*; Olaf Booij (TomTom); Marco Manfredi (TomTom); Julian F P Kooij (Delft University of Technology)
4697 Tackling Background Distraction in Video Object Segmentation Suhwan Cho (Yonsei University)*; Heansung Lee (Yonsei University); Minhyeok Lee ( Yonsei University); Chaewon Park (Yonsei University); Sungjun Jang (Yonsei University); Minjung Kim (Yonsei University); Sangyoun Lee (Yonsei University)
4705 The Surprisingly Straightforward Scene Text Removal Method With Gated Attention and Region of Interest Generation: A Comprehensive Prominent Model Analysis Hyeonsu Lee (Naver Corporation)*; Chankyu Choi (Naver Corporation)
4721 Free-Viewpoint RGB-D Human Performance Capture and Rendering Phong Ha Nguyen (University of Oulu)*; Nikolaos Sarafianos (Facebook Reality Labs); Christoph Lassner (Meta Reality Labs Research); Janne Heikkila (University of Oulu, Finland); Tony Tung (Facebook)
4934 Learning Self-prior for Mesh Denoising using Dual Graph Convolutional Networks Shota Hattori (The University of Tokyo)*; Tatsuya Yatagawa (The University of Tokyo); Yutaka Ohtake (The University of Tokyo); Hiromasa Suzuki (The University of Tokyo)
5224 SelectionConv: Convolutional Neural Networks for Non-rectilinear Image Data David M Hart (Brigham Young University)*; Michael Whitney (Brigham Young University); Bryan S Morse (Brigham Young University)
5595 Joint Learning of Localized Representations from Medical Images and Reports Philip Müller (Technical University of Munich)*; Georgios Kaissis (Technische Universität München); congyu zou (Klinikum Rechts der Isar Technische Universität München ); Daniel Rueckert (Technische Universität München)
5619 Source-free Video Domain Adaptation by Learning Temporal Consistency for Action Recognition Yuecong Xu (Institute for Infocomm Research, A*STAR, Singapore)*; Jianfei Yang (Nanyang Technological University); Haozhi Cao (Nanyang Technological University); Keyu Wu (Institute for Infocomm Research, A*STAR, Singapore); Min Wu (Institute for Infocomm Research, A*STAR, Singapore); Zhenghua Chen (Institute for Infocomm Research, A*STAR, Singapore)
5740 A data-centric approach for improving ambiguous labels with combined semi-supervised classification and clustering Lars Schmarje (Kiel University)*; Monty Santarossa (Kiel University); Simon-Martin Schröder (Kiel University); Claudius Zelenka (Kiel University); Rainer Kiko (Laboratoire d’Océanographie de Villefranche-sur-Mer); Jenny Stracke (University of Bonn); Nina Volkmann (University of Veterinary Medicine Hannover); Reinhard Koch (Kiel University)
6011 Learning Where To Look – Generative NAS is Surprisingly Efficient Jovita Lukasik (University of Mannheim)*; Steffen Jung (MPII); Margret Keuper (University of Mannheim)
6418 Zero-Shot Learning for Reflection Removal of Single 360-Degree Image Byeong-Ju Han (Ulsan National Institute of Science and Technology ); Jae-Young Sim (Ulsan National Institute of Science and Technology)*
6459 An Impartial Take to the CNN vs Transformer Robustness Contest Francesco Pinto (University of Oxford)*; Philip Torr (University of Oxford); Puneet Dokania (University of Oxford)
6513 AdaNeRF: Adaptive Sampling for Real-time Rendering of Neural Radiance Fields Andreas Kurz (Graz University of Technology)*; Thomas Neff (Graz University of Technology); Zhaoyang Lv (Facebook); Michael Zollhöfer (Facebook Reality Labs); Markus Steinberger (Graz University of Technology)
6565 DProST: Dynamic Projective Spatial Transformer Network for 6D Pose Estimation Jaewoo Park (Seoul National University); Nam Ik Cho (Seoul National University)*
6942 Temporal and cross-modal attention for audio-visual zero-shot learning Otniel-Bogdan Mercea (University of Tübingen)*; Thomas Hummel (University of Tübingen); A. Sophia Koepke (University of Tübingen); Zeynep Akata (University of Tübingen)
6973 Object Detection as Probabilistic Set Prediction Georg Hess (Chalmers University of Technology)*; Christoffer Petersson (Zenseact); Lennart Svensson (Chalmers University of Technology)
7238 Bi-directional Contrastive Learning for Domain Adaptive Semantic Segmentation Geon Lee (Yonsei University); Chanho Eom (Yonsei University); Wonkyung Lee (PS Analytics); Hyekang Park (Yonsei University); Bumsub Ham (Yonsei University)*
7304 Bridging the Domain Gap towards Generalization in Automatic Colorization Hyejin Lee (Kookmin University); Daehee Kim (Naver Corp.); Daeun Lee (Korea University); Jinkyu Kim (Korea University); Jaekoo Lee (Kookmin University)*
7350 A Dataset Generation Framework for Evaluating Megapixel Image Classifiers & their Explanations Gautam B Machiraju (Stanford University)*; Sylvia Plevritis (Stanford University); Parag Mallick (Stanford University)
3757 Unpaired Image Translation via Vector Symbolic Architectures Justin Theiss (University of California, Berkeley)*; Jay Leverett (Meta); Daeil Kim (Meta); Aayush Prakash (Meta)
4901 Adaptive Token Sampling For Efficient Vision Transformers Mohsen Fayyaz (Microsoft)*; Soroush Abbasi Koohpayegani (University of Maryland Baltimore County); Farnoush Rezaei Jafari (Technische Universität Berlin); Sunando Sengupta (Microsoft); HAMID VAEZI JOZE (Microsoft); Eric Sommerlade (Microsoft); Hamed Pirsiavash (University of California Davis); Jürgen Gall (University of Bonn)
5271 Cross-Modal Knowledge Transfer Without Task-Relevant Source Data SK MIRAJ AHMED (University of California Riverside); Suhas Lohit (Mitsubishi Electric Research Laboratories)*; Kuan-Chuan Peng (Mitsubishi Electric Research Laboratories (MERL)); Michael J Jones (MERL); Amit K. Roy-Chowdhury (University of California, Riverside)
5667 The Challenges of Continuous Self-Supervised Learning Senthil Purushwalkam (Carnegie Mellon University); Pedro Morgado (University of Wisconsin-Madison)*; Abhinav Gupta (CMU/FAIR)
6326 Identifying Hard Noise in Long-Tailed Sample Distribution Xuanyu Yi (Nanyang Technological University)*; Kaihua Tang (Nanyang Technological University); Xian-Sheng Hua (Damo Academy, Alibaba Group); Joo-Hwee Lim (Institute for Infocomm Research); Hanwang Zhang (Nanyang Technological University)
6568 PACTran: PAC-Bayesian Metrics for Estimating the Transferability of Pretrained Models to Classification Tasks Nan Ding (Google)*; Xi Chen (Google Research); Tomer Levinboim (Google); Soravit Changpinyo (Google Research); Radu Soricut (Google)
7302 Semidefinite Relaxations of Truncated Least-Squares in Robust Rotation Search: Tight or Not Liangzu Peng (Johns Hopkins University)*; Mahyar Fazlyab (Johns Hopkins University); Rene Vidal (Johns Hopkins University, USA)
7345 Lottery Ticket Hypothesis for Spiking Neural Networks Youngeun Kim (Yale University)*; Yuhang Li (Yale University); Hyoungseob Park (Yale University); Yeshwanth Venkatesha (Yale university); Ruokai Yin (Yale University); Priyadarshini Panda (Yale University)
7464 Cartoon Explanations of Image Classifiers Stefan Kolek (Ludwig Maximilian University of Munich)*; Duc Anh Nguyen (LMU Munich); Ron Levie (Technion); Joan Bruna (Courant Institute of Mathematical Sciences, NYU, USA); Gitta Kutyniok (Ludwig Maximilian University of Munich)
1552 Revisiting a kNN-based Image Classification System with High-capacity Storage Kengo Nakata (Kioxia Corporation)*; Youyang Ng (Kioxia Corporation); Daisuke Miyashita (Kioxia Corporation); Asuka Maki (Kioxia Corporation); Yu-Chieh Lin (Kioxia Corporation); Jun Deguchi (Kioxia Corporation)
1742 A Level Set Theory for Neural Implicit Evolution under Explicit Flows Ishit Mehta (University of California San Diego)*; Manmohan Chandraker (UC San Diego); Ravi Ramamoorthi (University of California San Diego)
4720 Organic Priors in Non-Rigid Structure from Motion Suryansh Kumar (ETH Zurich)*; Luc Van Gool (ETH Zurich)
4910 Implicit Field Supervision For Robust Non-Rigid Shape Matching Ramana S Sundararaman (Ecole Polytechnique)*; Gautam Pai (École Polytechnique); Maks Ovsjanikov (Ecole polytechnique)
5713 Shape-Pose Disentanglement using SE(3)-equivariant Vector Neurons Oren Katzir (Tel Aviv University)*; Dani Lischinski (The Hebrew University of Jerusalem); Danny Cohen-Or (Tel Aviv University)
7414 Unsupervised Pose-aware Part Decomposition for Man-made Articulated Objects Yuki Kawana (The University of Tokyo)*; Yusuke Mukuta (The University of Tokyo); Tatsuya Harada (The University of Tokyo / RIKEN)

Wednesday 26th

Orals 5 (Wed. am)

Faces, Bodies, Gestures, and Pose

229 D&D: Learning Human Dynamics from Dynamic Camera Jiefeng Li (Shanghai Jiao Tong University)*; Siyuan Bian (Shanghai Jiao Tong University); Chao Xu (Tencent); Gang Liu (Tencent inc.); Gang Yu (Tencent ); Cewu Lu (Shanghai Jiao Tong University)
1413 Pose-NDF: Modelling Human Pose Manifolds with Neural Distance Fields Garvita Tiwari (MPI-INF, University of Tübingen)*; Dimitrije Antic (University of Tuebingen); Jan E. Lenssen (TU Dortmund); Nikolaos Sarafianos (Facebook Reality Labs); Tony Tung (Facebook Reality Labs); Gerard Pons-Moll (University of Tübingen)
1620 CLIFF: Carrying Location Information in Full Frames into Human Pose and Shape Estimation Zhihao Li (Huawei Noah’s Ark Lab)*; Jianzhuang Liu (Huawei Noah’s Ark Lab); Zhensong Zhang (Huawei Noah’s Ark Lab); Songcen Xu (Huawei Noah’s Ark Lab); Youliang Yan (Huawei Noah’s Ark Lab)
4591 SimCC: a Simple Coordinate Classification Perspective for Human Pose Estimation Yanjie Li (Tsinghua University)*; Sen Yang (Southeast University); Peidong Liu (Tsinghua University); 寿奎 张 (meituan); Yunxiao Wang (Tsinghua University); Zhicheng Wang (Nreal); Wankou Yang (Southeast University); Shu-Tao Xia (Tsinghua University)
5242 Grasp’D: Differentiable Contact-rich Grasp Synthesis for Multi-fingered Hands Dylan Turpin (University of Toronto)*; Liquan Wang (University of Toronto); Eric Heiden (University of Southern California); Yun-Chun Chen (University of Toronto ); Miles Macklin (NVIDIA); Stavros Tsogkas (University of Toronto); Sven Dickinson (University of Toronto); Animesh Garg (University of Toronto, Vector Institute, Nvidia)
6515 PressureVision: Estimating Hand Pressure from a Single RGB Image Patrick L Grady (Georgia Institute of Technology)*; Chengcheng Tang (Facebook Reality Labs); Samarth Brahmbhatt (Intel); Christopher D Twigg (Meta); Chengde Wan (Facebook Reality Lab); James Hays (Georgia Institute of Technology, USA); Charlie Kemp (Georgia Institute of Technology)
6672 Pose for Everything: Towards Category-Agnostic Pose Estimation Lumin XU (The Chinese University of Hong Kong)*; Sheng Jin (The University of Hong Kong); Wang ZENG (The Chinese University of Hong Kong); Wentao Liu (Sensetime); Chen Qian (SenseTime); Wanli Ouyang (The University of Sydney); Ping Luo (The University of Hong Kong); Xiaogang Wang (Chinese University of Hong Kong, Hong Kong)
7360 Multi-domain Learning for Updating Face Anti-spoofing Models Xiao Guo (Michigan State University)*; Yaojie Liu (Google Research); Anil Jain (Michigan State University); Xiaoming Liu (Michigan State University)

Orals 6 (Wed. am)

Image/Video Synthesis and Generative Models

631 Semantic-Aware Implicit Neural Audio-Driven Video Portrait Generation Xian Liu (The Chinese University of Hong Kong)*; Yinghao Xu (Chinese University of Hong Kong); Qianyi Wu (Monash University); Hang Zhou (The Chinese University of Hong Kong); Wayne Wu (SenseTime Research); Bolei Zhou (UCLA)
993 Make-A-Scene: Scene-Based Text-to-Image Generation with Human Priors Oran Gafni (Meta AI Research)*; Adam Polyak (Facebook); Oron Ashual (Facebook AI Research); Shelly Sheynin (Meta); Devi Parikh (Georgia Tech & Facebook AI Research); Yaniv Taigman (Facebook)
2282 RFNet-4D: Joint Object Reconstruction and Flow Estimation from 4D Point Clouds Tuan-Anh Vu (The Hong Kong University of Science and Technology)*; Thanh Nguyen (Deakin University, Australia); Binh-Son Hua (VinAI Research); Quang Hieu Pham (Woven Planet North America); Sai-Kit Yeung (Hong Kong University of Science and Technology)
3181 Discovering Transferable Forensic Features for CNN-generated Images Detection Keshigeyan Chandrasegaran (Singapore University of Technology and Design)*; Ngoc-Trung Tran (Singapore University of Technology and Design); Alexander Binder (University of Oslo); Ngai-Man Cheung (Singapore University of Technology and Design)
3228 Text2LIVE: Text-Driven Layered Image and Video Editing Omer Bar Tal (Weizmann Institute of Science )*; Dolev Ofri-Amar (Weizmann Institute of Science); Rafail Fridman (Weizmann Institute of Science); Yoni Kasten (Weizmann Institute); Tali Dekel (Weizmann Institute of Science)
3673 Exploring Gradient-based Multi-directional Controls in GANs Zikun Chen (ModiFace Inc. )*; Ruowei Jiang (ModiFace Inc.); Brendan Duke (ModiFace Inc); Han Zhao (University of Illinois at Urbana-Champaign); Parham Aarabi (ModiFace Inc.)
4399 3D-Aware Indoor Scene Synthesis with Depth Priors Zifan SHI (HKUST)*; Yujun Shen (Dept. of IE, CUHK); Jiapeng Zhu (HKUST); Dit-Yan Yeung (HKUST); Qifeng Chen (HKUST)
4610 Generative Multiplane Images: Making a 2D GAN 3D-Aware Xiaoming Zhao (University of Illinois at Urbana-Champaign)*; Fangchang Ma (Apple Inc.); David Güera (Apple Inc.); Zhile Ren (Apple Inc.); Alexander Schwing (UIUC); Alex Colburn (Apple Inc.)
4847 Layered Controllable Video Generation Jiahui Huang (University of British Columbia)*; Yuhe Jin (University of British Columbia); Kwang Moo Yi (University of British Columbia); Leonid Sigal (University of British Columbia)

Posters 3 (Wed. early)

16 Generative Domain Adaptation for Face Anti-Spoofing Qianyu Zhou (Shanghai Jiao Tong University)*; Ke-Yue Zhang (YouTu Lab, Tencent); Taiping Yao (Tencent YouTu); Ran Yi (Shanghai Jiao Tong University); Kekai Sheng (Youtu Lab, Tencent Inc.); Shouhong Ding (Tencent); Lizhuang Ma (Shanghai Jiao Tong University)
34 Relighting4D: Neural Relightable Human from Videos Zhaoxi Chen (Nanyang Technological University )*; Ziwei Liu (Nanyang Technological University)
69 Learning-based Point Cloud Registration for 6D Object Pose Estimation in the Real World Zheng Dang (EPFL)*; Lizhou Wang (Xi’an Jiaotong University); Yu Guo (School of Software Engineering, Xi’an Jiaotong University); Mathieu Salzmann (EPFL)
167 MORE: Multi-Order RElation Mining for Dense Captioning in 3D Scenes Yang Jiao (Fudan University)*; Shaoxiang Chen (Fudan University); Zequn Jie (Meituan inc.); Jingjing Chen (Fudan University); Lin Ma (Meituan); Yu-Gang Jiang (Fudan University)
358 An Embedded Feature Whitening Approach to Deep Neural Network Optimization Hongwei Yong (The Hong Kong Polytechnic University)*; Lei Zhang (“Hong Kong Polytechnic University, Hong Kong, China”)
408 FurryGAN: High quality foreground-aware image synthesis Jeongmin Bae (Yonsei University); Mingi Kwon (Yonsei University); Youngjung Uh (Yonsei University)*
556 Scene Text Recognition with Permuted Autoregressive Sequence Models Darwin Bautista (University of the Philippines)*; Rowel Atienza (University of the Philippines)
575 SCAM! Transferring humans between images with Semantic Cross Attention Modulation Nicolas Dufour (ENPC)*; David Picard (ENPC); Vicky Kalogeiton (Ecole Polytechnique)
784 End-to-End Active Speaker Detection Juan C Leon (KAUST)*; Moritz Cordes (Leuphana University of Lüneburg); Chen Zhao (KAUST); Bernard Ghanem (KAUST)
833 VTC: Improving Video-Text Retrieval with User Comments Laura Hanu (Unitary)*; James Thewlis (Unitary); Yuki M Asano (University of Amsterdam); Christian Rupprecht (University of Oxford)
839 Less than Few: Self-Shot Video Instance Segmentation Pengwan Yang (University of Amsterdam)*; Yuki M Asano (University of Amsterdam); Pascal Mettes (University of Amsterdam); Cees Snoek (University of Amsterdam)
913 Designing One Unified Framework for High-Fidelity Face Reenactment and Swapping Chao Xu (Zhejiang University)*; Jiangning Zhang (Zhejiang University); Yue Han (Zhejiang University); Guanzhong Tian (Ningbo Research Institute, Zhejiang University); xianfang zeng (Zhejiang University); Ying Tai (Tencent YouTu); Yabiao Wang (Tencent); Chengjie Wang (Tencent; Shanghai Jiao Tong University); Yong Liu (Zhejiang University)
927 Intrinsic Neural Fields: Learning Functions on Manifolds Lukas Koestler (Technical University of Munich)*; Daniel Grittner (Technische Universität München); Michael Moeller (University of Siegen); Daniel Cremers (TU Munich); Zorah Laehner (University of Siegen)
1039 SALVe: Semantic Alignment Verification for Floorplan Reconstruction from Sparse Panoramas John W Lambert (Georgia Institute of Technology)*; Yuguang Li (Zillow Group); Ivaylo Boyadzhiev (Zillow Group); Lambert Wixson (Zillow Group); Manjunath Narayana (Zillow group); Will A Hutchcroft (Zillow Group); James Hays (Georgia Institute of Technology, USA); Frank Dellaert (Georgia Tech); Sing Bing Kang (Zillow Group)
1198 Adaptive Cross-Domain Learning for Generalizable Person Re-Identification Pengyi Zhang (Zhejiang University)*; Huanzhang Dou (Zhejiang University); Yunlong Yu (Zhejiang University); Xi Li (Zhejiang University)
1322 KeypointNeRF: Generalizing Image-based Volumetric Avatars using Relative Spatial Encoding of Keypoints Marko Mihajlovic (ETH Zurich)*; Aayush Bansal (Carnegie Mellon University); Michael Zollhöfer (Facebook Reality Labs); Siyu Tang (ETH Zurich); Shunsuke Saito (Facebook)
1347 Detecting and Recovering Sequential DeepFake Manipulation Rui Shao (Nanyang Technological University)*; Tianxing Wu (Nanyang Technological University); Ziwei Liu (Nanyang Technological University)
1482 Hunting Group Clues with Transformers for Social Group Activity Recognition Masato Tamura (Hitachi America, Ltd.)*; Rahul Vishwakarma (Hitachi America Ltd.); Ravigopal Vennelakanti (Hitachi America, Ltd.)
1511 HIVE: Evaluating the Human Interpretability of Visual Explanations Sunnie S. Y. Kim (Princeton University)*; Nicole Meister (Princeton University); Vikram V. Ramaswamy (Princeton University); Ruth C Fong (Princeton University); Olga Russakovsky (Princeton University)
1525 Waymo Open Dataset: Panoramic Video Panoptic Segmentation Jieru Mei (Johns Hopkins University); Alex Zhu (Waymo)*; Xinchen Yan (Waymo); Hang Yan (Waymo LLC); Siyuan Qiao (Google); Yukun Zhu (Google Inc.); Liang-Chieh Chen (Google Inc.); Henrik Kretzschmar (Waymo)
1735 Video Graph Transformer for Video Question Answering Junbin Xiao (National University of Singapore)*; Pan Zhou (Sea AI Lab); Tat-Seng Chua (National Univ. of Singapore); Shuicheng Yan (Sea AI Labs)
1738 Learning Local Implicit Fourier Representation for Image Warping Jaewon Lee (DGIST)*; Kwang Pyo Choi (Samsung Electronics); Kyong Hwan Jin (DGIST)
1810 NeXT: Towards High Quality Neural Radiance Fields via Multi-Skip Transformer Yunxiao Wang (Tsinghua University); Yanjie Li (Tsinghua University)*; Peidong Liu (Tsinghua University); Tao Dai (Shenzhen University); Shu-Tao Xia (Tsinghua University)
1832 PS-NeRF: Neural Inverse Rendering for Multi-view Photometric Stereo Wenqi Yang (The University of Hong Kong)*; Guanying CHEN (The Chinese University of Hong Kong, Shenzhen); Chaofeng Chen (Nanyang Technological University); Zhenfang Chen (MIT-IBM Watson AI Lab); Kwan-Yee K. Wong (The University of Hong Kong)
2118 Learning to Generate Realistic LiDAR Point Cloud Vlas Zyrianov (University of Illinois Urbana Champaign); Xiyue Zhu (university of illinois); Shenlong Wang (UIUC)*
2138 Uncertainty-Based Spatial-Temporal Attention for Online Action Detection Hongji Guo (Rensselaer Polytechnic Institute)*; Zhou Ren (Wormpex AI Research); Yi Wu (Wormpex AI Research); Gang Hua (Wormpex AI Research); Qiang Ji (Rensselaer Polytechnic Institute)
2334 Improving GANs for Long-Tailed Data through Group Spectral Regularization Harsh Rangwani (Indian Institute of Science)*; Naman Jaswani (Indian Institute of Science); Tejan Karmali (Indian Institute of Science, Bengaluru); Varun Jampani (Google); Venkatesh Babu RADHAKRISHNAN (Indian Institute of Science)
2373 Learning Dynamic Facial Radiance Fields for Few-Shot Talking Head Synthesis Shuai Shen (Tsinghua University); Wanhua Li (Tsinghua University); Zheng Zhu (Tsinghua University); Yueqi Duan (Tsinghua University); Jie Zhou (Tsinghua University); Jiwen Lu (Tsinghua University)*
2516 SparseNeuS: Fast Generalizable Neural Surface Reconstruction from Sparse Views Xiaoxiao Long (The University of Hong Kong)*; Cheng Lin (Tencent); Peng Wang (The University of Hong Kong); Taku Komura (The University of Hong Kong); Wenping Wang (The University of Hong Kong)
2568 Learning Visibility for Robust Dense Human Body Estimation Chun-Han Yao (University of California at Merced)*; Jimei Yang (Adobe); Duygu Ceylan (Adobe Research); Yi Zhou (Adobe Research); Yang Zhou (Adobe Research); Ming-Hsuan Yang (University of California at Merced)
2598 Depth Field Networks for Generalizable Multi-view Scene Representation Vitor Guizilini (Toyota Research Institute)*; Igor Vasiljevic (Toyota Research Institute); Jiading Fang (Toyota Technological Institute at Chicago); Rareș A Ambruș (Toyota Research Institute); Greg Shakhnarovich (Toyota Technological Institute at Chicago); Matthew Walter (Toyota Technological Institute at Chicago); Adrien Gaidon (Toyota Research Institute)
2888 2D GANs Meet Unsupervised Single-view 3D Reconstruction Feng Liu (Michigan State University)*; Xiaoming Liu (Michigan State University)
3150 GradAuto: Energy-oriented Attack on Dynamic Neural Networks Jianhong Pan (Singapore University of Technology and Design)*; Qichen Zheng (Singapore University of Technology and Design); Zhipeng Fan (NYU TANDON SCHOOL OF ENGINEERING); Hossein Rahmani (Lancaster University); Qiuhong Ke (Monash University); Jun Liu (Singapore University of Technology and Design)
3438 Domain Adaptive Video Segmentation via Temporal Pseudo Supervision Yun Xing (Nanyang Technological University); Dayan Guan (Mohamed bin Zayed University of Artificial Intelligence); Jiaxing Huang (Nanyang Technological University); Shijian Lu (Nanyang Technological University)*
3500 Weakly-Supervised Stitching Network for Real-World Panoramic Image Generation Dae-Young Song (Chungnam National University); Geonsoo Lee (Chungnam National University); HeeKyung Lee (ETRI(Electronics and Telecommunications Reseach Institute)); Gi-Mun Um (ETRI(Electronics and Telecommunications Research Institute)); Donghyeon Cho (Chungnam National University)*
3551 Balancing between Forgetting and Acquisition in Incremental Subpopulation Learning Mingfu Liang (Northwestern University)*; JIAHUAN ZHOU (Peking University); Wei Wei (Northwestern University); Ying Wu (Northwestern University)
3693 Any-resolution Training for High-resolution Image Synthesis Lucy Chai (MIT)*; Michaël Gharbi (Adobe Research); Eli Shechtman (Adobe Research, US); Phillip Isola (MIT); Richard Zhang (Adobe)
3780 Multi-Domain Multi-Definition Landmark Localization for Small Datasets David Ferman (AI Foundation); Gaurav Bharaj (AI Foundation)*
4018 High-Fidelity Image Inpainting with GAN Inversion Yongsheng Yu (University of Chinese Academy of Sciences); Libo Zhang (Institute of Software Chinese Academy of Sciences)*; Heng Fan (University of North Texas); Tiejian Luo (University of Chinese Academy of Sciences)
4080 Towards Efficient and Scale-Robust Ultra-High-Definition Image Demoiréing Xin Yu (The University of Hong Kong)*; Peng Dai (The University of Hong Kong); Wenbo Li (The Chinese University of Hong Kong); Lan Ma (TCL Corporate Research); Jiajun Shen (TCL Research); Jia Li (Sun Yat-Sen University); Xiaojuan Qi (The University of Hong Kong)
4136 Long Movie Clip Classification with State-Space Video Models Md Mohaiminul Islam (UNC Chapel Hill)*; Gedas Bertasius (UNC Chapel Hill)
4231 Planes vs. Chairs: Category-guided 3D shape learning without any 3D cues Zixuan Huang (Georgia Institute of Technology)*; Stefan Stojanov (Georgia Institute of Technology); Anh Thai (Georgia Institute of Technology); Varun Jampani (Google); James Rehg (Georgia Institute of Technology)
4239 Skeleton-Parted Graph Scattering Networks for 3D Human Motion Prediction Maosen Li (Cooperative Medianet Innovation Center, Shanghai Jiao Tong University)*; Siheng Chen (Shanghai Jiao Tong University); Zijing Zhang (Zhejiang University); Lingxi Xie (Huawei Inc.); Qi Tian (Huawei Cloud & AI); Ya Zhang (Cooperative Medianet Innovation Center, Shang hai Jiao Tong University)
4300 Constrained Mean Shift Using Distant Yet Related Neighbors for Representation Learning K L Navaneet (University of California, Davis); Soroush Abbasi Koohpayegani (University of Maryland Baltimore County)*; Ajinkya B Tejankar (UMBC); Kossar Pourahmadi Meibodi (University of Maryland, Baltimore County); Akshayvarun Subramanya (UMBC); Hamed Pirsiavash (University of California Davis)
4478 Compiler-Aware Neural Architecture Search for On-Mobile Real-time Super-Resolution Yushu Wu (Northeastern University)*; Yifan Gong (Northeastern University); Pu Zhao (Northeastern University); Yanyu Li (Northeastern University); Zheng Zhan (Northeastern University); Wei Niu (William & Mary); Hao Tang (ETH Zurich); Minghai Qin (Western Digital Research); Bin Ren (William & Mary); Yanzhi Wang (Northeastern University)
4496 RA-Depth: Resolution Adaptive Self-Supervised Monocular Depth Estimation Mu He (Nanjing University of Science and Technology)*; Le Hui (Nanjing University of Science and Technology); Yikai Bian (Nanjing University of Science and Technology); Jian Ren (Nanjing University of Science and Technology); Jin Xie (Nanjing University of Science and Technology); Jian Yang (Nanjing University of Science and Technology)
4584 Vector Quantized Image-to-Image Translation Yu-Jie Chen (National Chiao Tung University); Shin-I Cheng (National Chiao Tung University); Wei-Chen Chiu (National Chiao Tung University)*; Hung-Yu Tseng (Facebook); Hsin-Ying Lee (Snap Inc)
4672 K-centered Patch Sampling for Efficient Video Recognition Seong Hyeon Park (KAIST AI)*; Jihoon Tack (KAIST); Byeongho Heo (NAVER AI LAB); Jung-Woo Ha (NAVER CLOVA AI Lab); Jinwoo Shin (KAIST)
4708 FingerprintNet: Synthesized Fingerprints for Generated Image Detection Yonghyun Jeong (NAVER CLOVA)*; Doyeon Kim (Line+); Youngmin Ro (Samsung SDS); pyounggeon kim (SDS); Jongwon Choi (Chung-Ang University)
4736 The Anatomy of Video Editing: A Dataset and Benchmark Suite for AI-Assisted Video Editing Dawit Mureja Argaw (KAIST)*; Fabian Caba (Adobe Research); Joon-Young Lee (Adobe Research); Markus Woodson (Adobe); In So Kweon (KAIST)
4762 AdaBin: Improving Binary Neural Networks with Adaptive Binary Sets Zhijun Tu (Institute of Artificial Intelligence and Robotics, Xi’an Jiaotong university)*; Xinghao Chen (Huawei Noah’s Ark Lab); Pengju Ren (Institute of Artificial Intelligence at Xi’an Jiaotong University); Yunhe Wang (Huawei Technologies)
4804 Few-shot Action Recognition with Hierarchical Matching and Contrastive Learning Sipeng Zheng (Renmin University of China)*; Shizhe Chen (INRIA); Qin Jin (Renmin University of China)
4828 Remote Respiration Monitoring of Moving Person Using Radio Signals Jae-Ho Choi (Pohang University of Science and Technology)*; KIBONG KANG (POSTECH); Kyung-Tae Kim (Pohang University of Science and Technology)
4908 4DContrast: Contrastive Learning with Dynamic Correspondences for 3D Scene Understanding Yujin Chen (Technical University of Munich)*; Matthias Niessner (Technical University of Munich); Angela Dai (Technical University of Munich)
4933 Tracking Every Thing in the Wild Siyuan Li (ETH Zurich)*; Martin Danelljan (ETH Zurich); Henghui Ding (ETH Zurich); Thomas E Huang (ETH Zürich); Fisher Yu (ETH Zurich)
4953 STEEX: Steering Counterfactual Explanations with Semantics Paul Jacob (École Polytechnique ); eloi zablocki (Valeo.ai)*; Hedi Ben-younes (Valeo AI); Mickael Chen (valeo.ai); Patrick Pérez (Valeo.ai); Matthieu Cord (Sorbonne University)
5076 Factorizing Knowledge in Neural Networks Xingyi Yang (National University of Singapore)*; Jingwen Ye (National University of Singapore); Xinchao Wang (National University of Singapore)
5157 Learning Unbiased Transferability for Domain Adaptation by Uncertainty Modeling Jian Hu (Queen Mary University of London)*; Haowen Zhong (Zhejiang Lab); Fei Yang (Zhejiang Lab); Shaogang Gong (Queen Mary University of London); Guile Wu (Queen Mary University of London); Junchi Yan (Shanghai Jiao Tong University)
5227 HairNet: Hairstyle Transfer with Pose Changes Peihao Zhu (KAUST)*; Rameen Abdal (KAUST); JOHN C FEMIANI (Miami University); Peter Wonka (KAUST)
5247 Improving Closed and Open-Vocabulary Attribute Prediction using Transformers Khoi Pham (University of Maryland, College Park)*; Kushal Kafle (Adobe Research); Zhe Lin (Adobe Research); Zhihong Ding (Adobe Research); Scott Cohen (Adobe Research); Quan Hung Tran (Adobe Research); Abhinav Shrivastava (University of Maryland)
5252 A Contrastive Objective for Learning Disentangled Representations Jonathan Kahana (Hebrew University of Jerusalem)*; Yedid Hoshen (The Hebrew University of Jerusalem)
5256 Unbiased Multi-Modality Guidance for Image Inpainting Yongsheng Yu (University of Chinese Academy of Sciences); Dawei Du (Kitware, Inc.)*; Libo Zhang (Institute of Software Chinese Academy of Sciences); Tiejian Luo (University of Chinese Academy of Sciences)
5257 Learned Monocular Depth Priors in Visual-Inertial Initialization Yunwen Zhou (Google)*; Abhishek Kar (Google); Eric L Turner (GOOGLE LLC); Adarsh Kowdle (Google); Chao Guo (Google Inc.); Ryan DuToit (Google); Konstantine Tsotsos (Google)
5325 Improving the Intra-class Long-tail in 3D Detection via Rare Example Mining Chiyu Jiang (Waymo)*; Mahyar Najibi (Waymo LLC); Charles R. Qi (Waymo); Yin Zhou (Waymo); Dragomir Anguelov (Waymo)
5362 ParC-Net: Position Aware Circular Convolution with Merits from ConvNets and Transformer Haokui Zhang (Lighthouse Co.Ltd)*; Wenze Hu (Lighthouse Co.Ltd); Xiaoyu Wang (The Chinese University of Hong Kong (Shenzhen))
5443 TDAM: Top-Down Attention Module for Contextually Guided Feature Selection in CNNs Shantanu Jaiswal (Agency for Science, Technology and Research )*; Basura Fernando (Agency for Science, Technology and Research, A*STAR, Singapore); Cheston Tan (Institute for Infocomm Research, Singapore)
5537 Speaker-adaptive Lip Reading with User-dependent Padding Minsu Kim (KAIST)*; Hyunjun Kim (KAIST); Yong Man Ro (KAIST)
5578 Towards Racially Unbiased Skin Tone Estimation via Scene Disambiguation Haiwen Feng (Max Planck Institute for Intelligent Systems); Timo Bolkart (Max Planck Institute for Intelligent Systems); Joachim Tesch (Max Planck Institute for Intelligent Systems); Michael J. Black (Max Planck Institute for Intelligent Systems); Victoria Fernandez Abrevaya (Max Planck Institute)*
5585 Robust Visual Tracking by Segmentation Matthieu Paul (ETH Zurich)*; Martin Danelljan (ETH Zurich); Christoph Mayer (ETH Zurich); Luc Van Gool (ETH Zurich)
5789 Inpainting at Modern Camera Resolution by Guided PatchMatch with Auto-Curation Connelly Barnes (Adobe)*; Lingzhi Zhang (University of Pennsylvania); Jianbo Shi (University of Pennsylvania); Zhe Lin (Adobe Research); Eli Shechtman (Adobe Research, US); Sohrab Amirghodsi (Adobe Research); Kevin Wampler (Adobe Systems Inc.)
5901 Object Wake-up: 3D Object Rigging from a Single Image Ji Yang (University of Alberta)*; Xinxin Zuo (University of Alberta); Sen Wang (University of Alberta); Zhenbo Yu (Shanghai Jiao Tong University); Xingyu Li (University of Alberta); Bingbing Ni (Shanghai Jiao Tong University); Minglun Gong (University of Guelph); Li Cheng (ECE dept., University of Alberta)
6000 When Deep Classifiers Agree: Analyzing Correlations between Learning Order and Image Statistics Iuliia Pliushch (Goethe University)*; Martin Mundt (TU Darmstadt); Nicolas Lupp (Goethe University Frankfurt); Visvanathan Ramesh (Goethe University)
6023 Realistic One-shot Mesh-based Head Avatars Taras Khakhulin (Skolkovo Institute of Science and Technology)*; Vanessa Valerievna Skliarova (Skoltech); Victor Lempitsky (Yandex); Egor Zakharov (Skolkovo Institute of Science and Technology)
6056 Responsive Listening Head Generation: A Benchmark Dataset and Baseline Mohan Zhou (Harbin Institute of Technology)*; Yalong Bai (JD AI Research); Wei Zhang (JD AI Research); Ting Yao (JD AI Research); Tiejun Zhao (Harbin Institute of Technology); Tao Mei (AI Research of JD.com)
6127 Wave-ViT: Unifying Wavelet and Transformers for Visual Representation Learning Ting Yao (JD AI Research); Yingwei Pan (JD AI Research)*; Yehao Li (JD AI Research); Chong-Wah Ngo (Singapore Management University); Tao Mei (AI Research of JD.com)
6178 SSBNet: Improving Visual Recognition Efficiency by Adaptive Sampling Ho Man Kwan (The Hong Kong University of Science and Technology)*; S.H. Song (HKUST)
6187 Neural Space-filling Curves Hanyu Wang (University of Maryland – College Park)*; Kamal Gupta (University of Maryland); Larry Davis (University of Maryland); Abhinav Shrivastava (University of Maryland)
6192 MFIM: Megapixel Facial Identity Manipulation Sanghyeon Na (kakaobrain)*
6218 MaCLR: Motion-aware Contrastive Learning of Representations for Videos Fanyi Xiao (Meta); Joseph Tighe (Amazon); Davide Modolo (Amazon)*
6305 Data-free Backdoor Removal Based on Channel Lipschitzness Runkai Zheng (Chinese University of Hong Kong (Shenzhen)); Rongjun Tang (The Chinese University of Hong Kong, Shenzhen); Jianze Li (Shenzhen Research Institute of Big Data, The Chinese University of Hong Kong, Shenzhen); Li Liu (Shenzhen Research Institute of Big Data, the chinese university of hong kong shenzhen)*
6325 Realistic Blur Synthesis for Learning Image Deblurring Jaesung Rim (POSTECH); Geonung Kim (POSTECH); Jungeon Kim (POSTECH); Junyong Lee (POSTECH); Seungyong Lee (POSTECH); Sunghyun Cho (POSTECH)*
6356 FCAF3D: Fully Convolutional Anchor-Free 3D Object Detection Danila Rukhovich (Samsung AI Center Moscow); Anna Vorontsova (Samsung AI Center)*; Anton S. Konushin (Samsung AI Center Moscow)
6421 Towards Ultra Low Latency Spiking Neural Networks for Vision and Sequential Tasks Using Temporal Pruning Sayeed Shafayet Chowdhury (Purdue University)*; Nitin Rathi (Purdue University); Kaushik Roy (Purdue Uniiversity)
6479 Large scale Real-world Multi Person Tracking Bing Shuai (Amazon)*; Alessandro Bergamo (Amazon); Uta Büchler (Amazon); Andrew G Berneshawi (Amazon); Alyssa Boden (Amazon Web Services); Joseph Tighe (Amazon)
6836 VoViT: Low Latency Graph-based Audio-Visual Voice Separation Transformer Juan Felipe Montesinos (Universitat Pompeu Fabra)*; Venkatesh Shenoy Kadandale (Universitat Pompeu Fabra); Gloria Haro (Universitat Pompeu Fabra)
6956 Totems: Physical Objects for Verifying Visual Integrity Jingwei Ma (University of Washington)*; Lucy Chai (MIT); Minyoung Huh (MIT); Tongzhou Wang (MIT); Ser-Nam Lim (Meta AI); Phillip Isola (MIT); Antonio Torralba (MIT)
7016 Single-Stream Multi-Level Alignment for Vision-Language Pretraining Zaid Khan (Northeastern University)*; Vijay Kumar B G (NEC Laboratories America); Xiang Yu (NEC Labs); Samuel Schulter (NEC Laboratories America); Manmohan Chandraker (UC San Diego); YUN FU (Northeastern University)
7277 Rayleigh EigenDirections (REDs): Nonlinear GAN latent space traversals for multidimensional features Guha Balakrishnan (Rice University)*; Raghudeep Gadde (Amazon); Aleix M Martinez (Amazon); Pietro Perona (Amazon Web Services (AWS))
7374 SLiDE: Self-supervised LiDAR De-snowing through Reconstruction Difficulty Gwangtak Bae (Seoul National University)*; Byungjun Kim (Seoul National University); Seongyong Ahn (Agency for Defense Development); jihong Min (Agency for Defense Development); Inwook Shim (Inha University)
7388 A Sketch Is Worth a Thousand Words:Image Retrieval with Text and Sketch Patsorn Sangkloy (Georgia Institute of Technology)*; Wittawat Jitkrittum (Google Research); Diyi Yang (Georgia Institute of Technology); James Hays (Georgia Institute of Technology, USA)
7509 Object discovery and representation networks Olivier Henaff (DeepMind)*; Skanda Koppula (DeepMind); Evan Shelhamer (DeepMind); Daniel Zoran (DeepMind); Andrew Jaegle (DeepMind); Andrew Zisserman (Oxford University); Joao Carreira (DeepMind); Relja Arandjelović (DeepMind)
7519 Natural Synthetic Anomalies for Self-Supervised Anomaly Detection and Localization Hannah M Schlueter (Imperial College London)*; Jeremy Tan (Imperial College London); Benjamin Hou (Imperial College London); Bernhard Kainz (Imperial College London, FAU Erlangen-Nürnberg)
7634 AutoTransition: Learning to Recommend Video Transition Effects Yaojie Shen (Institute of Software, Chinese Academy of Sciences); Libo Zhang (Institute of Software Chinese Academy of Sciences); Kai Xu (ByteDance Inc); Xiaojie Jin (Bytedance Inc. USA)*
7691 LWGNet – Learned Wirtinger Gradients for Fourier Ptychographic Phase Retrieval Atreyee Saha (Indian Institute of Technology Madras)*; Salman Siddique Khan (IIT Madras); Sagar Sehrawat (IIT Madras); Sanjana S Prabhu (Indian Institute of Technology Madras); Shanti Bhattacharya (IIT Madras); Kaushik Mitra (IIT Madras)
7718 Exploiting the local parabolic landscapes of adversarial losses to accelerate black-box adversarial attack Hoang Tran (Oak Ridge National Laboratory); Dan Lu (Oak Ridge National Laboratory); Guannan Zhang (Oak Ridge National Laboratory)*
7968 Almost-Orthogonal Layers for Efficient General-Purpose Lipschitz Networks Bernd Prach (IST Austria)*; Christoph H Lampert (IST Austria)
8048 VQGAN-CLIP: Open Domain Image Generation and Manipulation Using Natural Language Katherine B Crowson (EleutherAI); Stella R Biderman (Booz Allen Hamilton)*; daniel kornis (Eleuther.ai); Dashiell Stander (Eleuther AI); Eric Hallahan (EleutherAI); Louis J Castricato (Georgia Tech); Edward Raff (Booz Allen Hamilton)
8106 Bitwidth-Adaptive Quantization-Aware Neural Network Training: A Meta-Learning Approach Jiseok Youn (Seoul National University)*; Jaehun Song (Seoul National University); Hyung-Sin Kim (Seoul National University); Saewoong Bahk (Seoul National University)
86 Style-Hallucinated Dual Consistency Learning for Domain Generalized Semantic Segmentation Yuyang Zhao (National University of Singapore)*; Zhun Zhong (University of Trento); Na Zhao (NUS); Nicu Sebe (University of Trento); Gim Hee Lee (National University of Singapore)
399 Graph-constrained Contrastive Regularization for Semi-weakly Volumetric Segmentation Simon Reiß (Karlsruhe Institute of Technology)*; Constantin Marc Seibold (Karlsruhe Institute of Technology); Alexander Freytag (Carl Zeiss AG, Jena, Germany); Rodner Erik (University of Applied Sciences Berlin); Rainer Stiefelhagen (Karlsruhe Institute of Technology)
541 Cross-modal Prototype Driven Network for Radiology Report Generation Jun Wang (University of Warwick)*; Abhir Bhalerao (University of Warwick); Yulan He (University of Warwick)
800 Masked Autoencoders for Point Cloud Self-supervised Learning Yatian Pang (National University of Singapore); Wenxiao Wang (State Key Lab of CAD&CG, Zhejiang University); Francis EH Tay (National University of Singapore); Wei Liu (Tencent); Yonghong Tian (Peking University); Li Yuan (Peking University)*
977 Proposal-Free Temporal Action Detection via Global Segmentation Mask Learning Sauradip Nag (University of Surrey)*; Xiatian Zhu (University of Surrey); Yi-Zhe Song (University of Surrey); Tao Xiang (University of Surrey)
1003 Semi-Supervised Temporal Action Detection with Proposal-Free Masking Sauradip Nag (University of Surrey)*; Xiatian Zhu (University of Surrey); Yi-Zhe Song (University of Surrey); Tao Xiang (University of Surrey)
1044 Active Learning Strategies for Weakly-Supervised Object Detection Huy V. Vo (Ecole Normale Supérieure – INRIA – Valeo.ai)*; Oriane Siméoni (valeo.ai); Spyros Gidaris (valeo.ai); Andrei Bursuc (valeo.ai); Patrick Pérez (Valeo.ai); Jean Ponce (Inria)
1455 Gradient-based Uncertainty for Monocular Depth Estimation Julia Hornauer (Ulm University)*; Vasileios Belagiannis (Otto von Guericke University Magdeburg)
1561 Self-Supervised Sparse Representation for Video Anomaly Detection Jhih-Ciang Wu (Academia Sinica )*; He-Yen Hsieh (Academia Sinica); Ding-Jie Chen (Academia Sinica); Chiou-Shann Fuh (National Taiwan University); Tyng-Luh Liu (Academia Sinica)
1851 Share With Thy Neighbors: Single-View Reconstruction by Cross-Instance Consistency Tom Monnier (École des ponts Paristech)*; Matthew Fisher (Adobe Research); Alexei A Efros (UC Berkeley); Mathieu Aubry (École des ponts ParisTech)
1987 Cost Aggregation with 4D Convolutional Swin Transformer for Few-Shot Segmentation sunghwan hong (Korea University); Seokju Cho (Korea University); Jisu Nam (korea university); Stephen Lin (Microsoft Research); Seungryong Kim (Korea University)*
2772 Editable Indoor Lighting Estimation Henrique Weber (Université Laval)*; Mathieu Garon (Depix); Jean-Francois Lalonde (Université Laval)
2787 Dynamic 3D Scene Analysis by Point Cloud Accumulation Shengyu Huang (ETH Zürich)*; Zan Gojcic (NVIDIA); Jiahui Huang (Tsinghua University); Andreas Wieser (ETH Zürich); Konrad Schindler (ETH Zurich)
2936 QISTA-ImageNet: A Deep Compressive Image Sensing Framework Solving Lq-Norm Optimization Problem Gang-Xuan Lin (Academia Sinica); Shih-Wei Hu (National Taiwan University); Chun-Shien Lu (Academia Sinica)*
3342 Deep Hash Distillation for Image Retrieval Young Kyun Jang (Seoul National University)*; Geonmo Gu (NAVER corp); Byungsoo Ko (NAVER/LINE Corp.); Isaac Kang (Seoul National University); Nam Ik Cho (Seoul National University)
3515 ERA: Enhanced Rational Activations Martin Trimmel (Lund University)*; Mihai Zanfir (Google); Richard I Hartley (Google); Cristian Sminchisescu (Google)
4261 Learning Degradation Representations for Image Deblurring dasong Li (Chinese University of Hong Kong)*; Yi Zhang (CUHK); Ka Chun Cheung (Nvidia); Xiaogang Wang (Chinese University of Hong Kong, Hong Kong); Hongwei Qin (Sensetime); Hongsheng Li (The Chinese University of Hong Kong)
4402 Optimizing Image Compression via Joint Learning with Denoising Ka Leong Cheng (The Hong Kong University of Science and Technology); Yueqi Xie (The Hong Kong University of Science and Technology); Qifeng Chen (HKUST)*
4633 Equivariant Hypergraph Neural Networks Jinwoo Kim (KAIST); Saeyoon Oh (KAIST); Sungjun Cho (LG AI Research); Seunghoon Hong (KAIST)*
4963 EgoBody: Human Body Shape and Motion of Interacting People from Head-Mounted Devices Siwei Zhang (ETH Zurich)*; Qianli Ma (Max Planck Institute for Intelligent Systems); Yan Zhang (ETH Zurich); Zhiyin Qian (ETH Zürich); Taein Kwon (ETH Zurich); Marc Pollefeys (ETH Zurich / Microsoft); Federica Bogo (Meta); Siyu Tang (ETH Zurich)
4987 Learned Variational Video Color Propagation Markus Hofinger (Graz University of Technology)*; Erich Kobler (University Hospital Bonn); Alexander Effland (University of Bonn); Thomas Pock (Graz University of Technology)
5564 BRACE: The Breakdancing Competition Dataset for Dance Motion Synthesis Davide Moltisanti (University of Edinburgh)*; Jinyi Wu (S-Lab Nanyang Technological University); Bo Dai (Shanghai AI Lab); Chen Change Loy (Nanyang Technological University)
5660 Dress Code: High-Resolution Multi-Category Virtual Try-On Davide Morelli (UNIMORE); Matteo Fincato (Università degli Studi di Modena e Reggio Emilia); Marcella Cornia (University of Modena and Reggio Emilia)*; Federico Landi (University of Modena and Reggio Emilia); Fabio Cesari (YOOX Net-A-Porter Group S.p.A.); Rita Cucchiara (Università di Modena e Reggio Emilia)
6024 Ensemble Knowledge Guided Sub-network Search and Fine-tuning for Filter Pruning Seunghyun Lee (Inha University); Byung Cheol Song (Inha University)*
6222 RAWtoBit: A Fully End-to-end Camera ISP Network Wooseok Jeong (Korea University); Seung-Won Jung (Korea University)*
6236 SWFormer: Sparse Window Transformer for 3D Object Detection in Point Clouds Pei Sun (Waymo)*; Mingxing Tan (Waymo); Weiyue Wang (Waymo); Chenxi Liu (Waymo); Fei Xia (Waymo); Zhaoqi Leng (Waymo); Dragomir Anguelov (Waymo)
6342 Emotion-aware Multi-view Contrastive Learning for Facial Emotion Recognition Daeha Kim (Inha University); Byung Cheol Song (Inha University)*
6730 FLEX: Extrinsic Parameters-free Multi-view 3D Human Motion Reconstruction Brian Gordon (Tel Aviv University); Sigal Raab (Tel Aviv University)*; Guy Azov (Tel Aviv University); Raja Giryes (Tel Aviv University); Danny Cohen-Or (Tel Aviv University)
7027 Supervised Attribute Information Removal and Reconstruction for Image Manipulation Nannan Li (Boston University)*; Bryan Plummer (Boston University)
7048 OIMNet++: Prototypical Normalization and Localization-aware Learning for Person Search Sanghoon Lee (Yonsei University); Youngmin Oh (Yonsei University); Donghyeon Baek (Yonsei University); Junghyup Lee (Yonsei University); Bumsub Ham (Yonsei University)*
7061 How stable are Transferability Metrics evaluations? Andrea Agostinelli (Google)*; Michal Pandy (University of Cambridge); Jasper Uijlings (Google Research); Thomas Mensink (Google Research); Vittorio Ferrari (Google Research)
229 D&D: Learning Human Dynamics from Dynamic Camera Jiefeng Li (Shanghai Jiao Tong University)*; Siyuan Bian (Shanghai Jiao Tong University); Chao Xu (Tencent); Gang Liu (Tencent inc.); Gang Yu (Tencent ); Cewu Lu (Shanghai Jiao Tong University)
1413 Pose-NDF: Modelling Human Pose Manifolds with Neural Distance Fields Garvita Tiwari (MPI-INF, University of Tübingen)*; Dimitrije Antic (University of Tuebingen); Jan E. Lenssen (TU Dortmund); Nikolaos Sarafianos (Facebook Reality Labs); Tony Tung (Facebook Reality Labs); Gerard Pons-Moll (University of Tübingen)
1620 CLIFF: Carrying Location Information in Full Frames into Human Pose and Shape Estimation Zhihao Li (Huawei Noah’s Ark Lab)*; Jianzhuang Liu (Huawei Noah’s Ark Lab); Zhensong Zhang (Huawei Noah’s Ark Lab); Songcen Xu (Huawei Noah’s Ark Lab); Youliang Yan (Huawei Noah’s Ark Lab)
4591 SimCC: a Simple Coordinate Classification Perspective for Human Pose Estimation Yanjie Li (Tsinghua University)*; Sen Yang (Southeast University); Peidong Liu (Tsinghua University); 寿奎 张 (meituan); Yunxiao Wang (Tsinghua University); Zhicheng Wang (Nreal); Wankou Yang (Southeast University); Shu-Tao Xia (Tsinghua University)
5242 Grasp’D: Differentiable Contact-rich Grasp Synthesis for Multi-fingered Hands Dylan Turpin (University of Toronto)*; Liquan Wang (University of Toronto); Eric Heiden (University of Southern California); Yun-Chun Chen (University of Toronto ); Miles Macklin (NVIDIA); Stavros Tsogkas (University of Toronto); Sven Dickinson (University of Toronto); Animesh Garg (University of Toronto, Vector Institute, Nvidia)
6515 PressureVision: Estimating Hand Pressure from a Single RGB Image Patrick L Grady (Georgia Institute of Technology)*; Chengcheng Tang (Facebook Reality Labs); Samarth Brahmbhatt (Intel); Christopher D Twigg (Meta); Chengde Wan (Facebook Reality Lab); James Hays (Georgia Institute of Technology, USA); Charlie Kemp (Georgia Institute of Technology)
6672 Pose for Everything: Towards Category-Agnostic Pose Estimation Lumin XU (The Chinese University of Hong Kong)*; Sheng Jin (The University of Hong Kong); Wang ZENG (The Chinese University of Hong Kong); Wentao Liu (Sensetime); Chen Qian (SenseTime); Wanli Ouyang (The University of Sydney); Ping Luo (The University of Hong Kong); Xiaogang Wang (Chinese University of Hong Kong, Hong Kong)
7360 Multi-domain Learning for Updating Face Anti-spoofing Models Xiao Guo (Michigan State University)*; Yaojie Liu (Google Research); Anil Jain (Michigan State University); Xiaoming Liu (Michigan State University)
631 Semantic-Aware Implicit Neural Audio-Driven Video Portrait Generation Xian Liu (The Chinese University of Hong Kong)*; Yinghao Xu (Chinese University of Hong Kong); Qianyi Wu (Monash University); Hang Zhou (The Chinese University of Hong Kong); Wayne Wu (SenseTime Research); Bolei Zhou (UCLA)
993 Make-A-Scene: Scene-Based Text-to-Image Generation with Human Priors Oran Gafni (Meta AI Research)*; Adam Polyak (Facebook); Oron Ashual (Facebook AI Research); Shelly Sheynin (Meta); Devi Parikh (Georgia Tech & Facebook AI Research); Yaniv Taigman (Facebook)
2282 RFNet-4D: Joint Object Reconstruction and Flow Estimation from 4D Point Clouds Tuan-Anh Vu (The Hong Kong University of Science and Technology)*; Thanh Nguyen (Deakin University, Australia); Binh-Son Hua (VinAI Research); Quang Hieu Pham (Woven Planet North America); Sai-Kit Yeung (Hong Kong University of Science and Technology)
3181 Discovering Transferable Forensic Features for CNN-generated Images Detection Keshigeyan Chandrasegaran (Singapore University of Technology and Design)*; Ngoc-Trung Tran (Singapore University of Technology and Design); Alexander Binder (University of Oslo); Ngai-Man Cheung (Singapore University of Technology and Design)
3228 Text2LIVE: Text-Driven Layered Image and Video Editing Omer Bar Tal (Weizmann Institute of Science )*; Dolev Ofri-Amar (Weizmann Institute of Science); Rafail Fridman (Weizmann Institute of Science); Yoni Kasten (Weizmann Institute); Tali Dekel (Weizmann Institute of Science)
3673 Exploring Gradient-based Multi-directional Controls in GANs Zikun Chen (ModiFace Inc. )*; Ruowei Jiang (ModiFace Inc.); Brendan Duke (ModiFace Inc); Han Zhao (University of Illinois at Urbana-Champaign); Parham Aarabi (ModiFace Inc.)
4399 3D-Aware Indoor Scene Synthesis with Depth Priors Zifan SHI (HKUST)*; Yujun Shen (Dept. of IE, CUHK); Jiapeng Zhu (HKUST); Dit-Yan Yeung (HKUST); Qifeng Chen (HKUST)
4610 Generative Multiplane Images: Making a 2D GAN 3D-Aware Xiaoming Zhao (University of Illinois at Urbana-Champaign)*; Fangchang Ma (Apple Inc.); David Güera (Apple Inc.); Zhile Ren (Apple Inc.); Alexander Schwing (UIUC); Alex Colburn (Apple Inc.)
4847 Layered Controllable Video Generation Jiahui Huang (University of British Columbia)*; Yuhe Jin (University of British Columbia); Kwang Moo Yi (University of British Columbia); Leonid Sigal (University of British Columbia)

Orals 7 (Wed. pm)

Scene, Action, and Video Understanding

735 ECLIPSE: Efficient Long-range Video Retrieval using Sight and Sound Yan-Bo Lin (UNC Chapel Hill)*; Jie Lei (UNC Chapel Hill); Mohit Bansal (University of North Carolina at Chapel Hill); Gedas Bertasius (UNC Chapel Hill)
1212 CMD: Self-supervised 3D Action Representation Learning with Cross-modal Mutual Distillation Yunyao Mao (University of Science and Technology of China)*; Wengang Zhou (University of Science and Technology of China); Zhenbo Lu (Institute of Artificial Intelligence, Hefei Comprehensive National Science Center); Jiajun Deng (University of Science and Technology of China); Houqiang Li (University of Science and Technology of China)
4640 Self-supervised Social Relation Representation for Human Group Detection Jiacheng Li (College of Intelligence and Computing, Tianjin University); Ruize Han (College of Intelligence and Computing, Tianjin University)*; Haomin Yan (Tianjin University); Zekun Qian (College of Intelligence and Computing, Tianjin University); Wei Feng (College of Intelligence and Computing, Tianjin University, China); Song Wang (University of South Carolina)
4861 GraphVid: It Only Takes a Few Nodes to Understand a Video Eitan Kosman (Bosch AI)*; Dotan Di Castro (Bosch)
5080 PrivHAR: Recognizing Human Actions From Privacy-preserving Lens Carlos Hinojosa (Universidad Industrial de Santander)*; Miguel A Marquez (UIS Colombia); Henry Arguello (Universidad Industrial Santander); Ehsan Adeli (Stanford University); Li Fei-Fei (Stanford University); Juan Carlos Niebles (Salesforce & Stanford University)
6132 Flow graph to Video Grounding for Weakly-supervised Multi-Step Localization NIKITA DVORNIK (Samsung)*; Isma Hadji (Samsung AI Center – Toronto); Hai X Pham (Samsung AI Center); Dhaivat Bhatt (Samsung); Brais Martinez (Samsung AI Center); Afsaneh Fazly (SAIC Toronto); Allan D Jepson (Samsung Toronto AIC)
7215 Bi-PointFlowNet: Bidirectional Learning for Point Cloud Based Scene Flow Estimation WENCAN CHENG (Sungkyunkwan University); Jong Hwan Ko (Sungkyunkwan University)*
7248 Bayesian Tracking of Video Graphs Using Joint Kalman Smoothing and Registration Aditi Basu Bal (Florida State University)*; Ramy A Mounir (University of South Florida); Sathyanarayanan N Aakur (OK State); Sudeep Sarkar (University of South Florida, Tampa); Anuj Srivastava (Florida State University)

Orals 8 (Wed. pm)

Low-level Vision and Segmentation

2021 Adaptive Patch Exiting for Scalable Single Image Super-Resolution Shizun Wang (Beijing University of Posts and Telecommunications)*; Jiaming Liu (Peking University); Kaixin Chen (Beijing University of Posts and Telecommunications); Xiaoqi Li (Columbia university in the city of New york); Ming Lu (Intel Labs China); Yandong Guo (OPPO Research Institute)
2153 Perceptual Artifacts Localization for Inpainting Lingzhi Zhang (University of Pennsylvania)*; Yuqian Zhou (Adobe); Connelly Barnes (Adobe); Zhe Lin (Adobe Research); Eli Shechtman (Adobe Research, US); Sohrab Amirghodsi (Adobe Research); Jianbo Shi (University of Pennsylvania)
4067 Secrets of Event-Based Optical Flow Shintaro Shiba (Keio University)*; Yoshimitsu Aoki (Keio University); Guillermo Gallego (TU Berlin)
4919 KXNet: A Model-Driven Deep Neural Network for Blind Super-Resolution Jiahong Fu (Xi’an Jiaotong University)*; Hong Wang (Jarvis Lab,Tencent ); Qi Xie (Xi’an Jiaotong University); Qian Zhao (Xi’an Jiaotong University); Deyu Meng (Xi’an Jiaotong University); Zongben Xu (Xi’an Jiaotong University)
6180 Learning Topological Interactions for Multi-Class Medical Image Segmentation Saumya Gupta (Stony Brook University)*; Xiaoling Hu (Stony Brook University); James Kaan (Stony Brook University); Michael Jin (Stony Brook University Hospital); Mutshipay Christian Mpoy (SUNY Stony Brook Medicine); Katherine Chung (Stony Brook University Hospital); Gagandeep Singh (RWJBarnabas Health); Mary Saltz (Stony Brook); Tahsin Kurc (Stony Brook University); Joel Saltz (Stony Brook University); APOSTOLOS K TASSIOPOULOS (Stony Brook University); Prateek Prasanna (Stony Brook University); Chao Chen (Stony Brook University)
6193 Unsupervised Segmentation in Real-World Images via Spelke Object Inference Honglin Chen (Stanford University); Rahul M V (Stanford University); Yoni I Friedman (MIT); Jiajun Wu (Stanford University); Joshua Tenenbaum (MIT); Daniel Yamins (Stanford University); Daniel Bear (Stanford University)*

Posters 4 (Wed. early)

4318 One-Trimap Video Matting Hongje Seong (Yonsei University)*; Seoung Wug Oh (Adobe Research); Brian Price (Adobe); Euntai Kim (Yonsei University); Joon-Young Lee (Adobe Research)
180 Tackling Long-Tailed Category Distribution Under Domain Shifts Xiao Gu (Imperial College London)*; Yao Guo (Shanghai Jiao Tong Univerisity); Zeju Li (Imperial College London); Jianing Qiu (Imperial College London); DOU QI (The Chinese University of Hong Kong); Yuxuan Liu (Institude of Medical Robotics, Shanghai Jiao Tong University); Benny P L Lo (Imperial College London); Guang-Zhong Yang (SJTU)
248 Boosting Event Stream Super-Resolution with A Recurrent Neural Network Wenming Weng (University of Science and Technology of China)*; Yueyi Zhang (University of Science and Technology of China); Zhiwei Xiong (University of Science and Technology of China)
291 Paint2Pix: Interactive Painting based Progressive Image Synthesis and Editing Jaskirat Singh (Australian National University)*; Liang Zheng (Australian National University); Cameron Y Smith (Adobe Research); Jose Echevarria (Adobe System Inc.)
368 MPPNet: Multi-Frame Feature Intertwining with Proxy Points for 3D Temporal Object Detection Xuesong Chen (The Chinese University of Hong Kong)*; Shaoshuai Shi (MPI Informatics); Benjin Zhu (MEGVII); Ka Chun Cheung (Nvidia); Hang Xu (Huawei Noah’s Ark Lab); Hongsheng Li (The Chinese University of Hong Kong)
444 Fusing Local Similarities for Retrieval-based 3D Orientation Estimation of Unseen Objects Chen Zhao (EPFL)*; Yinlin Hu (EPFL); Mathieu Salzmann (EPFL)
463 Unsupervised Learning of 3D Semantic Keypoints with Mutual Reconstruction Haocheng Yuan (Northwestern Polytechnical University); Chen Zhao (EPFL); Shichao Fan (Northwestern Polytechnical University); Jiaxi Jiang (Northwestern Polytechnical University); Jiaqi Yang (Northwestern Polytechnical University)*
601 Neuromorphic Data Augmentation for Training Spiking Neural Networks Yuhang Li (Yale University)*; Youngeun Kim (Yale University); Hyoungseob Park (Yale University); Tamar Geller (Yale University); Priyadarshini Panda (Yale University)
609 Human Trajectory Prediction via Neural Social Physics Jiangbei Yue (Leeds University); Dinesh Manocha (University of Maryland at College Park)*; He Wang (Leeds University)
820 Semantic-Sparse Colorization Network for Deep Exemplar-based Colorization Yunpeng Bai (Tsinghua University )*; Chao Dong (SIAT); Zenghao Chai (Tsinghua University); Andong Wang (Tsinghua University); Zhengzhuo Xu (Tsinghua University); Chun Yuan (Graduate school at ShenZhen,Tsinghua university)
841 End-to-End Visual Editing with a Generatively Pre-Trained Artist Andrew Brown (University of Oxford)*; Cheng-Yang Fu (Facebook.com); Omkar M Parkhi (Facebook); Tamara Berg (Facebook AI Research); Andrea Vedaldi (University of Oxford / Facebook AI Research)
852 COUCH: Towards Controllable Human-chair Interactions Xiaohan Zhang (University of Tübingen, MPI Informatics); Bharat Lal Bhatnagar (University of Tübingen, MPI informatik); Sebastian Starke (University of Edinburgh); Vladimir Guzov (University of Tuebingen); Gerard Pons-Moll (University of Tübingen)*
912 Concurrent Subsidiary Supervision for Unsupervised Source-Free Domain Adaptation Jogendra Nath Kundu (Indian Institute of Science)*; Suvaansh Bhambri (Indian Institute of Science); Akshay R Kulkarni (Indian Institute of Science); Hiran Sarkar (Indian Institute of Science); Varun Jampani (Google); Venkatesh Babu RADHAKRISHNAN (Indian Institute of Science)
1321 Highly Accurate Dichotomous Image Segmentation Xuebin Qin (University of Alberta); Hang Dai (Mohamed bin Zayed University of Artificial Intelligence); Xiaobin Hu (Technische Universität München); Deng-Ping Fan (ETH Zurich)*; Ling Shao (Terminus Group); Luc Van Gool (ETH Zurich)
1342 StretchBEV: Stretching Future Instance Prediction Spatially and Temporally Kaan Adil Akan (Koc University); Fatma Guney (Koc University)*
1412 Mining Relations among Cross-Frame Affinities for Video Semantic Segmentation Guolei Sun (ETH Zurich); Yun Liu (ETH Zurich)*; Hao Tang (ETH Zurich); Ajad Chhatkuli (ETH Zurich); Le Zhang (University of Electronic Science and Technology of China); Luc Van Gool (ETH Zurich)
1673 Locally Varying Distance Transform for Unsupervised Visual Anomaly Detection Wen-Yan Lin (SMU); Zhonghang Liu (SMU); Siying Liu (I2R Singapore)*
1710 Generative Adversarial Network for Future Hand Segmentation from Egocentric Video Wenqi Jia (Georgia Institute of Technology)*; Miao Liu (Georgia Institute of Technology); James Rehg (Georgia Institute of Technology)
1824 BayesCap: Bayesian Identity Cap for Calibrated Uncertainty in Frozen Neural Networks Uddeshya Upadhyay (University of Tübingen)*; Shyamgopal Karthik (University of Tübingen); Massimiliano Mancini (University of Tübingen); Yanbei Chen (University of Tübingen); Zeynep Akata (University of Tübingen)
1860 Learning Online Multi-Sensor Depth Fusion Erik Sandström (ETH Zürich)*; Martin R. Oswald (ETH Zurich); Suryansh Kumar (ETH Zurich); Silvan Weder (ETH Zürich); Fisher Yu (ETH Zurich); Cristian Sminchisescu (Lund University); Luc Van Gool (ETH Zurich)
1946 COO: Comic Onomatopoeia Dataset for Recognizing Arbitrary or Truncated Texts Jeonghun Baek (The University of Tokyo)*; Yusuke Matsui (The University of Tokyo); Kiyoharu Aizawa (The University of Tokyo)
2104 Target-absent Human Attention Zhibo Yang (Stony Brook University)*; Sounak Mondal (Stony Brook University); Seoyoung Ahn (Stony Brook University); Gregory Zelinsky (Stony Brook University); Minh Hoai (Stony Brook University); Dimitris Samaras (Stony Brook University)
2171 CPrune: Compiler-Informed Model Pruning for Efficient Target-Aware DNN Execution Taeho Kim (University of Colorado at Boulder)*; Yongin Kwon (Electronics and Telecommunications Research Institute); Jemin Lee (Electronics and Telecommunications Research Institute); Taeho Kim (Electronics and Telecommunications Research Institute); Sangtae Ha (University of Colorado at Boulder)
2180 Scraping Textures from Natural Images for Synthesis and Editing Xueting Li (University of California, Merced)*; Xiaolong Wang (UCSD); Ming-Hsuan Yang (University of California at Merced); Alexei A Efros (UC Berkeley); Sifei Liu (NVIDIA)
2267 Teaching with Soft Label Smoothing for Mitigating Noisy Labels in Facial Expressions Tohar Lukov (National University of Singapore)*; Na Zhao (NUS); Gim Hee Lee (National University of Singapore); Ser-Nam Lim (Facebook AI)
2336 Hierarchical Semantic Regularization of Latent Spaces in StyleGANs Tejan Karmali (Indian Institute of Science, Bengaluru)*; Rishubh Parihar (Indian Institute of Science, Bangalore); Susmit Agrawal (Indian Institute of Science); Harsh Rangwani (Indian Institute of Science); Varun Jampani (Google); Maneesh K Singh (Motive Technologies ); Venkatesh Babu RADHAKRISHNAN (Indian Institute of Science)
2501 Dense Gaussian Processes for Few-Shot Segmentation Joakim Johnander (Linköping University)*; Johan Edstedt (Linköping University); Fahad Shahbaz Khan (MBZUAI); Michael Felsberg (Linköping University); Martin Danelljan (ETH Zurich)
2589 Learning Continuous Implicit Representation for Near-Periodic Patterns Bowei Chen (CMU)*; Tiancheng Zhi (ByteDance); Martial Hebert (cmu); Srinivasa Narasimhan (Carnegie Mellon University, USA)
2631 Dense Siamese Network for Dense Unsupervised Learning Wenwei Zhang (NTU)*; Jiangmiao Pang (CUHK); Kai Chen (SenseTime Research); Chen Change Loy (Nanyang Technological University)
2667 Improving the Reliability for Confidence Estimation Haoxuan Qu (Singapore University of Technology and Design)*; Yanchao Li (Singapore University of Technology and Design); Lin Geng Foo (Singapore University of Technology and Design); Jason Kuen (Adobe Research); Jiuxiang Gu (Adobe Research); Jun Liu (Singapore University of Technology and Design)
2766 Dynamically Transformed Instance Normalization Network for Generalizable Person Re-Identification BingLiang Jiao (Northwestern Polytechnical University ); Lingqiao Liu (University of Adelaide); Liying Gao ( Northwestern Polytechnical University); Guosheng Lin (Nanyang Technological University); Lu Yang (Northwestern Polytechnical University); Shizhou Zhang (NorthWestern Polytechnical University); Peng Wang (Northwestern Polytechnical University)*; Yanning Zhang (Northwestern Polytechnical University)
2829 Exploring Resolution and Degradation Clues as Self-supervised Signal for Low Quality Object Detection Ziteng Cui (The University of Tokyo); Yingying Zhu (University of Texas Arlington); Lin Gu (RIKEN,AIP / The University of Tokyo)*; Guo-Jun Qi (Futurewei Technologies); Xiaoxiao Li (The University of British Columbia); Renrui Zhang (Shanghai AI Lab); Zenghui Zhang (Shanghai Jiao Tong university); Tatsuya Harada (The University of Tokyo / RIKEN)
2856 Robust Category-Level 6D Pose Estimation with Coarse-to-Fine Rendering of Neural Features Wufei Ma (Purdue University)*; Angtian Wang (Johns Hopkins University); Alan Yuille (Johns Hopkins University); Adam Kortylewski (Max Planck Institute for Informatics)
3127 Towards Robust Face Recognition with Comprehensive Search Manyuan Zhang (Sensetime)*; Guanglu Song (Sensetime); Yu Liu (SenseTime Group LTD); Hongsheng Li (The Chinese University of Hong Kong)
3217 Pose2Room: Understanding 3D Scenes from Human Activities Yinyu Nie (Technical University of Munich)*; Angela Dai (Technical University of Munich); Xiaoguang Han (Shenzhen Research Institute of Big Data, the Chinese University of Hong Kong (Shenzhen)); Matthias Niessner (Technical University of Munich)
3298 Uncertainty Inspired Underwater Image Enhancement Zhenqi Fu (Xiamen University)*; Wu Wang (Xiamen University); Yue Huang (Xiamen University); Xinghao Ding (Xiamen University); Kai-Kuang Ma (Nanyang Technological University, Singapore)
3351 S^2Contact: Graph-based Network for 3D Hand-Object Contact Estimation with Semi-Supervised Learning Tze Ho Elden Tse (University of Birmingham)*; Zhongqun Zhang (University of Birmingham); Kwang In Kim (UNIST); Ales Leonardis (University of Birmingham); Feng Zheng (SUSTech); Hyung Jin Chang (University of Birmingham)
3511 DynaST: Dynamic Sparse Transformer for Exemplar-Guided Image Generation Songhua Liu (National University of Singapore)*; Jingwen Ye (National University of Singapore); Sucheng Ren (South China University of Technology); Xinchao Wang (National University of Singapore)
3731 TAFIM: Targeted Adversarial Attacks against Facial Image Manipulations Shivangi Aneja (Technical University Of Munich )*; Lev Markhasin (Sony Europe); Matthias Niessner (Technical University of Munich)
3769 EdgeViTs: Competing Light-weight CNNs on Mobile Devices with Vision Transformers Junting Pan (The Chinese University of Hong Kong); Adrian Bulat (Samsung AI Center, Cambridge); Fuwen Tan (Samsung AI Center, Cambridge); Xiatian Zhu (University of Surrey); Lukasz Dudziak (Samsung AI Center Cambridge); Hongsheng Li (The Chinese University of Hong Kong); Georgios Tzimiropoulos (Queen Mary University of London); Brais Martinez (Samsung AI Center)*
4037 TokenMix: Rethinking Image Mixing for Data Augmentation in Vision Transformers Jihao Liu (Sensetime)*; Boxiao Liu (Institute of Computing Technology, Chinese Academy of Sciences); Hang Zhou (The Chinese University of Hong Kong); Hongsheng Li (The Chinese University of Hong Kong); Yu Liu (SenseTime Group LTD)
4043 Master of All: Simultaneous Generalization of Urban-Scene Segmentation to All Adverse Weather Conditions Nikhil Reddy (IIT Delhi)*; Abhinav Singhal (Indian Institute of Technology, Delhi); Abhishek Kumar (IIT Delhi); Mahsa Baktashmotlagh (University of Queensland); Chetan Arora (Indian Institute of Technology Delhi)
4115 DFNet: Enhance Absolute Pose Regression with Direct Feature Matching Shuai Chen (University of Oxford)*; Xinghui Li (University of Oxford); Zirui Wang (University of Oxford); Victor Adrian Prisacariu (University of Oxford)
4124 Towards Regression-Free Neural Networks for Diverse Compute Platforms Rahul Duggal (Georgia Tech); Hao Zhou (Amazon); Shuo Yang (Amazon); Jun Fang (Amazon)*; Yuanjun Xiong (Amazon); Wei Xia (Amazon)
4195 Word-Level Fine-Grained Story Visualization Bowen Li (University of Oxford)*
4222 Image Inpainting with Cascaded Modulation GAN and Object-Aware Training Haitian Zheng (University of Rochester)*; Zhe Lin (Adobe Research); Jingwan Lu (Adobe Research ); Scott Cohen (Adobe Research); Eli Shechtman (Adobe Research, US); Connelly Barnes (Adobe); Jianming Zhang (Adobe Research); Ning Xu (Adobe Research); Sohrab Amirghodsi (Adobe Research); Jiebo Luo (U. Rochester)
4247 Depth Map Decomposition for Monocular Depth Estimation Jinyoung Jun (Korea University)*; Jae-Han Lee (Gauss Labs Inc.); Chul Lee (Dongguk University); Chang-Su Kim (Korea university)
4275 Video Extrapolation in Space and Time Yunzhi Zhang (Stanford University)*; Jiajun Wu (Stanford University)
4304 Learning Visual Styles from Audio-Visual Associations Tingle Li (Tsinghua University)*; Yichen Liu (Tsinghua University); Andrew Owens (U Michigan); Hang Zhao (Tsinghua University)
4336 Learning Mutual Modulation for Self-Supervised Cross-Modal Super-Resolution Xiaoyu Dong (The University of Tokyo / RIKEN AIP); Naoto Yokoya (The University of Tokyo)*; Longguang Wang (National University of Defense Technology); Tatsumi Uezato (Hitachi, Ltd)
4406 DeMFI: Deep Joint Deblurring and Multi-Frame Interpolation with Flow-Guided Attentive Correlation and Recursive Boosting Jihyong Oh (KAIST)*; Munchurl Kim (Korea Advanced Institute of Science and Technology)
4412 Sliced Recursive Transformer Zhiqiang Shen (Carnegie Mellon University)*; Zechun Liu (Carnegie Mellon University); Eric Xing (MBZUAI, CMU, and Petuum Inc.)
4429 UniNet: Unified Architecture Search with Convolution, Transformer, and MLP Jihao Liu (Sensetime)*; Xin Huang (Waseda University); Guanglu Song (Sensetime); Hongsheng Li (The Chinese University of Hong Kong); Yu Liu (SenseTime Group LTD)
4581 Deep Portrait Delighting Joshua William Weir (Victoria University of Wellington)*; Junhong Zhao (CMIC); Andrew Chalmers (CMIC); Taehyun Rhee (Victoria University of Wellington)
4619 Facial Depth and Normal Estimation using Single Dual-Pixel Camera Minjun Kang (KAIST)*; Jaesung Choe (KAIST); Hyowon Ha (Facebook); Hae-Gon Jeon (GIST); Sunghoon Im (DGIST); In So Kweon (KAIST); Kuk-Jin Yoon (KAIST)
4652 Neural Scene Decoration from a Single Photograph Hong Wing Pang (The Hong Kong University of Science and Technology)*; Yingshu Chen ( The Hong Kong University of Science and Technology); Phuoc-Hieu Le (VinAI Research); Binh-Son Hua (VinAI Research); Thanh Nguyen (Deakin University, Australia); Sai-Kit Yeung (Hong Kong University of Science and Technology)
4727 When Active Learning Meets Implicit Semantic Data Augmentation zhuangzhuang chen (shenzhen university); Jin Zhang (Shenzhen University); Pan Wang (Shenzhen University); Jie Chen (Shenzhen University); Jianqiang Li (Shenzhen University)*
4739 Hallucinating Pose-Compatible Scenes Tim Brooks (UC Berkeley)*; Alexei A Efros (UC Berkeley)
4940 Few Zero Level Set-Shot Learning of Shape Signed Distance Functions in Feature Space Amine Ouasfi (IMT Atlantique ); Adnane Boukhayma (Inria)*
4972 diffConv: Analyzing Irregular Point Clouds with an Irregular View Manxi Lin (Technical University of Denmark)*; Aasa Feragen (Technical University of Denmark)
4977 TACS: Taxonomy Adaptive Cross-Domain Semantic Segmentation RUI GONG (ETH Zurich)*; Martin Danelljan (ETH Zurich); Dengxin Dai (ETH Zurich); Danda Pani Paudel (ETH Zürich); Ajad Chhatkuli (ETH Zurich); Fisher Yu (ETH Zurich); Luc Van Gool (ETH Zurich)
5083 Weight Fixing Networks Chris Subia-Waud (University of Southampton)*; Srinandan Dasmahapatra (University of Southampton)
5180 Directed Ray Distance Functions for 3D Scene Reconstruction Nilesh Kulkarni (University of Michigan)*; Justin Johnson (University of Michigan); David Fouhey (University of Michigan)
5251 FS-COCO: Towards Understanding of Freehand Sketches of Common Objects in Context Pinaki Nath Chowdhury (University of Surrey)*; Aneeshan Sain (University of Surrey); Ayan Kumar Bhunia (University of Surrey); Tao Xiang (University of Surrey); Yulia Gryaditskaya (University of Surrey); Yi-Zhe Song (University of Surrey)
5265 Exploring Fine-grained Audiovisual Categorization with the SSW60 Dataset Grant Van Horn (Cornell University)*; Rui Qian (Cornell University); Kimberly Wilber (Google); Hartwig Adam (Google); Oisin Mac Aodha (University of Edinburgh); Serge Belongie (University of Copenhagen)
5327 A Dataset for Interactive Vision-Language Navigation with Unknown Command Feasibility Andrea Burns (Boston University)*; Deniz Arsan (University of Illinois at Urbana Champaign); Sanjna Agrawal (Boston University); Ranjitha Kumar (UIUC: CS); Kate Saenko (Boston University); Bryan Plummer (Boston University)
5361 All You Need is RAW: Defending Against Adversarial Attacks with Camera Image Pipelines Yuxuan Zhang (Princeton University)*; Bo Dong (Princeton University); Felix Heide (Princeton University)
5377 Connecting Compression Spaces withTransformer for Approximate Nearest Neighbor Search Haokui Zhang (Lighthouse Co.Ltd)*; Buzhou Tang (Harbin Institute of Technology, China); Wenze Hu (Lighthouse Co.Ltd); Xiaoyu Wang (The Chinese University of Hong Kong (Shenzhen))
5405 PSS: Progressive Sample Selection for Open-World Visual Representation Learning Tianyue Cao (Shanghai Jiao Tong University); Yongxin Wang (Amazon)*; Yifan Xing (AMAZON CORPORATE LLC); Tianjun Xiao (Amazon); Tong He (Amazon); Zheng Zhang (AWS); Hao Zhou (Amazon); Joseph Tighe (Amazon)
5424 Are Vision Transformers Robust to Patch-wise Perturbations? Jindong Gu (University of Munich)*; Volker Tresp (Siemens AG and Ludwig Maximilian University of Munich ); Yao Qin (Google)
5484 Point MixSwap: Attentional Point Cloud Mixing via Swapping Matched Structural Divisions Ardian Umam (NYCU)*; Cheng-Kun Yang (National Taiwan University); Yung-Yu Chuang (National Taiwan University); Jen-Hui Chuang (National Chiao Tung University ); Yen-Yu Lin (National Yang Ming Chiao Tung University)
5661 UC-OWOD: Unknown-Classified Open World Object Detection Zhiheng Wu (Institute of Automation, Chinese Academy of Sciences (CASIA))*; Yue Lu (Institute of Automation, Chinese Academy of Sciences(CASIA)); Xingyu Chen (Xiaobing.AI); Zhengxing Wu (CASIA); Liwen Kang (Institute of Automation, Chinese Academy of Sciences (CASIA)); Junzhi Yu (CASIA)
5669 RayTran: 3D pose estimation and shape reconstruction of multiple objects from videos with ray-traced transformers Michał J Tyszkiewicz (EPFL); Kevis-Kokitsi Maninis (Google Research)*; Stefan Popov (Google Research); Vittorio Ferrari (Google Research)
5702 Space-Partitioning RANSAC Daniel Barath (ETH Zürich)*; Gábor Valasek (ELTE)
5750 SPIN: An Empirical Evaluation on Sharing Parameters of Isotropic Networks Anish J Prabhu (Apple)*; Chien-Yu lin (University of Washington); Thomas Merth (Apple); Sachin Mehta (University of Washington); Anurag Ranjan (Apple); Maxwell C Horton (Apple, Xnor.Ai and University of Washington); Mohammad Rastegari (University of Washington)
5771 Structure and Motion for Casual Videos Zhoutong Zhang (MIT)*; Forrester Cole (Google Research); Zhengqi Li (Google Inc.); Noah Snavely (Google); Michael Rubinstein (Google); William T Freeman (Google)
5932 Controllable Shadow Generation Using Pixel Heigh Maps Yichen Sheng (Purdue University)*; Yifan Liu (University of Adelaide); Jianming Zhang (Adobe Research); Wei Yin (University of Adelaide); A. Cengiz Oztireli (University of Cambridge, Google); He Zhang (Adobe); Zhe Lin (Adobe Research); Eli Shechtman (Adobe Research, US); Bedrich Benes (Purdue University)
6046 Contextformer: A Transformer with Spatio-Channel Attention for Context Modeling in Learned Image Compression Ahmet Burakhan Koyuncu (Technical University of Munich)*; Han Gao (Tencent America); Atanas Boev (Huawei Technologies Duesseldorf GmbH); Georgii Gaikov (Huawei Moscow Research Center); Elena Alshina (Huawei Technologies); Eckehard Steinbach (TUM)
6078 Dynamic Local Aggregation Network with Adaptive Clusterer for Anomaly Detection Zhiwei Yang (Xidian University)*; Peng Wu (Xidian University); Jing Liu (Xidian University); Xiaotao Liu (Xidian University)
6199 MUGEN: A Playground for Video-Audio-Text Multimodal Understanding and GENeration Thomas F Hayes (Meta); Songyang Zhang (University of Rochester)*; Xi Yin (Facebook); Guan Pang (Facebook); Sasha Sheng (Meta Platforms); Harry Yang (Facebook); Songwei Ge (University of Maryland, College Park); Qiyuan Hu (Facebook AI Research); Devi Parikh (Georgia Tech & Facebook AI Research)
6210 Lipschitz Continuity Retained Binary Neural Network Yuzhang Shang (Illinois Institute of Technology)*; Dan Xu (The Hong Kong University of Science and Technology); Bin Duan (Illinois Institute of Technology); Ziliang Zong (Texas State University); Liqiang Nie (Harbin Institute of Technology (Shenzhen)); Yan Yan (Illinois Institute of Technology)
6259 Seeing through a Black Box: Toward High-Quality Terahertz Imaging via Subspace-and-Attention Guided Restoration Weng-Tai Su (National Tsing Hua University); Yi-Chun Hung (University of California, Los Angeles); Po-Jen Yu (National Tsing Hua University); Shang-Hua Yang (National Tsing Hua University); Chia-Wen Lin (National Tsing Hua University)*
6365 Video Dialog as Conversation about Objects Living in Space-Time Hoang-Anh Pham (Deakin University)*; Thao Minh Le (Deakin University); Vuong Le (Deakin University); Tu Minh Phuong (Posts and Telecommunications Institute of Technology); Truyen Tran (Deakin University)
6555 Adversarial Label Poisoning Attack on Graph Neural Networks via Label Propagation Ganlin Liu (The University of Liverpool)*; Xiaowei Huang (Liverpool University); Xinping Yi (University of Liverpool)
6560 Visual Knowledge Tracing Neehar Kondapaneni (Caltech)*; Pietro Perona (California Institute of Technology); Oisin Mac Aodha (University of Edinburgh)
6634 FedLTN: Federated Learning for Sparse and Personalized Lottery Ticket Networks Vaikkunth Mugunthan (DynamoFL)*; Eric Lin (DynamoFL); Vignesh Gokul (University of California San Diego); Christian Lau (DynamoFL); Lalana Kagal (MIT); Steve Pieper (Isomics, Inc.)
6651 Improving the Perceptual Quality of 2D Animation Interpolation Shuhong Chen (University of Maryland – College Park)*; Matthias Zwicker (University of Maryland)
6867 Where in the World is this Image? Transformer-based Geo-localization in the Wild Shraman Pramanick (Johns Hopkins University)*; Ewa M Nowara (Meta Reality Labs); Joshua Gleason (Univ of Maryland); Carlos Castillo (Johns Hopkins University); Rama Chellappa (Johns Hopkins University)
6997 Variance-Aware Weight Initializationfor Point Convolutional Neural Networks Pedro Hermosilla Casajus (Ulm University)*; Michael Schelling (Ulm University – Institute of Media Informatics); Tobias Ritschel (UCL); Timo Ropinski (Ulm University)
7090 Learning Omnidirectional Flow in 360° Video via Siamese Representation Keshav Bhandari (Texas State University)*; Bin Duan (Illinois Institute of Technology); Gaowen Liu (Cisco Research); Hugo M Latapie (Cisco); Ziliang Zong (Texas State University); Yan Yan (Illinois Institute of Technology)
7186 BA-Net: Bridge Attention for Deep Convolutional Neural Networks Yue Zhao (Sun Yat-sen University); Junzhou Chen (Sun Yat-sen University)*; Zhang Zirui (Sun Yat-sen University); Ronghui Zhang (Sun Yat-Sen University)
7385 LANA: Latency Aware Network Acceleration Pavlo Molchanov (NVIDIA)*; James B Hall (Microsoft Research); Hongxu Yin (NVIDIA ); Nicolo Fusi (Microsoft Research); Jan Kautz (NVIDIA); Arash Vahdat (NVIDIA)
7955 DSR — A dual subspace re-projection network for surface anomaly detection Vitjan Zavrtanik (University of Ljubljana)*; Matej Kristan (University of Ljubljana); Danijel Skocaj (University of Ljubljana)
8063 Learning to use unlabeled data in data augmentation for 3D detection Zhaoqi Leng (Waymo)*; Shuyang Cheng (Waymo LLC); Ben Caine (Google); Weiyue Wang (Waymo); Xiao Zhang (Cruise); Jonathon Shlens (Google); Mingxing Tan (Waymo); Dragomir Anguelov (Waymo)
46 PPT: token-Pruned Pose Transformer for monocular and multi-view human pose estimation Haoyu Ma (University of California, Irvine)*; Zhe Wang (UC-Irvine); Yifei Chen (Tencent); Deying Kong (university of california, irvine); Liangjian Chen (Reality Labs); Xingwei Liu (University of California Irvine); Xiangyi Yan (University of California, Irvine); Hao Tang (University of California Irvine); Xiaohui Xie (University of California, Irvine)
194 ExtrudeNet: Unsupervised Inverse Sketch-and-Extrude for Shape Parsing Daxuan Ren (Nanyang Technological University)*; Jianmin Zheng (Nanyang Technological University); Jianfei Cai (Monash University); jiatong j li (Sensetime); Junzhe Zhang (Nanyang Technological University)
499 Demystifying Unsupervised Semantic Correspondence Estimation Mehmet Aygün (The University of Edinburgh)*; Oisin Mac Aodha (University of Edinburgh)
893 LiDAL: Inter-frame Uncertainty Based Active Learning for 3D LiDAR Semantic Segmentation ZEYU HU (Hong Kong University of Science and Technology)*; Xuyang Bai (HKUST); Runze Zhang (Tencent); Xin Wang (Tencent); Guangyuan Sun (TENCENT); Hongbo Fu (City University of Hong Kong); Chiew-Lan Tai (Hong Kong University of Science & Technology)
940 FashionViL: Fashion-Focused Vision-and-Language Representation Learning Xiao Han (University of Surrey)*; Licheng Yu (Facebook); Xiatian Zhu (University of Surrey); Li Zhang (Fudan University); Yi-Zhe Song (University of Surrey); Tao Xiang (University of Surrey)
1118 CycDA: Unsupervised Cycle Domain Adaptation to Learn from Image to Video Wei Lin (Graz University of Technology)*; Anna Kukleva (MPII); Kunyang Sun (Southeast University); Horst Possegger (Graz University of Technology); Hilde Kuehne (Goethe University Frankfurt); Horst Bischof (Graz University of Technology)
1415 NeILF: Neural Incident Light Field for Physically-based Material Estimation Yao Yao (Apple Inc.); Jingyang Zhang (The Hong Kong University of Science and Technology)*; Jingbo Liu (Apple Inc.); Yihang Qu (Apple Inc.); Tian Fang (Apple); David N McKinnon (Apple); Yanghai Tsin (Apple Inc); Long Quan (Apple)
1439 What to Hide from Your Students: Attention-Guided Masked Image Modeling Ioannis Kakogeorgiou (National Technical University of Athens)*; Spyros Gidaris (valeo.ai); Bill Psomas (National Technical University of Athens); Yannis Avrithis (IARAI, Athena RC); Andrei Bursuc (valeo.ai); Konstantinos Karantzalos (National Technical University of Athens); Nikos Komodakis (University of Crete)
1567 CPO: Change Robust Panorama to Point Cloud Localization Junho Kim (Seoul National University)*; Hojun Jang (Seoul National University); Changwoon Choi (Seoul National University); Young Min Kim (Seoul National University)
2054 Streaming Multiscale Deep Equilibrium Models Can Ufuk Ertenli (Middle East Technical University)*; Emre Akbas (METU); Ramazan Gokberk Cinbis (METU)
3623 JoJoGAN: One Shot Face Stylization Min Jin Chong (Univeristy of Illinois at Urbana-Champaign)*; David Forsyth (Univeristy of Illinois at Urbana-Champaign)
3672 Unified Fully and Timestamp Supervised Temporal Action Segmentation via Sequence to Sequence Translation Nadine Behrmann (Bosch Center for Artificial Intelligence)*; S. Alireza Golestaneh (Google); Zico Kolter (Carnegie Mellon University); Jürgen Gall (University of Bonn); Mehdi Noroozi (Bosch Gmb)
4021 UnrealEgo: A New Dataset for Robust Egocentric 3D Human Motion Capture Hiroyasu Akada (Max Planck Institute for Informatics, Keio University); Jian Wang (Max Planck Institute for Informatics); Soshi Shimada (MPI for Informatics); Masaki Takahashi (Keio University); Christian Theobalt (MPI Informatik); Vladislav Golyanik (MPI for Informatics)*
4330 Augmentation of rPPG Benchmark Datasets: Learning to Remove and Embed rPPG Signals via Double Cycle Consistent Learning from Unpaired Facial Videos WEI-HAO Chung (National Tsing Hua University)*; CHENG-JU HSIEH (National Tsing Hua University); Chiou-Ting Hsu (National Tsing Hua University)
4343 Learning Instance and Task-Aware Dynamic Kernels for Few-shot Learning Rongkai Ma (Monash University)*; Pengfei Fang (The Australian National University); Gil Avraham (Monash University); Yan Zuo (CSIRO); Tianyu Zhu (Monash University); Tom Drummond (University of Melbourne); Mehrtash Harandi (Monash University)
4471 Neural Image Representations for Multi-Image Fusion and Layer Separation Seonghyeon Nam (York University); Marcus A Brubaker (York University); Michael S Brown (York University)*
4802 Neural Video Compression using GANs for Detail Synthesis and Propagation Fabian Mentzer (Google)*; Eirikur Agustsson (Google); Johannes Ballé (Google); David Minnen (Google Inc.); Nick Johnston (Google); George Toderici (Google Research)
4904 Camera Pose Estimation and Localization with Active Audio Sensing Karren D Yang (MIT); Michael Firman (Niantic); Eric Brachmann (Niantic)*; Clement LJC Godard (Niantic)
4961 HULC: 3D HUman Motion Capture with Pose Manifold SampLing and Dense Contact Guidance Soshi Shimada (MPI for Informatics)*; Vladislav Golyanik (MPI for Informatics); Zhi Li (Max Planck Institute for Informatics); Patrick Pérez (Valeo.ai); Weipeng Xu (Reality Labs Research); Christian Theobalt (MPI Informatik)
5034 Audio-Visual Mismatch-Aware Video Retrieval via Association and Adjustment Sangmin Lee (KAIST)*; Sungjune Park (KAIST); Yong Man Ro (KAIST)
5040 Classification-Regression for Chart Comprehension Matan Levy (The Hebrew University of Jerusalem)*; Rami Ben-Ari (OriginAI); Dani Lischinski (The Hebrew University of Jerusalem)
5082 Contrastive Vicinal Space for Unsupervised Domain Adaptation Jaemin Na (Ajou University)*; Dongyoon Han (NAVER AI Lab); Hyung Jin Chang (University of Birmingham); Wonjun Hwang (Ajou University)
5643 GRIT-VLP: Grouped Mini-batch Sampling for Efficient Vision and Language Pre-training Jaeseok Byun (Seoul National university); Taebaek Hwang (M.IN.D Lab); Jianlong Fu (Microsoft Research); Taesup Moon (Seoul National University)*
5666 Helpful or Harmful: Inter-Task Association in Continual Learning Hyundong Jin (Chung-Ang University ); Eunwoo Kim (Chung-Ang University)*
5754 SAGA: Stochastic Whole-Body Grasping With Contact Yan Wu (ETH Zurich); Jiahao Wang (Max Planck Institute for Informatics); Yan Zhang (ETH Zurich); Siwei Zhang (ETH Zurich); Otmar Hilliges (ETH Zurich); Fisher Yu (ETH Zurich); Siyu Tang (ETH Zurich)*
6395 Style Your Hair: Latent Optimization for Pose-Invariant Hairstyle Transfer via Local-Style-Aware Hair Alignment Chaeyeon Chung ( Korea Advanced Institute of Science and Technology)*; Taewoo Kim (Korea Advanced Institute of Science and Technology ); Yoonseo Kim (KAIST); Sunghyun Park (KAIST); Kangyeol Kim (KAIST); Jaegul Choo (Korea Advanced Institute of Science and Technology)
6895 NashAE: Disentangling Representations through Adversarial Covariance Minimization Eric C Yeats (Duke University)*; Frank Liu (Oak Ridge National Lab); David Womble (Oak Ridge National Laboratory); Hai Li (Duke University)
6959 ManiFest: manifold deformation for few-shot image translation Fabio Pizzati (Inria / Vislab)*; Jean-Francois Lalonde (Université Laval); Raoul de Charette (Inria)
7093 Improving Generalization in Federated Learning by Seeking Flat Minima Debora Caldarola (Politecnico di Torino)*; Barbara Caputo (Politecnico di Torino); Marco Ciccone (Politecnico di Torino)
7102 MultiMAE: Multi-modal Multi-task Masked Autoencoders Roman Bachmann (EPFL)*; David Mizrahi (EPFL); Andrei Atanov (EPFL); Amir Zamir (Swiss Federal Institute of Technology (EPFL))
7174 Panoramic Vision Transformer for Saliency Detection in 360 Videos Heeseung Yun (Seoul National University)*; Sehun Lee (Seoul National University); Gunhee Kim (Seoul National University)
7218 PoserNet: Refining Relative Camera Poses Exploiting Object Detections Matteo Taiana (Istituto Italiano di Tecnologia)*; Matteo Toso (Istituto Italiano di Tecnologia); Stuart James (Istituto Italiano di Tecnologia (IIT)); Alessio Del Bue (Istituto Italiano di Tecnologia (IIT))
7331 Autoregressive Uncertainty Modeling for 3D Bounding Box Prediction YuXuan Liu (Covariant.ai, UC Berkeley)*; Nikhil Mishra (Covariant.ai, UC Berkeley); Maximilian Sieb (Covariant.ai); Fred Shentu (UC Berkeley); Pieter Abbeel (UC Berkeley); Peter Chen (COVARIANT.AI)
7450 VizWiz-FewShot: Locating Objects in Images Taken by People With Visual Impairments Yu-Yun Tseng (University of Colorado Boulder)*; Alexander Bell (IVC Group); Danna Gurari (University of Colorado Boulder)
7950 Transfer without Forgetting Matteo Boschini (University of Modena and Reggio Emilia)*; Lorenzo Bonicelli (Università of Modena and Reggio Emilia); Angelo Porrello (University of Modena and Reggio Emilia); Giovanni Bellitto (University of Catania); Matteo Pennisi (University of Catania); Simone Palazzo (University of Catania); Concetto Spampinato (University of Catania); SIMONE CALDERARA (University of Modena and Reggio Emilia, Italy)
1425 Multiview Stereo with Cascaded Epipolar RAFT Zeyu Ma (Princeton University)*; Zachary Teed (Princeton University); Jia Deng (Princeton University)
735 ECLIPSE: Efficient Long-range Video Retrieval using Sight and Sound Yan-Bo Lin (UNC Chapel Hill)*; Jie Lei (UNC Chapel Hill); Mohit Bansal (University of North Carolina at Chapel Hill); Gedas Bertasius (UNC Chapel Hill)
1212 CMD: Self-supervised 3D Action Representation Learning with Cross-modal Mutual Distillation Yunyao Mao (University of Science and Technology of China)*; Wengang Zhou (University of Science and Technology of China); Zhenbo Lu (Institute of Artificial Intelligence, Hefei Comprehensive National Science Center); Jiajun Deng (University of Science and Technology of China); Houqiang Li (University of Science and Technology of China)
4640 Self-supervised Social Relation Representation for Human Group Detection Jiacheng Li (College of Intelligence and Computing, Tianjin University); Ruize Han (College of Intelligence and Computing, Tianjin University)*; Haomin Yan (Tianjin University); Zekun Qian (College of Intelligence and Computing, Tianjin University); Wei Feng (College of Intelligence and Computing, Tianjin University, China); Song Wang (University of South Carolina)
4861 GraphVid: It Only Takes a Few Nodes to Understand a Video Eitan Kosman (Bosch AI)*; Dotan Di Castro (Bosch)
5080 PrivHAR: Recognizing Human Actions From Privacy-preserving Lens Carlos Hinojosa (Universidad Industrial de Santander)*; Miguel A Marquez (UIS Colombia); Henry Arguello (Universidad Industrial Santander); Ehsan Adeli (Stanford University); Li Fei-Fei (Stanford University); Juan Carlos Niebles (Salesforce & Stanford University)
6132 Flow graph to Video Grounding for Weakly-supervised Multi-Step Localization NIKITA DVORNIK (Samsung)*; Isma Hadji (Samsung AI Center – Toronto); Hai X Pham (Samsung AI Center); Dhaivat Bhatt (Samsung); Brais Martinez (Samsung AI Center); Afsaneh Fazly (SAIC Toronto); Allan D Jepson (Samsung Toronto AIC)
7215 Bi-PointFlowNet: Bidirectional Learning for Point Cloud Based Scene Flow Estimation WENCAN CHENG (Sungkyunkwan University); Jong Hwan Ko (Sungkyunkwan University)*
7248 Bayesian Tracking of Video Graphs Using Joint Kalman Smoothing and Registration Aditi Basu Bal (Florida State University)*; Ramy A Mounir (University of South Florida); Sathyanarayanan N Aakur (OK State); Sudeep Sarkar (University of South Florida, Tampa); Anuj Srivastava (Florida State University)
2021 Adaptive Patch Exiting for Scalable Single Image Super-Resolution Shizun Wang (Beijing University of Posts and Telecommunications)*; Jiaming Liu (Peking University); Kaixin Chen (Beijing University of Posts and Telecommunications); Xiaoqi Li (Columbia university in the city of New york); Ming Lu (Intel Labs China); Yandong Guo (OPPO Research Institute)
2153 Perceptual Artifacts Localization for Inpainting Lingzhi Zhang (University of Pennsylvania)*; Yuqian Zhou (Adobe); Connelly Barnes (Adobe); Zhe Lin (Adobe Research); Eli Shechtman (Adobe Research, US); Sohrab Amirghodsi (Adobe Research); Jianbo Shi (University of Pennsylvania)
4067 Secrets of Event-Based Optical Flow Shintaro Shiba (Keio University)*; Yoshimitsu Aoki (Keio University); Guillermo Gallego (TU Berlin)
4919 KXNet: A Model-Driven Deep Neural Network for Blind Super-Resolution Jiahong Fu (Xi’an Jiaotong University)*; Hong Wang (Jarvis Lab,Tencent ); Qi Xie (Xi’an Jiaotong University); Qian Zhao (Xi’an Jiaotong University); Deyu Meng (Xi’an Jiaotong University); Zongben Xu (Xi’an Jiaotong University)
6180 Learning Topological Interactions for Multi-Class Medical Image Segmentation Saumya Gupta (Stony Brook University)*; Xiaoling Hu (Stony Brook University); James Kaan (Stony Brook University); Michael Jin (Stony Brook University Hospital); Mutshipay Christian Mpoy (SUNY Stony Brook Medicine); Katherine Chung (Stony Brook University Hospital); Gagandeep Singh (RWJBarnabas Health); Mary Saltz (Stony Brook); Tahsin Kurc (Stony Brook University); Joel Saltz (Stony Brook University); APOSTOLOS K TASSIOPOULOS (Stony Brook University); Prateek Prasanna (Stony Brook University); Chao Chen (Stony Brook University)
6193 Unsupervised Segmentation in Real-World Images via Spelke Object Inference Honglin Chen (Stanford University); Rahul M V (Stanford University); Yoni I Friedman (MIT); Jiajun Wu (Stanford University); Joshua Tenenbaum (MIT); Daniel Yamins (Stanford University); Daniel Bear (Stanford University)*

Thursday 27th

Orals 9 (Thu. am)

Stereo and 3D Multiview/Sensors

1407 Box2Mask: Weakly Supervised 3D Semantic Instance Segmentation Using Bounding Boxes Julian Chibane (Max Planck Institute for Informatics, University of Wuerzburg)*; Francis Engelmann (ETH AI Center); Anh Tuan Tran (Max Planck Institute for Informatics, Saarland University); Gerard Pons-Moll (University of Tübingen)
2290 Generalizable Patch-Based Neural Rendering Mohammed Suhail (University of British Columbia)*; Carlos Esteves (Google Research); Leonid Sigal (University of British Columbia); Ameesh Makadia (Google Research)
5096 Solution Space Analysis of Essential Matrix based on Algebraic Error Minimization Gaku Nakano (NEC Corporation)*
5285 Approximate Differentiable Rendering with Algebraic Surfaces Leonid Keselman (Carnegie Mellon University)*; Martial Hebert (Carnegie Mellon School of Computer Science)
6571 Beyond Periodicity: Towards a Unifying Framework for Activations in Coordinate-MLPs Sameera Ramasinghe (University of Adelaide)*; Simon Lucey (University of Adelaide)
7838 Gaussian Activated Neural Radiance Fields for High Fidelity Reconstruction & Pose Estimation Shin-Fang Chng (The University of Adelaide)*; Sameera Ramasinghe (University of Adelaide); Jamie Sherrah (AIML); Simon Lucey (University of Adelaide)
7886 Unbiased Gradient Estimation for Differentiable Surface Splatting via Poisson Sampling Jan U. Müller (University of Bonn)*; Michael Weinmann (TU Delft); Reinhard Klein (University of Bonn)

Orals 10 (Thu. am)

Datasets and Evaluation

2808 OccamNets: Mitigating Dataset Bias by Favoring Simpler Hypotheses Robik S Shrestha (Rochester Institute of Technology)*; Kushal Kafle (Adobe Research); Christopher Kanan (University of Rochester)
3239 Event-Based Fusion for Motion Deblurring with Cross-modal Attention Lei Sun (Zhejiang University); Christos Sakaridis (ETH Zurich); Jingyun Liang (ETH Zurich); Qi Jiang (Zhejiang University); Kailun Yang (Karlsruhe Institute of Technology); Peng Sun (Zhejiang University); Yaozu Ye (State Key Laboratory of Modern Optical Instrumentation, Zhejiang University); Kaiwei Wang (State Key Laboratory of Modern Optical Instrumentation, Zhejiang University)*; Luc Van Gool (ETH Zurich)
3631 3D CoMPaT: Composition of Materials on Parts of 3D Things Yuchen Li (King Abdullah University of Science and Technology (KAUST)); Ujjwal Upadhyay (KAUST); Habib Slim (KAUST); Tezuesh Varshney (KAUST); Ahmed Abdelreheem (KAUST); Arpit Prajapati (Poly9); Suhail S Pothigara (Poly9 Inc); Peter Wonka (KAUST); Mohamed Elhoseiny (KAUST)*
4514 ROBIN: A Benchmark for Robustness to Individual Nuisances in Real-World Out-of-Distribution Shifts Bingchen Zhao (University of Edinburgh)*; Shaozuo Yu (Tongji University); Wufei Ma (Purdue University); Mingxin Yu (Peking University); Shenxiao Mei (Johns Hopkins University); Angtian Wang (Johns Hopkins University); Ju He (Johns Hopkins University); Alan Yuille (Johns Hopkins University); Adam Kortylewski (Max Planck Institute for Informatics)
5263 The Abduction of Sherlock Holmes: A Dataset for Visual Abductive Reasoning Jack Hessel (Allen Institute for AI)*; Jena D Hwang (Allen Institute for AI); Jae Sung Park (University of Washington); Rowan Zellers (University of Washington); Chandra Bhagavatula (AllenAI); Anna Rohrbach (UC Berkeley); Kate Saenko (Boston University); Yejin Choi (University of Washington)
6185 Look Both Ways: Self-Supervising Driver Gaze Estimation and Road Scene Saliency Isaac H Kasahara (University of Minnesota); Simon Stent (Toyota Research Institute); Hyun Soo Park (The University of Minnesota)*
6243 A Dense Material Segmentation Dataset for Indoor and Outdoor Scene Parsing Paul Upchurch (Apple)*; Ransen Niu (Apple)
8098 “This is my unicorn, Fluffy”: Personalizing frozen vision-language representations Niv Cohen (The Hebrew University of Jerusalem)*; Rinon Gal (Tel Aviv University); Eli Meirom (NVIDIA Research); Gal Chechik (NVIDIA); Yuval Atzmon (NVIDIA Research)
185 HuMMan: Multi-Modal 4D Human Dataset for Versatile Sensing and Modeling Zhongang Cai (SenseTime International Pte Ltd)*; Daxuan Ren (Nanyang Technological University); Ailing Zeng (The Chinese University of Hong Kong); Zhengyu Lin (SenseTime); Tao Yu (Tsinghua University); Wenjia Wang (SenseTime); Xiangyu Fan (Sensetime); Yang Gao (Sensetime); Yifan Yu (ETH Zurich); Liang Pan (Nanyang Technological University); Fangzhou Hong (Nanyang Technological University); Mingyuan Zhang (Nanyang Technological University); Chen Change Loy (Nanyang Technological University); Lei Yang (Sensetime Group Limited); Ziwei Liu (Nanyang Technological University)

Posters 5 (Thu. early)

75 Knowledge Condensation Distillation chenxin li (Xiamen University)*; Mingbao Lin (Xiamen University, China); Zhiyuan Ding (Xiamen University); Nie Lin (Hunan University); Yihong Zhuang (Xiamen University); Yue Huang (Xiamen University); Xinghao Ding (Xiamen University); Liujuan Cao (Xiamen University)
101 Class-incremental Novel Class Discovery Subhankar Roy (University of Trento); Mingxuan Liu (University of Trento); Zhun Zhong (University of Trento)*; Nicu Sebe (University of Trento); Elisa Ricci (University of Trento)
184 WeLSA: Learning To Predict 6D Pose From Weakly Labeled Data Using Shape Alignment Shishir Reddy Vutukur (TU Munich / Siemens Technology)*; Ivan Shugurov (TU Munich / Siemens Corporate Technology); Benjamin Busam (Technical University of Munich); ANDREAS HUTTER (Siemens Corporate Technology, Germany); Slobodan Ilic (TUM)
296 BEAT: A Large-Scale Semantic and Emotional Multi-Modal Dataset for Conversational Gestures Synthesis Haiyang Liu (The University of Tokyo)*; Zihao Zhu (Keio University); Naoya Iwamoto (Huawei Technologies Japan K.K.); Yichen Peng (Japan Advanced Institute of Science and Technology); Zhengqing Li (Huawei Japan K.K.); YOU ZHOU (Tokyo Research Center, Huawei); Elif Bozkurt (Huawei Turkey R&D Center, Istanbul, Turkey); Bo Zheng (Huawei)
537 Implicit Neural Representations for Image Compression Yannick Strümpler (ETH Zürich)*; Janis Postels (ETH Zurich); Ren Yang (ETH Zurich); Luc Van Gool (ETH Zurich); Federico Tombari (Google, TU Munich)
599 Neural Architecture Search for Spiking Neural Networks Youngeun Kim (Yale University)*; Yuhang Li (Yale University); Hyoungseob Park (Yale University); Yeshwanth Venkatesha (Yale university); Priyadarshini Panda (Yale University)
654 Semi-Supervised Monocular 3D Object Detection by Multi-View Consistency Qing Lian (Hong Kong University of Science and Technology )*; Yanbo XU (The Hong Kong University of Science and Technology); Weilong Yao (Shanghai Xiantu Intelligent Technology Co., Ltd.); Yingcong Chen (Hong Kong University of Science and Technology); Tong Zhang (Hong Kong University of Science and Technology)
790 Learn-to-Decompose: Cascaded Decomposition Network for Cross-Domain Few-Shot Facial Expression Recognition Xinyi Zou (Xiamen University); Yan Yan (Xiamen University)*; Jing-Hao Xue (University College London); Si Chen (Xiamen University of Technology); Hanzi Wang (Xiamen University)
798 Learning with Recoverable Forgetting Jingwen Ye (National University of Singapore)*; Fu Yifang (National University of Singapore); Jie Song (Zhejiang University); Xingyi Yang (National University of Singapore); Songhua Liu (National University of Singapore); Xin Jin (University of Science and Technology of China); Mingli Song (Zhejiang University); Xinchao Wang (National University of Singapore)
933 3D Compositional Zero-shot Learning with DeCompositional Consensus Muhammad Ferjad Naeem (ETH Zürich)*; Evin Pınar Örnek (TU Munich); Yongqin Xian (ETH Zurich); Luc Van Gool (ETH Zurich); Federico Tombari (Google, TU Munich)
1055 Real-time Online Video Detection with Temporal Smoothing Transformers Yue Zhao (University of Texas at Austin)*; Philipp Kraehenbuehl (UT Austin)
1105 Differentiable Raycasting for Self-supervised Occupancy Forecasting Tarasha Khurana (Carnegie Mellon University)*; Peiyun Hu (Carnegie Mellon University); Achal D Dave (Amazon); Jason P Ziglar (Argo AI); David Held (); Deva Ramanan (Carnegie Mellon University)
1277 Learning to Drive by Watching YouTube Videos: Action-Conditioned Contrastive Policy Pretraining Qihang Zhang (Chinese University of Hong Kong); Zhenghao Peng (Chinese University of Hong Kong); Bolei Zhou (UCLA)*
1330 Making Heads or Tails: Towards Semantically Consistent Visual Counterfactuals Simon Vandenhende (KU Leuven)*; Dhruv Mahajan (Facebook); Filip Radenovic (Facebook AI); Deepti Ghadiyaram (Facebook)
1358 Towards Generic 3D Tracking in RGBD Videos: Benchmark and Baseline Jinyu Yang (Southern University of Science and Technology)*; Zhongqun Zhang (University of Birmingham); Zhe LI (SUSTech); Hyung Jin Chang (University of Birmingham); Ales Leonardis (University of Birmingham); Feng Zheng (SUSTech)
1417 ViewFormer: NeRF-free Neural Rendering from Few Images Using Transformers Jonáš Kulhánek (Czech Technical University in Prague)*; Erik Derner (CTU CIIRC); Torsten Sattler (Czech Technical University in Prague); Robert Babuska (TU Delft)
1468 Relationformer: A Unified Framework for Image-to-Graph Generation Suprosanna Shit (TUM)*; Rajat Koner (Ludwig Maximilian University of Munich); Bastian Wittmann (Technical University of Munich); Johannes C. Paetzold (TUM); Ivan Ezhov (TUM); Hongwei Li (Technical University of Munich); Jiazhen Pan (Technical University of Munich); Sahand Sharifzadeh (Ludwig Maximilian University of Munich); Georgios Kaissis (Technische Universität München); Volker Tresp (LMU); Bjoern Menze (TUM)
1506 Not Just Streaks: Towards Ground Truth for Single Image Deraining Yunhao Ba (UCLA)*; Howard Zhang (UCLA); Ethan Yang (UCLA); Akira Suzuki (UCLA); Arnold J Pfahnl (University of California, Los Angeles); Chethan Chinder Chandrappa (University of California – Los Angeles); Celso de Melo (Army Research Laboratory); Suya You (US Army Research Laboratory); Stefano Soatto (UCLA); Alex Wong (Yale University); Achuta Kadambi (UCLA)
1534 Self-supervised Human Mesh Recovery with Cross-Representation Alignment Xuan Gong (University at Buffalo); Meng Zheng (United Imaging Intelligence); Benjamin Planche (United Imaging Intelligence); Srikrishna Karanam (Adobe Research); Terrence Chen (United Imaging Intelligence); David Doermann (University at Buffalo); Ziyan Wu (United Imaging Intelligence)*
1761 Neural Density-Distance Fields Itsuki UEDA (University of Tsukuba)*; Yoshihiro Fukuhara (Waseda University); Hirokatsu Kataoka (National Institute of Advanced Industrial Science and Technology (AIST)); Hiroaki Aizawa (Hiroshima University); Hidehiko Shishido (University of Tsukuba); Itaru Kitahara (University of Tsukuba)
1947 BungeeNeRF: Progressive Neural Radiance Field for Extreme Multiscale Scene Rendering Yuanbo Xiangli (Chinese University of Hong Kong)*; Linning Xu (CUHK); Xingang Pan (Max Planck Institute for Informatics); Nanxuan Zhao (University of Bath); Anyi Rao (The Chinese University of Hong Kong); Christian Theobalt (MPI Informatik); Bo Dai (Shanghai AI Lab); Dahua Lin (The Chinese University of Hong Kong)
2039 3D-Aware Semantic-Guided Generative Model for Human Synthesis Jichao Zhang (University of Trento)*; Enver Sangineto (University of Modena and Reggio Emilia); Hao Tang (ETH Zurich); Aliaksandr Siarohin (Snapchat); Zhun Zhong (University of Trento); Nicu Sebe (University of Trento); Wei Wang (EPFL)
2152 Fine-grained Egocentric Hand-Object Segmentation: Dataset, Model, and Applications Lingzhi Zhang (University of Pennsylvania)*; Shenghao Zhou (University of Pennsylvania); Simon Stent (Toyota Research Institute); Jianbo Shi (University of Pennsylvania)
2311 SESS: Saliency Enhancing with Scaling and Sliding Osman Tursun (Queensland University of Technology)*; SIMON DENMAN (Queensland University of Technology, Australia); Sridha Sridharan (QUT); Clinton Fookes (Queensland University of Technology)
2557 Detecting Twenty-thousand Classes using Image-level Supervision Xingyi Zhou (The University of Texas at Austin)*; Rohit Girdhar (Facebook AI Research); Armand Joulin (Facebook AI Research); Philipp Kraehenbuehl (UT Austin); Ishan Misra (Facebook AI Research)
2601 Max Pooling with Vision Transformers reconciles class and shape in weakly supervised semantic segmentation Simone Rossetti (Sapienza University); Damiano Zappia (Deepplants S.r.l.); Marta Sanzari (Sapienza University of Rome); Marco Schaerf (Sapienza University of Rome); fiora pirri (University of Rome, Sapienza)*
2609 Learning Semantic Correspondence with Sparse Annotations Shuaiyi Huang (University of Maryland, College Park)*; Luyu Yang (University of Maryland, College Park); Bo He (University of Maryland); Songyang Zhang (Shanghai AI Laboratory); Xuming He (ShanghaiTech University); Abhinav Shrivastava (University of Maryland)
2624 Context-Aware Streaming Perception in Dynamic Environments Gur-Eyal Sela (UC Berkeley)*; Ionel Gog (UC Berkeley); Justin Wong (UC Berkeley); Kumar Krishna Agrawal (UC Berkeley); Xiangxi Mo (UC Berkeley); Sukrit Kalra (UC Berkeley); Peter Schafhalter (UC Berkeley); Eric Leong (UC Berkeley); Xin Wang (Microsoft Research); Bharathan Balaji (Amazon); Joseph E Gonzalez (UC Berkeley); Ion Stoica (UC Berkeley)
2902 UNIF: United Neural Implicit Functions for Clothed Human Reconstruction and Animation Shenhan Qian (ShanghaiTech University)*; Jiale Xu (ShanghaiTech University); Ziwei Liu (Nanyang Technological University); Liqian Ma (ZMO AI); Shenghua Gao (Shanghaitech University)
3080 Doubly Deformable Aggregation of Covariance Matrices for Few-shot Segmentation Zhitong Xiong (Techinical University of Munich)*; Haopeng Li (The University of Melbourne); Xiaoxiang Zhu (Technical University of Munich (TUM); German Aerospace Center (DLR))
3093 MemSAC: Memory Augmented Sample Consistency for Large Scale Domain Adaptation Tarun Kalluri (UC San Diego)*; Astuti Sharma (UCSD); Manmohan Chandraker (UC San Diego)
3247 Efficient One-stage Video Object Detection by Exploiting Temporal Consistency Guanxiong Sun (Queen’s University Belfast); Yang Hua (Queen’s University Belfast)*; Guosheng Hu (Oosto); Neil Robertson (Queen’s University Belfast)
3660 Domain Generalization by Mutual-Information Regularization with Pre-trained Models Junbum Cha (Kakaobrain)*; Kyungjae Lee (Chung-Ang University); Sungrae Park (Upstage AI Research, Upstage AI); Sanghyuk Chun (NAVER AI Lab)
3681 A Closer Look at Invariances in Self-supervised Pre-training for 3D Vision Lanxiao Li (Karlsruher Institut fuer Technologie)*; Michael Heizmann (Karlsruher Institut fuer Technologie)
3686 SNeS: Learning Probably Symmetric Neural Surfaces from Incomplete Data Eldar Insafutdinov (University of Oxford); Dylan Campbell (University of Oxford)*; Joao F Henriques (University of Oxford); Andrea Vedaldi (Oxford University)
3703 HDR-Plenoxels: Self-Calibrating High Dynamic Range Radiance Fields Kim Jun-Seong (POSTECH)*; Kim Yu-Ji (POSTECH); Moon Ye-Bin (POSTECH); Tae-Hyun Oh (POSTECH)
3721 Abstracting Sketches through Simple Primitives Stephan Alaniz (University of Tübingen)*; Massimiliano Mancini (University of Tübingen); Anjan Dutta (University of Surrey); Diego Marcos (Wageningen University); Zeynep Akata (University of Tübingen)
4105 Polarimetric Pose Prediction Daoyi Gao (Technical University of Munich)*; Yitong Li (Technical University of Munich); Patrick Ruhkamp (Technical University of Munich); Iuliia Skobleva (Technical University of Munich); Magdalena Wysocki (Technical University of Munich); HyunJun Jung ( Technical University of Munich); Pengyuan Wang (TUM); Arturo Guridi (Technical University of Munich); Benjamin Busam (Technical University of Munich)
4117 A-OKVQA: A Benchmark for Visual Question Answering using World Knowledge Dustin Schwenk (Allen Institute for Artificial Intelligence); Apoorv Khandelwal (Allen Institute for AI); Christopher A Clark (Allen Institute for AI); Kenneth Marino (CMU); Roozbeh Mottaghi (Allen Institute for AI)*
4208 GOCA: Guided Online Cluster Assignment for Self Supervised Video Representation Learning HUSEYIN COSKUN (Technical University of Munich)*; Alireza Zareian (Snap Inc.); Joshua L Moore (Snapchat); Federico Tombari (Google, TU Munich); Chen Wang (Snap Inc.)
4303 SLIP: Self-supervision meets Language-Image Pre-training Norman Mu (University of California, Berkeley)*; Alexander Kirillov (Facebook AI Reserach); David Wagner (UC Berkeley); Saining Xie (Facebook AI Research)
4346 PillarNet: Real-Time and High-Performance Pillar-based 3D Object Detection Guangsheng Shi (Harbin Institute of Technology)*; Ruifeng Li (Harbin Institute of Technology); Chao Ma (Shanghai Jiao Tong University)
4588 PointMixer: MLP-Mixer for Point Cloud Understanding Jaesung Choe (KAIST)*; Chunghyun Park (POSTECH); Francois Rameau (KAIST); Jaesik Park (POSTECH); In So Kweon (KAIST)
4668 TIDEE: Tidying Up Novel Rooms using Visuo-Semantic Commonsense Priors Gabriel Sarch (Carnegie Mellon University)*; Zhaoyuan Fang (Carnegie Mellon University); Adam Harley (Carnegie Mellon University); Paul Schydlo (Carnegie Mellon University); Michael J Tarr (Carnegie Mellon University); Saurabh Gupta (UIUC); Katerina Fragkiadaki (Carnegie Mellon University)
4761 Motion and Appearance Adaptation for Cross-Domain Motion Transfer Borun Xu (University of Electronic Science and Technology of China)*; Biao Wang (Alibaba Group); Jinhong Deng (University of Electronic Science and Technology of China); Jiale Tao (University of Electronic Science and Technology of China); Tiezheng Ge (Alibaba Group); Yuning Jiang (Alibaba Group); Wen Li (University of Electronic Science and Technology of China); Lixin Duan (University of Electronic Science and Technology of China)
4807 Perspective Flow Aggregation for Data-Limited 6D Object Pose Estimation Yinlin Hu (EPFL)*; Pascal Fua (EPFL, Switzerland); Mathieu Salzmann (EPFL)
5051 Don’t Forget Me: Accurate Background Recovery for Text Removal via Modeling Local-Global Context Chongyu Liu (South China University of Technology); Lianwen Jin (South China University of Technology)*; Yuliang Liu (Huazhong University of Science and Technology); Canjie Luo (South China University of Technology); Bangdong Chen (South China University of Technology); Fengjun Guo (IntSig Information Co. Ltd); Kai Ding (IntSig Information Co., Ltd)
5092 ChunkyGAN: Real Image Inversion via Segments Adéla Šubrtová (Czech Technical University); David Futschik (Czech Technical University in Prague, FEE); Jan Čech (Czech Technical University in Prague); Michal Lukáč (Adobe Research); Eli Shechtman (Adobe Research, US); Daniel Sýkora (Czech Technical University in Prague)*
5129 3D-PL: Domain Adaptive Depth Estimation with 3D-aware Pseudo-Labeling Yu-Ting Yen (National Chiao Tung University, Phiar Technologies)*; Chia-Ni Lu (National Chiao Tung University ); Wei-Chen Chiu (National Chiao Tung University); Yi-Hsuan Tsai (Google)
5158 Camera Pose Auto-Encoders for Improving Pose Regression Yoli Shavit (Faculty of Engineering, Bar Ilan University); Yosi Keller (Bar Ilan University)*
5175 AU-aware 3D Face Reconstruction through Personalized AU-specific Blendshape Learning Chenyi Kuang (Rensselaer Polytechnic Institute)*; Zijun Cui (Rensselaer Polytechnic Institute); Jeffrey Kephart (IBM Research, USA); Qiang Ji (Renselaer Polytechnic Institute)
5217 FindIt: Generalized Localization with Natural Language Queries Weicheng Kuo (Google)*; Fred Bertsch (Google); Wei Li (GOOGLE INC); AJ Piergiovanni (Google); Mohammad Saffar (Google); Anelia Angelova (Google)
5235 Action-based Contrastive Learning for Trajectory Prediction Marah Halawa (Technische Universität Berlin)*; Olaf Hellwich (Technical University Berlin); Pia Bideau (TU Berlin)
5240 Scaling Open-vocabulary Image Segmentation with Image-level Labels Golnaz Ghiasi (Google Brain)*; Xiuye Gu (Google); Yin Cui (Google); Tsung-Yi Lin (Nvidia Research)
5299 Anti-Neuron Watermarking: Protecting Personal Data Against Unauthorized Neural Networks Zihang Zou (University of Central Florida)*; Boqing Gong (Google); Liqiang Wang (University of Central Florida)
5418 Adaptive Transformers for Robust Few-shot Cross-domain Face Anti-spoofing Hsin-Ping Huang (University of California, Merced)*; Deqing Sun (Google); Yaojie Liu (Google); Wen-Sheng Chu (Google); Taihong Xiao (University of California at Merced); Jinwei Yuan (Google); Hartwig Adam (Google); Ming-Hsuan Yang (University of California at Merced)
5481 Spatial-Separated Curve Rendering Network for Efficient and High-Resolution Image Harmonization Jingtang Liang (University of Macau)*; Xiaodong Cun (Tencent AI Lab); Chi-Man Pun (University of Macau); Jue Wang (Tencent AI Lab)
5544 An Efficient Person Clustering Algorithm for Open Checkout-free Groceries Junde Morsen Wu (Baidu); Yu Zhang (Harbin Institute of Technology); RAO FU (None); Yuanpei Liu (Beijing Institute of Technology); Jing Gao (Purdue University)*
5559 TDViT: Temporal Dilated Transformer for Dense Video Tasks Guanxiong Sun (Queen’s University Belfast); Yang Hua (Queen’s University Belfast)*; Guosheng Hu (Oosto); Neil Robertson (Queen’s University Belfast)
5721 SimpleRecon: 3D Reconstruction Without 3D Convolutions Mohamed Sayed (University College London)*; John Gibson (Niantic, Inc.); Jamie Watson (Niantic); Victor A Prisacariu (Niantic Labs); Michael Firman (Niantic); Clement LJC Godard (Niantic)
5829 3D Siamese Transformer Network for Single Object Tracking on Point Clouds Le Hui (Nanjing University of Science and Technology)*; Lingpeng Wang (Nanjing University of Science and Technology); Linghua Tang (Nanjing University of Science and Technology); Kaihao Lan (Nanjing University of Science and Technology); Jin Xie (Nanjing University of Science and Technology); Jian Yang (Nanjing University of Science and Technology)
5937 CADyQ: Content-Aware Dynamic Quantization for Image Super-Resolution Cheeun Hong (Seoul National University); Sungyong Baik (Hanyang University); Heewon Kim (Seoul National University); Seungjun Nah (NVIDIA); Kyoung Mu Lee (Seoul National University)*
5972 General Object Pose Transformation Network from Unpaired Data Yukun Su (South China University of Technology)*; Guosheng Lin (Nanyang Technological University); RuiZhou Sun (South China University of Technology); Qingyao Wu (South China University of Technology)
6044 RankSeg: Adaptive Pixel Classification with Image Category Ranking for Segmentation Haodi He (University of Science and Technology of China); Yuhui Yuan (Microsoft Research)*; Xiangyu Yue (University of California, Berkeley); Han Hu (Microsoft Research Asia)
6090 A Kendall Shape Space Approach to 3D Shape Estimation from 2D Landmarks Martha Paskin (Zuse Institute Berlin); Daniel Baum (Zuse Institute Berlin); Mason N Dean (City University of Hong Kong); Christoph von Tycowicz (Zuse Institute Berlin)*
6097 A study of Pre-training strategies and datasets for facial representation learning Adrian Bulat (Samsung AI Center, Cambridge)*; Shiyang Cheng (Samsung); Jing Yang (University of Nottingham); Andrew Garbett (Samsung AI Center); Enrique Sanchez (Samsung AI Centre); Georgios Tzimiropoulos (Queen Mary University of London)
6112 Neural Strands: Learning Hair Geometry and Appearance from Multi-View Images Radu Alexandru Rosu (University of Bonn); Shunsuke Saito (Facebook); Ziyan Wang (Carnegie Mellon University); Chenglei Wu (Facebook Reality Labs); Sven Behnke (University of Bonn); Giljoo Nam (Facebook Inc.)*
6144 Improving Few-Shot Learning through Multi-task Representation Learning Theory Quentin Bouniot (CEA, LIST)*; Ievgen Redko (Aalto University); Romaric Audigier (CEA LIST); Angélique Loesch (CEA LIST); Amaury Habrard (University of St-Etienne, Lab. H. Curien)
6148 Long-tailed Instance Segmentation using Gumbel Optimized Loss Konstantinos P Alexandridis (University of Liverpool)*; Jiankang Deng (Imperial College London); Anh Nguyen (University of Liverpool); Shan Luo (University of Liverpool)
6177 3D Scene Inference from Transient Histograms Sacha Jungerman (University of Wisconsin-Madison)*; Atul N Ingle (University of Wisconsin-Madison); Yin Li (University of Wisconsin-Madison); Mohit Gupta (“University of Wisconsin-Madison, USA “)
6207 Network Binarization via Contrastive Learning Yuzhang Shang (Illinois Institute of Technology)*; Dan Xu (The Hong Kong University of Science and Technology); Ziliang Zong (Texas State University); Liqiang Nie (Harbin Institute of Technology (Shenzhen)); Yan Yan (Illinois Institute of Technology)
6212 Is Geometry Enough for Matching in Visual Localization? Qunjie Zhou (Technical University of Munich)*; Sérgio Agostinho (Institute for Systems and Robotics, Instituto Superior Técnico, Universidade de Lisboa); Aljosa Osep (TUM Munich); Laura Leal-Taixé (TUM)
6220 Transformers as Meta-Learners for Implicit Neural Representations Yinbo Chen (UC San Diego)*; Xiaolong Wang (UCSD)
6228 3D Face Reconstruction with Dense Landmarks Erroll Wood (Microsoft)*; Tadas Baltrusaitis (Microsoft); Charlie Hewitt (Microsoft); Matthew A Johnson (Microsoft); Jingjing Shen (Microsoft); Nikola Milosavljevic (Microsoft); Daniel S Wilde (Microsoft); Stephan J Garbin (University College London); Toby Sharp (Microsoft); Ivan Stojiljkovic (Microsoft); Tom Cashman (Microsoft); Julien Valentin (Microsoft)
6287 Non-Uniform Step Size Quantization for Accurate Post-Training Quantization Sangyun Oh (UNIST)*; Hyeonuk Sim (UNIST); Jounghyun Kim (UNIST); Jongeun Lee (UNIST)
6328 GLAMD: Global and Local Attention MaskDistillation for Object Detectors YounHo Jang (Kyung Hee University); Wheemyung Shin (Kyung Hee University); Jinbeom Kim (Sungkyunkwan University (SKKU)); Sung-Ho Bae (Kyung Hee University)*; Simon S Woo (Sungkyunkwan University (SKKU))
6493 Language-Grounded Indoor 3D Semantic Segmentation in the Wild Dávid Rozenberszki (Technische Universitat Munchen)*; Or Litany (Stanford); Angela Dai (Technical University of Munich)
6538 You Already Have It: A Generator-Free Low-Precision DNN Training Framework using Stochastic Rounding Geng Yuan (Northeastern University)*; Sung-En Chang (Northeastern University); Qing Jin (Northeastern University); Alec Lu (Simon Fraser University ); Yanyu Li (Northeastern University); Yushu Wu (Northeastern University); Zhenglun Kong (Northeastern University); Yanyue Xie (Northeastern University); Peiyan Dong (Northeastern University); Minghai Qin (Western Digital Research); Xiaolong Ma (Clemson University); Xulong Tang (University of Pittsburgh); Zhenman Fang (Simon Fraser University); Yanzhi Wang (Northeastern University)
6660 Entry-Flipped Transformer for Inference and Prediction of Participant Behavior BO HU (Nanyang Technological University)*; Tat-Jen Cham (Nanyang Technological University)
6694 CAViT: Contextual Alignment Vision Transformer for Video Object Re-identification jinlin wu (Institute of Automation, Chinese Academy of Sciences, Beijing, China)*; He Lingxiao (nlpr,cripac); Wu Liu (AI Research of JD.com); Yang Yang (Institute of Automation, Chinese Academy of Sciences); Zhen Lei (NLPR, CASIA, China); Tao Mei (AI Research of JD.com); Stan Z. Li (Westlake University)
6722 DoodleFormer: Creative Sketch Drawing with Transformers Ankan Kumar Bhunia (MBZUAI)*; Salman Khan (MBZUAI/ANU); Hisham Cholakkal (MBZUAI); Rao Muhammad Anwer (MBZUAI/AALTO); Fahad Shahbaz Khan (MBZUAI); Jorma Laaksonen (Aalto University); Michael Felsberg (Linköping University)
6940 Compositional Visual Generation with Composable Diffusion Models Nan Liu (University of Illinois at Urbana-Champaign); Shuang Li (MIT); Yilun Du (MIT)*; Antonio Torralba (MIT); Joshua Tenenbaum (MIT)
6963 3D Shape Sequence of Human Comparison and Classification using Current and Varifolds Emery Pierson (Université de Lille)*; Mohamed Daoudi (IMT Nord Europe); Sylvain Arguillere (Institute Camille Jordan)
6990 FEAR: Fast, Efficient, Accurate and Robust Visual Tracker Vasyl Borsuk (Ukrainian Catholic University); Roman Vei (Ukrainian Catholic University); Orest Kupyn (Ukrainian Catholic University); Tetiana Martyniuk (Ukrainian Catholic University)*; Igor Krashenyi (Piñata Farms); Jiri Matas (CMP CTU FEE)
7139 Learning Phase Mask for Privacy-Preserving Passive Depth Estimation Zaid Tasneem (Rice University); Giovanni Milione (4 independence Way, Princeton, NJ 08540); Yi-Hsuan Tsai (Google); Xiang Yu (NEC Labs); Ashok Veeraraghavan (Rice University); Manmohan Chandraker (UC San Diego); Francesco Pittaluga (NEC Laboratories America)*
7214 Break and Make: Interactive Structural Understanding Using LEGO Bricks Aaron T Walsman (University of Washington)*; Muru Zhang (University of Washington); Klemen Kotar (Allen Institute for AI); Karthik Desingh (University Washington); Dieter Fox (NVIDIA Research / University of Washington); Ali Farhadi (University of Washington, Allen Institue for AI, Apple)
7281 Theoretical Understanding of the Information Flow on Continual Learning Performance Joshua J Andle (University of Maine); Salimeh Yasaei Sekeh (University of Maine)*
7288 Pure Transformer with Integrated Experts for Scene Text Recognition Yew Lee Tan (Nanyang Technological University)*; Wai-Kin Adams Kong (Nanyang Technological University); Jung Jae Kim (I2R)
7337 Learning Regional Purity for Instance Segmentation on 3D Point Clouds Shichao Dong (Nanyang Technological University)*; Guosheng Lin (Nanyang Technological University); Tzu-Yi HUNG (Delta Research Center)
7441 A Simple Single-Scale Vision Transformer for Object Detection and Instance Segmentation Wuyang Chen (University of Texas at Austin)*; Xianzhi Du (Google Brain); Fan Yang (Google); Lucas Beyer (Google Brain); Xiaohua Zhai (Google Brain); Tsung-Yi Lin (Google Brain); Huizhong Chen (Google); Jing Li (Google Brain); Xiaodan Song (Google Brain); Zhangyang Wang (University of Texas at Austin); Denny Zhou (Google Brain)
7543 Generating Natural Images with Direct Patch Distributions Matching Ariel Elnekave (Hebrew University of Jerusalem)*; Yair Weiss (Hebrew University)
7561 TRoVE: Transforming Road Scene Datasets into Photorealistic Virtual Environments Shubham Dokania (IIIT Hyderabad)*; Anbumani Subramanian (IIIT-Hyderabad); Manmohan Chandraker (UC San Diego); C.V. Jawahar (IIIT-Hyderabad)
7592 RDO-Q: Extremely Fine-Grained Channel-Wise Quantization via Rate-Distortion Optimization Zhe Wang (Institute for Infocomm Research, Singapore)*; Jie Lin (Institute for Infocomm Research (I2R), Singapore); Xue Geng (I2R, A*STAR); Mohamed M. Sabry Aly (Nanyang Technological University); Vijay R. Chandrasekhar (Institute for Infocomm Research)
7626 Understanding Collapse in Non-Contrastive Siamese Representation Learning Alexander C Li (Carnegie Mellon University)*; Alexei A Efros (UC Berkeley); Deepak Pathak (Carnegie Mellon University)
7651 SPE-Net: Boosting Point Cloud Analysis via Rotation Robustness Enhancement Zhaofan Qiu (JD.com); Yehao Li (JD AI Research); Yu Wang (JD AI Research); Yingwei Pan (JD AI Research); Ting Yao (JD AI Research)*; Tao Mei (AI Research of JD.com)
7771 Trading Positional Complexity vs Deepness in Coordinate Networks Jianqiao Zheng (University of Adelaide)*; Sameera Ramasinghe (University of Adelaide); Xueqian Li (Carnegie Mellon University); Simon Lucey (University of Adelaide)
7802 U-Boost NAS: Utilization-Boosted Differentiable Neural Architecture Search Ahmet Yüzügüler (EPFL)*; Nikolaos Dimitriadis (EPFL); Pascal Frossard (EPFL)
7815 Trapped in texture bias? A large scale comparison of deep instance segmentation Johannes Theodoridis (Hochschule der Medien Stuttgart)*; Jessica Hofmann (Hochschule der Medien); Johannes Maucher (Media University Stuttgart); Andreas G Schilling (University of Tübingen)
8009 StoryDALL-E: Adapting Pretrained Text-to-image Transformers for Story Continuation Adyasha Maharana (UNC Chapel Hill)*; Darryl Hannan (University of North Carolina at Chapel Hill); Mohit Bansal (University of North Carolina at Chapel Hill)
205 Contrast-Phys: Unsupervised Video-based Remote Physiological Measurement via Spatiotemporal Contrast Zhaodong Sun (University of Oulu)*; Xiaobai Li (University of Oulu)
634 Object-Compositional Neural Implicit Surfaces Qianyi Wu (Monash University)*; Xian Liu (The Chinese University of Hong Kong); Yuedong Chen (Monash University); Kejie Li (University of Oxford); Chuanxia Zheng (Monash University); Jianfei Cai (Monash University); Jianmin Zheng (Nanyang Technological University)
636 Sem2NeRF: Converting Single-View Semantic Masks to Neural Radiance Fields Yuedong Chen (Monash University)*; Qianyi Wu (Monash University); Chuanxia Zheng (Monash University); Tat-Jen Cham (Nanyang Technological University); Jianfei Cai (Monash University)
721 Burn After Reading: Online Adaptation for Cross-domain Streaming Data Luyu Yang (University of Maryland, College Park)*; Mingfei Gao (Apple); Zeyuan Chen (Salesforce Research); Ran Xu (Salesforce Research); Abhinav Shrivastava (University of Maryland); Chetan Ramaiah (Salesforce Research)
982 Unsupervised Night Image Enhancement: When Layer Decomposition Meets Light-Effects Suppression Yeying Jin (National University of Singapore)*; Wenhan Yang (NTU); Robby T. Tan (National University of Singapore)
1010 Zero-Shot Temporal Action Detection via Vision-Language Prompting Sauradip Nag (University of Surrey)*; Xiatian Zhu (University of Surrey); Yi-Zhe Song (University of Surrey); Tao Xiang (University of Surrey)
1028 Automatic dense annotation of large-vocabulary sign language videos Liliane Momeni (University of Oxford)*; Hannah Bull (LIMSI (CNRS)); Prajwal K R (VGG, Oxford); Samuel Albanie (University of Cambridge); Gul Varol (Ecole des Ponts ParisTech); Andrew Zisserman (University of Oxford)
1557 Making the Most of Text Semantics to Improve Biomedical Vision-Language Processing Benedikt Boecking (Carnegie Mellon University); Naoto Usuyama (Microsoft Research); Shruthi J Bannur (Microsoft Research); Daniel Coelho de Castro (Microsoft Research); Anton Schwaighofer (Microsoft Research); Stephanie Hyland (Microsoft Research); Maria Teodora A Wetscherek (Microsoft); Tristan Naumann (Microsoft Research Redmond, US); Aditya Nori (Microsoft Research); Javier Alvarez-Valle (Microsoft Research); Hoifung Poon (Microsoft Research); Ozan Oktay (Microsoft Research)*
1959 MOTCOM: The Multi-Object Tracking Dataset Complexity Metric Malte Pedersen (Aalborg University)*; Joakim Bruslund Haurum (Aalborg University); Patrick Dendorfer (TUM); Thomas B. Moeslund (Aalborg University)
1986 Feature Representation Learning for Unsupervised Cross-domain Image Retrieval Conghui Hu (National University of Singapore)*; Gim Hee Lee (National University of Singapore)
2862 A Broad Study of Pre-training for Domain Generalization and Adaptation Donghyun Kim (MIT-IBM Watson AI Lab)*; Kaihong Wang (Boston University); Stan Sclaroff (Boston University); Kate Saenko (Boston University)
2864 LGV: Boosting Adversarial Example Transferability from Large Geometric Vicinity Martin Gubri (University of Luxembourg)*; Maxime Cordy (University of Luxembourg); Mike Papadakis (University of Luxembourg); Yves Le Traon (University of Luxembourg); Koushik Sen (University of California, Berkeley)
2901 Improved Masked Image Generation with Token-Critic Jose Lezama (Google Research)*; Huiwen Chang (Google); Lu Jiang (Google Research); Irfan Essa (Google)
3151 Semantic-guided Multi-Mask Image Harmonization Xuqian Ren (Watrix Technology); Yifan Liu (University of Adelaide)*
3214 Object-Centric Unsupervised Image Captioning Zihang Meng (University of Wisconsin Madison)*; David Yang (Facebook); Xuefei Cao (Facebook); Ashish Shah (Facebook AI); Ser-Nam Lim (Meta AI)
3751 Event Neural Networks Matthew Dutson (University of Wisconsin-Madison)*; Yin Li (University of Wisconsin-Madison); Mohit Gupta (“University of Wisconsin-Madison, USA “)
3991 A Non-isotropic Probabilistic Take on Proxy-based Deep Metric Learning Michael Kirchhof (University of Tübingen)*; Karsten Roth (University of Tuebingen); Zeynep Akata (University of Tübingen); Enkelejda Kasneci (University of Tuebingen)
4278 Multimodal Transformer with Variable-length Memory for Vision-and-Language Navigation chuang lin (Monash University)*; Yi Jiang (Bytedance); Jianfei Cai (Monash University); Lizhen Qu (Monash University); Reza Haffari (Monash University, Australia); Zehuan Yuan (Bytedance.Inc)
4481 Dual Adaptive Transformations for Weakly Supervised Point Cloud Segmentation Zhonghua Wu (Nanyang Technological University)*; Yicheng Wu (Monash University); Guosheng Lin (Nanyang Technological University); Jianfei Cai (Monash University); Chen Qian (SenseTime)
4902 Learning Discriminative Shrinkage Deep Networks for Image Deconvolution Pin-Hung Kuo (National Taiwan University)*; Jinshan Pan (Nanjing University of Science and Technology); Shao-Yi Chien (National Taiwan University); Ming-Hsuan Yang (University of California at Merced)
5648 MeshLoc: Mesh-Based Visual Localization Vojtech Panek (CTU in Prague, FEE, CIIRC)*; Zuzana Kukelova (Czech Technical University in Prague); Torsten Sattler (Czech Technical University in Prague)
6002 S2F2: Single-Stage Flow Forecasting for Future Multiple Trajectories Prediction YU-WEN CHEN (National Tsing Hua University); Hsuan-Kung Yang (National Tsing Hua University); Chu-Chi Chiu (National Tsin-Hua University); Chun-Yi Lee (National Tsing Hua University)*
6250 Exposure-Aware Dynamic Weighted Learning for Single-Shot HDR Imaging An Gia Vien (Dongguk University); Chul Lee (Dongguk University)*
6403 High-Resolution Virtual Try-On with Misalignment and Occlusion-Handled Conditions SangYun Lee (Soongsil University); Gyojung Gu (Korea Advanced Institute of Science and Technology)*; Sunghyun Park (KAIST); Seunghwan Choi (Korea Advanced Institute of Science and Technology ); Jaegul Choo (Korea Advanced Institute of Science and Technology)
6526 PoseScript: 3D Human Poses from Natural Language Ginger Delmas (NAVER LABS EUROPE)*; Philippe Weinzaepfel (NAVER LABS Europe); Thomas LUCAS (Naver); Francesc Moreno (IRI); Gregory Rogez (NAVER LABS Europe)
6917 Generator Knows What Discriminator Should Learn in Unconditional GANs Gayoung Lee (NAVER AI Lab)*; Hyunsu Kim (NAVER AI Lab); Junho Kim (NAVER AI Lab); Seonghyeon Kim (Clova AI Research, NAVER Corp.); Jung-Woo Ha (NAVER CLOVA AI Lab); Yunjey Choi (NAVER AI Lab)
7176 incDFM: Incremental Deep Feature Modeling for Continual Novelty Detection Amanda S Rios (University of Southern California; Intel )*; Nilesh A Ahuja (Intel); Ibrahima Ndiour (Intel); Ergin U Genc (Intel); Laurent Itti (University of Southern California); Omesh Tickoo (Intel)
7301 AudioScopeV2: Audio-Visual Attention Architectures for Calibrated Open-Domain On-Screen Sound Separation Efthymios Tzinis (University of Illinois at Urbana-Champaign); Scott Wisdom (Google)*; Tal Remez (Google); John Hershey (Google)
7476 DeepShadow: Neural Shape from Shadow Asaf Karnieli (Reichman University)*; Yacov Hel-Or (The Interdisciplinary Center); Ohad Fried (IDC Herzliya)
7529 Simple Open-Vocabulary Object Detection with Vision Transformers Matthias Minderer (Google Research)*; Alexey Gritsenko (Google Brain); Austin C Stone (Google); Maxim Neumann (Google); Dirk Weißenborn (German Research Center for Artificial Intelligence); Alexey Dosovitskiy (Inceptive); Aravindh Mahendran (Google); Anurag Arnab (Google); Mostafa Dehghani (Google Brain); Zhuoran Shen (Pony.ai); Xiao Wang (Google); Xiaohua Zhai (Google Brain); Thomas Kipf (Google Brain); Neil Houlsby (Google)
1407 Box2Mask: Weakly Supervised 3D Semantic Instance Segmentation Using Bounding Boxes Julian Chibane (Max Planck Institute for Informatics, University of Wuerzburg)*; Francis Engelmann (ETH AI Center); Anh Tuan Tran (Max Planck Institute for Informatics, Saarland University); Gerard Pons-Moll (University of Tübingen)
2290 Generalizable Patch-Based Neural Rendering Mohammed Suhail (University of British Columbia)*; Carlos Esteves (Google Research); Leonid Sigal (University of British Columbia); Ameesh Makadia (Google Research)
5096 Solution Space Analysis of Essential Matrix based on Algebraic Error Minimization Gaku Nakano (NEC Corporation)*
5285 Approximate Differentiable Rendering with Algebraic Surfaces Leonid Keselman (Carnegie Mellon University)*; Martial Hebert (Carnegie Mellon School of Computer Science)
6571 Beyond Periodicity: Towards a Unifying Framework for Activations in Coordinate-MLPs Sameera Ramasinghe (University of Adelaide)*; Simon Lucey (University of Adelaide)
7838 Gaussian Activated Neural Radiance Fields for High Fidelity Reconstruction & Pose Estimation Shin-Fang Chng (The University of Adelaide)*; Sameera Ramasinghe (University of Adelaide); Jamie Sherrah (AIML); Simon Lucey (University of Adelaide)
7886 Unbiased Gradient Estimation for Differentiable Surface Splatting via Poisson Sampling Jan U. Müller (University of Bonn)*; Michael Weinmann (TU Delft); Reinhard Klein (University of Bonn)
2808 OccamNets: Mitigating Dataset Bias by Favoring Simpler Hypotheses Robik S Shrestha (Rochester Institute of Technology)*; Kushal Kafle (Adobe Research); Christopher Kanan (University of Rochester)
3239 Event-Based Fusion for Motion Deblurring with Cross-modal Attention Lei Sun (Zhejiang University); Christos Sakaridis (ETH Zurich); Jingyun Liang (ETH Zurich); Qi Jiang (Zhejiang University); Kailun Yang (Karlsruhe Institute of Technology); Peng Sun (Zhejiang University); Yaozu Ye (State Key Laboratory of Modern Optical Instrumentation, Zhejiang University); Kaiwei Wang (State Key Laboratory of Modern Optical Instrumentation, Zhejiang University)*; Luc Van Gool (ETH Zurich)
3631 3D CoMPaT: Composition of Materials on Parts of 3D Things Yuchen Li (King Abdullah University of Science and Technology (KAUST)); Ujjwal Upadhyay (KAUST); Habib Slim (KAUST); Tezuesh Varshney (KAUST); Ahmed Abdelreheem (KAUST); Arpit Prajapati (Poly9); Suhail S Pothigara (Poly9 Inc); Peter Wonka (KAUST); Mohamed Elhoseiny (KAUST)*
4514 ROBIN: A Benchmark for Robustness to Individual Nuisances in Real-World Out-of-Distribution Shifts Bingchen Zhao (University of Edinburgh)*; Shaozuo Yu (Tongji University); Wufei Ma (Purdue University); Mingxin Yu (Peking University); Shenxiao Mei (Johns Hopkins University); Angtian Wang (Johns Hopkins University); Ju He (Johns Hopkins University); Alan Yuille (Johns Hopkins University); Adam Kortylewski (Max Planck Institute for Informatics)
5263 The Abduction of Sherlock Holmes: A Dataset for Visual Abductive Reasoning Jack Hessel (Allen Institute for AI)*; Jena D Hwang (Allen Institute for AI); Jae Sung Park (University of Washington); Rowan Zellers (University of Washington); Chandra Bhagavatula (AllenAI); Anna Rohrbach (UC Berkeley); Kate Saenko (Boston University); Yejin Choi (University of Washington)
6185 Look Both Ways: Self-Supervising Driver Gaze Estimation and Road Scene Saliency Isaac H Kasahara (University of Minnesota); Simon Stent (Toyota Research Institute); Hyun Soo Park (The University of Minnesota)*
6243 A Dense Material Segmentation Dataset for Indoor and Outdoor Scene Parsing Paul Upchurch (Apple)*; Ransen Niu (Apple)
8098 “This is my unicorn, Fluffy”: Personalizing frozen vision-language representations Niv Cohen (The Hebrew University of Jerusalem)*; Rinon Gal (Tel Aviv University); Eli Meirom (NVIDIA Research); Gal Chechik (NVIDIA); Yuval Atzmon (NVIDIA Research)
185 HuMMan: Multi-Modal 4D Human Dataset for Versatile Sensing and Modeling Zhongang Cai (SenseTime International Pte Ltd)*; Daxuan Ren (Nanyang Technological University); Ailing Zeng (The Chinese University of Hong Kong); Zhengyu Lin (SenseTime); Tao Yu (Tsinghua University); Wenjia Wang (SenseTime); Xiangyu Fan (Sensetime); Yang Gao (Sensetime); Yifan Yu (ETH Zurich); Liang Pan (Nanyang Technological University); Fangzhou Hong (Nanyang Technological University); Mingyuan Zhang (Nanyang Technological University); Chen Change Loy (Nanyang Technological University); Lei Yang (Sensetime Group Limited); Ziwei Liu (Nanyang Technological University)

Orals 11 (Thu. pm)

Vision, Text, and non-Supervised Learning

1384 GLASS: Global to Local Attention for Scene-Text Spotting Roi Ronen (Technion)*; Shahar Tsiper (Amazon); Oron Anschel (AWS); Inbal Lavi (Amazon); Amir Markovitz (Amazon); R. Manmatha (Amazon)
1637 Pointly-Supervised Panoptic Segmentation Junsong Fan (Chinese Academy of Sciences, China)*; Zhaoxiang Zhang (Chinese Academy of Sciences, China); Tieniu Tan (NLPR, China)
1729 Registration based Few-Shot Anomaly Detection Chaoqin Huang (Shanghai Jiao Tong University)*; Haoyan Guan (King’s College London); Aofan Jiang (Shanghai Jiao Tong University); Ya Zhang (Cooperative Medianet Innovation Center, Shang hai Jiao Tong University); Michael W Spratling (King’s College London); Yan-Feng Wang (Cooperative medianet innovation center of Shanghai Jiao Tong University)
4028 Decoupled Adversarial Contrastive Learning for Self-supervised Adversarial Robustness Chaoning Zhang (KAIST)*; Kang Zhang (KAIST); Chenshuang Zhang (KAIST); Axi Niu (Northwestern Polytechnical University ); Jiu Feng (Sichuan University); Chang D. Yoo (KAIST); In So Kweon (KAIST)
7402 Towards Realistic Semi-Supervised Learning Mamshad Nayeem Rizve (University of Central Florida)*; Navid Kardan (University of Central Florida); Mubarak Shah (University of Central Florida)
1011 Weakly Supervised Grounding for VQA in Vision-Language Transformers Aisha Urooj (University of Central Florida)*; Hilde Kuehne (Goethe University Frankfurt); Chuang Gan (MIT-IBM Watson AI Lab); Niels da Vitoria Lobo (University of Central Florida); Mubarak Shah (University of Central Florida)

Orals 12 (Thu. pm)

Robots, Vehicles, and Computational Photography

1083 Practical and Scalable Desktop-based High-Quality Facial Capture Alexandros Lattas (Imperial College London)*; Yiming Lin (Imperial college); Jayanth Kannan (Lumirithmic); Ekin Ozturk (Imperial College London); Luca Filipi (Lumirithmic); Giuseppe Claudio Guarnera (University of York); Gaurav Chawla (Lumirithmic Limited); Abhijeet Ghosh (Imperial College London)
1396 Drive&Segment: Unsupervised Semantic Segmentation of Urban Scenes via Cross-modal Distillation Antonin Vobecky (Czech Technical University in Prague)*; David Hurych (Valeo.ai); Oriane Siméoni (valeo.ai); Spyros Gidaris (valeo.ai); Andrei Bursuc (valeo.ai); Patrick Pérez (Valeo.ai); Josef Sivic (Czech Technical University)
2657 SpOT: Spatiotemporal Modeling for 3D Object Tracking Colton Stearns (Stanford University)*; Davis Rempe (Stanford University); Jie Li (Toyota Research Institute); Rareș A Ambruș (Toyota Research Institute); Sergey Zakharov (Toyota Research Institute); Vitor Guizilini (Toyota Research Institute); Yanchao Yang (Stanford University); Leonidas Guibas (Stanford University)
4122 Synthesizing Light Field Video from Monocular Video Shrisudhan Govindarajan (Indian Institute of Technology Madras); Prasan A Shedligeri (Indian Institute of Technology Madras)*; Sarah Sarah (Indian Institute of Technology, Madras); Kaushik Mitra (IIT Madras)
4350 LESS: Label-Efficient Semantic Segmentation for LiDAR Point Clouds Minghua Liu (UCSD)*; Yin Zhou (Waymo); Charles R. Qi (Waymo); Boqing Gong (Google); Hao Su (UCSD); Dragomir Anguelov (Waymo)
5100 EvAC3D: From Event-based Apparent Contours to 3D Models via Continuous Visual Hulls Ziyun Wang (University of Pennsylvania)*; Kenneth Chaney (University of Pennsylvania); Kostas Daniilidis (University of Pennsylvania)
5303 Sim-2-Sim Transfer for Vision-and-Language Navigation in Continuous Environments Jacob Krantz (Oregon State University)*; Stefan Lee (Oregon State University)
6295 Pixel-wise Energy-biased Abstention Learning for Anomaly Segmentation on Complex Urban Driving Scenes Yu Tian (Australian Institute for Machine Learning, University of Adelaide ); Yuyuan Liu (University of Adelaide); Guansong Pang (Singapore Management University)*; Fengbei Liu (University of Adelaide); Yuanhong Chen (University of Adelaide); Gustavo Carneiro (University of Adelaide)
843 KING: Generating Safety-Critical Driving Scenarios for Robust Imitation via Kinematics Gradients Niklas Hanselmann (Mercedes-Benz AG)*; Katrin Renz (University of Tuebingen); Kashyap Chitta (MPI-IS and University of Tuebingen); Apratim Bhattacharyya (Max Planck Institute for Informatics); Andreas Geiger (University of Tuebingen)

Posters 6 (Thu. late)

4185 Rethinking Generic Camera Models for Deep Single Image Camera Calibration to Recover Rotation and Fisheye Distortion Nobuhiko Wakai (Panasonic Corporation)*; Satoshi Sato (Panasonic Corporation); Yasunori Ishii (Panasonic Holdings); Takayoshi Yamashita (Chubu University)
145 Saliency Hierarchy Modeling via Generative Kernels for Salient Object Detection Wenhu Zhang (Zhejiang University)*; Liangli Zheng (Zhejiang University); Huanyu Wang (Zhejiang University); Xintian Wu (Zhejiang University); Xi Li (Zhejiang University)
525 MVDECOR: Multi-view Dense Correspondence Learning for Fine-grained 3D Segmentation Gopal Sharma (University of Massachusetts Amherst)*; Kangxue Yin (NVIDIA); Subhransu Maji (University of Massachusetts, Amherst); Evangelos Kalogerakis (UMass Amherst); Or Litany (NVIDIA); Sanja Fidler (University of Toronto, NVIDIA)
584 Revisiting Point Cloud Simplification: A Learnable Feature Preserving Approach Rolandos Alexandros Potamias (Imperial College London)*; Giorgos Bouritsas (Imperial College London); Stefanos Zafeiriou (Imperial College London)
690 PTSEFormer: Progressive Temporal-Spatial Enhanced TransFormer Towards Video Object Detection Han Wang (Shanghai Jiao Tong University)*; Jun Tang (hikvision); Xiaodong Liu (Hikvision); Shanyan Guan (Shanghai Jiao Tong University); Rong Xie (Shanghai Jiao Tong University); Li Song (Shanghai Jiao Tong University)
824 SOS! Self-supervised Learning Over Sets Of Handled Objects In Egocentric Action Recognition Victor A Escorcia (Samsung AI Center)*; Ricardo Guerrero (Samsung AI Center Cambridge); Xiatian Zhu (Samsung AI Centre); Brais Martinez (Samsung AI Center)
930 LaMAR: Benchmarking Localization and Mapping for Augmented Reality Paul-Edouard Sarlin (ETH Zurich); Mihai Dusmanu (ETH Zurich)*; Johannes L Schönberger (Microsoft); Pablo Speciale (Microsoft); Lukas Gruber (Microsoft); Viktor Larsson (Lund University); Ondrej Miksik (Microsoft); Marc Pollefeys (ETH Zurich / Microsoft)
1029 Few-shot Class-incremental Learning via Entropy-regularized Data-free Replay Huan Liu (McMaster University)*; Li Gu (Huawei Canada); Zhixiang Chi (Huawei Noah’s Ark Laboratory); Yuanhao Yu (Huawei Noah’s Ark Laboratory); Yang Wang (Concordia University); Jun Chen (McMaster University); Jin Tang ( Huawei Noah’s Ark Laboratory)
1076 Identity-aware Hand Mesh Estimation and Personalization from RGB Images Deying Kong (university of california, irvine)*; Linguang Zhang (Facebook Reality Labs); Liangjian Chen (Reality Labs); Haoyu Ma (University of California, Irvine); Xiangyi Yan (University of California, Irvine); shanlin sun (University of California, Irvine); Xingwei Liu (University of California Irvine); Kun Han (University of California Irvine); Xiaohui Xie (University of California, Irvine)
1136 Prune Your Model Before Distill It JinHyuk Park (Hongik University); Albert No (Hongik University)*
1161 PASS: Part-Aware Self-Supervised Pre-Training for Person Re-Identification Kuan Zhu (Institute of Automation, Chinese Academy of Sciences)*; Haiyun Guo (CASIA); Tianyi Yan (Institute of Automation,Chinese Academy of Sciences;School of Artificial Intelligence, University of Chinese Academy Sciences); Yousong Zhu (Institute of Automation, Chinese Academy of Sciences); Jinqiao Wang (Institute of Automation, Chinese Academy of Sciences); Ming Tang (Institute of Automation, Chinese Academy of Sciences)
1408 Improving Few-Shot Part Segmentation using Coarse Supervision Oindrila Saha (University of Massachusetts Amherst)*; Zezhou Cheng (University of Massachusetts, Amherst); Subhransu Maji (University of Massachusetts, Amherst)
1531 MIME: Minority Inclusion for Majority Group Enhancement of AI Performance Pradyumna Chari (UCLA); Yunhao Ba (UCLA)*; Shreeram Athreya (UCLA); Achuta Kadambi (UCLA)
1586 Perception-Distortion Balanced ADMM Optimization for Single-Image Super-Resolution Yuehan Zhang (National University of Singapore)*; Bo Ji (National University of Singapore); Jia Hao (HiSilicon (Shanghai) Technologies Co., Ltd); Angela Yao (National University of Singapore)
1606 Hierarchical Contrastive Inconsistency Learning for Deepfake Video Detection Zhihao Gu (Shanghai Jiao Tong University)*; Taiping Yao (Tencent YouTu); Yang Chen (Tencent); Shouhong Ding (Tencent); Lizhuang Ma (Shanghai Jiao Tong University)
1687 Fast-Vid2Vid: Spatial-Temporal Compression for Video-to-Video Synthesis Long Zhuo (Shanghai AI Lab)*; Guangcong Wang (Nanyang Technological University); Shikai Li (SenseTime Research); Wayne Wu (SenseTime Research); Ziwei Liu (Nanyang Technological University)
1701 ASpanFormer: Detector-Free Image Matching with Adaptive Span Transformer Hongkai Chen (HKUST)*; Zixin Luo (Apple Inc.); Lei Zhou (Apple); Yurun Tian (Apple); Zhen Mingmin (Apple Inc.); Tian Fang (Apple); David N McKinnon (Apple); Yanghai Tsin (Apple Inc); Long Quan (Apple)
1709 Egocentric Activity Recognition and Localization on a 3D Map Miao Liu (Georgia Institute of Technology)*; Lingni Ma (Facebook Reality Labs); Kiran Somasundaram (Facebook Reality Labs); Yin Li (University of Wisconsin-Madison); Kristen Grauman (Facebook AI Research & UT Austin); James Rehg (Georgia Institute of Technology); Chao Li (Facebook Reality Labs)
2144 Video Question Answering with Iterative Video-Text Co-Tokenization AJ Piergiovanni (Google)*; Kairo Morton (Massachusetts Institute of Technology); Weicheng Kuo (Google); Michael S Ryoo (Google; Stony Brook University); Anelia Angelova (Google)
2145 LaTeRF: Label and Text Driven Object Radiance Fields Ashkan Mirzaei (University of Toronto)*; Yash Mukund Kant (University of Toronto); Jonathan Kelly (University of Toronto); Igor Gilitschenski (University of Toronto)
2170 Decomposing The Tangent of Occluding Boundaries According to Curvatures and Torsions Huizong Yang (Georgia Institute of Technology)*; Anthony Yezzi (Georgia Institute of Technology)
2378 StyleLight: HDR Panorama Generation for Lighting Estimation and Editing Guangcong Wang (Nanyang Technological University)*; Yinuo Yang (Nanyang Technological University); Chen Change Loy (Nanyang Technological University); Ziwei Liu (Nanyang Technological University)
2761 SP-Net: Slowly Progressing Dynamic Inference Networks Huanyu Wang (Zhejiang University)*; Wenhu Zhang (Zhejiang University); Shihao Su (Zhejiang University); Hui Wang (Zhejiang University); Zhenwei Miao (DAMO Academy, Alibaba Group); Xin Zhan (DAMO Academy, Alibaba Group); Xi Li (Zhejiang University)
2764 No Token Left Behind: Explainability-Aided Image Classification and Generation Roni Paiss (Tel Aviv University, Google); Hila Chefer (Tel Aviv University)*; Lior Wolf (Tel Aviv University, Israel)
2786 CompNVS: Novel View Synthesis with Scene Completion Zuoyue Li (ETH Zurich)*; Tianxing Fan (Zhejiang University); Zhenqiang Li (The University of Tokyo); Zhaopeng Cui (Zhejiang University); Yoichi Sato (University of Tokyo); Marc Pollefeys (ETH Zurich / Microsoft); Martin R. Oswald (ETH Zurich)
2877 Fast Two-step Blind Optical Aberration Correction Thomas Eboli (ENS Paris-Saclay)*; Jean-Michel Morel (Centre Borelli ENS Paris-Saclay); Gabriele Facciolo (ENS Paris – Saclay)
2903 PseudoClick: Interactive Image Segmentation with Click Imitation Qin Liu (UNC)*; Meng Zheng (United Imaging Intelligence); Benjamin Planche (United Imaging Intelligence); Srikrishna Karanam (Adobe Research); Terrence Chen (United Imaging Intelligence); Marc Niethammer (UNC); Ziyan Wu (United Imaging Intelligence)
2943 Trust, but Verify: Using Self-Supervised Probing to Improve Trustworthiness Ailin Deng (National University of Singapore)*; Shen Li (National University of Singapore); Miao Xiong (National University of Singapore); Zhirui Chen (National University of Singapore); Bryan Hooi (National University of Singapore)
3295 Rethinking IoU-based Optimization for Single-stage 3D Object Detection Hualian Sheng (Zhejiang University)*; Sijia Cai (DAMO Academy, Alibaba Group); Na Zhao (NUS); Bing Deng (Damo Academy, Alibaba Group); Jianqiang Huang (Damo Academy, Alibaba Group); Xian-Sheng Hua (Damo Academy, Alibaba Group); Min-Jian Zhao (Zhejiang University); Gim Hee Lee (National University of Singapore)
3324 ASSISTER: Assistive Navigation via Conditional Instruction Generation Zanming Huang (Boston University); Zhongkai Shangguan (Boston University); Jimuyang Zhang (Boston University); Gilad Bar (Rutgers University – Camden); Matthew Boyd (Boston University); Eshed Ohn-Bar (Boston University)*
3393 Semi-Supervised Vision Transformers Zejia Weng (Fudan University)*; Xitong Yang (University of Maryland); Ang Li (Google DeepMind); Zuxuan Wu (UMD); Yu-Gang Jiang (Fudan University)
3394 Learning an Isometric Surface Parameterization for Texture Unwrapping Sagnik Das (Stony Brook University)*; Ke Ma (Stony Brook University); Zhixin Shu (Adobe Research); Dimitris Samaras (Stony Brook University)
3755 Learning to Censor by Noisy Sampling Ayush Chopra (MIT)*; Abhinav Java (Adobe, MDSR Labs); Abhishek Singh (MIT); Vivek Sharma (MIT); Ramesh Raskar (Massachusetts Institute of Technology)
3781 TAVA: Template-free Animatable Volumetric Actors Ruilong Li (UC Berkeley)*; Julian Tanke (University of Bonn); Minh P Vo (Facebook Reality Labs); Michael Zollhöfer (Facebook Reality Labs); Jürgen Gall (University of Bonn); Angjoo Kanazawa (University of California Berkeley); Christoph Lassner (Meta Reality Labs Research)
3798 DEVIANT: Depth EquiVarIAnt NeTwork for Monocular 3D Object Detection Abhinav Kumar (Michigan State University)*; Garrick Brazil (Facebook); Enrique Corona (Ford Motor Company); Armin Parchami (Ford Motor Company); Xiaoming Liu (Michigan State University)
3984 Novel Class Discovery without Forgetting Joseph K J (Indian Institute of Technology, Hyderabad)*; Sujoy Paul (Google Research); Gaurav Aggarwal (Google); Soma Biswas (Indian Institute of Science, Bangalore); Piyush Rai (IIT Kanpur); Kai Han (The University of Hong Kong); Vineeth N Balasubramanian (Indian Institute of Technology, Hyderabad)
4019 Spatially Invariant Unsupervised 3D Object-Centric Learning and Scene Decomposition Tianyu Wang (The Australian National University); Miaomiao Liu (The Australian National University)*; Kee Siong Ng (The Australian National University)
4074 Negative Samples are at Large: Leveraging Hard-distance Elastic Loss for Re-identification Hyungtae Lee (DEVCOM Army Research Laboratory)*; Sungmin Eum (Booz Allen Hamilton Inc.); Heesung Kwon (U.S. Army Research Laboratory)
4119 Sound Localization by Self-Supervised Time Delay Estimation Ziyang Chen (University of Michigan)*; David Fouhey (University of Michigan); Andrew Owens (U Michigan)
4206 Chairs Can be Stood on: Overcoming Object Bias in Human-Object Interaction Detection Guangzhi Wang (National University of Singapore)*; Yangyang Guo (National University of Singapore); Yongkang Wong (National University of Singapore); Mohan Kankanhalli (National University of Singapore,)
4269 Aware of the History: Trajectory Forecasting with the Local Behavior Data Yiqi Zhong (University of Southern California)*; Zhenyang Ni (Shanghai Jiao Tong University); Siheng Chen (Shanghai Jiao Tong University); Ulrich Neumann (USC)
4270 FAR: Fourier Aerial Video Recognition Divya Kothandaraman (University of Maryland College Park)*; Tianrui Guan (University of Maryland, College Park); Xijun Wang (University of Maryland, College Park); Shuowen Hu (US Army Research Laboratory); Ming C Lin (UMD-CP & UNC-CH ); Dinesh Manocha (University of Maryland at College Park)
4420 PETR: Position Embedding Transformation for Multi-View 3D Object Detection Yingfei Liu (Megvii Technology); Tiancai Wang ( Megvii Technology)*; Xiangyu Zhang (Megvii Technology); Jian Sun (Megvii Technology)
4451 S2Net: Stochastic Sequential Pointcloud Forecasting Xinshuo Weng (NVIDIA Research)*; Junyu Nan (Carnegie Mellon University); Kuan-Hui Lee (Toyota Research Institute); Rowan McAllister (Toyota Research Institute); Adrien Gaidon (Toyota Research Institute); Nicholas Rhinehart (UC Berkeley); Kris Kitani (Carnegie Mellon University)
4452 D3Net: A Unified Speaker-Listener Architecture for 3D Dense Captioning and Visual Grounding Zhenyu Chen (Technical University of Munich)*; Qirui Wu (Simon Fraser University); Matthias Niessner (Technical University of Munich); Angel X Chang (Simon Fraser University)
4537 TinyViT: Fast Pretraining Distillation for Small Vision Transformers Kan Wu (Sun Yat-sen University); Jinnian Zhang (University of Wisconsin Madison); Houwen Peng (Microsoft Research)*; Mengchen Liu (Microsoft); Bin Xiao (Microsoft); Jianlong Fu (Microsoft Research); Lu Yuan (Microsoft)
4566 D2ADA: Dynamic Density-aware Active Domain Adaptation for Semantic Segmentation Tsung-Han Wu (National Taiwan University)*; Yi-Syuan Liou (National Taiwan University); Shao-Ji Yuan (National Taiwan University); Hsin-Ying Lee (National Taiwan University); Tung-I Chen (National Taiwan University); Kuan-Chih Huang (National Taiwan University); Winston H. Hsu (National Taiwan University)
4614 FILM: Frame Interpolation for Large Motion Fitsum Reda (Google)*; Janne Kontkanen (Google); Eric Tabellion (Google); Deqing Sun (Google); Caroline Pantofaru (Google Research); Brian Curless (University of Washington)
4790 A Deep Moving-camera Background Model Guy Erez (Ben Gurion University)*; Ron A Shapira Weber (Ben-Gurion University); Oren Freifeld (Ben-Gurion University)
4874 Quantum Motion Segmentation Federica Arrigoni (University of Trento)*; Willi Menapace (University of Trento); Marcel Seelbach Benkner (University of Siegen); Elisa Ricci (University of Trento); Vladislav Golyanik (MPI for Informatics)
4888 Class Is Invariant to Context and Vice Versa: On Learning Invariance for Out-Of-Distribution Generalization Jiaxin Qi (Nanyang Technological University)*; Kaihua Tang (Nanyang Technological University); Qianru Sun (Singapore Management University); Xian-Sheng Hua (Damo Academy, Alibaba Group); Hanwang Zhang (Nanyang Technological University)
4929 Few-Shot Classification with Contrastive Learning Zhanyuan Yang (Shenzhen University); Jinghua Wang (Harbin Institute of Technology); Yingying Zhu (Shenzhen University)*
4976 StyleBabel: Artistic Style Tagging and Captioning Dan Ruta (University of Surrey)*; Andrew Gilbert (University of Surrey); Pranav V Aggarwal (Adobe Inc.); Naveen Marri (Adobe Inc); Ajinkya Kale (Adobe); Jo Briggs (University of Northumbria); Chris Speed (University of Edinburgh); Hailin Jin (Adobe Research); Baldo Faieta (Adobe); Alex Filipkowski (Adobe); Zhe Lin (Adobe Research); John Collomosse (Adobe Research)
5017 Detecting Generated Images by Real Images Bo Liu (Chongqing University of Posts and Telecommunications); fan yang (Chongqing University of Posts and Telecommunications); Xiuli Bi (Chongqing University of Posts and Telecommunications); Bin Xiao (Chongqing University of Posts and Telecommunications)*; Weisheng Li (Chongqing University of Posts and Telecommunications); Xinbo Gao (Chongqing University of Posts and Telecommunications)
5018 VisageSynTalk: Unseen Speaker Video-to-Speech Synthesis via Speech-Visage Feature Selection Joanna Hong (KAIST)*; Minsu Kim (KAIST); Yong Man Ro (KAIST)
5088 Sim-to-Real 6D Object Pose Estimation via Iterative Self-training for Robotic Bin Picking Kai Chen (The Chinese University of Hong Kong); Rui Cao (The Chinese University of Hong Kong); Stephen L James (UC Berkeley); YICHUAN LI (CUHK); Yunhui Liu (CUHK); Pieter Abbeel (UC Berkeley); Qi Dou (The Chinese University of Hong Kong)*
5189 Background-Insensitive Scene Text Recognition with Text Semantic Segmentation Liang Zhao (University of South Carolina)*; Zhenyao Wu (University of South Carolina); Xinyi Wu (University of South Carolina); Greg Wilsbacher (University of South Carolina); Song Wang (University of South Carolina)
5207 MorphMLP: An Efficient MLP-Like Backbone for Spatial-Temporal Representation Learning David Junhao Zhang (National University of Singapore)*; Kunchang Li (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences); Yali Wang (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences); Yunpeng Chen (National University of Singapore); Shashwat Chandra (National University of Singapore); Yu Qiao (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences); Luoqi Liu (meitu); Mike Zheng Shou (National University of Singapore)
5215 Semantic Novelty Detection via Relational Reasoning Francesco Cappio Borlino (Politecnico di Torino); Silvia Bucci (Italian Institute of Technology)*; Tatiana Tommasi (Politecnico di Torino)
5261 DexMV: Imitation Learning for Dexterous Manipulation from Human Videos Yuzhe Qin (University of California San Diego)*; Yueh-Hua Wu (UCSD); Shaowei Liu (UIUC); Hanwen Jiang (UT Austin); Ruihan Yang (UC San Diego); Yang Fu (UCSD); Xiaolong Wang (UCSD)
5330 CoVisPose: Co-Visibility Pose Transformer for Wide-Baseline Relative Pose Estimation in 360 Indoor Panoramas Will A Hutchcroft (Zillow Group)*; Yuguang Li (Zillow Group); Ivaylo Boyadzhiev (Zillow Group); Zhiqiang Wan (Zillow); Haiyan Wang (The City College of New York); Sing Bing Kang (Zillow Group)
5419 GraphFit: Learning Multi-scale Graph-Convolutional Representation for Point Cloud Normal Estimation Keqiang Li (Institute of Automation, Chinese Academy of Sciences; School of Artificial Intelligence, University of Chinese Academy of Sciences)*; Mingyang Zhao (University of Chinese Academy and Sciences&Beijing Academy of Artificial Intelligence); Huaiyu Wu (Institute of Automation, Chinese Academy of Sciences); Dong-Ming Yan (NLPR, CASIA); Zhen Shen (Institute of Automation, Chinese Academy of Sciences/Qingdao Academy of Intelligent Industries); Fei-Yue Wang (Institute of Automation, Chinese Academy of Sciences ); gang xiong (CASIA)
5430 EleGANt: Exquisite and Locally Editable GAN for Makeup Transfer Chenyu Yang (Tsinghua University)*; Wanrong He (Tsinghua University); Yingqing Xu (Tsinghua University); Yang Gao (Tsinghua University)
5561 POP: Mining POtential Performance of new fashion products via webly cross-modal query expansion Christian Joppi (Humatics srl)*; Geri Skenderi (University of Verona); Marco Cristani (University of Verona)
5769 Photo-realistic Neural Domain Randomization Sergey Zakharov (Toyota Research Institute)*; Rareș A Ambruș (Toyota Research Institute); Vitor Guizilini (Toyota Research Institute); Wadim Kehl (Woven Planet); Adrien Gaidon (Toyota Research Institute)
5923 What Matters for 3D Scene Flow Network Guangming Wang (Shanghai Jiao Tong University); Yunzhe Hu (Shanghai Jiao Tong University); Zhe Liu (University of Cambridge); Yiyang Zhou (UC Berkeley ); Masayoshi TOMIZUKA (MSC Lab); Wei Zhan (University of California, Berkeley); Hesheng Wang (SJTU)*
6039 Video Instance Segmentation via Multi-Scale Spatio-Temporal Split Attention Transformer Omkar Thawakar (MBZUAI)*; Sanath Narayan (Inception Institute of Artificial Intelligence); Jiale Cao (Tianjin University); Hisham Cholakkal (MBZUAI); Rao Muhammad Anwer (MBZUAI/AALTO); Muhammad Haris Khan (Muhammad Bin Zayed University of Artificial Intelligence); Salman Khan (MBZUAI/ANU); Michael Felsberg (Linköping University); Fahad Shahbaz Khan (MBZUAI)
6076 Towards Scale-Aware, Robust, and Generalizable Unsupervised Monocular Depth Estimation by Integrating IMU Motion Dynamics Sen Zhang (The University of Sydney); Jing Zhang (The University of Sydney)*; Dacheng Tao (The University of Sydney)
6086 SecretGen: Privacy Recovery on Pre-trained Models via Distribution Discrimination Zhuowen Yuan (UIUC); Fan Wu (UIUC); Yunhui Long (University of Illinois); Chaowei Xiao (NVIDIA); Bo Li (UIUC)*
6092 Temporally Consistent Transformer for Video Denoising Mingyang Song (ETH Zurich)*; Yang Zhang (Disney Research Studios); Tunç Aydin (Disney Research)
6138 Revisiting Batch Norm Initialization Jim Davis (Ohio State University); Logan Frank (Ohio State University)*
6214 Webly Supervised Concept Expansion for General Purpose Vision Models Amita Kamath (Allen Institute for Artificial Intelligence); Christopher A Clark (Allen Institute for AI)*; Tanmay Gupta (Allen Institute for Artificial Intelligence); Eric Kolve (Allen AI); Derek Hoiem (University of Illinois at Urbana-Champaign); Aniruddha Kembhavi (Allen Institute for Artificial Intelligence)
6216 Compositional Human-Scene Interaction Synthesis with Semantic Control Kaifeng Zhao (ETH Zurich)*; Shaofei wang (ETH Zurich); Yan Zhang (ETH Zurich); Thabo Beeler (Disney Research | Studios); Siyu Tang (ETH Zurich)
6265 SPViT: Enabling Faster Vision Transformers via Soft Token Pruning Zhenglun Kong (Northeastern University)*; Peiyan Dong (Northeastern University); Xiaolong Ma (Clemson University); Xin Meng (Peking university); Wei Niu (William & Mary); Mengshu Sun (Northeastern University); Xuan Shen (Northeastern University); Geng Yuan (Northeastern University); Bin Ren (William & Mary); Hao Tang (ETH Zurich); Minghai Qin (Western Digital Research); Yanzhi Wang (Northeastern University)
6282 Tailoring Self-Supervision for Supervised Learning WonJun Moon (Sungkyunkwan University)*; Jihwan Kim (Sungkyunkwan University); Jae-Pil Heo (Sungkyunkwan University)
6283 Difficulty-Aware Simulator for Open Set Recognition WonJun Moon (Sungkyunkwan University)*; Jun ho Park (Sungkyunkwan university); Hyun Seok Seong (Sungkyunkwan University); Cheol-Ho Cho (Sungkyunkwan University); Jae-Pil Heo (Sungkyunkwan University)
6324 Tomography of Turbulence Strength Based on Scintillation Imaging Nir Shaul (Technion)*; Schechner Yoav (Technion)
6366 Few-Shot Class-Incremental Learning from an Open-Set Perspective Can Peng (the University of Queensland)*; Kun Zhao (Sullivan Nicolaides Pathology); Tianren Wang (The University of Queensland); Meng Li (The University of Queensland); Brian C Lovell (University of Queensland)
6389 DRCNet: Dynamic Image Restoration Contrastive Network Fei Li (China Agricultural University)*; Lingfeng Shen (Tencent AI Lab); YANG MI (China Agricultural University); Zhenbo Li (China Agricultural University)
6541 Unsupervised High-Fidelity Facial Texture Generation and Reconstruction Ron Slossberg (Technion)*; Ibrahim Jubran (The University of Haifa); Ron Kimmel (Technion)
6627 Language-Driven Artistic Style Transfer Tsu-Jui Fu (UCSB)*; Xin Eric Wang (University of California, Santa Cruz); William Yang Wang (UC Santa Barbara)
6639 Transformer with Implicit Edges for Particle-based Physics Simulation Yidi Shao (Nanyang Technological University)*; Chen Change Loy (Nanyang Technological University); Bo Dai (Shanghai AI Lab)
6665 OpenLDN: Learning to Discover Novel Classes for Open-World Semi-Supervised Learning Mamshad Nayeem Rizve (University of Central Florida)*; Navid Kardan (University of Central Florida); Salman Khan (MBZUAI/ANU); Fahad Shahbaz Khan (MBZUAI); Mubarak Shah (University of Central Florida)
6838 DenseHybrid: Hybrid Anomaly Detection for Dense Open-set Recognition Matej Grcić (University of Zagreb, Faculty of Electrical Engineering and Computing)*; Petra Bevandić (Faculty of Electrical Engineering and Computing); Sinisa Segvic (UniZg-FER)
6976 Uncertainty-guided Source-free Domain Adaptation Subhankar Roy (University of Trento)*; Martin Trapp (Aalto University ); Andrea Pilzer (NVIDIA); Juho Kannala (Aalto University, Finland); Nicu Sebe (University of Trento); Elisa Ricci (University of Trento); Arno Solin (Aalto University)
7077 HM: Hybrid Masking for Few-Shot Segmentation Seonghyeon Moon (Rutgers University)*; Samuel S Sohn (Rutgers University); Honglu Zhou (Rutgers University); Sejong Yoon (The College of New Jersey); Vladimir Pavlovic (Rutgers University); Muhammad Haris Khan (Muhammad Bin Zayed University of Artificial Intelligence); Mubbasir Kapadia (Rutgers)
7427 Masked Siamese Networks for Label-Efficient Learning Mahmoud Assran (Facebook AI)*; Mathilde Caron (Facebook Artificial Intelligence Research); Ishan Misra (Facebook AI Research); Piotr Bojanowski (Facebook); Florian Bordes (MILA); Pascal Vincent (Facebook FAIR & MILA Université de Montréal); Armand Joulin (Facebook AI Research); Mike Rabbat (Facebook FAIR); Nicolas Ballas (Facebook FAIR)
7746 FairStyle: Debiasing StyleGAN2 with Style Channel Manipulations Cemre Efe Karakas (Bogazici University); Alara Dirik (Bogazici University); Eylül Yalçınkaya (Bogazici University); Pinar Yanardag (Bogazici University)*
7765 Super-resolution 3D Human Shape from a Single Low-Resolution Image Marco Pesavento (University of Surrey)*; Marco Volino (University of Surrey); Adrian Hilton (University of Surrey)
7847 MINER: Multiscale Implicit Neural Representation Vishwanath Saragadam (Rice University)*; Jasper T Tan (Rice University); Guha Balakrishnan (Rice University); Richard Baraniuk (Rice University); Ashok Veeraraghavan (Rice University)
7874 Distilling the Undistillable: Learning from a Nasty Teacher Surgan Jandial (MDSR Labs, Adobe)*; Yash Khasbage (Indian Institute of Technology, Hyderabad); Arghya Pal (Harvard University); Vineeth N Balasubramanian (Indian Institute of Technology, Hyderabad); Balaji Krishnamurthy ()
7883 Towards Accurate Open-Set Recognition via Background-Class Regularization Wonwoo Cho (Korea Advanced Institute of Science and Technology)*; Jaegul Choo (Korea Advanced Institute of Science and Technology)
8081 Towards Learning Neural Representations from Shadows Kushagra Tiwary (MIT)*; Tzofi M Klinghoffer (Massachusetts Institute of Technology); Ramesh Raskar (Massachusetts Institute of Technology)
8086 Augmenting Deep Classifiers with Polynomial Neural Networks Grigorios Chrysos (EPFL)*; Markos Georgopoulos (Imperial College London); Jiankang Deng (Imperial College London); Jean Kossaifi (NVIDIA); Yannis Panagakis (University of Athens); Animashree Anandkumar (Caltech)
107 Minimal Neural Atlas: Parameterizing Complex Surfaces with Minimal Charts and Distortion Weng Fei Low (National University of Singapore)*; Gim Hee Lee (National University of Singapore)
939 Video Mask Transfiner for High-Quality Video Instance Segmentation Lei Ke (HKUST)*; Henghui Ding (ETH Zurich); Martin Danelljan (ETH Zurich); Yu-Wing Tai (Kuaishou Technology / HKUST); Chi-Keung Tang (Hong Kong University of Science and Technology); Fisher Yu (ETH Zurich)
1100 Domain Adaptive Hand Keypoint and Pixel Localization in the Wild Takehiko Ohkawa (The University of Tokyo)*; Yu-Jhe Li (Carnegie Mellon University); Qichen Fu (Carnegie Mellon University); Ryosuke Furuta (The University of Tokyo); Kris Kitani (Carnegie Mellon University); Yoichi Sato (University of Tokyo)
1516 Meta-Sampler: Almost-Universal yet Task-Oriented Sampling for Point Clouds Ta-Ying Cheng (University of Oxford); Qingyong Hu (University of Oxford)*; Qian Xie (University of Oxford); Niki Trigoni (University of Oxford); Andrew Markham (University of Oxford)
1759 MPIB: An MPI-Based Bokeh Rendering Framework for Realistic Partial Occlusion Effects Juewen Peng (Huazhong University of Science and Technology); Jianming Zhang (Adobe Research); Xianrui Luo (Huazhong University of Science and Technology); Hao Lu (Huazhong University of Science and Technology); Ke Xian (Huazhong University of Science and Technology)*; Zhiguo Cao (Huazhong Univ. of Sci.&Tech.)
1790 Housekeep: Tidying Virtual Households using Commonsense Reasoning Yash Mukund Kant (University of Toronto)*; Arun Ramachandran (Georgia Institute of Technology); Sriram Yenamandra (Georgia Institute of Technology); Igor Gilitschenski (University of Toronto); Dhruv Batra (Georgia Tech & Facebook AI Research); Andrew Szot (Georgia Institute of Technology); Harsh Agrawal (Georgia Institute of Technology)
2116 Cross-Attention of Disentangled Modalities for 3D Human Mesh Recovery with Transformers Junhyeong Cho (POSTECH)*; Kim Youwang (POSTECH); Tae-Hyun Oh (POSTECH)
2310 Ultra-high-resolution unpaired stain transformation via Kernelized Instance Normalization Ming-Yang Ho (aetherAI)*; Min-Sheng Wu (aetherAI); Che-Ming Wu (aetherAI)
2576 Reliable Visual Question Answering: Abstain Rather Than Answer Incorrectly Spencer Whitehead (Meta AI)*; Suzanne Petryk (UC Berkeley); Vedaad Shakib (UC Berkeley); Joseph E Gonzalez (UC Berkeley); Trevor Darrell (UC Berkeley); Anna Rohrbach (UC Berkeley); Marcus Rohrbach (Facebook AI Research)
2610 A Real World Dataset for Multi-view 3D Reconstruction Rakesh Shrestha (Simon Fraser University)*; Siqi Hu (Alibaba damo academy); Minghao Gou (Shanghai Jiao Tong University); Ziyuan Liu (Huawei group); Ping Tan (Simon Fraser University)
3229 CLIP-Actor: Text-Driven Recommendation and Stylization for Animating Human Meshes Kim Youwang (POSTECH)*; Ji-Yeon Kim (POSTECH); Tae-Hyun Oh (POSTECH)
3658 NeFSAC: Neurally Filtered Minimal Samples Luca Cavalli (ETH Zurich)*; Marc Pollefeys (ETH Zurich / Microsoft); Daniel Barath (ETH Zürich)
3690 DeiT III: Revenge of the ViT Hugo Touvron (Facebook AI Research)*; Matthieu Cord (Sorbonne University); Herve Jegou (Facebook AI Research)
4029 Map-free Visual Relocalization: Metric Pose Relative to a Single Image Eduardo Arnold (University of Warwick); Jamie M Wynn (Niantic); Sara Vicente (Niantic); Guillermo Garcia-Hernando (Niantic); Aron Monszpart (Niantic); Victor A Prisacariu (Niantic Labs); Daniyar Turmukhambetov (Niantic); Eric Brachmann (Niantic)*
4076 Global-local Motion Transformer for Unsupervised Skeleton-based Action Learning Boeun Kim (Seoul National University)*; Hyung Jin Chang (University of Birmingham); Jungho Kim (KETI); Jin Young Choi (Seoul National University)
4525 The One Where They Reconstructed 3D Humans and Environments in TV Shows Georgios Pavlakos (UC Berkeley)*; Ethan Weber (UC Berkeley); Matthew Tancik (UC Berkeley); Angjoo Kanazawa (University of California Berkeley)
4656 Bottom Up Top Down Detection Transformers for Language Grounding in Images and Point Clouds Ayush Jain (Carnegie Mellon University)*; Nikolaos Gkanatsios (Carnegie Mellon University); Ishita Mediratta (Meta AI); Katerina Fragkiadaki (Carnegie Mellon University)
4659 Discovering Deformable Keypoint Pyramids Jianing Qian (University of Pennsylvania)*; Anastasios Panagopoulos (University of Pennsylvania); Dinesh Jayaraman (University of Pennsylvania)
4951 Data Invariants to Understand Unsupervised Out-of-Distribution Detection Lars Doorenbos (University of Bern)*; Raphael Sznitman (University of Bern); Pablo Márquez Neila (University of Bern)
5020 Delta Distillation for Efficient Video Processing Amirhossein Habibian (Qualcomm AI Research)*; Haitam Ben Yahia (Qualcomm AI Research); Davide Abati (Qualcomm AI Research); Efstratios Gavves (University of Amsterdam ); Fatih Porikli (Qualcomm AI Research)
5122 Completely Self-Supervised Crowd Counting via Distribution Matching deepak babu sam (Indian Institute of Science)*; Abhinav Agarwalla (Carnegie Mellon University); Jimmy Joseph (Stony Brook University); Vishwanath Sindagi (Johns Hopkins University); Venkatesh Babu RADHAKRISHNAN (Indian Institute of Science); Vishal Patel (Johns Hopkins University)
5160 CoGS: Controllable Generation and Search from Sketch and Style Cusuh Ham (Georgia Institute of Technology)*; Gemma Canet Tarrés (CVSSP, University of Surrey); Tu Bui (University of Surrey); James Hays (Georgia Institute of Technology, USA); Zhe Lin (Adobe Research); John Collomosse (Adobe Research)
5318 LidarNAS: Unifying and Searching Neural Architectures for 3D Point Clouds Chenxi Liu (Waymo)*; Zhaoqi Leng (Waymo); Pei Sun (Waymo); Shuyang Cheng (Waymo LLC); Charles R. Qi (Waymo); Yin Zhou (Waymo); Mingxing Tan (Waymo); Dragomir Anguelov (Waymo)
5340 PT4AL: Using Self-Supervised Pretext Tasks for Active Learning John Seon Keun Yi (Georgia Institute of Technology)*; Minseok Seo (si-analytics); Jongchan Park (Lunit); Dong-Geol Choi (Hanbat National University)
5500 Style-Agnostic Reinforcement Learning Juyong Lee (POSTECH); Seokjun Ahn (POSTECH); Jaesik Park (POSTECH)*
5541 Online Domain Adaptation for Semantic Segmentation in Ever-Changing Conditions Theodoros Panagiotakopoulos (KTH Royal Institute of Technology in Stockholm); Pier Luigi Dovesi (Univrses); Linus Härenstam-Nielsen (Artisense); Matteo Poggi (University of Bologna)*
5898 BigColor: Colorization using a Generative Color Prior for Natural Images Geonung Kim (POSTECH); Kyoungkook Kang (POSTECH); Seongtae Kim (POSTECH); Hwayoon Lee (POSTECH); Sehoon Kim (Samsung electronics co. ltd.); Jonghyun Kim (Samsung Electronics); Seung-Hwan Baek (POSTECH); Sunghyun Cho (POSTECH)*
5913 Open Vocabulary Object Detection with Pseudo Bounding-Box Labels Mingfei Gao (Apple)*; Chen Xing (Salesforce Research); Juan Carlos Niebles (Salesforce & Stanford University); Junnan Li (Salesforce); Ran Xu (Salesforce Research); Wenhao Liu (Salesforce Metamind); Caiming Xiong (Salesforce Research)
5914 BoundaryFace: A mining framework with noise label self-correction for Face Recognition Shijie Wu (Southwest Jiaotong University)*; Xun Gong (Southwest Jiaotong University)
5951 Combining Internal and External Constraints for Unrolling Shutter in Videos Eyal Naor (Weizmann Institute)*; Itai Antebi (Weizmann); Shai Bagon (Weizmann Institute of Science); Michal Irani (Weizmann Institute, Israel)
5993 RepMix: Representation Mixing for Robust Attribution of Synthesized Images Tu Bui (University of Surrey)*; Ning Yu (Salesforce Research); John Collomosse (Adobe Research)
6338 CXR Segmentation by AdaIN-based Domain Adaptation and Knowledge Distillation Yujin Oh (Kim Jaechul Graduate School of AI, KAIST, Korea); Jong Chul Ye (Kim Jaechul Graduate School of AI, KAIST, Korea)*
7122 Diverse Generation from a Single Video Made Possible Niv Haim (Weizmann Institute of Science)*; Ben Feinstein (Weizmann Institute of Science); Niv Granot (Weizmann Institute of Science); Assaf Shocher (Weizmann Institute of Science); Shai Bagon (Weizmann Institute of Science); Tali Dekel (Weizmann Institute of Science); Michal Irani (Weizmann Institute, Israel)
7926 BodySLAM: Joint Camera Localisation, Mapping, and Human Motion Tracking Dorian F Henning (Imperial College London)*; Tristan Laidlow (Imperial College London); Stefan Leutenegger (TU Munich)
8070 Differentiable Zooming for Multiple Instance Learning on Whole-Slide Images Kevin Thandiackal (ETH Zurich / IBM Research)*; Boqi Chen (ETH Zurich ); Pushpak Pati (IBM Research Zurich); Guillaume Jaume (Harvard); Drew Williamson (Pathology, Brigham and Women’s Hospital, Harvard Medical School); Maria Gabrani (IBM Research); Orcun Goksel (ETH Zurich)
1384 GLASS: Global to Local Attention for Scene-Text Spotting Roi Ronen (Technion)*; Shahar Tsiper (Amazon); Oron Anschel (AWS); Inbal Lavi (Amazon); Amir Markovitz (Amazon); R. Manmatha (Amazon)
1637 Pointly-Supervised Panoptic Segmentation Junsong Fan (Chinese Academy of Sciences, China)*; Zhaoxiang Zhang (Chinese Academy of Sciences, China); Tieniu Tan (NLPR, China)
1729 Registration based Few-Shot Anomaly Detection Chaoqin Huang (Shanghai Jiao Tong University)*; Haoyan Guan (King’s College London); Aofan Jiang (Shanghai Jiao Tong University); Ya Zhang (Cooperative Medianet Innovation Center, Shang hai Jiao Tong University); Michael W Spratling (King’s College London); Yan-Feng Wang (Cooperative medianet innovation center of Shanghai Jiao Tong University)
4028 Decoupled Adversarial Contrastive Learning for Self-supervised Adversarial Robustness Chaoning Zhang (KAIST)*; Kang Zhang (KAIST); Chenshuang Zhang (KAIST); Axi Niu (Northwestern Polytechnical University ); Jiu Feng (Sichuan University); Chang D. Yoo (KAIST); In So Kweon (KAIST)
7402 Towards Realistic Semi-Supervised Learning Mamshad Nayeem Rizve (University of Central Florida)*; Navid Kardan (University of Central Florida); Mubarak Shah (University of Central Florida)
1011 Weakly Supervised Grounding for VQA in Vision-Language Transformers Aisha Urooj (University of Central Florida)*; Hilde Kuehne (Goethe University Frankfurt); Chuang Gan (MIT-IBM Watson AI Lab); Niels da Vitoria Lobo (University of Central Florida); Mubarak Shah (University of Central Florida)
1083 Practical and Scalable Desktop-based High-Quality Facial Capture Alexandros Lattas (Imperial College London)*; Yiming Lin (Imperial college); Jayanth Kannan (Lumirithmic); Ekin Ozturk (Imperial College London); Luca Filipi (Lumirithmic); Giuseppe Claudio Guarnera (University of York); Gaurav Chawla (Lumirithmic Limited); Abhijeet Ghosh (Imperial College London)
1396 Drive&Segment: Unsupervised Semantic Segmentation of Urban Scenes via Cross-modal Distillation Antonin Vobecky (Czech Technical University in Prague)*; David Hurych (Valeo.ai); Oriane Siméoni (valeo.ai); Spyros Gidaris (valeo.ai); Andrei Bursuc (valeo.ai); Patrick Pérez (Valeo.ai); Josef Sivic (Czech Technical University)
2657 SpOT: Spatiotemporal Modeling for 3D Object Tracking Colton Stearns (Stanford University)*; Davis Rempe (Stanford University); Jie Li (Toyota Research Institute); Rareș A Ambruș (Toyota Research Institute); Sergey Zakharov (Toyota Research Institute); Vitor Guizilini (Toyota Research Institute); Yanchao Yang (Stanford University); Leonidas Guibas (Stanford University)
4122 Synthesizing Light Field Video from Monocular Video Shrisudhan Govindarajan (Indian Institute of Technology Madras); Prasan A Shedligeri (Indian Institute of Technology Madras)*; Sarah Sarah (Indian Institute of Technology, Madras); Kaushik Mitra (IIT Madras)
4350 LESS: Label-Efficient Semantic Segmentation for LiDAR Point Clouds Minghua Liu (UCSD)*; Yin Zhou (Waymo); Charles R. Qi (Waymo); Boqing Gong (Google); Hao Su (UCSD); Dragomir Anguelov (Waymo)
5100 EvAC3D: From Event-based Apparent Contours to 3D Models via Continuous Visual Hulls Ziyun Wang (University of Pennsylvania)*; Kenneth Chaney (University of Pennsylvania); Kostas Daniilidis (University of Pennsylvania)
5303 Sim-2-Sim Transfer for Vision-and-Language Navigation in Continuous Environments Jacob Krantz (Oregon State University)*; Stefan Lee (Oregon State University)
6295 Pixel-wise Energy-biased Abstention Learning for Anomaly Segmentation on Complex Urban Driving Scenes Yu Tian (Australian Institute for Machine Learning, University of Adelaide ); Yuyuan Liu (University of Adelaide); Guansong Pang (Singapore Management University)*; Fengbei Liu (University of Adelaide); Yuanhong Chen (University of Adelaide); Gustavo Carneiro (University of Adelaide)
843 KING: Generating Safety-Critical Driving Scenarios for Robust Imitation via Kinematics Gradients Niklas Hanselmann (Mercedes-Benz AG)*; Katrin Renz (University of Tuebingen); Kashyap Chitta (MPI-IS and University of Tuebingen); Apratim Bhattacharyya (Max Planck Institute for Informatics); Andreas Geiger (University of Tuebingen)
PHP Code Snippets Powered By : XYZScripts.com