Accepted papers

Oral papers

Paper ID Paper Title Authors
59 Contrastive Deep Supervision Linfeng Zhang (Tsinghua University )*; Xin Chen (Intel Corp.); Junbo Zhang (Tsinghua University); Runpei Dong (Xi’an Jiaotong University); Kaisheng Ma (Tsinghua University )
116 Towards Grand Unification of Object Tracking Bin Yan (Dalian University of Technology)*; Yi Jiang (Bytedance); Peize Sun (The University of Hong Kong); Dong Wang (Dalian University of Technology); Zehuan Yuan (Bytedance.Inc); Ping Luo (The University of Hong Kong); Huchuan Lu (Dalian University of Technology)
125 SeqFormer: Sequential Transformer for Video Instance Segmentation Junfeng Wu (Huazhong University of Science and Technology); Yi Jiang (Bytedance); Song Bai (University of Oxford); Wenqing Zhang (Huazhong University of Science and Technology); Xiang Bai (Huazhong University of Science and Technology)*
162 Estimating Spatially-Varying Lighting in Urban Scenes with Disentangled Representation Jiajun Tang (Peking University); Yongjie Zhu (Beijing University of Posts and Telecommunications); Haoyu Wang (Peking University); Jun Hoong Chan (Peking University); Si Li (Beijing University of Posts and Telecommunications); Boxin Shi (Peking University)*
168 In Defense of Online Models for Video Instance Segmentation Junfeng Wu (Huazhong University of Science and Technology); Qihao Liu (Johns Hopkins University); Yi Jiang (Bytedance); Song Bai (University of Oxford); Alan Yuille (Johns Hopkins University); Xiang Bai (Huazhong University of Science and Technology)*
185 HuMMan: Multi-Modal 4D Human Dataset for Versatile Sensing and Modeling Zhongang Cai (SenseTime International Pte Ltd)*; Daxuan Ren (Nanyang Technological University); Ailing Zeng (The Chinese University of Hong Kong); Zhengyu Lin (SenseTime); Tao Yu (Tsinghua University); Wenjia Wang (SenseTime); Xiangyu Fan (Sensetime); Yang Gao (Sensetime); Yifan Yu (ETH Zurich); Liang Pan (Nanyang Technological University); Fangzhou Hong (Nanyang Technological University); Mingyuan Zhang (Nanyang Technological University); Chen Change Loy (Nanyang Technological University); Lei Yang (Sensetime Group Limited); Ziwei Liu (Nanyang Technological University)
193 Graph R-CNN: Towards Accurate 3D Object Detection with Semantic-Decorated Local Graph Honghui Yang (Zhejiang University)*; Zili Liu (ZJU); Xiaopei Wu (ZhejiangUniversity); Wenxiao Wang (State Key Lab of CAD&CG, Zhejiang University); Wei Qian (Fabu Inc.); Xiaofei He (Zhejiang University); Deng Cai (ZJU)
213 PointScatter: Point Set Representation for Tubular Structure Extraction Dong Wang (Peking University)*; Zhao Zhang (Peking Univesity); Ziwei Zhao (Peking University); Yuhang Liu (Yizhun Medical AI Co., Ltd); Yihong Chen (Peking University); Liwei Wang (Peking University)
229 D&D: Learning Human Dynamics from Dynamic Camera Jiefeng Li (Shanghai Jiao Tong University)*; Siyuan Bian (Shanghai Jiao Tong University); Chao Xu (Tencent); Gang Liu (Tencent inc.); Gang Yu (Tencent ); Cewu Lu (Shanghai Jiao Tong University)
413 On Mitigating Hard Clusters for Face Clustering Yingjie Chen (Peking University); Huasong Zhong (Damo Academy, Alibaba Group); Chong Chen (Alibaba Group)*; Chen Shen (Alibaba Group); Jianqiang Huang (Damo Academy, Alibaba Group); Tao Wang (Peking University); Yun Liang (Peking University); Qianru Sun (Singapore Management University)
415 Recurrent Bilinear Optimization for Binary Neural Networks Sheng Xu (Beihang University)*; Yanjing Li (Beihang University); Tiancheng Wang (Beihang University); Teli Ma (Shanghai Artificial Intelligence Laboratory); Baochang Zhang (Beihang University); Peng Gao (Chinese university of hong kong); Yu Qiao (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences); Jinhu Lu (Beihang University, Beijing, China); Guodong Guo (IDL, Baidu Research)
561 Particle Video Revisited: Tracking Through Occlusions Using Point Trajectories Adam Harley (Carnegie Mellon University)*; Zhaoyuan Fang (Carnegie Mellon University); Katerina Fragkiadaki (Carnegie Mellon University)
617 Open-Set Semi-Supervised Object Detection Yen-Cheng Liu (Georgia Institute of Technology)*; Chih-Yao Ma (Facebook); Xiaoliang Dai (Facebook); Junjiao Tian (Georgia Institute of Technology); Peter Vajda (Facebook); Zijian He (Facebook); Zsolt Kira (Georgia Institute of Technology)
631 Semantic-Aware Implicit Neural Audio-Driven Video Portrait Generation Xian Liu (The Chinese University of Hong Kong)*; Yinghao Xu (Chinese University of Hong Kong); Qianyi Wu (Monash University); Hang Zhou (The Chinese University of Hong Kong); Wayne Wu (SenseTime Research); Bolei Zhou (UCLA)
640 Long-tail Detection with Effective Class-Margins Jang Hyun Cho (The University of Texas at Austin)*; Philipp Kraehenbuehl (UT Austin)
669 SeqTR: A Simple yet Universal Network for Visual Grounding Chaoyang Zhu (Xiamen University)*; Yiyi Zhou (Xiamen University); Yunhang Shen (Xiamen University); Gen Luo (Xiamen University); Xingjia Pan (Momenta.ai); Mingbao Lin (Xiamen University, China); Chao Chen (Youtu Laboratory); Liujuan Cao (Xiamen University); Xiaoshuai Sun (Xiamen University); Rongrong Ji (Xiamen University, China)
735 ECLIPSE: Efficient Long-range Video Retrieval using Sight and Sound Yan-Bo Lin (UNC Chapel Hill)*; Jie Lei (UNC Chapel Hill); Mohit Bansal (University of North Carolina at Chapel Hill); Gedas Bertasius (UNC Chapel Hill)
843 KING: Generating Safety-Critical Driving Scenarios for Robust Imitation via Kinematics Gradients Niklas Hanselmann (Mercedes-Benz AG)*; Katrin Renz (University of Tuebingen); Kashyap Chitta (MPI-IS and University of Tuebingen); Apratim Bhattacharyya (Max Planck Institute for Informatics); Andreas Geiger (University of Tuebingen)
910 Extract Free Dense Labels from CLIP Chong Zhou (Nanyang Technological University)*; Chen Change Loy (Nanyang Technological University); Bo Dai (Shanghai AI Lab)
974 Frequency Domain Model Augmentation for Adversarial Attack Yuyang Long (University of Electronic Science and Technology of China)*; Qilong Zhang ( University of Electronic Science and Technology of China); Boheng Zeng (University of Electronic Science and Technology of China); Lianli Gao (The University of Electronic Science and Technology of China); Xianglong Liu (BUAA); Jian Zhang (College of Computer Science and Electronic Engineering, HNU); Jingkuan Song (UESTC)
993 Make-A-Scene: Scene-Based Text-to-Image Generation with Human Priors Oran Gafni (Meta AI Research)*; Adam Polyak (Facebook); Oron Ashual (Facebook AI Research); Shelly Sheynin (Meta); Devi Parikh (Georgia Tech & Facebook AI Research); Yaniv Taigman (Facebook)
1011 Weakly Supervised Grounding for VQA in Vision-Language Transformers Aisha Urooj (University of Central Florida)*; Hilde Kuehne (University of Frankfurt); Chuang Gan (MIT-IBM Watson AI Lab); Niels da Vitoria Lobo (University of Central Florida); Mubarak Shah (University of Central Florida)
1083 Practical and Scalable Desktop-based High-Quality Facial Capture Alexandros Lattas (Imperial College London)*; Yiming Lin (Imperial college); Jayanth Kannan (Lumirithmic); Ekin Ozturk (Imperial College London); Luca Filipi (Lumirithmic); Giuseppe Claudio Guarnera (University of York); Gaurav Chawla (Lumirithmic Limited); Abhijeet Ghosh (Imperial College London)
1185 Tracking Objects as Pixel-wise Distributions Zelin Zhao (The Chinese University of Hong Kong)*; Ze Wu (Megvii); Yueqing Zhuang (Megvii Inc Company); Boxun Li (Megvii Inc.); Jiaya Jia (Chinese University of Hong Kong)
1212 CMD: Self-supervised 3D Action Representation Learning with Cross-modal Mutual Distillation Yunyao Mao (University of Science and Technology of China)*; Wengang Zhou (University of Science and Technology of China); Zhenbo Lu (Institute of Artificial Intelligence, Hefei Comprehensive National Science Center); Jiajun Deng (University of Science and Technology of China); Houqiang Li (University of Science and Technology of China)
1248 Open-Vocabulary DETR with Conditional Matching Yuhang Zang (Nanyang Technological University)*; Wei Li (Nanyang Technological University); Kaiyang Zhou (Nanyang Technological University); Chen Huang (Apple); Chen Change Loy (Nanyang Technological University)
1250 Towards Calibrated Hyper-sphere Representation via Distribution Overlap Coefficient for Long-tailed Learning Hualiang Wang (Zhejiang University)*; Siming FU (Zhejiang University); Xiaoxuan He (Zhejiang University); Hangxiang Fang (Zhejiang University); Zuozhu Liu (Zhejiang-UIUC Institute); Haoji Hu (Zhejiang University, China)
1272 FBNet: Feedback Network for Point Cloud Completion Xuejun Yan (Hikvision Research Institue)*; Hongyu Yan (Sichuan Universite); Jingjing Wang (Hikvision Research Institute); Hang Du (Hikvision Research Institute); Zhihong Wu (Sichuan University); Di Xie (Hikvision Research Institute); Shiliang Pu (Hikvision Research Institute); Li Lu (Sichuan University)
1276 Physically-Based Editing of Indoor Scene Lighting from a Single Image Zhengqin Li (Meta)*; Jia Shi (Carnegie Mellon University); Sai Bi (Adobe Research); Rui Zhu (University of California San Diego ); Kalyan Sunkavalli (Adobe Research); Milos Hasan (Adobe Research); Zexiang Xu (Adobe Research); Ravi Ramamoorthi (University of California San Diego); Manmohan Chandraker (UC San Diego)
1384 GLASS: Global to Local Attention for Scene-Text Spotting Roi Ronen (Technion)*; Shahar Tsiper (Amazon); Oron Anschel (AWS); Inbal Lavi (Amazon); Amir Markovitz (Amazon); R. Manmatha (Amazon)
1396 Drive&Segment: Unsupervised Semantic Segmentation of Urban Scenes via Cross-modal Distillation Antonin Vobecky (Czech Technical University in Prague)*; David Hurych (Valeo.ai); Oriane Siméoni (valeo.ai); Spyros Gidaris (valeo.ai); Andrei Bursuc (valeo.ai); Patrick Pérez (Valeo.ai); Josef Sivic (Czech Technical University)
1398 Expanding Language-Image Pretrained Models for General Video Recognition Bolin Ni (Institute of Automation, Chinese Academy of Sciences); Houwen Peng (Microsoft Research)*; Minghao Chen (Stony Brook University); Songyang Zhang (University of Rochester); Gaofeng Meng (Chinese Academy of Sciences); Jianlong Fu (Microsoft Research); SHIMING XIANG (Chinese Academy of Sciences, China); Haibin Ling (Stony Brook University)
1407 Box2Mask: Weakly Supervised 3D Semantic Instance Segmentation Using Bounding Boxes Julian Chibane (Max Planck Institute for Informatics, University of Wuerzburg)*; Francis Engelmann (ETH AI Center); Anh Tuan Tran (Max Planck Institute for Informatics, Saarland University); Gerard Pons-Moll (University of Tübingen)
1413 Pose-NDF: Modelling Human Pose Manifolds with Neural Distance Fields Garvita Tiwari (MPI-INF, University of Tübingen)*; Dimitrije Antic (University of Tuebingen); Jan E. Lenssen (TU Dortmund); Nikolaos Sarafianos (Facebook Reality Labs); Tony Tung (Facebook Reality Labs); Gerard Pons-Moll (University of Tübingen)
1448 Multimodal Object Detection via Probabilistic Ensembling Yi-Ting Chen (University of Maryland); Jinghao Shi (Carnegie Mellon University); Zelin Ye (CMU); Mertz Christoph (CMU); Deva Ramanan (Carnegie Mellon University); Shu Kong (Carnegie Mellon University)*
1545 CenterFormer: Center-based Transformer for 3D Object Detection Zixiang Zhou (University of Central Florida)*; xiangchen zhao (Tusimple); Yu Wang (Tusimple); Panqu Wang (TuSimple, Inc); Hassan Foroosh (University of Central Florida)
1552 Revisiting a kNN-based Image Classification System with High-capacity Storage Kengo Nakata (Kioxia Corporation)*; Youyang Ng (Kioxia Corporation); Daisuke Miyashita (Kioxia Corporation); Asuka Maki (Kioxia Corporation); Yu-Chieh Lin (Kioxia Corporation); Jun Deguchi (Kioxia Corporation)
1588 TransFGU: A Top-down Approach to Fine-Grained Unsupervised Semantic Segmentation Zhaoyuan Yin (Hunan University)*; Pichao Wang (Alibaba Group); Fan Wang (Alibaba Group); Xianzhe Xu (alibaba group); Hanling Zhang (Hunan University); Hao Li (Alibaba Group); rong jin (alibaba group)
1617 VQFR: Blind Face Restoration with Vector-Quantized Dictionary and Parallel Decoder Yu-Chao Gu (Nankai University)*; Xintao Wang (Tencent); Liangbin Xie (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, China); Chao Dong (SIAT); Gen LI (Tencent); Ying Shan (Tencent); Ming-Ming Cheng (Nankai University)
1620 CLIFF: Carrying Location Information in Full Frames into Human Pose and Shape Estimation Zhihao Li (Huawei Noah’s Ark Lab)*; Jianzhuang Liu (Huawei Noah’s Ark Lab); Zhensong Zhang (Huawei Noah’s Ark Lab); Songcen Xu (Huawei Noah’s Ark Lab); Youliang Yan (Huawei Noah’s Ark Lab)
1637 Pointly-Supervised Panoptic Segmentation Junsong Fan (Chinese Academy of Sciences, China)*; Zhaoxiang Zhang (Chinese Academy of Sciences, China); Tieniu Tan (NLPR, China)
1729 Registration based Few-Shot Anomaly Detection Chaoqin Huang (Shanghai Jiao Tong University)*; Haoyan Guan (King’s College London); Aofan Jiang (Shanghai Jiao Tong University); Ya Zhang (Cooperative Medianet Innovation Center, Shang hai Jiao Tong University); Michael W Spratling (King’s College London); Yan-Feng Wang (Cooperative medianet innovation center of Shanghai Jiao Tong University)
1742 A Level Set Theory for Neural Implicit Evolution under Explicit Flows Ishit Mehta (University of California San Diego)*; Manmohan Chandraker (UC San Diego); Ravi Ramamoorthi (University of California San Diego)
1791 Improving Robustness by Enhancing Weak Subnets Yong Guo (Max Planck Institute for Informatics)*; David Stutz (Max Planck Institute for Informatics); Bernt Schiele (MPI Informatics)
1792 TO-Scene: A Large-scale Dataset for Understanding 3D Tabletop Scenes Mutian Xu (The Chinese University of Hong Kong (Shenzhen))*; Pei Chen (the Chinese University of Hong Kong (Shenzhen)); Haolin Liu (The Chinese University of Hong Kong, Shenzhen); Xiaoguang Han (Shenzhen Research Institute of Big Data, the Chinese University of Hong Kong (Shenzhen))
1817 PersFormer: 3D Lane Detection via Perspective Transformer and the OpenLane Benchmark Li Chen (Shanghai AI Laboratory)*; Chonghao Sima (Purdue University); Yang Li (SenseTime); Zehan Zheng (Shanghai AI Laboratory); Jiajie Xu (Carnegie Mellon University); Xiangwei Geng (SenseTime); Hongyang Li (SenseTime); Conghui He (Shanghai AI Lab); Jianping Shi (Sensetime Group Limited); Yu Qiao (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences); Junchi Yan (Shanghai Jiao Tong University)
1958 Language Matters: A Weakly Supervised Vision-Language Pre-training Approach for Scene Text Detection and Spotting Chuhui Xue (Nanyang Technological University); Wenqing Zhang (ByteDance); Yu Hao (Bytedance Inc.); Shijian Lu (Nanyang Technological University); Philip Torr (University of Oxford); Song Bai (University of Oxford)*
2021 Adaptive Patch Exiting for Scalable Single Image Super-Resolution Shizun Wang (Beijing University of Posts and Telecommunications)*; Jiaming Liu (Peking University); Kaixin Chen (Beijing University of Posts and Telecommunications); Xiaoqi Li (Columbia university in the city of New york); Ming Lu (Intel Labs China); Yandong Guo (OPPO Research Institute)
2153 Perceptual Artifacts Localization for Inpainting Lingzhi Zhang (University of Pennsylvania)*; Yuqian Zhou (Adobe); Connelly Barnes (Adobe); Zhe Lin (Adobe Research); Eli Shechtman (Adobe Research, US); Sohrab Amirghodsi (Adobe Research); Jianbo Shi (University of Pennsylvania)
2179 Adversarially-Aware Robust Object Detector ZiYi Dong (Sun Yat-Sen University)*; Pengxu Wei (Sun Yat-sen University); Liang Lin (Sun Yat-sen University)
2282 RFNet-4D: Joint Object Reconstruction and Flow Estimation from 4D Point Clouds Tuan-Anh Vu (The Hong Kong University of Science and Technology)*; Thanh Nguyen (Deakin University, Australia); Binh-Son Hua (VinAI Research); Quang Hieu Pham (Woven Planet North America); Sai-Kit Yeung (Hong Kong University of Science and Technology)
2290 Generalizable Patch-Based Neural Rendering Mohammed Suhail (University of British Columbia)*; Carlos Esteves (Google Research); Leonid Sigal (University of British Columbia); Ameesh Makadia (Google Research)
2385 A Perturbation-Constrained Adversarial Attack for Evaluating the Robustness of Optical Flow Jenny Schmalfuss (University of Stuttgart)*; Philipp Scholze (University of Stuttgart); Andrés Bruhn (University of Stuttgart)
2526 Contrastive Monotonic Pixel-Level Modulation Kun Lu (Zhejiang University)*; Rongpeng Li (Zhejiang University); Honggang Zhang (Zhejiang University)
2623 Social-SSL: Self-Supervised Cross-Sequence Representation Learning Based on Transformers for Multi-Agent Trajectory Prediction Li-Wu Tsao (National Chiao Tung University)*; Yan-Kai Wang (National Chiao Tung University); Hao-Siang Lin (National Chiao Tung University); Hong-Han Shuai (National Yang Ming Chiao Tung University); Lai-Kuan Wong (Multimedia University); Wen-Huang Cheng (National Chiao Tung University)
2657 SpOT: Spatiotemporal Modeling for 3D Object Tracking Colton Stearns (Stanford University)*; Davis Rempe (Stanford University); Jie Li (Toyota Research Institute); Rareș A Ambruș (Toyota Research Institute); Sergey Zakharov (Toyota Research Institute); Vitor Guizilini (Toyota Research Institute); Yanchao Yang (Stanford University); Leonidas Guibas (Stanford University)
2688 Toward Understanding WordArt: Corner-Guided Transformer for Scene Text Recognition Xudong Xie (Huazhong University of Science and Technology)*; LING FU (Huazhong University of Science and Technology); Zhifei Zhang (Adobe Research); Zhaowen Wang (Adobe Research); Xiang Bai (Huazhong University of Science and Technology)
2691 Monocular 3D Object Detection with Depth from Motion Tai Wang (The Chinese University of Hong Kong)*; Jiangmiao Pang (CUHK); Dahua Lin (The Chinese University of Hong Kong)
2723 Fine-Grained Scene Graph Generation with Data Transfer Ao Zhang (National University of Singapore)*; Yuan Yao (Tsinghua University); qianyu chen (Tsinghua University); Wei Ji (National University of Singapore); Zhiyuan Liu (Tsinghua University); Maosong Sun (Tsinghua University); Tat-Seng Chua (National university of Singapore)
2753 Balancing Stability and Plasticity through Advanced Null Space in Continual Learning Yajing Kong (The University of Sydney)*; Liu Liu (The University of Sydney); Zhen Wang (The University of Sydney ); Dacheng Tao (JD.com)
2808 OccamNets: Mitigating Dataset Bias by Favoring Simpler Hypotheses Robik S Shrestha (Rochester Institute of Technology)*; Kushal Kafle (Adobe Research); Christopher Kanan (University of Rochester)
2827 DisCo: Remedying Self-supervised Learning on Lightweight Models with Distilled Contrastive Learning Yuting Gao (tencent)*; Jia-Xin Zhuang (Sun Yat-sen University); Shaohui Lin (East China Normal University ); Hao Cheng (Tencent); Xing Sun (Shopee); Ke Li (Tencent); Chunhua Shen (“University of Adelaide, Australia”)
2874 Diverse Human Motion Prediction Guided by Multi-Level Spatial-Temporal Anchors Sirui Xu (University of Illinois Urbana-Champaign)*; Yu-Xiong Wang (University of Illinois at Urbana-Champaign); Liangyan Gui (University of Illinois Urbana-Champaign)
2911 InfiniteNature-Zero: Learning Perpetual View Generation of Natural Scenes from Single Images Zhengqi Li (Google Inc.)*; Qianqian Wang (Cornell); Noah Snavely (Google); Angjoo Kanazawa (University of California Berkeley)
3007 CT^2: Colorization Transformer via Color Tokens Shuchen Weng (Peking University)*; Jimeng Sun (Beijing University of Posts and Telecommunications); Yu Li (International Digital Economy Academy); Si Li (Beijing University of Posts and Telecommunications); Boxin Shi (Peking University)
3086 PCW-Net: Pyramid Combination and Warping Cost Volume for Stereo Matching Zhelun Shen (Baidu Research)*; Yuchao Dai (Northwestern Polytechnical University); Xibin Song (Baidu); ZhiBo Rao (Northwestern Polytechnical University); Dingfu Zhou (Baidu); Liangjun Zhang (Baidu Research Institute)
3181 Discovering Transferable Forensic Features for CNN-generated Images Detection Keshigeyan Chandrasegaran (Singapore University of Technology and Design)*; Ngoc-Trung Tran (Singapore University of Technology and Design); Alexander Binder (University of Oslo); Ngai-Man Cheung (Singapore University of Technology and Design)
3187 Domain Adaptive Person Search Junjie Li (Shanghai Jiao Tong University); Yichao Yan (Shanghai Jiao Tong University)*; Guanshuo Wang (Tencent Youtu Lab); Fufu Yu (Tencent Youtu); Qiong Jia (Tencent Youtu Lab); Shouhong Ding (Tencent)
3228 Text2LIVE: Text-Driven Layered Image and Video Editing Omer Bar Tal (Weizmann Institute of Science )*; Dolev Ofri-Amar (Weizmann Institute of Science); Rafail Fridman (Weizmann Institute of Science); Yoni Kasten (Weizmann Institute); Tali Dekel (Weizmann Institute of Science)
3239 Event-Based Fusion for Motion Deblurring with Cross-modal Attention Lei Sun (Zhejiang University); Christos Sakaridis (ETH Zurich); Jingyun Liang (ETH Zurich); Qi Jiang (Zhejiang University); Kailun Yang (Karlsruhe Institute of Technology); Peng Sun (Zhejiang University); Yaozu Ye (State Key Laboratory of Modern Optical Instrumentation, Zhejiang University); Kaiwei Wang (State Key Laboratory of Modern Optical Instrumentation, Zhejiang University)*; Luc Van Gool (ETH Zurich)
3311 AutoMix: Unveiling the Power of Mixup Zicheng Liu (Westlake University)*; Siyuan Li (Westlake University); di wu (Westlake University); Zihan Liu (Westlake University); Zhiyuan Chen (Shanghai AI Lab); Lirong Wu (Westlake University); Stan Z. Li (Westlake University)
3332 Synergistic Self-Supervised and Quantization Learning Yunhao Cao (Nanjing University)*; Peiqin Sun (MEGVII Technology); Yechang Huang (MEGVII Technology); Jianxin Wu (Nanjing University); Shuchang Zhou (MEGVII Technology)
3586 Auto-regressive Image Synthesis with Integrated Quantization Fangneng Zhan (Max Planck Institute for Informatics); Yingchen Yu (Nanyang Technological University); Rongliang WU (Nanyang Technological University); Jiahui Zhang (Nanyang Technological University); Kaiwen Cui (Nanyang Technological University); Changgong Zhang (Amazon); Shijian Lu (Nanyang Technological University)*
3601 Event-guided Deblurring of Unknown Exposure Time Videos Taewoo Kim (KAIST)*; Jeongmin Lee (KAIST); Lin Wang (HKUST); Kuk-Jin Yoon (KAIST)
3622 Learning Disentanglement with Decoupled Labels for Vision-Language Navigation Wenhao Cheng (Beijing Institute of Technology); Xingping Dong (Inception Institute of Artificial Intelligence); Salman Khan (MBZUAI/ANU); Jianbing Shen (Inception Institute of Artificial Intelligence)*
3631 3D CoMPaT: Composition of Materials on Parts of 3D Things Yuchen Li (King Abdullah University of Science and Technology (KAUST)); Ujjwal Upadhyay (KAUST); Habib Slim (KAUST); Tezuesh Varshney (KAUST); Ahmed Abdelreheem (KAUST); Arpit Prajapati (Poly9); Suhail S Pothigara (Poly9 Inc); Peter Wonka (KAUST); Mohamed Elhoseiny (KAUST)*
3673 Exploring Gradient-based Multi-directional Controls in GANs Zikun Chen (ModiFace Inc. )*; Ruowei Jiang (ModiFace Inc.); Brendan Duke (ModiFace Inc); Han Zhao (University of Illinois at Urbana-Champaign); Parham Aarabi (ModiFace Inc.)
3727 OPD: Single-view 3D Openable Part Detection Hanxiao Jiang (Simon Fraser University)*; Yongsen Mao (Simon Fraser University); Manolis Savva (Simon Fraser University); Angel X Chang (SFU)
3757 Unpaired Image Translation via Vector Symbolic Architectures Justin Theiss (University of California, Berkeley)*; Jay Leverett (Meta); Daeil Kim (Meta); Aayush Prakash (Meta)
3887 CCPL: Contrastive Coherence Preserving Loss for Versatile Style Transfer Zijie Wu (Huazhong University of Science and Technology)*; Zhen Zhu (University of Illinois at Urbana-Champaign); Junping Du (Beijing University of Posts and Telecommunications); Xiang Bai (Huazhong University of Science and Technology)
4028 Decoupled Adversarial Contrastive Learning for Self-supervised Adversarial Robustness Chaoning Zhang (KAIST)*; Kang Zhang (KAIST); Chenshuang Zhang (KAIST); Axi Niu (Northwestern Polytechnical University ); Jiu Feng (Sichuan University); Chang D. Yoo (KAIST); In So Kweon (KAIST)
4067 Secrets of Event-Based Optical Flow Shintaro Shiba (Keio University)*; Yoshimitsu Aoki (Keio University); Guillermo Gallego (TU Berlin)
4122 Synthesizing Light Field Video from Monocular Video Shrisudhan Govindarajan (Indian Institute of Technology Madras); Prasan A Shedligeri (Indian Institute of Technology Madras)*; Sarah Sarah (Indian Institute of Technology, Madras); Kaushik Mitra (IIT Madras)
4350 LESS: Label-Efficient Semantic Segmentation for LiDAR Point Clouds Minghua Liu (UCSD)*; Yin Zhou (Waymo); Charles R. Qi (Waymo); Boqing Gong (Google); Hao Su (UCSD); Dragomir Anguelov (Waymo)
4399 3D-Aware Indoor Scene Synthesis with Depth Priors Zifan SHI (HKUST)*; Yujun Shen (Dept. of IE, CUHK); Jiapeng Zhu (HKUST); Dit-Yan Yeung (HKUST); Qifeng Chen (HKUST)
4417 Restore Globally, Refine Locally: A Mask-Guided Scheme to Accelerate Super-Resolution Networks xiaotao hu (Nankai University); Jun Xu (Nankai University)*; Shuhang Gu (ETH Zurich, Switzerland); Ming-Ming Cheng (Nankai University); Li Liu (the inception institute of artificial intelligence)
4507 Modeling Mask Uncertainty in Hyperspectral Image Reconstruction jiamian wang (Santa Clara University)*; Yulun Zhang (ETH Zurich); Xin Yuan (Westlake University); Ziyi Meng (Kuaishou Technology); Zhiqiang Tao (Santa Clara University)
4508 Perceiving and Modeling Density for Image Dehazing Tian Ye (Jimei University)*; Yunchen Zhang (China Design Group Ltd.Co); Erkang Chen (Jimei University); MingChao Jiang (JOYY.INC); Yun Liu (Southwest University); Liang Chen (Fujian Normal University); Sixiang Chen (JiMei University)
4514 ROBIN: A Benchmark for Robustness to Individual Nuisances in Real-World Out-of-Distribution Shifts Bingchen Zhao (University of Edinburgh)*; Shaozuo Yu (Tongji University); Wufei Ma (Purdue University); Mingxin Yu (Peking University); Shenxiao Mei (Johns Hopkins University); Angtian Wang (Johns Hopkins University); Ju He (Johns Hopkins University); Alan Yuille (Johns Hopkins University); Adam Kortylewski (Max Planck Institute for Informatics)
4539 Delving into Details: Synopsis-to-Detail Networks for Video Recognition Shuxian Liang (Zhejiang University)*; Xu Shen (Alibaba Group); Jianqiang Huang (Alibaba Group); Xian-Sheng Hua (Alibaba Group)
4547 Bringing Rolling Shutter Images Alive with Dual Reversed Distortion Zhihang Zhong (The University of Tokyo); Mingdeng Cao (Tsinghua University); Xiao Sun (Microsoft Research Asia); Zhirong Wu (Microsoft Research); Zhongyi Zhou (The University of Tokyo); Yinqiang Zheng (The University of Tokyo)*; Stephen Lin (Microsoft Research); Imari Sato (National Institute of Informatics)
4591 SimCC: a Simple Coordinate Classification Perspective for Human Pose Estimation Yanjie Li (Tsinghua University)*; Sen Yang (Southeast University); Peidong Liu (Tsinghua University); 寿奎 张 (meituan); Yunxiao Wang (Tsinghua University); Zhicheng Wang (Nreal); Wankou Yang (Southeast University); Shu-Tao Xia (Tsinghua University)
4610 Generative Multiplane Images: Making a 2D GAN 3D-Aware Xiaoming Zhao (University of Illinois at Urbana-Champaign)*; Fangchang Ma (Apple Inc.); David Güera (Apple Inc.); Zhile Ren (Apple Inc.); Alexander Schwing (UIUC); Alex Colburn (Apple Inc.)
4640 Self-supervised Social Relation Representation for Human Group Detection Jiacheng Li (College of Intelligence and Computing, Tianjin University); Ruize Han (College of Intelligence and Computing, Tianjin University)*; Haomin Yan (Tianjin University); Zekun Qian (College of Intelligence and Computing, Tianjin University); Wei Feng (College of Intelligence and Computing, Tianjin University, China); Song Wang (University of South Carolina)
4651 Stripformer: Strip Transformer for Fast Image Deblurring Fu-Jen Tsai (National Tsing Hua University)*; Yan-Tsung Peng (National Chengchi University); Yen-Yu Lin (National Yang Ming Chiao Tung University); Chung-Chi Tsai (Qualcomm Technology); Chia-Wen Lin (National Tsing Hua University)
4678 Deep Fourier-based Exposure Correction Network with Spatial-Frequency Interaction Jie Huang (University of Science and Technology of China); Yajing Liu (USTC); Feng Zhao (University of Science and Technology of China)*; Keyu Yan (University of Science and Technology of China); Jinghao Zhang (University of Science and Technology of China); Yukun Huang (University of Science and Technology of China); man zhou (University of Science and Technology of China); Zhiwei Xiong (University of Science and Technology of China)
4720 Organic Priors in Non-Rigid Structure from Motion Suryansh Kumar (ETH Zurich)*; Luc Van Gool (ETH Zurich)
4806 TEMOS: Generating diverse human motions from textual descriptions Mathis Petrovich (Ecole des Ponts)*; Michael Black (Max Planck Institute for Intelligent Systems); Gul Varol (Ecole des Ponts ParisTech)
4824 Semantic-Aware Fine-Grained Correspondence Yingdong Hu (Tsinghua University); Renhao Wang (Tsinghua University); Kaifeng Zhang (Tsinghua University); Yang Gao (Tsinghua University)*
4847 Layered Controllable Video Generation Jiahui Huang (University of British Columbia)*; Yuhe Jin (University of British Columbia); Kwang Moo Yi (University of British Columbia); Leonid Sigal (University of British Columbia)
4861 GraphVid: It Only Takes a Few Nodes to Understand a Video Eitan Kosman (Bosch AI)*; Dotan Di Castro (Bosch)
4878 Cross-Modality Knowledge Distillation Network for Monocular 3D Object Detection Yu Hong (Zhejiang University); Hang Dai (Mohamed bin Zayed University of Artificial Intelligence)*; Yong Ding (Zhejiang University)
4901 Adaptive Token Sampling For Efficient Vision Transformers Mohsen Fayyaz (Microsoft)*; Soroush Abbasi Koohpayegani (University of Maryland Baltimore County); Farnoush Rezaei Jafari (Technische Universität Berlin); Sunando Sengupta (Microsoft); HAMID VAEZI JOZE (Microsoft); Eric Sommerlade (Microsoft); Hamed Pirsiavash (University of California Davis); Jürgen Gall (University of Bonn)
4910 Implicit Field Supervision For Robust Non-Rigid Shape Matching Ramana S Sundararaman (Ecole Polytechnique)*; Gautam Pai (École Polytechnique); Maks Ovsjanikov (Ecole polytechnique)
4916 NeuMesh: Learning Disentangled Neural Mesh-based Implicit Field for Geometry and Texture Editing Bangbang Yang (Zhejiang University); Chong Bao (Zhejiang University); Junyi Zeng (Zhejiang University); Hujun Bao (Zhejiang University); Yinda Zhang (Google); Zhaopeng Cui (Zhejiang University); Guofeng Zhang (Zhejiang University)*
4919 KXNet: A Model-Driven Deep Neural Network for Blind Super-Resolution Jiahong Fu (Xi’an Jiaotong University)*; Hong Wang (Jarvis Lab,Tencent ); Qi Xie (Xi’an Jiaotong University); Qian Zhao (Xi’an Jiaotong University); Deyu Meng (Xi’an Jiaotong University); Zongben Xu (Xi’an Jiaotong University)
4989 RealFlow: EM-based Realistic Optical Flow Datasets Generation from Videos Yunhui Han (THU;Megvii); Kunming Luo (Megvii); Ao Luo (Megvii); Jiangyu Liu (megvii inc); Haoqiang Fan (Megvii Inc(face++)); Guiming Luo (School of Software, Tsinghua University); Shuaicheng Liu (UESTC; Megvii)*
5010 Semi-supervised Object Detection via Virtual Category Learning Changrui Chen (University of Warwick); Kurt Debattista (University of Warwick, UK); Jungong Han (Aberystwyth University)*
5080 PrivHAR: Recognizing Human Actions From Privacy-preserving Lens Carlos Hinojosa (Universidad Industrial de Santander)*; Miguel A Marquez (UIS Colombia); Henry Arguello (Universidad Industrial Santander); Ehsan Adeli (Stanford University); Li Fei-Fei (Stanford University); Juan Carlos Niebles (Salesforce & Stanford University)
5096 Solution Space Analysis of Essential Matrix based on Algebraic Error Minimization Gaku Nakano (NEC Corporation)*
5100 EvAC3D: From Event-based Apparent Contours to 3D Models via Continuous Visual Hulls Ziyun Wang (University of Pennsylvania)*; Kenneth Chaney (University of Pennsylvania); Kostas Daniilidis (University of Pennsylvania)
5142 DCCF: Deep Comprehensible Color Filter Learning Framework for High-Resolution Image Harmonization Ben Xue (Peking University); Shenghui Ran (Alibaba Group); Quan Chen (Alibaba Group)*; Rongfei Jia (Alibaba Group); Binqiang Zhao (Alibaba); Xing Tang (Alibaba Group)
5226 UniTAB: Unifying Text and Box Outputs for Grounded Vision-Language Modeling Zhengyuan Yang (Microsoft)*; Zhe Gan (Microsoft); Jianfeng Wang (Microsoft); Xiaowei Hu (Microsoft); Faisal Ahmed (Microsoft); Zicheng Liu (Microsoft); Yumao Lu (Microsoft); Lijuan Wang (Microsoft)
5242 Grasp’D: Differentiable Contact-rich Grasp Synthesis for Multi-fingered Hands Dylan Turpin (University of Toronto)*; Liquan Wang (University of Toronto); Eric Heiden (University of Southern California); Yun-Chun Chen (University of Toronto ); Miles Macklin (NVIDIA); Stavros Tsogkas (University of Toronto); Sven Dickinson (University of Toronto); Animesh Garg (University of Toronto, Vector Institute, Nvidia)
5263 The Abduction of Sherlock Holmes: A Dataset for Visual Abductive Reasoning Jack Hessel (Allen Institute for AI)*; Jena D Hwang (Allen Institute for AI); Jae Sung Park (University of Washington); Rowan Zellers (University of Washington); Chandra Bhagavatula (AllenAI); Anna Rohrbach (UC Berkeley); Kate Saenko (Boston University); Yejin Choi (University of Washington)
5271 Cross-Modal Knowledge Transfer Without Task-Relevant Source Data SK MIRAJ AHMED (University of California Riverside); Suhas Lohit (Mitsubishi Electric Research Laboratories)*; Kuan-Chuan Peng (Mitsubishi Electric Research Laboratories (MERL)); Michael J Jones (MERL); Amit K. Roy-Chowdhury (University of California, Riverside)
5285 Approximate Differentiable Rendering with Algebraic Surfaces Leonid Keselman (Carnegie Mellon University)*; Martial Hebert (Carnegie Mellon School of Computer Science)
5303 Sim-2-Sim Transfer for Vision-and-Language Navigation in Continuous Environments Jacob Krantz (Oregon State University)*; Stefan Lee (Oregon State University)
5350 Uncertainty-DTW for Time Series and Sequences Lei Wang (The Australian National University); Piotr Koniusz (ANU College of Engineering and Computer Science)*
5358 Affine Correspondences between Multi-Camera Systems for 6DOF Relative Pose Estimation Banglei Guan (National University of Defense Technology)*; Ji Zhao (Huazhong University of Science and Technology)
5415 Improving Self-supervised Lightweight Model Learning via Hard-aware Metric Distillation Hao Liu (Beijing Institute of Technology); Mang Ye (Wuhan University)*
5422 NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion Chenfei Wu (Microsoft)*; Jian Liang (Peking University); Lei Ji (Microsoft); Fan Yang (MSRA); Yuejian Fang (Peking University); Daxin Jiang (Microsoft, Beijing, China); Nan Duan (Microsoft Research)
5512 BATMAN: Bilateral Attention Transformer in Motion-Appearance Neighboring Space for Video Object Segmentation Ye Yu (Microsoft)*; Jialin Yuan (Oregon State University); Gaurav Mittal (Microsoft); Li Fuxin (Oregon State University); Mei Chen (Microsoft)
5622 DiffuStereo: High Quality Human Reconstruction via Diffusion-based Stereo Using Sparse Cameras Ruizhi Shao (Tsinghua University); Zerong Zheng (Tsinghua University); Hongwen Zhang (Tsinghua University); Jingxiang Sun (University of Illinois Urbana-Champaign); Yebin Liu (Tsinghua University)*
5667 The Challenges of Continuous Self-Supervised Learning Senthil Purushwalkam (Carnegie Mellon University); Pedro Morgado (CMU)*; Abhinav Gupta (CMU/FAIR)
5670 Deep Radial Embedding for Visual Sequence Learning Yuecong Min (Institute of Computing Technology, Chinese Academy of Sciences); Peiqi Jiao (Institute of Computing Technology, Chinese Academy of Sciences); Yanan Li (Xiaomi); Wang Xiaotao (XIaomi); LEI LEI (Xiaomi); Xiujuan Chai (Agricultural Information Institute, Chinese); Xilin Chen (Institute of Computing Technology, Chinese Academy of Sciences)*
5713 Shape-Pose Disentanglement using SE(3)-equivariant Vector Neurons Oren Katzir (Tel Aviv University)*; Dani Lischinski (The Hebrew University of Jerusalem); Danny Cohen-Or (Tel Aviv University)
5763 3D Object Detection with a Self-supervised Lidar Scene Flow Backbone Emeç Erçelik (Technical University of Munich)*; Ekim Yurtsever (The Ohio State University); Mingyu Liu (TUM); Zhijie Yang (Technical University of Munich); Hanzhen Zhang (TUM); Pınar Topçam (Technical University of Munich ); Maximilian Listl (Technical University of Munich); Yılmaz Kaan Kaan Çaylı (Technical University of Munich); Alois C. Knoll (Robotics and Embedded Systems)
5991 FH-Net: A Fast Hierarchical Network for Scene Flow Estimation on Real-world Point Clouds lihe Ding (Beijing Institute of Technology)*; Shaocong Dong (Beijing Institute of Technology); Tingfa Xu (Beijing Institute of Technology); xinli Xu (Beijing Institute of Technology); Jie Wang (Beijing Institute of Technology); Jianan Li (Beijing Institute of Technology)
6108 Vote from the Center: 6 DoF Pose Estimation in RGB-D Images by Radial Keypoint Voting Yangzheng Wu (Queen’s University)*; Mohsen Zand (Queen’s University); Ali Etemad (Queen’s University); Michael Alan Greenspan (Queen’s University)
6132 Flow graph to Video Grounding for Weakly-supervised Multi-Step Localization NIKITA DVORNIK (Samsung)*; Isma Hadji (Samsung AI Center – Toronto); Hai X Pham (Samsung AI Center); Dhaivat Bhatt (Samsung); Brais Martinez (Samsung AI Center); Afsaneh Fazly (SAIC Toronto); Allan D Jepson (Samsung Toronto AIC)
6143 Neural Radiance Transfer Fields for Relightable Novel-view Synthesis with Global Illumination Linjie Lyu (MPII)*; Ayush Tewari (MIT); Thomas Leimkuehler (MPI Informatik); Marc Habermann (Max Planck Institute for Informatics); Christian Theobalt (MPI Informatik)
6180 Learning Topological Interactions for Multi-Class Medical Image Segmentation Saumya Gupta (Stony Brook University)*; Xiaoling Hu (Stony Brook University); James Kaan (Stony Brook University); Michael Jin (Stony Brook University Hospital); Mutshipay Christian Mpoy (SUNY Stony Brook Medicine); Katherine Chung (Stony Brook University Hospital); Gagandeep Singh (RWJBarnabas Health); Mary Saltz (Stony Brook); Tahsin Kurc (Stony Brook University); Joel Saltz (Stony Brook University); APOSTOLOS K TASSIOPOULOS (Stony Brook University); Prateek Prasanna (Stony Brook University); Chao Chen (Stony Brook University)
6185 Look Both Ways: Self-Supervising Driver Gaze Estimation and Road Scene Saliency Isaac H Kasahara (University of Minnesota); Simon Stent (Toyota Research Institute); Hyun Soo Park (The University of Minnesota)*
6191 ObjectBox: From Centers to Boxes for Anchor-Free Object Detection Mohsen Zand (Queen’s University)*; Ali Etemad (Queen’s University); Michael Alan Greenspan (Queen’s University)
6193 Unsupervised Segmentation in Real-World Images via Spelke Object Inference Honglin Chen (Stanford University); Rahul M V (Stanford University); Yoni I Friedman (MIT); Jiajun Wu (Stanford University); Joshua Tenenbaum (MIT); Daniel Yamins (Stanford University); Daniel Bear (Stanford University)*
6243 A Dense Material Segmentation Dataset for Indoor and Outdoor Scene Parsing Paul Upchurch (Apple)*; Ransen Niu (Apple)
6295 Pixel-wise Energy-biased Abstention Learning for Anomaly Segmentation on Complex Urban Driving Scenes Yu Tian (Australian Institute for Machine Learning, University of Adelaide ); Yuyuan Liu (University of Adelaide); Guansong Pang (Singapore Management University)*; Fengbei Liu (University of Adelaide); Yuanhong Chen (University of Adelaide); Gustavo Carneiro (University of Adelaide)
6326 Identifying Hard Noise in Long-Tailed Sample Distribution Xuanyu Yi (Nanyang Technological University)*; Kaihua Tang (Nanyang Technological University); Xian-Sheng Hua (Damo Academy, Alibaba Group); Joo-Hwee Lim (Institute for Infocomm Research); Hanwang Zhang (Nanyang Technological University)
6515 PressureVision: Estimating Hand Pressure from a Single RGB Image Patrick L Grady (Georgia Institute of Technology)*; Chengcheng Tang (Facebook Reality Labs); Samarth Brahmbhatt (Intel); Christopher D Twigg (Meta); Chengde Wan (Facebook Reality Lab); James Hays (Georgia Institute of Technology, USA); Charlie Kemp (Georgia Institute of Technology)
6568 PACTran: PAC-Bayesian Metrics for Estimating the Transferability of Pretrained Models to Classification Tasks Nan Ding (Google)*; Xi Chen (Google Research); Tomer Levinboim (Google); Soravit Changpinyo (Google Research); Radu Soricut (Google)
6571 Beyond Periodicity: Towards a Unifying Framework for Activations in Coordinate-MLPs Sameera Ramasinghe (University of Adelaide)*; Simon Lucey (University of Adelaide)
6672 Pose for Everything: Towards Category-Agnostic Pose Estimation Lumin XU (The Chinese University of Hong Kong)*; Sheng Jin (The University of Hong Kong); Wang ZENG (The Chinese University of Hong Kong); Wentao Liu (Sensetime); Chen Qian (SenseTime); Wanli Ouyang (The University of Sydney); Ping Luo (The University of Hong Kong); Xiaogang Wang (Chinese University of Hong Kong, Hong Kong)
6739 UIA-ViT: Unsupervised Inconsistency-Aware Method based on Vision Transformer for Face Forgery Detection Wanyi Zhuang (University of Science and Technology of China); Qi Chu (University of Science and Technology of China)*; Zhentao Tan (University of Science and Technology of China); Qiankun Liu (University of Science and Technology of China); Haojie Yuan (University of Science and Technology of China); Changtao Miao (University of Science and Technology of China); Zixiang Luo (University of Science and Technology of China); Nenghai Yu (University of Science and Technology of China)
7092 PREF: Predictability Regularized Neural Motion Fields Liangchen Song (University at Buffalo)*; Xuan Gong (University at Buffalo); Benjamin Planche (United Imaging Intelligence); Meng Zheng (United Imaging Intelligence); David Doermann (University at Buffalo); Junsong Yuan (“State University of New York at Buffalo, USA”); Terrence Chen (United Imaging Intelligence); Ziyan Wu (United Imaging Intelligence)
7215 Bi-PointFlowNet: Bidirectional Learning for Point Cloud Based Scene Flow Estimation WENCAN CHENG (Sungkyunkwan University); Jong Hwan Ko (Sungkyunkwan University)*
7248 Bayesian Tracking of Video Graphs Using Joint Kalman Smoothing and Registration Aditi Basu Bal (Florida State University)*; Ramy A Mounir (University of South Florida); Sathyanarayanan N Aakur (OK State); Sudeep Sarkar (University of South Florida, Tampa); Anuj Srivastava (Florida State University)
7302 Semidefinite Relaxations of Truncated Least-Squares in Robust Rotation Search: Tight or Not Liangzu Peng (Johns Hopkins University)*; Mahyar Fazlyab (Johns Hopkins University); Rene Vidal (Johns Hopkins University, USA)
7345 Lottery Ticket Hypothesis for Spiking Neural Networks Youngeun Kim (Yale University)*; Yuhang Li (Yale University); Hyoungseob Park (Yale University); Yeshwanth Venkatesha (Yale university); Ruokai Yin (Yale University); Priyadarshini Panda (Yale University)
7360 Multi-domain Learning for Updating Face Anti-spoofing Models Xiao Guo (Michigan State University)*; Yaojie Liu (Google Research); Anil Jain (Michigan State University); Xiaoming Liu (Michigan State University)
7402 Towards Realistic Semi-Supervised Learning Mamshad Nayeem Rizve (University of Central Florida)*; Navid Kardan (University of Central Florida); Mubarak Shah (University of Central Florida)
7414 Unsupervised Pose-aware Part Decomposition for Man-made Articulated Objects Yuki Kawana (The University of Tokyo)*; Yusuke Mukuta (The University of Tokyo); Tatsuya Harada (The University of Tokyo / RIKEN)
7464 Cartoon Explanations of Image Classifiers Stefan Kolek (LMU)*; Duc Anh Nguyen (LMU Munich); Ron Levie (Technion); Joan Bruna (Courant Institute of Mathematical Sciences, NYU, USA); Gitta Kutyniok (Ludwig Maximilian University of Munich)
7808 RRSR:Reciprocal Reference-based Image Super-Resolution with Progressive Feature Alignment and Selection Lin Zhang (CASIA); Xin Li (Baidu); Dongliang He (Baidu)*; Fu Li (Baidu); Yili Wang (Tsinghua University); Zhaoxiang Zhang (Chinese Academy of Sciences, China)
7838 Gaussian Activated Neural Radiance Fields for High Fidelity Reconstruction & Pose Estimation Shin-Fang Chng (The University of Adelaide)*; Sameera Ramasinghe (University of Adelaide); Jamie Sherrah (AIML); Simon Lucey (University of Adelaide)
7886 Unbiased Gradient Estimation for Differentiable Surface Splatting via Poisson Sampling Jan U. Müller (University of Bonn)*; Michael Weinmann (TU Delft); Reinhard Klein (University of Bonn)
8098 “This is my unicorn, Fluffy”: Personalizing frozen vision-language representations Niv Cohen (The Hebrew University of Jerusalem)*; Rinon Gal (Tel Aviv University); Eli Meirom (NVIDIA Research); Gal Chechik (NVIDIA); Yuval Atzmon (NVIDIA Research)

Poster papers

Paper ID Paper Title Authors
8 Learning Uncoupled-Modulation CVAE for 3D Action-Conditioned Human Motion Synthesis Chongyang Zhong (Institute of Computing Technology, Chinese Academy of Sciences)*; Lei Hu (Institute of Computing Technology, Chinese Academy of Sciences ); Zihao Zhang (Institute of Computing Technology, Chinese Academy of Sciences); Shihong Xia (institute of computing technology of the Chinese academy of sciences)
16 Generative Domain Adaptation for Face Anti-Spoofing Qianyu Zhou (Shanghai Jiao Tong University)*; Ke-Yue Zhang (YouTu Lab, Tencent); Taiping Yao (Tencent YouTu); Ran Yi (Shanghai Jiao Tong University); Kekai Sheng (Youtu Lab, Tencent Inc.); Shouhong Ding (Tencent); Lizhuang Ma (Shanghai Jiao Tong University)
19 Learning Depth from Focus in the Wild Changyeon Won (GIST)*; Hae-Gon Jeon (GIST)
34 Relighting4D: Neural Relightable Human from Videos Zhaoxi Chen (Nanyang Technological University )*; Ziwei Liu (Nanyang Technological University)
46 PPT: token-Pruned Pose Transformer for monocular and multi-view human pose estimation Haoyu Ma (University of California, Irvine)*; Zhe Wang (UC-Irvine); Yifei Chen (Tencent); Deying Kong (university of california, irvine); Liangjian Chen (Reality Labs); Xingwei Liu (University of California Irvine); Xiangyi Yan (University of California, Irvine); Hao Tang (University of California Irvine); Xiaohui Xie (University of California, Irvine)
52 Understanding the Dynamics of DNNs Using Graph Modularity Yao Lu (Zhejiang University of Technology)*; Wen Yang (Zhejiang University of Technology); Yunzhe Zhang (Zhejiang University of Technology); Zuohui Chen (Zhejiang University of Technology); Jinyin Chen (Zhejiang University of Technology); Qi Xuan (Zhejiang University of Technology); Zhen Wang (Northwestern Polytechnical University); Xiaoniu Yang (Zhejiang University of Technology; Science and Technology on Communication Information Security Control Laboratory)
65 Discriminability-Transferability Trade-Off: An Information-Theoretic Perspective Quan Cui (Waseda University)*; Bingchen Zhao (University of Edinburgh); Zhao-Min Chen (NanJing University); Borui Zhao (Megvii Technology); Renjie Song (Megvii Inc.); Boyan Zhou (ByteDance); Jiajun Liang (Megvii); Osamu Yoshie (Waseda University)
69 Learning-based Point Cloud Registration for 6D Object Pose Estimation in the Real World Zheng Dang (EPFL)*; Lizhou Wang (Xi’an Jiaotong University); Yu Guo (School of Software Engineering, Xi’an Jiaotong University); Mathieu Salzmann (EPFL)
74 AvatarPoser: Articulated Full-Body Pose Tracking from Sparse Motion Sensing Jiaxi Jiang (ETH Zurich)*; Paul Streli (ETH Zurich); Huajian Qiu (EPFL); Andreas R Fender (ETH Zurich); Larissa Laich (Facebook Reality Labs); Patrick Snape (Meta); Christian Holz (ETH Zürich)
75 Knowledge Condensation Distillation chenxin li (Xiamen University)*; Mingbao Lin (Xiamen University, China); Zhiyuan Ding (Xiamen University); Nie Lin (Hunan University); Yihong Zhuang (Xiamen University); Yue Huang (Xiamen University); Xinghao Ding (Xiamen University); Liujuan Cao (Xiamen University)
83 CAR: Class-aware Regularizations for Semantic Segmentation Ye Huang (University of Technology Sydney)*; Di Kang (Tencent); Liang Chen (Fujian Normal University); Xuefei Zhe (Tencent AI lab); Wenjing Jia (University of Technology Sydney); Linchao Bao (Tencent AI Lab); Xiangjian He (University of Nottingham Ningbo China)
86 Style-Hallucinated Dual Consistency Learning for Domain Generalized Semantic Segmentation Yuyang Zhao (National University of Singapore)*; Zhun Zhong (University of Trento); Na Zhao (NUS); Nicu Sebe (University of Trento); Gim Hee Lee (National University of Singapore)
88 Reducing Information Loss for Spiking Neural Networks Yufei Guo (The Second Academy of China Aerospace Science and Industry Corporation)*; Yuanpei Chen (X LAB,The Second Academy of CASIC,Beijing); Liwen Zhang (X Lab, the Second Academy of CASIC, Beijing); YingLei Wang (CASIC); Xiaode Liu (X Lab, The Second Academy of China Aerospace Science and Industry Corporation); Xinyi Tong (The Second Academy of China Aerospace Science and Industry Corporation); Yuanyuan Ou (Chongqing University); Xuhui Huang (X Lab, The Second Academy of CASIC); Zhe Ma (Xlab, the Second Academy of CASIC, Beijing)
95 Real-Time Intermediate Flow Estimation for Video Frame Interpolation Zhewei Huang (MEGVII)*; Tianyuan Zhang (Carnegie Mellon University); Wen Heng (Megvii inc.); Boxin Shi (Peking University); Shuchang Zhou (MEGVII Technology)
101 Class-incremental Novel Class Discovery Subhankar Roy (University of Trento); Mingxuan Liu (University of Trento); Zhun Zhong (University of Trento)*; Nicu Sebe (University of Trento); Elisa Ricci (University of Trento)
103 PixelFolder: An Efficient Progressive Pixel Synthesis Network for Image Generation Jing He (Xiamen university)*; Yiyi Zhou (Xiamen University); Qi Zhang (Tencent); Jun Peng (Xiamen University); Yunhang Shen (Xiamen University); Xiaoshuai Sun (Xiamen University); Chao Chen (Youtu Laboratory); Rongrong Ji (Xiamen University, China)
107 Minimal Neural Atlas: Parameterizing Complex Surfaces with Minimal Charts and Distortion Weng Fei Low (National University of Singapore)*; Gim Hee Lee (National University of Singapore)
121 Contrastive Prototypical Network with Wasserstein Confidence Penalty Haoqing Wang (Peking University)*; Zhi-Hong Deng (Peking University)
123 Privacy-Preserving Face Recognition with Learnable Privacy Budgets in Frequency Domain Jiazhen Ji (Tencent)*; Huan Wang (Xiamen University); Yuge Huang (Tencent YouTu); Jiaxiang Wu (Tencent); Xingkun Xu (Tencent); Shouhong Ding (Tencent); ShengChuan Zhang (Xiamen University); Liujuan Cao (Xiamen University); Rongrong Ji (Xiamen University, China)
127 An End-to-End Transformer Model for Crowd Localization Dingkang Liang (Huazhong University of Science and Technology)*; Wei Xu (Beijing University of Posts and Telecommunications); Xiang Bai (Huazhong University of Science and Technology)
132 Deformable Feature Aggregation for Dynamic Multi-Modal 3D Object Detection Zehui Chen (University of Science and Technology of China); Zhenyu Li (Harbin Institute of Technology); Shiquan Zhang (SenseTime Research); Liangji Fang (Sensetime Research); Qinhong Jiang (SenseTime Research; Shanghai AI Laboratory); Feng Zhao (University of Science and Technology of China)*
140 Masked Generative Distillation Zhendong Yang (Graduate school at ShenZhen,Tsinghua university)*; Zhe Li (Bytedance Inc.); Shao Mingqi (Graduate school at ShenZhen, Tsinghua university); Dachuan Shi (Graduate school at ShenZhen, Tsinghua University); Zehuan Yuan (Bytedance.Inc); Chun Yuan (Graduate school at ShenZhen,Tsinghua university)
145 Saliency Hierarchy Modeling via Generative Kernels for Salient Object Detection Wenhu Zhang (Zhejiang University)*; Liangli Zheng (Zhejiang University); Huanyu Wang (Zhejiang University); Xintian Wu (Zhejiang University); Xi Li (Zhejiang University)
154 Tip-Adapter: Training-free Adaption of CLIP for Few-shot Classification Renrui Zhang (Shanghai AI Lab)*; Zhang Wei (Shanghai AI-Lab); Rongyao Fang (Chinese University of Hong Kong); Peng Gao (Chinese university of hong kong); Kunchang Li (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences); Jifeng Dai (SenseTime); Yu Qiao (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences); Hongsheng Li (The Chinese University of Hong Kong)
160 Temporal Lift Pooling for Continuous Sign Language Recognition Lianyu Hu (Tianjin University)*; Liqing Gao (College of Intelligence and Computing,Tianjin University); Zekang Liu (College of Intelligence and Computing, Tianjin University); Wei Feng (College of Intelligence and Computing, Tianjin University, China)
167 MORE: Multi-Order RElation Mining for Dense Captioning in 3D Scenes Yang Jiao (Fudan University)*; Shaoxiang Chen (Fudan University); Zequn Jie (Meituan inc.); Jingjing Chen (Fudan University); Lin Ma (Meituan); Yu-Gang Jiang (Fudan University)
171 JPEG Artifacts Removal via Contrastive Representation Learning Xi Wang (University of Science and Technology of China); Xueyang Fu (University of Science and Technology of China)*; Yurui Zhu (University of Science and Technology of China); Zheng-Jun Zha (University of Science and Technology of China)
180 Tackling Long-Tailed Category Distribution Under Domain Shifts Xiao Gu (Imperial College London)*; Yao Guo (Shanghai Jiao Tong Univerisity); Zeju Li (Imperial College London); Jianing Qiu (Imperial College London); DOU QI (The Chinese University of Hong Kong); Yuxuan Liu (Institude of Medical Robotics, Shanghai Jiao Tong University); Benny P L Lo (Imperial College London); Guang-Zhong Yang (SJTU)
184 WeLSA: Learning To Predict 6D Pose From Weakly Labeled Data Using Shape Alignment Shishir Reddy Vutukur (TU Munich / Siemens Technology)*; Ivan Shugurov (TU Munich / Siemens Corporate Technology); Benjamin Busam (Technical University of Munich); ANDREAS HUTTER (Siemens Corporate Technology, Germany); Slobodan Ilic (TUM)
190 Fine-grained Data Distribution Alignment for Post-Training Quantization Yunshan Zhong (xiamen university)*; Mingbao Lin (Xiamen University, China); Mengzhao Chen (Xiamen University); Ke Li (Tencent); Yunhang Shen (Xiamen University); Fei Chao (Xiamen University); Yongjian Wu (Tencent Technology (Shanghai) Co.,Ltd); Rongrong Ji (Xiamen University, China)
192 Few-shot Single-view 3D Reconstruction with Memory Prior Contrastive Network Zhen Xing (Fudan University)*; Yijiang Chen (Fudan University); Zhixin Ling (Fudan University); Xiangdong Zhou (Fudan University); Yu Xiang (The University of Texas at Dallas)
194 ExtrudeNet: Unsupervised Inverse Sketch-and-Extrude for Shape Parsing Daxuan Ren (Nanyang Technological University)*; Jianmin Zheng (Nanyang Technological University); Jianfei Cai (Monash University); jiatong j li (Sensetime); Junzhe Zhang (Nanyang Technological University)
196 P-STMO: Pre-Trained Spatial Temporal Many-to-One Model for 3D Human Pose Estimation Wenkang Shan (Peking University)*; Zhenhua Liu (Peking University); xinfeng zhang (University of Chinese Academy of Sciences); Shanshe Wang (Peking University); Siwei Ma (Peking University, China); Wen Gao (PKU)
205 Contrast-Phys: Unsupervised Video-based Remote Physiological Measurement via Spatiotemporal Contrast Zhaodong Sun (University of Oulu)*; Xiaobai Li (University of Oulu)
222 Panoptic Scene Graph Generation Jingkang Yang (Nanyang Technological University)*; Yi Zhe Ang (Nanyang Technological University); Zujin GUO (Nanyang Technological University); Kaiyang Zhou (Nanyang Technological University); Wayne Zhang (SenseTime Research); Ziwei Liu (Nanyang Technological University)
247 StyleSwap: Style-Based Generator Empowers Robust Face Swapping Zhiliang Xu (Baidu Inc.); Hang Zhou (The Chinese University of Hong Kong)*; Zhibin Hong (Baidu Inc.); Ziwei Liu (Nanyang Technological University); Jiaming Liu (Baidu Inc.); zhizhi guo (Department of Computer Vision Technology (VIS), Baidu Inc); Junyu Han (Baidu Inc.); jingtuo liu (baidu); Errui Ding (Baidu Inc.); Jingdong Wang (Baidu)
248 Boosting Event Stream Super-Resolution with A Recurrent Neural Network Wenming Weng (University of Science and Technology of China)*; Yueyi Zhang (University of Science and Technology of China); Zhiwei Xiong (University of Science and Technology of China)
249 Unknown-Oriented Learning for Open Set Domain Adaptation jie liu (City University of Hong Kong)*; Xiaoqing Guo (City University of Hong Kong); Yixuan YUAN (City University of Hong Kong)
255 Unpaired Deep Image Dehazing Using Contrastive Disentanglement Learning Xiang Chen (Nanjing University of Science and Technology)*; Zhentao Fan (Shenyang Aerospace University); Pengpeng Li (Dalian Polytechnic University); Longgang Dai (Shenyang Aerospace University); Caihua Kong (Shenyang Aerospace University); Zhuoran Zheng (Nanjing University of Science and Technology ); Yufeng Huang (Shenyang Aerospace University); Yufeng Li (Shenyang Aerospace University)
263 Check and Link: Pairwise Lesion Correspondence Guides Mammogram Mass Detection Ziwei Zhao (Peking University)*; Dong Wang (Peking University); Yihong Chen (Peking University); Ziteng Wang (Yizhun-ai); Liwei Wang (Peking University)
265 Generative Subgraph Contrast for Self-Supervised Graph Representation Learning yuehui han (njust)*; Le Hui (Nanjing University of Science and Technology); Haobo Jiang (Nanjing University of Science and Technology); Jianjun Qian (Nanjing University of Science and Technology); Jin Xie (Nanjing University of Science and Technology)
267 DVS-Voltmeter: Stochastic Process-based Event Simulator for Dynamic Vision Sensors SongNan Lin (Nanyang Technological University)*; Ye Ma (McGill University); Zhenhua Guo (Aliababa Group); Bihan Wen (Nanyang Technological University)
268 Prototype-Guided Continual Adaptation for Class-Incremental Unsupervised Domain Adaptation Hongbin Lin (South China University of Technology); Yifan Zhang (National University of Singapore); Zhen Qiu (South China University of Technology); Shuaicheng Niu (South China University of Technology); Chuang Gan (MIT-IBM Watson AI Lab); Yanxia Liu (South China University of Technology); Mingkui Tan (South China University of Technology)*
283 SiRi: A Simple Selective Retraining Mechanism for Transformer-based Visual Grounding Mengxue Qu (Beijing Jiaotong University)*; Yu Wu (Princeton University); Wu Liu (AI Research of JD.com); Qiqi Gong (BeijingJiaotong University); Xiaodan Liang (Sun Yat-sen University); Olga Russakovsky (Princeton University); Yao Zhao (Beijing Jiaotong University); Yunchao Wei (UTS)
287 Benchmarking Omni-Vision Representation through the Lens of Visual Realms Yuanhan Zhang (Nanyang Technological University); Zhenfei Yin (Sensetime); Jing Shao (Sensetime); Ziwei Liu (Nanyang Technological University)*
291 Paint2Pix: Interactive Painting based Progressive Image Synthesis and Editing Jaskirat Singh (Australian National University)*; Liang Zheng (Australian National University); Cameron Y Smith (Adobe Research); Jose Echevarria (Adobe System Inc.)
296 BEAT: A Large-Scale Semantic and Emotional Multi-Modal Dataset for Conversational Gestures Synthesis Haiyang Liu (The University of Tokyo)*; Zihao Zhu (Keio University); Naoya Iwamoto (Huawei Technologies Japan K.K.); Yichen Peng (Japan Advanced Institute of Science and Technology); Zhengqing Li (Huawei Japan K.K.); YOU ZHOU (Tokyo Research Center, Huawei); Elif Bozkurt (Huawei Turkey R&D Center, Istanbul, Turkey); Bo Zheng (Huawei)
300 Active Pointly-Supervised Instance Segmentation Chufeng Tang (Tsinghua University)*; Lingxi Xie (Huawei Inc.); Gang Zhang (Tsinghua University); xiaopeng zhang (Huawei Cloud EI ); Qi Tian (Huawei Cloud & AI); Xiaolin Hu (Tsinghua University)
303 DecoupleNet: Decoupled Network for Domain Adaptive Semantic Segmentation Xin Lai (The Chinese University of Hong Kong)*; Zhuotao Tian (The Chinese University of Hong Kong); Xiaogang XU (The Chinese University of Hong Kong); Yingcong Chen (Hong Kong University of Science and Technology); Shu Liu (SmartMore); Hengshuang Zhao (University of Oxford); Liwei Wang (CUHK); Jiaya Jia (Chinese University of Hong Kong)
315 ByteTrack: Multi-Object Tracking by Associating Every Detection Box Yifu Zhang (Huazhong University of Science and Technology); Peize Sun (The University of Hong Kong); Yi Jiang (Bytedance); Dongdong Yu (ByteDance Inc.); Fucheng Weng (Huazhong University of Science and Technology); Zehuan Yuan (Bytedance.Inc); Ping Luo (The University of Hong Kong); Wenyu Liu (Huazhong University of Science and Technology); Xinggang Wang (Huazhong University of Science and Technology)*
317 Robust Multi-Object Tracking by Marginal Inference Yifu Zhang (Huazhong University of Science and Technology); Chunyu Wang (Microsoft Research asia); Xinggang Wang (Huazhong University of Science and Technology)*; Wenjun Zeng (EIT Institute for Advanced Study); Wenyu Liu (Huazhong University of Science and Technology)
322 Doubly-Fused ViT: Fuse Information from Vision Transformer Doubly with Local Representation Li Gao (Wuhan University)*; Dong Nie (UNC); Bo Li (Alibaba Group); Xiaofeng Ren (alibaba group)
326 CATRE: Iterative Point Clouds Alignment for Category-level Object Pose Refinement Xingyu Liu (Tsinghua University); Gu Wang (JD.COM); Yi Li (University of Washington); Xiangyang Ji (Tsinghua University)*
334 Spatiotemporal Self-attention Modeling with Temporal Patch Shift for Action Recognition Wangmeng Xiang (The Hong Kong Polytechnic University)*; Chao Li (Alibaba); Biao Wang (Alibaba); Xihan Wei (Alibaba); Xian-Sheng Hua (Damo Academy, Alibaba Group); Lei Zhang (“Hong Kong Polytechnic University, Hong Kong, China”)
339 Efficient Long-Range Attention Network for Image Super-resolution Xindong Zhang (The Hong Kong Polytechnic University)*; Hui Zeng (OPPO); Shi Guo (The Hong Kong Polytechnic University); Lei Zhang (“Hong Kong Polytechnic University, Hong Kong, China”)
343 DID-M3D: Decoupling Instance Depth for Monocular 3D Object Detection Liang Peng (ZJU)*; Xiaopei Wu (ZhejiangUniversity); Zheng Yang (FABU); Haifeng Liu (ZJU); Deng Cai (ZJU)
349 FlowFormer: A Transformer Architecture for Optical Flow Zhaoyang Huang (Chinese University of HongKong)*; Xiaoyu Shi (CUHK); Chao Zhang (Samsung Telecommunication Research Institute); Qiang Wang (Samsung Research China, Beijing); Ka Chun Cheung (Nvidia); Hongwei Qin (Sensetime); Jifeng Dai (SenseTime); Hongsheng Li (The Chinese University of Hong Kong)
357 Coarse-to-Fine Sparse Transformer for Hyperspectral Image Reconstruction Yuanhao Cai (Tsinghua Univisity, Tsinghua Shenzhen International Graduate School); Jing Lin (Tsinghua Univisity, Tsinghua Shenzhen International Graduate School)*; Xiaowan Hu (Tsinghua Univisity, Tsinghua Shenzhen International Graduate School); Haoqian Wang (Tsinghua Shenzhen International Graduate School, Tsinghua University); Xin Yuan (Westlake University); Yulun Zhang (ETH Zurich); Radu Timofte (University of Wurzburg & ETH Zurich); Luc Van Gool (ETH Zurich)
358 An Embedded Feature Whitening Approach to Deep Neural Network Optimization Hongwei Yong (The Hong Kong Polytechnic University)*; Lei Zhang (“Hong Kong Polytechnic University, Hong Kong, China”)
361 Optimization over Disentangled Encoding: Unsupervised Cross-Domain Point Cloud Completion via Occlusion Factor Manipulation Jingyu Gong (Shanghai Jiao Tong University)*; Fengqi Liu (Shanghai Jiao Tong University); Jiachen Xu (Shanghai Jiao Tong University); Min Wang (Sensetime Group); Xin Tan (Shanghai Jiao Tong University); Zhizhong Zhang (East China Normal University); Ran Yi (Shanghai Jiao Tong University); Haichuan Song (East China Normal University); Yuan Xie (East China Normal University); Lizhuang Ma (Shanghai Jiao Tong University)
362 Source-Free Domain Adaptation with Contrastive Domain Alignment and Self-supervised Exploration for Face Anti-Spoofing Yuchen Liu (Shanghai Jiao Tong university)*; Yabo Chen (Shanghai Jiao Tong University ); Wenrui Dai (Shanghai Jiao Tong University); Mengran Gou (Qualcomm); Chun-Ting Huang (Qualcomm); Hongkai Xiong (Shanghai Jiao Tong University)
368 MPPNet: Multi-Frame Feature Intertwining with Proxy Points for 3D Temporal Object Detection Xuesong Chen (The Chinese University of Hong Kong)*; Shaoshuai Shi (MPI Informatics); Benjin Zhu (MEGVII); Ka Chun Cheung (Nvidia); Hang Xu (Huawei Noah’s Ark Lab); Hongsheng Li (The Chinese University of Hong Kong)
379 SdAE: Self-distillated Masked Autoencoder Yabo Chen (Shanghai Jiao Tong University ); Yuchen Liu (Shanghai Jiao Tong university); Dongsheng Jiang (Huawei Cloud & AI); xiaopeng zhang (Huawei Cloud EI )*; Wenrui Dai (Shanghai Jiao Tong University); Hongkai Xiong (Shanghai Jiao Tong University); Qi Tian (Huawei Cloud & AI)
383 A Transformer-based Decoder for Semantic Segmentation with Multi-level Context Mining Bowen Shi (Shanghai Jiao Tong University)*; Dongsheng Jiang (Huawei Cloud & AI); xiaopeng zhang (Huawei Cloud EI ); Han Li (Shanghai Jiao Tong University); Wenrui Dai (Shanghai Jiao Tong University); Junni Zou (Shanghai Jiao Tong University); Hongkai Xiong (Shanghai Jiao Tong University); Qi Tian (Huawei Cloud & AI)
399 Graph-constrained Contrastive Regularization for Semi-weakly Volumetric Segmentation Simon Reiß (Karlsruhe Institute of Technology)*; Constantin Marc Seibold (Karlsruhe Institute of Technology); Alexander Freytag (Carl Zeiss AG, Jena, Germany); Rodner Erik (University of Applied Sciences Berlin); Rainer Stiefelhagen (Karlsruhe Institute of Technology)
401 Improving Vision Transformers by Revisiting High-frequency Components Jiawang Bai (Tsinghua University)*; Li Yuan (Peking University); Shu-Tao Xia (Tsinghua University); Shuicheng Yan (Sea AI Labs); Zhifeng Li (Tencent AI Lab); Wei Liu (Tencent)
405 Adaptive Co-Teaching for Unsupervised Monocular Depth Estimation Weisong Ren (Dalian University of Technology); Lijun Wang (Dalian University of Technology)*; Yongri Piao (Dalian University of Technology); Miao Zhang (Dalian University of Technology); Huchuan Lu (Dalian University of Technology); Ting Liu (Alibaba)
408 FurryGAN: High quality foreground-aware image synthesis Jeongmin Bae (Yonsei University); Mingi Kwon (Yonsei University); Youngjung Uh (Yonsei University)*
433 An Efficient Spatio-Temporal Pyramid Transformer for Action Detection Yuetian Weng (Monash University); Zizheng Pan (Monash University); Mingfei Han (Monash University; DATA61, CSIRO); Xiaojun Chang (University of Technology Sydney); Bohan Zhuang (Monash University)*
434 LocVTP: Video-Text Pre-training for Temporal Localization Meng Cao (Peking University); Tianyu Yang (Tencent AI Lab); Junwu Weng (Tencent AI Lab); Can Zhang (Peking University); Jue Wang (Tencent AI Lab); Yuexian Zou (Peking University)*
444 Fusing Local Similarities for Retrieval-based 3D Orientation Estimation of Unseen Objects Chen Zhao (EPFL)*; Yinlin Hu (EPFL); Mathieu Salzmann (EPFL)
458 Online Segmentation of LiDAR Sequences: Dataset and Algorithm Romain Loiseau (École des ponts ParisTech)*; Mathieu Aubry (École des ponts ParisTech); loic landrieu (IGN)
460 MVSTER: Epipolar Transformer for Efficient Multi-View Stereo Xiaofeng Wang (Institute of Automation, Chinese Academy of Sciences; School of Artificial Intelligence, University of Chinese Academy of Sciences)*; Zheng Zhu (Tsinghua University); Guan Huang (Institute of Automation, Chinese Academy of Sciences); Fangbo Qin (Institute of Automation, Chinese Academy of Sciences); Yun Ye (XForwardAI Technology Co., Ltd, Beijing, China); Yijia He (Beijing Kuaishou Technology Co., Ltd); Xu Chi (Phigent Robotics); Xingang Wang (Institute of Automation, CAS)
463 Unsupervised Learning of 3D Semantic Keypoints with Mutual Reconstruction Haocheng Yuan (Northwestern Polytechnical University); Chen Zhao (EPFL); Shichao Fan (Northwestern Polytechnical University); Jiaxi Jiang (Northwestern Polytechnical University); Jiaqi Yang (Northwestern Polytechnical University)*
482 Generalizable Medical Image Segmentation via Random Amplitude Mixup and Domain-Specific Image Restoration Ziqi Zhou (Nanjing University)*; Lei Qi (Southeast University); Yinghuan Shi (Nanjing University)
499 Demystifying Unsupervised Semantic Correspondence Estimation Mehmet Aygün (The University of Edinburgh)*; Oisin Mac Aodha (University of Edinburgh)
513 Learning Shadow Correspondence for Video Shadow Detection Xinpeng Ding (The Hong Kong University of Science and Technology); Jingwen Yang (The Hong Kong University of Science and Technology); Xiaowei Hu (Shanghai AI Laboratory); Xiaomeng Li (The Hong Kong University of Science and Technology)*
514 PolarMOT: How far can geometric relations take us in 3D multi-object tracking? Aleksandr Kim (Technical University of Munich); Guillem Brasó (TUM); Aljosa Osep (TUM Munich)*; Laura Leal-Taixé (TUM)
516 Few-Shot End-to-End Object Detection via Constantly Concentrated Encoding across Heads Jiawei Ma (Columbia University)*; Guangxing Han (Columbia University); Shiyuan Huang (Columbia University); Yuncong Yang (Columbia University); Shih-Fu Chang (Columbia University)
525 MVDECOR: Multi-view Dense Correspondence Learning for Fine-grained 3D Segmentation Gopal Sharma (University of Massachusetts Amherst)*; Kangxue Yin (NVIDIA); Subhransu Maji (University of Massachusetts, Amherst); Evangelos Kalogerakis (UMass Amherst); Or Litany (NVIDIA); Sanja Fidler (University of Toronto, NVIDIA)
537 Implicit Neural Representations for Image Compression Yannick Strümpler (ETH Zürich)*; Janis Postels (ETH Zurich); Ren Yang (ETH Zurich); Luc Van Gool (ETH Zurich); Federico Tombari (Google, TU Munich)
541 Cross-modal Prototype Driven Network for Radiology Report Generation Jun Wang (University of Warwick)*; Abhir Bhalerao (University of Warwick); Yulan He (University of Warwick)
556 Scene Text Recognition with Permuted Autoregressive Sequence Models Darwin Bautista (University of the Philippines)*; Rowel Atienza (University of the Philippines)
568 XMem: Long-Term Video Object Segmentation with an Atkinson-Shiffrin Memory Model Ho Kei Cheng (University of Illinois Urbana-Champaign)*; Alexander Schwing (UIUC)
570 SUPR: A Sparse Unified Part-Based Human Body Model Ahmed A A Osman (Max Planck Institute for Intelligent Systems)*; Michael J. Black (Max Planck Institute for Intelligent Systems); Timo Bolkart (Max Planck Institute for Intelligent Systems); Dimitrios Tzionas (University of Amsterdam)
575 SCAM! Transferring humans between images with Semantic Cross Attention Modulation Nicolas Dufour (ENPC)*; David Picard (ENPC); Vicky Kalogeiton (Ecole Polytechnique)
583 Q-FW: A Hybrid Classical-Quantum Frank-Wolfe for Quadratic Binary Optimization Alp Yurtsever (Umeå University); Tolga Birdal (TU Munich)*; Vladislav Golyanik (MPI for Informatics)
584 Revisiting Point Cloud Simplification: A Learnable Feature Preserving Approach Rolandos Alexandros Potamias (Imperial College London)*; Giorgos Bouritsas (Imperial College London); Stefanos Zafeiriou (Imperial College London)
599 Neural Architecture Search for Spiking Neural Networks Youngeun Kim (Yale University)*; Yuhang Li (Yale University); Hyoungseob Park (Yale University); Yeshwanth Venkatesha (Yale university); Priyadarshini Panda (Yale University)
601 Neuromorphic Data Augmentation for Training Spiking Neural Networks Yuhang Li (Yale University)*; Youngeun Kim (Yale University); Hyoungseob Park (Yale University); Tamar Geller (Yale University); Priyadarshini Panda (Yale University)
602 RelPose: Predicting Probabilistic Relative Rotation for Single Objects in the Wild Jason Y Zhang (Carnegie Mellon University)*; Deva Ramanan (Carnegie Mellon University); Shubham Tulsiani (Carnegie Mellon University)
609 Human Trajectory Prediction via Neural Social Physics Jiangbei Yue (Leeds University); Dinesh Manocha (University of Maryland at College Park)*; He Wang (Leeds University)
615 Explicit Occlusion Reasoning for Multi-person 3D Human Pose Estimation Qihao Liu (Johns Hopkins University); Yi Zhang (Johns Hopkins University); Song Bai (University of Oxford); Alan Yuille (Johns Hopkins University)*
626 R2L: Distilling Neural Radiance Field to Neural Light Field for Efficient Novel View Synthesis Huan Wang (Northeastern University); Jian Ren (Snap Inc.); Zeng Huang (Snap Inc.)*; Kyle B Olszewski (Snap Inc.); Menglei Chai (Snap Inc.); YUN FU (Northeastern University); Sergey Tulyakov (Snap Inc)
629 Towards Open Set Video Anomaly Detection Yuansheng Zhu (Rochester Institute of Technology)*; Wentao Bao (Rochester Institute of Technology); Qi Yu (Rochester Institute of Technology)
634 Object-Compositional Neural Implicit Surfaces Qianyi Wu (Monash University)*; Xian Liu (The Chinese University of Hong Kong); Yuedong Chen (Monash University); Kejie Li (University of Oxford); Chuanxia Zheng (Monash University); Jianfei Cai (Monash University); Jianmin Zheng (Nanyang Technological University)
636 Sem2NeRF: Converting Single-View Semantic Masks to Neural Radiance Fields Yuedong Chen (Monash University)*; Qianyi Wu (Monash University); Chuanxia Zheng (Monash University); Tat-Jen Cham (Nanyang Technological University); Jianfei Cai (Monash University)
641 WaveGAN: Frequency-aware GAN for High-Fidelity Few-shot Image Generation Mengping Yang (East China University of Science and Technology)*; Zhe Wang ( East China University of Science and Technology ); Ziqiu Chi (East China University Of Science and Technology); Wenyi Feng (east China university of science and technology)
642 Class-Agnostic Object Counting Robust to Intraclass Diversity Shenjian Gong (Nanjing University of Science and Technology)*; Shanshan Zhang (Nanjing University of Science and Technology); Jian Yang (Nanjing University of Science and Technology); Dengxin Dai (MPI for Informatics ); Bernt Schiele (MPI Informatics)
650 TM2T: Stochastic and Tokenized Modeling for the Reciprocal Generation of 3D Human Motions and Texts Chuan Guo (University of Alberta)*; Xinxin Zuo (University of Alberta); Sen Wang (University of Alberta); Li Cheng (ECE dept., University of Alberta)
652 Self-Distillation for Robust LiDAR Semantic Segmentation in Autonomous Driving Jiale Li (Zhejiang University); Hang Dai (Mohamed bin Zayed University of Artificial Intelligence)*; Yong Ding (Zhejiang University)
654 Semi-Supervised Monocular 3D Object Detection by Multi-View Consistency Qing Lian (Hong Kong University of Science and Technology )*; Yanbo XU (The Hong Kong University of Science and Technology); Weilong Yao (Shanghai Xiantu Intelligent Technology Co., Ltd.); Yingcong Chen (Hong Kong University of Science and Technology); Tong Zhang (Hong Kong University of Science and Technology)
655 Lidar Point Cloud Guided Monocular 3D Object Detection Liang Peng (ZJU)*; Fei Liu (Zhejiang University); Zhengxu Yu (Zhejiang University); Senbo Yan (Zhejiang University); Dan Deng (FABU); Zheng Yang (FABU); Haifeng Liu (ZJU); Deng Cai (ZJU)
656 Structural Causal 3D Reconstruction Weiyang Liu (University of Cambridge)*; Zhen Liu (Mila, University of Montreal); Liam Paull (Université de Montréal); Adrian Weller (University of Cambridge); Bernhard Schölkopf (MPI for Intelligent Systems, Tübingen)
671 KD-MVS: Knowledge Distillation Based Self-supervised Learning for Multi-view Stereo Yikang Ding (Tsinghua University)*; Qingtian Zhu (Peking University); Xiangyue Liu (Beihang University); Wentao Yuan (Peking Universtiy); Haotian Zhang (Megvii); Chi Zhang (Megvii Inc.)
685 When Counting Meets HMER: Counting-Aware Network for Handwritten Mathematical Expression Recognition Bohan Li (Huazhong University of Science and Technology)*; Ye Yuan (Tomorrow Advancing Life); Dingkang Liang (Huazhong University of Science and Technology); Xiao Liu (Tencent); zhilong ji (Tomorrow Advancing Life); Jinfeng Bai (TAL); Wenyu Liu (Huazhong University of Science and Technology); Xiang Bai (Huazhong University of Science and Technology)
689 Shape Matters: Deformable Patch Attack Zhaoyu Chen (Fudan University); Bo Li (Nanjing University)*; Shuang Wu (Tencent); Jianghe Xu (Tencent Youtu Lab); Shouhong Ding (Tencent); Wenqiang Zhang (Fudan University)
690 PTSEFormer: Progressive Temporal-Spatial Enhanced TransFormer Towards Video Object Detection Han Wang (Shanghai Jiao Tong University)*; Jun Tang (hikvision); Xiaodong Liu (Hikvision); Shanyan Guan (Shanghai Jiao Tong University); Rong Xie (Shanghai Jiao Tong University); Li Song (Shanghai Jiao Tong University)
694 BEVFormer: Learning Bird-Eye-View Representations from Multi-View Images via Spatiotemporal Transformer Zhiqi Li (Nanjing University); Wenhai Wang (Nanjing University); Hongyang Li (SenseTime); Enze Xie (The University of Hong Kong); Chonghao Sima (Purdue University); Tong Lu (Nanjing University); Yu Qiao (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences); Jifeng Dai (SenseTime)*
696 Detecting Tampered Scene Text in the Wild YuXin Wang (University of Science and Technology of China)*; Hongtao Xie (University of Science and Technology of China); Mengting Xing (University of Science and Technology of China); Jing Wang (Huawei Cloud & AI); Shenggao Zhu (Huawei); Yongdong Zhang (University of Science and Technology of China)
702 Projective Parallel Single-pixel Imaging to Overcome Global Illumination in 3D Structure Light Scanning Yuxi Li (Beihang University)*; Huijie Zhao (Beihang University); Hongzhi Jiang (Beihang University); Xudong Li (Beihang University)
709 CelebV-HQ: A Large-Scale Video Facial Attributes Dataset Hao Zhu (SenseTime Research)*; Wayne Wu (SenseTime Research); Wentao Zhu (Peking University); Liming Jiang (Nanyang Technological University); Siwei Tang (Sensetime research); Li Zhang (Sensetime); Ziwei Liu (Nanyang Technological University); Chen Change Loy (Nanyang Technological University)
710 Open-world Semantic Segmentation for LIDAR Point Clouds Jun CEN (The Hong Kong University of Science and Technology)*; Peng YUN (Hong Kong University of Science and Technology); Shiwei Zhang (DAMO Academy, Alibaba Group); Junhao CAI (HKUST); Di LUAN (Hong Kong University of Science and Technology); Mingqian Tang (Alibaba Group); Michael Yu Wang (HKUST); Ming Liu (HKUST)
721 Burn After Reading: Online Adaptation for Cross-domain Streaming Data Luyu Yang (University of Maryland, College Park)*; Mingfei Gao (Apple); Zeyuan Chen (Salesforce Research); Ran Xu (Salesforce Research); Abhinav Shrivastava (University of Maryland); Chetan Ramaiah (Salesforce Research)
728 CLOSE: Curriculum Learning On the Sharing Extent Towards Better One-shot NAS Zixuan Zhou (Tsinghua University)*; Xuefei Ning (Tsinghua University); Yi Cai (Tsinghua University); Jiashu Han (None); Yiping Deng (Huawei); Yuhan Dong (Tsinghua University); Huazhong Yang (Tsinghua University); Yu Wang (Tsinghua University)
734 RigNet: Repetitive Image Guided Network for Depth Completion Zhiqiang Yan (Nanjing University of Science and Tenchnology)*; Kun Wang (Nanjing University of Science and Technology); Xiang Li (Nanjing University of Science and Technology); Zhenyu Zhang (Tencent); Jun Li (Nanjing University of Science and Technology); Jian Yang (Nanjing University of Science and Technology)
744 Streamable Neural Fields Junwoo Cho (Sungkyunkwan University)*; Seungtae Nam (Sungkyunkwan University); Daniel Rho (Sungkyunkwan University); Jong Hwan Ko (Sungkyunkwan University); Eunbyung Park (Sungkyunkwan University)
755 2DPASS: 2D Priors Assisted Semantic Segmentation on LiDAR Point Clouds Xu Yan (The Chinese University of Hong Kong, Shenzhen); Jiantao Gao (Shanghai University); Chaoda Zheng (The Chinese University of Hong Kong, Shen Zhen); chao zheng (Tencent); Ruimao Zhang (The Chinese University of Hong Kong, Shenzhen); Shuguang Cui (The Chinese University of Hong Kong, Shenzhen ); Zhen Li (The Chinese University of Hong Kong, Shenzhen)*
762 Where to Focus: Investigating Hierarchical Attention Relationship for Fine-Grained Visual Classification Yang Liu (Beihang University); Lei Zhou (Beihang University)*; Pengcheng Zhang (Beihang University); Xiao Bai (Beihang University); Lin Gu (RIKEN,AIP / The University of Tokyo); Xiaohan Yu (Griffith University); Jun Zhou (Griffith University); Hancock Edwin (“University of York, UK”)
776 Mind the Gap in Distilling StyleGANs Guodong Xu (The Chinese University of Hong Kong)*; Yuenan HOU (Shanghai AI Lab); Ziwei Liu (Nanyang Technological University); Chen Change Loy (Nanyang Technological University)
784 End-to-End Active Speaker Detection Juan C Leon (KAUST)*; Moritz Cordes (Leuphana University of Lüneburg); Chen Zhao (KAUST); Bernard Ghanem (KAUST)
785 Joint-Modal Label Denoising for Weakly-Supervised Audio-Visual Video Parsing Haoyue Cheng (Nanjing University); Zhaoyang Liu (SenseTime Research); Hang Zhou (The Chinese University of Hong Kong); Chen Qian (SenseTime); Wayne Wu (SenseTime Research); Limin Wang (Nanjing University)*
790 Learn-to-Decompose: Cascaded Decomposition Network for Cross-Domain Few-Shot Facial Expression Recognition Xinyi Zou (Xiamen University); Yan Yan (Xiamen University)*; Jing-Hao Xue (University College London); Si Chen (Xiamen University of Technology); Hanzi Wang (Xiamen University)
798 Learning with Recoverable Forgetting Jingwen Ye (National University of Singapore)*; Fu Yifang (National University of Singapore); Jie Song (Zhejiang University); Xingyi Yang (National University of Singapore); Songhua Liu (National University of Singapore); Xin Jin (University of Science and Technology of China); Mingli Song (Zhejiang University); Xinchao Wang (National University of Singapore)
800 Masked Autoencoders for Point Cloud Self-supervised Learning Yatian Pang (National University of Singapore); Wenxiao Wang (State Key Lab of CAD&CG, Zhejiang University); Francis EH Tay (National University of Singapore); Wei Liu (Tencent); Yonghong Tian (Peking University); Li Yuan (Peking University)*
803 RamGAN: Region Attentive Morphing GAN for Region-Level Makeup Transfer Jianfeng Xiang (ShenZhen University)*; Junliang Chen (Shenzhen University); Wenshuang Liu (Shenzhen University); Xianxu Hou (Shenzhen University); Linlin Shen (Shenzhen University)
807 Efficient One Pass Self-distillation with Zipf’s Label Smoothing Jiajun Liang (Megvii)*; Linze Li (MEGVII Technology); Zhaodong Bing (Megvii Technology); Borui Zhao (Megvii Technology); Yao Tang (Peking University); Bo Lin (MEGVII Technology); Haoqiang Fan (Megvii Inc(face++))
812 DaViT: Dual Attention Vision Transformers Mingyu Ding (The University of Hong Kong)*; Bin Xiao (Microsoft); Noel C Codella (Microsoft); Ping Luo (The University of Hong Kong); Jingdong Wang (Baidu); Lu Yuan (Microsoft)
815 OneFace: One Threshold for All Jiaheng Liu (Beihang University); zhipeng yu (University of Chinese Academy of Sciences); Haoyu Qin (SenseTime); Yichao Wu (Sensetime Group Limited); Ding Liang (Sensetime Group Limited); Gangming Zhao (The University of Hong Kong); Ke Xu (Beihang University)*
820 Semantic-Sparse Colorization Network for Deep Exemplar-based Colorization Yunpeng Bai (Tsinghua University )*; Chao Dong (SIAT); Zenghao Chai (Tsinghua University); Andong Wang (Tsinghua University); Zhengzhuo Xu (Tsinghua University); Chun Yuan (Graduate school at ShenZhen,Tsinghua university)
822 Vibration-based Uncertainty Estimation for Learning from Limited Supervision Hengtong Hu (Hefei University of Technology)*; Lingxi Xie (Huawei Inc.); Xinyue Huo (University of Science and Technology of China); Richang Hong (HeFei University of Technology); Qi Tian (Huawei Cloud & AI)
824 SOS! Self-supervised Learning Over Sets Of Handled Objects In Egocentric Action Recognition Victor A Escorcia (Samsung AI Center)*; Ricardo Guerrero (Samsung AI Center Cambridge); Xiatian Zhu (Samsung AI Centre); Brais Martinez (Samsung AI Center)
829 FADE: Fusing the Assets of Decoder and Encoder for Task-Agnostic Upsampling Hao Lu (Huazhong University of Science and Technology); Wenze Liu (Huazhong university of science and technology); Hongtao Fu (Huazhong university of Science and Technology); Zhiguo Cao (Huazhong Univ. of Sci.&Tech.)*
833 VTC: Improving Video-Text Retrieval with User Comments Laura Hanu (Unitary)*; James Thewlis (Unitary); Yuki M Asano (University of Amsterdam); Christian Rupprecht (University of Oxford)
839 Less than Few: Self-Shot Video Instance Segmentation Pengwan Yang (University of Amsterdam)*; Yuki M Asano (University of Amsterdam); Pascal Mettes (University of Amsterdam); Cees Snoek (University of Amsterdam)
841 End-to-End Visual Editing with a Generatively Pre-Trained Artist Andrew Brown (University of Oxford)*; Cheng-Yang Fu (Facebook.com); Omkar M Parkhi (Facebook); Tamara Berg (Facebook AI Research); Andrea Vedaldi (University of Oxford / Facebook AI Research)
852 COUCH: Towards Controllable Human-chair Interactions Xiaohan Zhang (University of Tübingen, MPI Informatics); Bharat Lal Bhatnagar (University of Tübingen, MPI informatik); Sebastian Starke (University of Edinburgh); Vladimir Guzov (University of Tuebingen); Gerard Pons-Moll (University of Tübingen)*
859 MovieCuts: A New Dataset and Benchmark forCut Type Recognition Alejandro Pardo (KAUST)*; Fabian Caba (Adobe Research); Juan C Leon (KAUST); Ali K Thabet (Facebook); Bernard Ghanem (KAUST)
877 High-fidelity GAN Inversion with Padding Space Qingyan Bai (Tsinghua University)*; Yinghao Xu (Chinese University of Hong Kong); Jiapeng Zhu (HKUST); Weihao Xia (University College London); Yujiu Yang (Tsinghua University); Yujun Shen (Dept. of IE, CUHK)
893 LiDAL: Inter-frame Uncertainty Based Active Learning for 3D LiDAR Semantic Segmentation ZEYU HU (Hong Kong University of Science and Technology)*; Xuyang Bai (HKUST); Runze Zhang (Tencent); Xin Wang (Tencent); Guangyuan Sun (TENCENT); Hongbo Fu (City University of Hong Kong); Chiew-Lan Tai (Hong Kong University of Science & Technology)
897 Optimal Boxes: Boosting End-to-End Scene Text Recognition by Adjusting Annotated Bounding Boxes via Reinforcement Learning Jingqun Tang (Ant Group)*; wenming qian (Huazhong University of Science and Technology); Luchuan Song (University of Science and Technology of China); Xiena Dong (Hangzhou Dianzi Universiy); lan li (Whu Han University); Xiang Bai (Huazhong University of Science and Technology)
912 Concurrent Subsidiary Supervision for Unsupervised Source-Free Domain Adaptation Jogendra Nath Kundu (Indian Institute of Science)*; Suvaansh Bhambri (Indian Institute of Science); Akshay R Kulkarni (Indian Institute of Science); Hiran Sarkar (Indian Institute of Science); Varun Jampani (Google); Venkatesh Babu RADHAKRISHNAN (Indian Institute of Science)
913 Designing One Unified Framework for High-Fidelity Face Reenactment and Swapping Chao Xu (Zhejiang University)*; Jiangning Zhang (Zhejiang University); Yue Han (Zhejiang University); Guanzhong Tian (Ningbo Research Institute, Zhejiang University); xianfang zeng (Zhejiang University); Ying Tai (Tencent YouTu); Yabiao Wang (Tencent); Chengjie Wang (Tencent; Shanghai Jiao Tong University); Yong Liu (Zhejiang University)
919 Category-Level 6D Object Pose and Size Estimation using Self-Supervised Deep Prior Deformation Networks Jiehong Lin (South China University of Technology)*; Zewei Wei (South China University of Technology); Changxing Ding (South China University of Technology); Kui Jia (South China University of Technology)
927 Intrinsic Neural Fields: Learning Functions on Manifolds Lukas Koestler (Technical University of Munich)*; Daniel Grittner (Technische Universität München); Michael Moeller (University of Siegen); Daniel Cremers (TU Munich); Zorah Laehner (University of Siegen)
930 LaMAR: Benchmarking Localization and Mapping for Augmented Reality Paul-Edouard Sarlin (ETH Zurich); Mihai Dusmanu (ETH Zurich)*; Johannes L Schönberger (Microsoft); Pablo Speciale (Microsoft); Lukas Gruber (Microsoft); Viktor Larsson (Lund University); Ondrej Miksik (Microsoft); Marc Pollefeys (ETH Zurich / Microsoft)
933 3D Compositional Zero-shot Learning with DeCompositional Consensus Muhammad Ferjad Naeem (ETH Zürich)*; Evin Pınar Örnek (TU Munich); Yongqin Xian (ETH Zurich); Luc Van Gool (ETH Zurich); Federico Tombari (Google, TU Munich)
939 Video Mask Transfiner for High-Quality Video Instance Segmentation Lei Ke (HKUST)*; Henghui Ding (ETH Zurich); Martin Danelljan (ETH Zurich); Yu-Wing Tai (Kuaishou Technology / HKUST); Chi-Keung Tang (Hong Kong University of Science and Technology); Fisher Yu (ETH Zurich)
940 FashionViL: Fashion-Focused Vision-and-Language Representation Learning Xiao Han (University of Surrey)*; Licheng Yu (Facebook); Xiatian Zhu (University of Surrey); Li Zhang (Fudan University); Yi-Zhe Song (University of Surrey); Tao Xiang (University of Surrey)
945 Adaptive Face Forgery Detection in Cross Domain Luchuan Song (University of Science and Technology of China)*; Zheng Fang (BeihangUniversity); Xiaodan Li (Alibaba Group); Xiaoyi Dong (University of Science and Technology of China); Zhenchao Jin (University of Science and Technology of China); Yuefeng Chen (Alibaba Group); Siwei Lyu (University at Buffalo)
958 LiP-Flow: Learning Inference-time Priors for Codec Avatars via Normalizing Flows in Latent Space Emre Aksan (ETH Zurich)*; Shugao Ma (Facebook); Akin Caliskan (Center for Vision Speech and Signal Processing – University of Surrey); Stanislav Pidhorskyi (Facebook Inc.); Alexander Richard (Facebook Reality Labs); Shih-En Wei (Facebook); Jason Saragih (Facebook); Otmar Hilliges (ETH Zurich)
961 Dense Teacher: Dense Pseudo-Labels for Semi-supervised Object Detection Hongyu Zhou (Megvii)*; Songtao Liu (MEGVII); Zeming Li (Megvii(Face++) Inc); Jian Sun (Megvii Technology); Weixin Mao (waseda university); Zheng Ge (MEGVII Technology); haiyan yu (Harbin Institute of Technology)
968 Metric Learning based Interactive Modulation for Real-World Super-Resolution Chong Mou (Peking University Shenzhen Graduate School)*; Yanze Wu (Tencent); Xintao Wang (Tencent); Chao Dong (SIAT); Jian Zhang (Peking University Shenzhen Graduate School); Ying Shan (Tencent)
971 Optimal Transport for Label-Efficient Visible-Infrared Person Re-Identification Jiangming Wang (East China Normal University); Zhizhong Zhang (East China Normal University); Mingang Chen (Shanghai Development Center of Computer Software Technology); yi zhang (zhejianglab); Cong Wang (Huawei Technologies); Bin Sheng (Shanghai Jiao Tong University); Yanyun Qu (XMU); Yuan Xie (East China Normal University)*
977 Proposal-Free Temporal Action Detection via Global Segmentation Mask Learning Sauradip Nag (University of Surrey)*; Xiatian Zhu (University of Surrey); Yi-Zhe Song (University of Surrey); Tao Xiang (University of Surrey)
979 Sobolev Training for Implicit Neural Representations with Approximated Image Derivatives Wentao Yuan (Peking Universtiy)*; Qingtian Zhu (Peking University); Xiangyue Liu (Beihang University); Yikang Ding (Tsinghua University); Haotian Zhang (Megvii); Chi Zhang (Megvii Inc.)
982 Unsupervised Night Image Enhancement: When Layer Decomposition Meets Light-Effects Suppression Yeying Jin (National University of Singapore)*; Wenhan Yang (NTU); Robby T. Tan (National University of Singapore)
986 Point-to-Box Network for Accurate Object Detection via Single Point Supervision Pengfei Chen (University of Chinese Academy of Sciences); Xuehui Yu (University of Chinese Academy of Sciences); Xumeng Han (University of Chinese Academy of Sciences); Najmul Hassan (University of Oregon); Kai Wang (U of Oregon); Jiachen Li (UIUC); Jian Zhao (Institute of North Electronic Equipment); Humphrey Shi (U of Oregon | UIUC | PAIR); Zhenjun Han (University of Chinese Academy of Sciences)*; Qixiang Ye (University of Chinese Academy of Sciences, China)
989 Dynamic Dual Trainable Bounds for Ultra-low Precision Super-Resolution Networks Yunshan Zhong (xiamen university)*; Mingbao Lin (Xiamen University, China); xunchao li (Xiamen University); Ke Li (Tencent); Yunhang Shen (Xiamen University); Fei Chao (Xiamen University); Yongjian Wu (Tencent Technology (Shanghai) Co.,Ltd); Rongrong Ji (Xiamen University, China)
999 Locality Guidance for Improving Vision Transformers on Tiny Datasets Kehan Li (Peking University); Runyi Yu (Peking University); Zhennan Wang (Peng Cheng Laboratory); Li Yuan (Peking University); Guoli Song (Peng Cheng Laboratory); Jie Chen (Peking University)*
1002 Weakly Supervised Object Localization through Inter-class Feature Similarity and Intra-class Appearance Consistency Jun Wei (The Chinese University of Hong Kong, Shenzhen); Sheng Wang (Shanghai Zelixir Biotech); S. Kevin Zhou (USTC); Shuguang Cui (The Chinese University of Hong Kong, Shenzhen ); Zhen Li (The Chinese University of Hong Kong, Shenzhen)*
1003 Semi-Supervised Temporal Action Detection with Proposal-Free Masking Sauradip Nag (University of Surrey)*; Xiatian Zhu (University of Surrey); Yi-Zhe Song (University of Surrey); Tao Xiang (University of Surrey)
1005 Neighborhood Collective Estimation for Noisy Label Identification and Correction Jichang Li (The University of Hong Kong)*; Guanbin Li (Sun Yat-sen University); Feng Liu (Deepwise AI Lab); Yizhou Yu (The University of Hong Kong)
1010 Zero-Shot Temporal Action Detection via Vision-Language Prompting Sauradip Nag (University of Surrey)*; Xiatian Zhu (University of Surrey); Yi-Zhe Song (University of Surrey); Tao Xiang (University of Surrey)
1016 Dual-Stream Knowledge-Preserving Hashing for Unsupervised Video Retrieval Pandeng Li (University of Science and Technology of China)*; Hongtao Xie (University of Science and Technology of China); Jiannan Ge (University of Science and Technology of China); Lei Zhang (Kuaishou); Shaobo Min (tencent); Yongdong Zhang (University of Science and Technology of China)
1018 Discover and Mitigate Unknown Biases with Debiasing Alternate Networks Zhiheng Li (University of Rochester)*; Anthony Hoogs (Kitware); Chenliang Xu (University of Rochester)
1020 Hierarchical Memory Learning for Fine-Grained Scene Graph Generation Youming Deng (Wuhan University); Yansheng Li (Wuhan University)*; Yongjun Zhang (Wuhan University); Xiang Xiang (Huazhong University of Science and Technology); Jian Wang (Ant Group); Jingdong Chen (Ant Group); Jiayi Ma (Wuhan University)
1026 Improving Test-Time Adaptation via Shift-agnostic Weight Regularization and Nearest Source Prototypes Sungha Choi (Qualcomm AI Research)*; Seunghan Yang (Qualcomm AI Research); Seokeon Choi (Qualcomm AI research); Sungrack Yun (Qualcomm AI Research)
1028 Automatic dense annotation of large-vocabulary sign language videos Liliane Momeni (University of Oxford)*; Hannah Bull (LIMSI (CNRS)); Prajwal K R (VGG, Oxford); Samuel Albanie (University of Cambridge); Gul Varol (Ecole des Ponts ParisTech); Andrew Zisserman (University of Oxford)
1029 Few-shot Class-incremental Learning via Entropy-regularized Data-free Replay Huan Liu (McMaster University)*; Li Gu (Huawei Canada); Zhixiang Chi (Huawei Noah’s Ark Laboratory); Yuanhao Yu (Huawei Noah’s Ark Laboratory); Yang Wang (Concordia University); Jun Chen (McMaster University); Jin Tang ( Huawei Noah’s Ark Laboratory)
1035 Learning Instance-Specific Adaptation for Cross-Domain Segmentation Yuliang Zou (Virginia Tech)*; Zizhao Zhang (Google); Chun-Liang Li (Google); Han Zhang (Google); Tomas Pfister (Google); Jia-Bin Huang (Facebook )
1039 SALVe: Semantic Alignment Verification for Floorplan Reconstruction from Sparse Panoramas John W Lambert (Georgia Institute of Technology)*; Yuguang Li (Zillow Group); Ivaylo Boyadzhiev (Zillow Group); Lambert Wixson (Zillow Group); Manjunath Narayana (Zillow group); Will A Hutchcroft (Zillow Group); James Hays (Georgia Institute of Technology, USA); Frank Dellaert (Georgia Tech); Sing Bing Kang (Zillow Group)
1044 Active Learning Strategies for Weakly-Supervised Object Detection Huy V. Vo (Ecole Normale Supérieure – INRIA – Valeo.ai)*; Oriane Siméoni (valeo.ai); Spyros Gidaris (valeo.ai); Andrei Bursuc (valeo.ai); Patrick Pérez (Valeo.ai); Jean Ponce (Inria)
1049 3D Human Pose Estimation Using Möbius Graph Convolutional Networks Niloofar Azizi (ICG department of TU Graz)*; Horst Possegger (Graz University of Technology); Emanuele Rodola (Sapienza University of Rome); Horst Bischof (Graz University of Technology)
1055 Real-time Online Video Detection with Temporal Smoothing Transformers Yue Zhao (University of Texas at Austin)*; Philipp Kraehenbuehl (UT Austin)
1060 3D-FM GAN: Towards 3D-Controllable Face Manipulation Yuchen Liu (Princeton University)*; Zhixin Shu (Adobe Research); Yijun Li (Adobe Research); Zhe Lin (Adobe Research); Richard Zhang (Adobe); Sun-Yuan Kung (Princeton University)
1064 SinNeRF: Training Neural Radiance Field on Complex Scene from a Single Image Dejia Xu (University of Texas at Austin)*; Yifan Jiang (University of Texas at Austin); Peihao Wang (University of Texas at Austin); Zhiwen Fan (University of Texas at Austin); Humphrey Shi (U of Oregon | UIUC | PAIR); Zhangyang Wang (University of Texas at Austin)
1069 Entropy-driven Sampling and Training Scheme for Conditional Diffusion Generation Guangcong Zheng (Zhejiang University); Shengming Li (Zhejiang University); Hui Wang (Zhejiang University); Taiping Yao (Tencent YouTu); Yang Chen (Tencent); Shouhong Ding (Tencent); Xi Li (Zhejiang University)*
1076 Identity-aware Hand Mesh Estimation and Personalization from RGB Images Deying Kong (university of california, irvine)*; Linguang Zhang (Facebook Reality Labs); Liangjian Chen (Reality Labs); Haoyu Ma (University of California, Irvine); Xiangyi Yan (University of California, Irvine); shanlin sun (University of California, Irvine); Xingwei Liu (University of California Irvine); Kun Han (University of California Irvine); Xiaohui Xie (University of California, Irvine)
1084 TALLFormer: Temporal Action Localization with a Long-memory Transformer Feng Cheng (University of North Carolina ch); Gedas Bertasius (UNC Chapel Hill)*
1086 Unsupervised and Semi-supervised Bias Benchmarking in Face Recognition Siqi Deng (Amazon)*; Alexandra Chouldechova (CMU); Yongxin Wang (Amazon); Wei Xia (Amazon); Pietro Perona (California Institute of Technology)
1100 Domain Adaptive Hand Keypoint and Pixel Localization in the Wild Takehiko Ohkawa (The University of Tokyo)*; Yu-Jhe Li (Carnegie Mellon University); Qichen Fu (Carnegie Mellon University); Ryosuke Furuta (The University of Tokyo); Kris Kitani (Carnegie Mellon University); Yoichi Sato (University of Tokyo)
1103 Skeleton-free Pose Transfer for Stylized 3D Characters Zhouyingcheng Liao (Saarland University)*; Jimei Yang (Adobe); Jun Saito (Adobe); Gerard Pons-Moll (University of Tübingen); Yang Zhou (Adobe Research)
1105 Differentiable Raycasting for Self-supervised Occupancy Forecasting Tarasha Khurana (Carnegie Mellon University)*; Peiyun Hu (Carnegie Mellon University); Achal D Dave (Amazon); Jason P Ziglar (Argo AI); David Held (); Deva Ramanan (Carnegie Mellon University)
1109 InAction: Interpretable Action Decision Making for Autonomous Driving Taotao Jing (Tulane University)*; Haifeng Xia (Tulane University); Renran Tian (Indiana University-Purdue University Indianapolis); Haoran Ding (IUPUI); Xiao Luo (IUPUI); Joshua E Domeyer (Toyota Motor North America); Rini Sherony (Toyota CSRC); Zhengming Ding (Tulane University)
1114 CramNet: Camera-Radar Fusion with Ray-Constrained Cross-Attention for Robust 3D Object Detection Jyh-Jing Hwang (Waymo)*; Henrik Kretzschmar (Waymo); Joshua M Manela (Waymo); Sean Rafferty (Waymo); Nicholas Armstrong-Crews (Waymo); Tiffany Chen (Waymo); Dragomir Anguelov (Waymo)
1118 CycDA: Unsupervised Cycle Domain Adaptation to Learn from Image to Video Wei Lin (Graz University of Technology)*; Anna Kukleva (MPII); Kunyang Sun (Southeast University); Horst Possegger (Graz University of Technology); Hilde Kuehne (University of Frankfurt); Horst Bischof (Graz University of Technology)
1119 Latent Discriminant deterministic Uncertainty Gianni Franchi (ENSTA Paris)*; Xuanlong Yu (ENSTA Paris); Andrei Bursuc (valeo.ai); Emanuel Aldea (Paris-Saclay University); Severine Dubuisson (Aix-Marseille University); David Filliat (ENSTA Paris)
1129 Auto-FedRL: Federated Hyperparameter Optimization for Multi-institutional Medical Image Segmentation Pengfei Guo (Johns Hopkins University)*; Dong Yang (NVIDIA Corporation); Ali Hatamizadeh (NVIDIA Corporation); An Xu (University of Pittsburgh); Ziyue Xu (NVIDIA); Wenqi Li (NVIDIA); Can Zhao (Nvidia); Daguang Xu (NVIDIA Corporation); Stephanie Anne Harmon (National Cancer Institute); Evrim Turkbey (NIH); Baris Turkbey (National Cancer Institute); Bradford J Wood (National Institutes of Health); Francesca Patella (ASST Santi Paolo e Carlo); Elvira Stellato (University of Milan); Gianpaolo Carrafiello (University of Milan); Vishal Patel (Johns Hopkins University); Holger R Roth (NVIDIA)
1135 Image-based CLIP-Guided Essence Transfer Hila Chefer (Tel Aviv University)*; Sagie Benaim (University of Copenhagen); Roni Paiss (Tel Aviv University, Google); Lior Wolf (Tel Aviv University, Israel)
1136 Prune Your Model Before Distill It JinHyuk Park (Hongik University); Albert No (Hongik University)*
1155 S2N: Suppression-Strengthen Network for Event-based Recognition under Variant Illuminations zengyu wan (University of Science and Technology of China)*; Yang Wang (University of Science and Technology of China); Ganchao Tan (University of Science and Technology of China); Yang Cao (University of Science and Technology of China); Zheng-Jun Zha (University of Science and Technology of China)
1159 MILES: Visual BERT Pre-training with Injected Language Semantics for Video-text Retrieval Yuying Ge (The University of Hong Kong)*; Yixiao Ge (Tencent); Xihui Liu (UC Berkeley); Jinpeng Wang (National University of Singapore); Jianping Wu (Tsinghua University); Ying Shan (Tencent); Xiaohu Qie (Tencent); Ping Luo (The University of Hong Kong)
1161 PASS: Part-Aware Self-Supervised Pre-Training for Person Re-Identification Kuan Zhu (Institute of Automation, Chinese Academy of Sciences)*; Haiyun Guo (CASIA); Tianyi Yan (Institute of Automation,Chinese Academy of Sciences;School of Artificial Intelligence, University of Chinese Academy Sciences); Yousong Zhu (Institute of Automation, Chinese Academy of Sciences); Jinqiao Wang (Institute of Automation, Chinese Academy of Sciences); Ming Tang (Institute of Automation, Chinese Academy of Sciences)
1165 RegionCL: Exploring Contrastive Region Pairs for Self-supervised Representation Learning YUFEI XU (University of sydney)*; Qiming Zhang (The University of Sydney); Jing Zhang (The University of Sydney); Dacheng Tao (JD.com)
1174 Towards Data-Efficient Detection Transformers Wen Wang (University of Science and Technology of China)*; Jing Zhang (The University of Sydney); Yang Cao (University of Science and Technology of China); Yongliang Shen (Zhejiang University); Dacheng Tao (JD.com)
1175 Label2Label: A Language Modeling Framework for Multi-Attribute Learning Wanhua Li (Tsinghua University); Zhexuan Cao (Tsinghua University); Jianjiang Feng (Tsinghua University); Jie Zhou (Tsinghua University); Jiwen Lu (Tsinghua University)*
1179 Anti-Retroactive Interference for Lifelong Learning Runqi Wang (Beihang University); Yuxiang Bao (Beihang University); Baochang Zhang (Beihang University)*; Jianzhuang Liu (Huawei Noah’s Ark Lab); Wentao Zhu (Amazon); Guodong Guo (IDL, Baidu Research)
1181 Emotion Recognition for Multiple Context Awareness Dingkang Yang (Fudan University); shuai huang (Fudan university); Shunli Wang (Fudan University); Yang Liu (Fudan University); Peng Zhai (Fudan university); Liuzhen Su (Fudan University); Mingcheng Li (Fudan University); Lihua Zhang (Fudan University)*
1182 Box-supervised Instance Segmentation with Level Set Evolution Wentong Li (Zhejiang University ); Wenyu Liu (Zhejiang University); Jianke Zhu (Zhejiang University)*; Miaomiao Cui (Alibaba-inc); Xian-Sheng Hua (Damo Academy, Alibaba Group); Lei Zhang (“Hong Kong Polytechnic University, Hong Kong, China”)
1197 mc-BEiT: Multi-choice Discretization for Image BERT Pre-training Xiaotong Li (Peking University)*; Yixiao Ge (Tencent); Kun Yi (Nanjing University); Zixuan Hu (Peking University); Ying Shan (Tencent); Lingyu Duan (Peking University)
1198 Adaptive Cross-Domain Learning for Generalizable Person Re-Identification Pengyi Zhang (Zhejiang University)*; Huanzhang Dou (Zhejiang University); Yunlong Yu (Zhejiang University); Xi Li (Zhejiang University)
1202 MetaGait: Learning to Learn an Omni Sample Adaptive Representation for Gait Recognition Huanzhang Dou (Zhejiang University)*; Pengyi Zhang (Zhejiang University); Wei Su (Zhejiang University); Yunlong Yu (Zhejiang University); Xi Li (Zhejiang University)
1203 Bootstrapped Masked Autoencoders for Vision BERT Pretraining Xiaoyi Dong (University of Science and Technology of China)*; Jianmin Bao (Microsoft Research Asia); Ting Zhang (MSRA); Dongdong Chen (Microsoft Cloud AI); Weiming Zhang (University of Science and Technology of China); Lu Yuan (Microsoft); Dong Chen (Microsoft Research Asia); Fang Wen (Microsoft Research Asia ); Nenghai Yu (University of Science and Technology of China)
1209 Masked Discrimination for Self-Supervised Learning on Point Clouds Haotian Liu (University of Wisconsin-Madison)*; Mu Cai (University of Wisconsin-Madison); Yong Jae Lee (University of Wisconsin-Madison)
1214 GEB+: A Benchmark for Generic Event Boundary Captioning, Grounding and Retrieval Yuxuan Wang (National University of Singapore); Difei Gao (NUS); Licheng Yu (Facebook); Stan Weixian Lei (National University of Singapore); Matt Feiszli (Facebook Research); Mike Zheng Shou (National University of Singapore)*
1225 FAST-VQA: Efficient End-to-end Video Quality Assessment with Fragment Sampling Haoning Wu (Nanyang Technological University)*; Chaofeng Chen (Nanyang Technological University); Jingwen Hou (Nanyang Technological University); Liang Liao (Nanyang Technological University); Annan Wang (Nanyang Technological University); Wenxiu Sun (SenseTime Research and Tetras.AI); Qiong Yan (SenseTime Group Limited); Weisi Lin (Nanyang Technological University, Singapore)
1235 Learning to train a point cloud reconstruction network without matching Tianxin Huang (Zhejiang University)*; Xuemeng Yang (Zhejiang University); Jiangning Zhang (Zhejiang University); Jinhao Cui (Zhejiang Unversity); Hao Zou (Zhejiang University); Jun Chen (Zhejiang University); Xiangrui Zhao (Zhejiang University); Yong Liu (Zhejiang University)
1243 Long-Tailed Class Incremental Learning Xialei Liu (Nankai University)*; Yusong Hu (Nankai University); Xu-Sheng Cao (Nankai University); Andy Bagdanov (University of Florence, Italy); Ke Li (Tencent); Ming-Ming Cheng (Nankai University)
1247 CODA: A Real-World Road Corner Case Dataset for Object Detection in Autonomous Driving Kaican Li (Huawei Noah’s Ark Lab)*; Kai Chen (HKUST); Haoyu Wang (Purdue University); Lanqing Hong (Huawei Noah’s Ark Lab); Chaoqiang Ye (Huawei); Jianhua Han (Huawei Noah’s Ark Lab); Yukuai Chen (Huawei Intelligent Automotive Solution BU); Wei Zhang ( Noah’s Ark Lab, Huawei Technologies); Chunjing Xu (Huawei Noah’s Ark Lab); Dit-Yan Yeung (HKUST); Xiaodan Liang (Sun Yat-sen University); Zhenguo Li (Huawei Noah’s Ark Lab); Hang Xu (Huawei Noah’s Ark Lab)
1253 CMT: Context-Matching-Guided Transformer for 3D Tracking in Point Clouds Zhiyang Guo (University of Science and Technology of China)*; Yunyao Mao (University of Science and Technology of China); Wengang Zhou (University of Science and Technology of China); Min Wang (Institute of Artificial Intelligence, Hefei Comprehensive National Science Center); Houqiang Li (University of Science and Technology of China)
1257 Motion Inspired Unsupervised Perception and Prediction in Autonomous Driving Mahyar Najibi (Waymo LLC); Jingwei Ji (Waymo); Yin Zhou (Waymo)*; Charles R. Qi (Waymo); Xinchen Yan (Waymo); Scott Ettinger (Waymo); Dragomir Anguelov (Waymo)
1259 Unitail: Detecting, Reading, and Matching in Retail Scene Fangyi Chen (Carnegie Mellon University)*; Han Zhang (CMU); zaiwang li (pitt); Jiachen Dou (Carnegie Mellon University); Shentong Mo (Carnegie Mellon University); Hao Chen (Carnegie Mellon University); Yong-Xin Zhang (Tsinghua University); Uzair Ahmed (Carnegie Mellon University); Chenchen Zhu (Meta AI); Marios Savvides (Carnegie Mellon University)
1275 DODA: Data-oriented Sim-to-Real Domain Adaptation for 3D Semantic Segmentation Runyu Ding (The University of Hong Kong)*; Jihan Yang (The University of Hong Kong); Li Jiang (Max Planck Institute for Informatics); Xiaojuan Qi (The University of Hong Kong)
1277 Learning to Drive by Watching YouTube Videos: Action-Conditioned Contrastive Policy Pretraining Qihang Zhang (Chinese University of Hong Kong); Zhenghao Peng (Chinese University of Hong Kong); Bolei Zhou (UCLA)*
1278 Multi-Curve Translator for High-Resolution Photorealistic Image Translation Yuda Song (Zhejiang University); Hui Qian (Zhejiang University); Xin Du (Zhejiang University)*
1280 Dynamic Metric Learning with Cross-Level Concept Distillation Wenzhao Zheng (Tsinghua University)*; Yuanhui Huang (Tsinghua University); Borui Zhang (Tsinghua University); Jie Zhou (Tsinghua University); Jiwen Lu (Tsinghua University)
1287 Deep Bayesian Video Frame Interpolation Zhiyang Yu (Harbin Institute of Technology)*; Yu Zhang (Beihang University); Xujie Xiang (Beihang University); Dongqing Zou (SenseTime Research;Qing Yuan Research Institute, Shanghai Jiao Tong University); Xijun Chen (Harbin Institute of Technology); Jimmy Ren (SenseTime Research;Qing Yuan Research Institute, Shanghai Jiao Tong University)
1300 PanoFormer: Panorama Transformer for Indoor 360° Depth Estimation Zhijie Shen (Beijing Jiaotong University); Chunyu Lin (Beijing Jiaotong University)*; Kang Liao (Beijing Jiaotong University); Lang Nie (Beijing Jiaotong University); Zishuo Zheng (Beijing Jiaotong University); Yao Zhao (Beijing Jiaotong University)
1312 Cross Attention Based Style Distribution for Controllable Person Image Synthesis Xinyue Zhou (East China Normal University ); Mingyu Yin (East China Normal University); Xinyuan Chen (Shanghai AI Laboratory); Li Sun (East China Normal University)*; Changxin Gao (Huazhong University of Science and Technology); Qingli Li (East China Normal University)
1315 Generative Meta-Adversarial Network for Unseen Object Navigation Sixian Zhang (ICT, China Academy of Science)*; Weijie Li (ICT, China Academy of Sciences); Xinhang Song (ICT); Yubing Bai (ICT,China Academy of Science); Shuqiang Jiang (ICT, China Academy of Science)
1316 Unsupervised Visual Representation Learning by Synchronous Momentum Grouping Bo Pang (Shanghai Jiao Tong University)*; Yifan Zhang (Shanghai Jiao Tong University); Yaoyi Li (Huawei); Jia Cai (Huawei); Cewu Lu (Shanghai Jiao Tong University)
1317 OSFormer: One-Stage Camouflaged Instance Segmentation with Transformers Jialun Pei (Huazhong University of Science and Technology); Tianyang Cheng (Huazhong University of Science and Technology); Deng-Ping Fan (ETH Zurich)*; He Tang (Huazhong University of Science and Technology); Chuanbo Chen (Huazhong University of Science and Technology); Luc Van Gool (ETH Zürich)
1321 Highly Accurate Dichotomous Image Segmentation Xuebin Qin (University of Alberta); Hang Dai (Mohamed bin Zayed University of Artificial Intelligence); Xiaobin Hu (Technische Universität München); Deng-Ping Fan (ETH Zurich)*; Ling Shao (Terminus Group); Luc Van Gool (ETH Zurich)
1322 KeypointNeRF: Generalizing Image-based Volumetric Avatars using Relative Spatial Encoding of Keypoints Marko Mihajlovic (ETH Zurich)*; Aayush Bansal (Carnegie Mellon University); Michael Zollhöfer (Facebook Reality Labs); Siyu Tang (ETH Zurich); Shunsuke Saito (Facebook)
1326 MENet: a Memory-Based Network with Dual-Branch for Efficient Event Stream Processing Linhui Sun (CASIA)*; Yifan Zhang (Institute of Automation, Chinese Academy of Sciences); Ke Cheng (Institute of Automation, Chinese Academy of Sciences); Jian Cheng (“Chinese Academy of Sciences, China”); Hanqing Lu (NLPR, Institute of Automation, CAS)
1330 Making Heads or Tails: Towards Semantically Consistent Visual Counterfactuals Simon Vandenhende (KU Leuven)*; Dhruv Mahajan (Facebook); Filip Radenovic (Facebook AI); Deepti Ghadiyaram (Facebook)
1331 LEDNet: Joint Low-light Enhancement and Deblurring in the Dark Shangchen Zhou (Nanyang Technological University)*; Chongyi Li ( Nanyang Technological University); Chen Change Loy (Nanyang Technological University)
1336 RC-MVSNet: Unsupervised Multi-View Stereo with Neural Rendering Di Chang (Technical University of Munich)*; Aljaz Bozic (Technical University Munich); Tong Zhang (EPFL); Qingsong Yan (hong kong university of science and technology); Yingcong Chen (Hong Kong University of Science and Technology); Sabine Süsstrunk (EPFL); Matthias Niessner (Technical University of Munich)
1342 StretchBEV: Stretching Future Instance Prediction Spatially and Temporally Kaan Adil Akan (Koc University); Fatma Guney (Koc University)*
1344 AgeTransGAN for Facial Age Transformation with Rectified Performance Metrics Gee-Sern Hsu (National Taiwan University of Science and Technology)*; Rui-Cang Xie ( National Taiwan University of Science and Technology); Zhi-Ting Chen (National Taiwan University of Science and Technology); Yu-Hong Lin (National Taiwan University of Science and Technology)
1346 Boosting Supervised Dehazing Methods via Bi-level Patch Reweighting Xingyu Jiang (beihang ); Hongkun Dou (Beihang University); Chengwei Fu (beihang); Bingquan Dai (Beihang); Tianrun Xu (North China University of Technology); Yue Deng (Samsung Research America)*
1347 Detecting and Recovering Sequential DeepFake Manipulation Rui Shao (Nanyang Technological University)*; Tianxing Wu (Nanyang Technological University); Ziwei Liu (Nanyang Technological University)
1353 MTFormer: Multi-Task Learning via Transformer and Cross-Task Reasoning Xiaogang XU (The Chinese University of Hong Kong)*; Hengshuang Zhao (University of Oxford); Vibhav Vineet (Microsoft Research); Ser-Nam Lim (Meta AI); Antonio Torralba (MIT)
1356 Prediction-Guided Distillation for Dense Object Detection Chenhongyi Yang (University of Edinburgh)*; Mateusz Ochal (Heriot Watt University); Amos Storkey (U Edinburgh); Elliot J Crowley (University of Edinburgh)
1358 Towards Generic 3D Tracking in RGBD Videos: Benchmark and Baseline Jinyu Yang (Southern University of Science and Technology)*; Zhongqun Zhang (University of Birmingham); Zhe LI (SUSTech); Hyung Jin Chang (University of Birmingham); Ales Leonardis (University of Birmingham); Feng Zheng (SUSTech)
1364 C3P: Cross-domain Pose Prior Propagation for Weakly Supervised 3D Human Pose Estimation cunlin wu (Huazhong University of Science and Technology); Yang Xiao (Huazhong Univ. of Sci.&Tech.)*; Boshen Zhang (Tencent); Mingyang Zhang (Huazhong Univ. of Sci.&Tech); Zhiguo Cao (Huazhong Univ. of Sci.&Tech.); Joey Tianyi Zhou (A*STAR Centre for Frontier AI Research (CFAR) )
1366 Adaptive Fine-Grained Sketch-Based Image Retrieval Ayan Kumar Bhunia (University of Surrey)*; Aneeshan Sain (University of Surrey); Parth Hiren Shah (Indian Institute of Technology Guwahati); Animesh Gupta (Thapar University); Pinaki Nath Chowdhury (University of Surrey); Tao Xiang (University of Surrey); Yi-Zhe Song (University of Surrey)
1376 Learning Ego 3D Representation as Ray Tracing Jiachen Lu (Fudan University); Zheyuan Zhou (Fudan University); Xiatian Zhu (University of Surrey); Hang Xu (Huawei Noah’s Ark Lab); Li Zhang (Fudan University)*
1380 Accelerating Score-based Generative Models with Preconditioned Diffusion Sampling Hengyuan Ma (Fudan University); Li Zhang (Fudan University)*; Xiatian Zhu (University of Surrey); Jianfeng Feng (Fudan University)
1382 RCLane: Relay Chain Prediction for Lane Detection Shenghua Xu (Fudan University); Xinyue Cai (Huawei Noah’s Ark Lab); Bin Zhao (Fudan University); Li Zhang (Fudan University)*; Hang Xu (Huawei Noah’s Ark Lab); Yanwei Fu (Fudan University); Xiangyang Xue (Fudan University)
1394 Point Primitive Transformer for Long-Term 4D Point Cloud Video Understanding Hao Wen (Tsinghua University); Yunze Liu (Tsinghua University)*; Jingwei Huang (Huawei); Bo Duan (Huawei); Li Yi (Tsinghua University)
1395 Towards Efficient Adversarial Training on Vision Transformers Boxi Wu (Zhejiang University)*; Jindong Gu (University of Munich); Zhifeng Li (Tencent AI Lab); Deng Cai (ZJU); Xiaofei He (Zhejiang University); Wei Liu (Tencent)
1397 Adaptive Agent Transformer for Few-shot Segmentation Yuan Wang (University of Science and Technology of China)*; Rui Sun (University of Science and Technology of China); Zhe Zhang (Lunar Exploration and Space Engineering Center of CNSA); Tianzhu Zhang (University of Science and Technology of China)
1408 Improving Few-Shot Part Segmentation using Coarse Supervision Oindrila Saha (University of Massachusetts Amherst)*; Zezhou Cheng (University of Massachusetts, Amherst); Subhransu Maji (University of Massachusetts, Amherst)
1412 Mining Relations among Cross-Frame Affinities for Video Semantic Segmentation Guolei Sun (ETH Zurich); Yun Liu (ETH Zurich)*; Hao Tang (ETH Zurich); Ajad Chhatkuli (ETH Zurich); Le Zhang (University of Electronic Science and Technology of China); Luc Van Gool (ETH Zurich)
1414 Out-of-distribution Detection with Boundary Aware Learning Sen Pei (Institute of Automation, Chinese Academy of Sciences)*; Xin Zhang (Institute of Automation, Chinese Academy of Sciences, University of Chinese Academy of Sciences); Bin Fan (University of Science and Technology Beijing); Gaofeng Meng (Chinese Academy of Sciences)
1415 NeILF: Neural Incident Light Field for Physically-based Material Estimation Yao Yao (Apple Inc.); Jingyang Zhang (The Hong Kong University of Science and Technology)*; Jingbo Liu (Apple Inc.); Yihang Qu (Apple Inc.); Tian Fang (Apple); David N McKinnon (Apple); Yanghai Tsin (Apple Inc); Long Quan (Apple)
1417 ViewFormer: NeRF-free Neural Rendering from Few Images Using Transformers Jonáš Kulhánek (Czech Technical University in Prague)*; Erik Derner (CTU CIIRC); Torsten Sattler (Czech Technical University in Prague); Robert Babuska (TU Delft)
1421 L-Tracing: Fast Light Visibility Estimation on Neural Surfaces by Sphere Tracing Ziyu Chen (Shanghai Jiao Tong University)*; Chenjing Ding (Sensetime Group Limited); Jianfei Guo (Shanghai AI Laboratory); Dongliang Wang (SenseTime Group Limited); Yikang Li (Shanghai AI Lab); Xuan Xiao (SenseTime Group Limited); Wei Wu (SenseTime Group Limited); Li Song (Shanghai Jiao Tong University)
1424 ARF: Artistic Radiance fields Kai Zhang (Cornell University)*; Nicholas I Kolkin (Adobe Research); Sai Bi (Adobe Research); Fujun Luan (Adobe Research); Zexiang Xu (Adobe Research); Eli Shechtman (Adobe Research, US); Noah Snavely (Cornell University and Google AI)
1425 Multiview Stereo with Cascaded Epipolar RAFT Zeyu Ma (Princeton University)*; Zachary Teed (Princeton University); Jia Deng (Princeton University)
1439 What to Hide from Your Students: Attention-Guided Masked Image Modeling Ioannis Kakogeorgiou (National Technical University of Athens)*; Spyros Gidaris (valeo.ai); Bill Psomas (National Technical University of Athens); Yannis Avrithis (IARAI, Athena RC); Andrei Bursuc (valeo.ai); Konstantinos Karantzalos (National Technical University of Athens); Nikos Komodakis (University of Crete)
1441 Static and Dynamic Concepts for Self-supervised Video Representation Learning Rui Qian (The Chinese University of Hong Kong)*; Shuangrui Ding (Shanghai Jiao Tong University); Xian Liu (The Chinese University of Hong Kong); Dahua Lin (The Chinese University of Hong Kong)
1447 Deep Partial Updating: Towards Communication Efficient Updating for On-device Inference Zhongnan Qu (ETH Zurich)*; Cong Liu (University of Texas at Dallas); Lothar Thiele (ETH Zürich)
1455 Gradient-based Uncertainty for Monocular Depth Estimation Julia Hornauer (Ulm University)*; Vasileios Belagiannis (Otto von Guericke University Magdeburg)
1456 Flow-Guided Transformer for Video Inpainting Kaidong Zhang (University of Science and Technology of China); Jingjing Fu (Microsoft)*; Dong Liu (University of Science and Technology of China)
1468 Relationformer: A Unified Framework for Image-to-Graph Generation Suprosanna Shit (TUM)*; Rajat Koner (Ludwig Maximilian University of Munich); Bastian Wittmann (Technical University of Munich); Johannes C. Paetzold (TUM); Ivan Ezhov (TUM); Hongwei Li (Technical University of Munich); Jiazhen Pan (Technical University of Munich); Sahand Sharifzadeh (Ludwig Maximilian University of Munich); Georgios Kaissis (Technische Universität München); Volker Tresp (LMU); Bjoern Menze (TUM)
1469 ARAH: Animatable Volume Rendering of Articulated Human SDFs Shaofei wang (ETH Zurich)*; Katja Schwarz (MPI Tuebingen); Andreas Geiger (University of Tuebingen); Siyu Tang (ETH Zurich)
1471 Learning Hierarchy Aware Features for Reducing Mistake Severity Ashima Garg (IIIT Delhi)*; Depanshu Sani (Indraprastha Institute of Information Technology); Saket Anand (Indraprastha Institute of Information Technology Delhi)
1474 Exploiting Unlabeled Data with Vision and Language Models for Object Detection Shiyu Zhao (Rutgers University)*; Zhixing Zhang (Rutgers University); Samuel Schulter (NEC Laboratories America); Long Zhao (Google Research); Vijay Kumar B G (NEC Laboratories America); Anastasis Stathopoulos (Rutgers University); Manmohan Chandraker (UC San Diego); Dimitris N. Metaxas (Rutgers)
1479 A Simple and Robust Correlation Filtering method for text-based person search Wei Suo (Northwestern Polytechnical University); MengYang Sun (Northwestern Polytechnical University); Kai Niu (Northwestern Polytechnical University); Yiqi Gao (Northwestern Polytechnical University); Peng Wang (Northwestern Polytechnical University); Yanning Zhang (Northwestern Polytechnical University)*; Qi Wu (University of Adelaide)
1482 Hunting Group Clues with Transformers for Social Group Activity Recognition Masato Tamura (Hitachi America, Ltd.)*; Rahul Vishwakarma (Hitachi America Ltd.); Ravigopal Vennelakanti (Hitachi America, Ltd.)
1493 Quantized GAN for Complex Music Generation from Dance Videos Ye Zhu (Illinois Institute of Technology)*; Kyle B Olszewski (Snap Inc.); Yu Wu (Princeton University); Panos Achlioptas (Stanford University); Menglei Chai (Snap Inc.); Yan Yan (Illinois Institute of Technology); Sergey Tulyakov (Snap Inc)
1506 Not Just Streaks: Towards Ground Truth for Single Image Deraining Yunhao Ba (UCLA)*; Howard Zhang (UCLA); Ethan Yang (UCLA); Akira Suzuki (UCLA); Arnold J Pfahnl (University of California, Los Angeles); Chethan Chinder Chandrappa (University of California – Los Angeles); Celso de Melo (Army Research Laboratory); Suya You (US Army Research Laboratory); Stefano Soatto (UCLA); Alex Wong (Yale University); Achuta Kadambi (UCLA)
1511 HIVE: Evaluating the Human Interpretability of Visual Explanations Sunnie S. Y. Kim (Princeton University)*; Nicole Meister (Princeton University); Vikram V. Ramaswamy (Princeton University); Ruth C Fong (Princeton University); Olga Russakovsky (Princeton University)
1512 GAMa: Cross-view Video Geo-localization Shruti Vyas (University of Central Florida)*; Chen Chen (University of Central Florida); Mubarak Shah (University of Central Florida)
1516 Meta-Sampler: Almost-Universal yet Task-Oriented Sampling for Point Clouds Ta-Ying Cheng (University of Oxford); Qingyong Hu (University of Oxford)*; Qian Xie (University of Oxford); Niki Trigoni (University of Oxford); Andrew Markham (University of Oxford)
1517 Multi-Query Video Retrieval Zeyu Wang (Princeton University)*; Yu Wu (Princeton University); Karthik Narasimhan (Princeton University); Olga Russakovsky (Princeton University)
1525 Waymo Open Dataset: Panoramic Video Panoptic Segmentation Jieru Mei (Johns Hopkins University); Alex Zhu (Waymo)*; Xinchen Yan (Waymo); Hang Yan (Waymo LLC); Siyuan Qiao (Google); Yukun Zhu (Google Inc.); Liang-Chieh Chen (Google Inc.); Henrik Kretzschmar (Waymo)
1531 MIME: Minority Inclusion for Majority Group Enhancement of AI Performance Pradyumna Chari (UCLA); Yunhao Ba (UCLA)*; Shreeram Athreya (UCLA); Achuta Kadambi (UCLA)
1534 Self-supervised Human Mesh Recovery with Cross-Representation Alignment Xuan Gong (University at Buffalo); Meng Zheng (United Imaging Intelligence); Benjamin Planche (United Imaging Intelligence); Srikrishna Karanam (Adobe Research); Terrence Chen (United Imaging Intelligence); David Doermann (University at Buffalo); Ziyan Wu (United Imaging Intelligence)*
1541 TL;DW? Summarizing Instructional Videos with Task Relevance & Cross-Modal Saliency Medhini Narasimhan (UC Berkeley)*; Arsha Nagrani (Google); Chen Sun (Brown University); Michael Rubinstein (Google); Trevor Darrell (UC Berkeley); Anna Rohrbach (UC Berkeley); Cordelia Schmid (Google)
1542 A Perceptual Quality Metric for Video Frame Interpolation Qiqi Hou (Portland State University)*; Abhijay Ghildyal (Portland State University); Feng Liu (Portland State University)
1543 Adaptive Feature Interpolation for Low-Shot Image Generation Mengyu Dai (Microsoft Corporation)*; Haibin Hang (Amazom.com); Xiaoyang Guo (Facebook)
1544 Rethinking Learning Approaches for Long-Term Action Anticipation Megha Nawhal (Simon Fraser University)*; Akash Abdu Jyothi (Simon Fraser University); Greg Mori (Simon Fraser University / Borealis AI)
1546 Object Manipulation via Visual Target Localization Kiana Ehsani (Allen Institute for Artificial Intelligence)*; Ali Farhadi (University of Washington, Apple); Aniruddha Kembhavi (Allen Institute for Artificial Intelligence); Roozbeh Mottaghi (Allen Institute for AI)
1549 AlignSDF: Pose-Aligned Signed Distance Fields for Hand-Object Reconstruction Zerui Chen (Inria Paris); Yana Hasson (Inria); Cordelia Schmid (Inria/Google)*; Ivan Laptev (INRIA Paris)
1551 Shift-tolerant Perceptual Similarity Metric Abhijay Ghildyal (Portland State University)*; Feng Liu (Portland State University)
1557 Making the Most of Text Semantics to Improve Biomedical Vision-Language Processing Benedikt Boecking (Carnegie Mellon University); Naoto Usuyama (Microsoft Research); Shruthi J Bannur (Microsoft Research); Daniel Coelho de Castro (Microsoft Research); Anton Schwaighofer (Microsoft Research); Stephanie Hyland (Microsoft Research); Maria Teodora A Wetscherek (Microsoft); Tristan Naumann (Microsoft Research Redmond, US); Aditya Nori (Microsoft Research); Javier Alvarez-Valle (Microsoft Research); Hoifung Poon (Microsoft Research); Ozan Oktay (Microsoft Research)*
1561 Self-Supervised Sparse Representation for Video Anomaly Detection Jhih-Ciang Wu (Academia Sinica )*; He-Yen Hsieh (Academia Sinica); Ding-Jie Chen (Academia Sinica); Chiou-Shann Fuh (National Taiwan University); Tyng-Luh Liu (Academia Sinica)
1567 CPO: Change Robust Panorama to Point Cloud Localization Junho Kim (Seoul National University)*; Hojun Jang (Seoul National University); Changwoon Choi (Seoul National University); Young Min Kim (Seoul National University)
1569 MonoPLFlowNet: Permutohedral Lattice FlowNet for Real-Scale 3D Scene Flow Estimation with Monocular Images Runfa Li (UC San Diego)*; Truong Nguyen (UC San Diego)
1576 DLCFT: Deep Linear Continual Fine-Tuning for General Incremental Learning Hyounguk Shon (KAIST)*; Janghyeon Lee (LG AI Research); Seung Hwan Kim (LG AI Research); Junmo Kim (KAIST)
1578 Contrastive Positive Mining for Unsupervised 3D Action Representation Learning Haoyuan Zhang (Tianjin University)*; Yonghong Hou (Tianjin University); Wenjing Zhang (Tianjin University); Wanqing Li (University of Wollongong)
1580 Patch Similarity Aware Data-Free Quantization for Vision Transformers Zhikai Li (Institute of Automation, Chinese Academy of Sciences; School of Artificial Intelligence, University of Chinese Academy of Sciences); Liping Ma (Institute of Automation, Chinese Academy of Sciences); Mengjuan Chen (Center of Precision Sensing and Control, Institute of Automation, Chinese Academy of Sciences); Junrui Xiao (Institute of Automation, Chinese Academy of Sciences; School of Artificial Intelligence, University of Chinese Academy of Sciences); Qingyi Gu (Institute of Automation, Chinese Academy of Sciences)*
1586 Perception-Distortion Balanced ADMM Optimization for Single-Image Super-Resolution Yuehan Zhang (National University of Singapore)*; Bo Ji (National University of Singapore); Jia Hao (HiSilicon (Shanghai) Technologies Co., Ltd); Angela Yao (National University of Singapore)
1596 DualFormer: Local-Global Stratified Transformer for Efficient Video Recognition Yuxuan Liang (National University of Singapore)*; Pan Zhou (Sea AI Lab); Roger Zimmermann (NUS); Shuicheng Yan (Sea AI Labs)
1606 Hierarchical Contrastive Inconsistency Learning for Deepfake Video Detection Zhihao Gu (Shanghai Jiao Tong University)*; Taiping Yao (Tencent YouTu); Yang Chen (Tencent); Shouhong Ding (Tencent); Lizhuang Ma (Shanghai Jiao Tong University)
1616 Watermark Vaccine: Adversarial Attacks to Prevent Watermark Removal Xinwei Liu (Institute of Information Engineering,Chinese Academy of Sciences)*; Jian Liu (Ant Group); Yang Bai (Tsinghua); Jindong Gu (University of Munich); Tao Chen (Ant Group); Xiaojun Jia (Institute of Information Engineering,Chinese Academy of Sciences); Xiaochun Cao (Sun Yat-sen University)
1625 ECCV Caption: Correcting False Negatives by Collecting Machine-and-Human-verified Image-Caption Associations for MS-COCO Sanghyuk Chun (NAVER AI Lab)*; Wonjae Kim (NAVER AI Lab); Song Park (NAVER AI Lab); Minsuk Chang (NAVER AI Lab); Seong Joon Oh (Naver AI Lab)
1626 Personalizing Federated Medical Image Segmentation via Local Calibration Jiacheng Wang (Xiamen University); Yueming Jin (The Chinese University of Hong Kong); Liansheng Wang (Xiamen University)*
1628 Learning to Detect Every Thing in an Open World Kuniaki Saito (Boston University)*; Ping Hu (Boston University); Trevor Darrell (UC Berkeley); Kate Saenko (Boston University)
1648 MVP: Multimodality-guided Visual Pre-training Longhui Wei (University of Science and Technology of China)*; Lingxi Xie (Huawei Inc.); Wengang Zhou (University of Science and Technology of China); Houqiang Li (University of Science and Technology of China); Qi Tian (Huawei Cloud & AI)
1649 Uncertainty Learning in Kernel Estimation for Multi-Stage Blind Image Super-Resolution Zhenxuan Fang (Xidian University); Weisheng Dong (Xidian University)*; Xin Li (West Virginia University); Jinjian Wu (Xidian University); Leida Li (Xidian University); Guangming Shi (Xidian University)
1666 Physical Attack on Monocular Depth Estimation in Autonomous Driving with Optimal Adversarial Patches Zhiyuan Cheng (Purdue University)*; James C Liang (Rochester Institute of Technology); Hongjun Choi (Purdue University); Guanhong Tao (Purdue University); Zhiwen Cao (Purdue University); Dongfang Liu (Rochester Institute of Technology); Xiangyu Zhang (Purdue University)
1670 KVT: $k$-NN Attention for Boosting Vision Transformers Pichao Wang (Alibaba Group)*; Xue Wang (Alibaba DAMO Academy); Fan Wang (Alibaba Group); Ming Lin (Alibaba Group); Shuning Chang (Alibiba Group); Hao Li (Alibaba Group); rong jin (alibaba group)
1673 Locally Varying Distance Transform for Unsupervised Visual Anomaly Detection Wen-Yan Lin (SMU); Zhonghang Liu (SMU); Siying Liu (I2R Singapore)*
1676 Hierarchical Feature Alignment Network for Unsupervised Video Object Segmentation Gensheng Pei (Nanjing University of Science and Technology)*; Fumin Shen (UESTC); Yazhou Yao (Nanjing University of Science and Technology); Guo-Sen Xie (Nanjing University of Science and Technology); Zhenmin Tang ( Nanjing University of Science and Technology); Jinhui Tang (Nanjing University of Science and Technology)
1677 PalGAN: Image Colorization with Palette Generative Adversarial Networks Yi Wang (Shanghai AI Laboratory)*; Menghan Xia (Tencent AI lab); Lu Qi (The Chinese University of Hong Kong); Jing Shao (Sensetime); Yu Qiao (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences)
1687 Fast-Vid2Vid: Spatial-Temporal Compression for Video-to-Video Synthesis Long Zhuo (Shanghai AI Lab)*; Guangcong Wang (Nanyang Technological University); Shikai Li (SenseTime Research); Wayne Wu (SenseTime Research); Ziwei Liu (Nanyang Technological University)
1693 Generative Negative Text Replay for Continual Vision-Language Pretraining Shipeng Yan (ShanghaiTech University)*; Lanqing Hong (Huawei Noah’s Ark Lab); Hang Xu (Huawei Noah’s Ark Lab); Jianhua Han (Huawei Noah’s Ark Lab); Tinne Tuytelaars (KU Leuven); Zhenguo Li (Huawei Noah’s Ark Lab); Xuming He (ShanghaiTech University)
1697 Learning Spatio-Temporal Downsampling for Effective Video Upscaling Xiaoyu Xiang (Meta Platforms Inc.)*; Yapeng Tian (University of Texas at Dallas); Vijay Rengarajan (Meta Platforms Inc.); Lucas D Young (Facebook); Bo Zhu (Meta Platforms, Inc.); Rakesh Ranjan (Facebook)
1698 Geometric Representation Learning for Document Image Rectification Hao Feng (University of Science and Technology of China)*; Wengang Zhou (University of Science and Technology of China); Jiajun Deng (University of Science and Technology of China); Yuechen Wang (University of Science and Technology of China); Houqiang Li (University of Science and Technology of China)
1701 ASpanFormer: Detector-Free Image Matching with Adaptive Span Transformer Hongkai Chen (HKUST)*; Zixin Luo (Apple Inc.); Lei Zhou (Apple); Yurun Tian (Apple); Zhen Mingmin (Apple Inc.); Tian Fang (Apple); David N McKinnon (Apple); Yanghai Tsin (Apple Inc); Long Quan (Apple)
1709 Egocentric Activity Recognition and Localization on a 3D Map Miao Liu (Georgia Institute of Technology)*; Lingni Ma (Facebook Reality Labs); Kiran Somasundaram (Facebook Reality Labs); Yin Li (University of Wisconsin-Madison); Kristen Grauman (Facebook AI Research & UT Austin); James Rehg (Georgia Institute of Technology); Chao Li (Facebook Reality Labs)
1710 Generative Adversarial Network for Future Hand Segmentation from Egocentric Video Wenqi Jia (Georgia Institute of Technology)*; Miao Liu (Georgia Institute of Technology); James Rehg (Georgia Institute of Technology)
1712 One-Shot Medical Landmark Localization by Edge-Guided Transform and Noisy Landmark Refinement Zihao Yin (Center for Data Science, Peking University); Ping Gong (Deepwise AI Lab); Chunyu Wang (Microsoft Research asia); Yizhou Yu (The University of Hong Kong); Yizhou Wang (PKU)*
1721 Learning Prior Feature and Attention Enhanced Image Inpainting chenjie cao (fudan.edu.cn)*; Qiaole Dong (Fudan University); Yanwei Fu (Fudan University)
1730 AdaAfford: Learning to Adapt Manipulation Affordance for 3D Articulated Objects via Few-shot Interactions Yian Wang (Peking university); Ruihai Wu (Peking University); Kaichun Mo (Stanford); Jiaqi Ke (Peking University); Qingnan Fan (Tencent AI Lab); Leonidas Guibas (Stanford University); Hao Dong (Peking University)*
1735 Video Graph Transformer for Video Question Answering Junbin Xiao (National University of Singapore)*; Pan Zhou (Sea AI Lab); Tat-Seng Chua (National Univ. of Singapore); Shuicheng Yan (Sea AI Labs)
1737 A Reliable Online Method for Joint Estimation of Focal Length and Camera Rotation Yiming Qian (Osaka University)*; James Elder (York University)
1738 Learning Local Implicit Fourier Representation for Image Warping Jaewon Lee (DGIST)*; Kwang Pyo Choi (Samsung Electronics); Kyong Hwan Jin (DGIST)
1740 SepLUT: Separable Image-adaptive Lookup Tables for Real-time Image Enhancement Canqian Yang (Shanghai Jiao Tong University); Meiguang Jin (Alibaba Group); Yi Xu (Shanghai Jiao Tong University)*; Rui Zhang (Shanghai Jiao Tong University); Ying Chen (Alibaba Group); Huaida Liu (Alibaba)
1744 Temporal-MPI: Enabling Multi-Plane Images for Dynamic Scene Modelling via Temporal Basis Learning Wenpeng Xing (Hong Kong Baptist University); Jie Chen (Hong Kong Baptist University)*
1746 Blind Image Decomposition Junlin Han (CSIRO)*; Weihao Li (Data61, CSIRO); Pengfei Fang (The Australian National University); Chunyi Sun (Australian National University ); Jie Hong (Australian National University); Mohammad Ali Armin (CSIRO(Data61)); Lars Petersson (Data61/CSIRO); HONGDONG LI (Australian National University, Australia)
1751 INT: Towards Infinite-frames 3D Detection with An Efficient Framework Jianyun Xu (DAMO Academy, Alibaba Group)*; Zhenwei Miao (DAMO Academy, Alibaba Group); Da Zhang (UC Santa Barbara); Hongyu Pan (DAMO Academy, Alibaba Group); Kaixuan Liu (DAMO Academy, Alibaba Group); Peihan Hao (DAMO Academy, Alibaba Group); Jun Zhu (DAMO Academy, Alibaba Group); Zhengyang Sun (Tsinghua University); Li Hongmin (Huawei TCS lab); Xin Zhan (DAMO Academy, Alibaba Group)
1756 MuLUT: Cooperating Multiple Look-Up Tables for Efficient Image Super-Resolution Jiacheng Li (University of Science and Technology of China); Chang Chen (Huawei Noah’s Ark Lab); Zhen Cheng (University of Science and Technology of China); Zhiwei Xiong (University of Science and Technology of China)*
1757 NDF: Neural Deformable Fields for Dynamic Human Modelling Ruiqi Zhang (Hong Kong Baptist University); Jie Chen (Hong Kong Baptist University)*
1759 MPIB: An MPI-Based Bokeh Rendering Framework for Realistic Partial Occlusion Effects Juewen Peng (Huazhong University of Science and Technology); Jianming Zhang (Adobe Research); Xianrui Luo (Huazhong University of Science and Technology); Hao Lu (Huazhong University of Science and Technology); Ke Xian (Huazhong University of Science and Technology)*; Zhiguo Cao (Huazhong Univ. of Sci.&Tech.)
1761 Neural Density-Distance Fields Itsuki UEDA (University of Tsukuba)*; Yoshihiro Fukuhara (Waseda University); Hirokatsu Kataoka (National Institute of Advanced Industrial Science and Technology (AIST)); Hiroaki Aizawa (Hiroshima University); Hidehiko Shishido (University of Tsukuba); Itaru Kitahara (University of Tsukuba)
1762 MoDA: Map style transfer for self-supervised Domain Adaptation of embodied agents Eun Sun Lee (Seoul National University)*; Junho Kim (Seoul National University); Sangwon Park (Seoul Nat’l University); Young Min Kim (Seoul National University)
1766 L3: Accelerator-Friendly Lossless Image Format for High-Resolution, High-Throughput DNN Training Jonghyun Bae (Seoul National University)*; Woohyeon Baek (Seoul National University); Tae Jun Ham (Seoul National University); Jae W. Lee (Seoul National University)
1780 Prior-Guided Adversarial Initialization for Fast Adversarial Training Xiaojun Jia (Institute of Information Engineering,Chinese Academy of Sciences)*; Yong Zhang (Tencent AI Lab); Xingxing Wei (Beihang University); Baoyuan Wu (The Chinese University of Hong Kong, Shenzhen; Shenzhen Research Institute of Big Data); Ke Ma (UCAS); Jue Wang (Tencent AI Lab); Xiaochun Cao (Sun Yat-sen University)
1790 Housekeep: Tidying Virtual Households using Commonsense Reasoning Yash Mukund Kant (University of Toronto)*; Arun Ramachandran (Georgia Institute of Technology); Sriram Yenamandra (Georgia Institute of Technology); Igor Gilitschenski (University of Toronto); Dhruv Batra (Georgia Tech & Facebook AI Research); Andrew Szot (Georgia Institute of Technology); Harsh Agrawal (Georgia Institute of Technology)
1804 Real-RawVSR: Real-World Raw Video Super-Resolution with a Benchmark Dataset Huanjing Yue (Tianjin University)*; Zhiming Zhang (Tianjin University); Jingyu Yang (Tianjin University)
1807 ST-P3: End-to-end Vision-based Autonomous Driving via Spatial-Temporal Feature Learning Shengchao Hu (Shanghai Jiao Tong University)*; Li Chen (Shanghai AI Laboratory); Penghao Wu (Shanghai Jiao Tong University); Hongyang Li (SenseTime); Junchi Yan (Shanghai Jiao Tong University); Dacheng Tao (JD.com)
1810 NeXT: Towards High Quality Neural Radiance Fields via Multi-Skip Transformer Yunxiao Wang (Tsinghua University); Yanjie Li (Tsinghua University)*; Peidong Liu (Tsinghua University); Tao Dai (Shenzhen University); Shu-Tao Xia (Tsinghua University)
1814 Learning Spatiotemporal Frequency-Transformer for Compressed Video Super-Resolution Zhongwei Qiu (University of Science and Technology Beijing); Huan Yang (Microsoft Research)*; Jianlong Fu (Microsoft Research); Dongmei Fu (University of Science and Technology Beijing)
1819 Adversarial Partial Domain Adaptation by Cycle Inconsistency Kun-Yu Lin (Sun Yat-sen University); Jiaming Zhou (Sun Yat-sen University); Yukun Qiu (Sun Yat-sen University); WEI-SHI ZHENG (Sun Yat-sen University, China)*
1824 BayesCap: Bayesian Identity Cap for Calibrated Uncertainty in Frozen Neural Networks Uddeshya Upadhyay (University of Tübingen)*; Shyamgopal Karthik (University of Tübingen); Massimiliano Mancini (University of Tübingen); Yanbei Chen (University of Tübingen); Zeynep Akata (University of Tübingen)
1831 Domain Randomization-Enhanced Depth Simulation and Restoration for Perceiving and Grasping Specular and Transparent Objects Qiyu Dai (Peking University); Jiyao Zhang (Xi’an Jiaotong University); Qiwei Li (Peking University); tianhao wu (Peking University); Hao Dong (Peking University); Ziyuan Liu (Huawei group); Ping Tan (Simon Fraser University); He Wang (Peking University)*
1832 PS-NeRF: Neural Inverse Rendering for Multi-view Photometric Stereo Wenqi Yang (The University of Hong Kong)*; Guanying CHEN (The Chinese University of Hong Kong, Shenzhen); Chaofeng Chen (Nanyang Technological University); Zhenfang Chen (MIT-IBM Watson AI Lab); Kwan-Yee K. Wong (The University of Hong Kong)
1845 DeciWatch: A Simple Baseline for 10× Efficient 2D and 3D Pose Estimation Ailing Zeng (The Chinese University of Hong Kong)*; Xuan Ju (The Chinese University of Hong Kong); Lei Yang (Sensetime Group Limited); Ruiyuan Gao (The Chinese University of Hong Kong); Xizhou Zhu (SenseTime); Bo Dai (Shanghai AI Lab); Qiang Xu (The Chinese University of Hong Kong)
1846 Hierarchical Latent Structure for Multi-Modal Vehicle Trajectory Forecasting Dooseop Choi (ETRI)*; KyoungWook Min (ETRI)
1848 SmoothNet: A Plug-and-Play Network for Refining Human Poses in Videos Ailing Zeng (The Chinese University of Hong Kong)*; Lei Yang (Sensetime Group Limited); Xuan Ju (The Chinese University of Hong Kong); Jiefeng Li (Shanghai Jiao Tong University); Jianyi Wang (Nanyang Technological University); Qiang Xu (The Chinese University of Hong Kong)
1851 Share With Thy Neighbors: Single-View Reconstruction by Cross-Instance Consistency Tom Monnier (École des ponts Paristech)*; Matthew Fisher (Adobe Research); Alexei A Efros (UC Berkeley); Mathieu Aubry (École des ponts ParisTech)
1852 End-to-End Weakly Supervised Object Detection with Sparse Proposal Evolution Mingxiang Liao (University of Chinese Academy of Sciences); Fang Wan (University of Chinese Academy of Sciences)*; Yuan Yao (University of Chinese Academy of Sciences); Zhenjun Han (University of Chinese Academy of Sciences); Zou Jialing (University of Chinese Academy of Science); Yuze Wang ( Huawei Noah’s Ark Lab); Bailan Feng (Huawei Noah’s Ark Lab); Peng Yuan (Huawei Noah’s Ark Lab); Qixiang Ye (University of Chinese Academy of Sciences, China)
1853 PAC-Net: Highlight Your Video via History Preference Modeling Hang Wang (Huawei HiSilicon)*; Penghao Zhou (ByteDance); Chong Zhou (Nanyang Technological University); Zhao Zhang (Nankai University); Xing Sun (Shopee)
1859 Efficient Point Cloud Analysis Using Hilbert Curve Wanli Chen (CUHK)*; Xinge Zhu (The Chinese University of Hong Kong); Guojin Chen (The Chinese University of Hong Kong); Bei Yu (CUHK)
1860 Learning Online Multi-Sensor Depth Fusion Erik Sandström (ETH Zürich)*; Martin R. Oswald (ETH Zurich); Suryansh Kumar (ETH Zurich); Silvan Weder (ETH Zürich); Fisher Yu (ETH Zurich); Cristian Sminchisescu (Lund University); Luc Van Gool (ETH Zurich)
1866 Self-Support Few-Shot Semantic Segmentation Qi Fan (HKUST)*; Wenjie Pei (Harbin Institute of Technology, Shenzhen); Yu-Wing Tai (Kuaishou Technology / HKUST); Chi-Keung Tang (Hong Kong University of Science and Technology)
1868 Few-Shot Object Detection with Model Calibration Qi Fan (HKUST)*; Chi-Keung Tang (Hong Kong University of Science and Technology); Yu-Wing Tai (Kuaishou Technology / HKUST)
1870 S2-VER: Semi-Supervised Visual Emotion Recognition Guoli Jia (NanKai University); Jufeng Yang (Nankai University )*
1882 Self-Supervision Can Be a Good Few-Shot Learner Yuning Lu (USTC); liangjian Wen (the Noah’s Ark Lab, Huawei Technologies Company Limited); Jianzhuang Liu (Huawei Noah’s Ark Lab); Yajing Liu (USTC); Xinmei Tian (USTC)*
1886 My View is the Best View: Procedure Learning from Egocentric Videos Siddhant Bansal (IIIT, Hyderabad)*; Chetan Arora (Indian Institute of Technology Delhi); C.V. Jawahar (IIIT-Hyderabad)
1894 Trace Controlled Text to Image Generation Kun Yan (Beihang University)*; Lei Ji (Microsoft); Chenfei Wu (Microsoft); Jianmin Bao (microsoft.com); Ming Zhou (SINOVATION VENTURES); Nan Duan (Microsoft Research); Shuai Ma (Beihang University)
1925 Towards Comprehensive Representation Enhancement in Semantics-guided Self-supervised Monocular Depth Estimation Jingyuan Ma (HikVision Research Institute)*; Xiangyu Lei (Hikvision Research Institute); Nan Liu (hikvison); Zhao Xian (Hikvision); Shiliang Pu (Hikvision Research Institute)
1929 Calibration-free Multi-view Crowd Counting Qi Zhang (City University of Hong Kong, Hong Kong)*; Antoni Chan (City University of Hong Kong, Hong, Kong)
1930 Unsupervised Domain Adaptation for Monocular 3D Object Detection via Self-Training Zhenyu Li (Harbin Institute of Technology)*; Zehui Chen (University of Science and Technology of China); Ang Li (SenseTime Research); Liangji Fang (Sensetime Research); Qinhong Jiang (SenseTime Research; Shanghai AI Laboratory); Xianming Liu (Harbin Institute of Technology); Junjun Jiang (Harbin Institute of Technology)
1940 Online Continual Learning with Contrastive Vision Transformer Zhen Wang (The University of Sydney )*; Liu Liu (The University of Sydney); Yajing Kong (The University of Sydney); Jiaxian Guo (The University of Sydney); Dacheng Tao (JD.com)
1946 COO: Comic Onomatopoeia Dataset for Recognizing Arbitrary or Truncated Texts Jeonghun Baek (The University of Tokyo)*; Yusuke Matsui (The University of Tokyo); Kiyoharu Aizawa (The University of Tokyo)
1947 BungeeNeRF: Progressive Neural Radiance Field for Extreme Multiscale Scene Rendering Yuanbo Xiangli (Chinese University of Hong Kong)*; Linning Xu (CUHK); Xingang Pan (Max Planck Institute for Informatics); Nanxuan Zhao (University of Bath); Anyi Rao (The Chinese University of Hong Kong); Christian Theobalt (MPI Informatik); Bo Dai (Shanghai AI Lab); Dahua Lin (The Chinese University of Hong Kong)
1951 AiATrack: Attention in Attention for Transformer Visual Tracking Shenyuan Gao (Huazhong University of Science and Technology)*; CHUNLUAN ZHOU (Wormpex AI Research); Chao Ma (Shanghai Jiao Tong University); Xinggang Wang (Huazhong University of Science and Technology); Junsong Yuan (“State University of New York at Buffalo, USA”)
1952 Learning Invariant Visual Representations for Compositional Zero-Shot Learning Tian Zhang (Beijing University of Posts and Telecommunications); Kongming Liang (Beijing University of Posts and Telecommunications)*; Ruoyi Du (Beijing University of Posts and Telecommunications); Xian Sun (Aerospace Information Research Institute, Chinese Academy of Sciences); Zhanyu Ma (Beijing University of Posts and Telecommunications); Jun Guo (Beijing University of Posts and Telecommunications)
1954 Image Coding for Machines with Omnipotent Feature Learning Ruoyu Feng (University of Science and Technology of China)*; Xin Jin (University of Science and Technology of China); Zongyu Guo (University of Science and Technology of China); Runsen Feng (University of Science and Technology of China); Yixin Gao (University of Science and Technology of China); Tianyu He (Microsoft Research Asia); Zhizheng Zhang (Microsoft Research); Simeng Sun (University of Science and Technology of China); Zhibo Chen (University of Science and Technology of China)
1959 MOTCOM: The Multi-Object Tracking Dataset Complexity Metric Malte Pedersen (Aalborg University)*; Joakim Bruslund Haurum (Aalborg University); Patrick Dendorfer (TUM); Thomas B. Moeslund (Aalborg University)
1980 How Severe is Benchmark-Sensitivity in Video Self-Supervised Learning? Fida Mohammad Thoker (University of Amsterdam)*; Hazel Doughty (University of Amsterdam); Piyush Nitin Bagad (University of Amsterdam); Cees Snoek (University of Amsterdam)
1982 Rethinking Robust Representation Learning Under Fine-grained Noisy Faces Bingqi Ma (Sensetime Group Limited)*; Guanglu Song (Sensetime); Boxiao Liu (Institute of Computing Technology, Chinese Academy of Sciences); Yu Liu (SenseTime Group LTD)
1986 Feature Representation Learning for Unsupervised Cross-domain Image Retrieval Conghui Hu (National University of Singapore)*; Gim Hee Lee (National University of Singapore)
1987 Cost Aggregation with 4D Convolutional Swin Transformer for Few-Shot Segmentation sunghwan hong (Korea University); Seokju Cho (Korea University); Jisu Nam (korea university); Stephen Lin (Microsoft Research); Seungryong Kim (Korea University)*
1988 Spatial-Frequency Domain Information Integration for Pan-sharpening man zhou (University of Science and Technology of China); Jie Huang (University of Science and Technology of China); Keyu Yan (University of Science and Technology of China); Hu Yu (University of Science and Technology of China); Xueyang Fu (University of Science and Technology of China); Aiping Liu (University of Science and Technology of China); Xian Wei (East China Normal University); Feng Zhao (University of Science and Technology of China)*
1991 TOCH: Spatio-Temporal Object-to-Hand Correspondence for Motion Refinement Keyang Zhou (University of Tübingen)*; Bharat Lal Bhatnagar (University of Tübingen, MPI informatik); Jan E. Lenssen (TU Dortmund); Gerard Pons-Moll (University of Tübingen)
1999 HRDA: Context-Aware High-Resolution Domain-Adaptive Semantic Segmentation Lukas Hoyer (ETH Zurich)*; Dengxin Dai (ETH Zurich); Luc Van Gool (ETH Zurich)
2012 Combating Label Distribution Shift for Active Domain Adaptation Sehyun Hwang (POSTECH)*; Sohyun Lee (POSTECH); Sungyeon Kim (POSTECH); Jungseul Ok (POSTECH); Suha Kwak (POSTECH)
2016 GIPSO: Geometrically Informed Propagation for Online Adaptation in 3D LiDAR Segmentation Cristiano Saltori (University of Trento)*; Evgeny Krivosheev (University of Trento); Stéphane Lathuilière (Telecom-Paris); Nicu Sebe (University of Trento); Fabio Galasso (Sapienza University); Giuseppe Fiameni (NVIDIA); Elisa Ricci (University of Trento); Fabio Poiesi (Fondazione Bruno Kessler)
2025 SuperLine3D: Self-supervised Line Segmentation and Description for LiDAR Point Cloud Xiangrui Zhao (Zhejiang University)*; Sheng Yang (Alibaba Group); Tianxin Huang (Zhejiang University); Jun Chen (Zhejiang University); Teng Ma (Alibaba Group); Mingyang Li (Alibaba A.I. Labs); Yong Liu (Zhejiang University)
2031 Efficient Meta-Tuning for Content-aware Neural Video Delivery Xiaoqi Li (Columbia university in the city of New york)*; Jiaming Liu (Peking University); Shizun Wang (Beijing University of Posts and Telecommunications); Cheng Lyu (Beijing University of Posts and Telecommunications); Ming Lu (Intel Labs China); Yurong Chen (Intel Labs China); Anbang Yao (Intel Labs China); Yandong Guo (OPPO Research Institute); Shanghang Zhang (University of California, Berkeley)
2033 PoseTrans: A Simple Yet Effective Pose Transformation Augmentation for Human Pose Estimation Wentao Jiang (Beihang University)*; Sheng Jin (The University of Hong Kong); Wentao Liu (Sensetime); Chen Qian (SenseTime); Ping Luo (The University of Hong Kong); Si Liu (Beihang University)
2039 3D-Aware Semantic-Guided Generative Model for Human Synthesis Jichao Zhang (University of Trento)*; Enver Sangineto (University of Modena and Reggio Emilia); Hao Tang (ETH Zurich); Aliaksandr Siarohin (Snapchat); Zhun Zhong (University of Trento); Nicu Sebe (University of Trento); Wei Wang (EPFL)
2041 Improving Covariance Conditioning of the SVD Meta-layer by Orthogonality Yue Song (University of Trento)*; Nicu Sebe (University of Trento); Wei Wang (EPFL)
2050 CoSMix: Compositional Semantic Mix for Domain Adaptation in 3D LiDAR Segmentation Cristiano Saltori (University of Trento)*; Fabio Galasso (Sapienza University); Giuseppe Fiameni (NVIDIA); Nicu Sebe (University of Trento); Elisa Ricci (University of Trento); Fabio Poiesi (Fondazione Bruno Kessler)
2054 Streaming Multiscale Deep Equilibrium Models Can Ufuk Ertenli (Middle East Technical University)*; Emre Akbas (METU); Ramazan Gokberk Cinbis (METU)
2057 AvatarCap: Animatable Avatar Conditioned Monocular Human Volumetric Capture Zhe Li (Tsinghua University)*; Zerong Zheng (Tsinghua University); Hongwen Zhang (Tsinghua University); Chaonan Ji (Tsinghua University); Yebin Liu (Tsinghua University)
2061 Hierarchical Average Precision Training for Pertinent Image Retrieval Elias Ramzi (Conservatoire Nation des Arts et Metiers)*; Nicolas Audebert (Cnam); Nicolas Thome (CNAM, Paris); Clément Rambour (Cnam); Xavier B Bitot (Coexya)
2087 Fashionformer: A Simple, Effective and Unified Baseline for Human Fashion Segmentation and Recognition Shilin Xu (Peking University); Xiangtai Li (Peking University)*; Jingbo Wang (The Chinese University of HongKong); Guangliang Cheng (Sensetime Group Limited); Yunhai Tong (Peking University); Dacheng Tao (JD.com)
2088 Out-of-Distribution Detection with Semantic Mismatch under Masking Yijun Yang (The Chinese University of Hong Kong)*; Ruiyuan Gao (The Chinese University of Hong Kong); Qiang Xu (The Chinese University of Hong Kong)
2104 Target-absent Human Attention Zhibo Yang (Stony Brook University)*; Sounak Mondal (Stony Brook University); Seoyoung Ahn (Stony Brook University); Gregory Zelinsky (Stony Brook University); Minh Hoai (Stony Brook University); Dimitris Samaras (Stony Brook University)
2105 Reference-based Image Super-Resolution with Deformable Attention Transformer Jiezhang Cao (ETH Zürich)*; Jingyun Liang (ETH Zurich); Kai Zhang (ETH Zurich); Yawei Li (ETH Zurich); Yulun Zhang (ETH Zurich); Wenguan Wang (Eidgenössische Technische Hochschule Zürich); Luc Van Gool (ETH Zurich)
2116 Cross-Attention of Disentangled Modalities for 3D Human Mesh Recovery with Transformers Junhyeong Cho (POSTECH)*; Kim Youwang (POSTECH); Tae-Hyun Oh (POSTECH)
2118 Learning to Generate Realistic LiDAR Point Cloud Vlas Zyrianov (University of Illinois Urbana Champaign); Xiyue Zhu (university of illinois); Shenlong Wang (UIUC)*
2124 GeoRefine: Self-Supervised Online Depth Refinement for Accurate Dense Mapping Pan Ji (OPPO US Research Center)*; Qingan Yan (OPPO US Research Center); Yuxin Ma (Wing LLC); Yi Xu (OPPO US Research Center)
2134 Transform your Smartphone into a DSLR Camera: Learning the ISP in the Wild Ardhendu Shekhar Tripathi (ETH Zurich)*; Martin Danelljan (ETH Zurich); Samarth Shukla (ETH Zurich); Radu Timofte (University of Wurzburg & ETH Zurich); Luc Van Gool (ETH Zurich)
2138 Uncertainty-Based Spatial-Temporal Attention for Online Action Detection Hongji Guo (Rensselaer Polytechnic Institute)*; Zhou Ren (Wormpex AI Research); Yi Wu (Wormpex AI Research); Gang Hua (Wormpex AI Research); Qiang Ji (Rensselaer Polytechnic Institute)
2144 Video Question Answering with Iterative Video-Text Co-Tokenization AJ Piergiovanni (Google)*; Kairo Morton (Massachusetts Institute of Technology); Weicheng Kuo (Google); Michael S Ryoo (Google; Stony Brook University); Anelia Angelova (Google)
2145 LaTeRF: Label and Text Driven Object Radiance Fields Ashkan Mirzaei (University of Toronto)*; Yash Mukund Kant (University of Toronto); Jonathan Kelly (University of Toronto); Igor Gilitschenski (University of Toronto)
2146 Temporally Consistent Semantic Video Editing Yiran Xu (University of Maryland, College Park)*; Badour A Sh AlBahar (Virginia Tech); Jia-Bin Huang (Facebook )
2149 SPot-the-Difference Self-Supervised Pre-training for Anomaly Detection and Segmentation Yang Zou (Amazon AI)*; Jongheon Jeong (KAIST); Latha Pemula (Amazon); Dongqing Zhang (Amazon); Onkar Dabeer (Amazon)
2151 Exploring Plain Vision Transformer Backbones for Object Detection Yanghao Li (Facebook AI Research)*; Hanzi Mao (Facebook AI Research); Ross Girshick (FAIR); Kaiming He (Facebook AI Research)
2152 Fine-grained Egocentric Hand-Object Segmentation: Dataset, Model, and Applications Lingzhi Zhang (University of Pennsylvania)*; Shenghao Zhou (University of Pennsylvania); Simon Stent (Toyota Research Institute); Jianbo Shi (University of Pennsylvania)
2154 Is It Necessary to Transfer Temporal Knowledge for Domain Adaptive Video Semantic Segmentation? Xinyi Wu (University of South Carolina); Zhenyao Wu (University of South Carolina)*; Jin Wan (Beijing Jiaotong University); Lili Ju (University of South Carolina); Song Wang (University of South Carolina)
2162 GIMO: Gaze-Informed Human Motion Prediction in Context Yang Zheng (Tsinghua University); Yanchao Yang (Stanford University)*; Kaichun Mo (Stanford); Jiaman Li (University of Southern California); Tao Yu (Tsinghua University); Yebin Liu (Tsinghua University); Karen Liu (Stanford); Leonidas Guibas (Stanford University)
2166 Error Compensation Framework for Flow-Guided Video Inpainting Jaeyeon Kang (Yonsei University); Seoung Wug Oh (Adobe Research); Seon Joo Kim (Yonsei University)*
2170 Decomposing The Tangent of Occluding Boundaries According to Curvatures and Torsions Huizong Yang (Georgia Institute of Technology)*; Anthony Yezzi (Georgia Institute of Technology)
2171 CPrune: Compiler-Informed Model Pruning for Efficient Target-Aware DNN Execution Taeho Kim (University of Colorado at Boulder)*; Yongin Kwon (Electronics and Telecommunications Research Institute); Jemin Lee (Electronics and Telecommunications Research Institute); Taeho Kim (Electronics and Telecommunications Research Institute); Sangtae Ha (University of Colorado at Boulder)
2180 Scraping Textures from Natural Images for Synthesis and Editing Xueting Li (University of California, Merced)*; Xiaolong Wang (UCSD); Ming-Hsuan Yang (University of California at Merced); Alexei A Efros (UC Berkeley); Sifei Liu (NVIDIA)
2203 Self-supervised Learning of Visual Graph Matching Chang Liu (Shanghai Jiao Tong University); Shaofeng Zhang (Shanghai Jiao Tong University); Xiaokang Yang (Shanghai Jiao Tong University of China); Junchi Yan (Shanghai Jiao Tong University)*
2206 Disentangling Architecture and Training for Optical Flow Deqing Sun (Google)*; Charles Herrmann (Google); Fitsum Reda (Google); Michael Rubinstein (Google); David J Fleet (University of Toronto); William T Freeman (Google)
2217 PointFix: Learning to Fix Domain Bias for Robust Online Stereo Adaptation Kwonyoung Kim (Yonsei University); JungIn Park (Yonsei University); Jiyoung Lee (NAVER AI Lab); Dongbo Min (Ewha Womans University); Kwanghoon Sohn (Yonsei Univ.)*
2218 Teaching Where to Look: Attention Similarity Knowledge Distillation for Low Resolution Face Recognition Sungho Shin (Gwangju Institute of Science and Technology); Joosoon Lee (Gwangju Institute of Science and Technology); junseok lee (GIST(Gwangju Institute of Science and Technology)); Yeonguk Yu (Gwangju Institute of Science and Technology); Kyoobin Lee (Gwangju Institute of Science and Technology)*
2219 Iwin: Human-Object Interaction Detection via Transformer with Irregular Windows Danyang Tu (Shanghai Jiao Tong University)*; Xiongkuo Min (Shanghai Jiao Tong University); Huiyu Duan (Shanghai Jiao Tong University); Guodong Guo (Baidu); Guangtao Zhai (Shanghai Jiao Tong University); Wei Shen (Shanghai Jiao Tong University)
2221 Single Stage Virtual Try-on via Deformable Attention Flows Shuai Bai (Alibaba Group)*; Huiling Zhou (Alibaba); Zhikang Li (DAMO Academy, Alibaba Group); Chang Zhou (Alibaba Group); Hongxia Yang (Alibaba Group)
2222 Learning Deep Non-Blind Image Deconvolution Without Ground Truths Yuhui Quan (South China University of Technology)*; Zhuojie Chen (South China University of Technology); Huan Zheng (National University of Singapore); Hui Ji (National University of Singapore)
2233 Rethinking Zero-shot Action Recognition: Learning from Latent Atomic Actions Yijun Qian (Carnegie Mellon University)*; Lijun Yu (Carnegie Mellon University); Wenhe Liu (Carnegie Mellon University); Alexander Hauptmann (Carnegie Mellon University)
2234 NeuRIS: Neural Reconstruction of Indoor Scenes Using Normal Priors Jiepeng Wang (The University of Hong Kong); Peng Wang (The University of Hong Kong); Xiaoxiao Long (The University of Hong Kong); Christian Theobalt (MPI Informatik); Taku Komura (The University of Hong Kong); Lingjie Liu (Max Planck Institute for Informatics ); Wenping Wang (The University of Hong Kong)*
2237 Rethinking Data Augmentation for Robust Visual Question Answering Long Chen (Columbia University)*; Yuhang Zheng (Zhejiang University); Jun Xiao (Zhejiang University)
2240 Dual-Domain Self-Supervised Learning and Model Adaption for Deep Compressive Imaging Yuhui Quan (South China University of Technology)*; Xinran Qin (South China University of Technology); Tongyao Pang (National University of Singapore); Hui Ji (National University of Singapore)
2242 Explicit Image Caption Editing Zhen Wang (Zhejiang University); Long Chen (Columbia University)*; Wenbo Ma (Zhejiang University); Guangxing Han (Columbia University); Yulei Niu (Columbia University); Jian Shao (Zhejiang University); Jun Xiao (Zhejiang University)
2255 SphereFed: Hyperspherical Federated Learning Xin Dong (Harvard Univeristy)*; Sai Qian Zhang (Harvard University); Ang Li (Google DeepMind); H.T. Kung (Harvard University)
2257 Local Color Distributions Prior for Image Enhancement Haoyuan Wang (City University of Hong Kong)*; Ke Xu (City University of Hong Kong); Rynson W.H. Lau (City University of Hong Kong)
2267 Teaching with Soft Label Smoothing for Mitigating Noisy Labels in Facial Expressions Tohar Lukov (National University of Singapore)*; Na Zhao (NUS); Gim Hee Lee (National University of Singapore); Ser-Nam Lim (Facebook AI)
2269 Multi-Modal Masked Pre-Training for Monocular Panoramic Depth Completion Zhiqiang Yan (Nanjing University of Science and Tenchnology)*; Xiang Li (Nanjing University of Science and Technology); Kun Wang (Nanjing University of Science and Technology); Zhenyu Zhang (Tencent); Jun Li (Nanjing University of Science and Technology); Jian Yang (Nanjing University of Science and Technology)
2272 2D Amodal Instance Segmentation Guided by 3D Shape Prior Zhixuan Li (Peking University); Weining Ye (Peking University); Tingting Jiang (Peking University)*; Tiejun Huang (Peking University)
2280 How to Synthesize a Large-Scale and Trainable Micro-Expression Dataset? Yuchi Liu (Australian National University)*; Zhongdao Wang (Tsinghua University); Tom Gedeon (The Australian National University); Liang Zheng (Australian National University)
2285 HEAD: HEtero-Assists Distillation for Heterogeneous Object Detectors Luting Wang (Beihang University)*; Xiaojie Li (sensetime); Yue Liao (Beihang University); Zeren Jiang (ETH Zurich); Jianlong Wu (Shandong University); Fei Wang (University of Science and Technology of China); Chen Qian (SenseTime); Si Liu (Beihang University)
2293 Meta Spatio-Temporal Debiasing for Video Scene Graph Generation LI XU (Singapore University of Technology and Design)*; Haoxuan Qu (Singapore University of Technology and Design); Jason Kuen (Adobe Research); Jiuxiang Gu (Adobe Research); Jun Liu (Singapore University of Technology and Design)
2307 A Sliding Window Scheme for Online Temporal Action Localization Young Hwi Kim (Yonsei University); Hyolim Kang (Yonsei University); Seon Joo Kim (Yonsei University)*
2310 Ultra-high-resolution unpaired stain transformation via Kernelized Instance Normalization Ming-Yang Ho (aetherAI)*; Min-Sheng Wu (aetherAI); Che-Ming Wu (aetherAI)
2311 SESS: Saliency Enhancing with Scaling and Sliding Osman Tursun (Queensland University of Technology)*; SIMON DENMAN (Queensland University of Technology, Australia); Sridha Sridharan (QUT); Clinton Fookes (Queensland University of Technology)
2312 Data Efficient 3D Learner via Knowledge Transferred from 2D Model Ping-Chung Yu (National Tsing Hua University)*; Cheng Sun (National Tsing Hua University); Min Sun (NTHU)
2319 MeshMAE: Masked Autoencoders for 3D Mesh Data Analysis Yaqian Liang (Wuhan University); Shanshan Zhao (JD.COM); Baosheng Yu (The University of Sydney); Jing Zhang (The University of Sydney); Fazhi He (Wuhan University)*
2327 ERA: Expert Retrieval and Assembly for Early Action Prediction Lin Geng Foo (Singapore University of Technology and Design)*; Tianjiao Li (Singapore University of Technology and Design); Hossein Rahmani (Lancaster University); Qiuhong Ke (Monash University); Jun Liu (Singapore University of Technology and Design)
2328 Mining Cross-Person Cues for Body-Part Interactiveness Learning in HOI Detection Xiaoqian Wu (Shanghai Jiao Tong University); Yong-Lu Li (Shanghai Jiao Tong University); Xinpeng Liu (Shanghai Jiao Tong University); Junyi Zhang (Shanghai Jiao Tong University); Yuzhe Wu (DongHua University); Cewu Lu (Shanghai Jiao Tong University)*
2334 Improving GANs for Long-Tailed Data through Group Spectral Regularization Harsh Rangwani (Indian Institute of Science)*; Naman Jaswani (Indian Institute of Science); Tejan Karmali (Indian Institute of Science, Bengaluru); Varun Jampani (Google); Venkatesh Babu RADHAKRISHNAN (Indian Institute of Science)
2336 Hierarchical Semantic Regularization of Latent Spaces in StyleGANs Tejan Karmali (Indian Institute of Science, Bengaluru)*; Rishubh Parihar (Indian Institute of Science, Bangalore); Susmit Agrawal (Indian Institute of Science); Harsh Rangwani (Indian Institute of Science); Varun Jampani (Google); Maneesh K Singh (Motive Technologies ); Venkatesh Babu RADHAKRISHNAN (Indian Institute of Science)
2337 Symmetry Regularization and Saturating Nonlinearity for Robust Quantization SEIN PARK (POSTECH); Yeongsang Jang (POSTECH); Eunhyeok Park (POSTECH)*
2350 IntereStyle: Encoding an Interest Region for Robust StyleGAN Inversion Seung Jun Moon (KAIST)*; Gyeong-Moon Park (Kyung Hee University)
2369 Improving RGB-D Point Cloud Registration by Learning Multi-scale Local Linear Transformation Ziming Wang (Beihang University); Xiaoliang Huo (Beihang University); Zhenghao Chen (University of Sydney); Jing Zhang (Beihang University); Lu Sheng (Beihang University)*; Dong Xu (The University of Hong Kong)
2373 Learning Dynamic Facial Radiance Fields for Few-Shot Talking Head Synthesis Shuai Shen (Tsinghua University); Wanhua Li (Tsinghua University); Zheng Zhu (Tsinghua University); Yueqi Duan (Tsinghua University); Jie Zhou (Tsinghua University); Jiwen Lu (Tsinghua University)*
2378 StyleLight: HDR Panorama Generation for Lighting Estimation and Editing Guangcong Wang (Nanyang Technological University)*; Yinuo Yang (Nanyang Technological University); Chen Change Loy (Nanyang Technological University); Ziwei Liu (Nanyang Technological University)
2379 You Should Look at All Objects Zhenchao Jin (University of Science and Technology of China)*; Dongdong Yu (ByteDance Inc.); Luchuan Song (University of Science and Technology of China); Zehuan Yuan (Bytedance.Inc); Lequan Yu (The University of Hong Kong)
2384 BRNet: Exploring Comprehensive Features for Monocular Depth Estimation Wencheng Han (Beijing Institute of Technology)*; Junbo Yin (Beijing Institute of Technology); Xiaogang Jin (Zhejiang University); dai xiangdong (oppo); Jianbing Shen (Inception Institute of Artificial Intelligence)
2403 CoupleFace: Relation Matters for Face Recognition Distillation Jiaheng Liu (Beihang University)*; Haoyu Qin (SenseTime); Yichao Wu (Sensetime Group Limited); Jinyang Guo (The University of Sydney); Ding Liang (Sensetime Group Limited); Ke Xu (Beihang University)
2404 Collaborating Domain-shared and Target-specific Feature Clustering for Cross-domain 3D Action Recognition Qinying Liu (University of Science and Technology of China); Zilei Wang (University of Science and Technology of China)*
2406 Adaptive Spatial-BCE Loss for Weakly Supervised Semantic Segmentation Tong Wu (Beijing Institute of Technology); Guangyu Ryan Gao (Beijing Institute of Technology)*; junshi huang (Meituan); Xiaolin Wei (Meituan); Xiaoming Wei (Meituan); Chi Harold Liu (Beijing Institute of Technology)
2418 Multi-Person 3D Pose and Shape Estimation via Inverse Kinematics and Refinement Junuk Cha (UNIST)*; Muhammad Saqlain (Ulsan National Institute of Science and Technology); GeonU Kim (UNIST); Mingyu Shin (ULSAN NATIONAL INSTITUTE OF SCIENCE AND TECHNOLOGY); Seungryul Baek (UNIST)
2423 Explaining Deepfake Detection by Analysing Image Matching Shichao Dong (Megvii); Jin Wang (Megvii); Haoqiang Fan (Megvii Inc(face++)); Jiajun Liang (Megvii); Renhe Ji (Megvii)*
2424 L-CoDer: Language-based Colorization with Color-object Decoupling Transformer Zheng Chang (Beijing University of Posts and Telecommunications); Shuchen Weng (Peking University)*; Yu Li (International Digital Economy Academy); Si Li (Beijing University of Posts and Telecommunications); Boxin Shi (Peking University)
2449 GitNet: Geometric Prior-based Transformation for Birds-Eye-View Segmentation Shi Gong (Huazhong University of Science and Technology); Xiaoqing Ye (Baidu Inc.); Xiao Tan (Baidu Inc.); Jingdong Wang (Baidu); Errui Ding (Baidu Inc.); Yu Zhou (Huazhong University of Science and Technology)*; Xiang Bai (Huazhong University of Science and Technology)
2459 Unsupervised Deep Multi-Shape Matching Dongliang Cao (Technical University of Munich); Florian Bernard (University of Bonn)*
2463 GaitEdge: Beyond Plain End-to-end Gait Recognition for Better Practicality Junhao Liang (Southern University of Science and Technology in China)*; Chao Fan (SUSTech); Saihui Hou (Beijing Normal University); Chuanfu Shen (Southern University of Science and Technology); Yongzhen Huang (School of Artificial Intelligence, Beijing Normal University); Shiqi Yu (Southern University of Science and Technology)
2483 EAutoDet: Efficient Architecture Search for Object Detection Xiaoxing Wang (Shanghai Jiao Tong University); Jiale Lin (Shanghai Jiao Tong University); Juanping Zhao (Guangdong OPPO Mobile Telecommunications Co., Ltd.); Xiaokang Yang (Shanghai Jiao Tong University of China); Junchi Yan (Shanghai Jiao Tong University)*
2485 A Max-Flow based Approach for Neural Architecture Search Chao Xue (beijing university of posts and telecommunications)*; Xiaoxing Wang (Shanghai Jiao Tong University); Junchi Yan (Shanghai Jiao Tong University); Chun-Guang Li (Beijing University of Posts & Telecommunications)
2488 Can Shuffling Video Benefit Temporal Bias Problem: A Novel Training Framework for Temporal Grounding Jiachang Hao (Beijing University of Posts and Telecommunications)*; Haifeng Sun (Beijing university of posts and telecommunications); Pengfei Ren (Beijing University of Posts and Telecommunications); Jingyu Wang (Beijing University of Posts and Telecommunications); Qi Qi (Beijing University of Posts and Telecommunications); Jianxin Liao (beijing university of posts and telecommunications)
2494 tSF: Transformer-based Semantic Filter for Few-Shot Learning Jinxiang Lai (Tencent)*; Siqian Yang (Tencent); Wenlong Liu (Tencent); Yi Zeng (Tencent); Zhongyi Huang (Tencent); Wenlong Wu (Tencent); Jun Liu (Tencent); Bin-Bin Gao (Tencent); Chengjie Wang (Tencent; Shanghai Jiao Tong University)
2501 Dense Gaussian Processes for Few-Shot Segmentation Joakim Johnander (Linköping University)*; Johan Edstedt (Linköping University); Fahad Shahbaz Khan (MBZUAI); Michael Felsberg (Linköping University); Martin Danelljan (ETH Zurich)
2507 Adversarial Feature Augmentation for Cross-domain Few-shot Classification Yanxu Hu (Sun Yat-sen University); Andy J Ma (Sun Yat-sen University)*
2511 Real-Time Neural Character Rendering with Pose-Guided Multiplane Images Hao Ouyang (HKUST)*; Bo Zhang (Microsoft Research Asia); Pan Zhang (Shanghai AI Laboratory); Hao Yang (Microsoft Research Asia); Dong Chen (Microsoft Research Asia); Jiaolong Yang (Microsoft Research); Qifeng Chen (HKUST); Fang Wen (Microsoft Research Asia )
2512 Constructing Balance from Imbalance for Long-tailed Image Recognition Yue Xu (Shanghai Jiao Tong University); Yong-Lu Li (Shanghai Jiao Tong University); Jiefeng Li (Shanghai Jiao Tong University); Cewu Lu (Shanghai Jiao Tong University)*
2516 SparseNeuS: Fast Generalizable Neural Surface Reconstruction from Sparse Views Xiaoxiao Long (The University of Hong Kong)*; Cheng Lin (Tencent); Peng Wang (The University of Hong Kong); Taku Komura (The University of Hong Kong); Wenping Wang (The University of Hong Kong)
2538 Dual Perspective Network for Audio Visual Event Localization Varshanth Rao (Huawei Technologies)*; Md Ibrahim Khalil (Huawei Noah’s Ark Laboratory); Haoda Li (University of California, Berkeley); Peng Dai (Huawei Technologies Inc.Canada); Juwei Lu (Huawei Noah’s Ark Lab)
2542 SiamDoGe: Domain Generalizable Semantic Segmentation using Siamese Network Zhenyao Wu (University of South Carolina)*; Xinyi Wu (University of South Carolina); Xiaoping Zhang (Wuhan University); Song Wang (University of South Carolina); Lili Ju (University of South Carolina)
2545 Is Appearance Free Action Recognition Possible? Filip Ilic (Graz University of Technology)*; Rick Wildes (York University); Thomas Pock (Graz University of Technology)
2557 Detecting Twenty-thousand Classes using Image-level Supervision Xingyi Zhou (The University of Texas at Austin)*; Rohit Girdhar (Facebook AI Research); Armand Joulin (Facebook AI Research); Philipp Kraehenbuehl (UT Austin); Ishan Misra (Facebook AI Research)
2558 DCL-Net: Deep Correspondence Learning Network for 6D Pose Estimation Hongyang Li (South China University of Technology)*; Jiehong Lin (South China University of Technology); Kui Jia (South China University of Technology)
2565 Learning Cross-Video Neural Representations for High-Quality Frame Interpolation Wentao Shangguan (Washington University in St Louis); Yu Sun (Washington University in St. Louis); Weijie Gan (Washington University in St. Louis); Ulugbek S. Kamilov (Washington University in St. Louis)*
2568 Learning Visibility for Robust Dense Human Body Estimation Chun-Han Yao (University of California at Merced)*; Jimei Yang (Adobe); Duygu Ceylan (Adobe Research); Yi Zhou (Adobe Research); Yang Zhou (Adobe Research); Ming-Hsuan Yang (University of California at Merced)
2573 Texturify: Generating Textures on 3D Shape Surfaces Yawar Siddiqui (Technical University of Munich)*; Justus Thies (Max Planck Institute for Intelligent Systems); Fangchang Ma (Apple Inc.); Qi Shan (Apple Inc.); Matthias Niessner (Technical University of Munich); Angela Dai (Technical University of Munich)
2575 Unsupervised Selective Labeling for More Effective Semi-Supervised Learning Xudong Wang (UC Berkeley / ICSI); Long Lian (UC Berkeley / ICSI); Stella X Yu (UC Berkeley / ICSI)*
2576 Reliable Visual Question Answering: Abstain Rather Than Answer Incorrectly Spencer Whitehead (Meta AI)*; Suzanne Petryk (UC Berkeley); Vedaad Shakib (UC Berkeley); Joseph E Gonzalez (UC Berkeley); Trevor Darrell (UC Berkeley); Anna Rohrbach (UC Berkeley); Marcus Rohrbach (Facebook AI Research)
2581 Studying Bias in GANs through the Lens of Race Vongani H Maluleke (University of California, Berkeley); Neerja Thakkar (University of California, Berkeley)*; Tim Brooks (UC Berkeley); Ethan Weber (UC Berkeley); Trevor Darrell (UC Berkeley); Alexei A Efros (UC Berkeley); Angjoo Kanazawa (University of California Berkeley); Devin Guillory (UC Berkeley)
2583 On Multi-Domain Long-Tailed Recognition, Imbalanced Domain Generalization and Beyond Yuzhe Yang (MIT)*; Hao Wang (Rutgers University); Dina Katabi (Massachusetts Institute of Technology)
2584 Disentangling Object Motion and Occlusion for Unsupervised Multi-frame Monocular Depth Ziyue Feng (Clemson University)*; Liang Yang (Apple Inc); Longlong Jing (Waymo LLC); Haiyan Wang (The City College of New York); YingLi Tian (City University of New York); Bing Li (Clemson University)
2586 Autoregressive 3D Shape Generation via Canonical Mapping An-Chieh Cheng (National Tsing Hua University); Xueting Li (University of California, Merced); Sifei Liu (NVIDIA)*; Min Sun (NTHU); Ming-Hsuan Yang (University of California at Merced)
2589 Learning Continuous Implicit Representation for Near-Periodic Patterns Bowei Chen (CMU)*; Tiancheng Zhi (ByteDance); Martial Hebert (cmu); Srinivasa Narasimhan (Carnegie Mellon University, USA)
2596 Robust Landmark-based Stent Tracking in X-ray Fluoroscopy Luojie Huang (Johns Hopkins Uniersity); Yikang Liu (United Imaging Intelligence America); Li Chen (University of Washington); Eric Z. Chen (United Imaging Intelligence America); Xiao Chen (United Imaging Intelligence America); Shanhui Sun (United Imaging Intelligence America)*
2598 Depth Field Networks for Generalizable Multi-view Scene Representation Vitor Guizilini (Toyota Research Institute)*; Igor Vasiljevic (Toyota Research Institute); Jiading Fang (Toyota Technological Institute at Chicago); Rareș A Ambruș (Toyota Research Institute); Greg Shakhnarovich (Toyota Technological Institute at Chicago); Matthew Walter (Toyota Technological Institute at Chicago); Adrien Gaidon (Toyota Research Institute)
2601 Max Pooling with Vision Transformers reconciles class and shape in weakly supervised semantic segmentation Simone Rossetti (Sapienza University); Damiano Zappia (Deepplants S.r.l.); Marta Sanzari (Sapienza University of Rome); Marco Schaerf (Sapienza University of Rome); fiora pirri (University of Rome, Sapienza)*
2605 GRIT: Faster and Better Image captioning Transformer Using Dual Visual Features Van-Quang Nguyen (Tohoku University)*; Masanori Suganuma (Tohoku University / RIKEN AIP); Takayuki Okatani (Tohoku University/RIKEN AIP)
2609 Learning Semantic Correspondence with Sparse Annotations Shuaiyi Huang (University of Maryland, College Park)*; Luyu Yang (University of Maryland, College Park); Bo He (University of Maryland); Songyang Zhang (Shanghai AI Laboratory); Xuming He (ShanghaiTech University); Abhinav Shrivastava (University of Maryland)
2610 A Real World Dataset for Multi-view 3D Reconstruction Rakesh Shrestha (Simon Fraser University)*; Siqi Hu (Alibaba damo academy); Minghao Gou (Shanghai Jiao Tong University); Ziyuan Liu (Huawei group); Ping Tan (Simon Fraser University)
2620 Social ODE: Multi-Agent Trajectory Forecasting with Neural Ordinary Differential Equations Song Wen (Rutgers University)*; Hao Wang (Rutgers University); Dimitris N. Metaxas (Rutgers)
2621 3D Instances as 1D Kernels Yizheng Wu (Huazhong Univ. of Sci.&Tech.); Min Shi (Huazhong University of Science and Technology); Shuaiyuan Du (Huazhong Univ. of Sci.&Tech. ); Hao Lu (Huazhong University of Science and Technology); Zhiguo Cao (Huazhong Univ. of Sci.&Tech.)*; Weicai Zhong (Huawei CBG Consumer Cloud Service Big Data Platform Dept.)
2624 Context-Aware Streaming Perception in Dynamic Environments Gur-Eyal Sela (UC Berkeley)*; Ionel Gog (UC Berkeley); Justin Wong (UC Berkeley); Kumar Krishna Agrawal (UC Berkeley); Xiangxi Mo (UC Berkeley); Sukrit Kalra (UC Berkeley); Peter Schafhalter (UC Berkeley); Eric Leong (UC Berkeley); Xin Wang (Microsoft Research); Bharathan Balaji (Amazon); Joseph E Gonzalez (UC Berkeley); Ion Stoica (UC Berkeley)
2625 PointTree: Transformation-Robust Point Cloud Encoder with Relaxed K-D Trees Jun-Kun Chen (University of Illinois at Urbana-Champaign)*; Yu-Xiong Wang (University of Illinois at Urbana-Champaign)
2631 Dense Siamese Network for Dense Unsupervised Learning Wenwei Zhang (NTU)*; Jiangmiao Pang (CUHK); Kai Chen (SenseTime Research); Chen Change Loy (Nanyang Technological University)
2633 Uncertainty-aware Multi-modal Learning via Cross-modal Random Network Prediction Hu Wang (the University of Adelaide)*; Jianpeng Zhang (Northwestern Polytechnical University); Yuanhong Chen (University of Adelaide); Congbo Ma (The University of Adelaide); Jodie C Avery (University of Adelaide); Mary L Hull (University of Adelaide); Gustavo Carneiro (University of Adelaide)
2638 Enhanced Accuracy and Robustness via Multi-Teacher Adversarial Distillation Shiji Zhao (Beihang University); Jie Yu (Beihang University); Zhenlong Sun (Tencent Technology Co.Ltd); Bo Zhang (WeChat Search Application Department, Tencent); Xingxing Wei (Beihang University)*
2645 End-to-end graph-constrained vectorized floorplan generation with panoptic refinement Jiachen Liu (Pennsylvania State University)*; Yuan Xue (Johns Hopkins University); Jose P. Duarte (Penn State University); Krishnendra Shekhawat (BITS Pilani); Zihan Zhou (Manycore Tech Inc.); Sharon Xiaolei Huang (The Pennsylvania State University)
2649 Context Enhanced Stereo Transformer weiyu Guo (University of Chinese Academy of Sciences)*; Zhaoshuo Li (Johns Hopkins University); Yongkui Yang (Shenzhen Institute of Advanced Technology,Chinese Academy of Sciences); Zheng Wang (Shenzhen Institutes of Advanced Technology); Russ Taylor (Johns Hopkins University); Mathias Unberath (Johns Hopkins University); Alan Yuille (Johns Hopkins University); Yingwei Li (Johns Hopkins University)
2652 NSNet: Non-saliency Suppression Sampler for Efficient Video Recognition Boyang Xia (Institute of Computing Technology, Chinese Academy of Science); Wenhao Wu (Baidu)*; Haoran Wang (Baidu); RUI SU (the University of Sydney); Dongliang He (Baidu); Haosen Yang (Harbin Institute of Technology); Xiaoran Fan (Institute of Computing Technology, Chinese Academy of Sciences); Wanli Ouyang (The University of Sydney)
2663 Hierarchically Self-Supervised Transformer for Human Skeleton Representation Learning Yuxiao Chen (Rutgers University)*; Long Zhao (Google Research); Jianbo Yuan (Bytedance); Yu Tian (Rutgers); zhaoyang xia (Rutgers University); Shijie Geng (Rutgers University); Ligong Han (Rutgers University); Dimitris N. Metaxas (Rutgers)
2666 Few-Shot Video Object Detection Qi Fan (HKUST)*; Chi-Keung Tang (Hong Kong University of Science and Technology); Yu-Wing Tai (Kuaishou Technology / HKUST)
2667 Improving the Reliability for Confidence Estimation Haoxuan Qu (Singapore University of Technology and Design)*; Yanchao Li (Singapore University of Technology and Design); Lin Geng Foo (Singapore University of Technology and Design); Jason Kuen (Adobe Research); Jiuxiang Gu (Adobe Research); Jun Liu (Singapore University of Technology and Design)
2686 Selective Query-guided Debiasing for Video Corpus Moment Retrieval Sunjae Yoon (KAIST)*; Ji Woo Hong (KAIST); Eunseop Yoon (KAIST); DaHyun Kim (KAIST); Junyeong Kim (Chung-Ang University); Hee Suk Yoon (KAIST); Chang D. Yoo (KAIST)
2701 Posterior Refinement on Metric Matrix Improves Generalization in Metric Learning Mingda Wang (Shanghai Jiao Tong University); Canqian Yang (Shanghai Jiao Tong University); Yi Xu (Shanghai Jiao Tong University)*
2707 DISP6D: Disentangled Implicit Shape and Pose Learning for Scalable 6D Pose Estimation Yilin Wen (The University of Hong Kong)*; Xiangyu Li (Brown University); Hao Pan (Microsoft Research); Lei Yang (The University of Hong Kong); Zheng Wang (SUSTech); Taku Komura (The University of Hong Kong); Wenping Wang (The University of Hong Kong)
2709 Few-shot Image Generation with Mixup-based Distance Learning Chaerin Kong (Seoul National University); Jeesoo Kim (Naver Webtoon AI); Donghoon Han (Seoul National University); Nojun Kwak (Seoul National University)*
2715 Data-Free Neural Architecture Search via Recursive Label Calibration Zechun Liu (Carnegie Mellon University); Zhiqiang Shen (Carnegie Mellon University)*; Yun Long (Google); Eric Xing (MBZUAI, CMU, and Petuum Inc.); Kwang-Ting Cheng (Hong Kong University of Science and Technology); Chas H Leichner (Google)
2717 Distilling Object Detectors With Global Knowledge Sanli Tang (Hikvision Research Institute); Zhongyu Zhang (Hikvision Research Institute); Zhanzhan Cheng (Zhejiang University & Hikvision Research Institute)*; Jing Lu (Hikvision Research Institute); Yunlu Xu (Hikvision Research Institute); Yi Niu (Hikvision Research Institute); Fan He (Shanghai Jiao Tong University)
2730 NEST: Neural Event Stack for Event-based Image Enhancement Minggui Teng (Peking University)*; Chu Zhou (Peking University); Hanyue Lou (Peking University); Boxin Shi (Peking University)
2732 Multi-Granularity Distillation Scheme Towards Lightweight Semi-Supervised Semantic Segmentation Jie Qin (School of Artificial Intelligence, University of Chinese Academy of Sciences; Institute of Automation,Chinese Academy of Sciences)*; Jie Wu (ByteDance Inc); Ming Li (Xiamen University); Xuefeng Xiao (ByteDance Inc); Min Zheng (ByteDance); Xingang Wang (Institute of Automation, CAS)
2740 A Style-Based GAN Encoder for High Fidelity Reconstruction of Images and Videos Xu YAO (Telecom ParisTech)*; Alasdair Newson (Telecom Paris); Yann Gousseau (Telecom Paris); PIERRE HELLIER (Interdigital (Technicolor))
2746 Unifying Visual Perception by Dispersible Points Learning Jianming Liang (Beihang University)*; Guanglu Song (Sensetime); Biao Leng (Beihang University); Yu Liu (SenseTime Group LTD)
2747 Towards High-Fidelity Single-view Holistic Reconstruction of Indoor Scenes Haolin Liu (The Chinese University of Hong Kong, Shenzhen)*; Yujian Zheng (The Chinese University of Hong Kong, Shenzhen); Guanying CHEN (The Chinese University of Hong Kong, Shenzhen); Shuguang Cui (The Chinese University of Hong Kong, Shenzhen ); Xiaoguang Han (Shenzhen Research Institute of Big Data, the Chinese University of Hong Kong (Shenzhen))
2756 Multimodal Transformer for Automatic 3D Annotation and Object Detection Chang Liu (The University of Hong Kong)*; Xiaoyan QIAN (The University of Hong Kong); Binxiao Huang (The University of Hong Kong); Xiaojuan Qi (The University of Hong Kong); Edmund Lam (The University of Hong Kong); Siew-Chong Tan (Nil); Ngai Wong (The University of Hong Kong)
2761 SP-Net: Slowly Progressing Dynamic Inference Networks Huanyu Wang (Zhejiang University)*; Wenhu Zhang (Zhejiang University); Shihao Su (Zhejiang University); Hui Wang (Zhejiang University); Zhenwei Miao (DAMO Academy, Alibaba Group); Xin Zhan (DAMO Academy, Alibaba Group); Xi Li (Zhejiang University)
2764 No Token Left Behind: Explainability-Aided Image Classification and Generation Roni Paiss (Tel Aviv University, Google); Hila Chefer (Tel Aviv University)*; Lior Wolf (Tel Aviv University, Israel)
2766 Dynamically Transformed Instance Normalization Network for Generalizable Person Re-Identification BingLiang Jiao (Northwestern Polytechnical University ); Lingqiao Liu (University of Adelaide); Liying Gao ( Northwestern Polytechnical University); Guosheng Lin (Nanyang Technological University); Lu Yang (Northwestern Polytechnical University); Shizhou Zhang (NorthWestern Polytechnical University); Peng Wang (Northwestern Polytechnical University)*; Yanning Zhang (Northwestern Polytechnical University)
2772 Editable Indoor Lighting Estimation Henrique Weber (Université Laval)*; Mathieu Garon (Depix); Jean-Francois Lalonde (Université Laval)
2783 PseCo: Pseudo Labeling and Consistency Training for Semi-Supervised Object Detection Gang Li (Nanjing University of Science and Technology)*; Xiang Li (Nanjing University of Science and Technology); Yujie Wang (Sensetime Research); Yichao Wu (Sensetime Group Limited); Ding Liang (Sensetime Group Limited); Shanshan Zhang (Max Planck Institute for Informatics)
2786 CompNVS: Novel View Synthesis with Scene Completion Zuoyue Li (ETH Zurich)*; Tianxing Fan (Zhejiang University); Zhenqiang Li (The University of Tokyo); Zhaopeng Cui (Zhejiang University); Yoichi Sato (University of Tokyo); Marc Pollefeys (ETH Zurich / Microsoft); Martin R. Oswald (ETH Zurich)
2787 Dynamic 3D Scene Analysis by Point Cloud Accumulation Shengyu Huang (ETH Zürich)*; Zan Gojcic (NVIDIA); Jiahui Huang (Tsinghua University); Andreas Wieser (ETH Zürich); Konrad Schindler (ETH Zurich)
2798 FakeCLR: Exploring Contrastive Learning for Solving Latent Discontinuity in Data-Efficient GANs Ziqiang Li (University of Science and Technology of China)*; Chaoyue Wang (JD.com); Heliang Zheng (JD Explore Academy, JD.com); Jing Zhang (The University of Sydney); Bin Li (University of Science and Technology of China)
2802 Resolving Copycat Problems in Visual Imitation Learning via Residual Action Prediction Chia-Chi Chuang (Tsinghua University); Donglin Yang (Tsinghua University); Chuan Wen (Tsinghua University)*; Yang Gao (Tsinghua University)
2804 REALY: Rethinking the Evaluation of 3D Face Reconstruction Zenghao Chai (Tsinghua University); Haoxian Zhang (Tencent); Jing Ren (ETH Zurich); Di Kang (Tencent); Zhengzhuo Xu (Tsinghua University); Xuefei Zhe (Tencent AI lab); Chun Yuan (Graduate school at ShenZhen,Tsinghua university); Linchao Bao (Tencent AI Lab)*
2806 TransMatting: Enhancing Transparent Objects Matting with Transformers huanqia cai (University of Chinese Academy of Sciences)*; Fanglei Xue (University of Chinese Academy of Sciences); Lele Xu (Key Laboratory of Space Utilization, Technology and Engineering Center for space Utilization, Chinese Academy of Sciences.); lili guo (Key Laboratory of Space Utilization, Technology and Engineering Center for space Utilization, Chinese Academy of Sciences. )
2814 Diverse Image Inpainting with Normalizing Flow Cairong Wang (Graduate school at Shenzhen, Tsinghua University)*; Yiming M Zhu (Graduate school at ShenZhen,Tsinghua university); Chun Yuan (Graduate school at ShenZhen,Tsinghua university)
2818 Video Activity Localisation with Uncertainties in Temporal Boundary Jiabo Huang (Queen Mary University of London)*; Hailin Jin (Adobe Research); Shaogang Gong (Queen Mary University of London); Yang Liu (Peking University)
2822 SketchSampler: Sketch-based 3D Reconstruction via View-dependent Depth Sampling Chenjian Gao (Beihang University); Qian Yu (Beihang University)*; Lu Sheng (Beihang University); Yi-Zhe Song (University of Surrey); Dong Xu (The University of Hong Kong)
2829 Exploring Resolution and Degradation Clues as Self-supervised Signal for Low Quality Object Detection Ziteng Cui (The University of Tokyo); Yingying Zhu (University of Texas Arlington); Lin Gu (RIKEN,AIP / The University of Tokyo)*; Guo-Jun Qi (Futurewei Technologies); Xiaoxiao Li (The University of British Columbia); Renrui Zhang (Shanghai AI Lab); Zenghui Zhang (Shanghai Jiao Tong university); Tatsuya Harada (The University of Tokyo / RIKEN)
2840 CP2: Copy-Paste Contrastive Pretraining for Semantic Segmentation Feng Wang (Tsinghua University)*; Huiyu Wang (JHU); Chen Wei (Johns Hopkins University); Alan Yuille (Johns Hopkins University); Wei Shen (Shanghai Jiao Tong University)
2852 Learning from Multiple Annotator Noisy Labels via Sample-wise Label Fusion Zhengqi Gao (MIT)*; Fan-Keng Sun (MIT); Mingran Yang (MIT); Sucheng Ren (South China University of Technology); Zikai Xiong (Massachusetts Institute of Technology); Marc Engeler (Takeda); Antonio Burazer (Takeda); Linda Wildling (Takeda Pharmaceuticals International AG); Luca Daniel (Massachusetts Institute of Technology); Duane Boning (MIT)
2856 Robust Category-Level 6D Pose Estimation with Coarse-to-Fine Rendering of Neural Features Wufei Ma (Purdue University)*; Angtian Wang (Johns Hopkins University); Alan Yuille (Johns Hopkins University); Adam Kortylewski (Max Planck Institute for Informatics)
2861 A Unified Framework for Domain Adaptive Pose Estimation Donghyun Kim (MIT-IBM Watson AI Lab)*; Kaihong Wang (Boston University); Stan Sclaroff (Boston University); Margrit Betke (Boston University); Kate Saenko (Boston University)
2862 A Broad Study of Pre-training for Domain Generalization and Adaptation Donghyun Kim (MIT-IBM Watson AI Lab)*; Kaihong Wang (Boston University); Stan Sclaroff (Boston University); Kate Saenko (Boston University)
2863 BlobGAN: Spatially Disentangled Scene Representations Dave Epstein (UC Berkeley)*; Taesung Park (Adobe Research); Richard Zhang (Adobe); Eli Shechtman (Adobe Research, US); Alexei A Efros (UC Berkeley)
2864 LGV: Boosting Adversarial Example Transferability from Large Geometric Vicinity Martin Gubri (University of Luxembourg)*; Maxime Cordy (University of Luxembourg); Mike Papadakis (University of Luxembourg); Yves Le Traon (University of Luxembourg); Koushik Sen (University of California, Berkeley)
2871 LocalBins: Improving Depth Estimation by Learning Local Distributions Shariq F Bhat (KAUST)*; Ibraheem Alhashim (National Center for Artificial Intelligence (NCAI), Saudi Data and Artificial Intelligence Authority (SDAIA), Riyadh, Kingdom of Saudi Arabia); Peter Wonka (KAUST)
2872 Prior Knowledge Guided Unsupervised Domain Adaptation Tao Sun (Stony Brook University)*; Cheng Lu (Xiaopeng); Haibin Ling (Stony Brook University)
2877 Fast Two-step Blind Optical Aberration Correction Thomas Eboli (ENS Paris-Saclay)*; Jean-Michel Morel (Centre Borelli ENS Paris-Saclay); Gabriele Facciolo (ENS Paris – Saclay)
2887 Controllable and Guided Face Synthesis for Unconstrained Face Recognition Feng Liu (Michigan State University)*; Minchul Kim (Michigan State University); Anil Jain (Michigan State University); Xiaoming Liu (Michigan State University)
2888 2D GANs Meet Unsupervised Single-view 3D Reconstruction Feng Liu (Michigan State University)*; Xiaoming Liu (Michigan State University)
2891 Seeing Far in the Dark with Patterned Flash Zhanghao Sun (Stanford University)*; Jian Wang (Snap); Yicheng Wu (Snap Inc.); Shree Nayar (Snap)
2900 Unified Implicit Neural Stylization Zhiwen Fan (University of Texas at Austin)*; Yifan Jiang (University of Texas at Austin); Peihao Wang (University of Texas at Austin); Xinyu Gong (University of Texas at Austin); Dejia Xu (University of Texas at Austin); Zhangyang Wang (University of Texas at Austin)
2901 Improved Masked Image Generation with Token-Critic Jose Lezama (Google Research)*; Huiwen Chang (Google); Lu Jiang (Google Research); Irfan Essa (Google)
2902 UNIF: United Neural Implicit Functions for Clothed Human Reconstruction and Animation Shenhan Qian (ShanghaiTech University)*; Jiale Xu (ShanghaiTech University); Ziwei Liu (Nanyang Technological University); Liqian Ma (ZMO AI); Shenghua Gao (Shanghaitech University)
2903 PseudoClick: Interactive Image Segmentation with Click Imitation Qin Liu (UNC)*; Meng Zheng (United Imaging Intelligence); Benjamin Planche (United Imaging Intelligence); Srikrishna Karanam (Adobe Research); Terrence Chen (United Imaging Intelligence); Marc Niethammer (UNC); Ziyan Wu (United Imaging Intelligence)
2904 CoSCL: Cooperation of Small Continual Learners is Stronger than a Big One Liyuan Wang (Tsinghua University)*; Xingxing Zhang (Tsinghua University); Qian Li (Tsinghua University); Jun Zhu (Tsinghua University); Yi Zhong (Tsinghua University)
2909 Scalable Learning to Optimize: A Learned Optimizer Can Train Big Models Xuxi Chen (University of Texas at Austin)*; Tianlong Chen (Unversity of Texas at Austin); Yu Cheng (Microsoft Research); Weizhu Chen (Microsoft); Ahmed Awadallah (Microsoft); Zhangyang Wang (University of Texas at Austin)
2921 PRIF: Primary Ray-based Implicit Function Brandon Yushan Feng (University of Maryland, College Park)*; Yinda Zhang (Google); Danhang Tang (Google); Ruofei Du (Google); Amitabh Varshney (University of Maryland)
2925 From Face to Natural Image: Learning Real Degradation for Blind Image Super-Resolution Xiaoming Li (Harbin Institute of Technology); Chaofeng Chen (Nanyang Technological University); Xianhui Lin (Alibaba Group); Wangmeng Zuo (Harbin Institute of Technology, China)*; Lei Zhang (“Hong Kong Polytechnic University, Hong Kong, China”)
2936 QISTA-ImageNet: A Deep Compressive Image Sensing Framework Solving Lq-Norm Optimization Problem Gang-Xuan Lin (Academia Sinica); Shih-Wei Hu (National Taiwan University); Chun-Shien Lu (Academia Sinica)*
2943 Trust, but Verify: Using Self-Supervised Probing to Improve Trustworthiness Ailin Deng (National University of Singapore)*; Shen Li (National University of Singapore); Miao Xiong (National University of Singapore); Zhirui Chen (National University of Singapore); Bryan Hooi (National University of Singapore)
2948 Spatial and Visual Perspective-Taking via View Rotation and Relation Reasoning for Embodied Reference Understanding Cheng Shi (ShanghaiTech University); Sibei Yang (ShanghaiTech University)*
2953 Med-DANet: Dynamic Architecture Network for Efficient Medical Volumetric Segmentation Wenxuan Wang (University of Science and Technology Beijing)*; Chen Chen (University of Central Florida); Jing Wang (University of Science and Technology Beijing); Sen Zha (University of Science and Technology Beijing); Yan Zhang (University of Science and Technology Beijing); Jiangyun Li (University of Science and Technology Beijing)
3005 Worst Case Matters for Few-Shot Recognition Minghao Fu (Nanjing University); Yunhao Cao (Nanjing University); Jianxin Wu (Nanjing University)*
3017 Self-Filtering: A Noise-Aware Sample Selection for Label Noise with Confidence Penalization Qi Wei (Shandong University)*; Haoliang Sun (Shandong University); Xiankai Lu (Shandong University); Yilong Yin (Shandong University)
3035 Point Cloud Domain Adaptation via Masked Local 3D Structure Prediction hanxue liang (University of Texas at Austin)*; Hehe Fan (NUS); Zhiwen Fan (University of Texas at Austin); Yi Wang (University of Texas at Austin); Tianlong Chen (Unversity of Texas at Austin); Yu Cheng (Microsoft Research); Zhangyang Wang (University of Texas at Austin)
3041 Translation, Scale and Rotation: Cross-Modal Alignment Meets RGB-Infrared Vehicle Detection Maoxun Yuan (Beihang University); Yinyan Wang (BeiHaing University); Xingxing Wei (Beihang University)*
3043 Simple Baselines for Image Restoration Liangyu Chen (Megvii Technology)*; Xiaojie Chu (Megvii Technology); Xiangyu Zhang (Megvii Technology); Jian Sun (Megvii Technology)
3058 RDA: Reciprocal Distribution Alignment for Robust Semi-supervised Learning Yue Duan (Nanjing University)*; Lei Qi (Southeast University); Lei Wang (“University of Wollongong, Australia”); Luping Zhou (University of Sydney); Yinghuan Shi (Nanjing University)
3060 Exploring Hierarchical Graph Representation for Large-Scale Zero-Shot Image Classification Kai Yi (King Abdullah University of Science and Technology)*; xiaoqian shen (King Abdullah University of Science and Technology); Yunhao Gou (Hong Kong University of Science and Technology); Mohamed Elhoseiny (KAUST)
3080 Doubly Deformable Aggregation of Covariance Matrices for Few-shot Segmentation Zhitong Xiong (Techinical University of Munich)*; Haopeng Li (The University of Melbourne); Xiaoxiang Zhu (Technical University of Munich (TUM); German Aerospace Center (DLR))
3093 MemSAC: Memory Augmented Sample Consistency for Large Scale Domain Adaptation Tarun Kalluri (UC San Diego)*; Astuti Sharma (UCSD); Manmohan Chandraker (UC San Diego)
3094 GCISG: Guided Causal Invariant Learning for Improved Syn-to-real Generalization Gilhyun Nam (Agency for Defense Development)*; Gyeongjae Choi (Agency for Defense Development); Kyungmin Lee (Agency for Defense Development)
3101 Temporal Saliency Query Network for Efficient Video Recognition Boyang Xia (Institute of Computing Technology, Chinese Academy of Science); Zhihao Wang (Institute of Computing Technology, Chinese Academy of Sciences); Wenhao Wu (Baidu)*; Haoran Wang (Baidu); Jungong Han (Aberystwyth University)
3116 Towards Interpretable Video Super-Resolution via Alternating Optimization Jiezhang Cao (ETH Zürich)*; Jingyun Liang (ETH Zurich); Kai Zhang (ETH Zurich); Wenguan Wang (Eidgenössische Technische Hochschule Zürich); Qin Wang (ETH Zurich); Yulun Zhang (ETH Zurich); Hao Tang (ETH Zurich); Luc Van Gool (ETH Zurich)
3118 R-DFCIL: Relation-Guided Representation Learning for Data-Free Class Incremental Learning Qiankun Gao (Peking University Shenzhen Graduate School)*; Chen Zhao (KAUST); Bernard Ghanem (KAUST); Jian Zhang (Peking University Shenzhen Graduate School)
3125 Spike Transformer: Monocular Depth Estimation for Spiking Camera Jiyuan Zhang (Peking University)*; Lulu Tang (Tsingua University); Zhaofei Yu (Peking University); Jiwen Lu (Tsinghua University); Tiejun Huang (Peking University)
3127 Towards Robust Face Recognition with Comprehensive Search Manyuan Zhang (Sensetime)*; Guanglu Song (Sensetime); Yu Liu (SenseTime Group LTD); Hongsheng Li (The Chinese University of Hong Kong)
3129 Improving Image Restoration by Revisiting Global Information Aggregation Xiaojie Chu (Megvii Technology)*; Liangyu Chen (Megvii Technology); Chengpeng Chen (Megvii); Xin Lu (Megvii Technology)
3132 Learning Pedestrian Group Representations for Multi-modal Trajectory Prediction Inhwan Bae (Gwangju Institute of Science and Technology)*; Jin-Hwi Park (GIST); Hae-Gon Jeon (GIST)
3138 RFLA: Gaussian Receptive Field based Label Assignment for Tiny Object Detection Chang Xu (Wuhan University); Jinwang Wang (Huawei Technoloty); Wen Yang (Wuhan University)*; Huai Yu (Wuhan University); Lei Yu (Wuhan University); Gui-Song Xia (Wuhan University)
3139 Semi-supervised Single-view 3D Reconstruction via Prototype Shape Priors Zhen Xing (Fudan University)*; Hengduo Li (University of Maryland, College Park ); Zuxuan Wu (UMD); Yu-Gang Jiang (Fudan University)
3145 Sequential Multi-View Fusion Network for Fast LiDAR Point Motion Estimation Gang Zhang (Damo Academy, Alibaba Group)*; Xiaoyan Li (Beijing University of Technology); Zhenhua Wang (DAMO Academy, Alibaba Group)
3147 A Large-scale Multiple-objective Method for Black-box Attack against Object Detection Siyuan Liang (Chinese Academy of Sciences); Longkang Li (Mohamed bin Zayed University of Artificial Intelligence); Yanbo Fan (Tencent AI Lab); Xiaojun Jia (Institute of Information Engineering,Chinese Academy of Sciences); Jingzhi Li (Institute of information engineering, CAS); Baoyuan Wu (The Chinese University of Hong Kong, Shenzhen)*; Xiaochun Cao (Sun Yat-sen University)
3150 GradAuto: Energy-oriented Attack on Dynamic Neural Networks Jianhong Pan (Singapore University of Technology and Design)*; Qichen Zheng (Singapore University of Technology and Design); Zhipeng Fan (NYU TANDON SCHOOL OF ENGINEERING); Hossein Rahmani (Lancaster University); Qiuhong Ke (Monash University); Jun Liu (Singapore University of Technology and Design)
3151 Semantic-guided Multi-Mask Image Harmonization Xuqian Ren (Watrix Technology); Yifan Liu (University of Adelaide)*
3155 Manifold Adversarial Learning for Cross-domain 3D Shape Representation Hao Huang (New York University); Cheng Chen (New York University); Yi Fang (New York University)*
3167 GAN with Multivariate Disentangling for Controllable Hair Editing Xuyang Guo (Institute of Computing Technology, Chinese Academy of Sciences); Meina Kan (Institute of Computing Technology, Chinese Academy of Sciences); Tianle Chen (Institute of Computing Technology, Chinese Academy of Sciences); Shiguang Shan (Institute of Computing Technology, Chinese Academy of Sciences)*
3169 Fast-MoCo: Boost Momentum-based Contrastive Learning with Combinatorial Patches Yuanzheng Ci (The University of Sydney)*; Chen Lin (University of Oxford); Lei Bai (Shanghai AI Laboratory); Wanli Ouyang (The University of Sydney)
3179 Dense Cross-Query-and-Support Attention Weighted Mask Aggregation for Few-Shot Segmentation Xinyu Shi (School of Computer Science and Engineering, Southeast University); DONG WEI (Tencent Jarvis Lab)*; Yu Zhang (Southeast University); Donghuan Lu (Tencent); Munan Ning (Tencent); Jiashun Chen (School of Computer Science and Engineering, Southeast University); Kai Ma (Tencent); Yefeng Zheng (Tencent)
3180 Acknowledging the Unknown for Multi-label Learning with Single Positive Labels Donghao Zhou (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences)*; Pengfei Chen (The Chinese University of Hong Kong); Qiong Wang (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences); Guangyong Chen (Shenzhen Institutes of Advanced Technology); Pheng-Ann Heng (The Chinese Univsersity of Hong Kong)
3200 LoRD: Local 4D Implicit Representation for High-Fidelity Dynamic Human Modeling Boyan Jiang (Fudan University)*; Xinlin Ren (Fudan University); Mingsong Dou (Google Inc.); Xiangyang Xue (Fudan University); Yanwei Fu (Fudan University); Yinda Zhang (Google)
3202 Bilateral Normal Integration Xu Cao (Osaka University)*; Hiroaki Santo (Osaka University); Boxin Shi (Peking University); Fumio Okura (Osaka University); Yasuyuki Matsushita (Osaka University)
3203 Harmonizer: Learning to Perform White-Box Image and Video Harmonization Zhanghan Ke (City University of Hong Kong)*; Chunyi Sun (Australian National University ); Lei ZHU (City University of Hong Kong); Ke Xu (City University of Hong Kong); Rynson W.H. Lau (City University of Hong Kong)
3213 On the Versatile Uses of Partial Distance Correlation in Deep Learning Xingjian Zhen (University of Wisconsin-Madison)*; Zihang Meng (University of Wisconsin Madison); Rudrasis Chakraborty (Butlr); Vikas Singh (University of Wisconsin Madison)
3214 Object-Centric Unsupervised Image Captioning Zihang Meng (University of Wisconsin Madison)*; David Yang (Facebook); Xuefei Cao (Facebook); Ashish Shah (Facebook AI); Ser-Nam Lim (Meta AI)
3217 Pose2Room: Understanding 3D Scenes from Human Activities Yinyu Nie (Technical University of Munich)*; Angela Dai (Technical University of Munich); Xiaoguang Han (Shenzhen Research Institute of Big Data, the Chinese University of Hong Kong (Shenzhen)); Matthias Niessner (Technical University of Munich)
3218 Capturing, Reconstructing, and Simulating: the UrbanScene3D Dataset Liqiang Lin (Shenzhen University); Yilin Liu (Shenzhen University); Yue Hu (Shenzhen University); Xingguang Yan (Shenzhen University); Ke Xie (Shenzhen University); Hui Huang (Shenzhen University)*
3225 A Spectral View of Randomized Smoothing under Common Corruptions: Benchmarking and Improving Certified Robustness Jiachen Sun (University of Michigan)*; Akshay Mehra (Tulane University); Bhavya Kailkhura (Lawrence Livermore National Laboratory); Pin-Yu Chen (IBM Research); Dan Hendrycks (UC Berkeley); Jihun Hamm (Tulane University); Zhuoqing Morley Mao (University of Michigan)
3229 CLIP-Actor: Text-Driven Recommendation and Stylization for Animating Human Meshes Kim Youwang (POSTECH)*; Ji-Yeon Kim (POSTECH); Tae-Hyun Oh (POSTECH)
3240 Interpretable Image Classification with Differentiable Prototypes Assignment Dawid Damian Rymarczyk (Jagiellonian University)*; Łukasz Struski (Jagiellonian University); Michał Górszczak (Jagiellonian University); Koryna Lewandowska (Jagiellonian University); Jacek Tabor (Jagiellonian University); Bartosz Zieliński (Jagiellonian University)
3247 Efficient One-stage Video Object Detection by Exploiting Temporal Consistency Guanxiong Sun (Queen’s University Belfast); Yang Hua (Queen’s University Belfast)*; Guosheng Hu (Oosto); Neil Robertson (Queen’s University Belfast)
3250 ConCL: Concept Contrastive Learning for Dense Prediction Pre-training in Pathology Images Jiawei Yang (UCLA)*; Hanbo Chen (Tencent AI Lab); Yuan Liang (UCLA); Junzhou Huang (University of Texas at Arlington); Lei He (UCLA); Jianhua Yao (National Institutes of Health)
3254 Leveraging Action Affinity and Continuity for Semi-supervised Temporal Action Segmentation Guodong Ding (National University of Singapore)*; Angela Yao (National University of Singapore)
3257 Fast and High Quality Image Denoising via Malleable Convolution Yifan Jiang (University of Texas at Austin)*; Bartlomiej Wronski (Google Research); Ben Mildenhall (Google Research); Jonathan T Barron (Google Research); Zhangyang Wang (University of Texas at Austin); Tianfan Xue (Google)
3265 Data Association between Event Streams andIntensity Frames under Diverse Baselines Dehao Zhang (Peking University)*; Qiankun Ding (Peking University); Peiqi Duan (Peking University); Chu Zhou (Peking University); Boxin Shi (Peking University)
3287 Self-Regulated Feature Learning via Teacher-free Feature Distillation Lujun Li (Chinese Academy of Science)*
3289 TS2-Net: Token Shift and Selection Transformer for Text-Video Retrieval Yuqi Liu (Renmin University of China)*; Pengfei Xiong (Shopee); luhui xu (tencent); Cao Shengming (Tencent); Qin Jin (Renmin University of China)
3292 TAPE: Task-Agnostic Prior Embedding for Image Restoration Lin Liu (University of Science and Technology of China)*; Lingxi Xie (Huawei Inc.); Xiaopeng Zhang (Noah’s Ark Lab, Huawei Inc.); Shanxin Yuan (Huawei Noah’s Ark Lab); Xiangyu Chen (University of Macau; SIAT); Wengang Zhou (University of Science and Technology of China); Houqiang Li (University of Science and Technology of China); Qi Tian (Huawei Cloud & AI)
3293 MVSalNet:Multi-View Augmentation for RGB-D Salient Object Detection JiaYuan Zhou (Dalian University of Technology)*; Lijun Wang (Dalian University of Technology); Huchuan Lu (Dalian University of Technology); Kaining Huang (huang kaining); Xinchu Shi (Meituan Group); Bocong Liu (Meituan)
3295 Rethinking IoU-based Optimization for Single-stage 3D Object Detection Hualian Sheng (College of Information Science and Electronic Engineering, Zhejiang University; DAMO Academy, Alibaba Group)*; Sijia Cai (DAMO Academy, Alibaba Group); Na Zhao (NUS); Bing Deng (Damo Academy, Alibaba Group); Jianqiang Huang (Damo Academy, Alibaba Group); Xian-Sheng Hua (Damo Academy, Alibaba Group); Min-Jian Zhao (Zhejiang University); Gim Hee Lee (National University of Singapore)
3298 Uncertainty Inspired Underwater Image Enhancement Zhenqi Fu (Xiamen University)*; Wu Wang (Xiamen University); Yue Huang (Xiamen University); Xinghao Ding (Xiamen University); Kai-Kuang Ma (Nanyang Technological University, Singapore)
3300 k-means Mask Transformer Qihang Yu (Johns Hopkins University)*; Huiyu Wang (JHU); Siyuan Qiao (Google); Maxwell D Collins (Google Inc.); Yukun Zhu (Google Inc.); Hartwig Adam (Google); Alan Yuille (Johns Hopkins University); Liang-Chieh Chen (Google Inc.)
3302 Contrastive Vision-Language Pre-training with Limited Resources Quan Cui (Waseda University)*; Boyan Zhou (ByteDance); Yu Guo (Fudan University); Weidong Yin (UBC); Hao Wu (Bytedance Inc.); Osamu Yoshie (Waseda University); Yubo Chen (Bytedance)
3305 Learning Linguistic Association Towards Efficient Text-Video Retrieval Sheng Fang (ICT); Shuhui Wang (VIPL,ICT,Chinese academic of science)*; Junbao Zhuo (ICT CAS); Xinzhe Han (University of Chinese Academy of Sciences); Qingming Huang (University of Chinese Academy of Sciences)
3308 United Defocus Blur Detection and Deblurring via Adversarial Promoting Learning Wenda Zhao (Dalian University of Technology)*; Fei Wei (Dalian University of Techology); You He (Naval Aviation University); Huchuan Lu (Dalian University of Technology)
3314 Unstructured Feature Decoupling for Vehicle Re-Identification Wen Qian (Institute of Automation, Chinese Academy of Sciences)*; Hao Luo (Alibaba group); Silong Peng (The Chinese academy of science); Fan Wang (Alibaba Group); Chen Chen (The Chinese academy of science); Hao Li (Alibaba Group)
3322 Improving Adversarial Robustness of 3D Point Cloud Classification Models Guanlin Li (Nanyang Technological University)*; Guowen Xu (Nanyang Technological University); Han Qiu (Tsinghua University); Ruan HE (Tencent); Jiwei Li (Shannon.AI); Tianwei Zhang (Nanyang Technological University)
3324 ASSISTER: Assistive Navigation via Conditional Instruction Generation Zanming Huang (Boston University); Zhongkai Shangguan (Boston University); Jimuyang Zhang (Boston University); Gilad Bar (Rutgers University – Camden); Matthew Boyd (Boston University); Eshed Ohn-Bar (Boston University)*
3342 Deep Hash Distillation for Image Retrieval Young Kyun Jang (Seoul National University)*; Geonmo Gu (NAVER corp); Byungsoo Ko (NAVER/LINE Corp.); Isaac Kang (Seoul National University); Nam Ik Cho (Seoul National University)
3345 Learning Spatial-Preserved Skeleton Representations for Few-Shot Action Recognition Ning Ma (Zhejiang University)*; Hongyi Zhang (Zhejiang University); Xuhui Li (Zhejiang University); Sheng Zhou (Zhejiang University); Zhen Zhang (National University of Singapore); Jun Wen (Harvard University); Haifeng Li (Zhejiang University); Jingjun Gu (Zhejiang University); Jiajun Bu (Zhejiang University)
3346 Digging into Radiance Grid for Real-Time View Synthesis with Detail Preservation Jian Zhang (Alibaba Group); Jinchi Huang (Alibaba Group); Bowen Cai (Alibaba Group); Huan Fu (Alibaba Group)*; Mingming Gong (University of Melbourne); Chaohui Wang (Laboratoire d’Informatique Gaspard Monge, Université Paris-Est); Jiaming Wang (Alibaba Group); Hongchen Luo (Alibaba Group); Rongfei Jia (Alibaba Group); Binqiang Zhao (Alibaba); Xing Tang (Alibaba Group)
3351 S^2Contact: Graph-based Network for 3D Hand-Object Contact Estimation with Semi-Supervised Learning Tze Ho Elden Tse (University of Birmingham)*; Zhongqun Zhang (University of Birmingham); Kwang In Kim (UNIST); Ales Leonardis (University of Birmingham); Feng Zheng (SUSTech); Hyung Jin Chang (University of Birmingham)
3359 TD-Road: Top-Down Road Network Extraction with Holistic Graph Construction Yang He (Amazon)*; Ravi Garg (Amazon com services inc); Amber Roy Chowdhury (Amazon)
3366 StyleGAN-Human: A Data-Centric Odyssey of Human Generation Jianglin Fu (SenseTime)*; Shikai Li (SenseTime Research); Yuming Jiang (Nanyang Technological University); Kwan-Yee Lin (SenseTime Research); Chen Qian (SenseTime); Chen Change Loy (Nanyang Technological University); Wayne Wu (SenseTime Research); Ziwei Liu (Nanyang Technological University)
3369 Hourglass Attention Network for Image Inpainting Ye Deng (Xi’an Jiaotong University)*; Siqi Hui (Xi’an Jiaotong University); Rongye Meng (IAIR, Xi’an Jiaotong University); Sanping Zhou (Xi’an Jiaotong University); Jinjun Wang (Xi’an Jiaotong University)
3370 MaxViT: Multi-Axis Vision Transformer Zhengzhong Tu (University of Texas at Austin)*; Hossein Talebi (Google); Han Zhang (Google); Feng Yang (Google Research); Peyman Milanfar (Google); Alan Bovik (University of Texas at Austin); Yinxiao Li (Google)
3378 Gen6D: Generalizable Model-Free 6-DoF Object Pose Estimation from RGB Images Yuan Liu (The University of Hong Kong)*; Yilin Wen (The University of Hong Kong); Sida Peng (Zhejiang University); Cheng Lin (Tencent); Xiaoxiao Long (The University of Hong Kong); Taku Komura (The University of Hong Kong); Wenping Wang (The University of Hong Kong)
3385 ColorFormer: Image Colorization via Color Memory assisted Hybrid-attention Transformer Xiaozhong Ji (Tencent)*; Boyuan Jiang (Tencent Youtu Lab); Donghao Luo (Tencent); Guangpin Tao (Nanjing University); Wenqing Chu (Tencent); Zhifeng Xie (Shanghai University); Chengjie Wang (Tencent; Shanghai Jiao Tong University); Ying Tai (Tencent YouTu)
3387 Spotting Temporally Precise, Fine-Grained Events in Video James Hong (Stanford University)*; Haotian Zhang (Stanford University); Michaël Gharbi (Adobe Research); Matthew Fisher (Adobe Research); Kayvon Fatahalian (Stanford)
3390 SegPGD: An Effective and Efficient Adversarial Attack for Evaluating and Boosting Segmentation Robustness Jindong Gu (University of Munich)*; Hengshuang Zhao (University of Oxford); Volker Tresp (Siemens AG and Ludwig Maximilian University of Munich ); Philip Torr (University of Oxford)
3391 Adversarial Erasing Framework via Triplet with Gated Pyramid Pooling Layer for Weakly Supervised Semantic Segmentation Sung-Hoon Yoon (KAIST)*; Hyeokjun Kweon (KAIST); Jegyeong Cho (KAIST); Shinjeong Kim (KAIST); Kuk-Jin Yoon (KAIST)
3393 Semi-Supervised Vision Transformers Zejia Weng (Fudan University)*; Xitong Yang (University of Maryland); Ang Li (Google DeepMind); Zuxuan Wu (UMD); Yu-Gang Jiang (Fudan University)
3394 Learning an Isometric Surface Parameterization for Texture Unwrapping Sagnik Das (Stony Brook University)*; Ke Ma (Stony Brook University); Zhixin Shu (Adobe Research); Dimitris Samaras (Stony Brook University)
3409 Mimic Embedding via Adaptive Aggregation: Learning Generalizable Person Re-identification BOQIANG XU (University of Chinese Academy of Sciences;Institute of Automation,Chinese Academy of Sciences)*; Jian Liang (CASIA); He Lingxiao (nlpr,cripac); Zhenan Sun (Chinese of Academy of Sciences)
3418 CryoAI: Amortized Inference of Poses for Ab Initio Reconstruction of 3D Molecular Volumes from Real Cryo-EM Images Axel Levy (Stanford University); Frederic Poitevin (SLAC National Accelerator Laboratory); Julien N. P. Martel (Stanford University); Youssef Nashed (SLAC National Accelerator Laboratory); Ariana Peck (SLAC National Accelerator Laboratory); Nina Miolane (UCSB); Daniel Ratner (Stanford University ); Mike Dunne (SLAC National Accelerator Laboratory); Gordon Wetzstein (Stanford University)*
3419 EAGAN: Efficient Two-stage Evolutionary Architecture Search for GANs Guohao Ying (University of Southern California); Xin He (Hong Kong Baptist University); Bin Gao (National University of Singapore); Bo Han (HKBU / RIKEN); Xiaowen Chu (Hong Kong University of Science and Technology)*
3428 ScalableViT: Rethinking the Context-oriented Generalization of Vision Transformer Rui Yang (Tsinghua University)*; Hailong Ma (ByteDance Inc); Jie Wu (ByteDance Inc); Yansong Tang (Tsinghua University); Xuefeng Xiao (ByteDance Inc); Min Zheng (ByteDance); Xiu Li (Tsinghua University)
3429 PlaneFormers: From Sparse View Planes to 3D Reconstruction Samir Agarwala (University of Michigan)*; Linyi Jin (University of Michigan); Chris Rockwell (University of Michigan); David Fouhey (University of Michigan)
3438 Domain Adaptive Video Segmentation via Temporal Pseudo Supervision Yun Xing (Nanyang Technological University); Dayan Guan (Mohamed bin Zayed University of Artificial Intelligence); Jiaxing Huang (Nanyang Technological University); Shijian Lu (Nanyang Technological University)*
3442 Diverse Learner: Exploring Diverse Supervision for Semi-supervised Object Detection Linfeng Li (Baidu)*; Minyue Jiang (Baidu Inc.); Yue Yu (Baidu.Inc.); Wei Zhang (Baidu Inc); Xiangru Lin (Baidu Inc.); Yingying Li (Baidu); Xiao Tan (Baidu Inc.); Jingdong Wang (Baidu); Errui Ding (Baidu Inc.)
3452 Overlooked Poses Actually Make Sense: Distilling Privileged Knowledge for Human Motion Prediction Xiaoning Sun (Nanjing University of Science and Technology)*; Qiongjie Cui (Nanjing University of Science and Technology); Huaijiang Sun (Nanjing University of Science and Technology); Bin Li (Tianjin AiForward Science and Technology); Weiqing Li (Nanjing University of Science and Technology); Jianfeng Lu (Nanjing University of Science and Technology)
3455 Towards Hard-Positive Query Mining for DETR-based Human-Object Interaction Detection Xubin Zhong (South China University of Technology); Changxing Ding (South China University of Technology)*; Zijian Li (South China University of Technology); Shaoli Huang (Tencent AI-Lab)
3458 Learning Extremely Lightweight and Robust Model with Differentiable Constraints on Sparsity and Condition Number Xian Wei (East China Normal University); Yangyu Xu (Fujian Institute of Research on the Structure of Matter, Chinese Academy of Sciences;University of Chinese Academy of Sciences); yanhui huang (Fuzhou University); Hairong Lv (Tsinghua University); Hai Lan (Fujian Institute of Research on the Structure of Matter, Chinese Academy of Sciences); Mingsong Chen (East China Normal University); XUAN TANG (East China Normal University)*
3470 Structural Triangulation: A Closed-Form Solution to Constrained 3D Human Pose Estimation Zhuo Chen (Shanghai Jiao Tong University)*; Xu Zhao (Shanghai Jiao Tong University); Xiaoyue Wan (Shanghai Jiao Tong University)
3474 Latency-Aware Collaborative Perception Zixing Lei (Shanghai Jiao Tong University)*; Shunli Ren (Shanghai Jiao Tong University); Yue Hu (Shanghai Jiao Tong University); Wenjun Zhang (Shanghai Jiao Tong University); Siheng Chen (Shanghai Jiao Tong University)
3475 Homogeneous Multi-modal Feature Fusion and Interaction for 3D Object Detection Xin Li (East China Normal University)*; Botian Shi (Shanghai AI Lab); Yuenan HOU (Shanghai AI Lab); Xingjiao Wu ( East China Normal University); Tianlong Ma (East China Normal University); Yikang Li (Shanghai AI Lab); Liang He (ECNU)
3484 Unfolded Deep Kernel Estimation for Blind Image Super-resolution Hongyi Zheng (The Hong Kong Polytechnic University); Hongwei Yong (The Hong Kong Polytechnic University); Lei Zhang (“Hong Kong Polytechnic University, Hong Kong, China”)*
3487 Rethinking Clustering-Based Pseudo-Labeling for Unsupervised Meta-Learning Xingping Dong (Inception Institute of Artificial Intelligence)*; Jianbing Shen (Inception Institute of Artificial Intelligence); Ling Shao (Terminus Group)
3489 Continual Semantic Segmentation via Structure Preserving and Projected Feature Alignment Zihan Lin (University of Science and Technology of China); Zilei Wang (University of Science and Technology of China)*; Yixin Zhang (University of Science and Technology of China)
3498 SC-wLS: Towards Interpretable Feed-forward Camera Re-localization Xin Wu (Peking University)*; Hao Zhao (Intel Labs China); Shunkai Li (Peking University); Yingdian Cao (Peking University); Hongbin Zha (Peking University, China)
3500 Weakly-Supervised Stitching Network for Real-World Panoramic Image Generation Dae-Young Song (Chungnam National University); Geonsoo Lee (Chungnam National University); HeeKyung Lee (ETRI(Electronics and Telecommunications Reseach Institute)); Gi-Mun Um (ETRI(Electronics and Telecommunications Research Institute)); Donghyeon Cho (Chungnam National University)*
3503 FloatingFusion: Depth from ToF and Image-stabilized Stereo Cameras Andreas Meuleman (KAIST); Hakyeong Kim (KAIST); James Tompkin (Brown University); Min H. Kim (KAIST)*
3504 Dual-Evidential Learning for Weakly-supervised Temporal Action Localization Mengyuan Chen (Institute of Automation, Chinese Academy of Sciences)*; Junyu Gao (CASIA); Shicai Yang (Hikvision Research Institute); Changsheng Xu (CASIA)
3511 DynaST: Dynamic Sparse Transformer for Exemplar-Guided Image Generation Songhua Liu (National University of Singapore)*; Jingwen Ye (National University of Singapore); Sucheng Ren (South China University of Technology); Xinchao Wang (National University of Singapore)
3512 D2HNet: Joint Denoising and Deblurring with Hierarchical Network for Robust Night Image Restoration Yuzhi Zhao (City University of Hong Kong)*; Yongzhe Xu (SenseTime Group Limited); Qiong Yan (SenseTime Group Limited); DINGDONG YANG (University of Michigan); Xuehui Wang (Shanghai Jiao Tong University); Lai-Man Po (CITY UNIVERSITY OF HONG KONG)
3514 DELTAR: Depth Estimation from a Light-weight ToF Sensor and RGB Image Yijin Li (Zhejiang University); Yinda Zhang (Google); Xinyang Liu (Zhejiang University); Wenqi Dong (Zhejiang University); Han Zhou (Zhejiang University); Hujun Bao (Zhejiang University); Guofeng Zhang (Zhejiang University); Zhaopeng Cui (Zhejiang University)*
3515 ERA: Enhanced Rational Activations Martin Trimmel (Lund University)*; Mihai Zanfir (Google); Richard I Hartley (google); Cristian Sminchisescu (Google)
3518 FrequencyLowCut pooling – Plug & Play against Catastrophic Overfitting Julia Grabinski (University of Siegen)*; Janis Keuper (Fraunhofer); Margret Keuper (University of Mannheim); Steffen Jung (MPII)
3520 Interclass Prototype Relation for Few-Shot Segmentation Atsuro Okazawa (SoftBank Corp.)*
3523 Multi-Faceted Distillation of Base-Novel Commonality for Few-shot Object Detection Shuang Wu (Harbin Institute of Technology, Shenzhen); Wenjie Pei (Harbin Institute of Technology, Shenzhen); Dianwen Mei (Harbin Institute of Technology, Shenzhen); Fanglin Chen (Harbin Institute of Technology, Shenzhen); Jiandong Tian (CAS); Guangming Lu ( Harbin Institute of Technology, Shenzhen)*
3525 X-DETR: A Versatile Architecture for Instance-wise Vision-Language Tasks Zhaowei Cai (Amazon)*; Gukyeong Kwon (Amazon); Avinash Ravichandran (Amazon); Erhan Bas (Amazon); Zhuowen Tu (UC San Diego); Rahul Bhotika (Amazon); Stefano Soatto (UCLA)
3535 Equivariance and Invariance Inductive Bias for Learning from Insufficient Data Tan Wang (Nanyang Technological University)*; Qianru Sun (Singapore Management University); Sugiri Pranata (Panasonic R&D Center Singapore); Karlekar Jayashree (Panasonic); Hanwang Zhang (Nanyang Technological University)
3539 Multimodal Conditional Image Synthesis with Product-of-Experts GANs Xun Huang (NVIDIA)*; Arun Mallya (NVIDIA); Ting-Chun Wang (NVIDIA); Ming-Yu Liu (NVIDIA)
3551 Balancing between Forgetting and Acquisition in Incremental Subpopulation Learning Mingfu Liang (Northwestern University)*; JIAHUAN ZHOU (Peking University); Wei Wei (Northwestern University); Ying Wu (Northwestern University)
3555 TensoRF: Tensorial Radiance Fields Anpei Chen (ShanghaiTech University)*; Zexiang Xu (Adobe Research); Andreas Geiger (University of Tuebingen); Jingyi Yu (Shanghai Tech University); Hao Su (UCSD)
3580 PointCLM: A Contrastive Learning-based Framework for Multi-instance Point Cloud Registration Mingzhi Yuan (Fudan University)*; Zhihao Li (Fudan); Qiuye Jin (Fudan University); Xinrong Chen (Fudan University); Manning Wang (Fudan University)
3581 Slim Scissors: Segmenting Thin Object from Synthetic Background Kunyang Han (Beijing Jiaotong University)*; Jun Hao Liew (ByteDance); Jiashi Feng (ByteDance); Huawei Tian (People’s Public Security University of China); Yao Zhao (Beijing Jiaotong University); Yunchao Wei (UTS)
3591 CLASTER: Clustering with Reinforcement Learning for Zero-Shot Action Recognition Shreyank N Gowda (University of Edinburgh)*; Laura Sevilla-Lara (Facebook); Frank Keller (University of Edinburgh); Marcus Rohrbach (Facebook AI Research)
3593 Discovering Human-Object Interaction Concepts via Self-Compositional Learning Zhi Hou (The University of Sydney)*; Baosheng Yu (The University of Sydney); Dacheng Tao (The University of Sydney)
3598 Mixed-Precision Neural Network Quantization via Learned Layer-wise Importance Chen Tang (Tsinghua University)*; Kai Ouyang (Tsinghua University); Zhi Wang (Tsinghua University); Yifei Zhu (Shanghai Jiao Tong University); Wen Ji (Institute of Computing Technology, Chinese Academy of Sciences); Yaowei Wang (PengCheng Laboratory); Wenwu Zhu (Tsinghua University)
3604 TREND: Truncated Generalized Normal Density Estimation of Inception Embeddings for GAN Evaluation Junghyuk Lee (School of Integrated Technology, Yonsei University); Jong-Seok Lee (“Yonsei University, Korea”)*
3606 3D Room Layout Estimation from a Cubemap of Panorama Image via Deep Manhattan Hough Transform Yining Zhao (Tsinghua University); Chao Wen (Bytedance); Zhou Xue (Bytedance); Yue Gao (Tsinghua University)*
3623 JoJoGAN: One Shot Face Stylization Min Jin Chong (Univeristy of Illinois at Urbana-Champaign)*; David Forsyth (Univeristy of Illinois at Urbana-Champaign)
3627 Convolutional Embedding Makes Hierarchical Vision Transformer Stronger Cong Wang (OPPO); Hongmin Xu (OPPO)*; Xiong Zhang (Neolix Autonomous Vehicle); Li Wang (North China University of Technology ); Zhitong Zheng (OPPO); Haifeng Liu (OPPO)
3632 Weakly Supervised Object Localization via Transformer with Implicit Spatial Calibration Haotian Bai (The Chinese University of Hongkong, shenzhen); Ruimao Zhang (The Chinese University of Hong Kong, Shenzhen)*; Jiong WANG (The Chinese University of Hong Kong, Shenzhen); Xiang Wan (Shenzhen Research Institute of Big Data, the Chinese University of Hong Kong (Shenzhen))
3641 Few-shot Class-incremental Learning for 3D Point Cloud Objects Townim Faisal Chowdhury (North South University); Ali Cheraghian (Australian National University (ANU)); Sameera Chandimal Ramasinghe (Australian National University); Sahar Ahmadi (University of Technology Sydney); Morteza Saberi (University of Technology, Sydney); Shafin Rahman (North South University)*
3643 Learning Graph Neural Networks for Image Style Transfer Yongcheng Jing (The University of Sydney); Yining Mao (Zhejiang University); Yiding Yang (Wormpex AI Research); Yibing Zhan (JD Explore Academy); Mingli Song (Zhejiang University); Xinchao Wang (National University of Singapore)*; Dacheng Tao (JD.com)
3644 JPerceiver: Joint Perception Network for Depth, Pose and Layout Estimation in Driving Scenes Haimei Zhao (The University of Sydney)*; Jing Zhang (The University of Sydney); Sen Zhang (The University of Sydney); Dacheng Tao (JD.com)
3645 Meta-Learning with Less Forgetting on Large-Scale Non-Stationary Task Distributions Zhenyi Wang (University at Buffalo)*; Li Shen (JD Explore Academy); Le Fang (University at Buffalo); Qiuling Suo (State University of New York at Buffalo); Donglin Zhan (Columbia University); Tiehang Duan (Facebook); Mingchen Gao (University at Buffalo, SUNY)
3655 Semi-supervised 3D Object Detection with Proficient Teachers Junbo Yin (Beijing Institute of Technology); Jin Fang (Baidu ); Dingfu Zhou (Baidu); Wenguan Wang (Eidgenössische Technische Hochschule Zürich); Liangjun Zhang (baidu); Cheng-Zhong Xu (University of Macau); Jianbing Shen (Inception Institute of Artificial Intelligence)*
3658 NeFSAC: Neurally Filtered Minimal Samples Luca Cavalli (ETH Zurich)*; Marc Pollefeys (ETH Zurich / Microsoft); Daniel Barath (ETH Zürich)
3660 Domain Generalization by Mutual-Information Regularization with Pre-trained Models Junbum Cha (Kakaobrain)*; Kyungjae Lee (Chung-Ang University); Sungrae Park (Upstage AI Research, Upstage AI); Sanghyuk Chun (NAVER AI Lab)
3661 AcroFOD: An Adaptive Method for Cross-domain Few-shot Object Detection Yipeng Gao (Sun Yat-sen University, China); Lingxiao YANG (Sun-Yat Sen University); Yunmu Huang (Huawei Technologies Co., Ltd.); Song Xie (Huawei Technologies Co., Ltd.); Shiyong Li ( AI Application Research Center, Huawei Technologies Co., Ltd); WEI-SHI ZHENG (Sun Yat-sen University, China)*
3665 Primitive-based Shape Abstraction via Nonparametric Bayesian Inference Yuwei Wu (National University of Singapore)*; Weixiao Liu (National University of Singapore); Sipu Ruan (National University of Singapore); Gregory S Chirikjian (National University of Singapore)
3670 Active label correction using robust parameter update and entropy propagation Kwang In Kim (UNIST)*
3671 E-Graph: Minimal Solution for Rigid Rotation with Extensibility Graphs Yanyan Li (tum)*; Federico Tombari (Google, TU Munich)
3672 Unified Fully and Timestamp Supervised Temporal Action Segmentation via Sequence to Sequence Translation Nadine Behrmann (Bosch Center for Artificial Intelligence)*; S. Alireza Golestaneh (Google); Zico Kolter (Carnegie Mellon University); Jürgen Gall (University of Bonn); Mehdi Noroozi (Bosch Gmb)
3677 Counterfactual Intervention Feature Transfer for Visible-Infrared Person Re-identification Xulin Li (University of Science and Technology of China); Yan Lu (University of Sydney); Bin Liu (University of Science and Technology of China)*; Yating Liu (USTC); Guojun Yin (University of Science and Technology of China); Qi Chu (University of Science and Technology of China); Jinyang Huang (University Of Science And Technology Of China); Feng Zhu (University of Science and Technology of China); Rui Zhao (SenseTime Group Limited); Nenghai Yu (University of Science and Technology of China)
3681 A Closer Look at Invariances in Self-supervised Pre-training for 3D Vision Lanxiao Li (Karlsruher Institut fuer Technologie)*; Michael Heizmann (Karlsruher Institut fuer Technologie)
3685 VecGAN: Image-to-Image Translation with Interpretable Latent Directions Yusuf Dalva (Bilkent University); Said F Altındiş (Bilkent University); Aysegul Dundar (Bilkent University)*
3686 SNeS: Learning Probably Symmetric Neural Surfaces from Incomplete Data Eldar Insafutdinov (University of Oxford); Dylan Campbell (University of Oxford)*; Joao F Henriques (University of Oxford); Andrea Vedaldi (Oxford University)
3689 Three things everyone should know about Vision Transformers Hugo Touvron (Facebook AI Research)*; Matthieu Cord (Sorbonne University); Alaaeldin M El-Nouby (Facebook AI Research); Jakob Verbeek (Facebook); Herve Jegou (Facebook AI Research)
3690 DeiT III: Revenge of the ViT Hugo Touvron (Facebook AI Research)*; Matthieu Cord (Sorbonne University); Herve Jegou (Facebook AI Research)
3693 Any-resolution Training for High-resolution Image Synthesis Lucy Chai (MIT)*; Michaël Gharbi (Adobe Research); Eli Shechtman (Adobe Research, US); Phillip Isola (MIT); Richard Zhang (Adobe)
3703 HDR-Plenoxels: Self-Calibrating High Dynamic Range Radiance Fields Kim Jun-Seong (POSTECH)*; Kim Yu-Ji (POSTECH); Moon Ye-Bin (POSTECH); Tae-Hyun Oh (POSTECH)
3719 PartImageNet: A Large, High-Quality Dataset of Parts Ju He (Johns Hopkins University)*; Shuo Yang (University of Technology Sydney); Shaokang Yang (ByteDance); Adam Kortylewski (Max Planck Institute for Informatics); Xiaoding Yuan (Johns Hopkins University); Jie-Neng Chen (Johns Hopkins University); shuai liu (ByteDance Inc.); Cheng Yang (ByteDance Inc.); Qihang Yu (Johns Hopkins University); Alan Yuille (Johns Hopkins University)
3721 Abstracting Sketches through Simple Primitives Stephan Alaniz (University of Tübingen)*; Massimiliano Mancini (University of Tübingen); Anjan Dutta (University of Surrey); Diego Marcos (Wageningen University); Zeynep Akata (University of Tübingen)
3723 MTTrans: Cross-Domain Object Detection with Mean Teacher Transformer Jinze Yu (Beihang University); Jiaming Liu (Peking University); Xiaobao Wei (Beihang University); Haoyi Zhou (Beihang University); Yohei Nakata (Panasonic Corporation); Denis A Gudovskiy (Panasonic); Tomoyuki Okuno (Panasonic); Jianxin Li (Beihang University); Kurt Keutzer (UC Berkeley); Shanghang Zhang (University of California, Berkeley)*
3731 TAFIM: Targeted Adversarial Attacks against Facial Image Manipulations Shivangi Aneja (Technical University Of Munich )*; Lev Markhasin (Sony Europe); Matthias Niessner (Technical University of Munich)
3737 NeuMan: Neural Human Radiance Field from a Single Video Wei Jiang (University of British Columbia)*; Kwang Moo Yi (University of British Columbia); Golnoosh Samei (UBC); Oncel Tuzel (Apple); Anurag Ranjan (Apple)
3747 Learning Implicit Templates for Point-Based Clothed Human Modeling Siyou Lin (Tsinghua University)*; Hongwen Zhang (Tsinghua University); Zerong Zheng (Tsinghua University); Ruizhi Shao (Tsinghua University); Yebin Liu (Tsinghua University)
3751 Event Neural Networks Matthew Dutson (University of Wisconsin-Madison)*; Yin Li (University of Wisconsin-Madison); Mohit Gupta (“University of Wisconsin-Madison, USA “)
3755 Learning to Censor by Noisy Sampling Ayush Chopra (MIT)*; Abhinav Java (Adobe, MDSR Labs); Abhishek Singh (MIT); Vivek Sharma (MIT); Ramesh Raskar (Massachusetts Institute of Technology)
3758 ConMatch: Semi-Supervised Learning with Confidence-Guided Consistency Regularization Jiwon Kim (Korea University)*; Youngjo Min (Korea University); Daehwan Kim (Samsung electro mechanics); Gyuseong Lee (Korea University); Junyoung Seo (Korea University); Kwangrok Ryoo (Korea University); Seungryong Kim (Korea University)
3760 Granularity-aware Adaptation for Image Retrieval over Multiple Tasks Jon Almazan (Naver Labs); Byungsoo Ko (NAVER/LINE Corp.); Geonmo Gu (NAVER corp); Diane Larlus (Naver Labs Europe); Yannis Kalantidis (NAVER LABS Europe)*
3769 EdgeViTs: Competing Light-weight CNNs on Mobile Devices with Vision Transformers Junting Pan (The Chinese University of Hong Kong); Adrian Bulat (Samsung AI Center, Cambridge); Fuwen Tan (Samsung AI Center, Cambridge); Xiatian Zhu (University of Surrey); Lukasz Dudziak (Samsung AI Center Cambridge); Hongsheng Li (The Chinese University of Hong Kong); Georgios Tzimiropoulos (Queen Mary University of London); Brais Martinez (Samsung AI Center)*
3780 Multi-Domain Multi-Definition Landmark Localization for Small Datasets David Ferman (AI Foundation); Gaurav Bharaj (AI Foundation)*
3781 TAVA: Template-free Animatable Volumetric Actors Ruilong Li (UC Berkeley)*; Julian Tanke (University of Bonn); Minh P Vo (Facebook Reality Labs); Michael Zollhöfer (Facebook Reality Labs); Jürgen Gall (University of Bonn); Angjoo Kanazawa (University of California Berkeley); Christoph Lassner (Meta Reality Labs Research)
3792 Stereo Depth Estimation with Echoes Chenghao Zhang (National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, China)*; Kun Tian (Institute of Automation, Chinese Academy of Sciences); Bolin Ni (Institute of Automation, Chinese Academy of Sciences); Gaofeng Meng (Chinese Academy of Sciences); Bin Fan (University of Science and Technology Beijing); Zhaoxiang Zhang (Chinese Academy of Sciences, China); Chunhong Pan (Institute of Automation, Chinese Academy of Sciences)
3794 EASNet:Searching Elastic and Accurate Network Architecture for Stereo Matching Qiang Wang (Harbin Institute of Technology (Shenzhen))*; Shaohuai Shi (The Hong Kong University of Science and Technology); Kaiyong Zhao (Hong Kong Baptist University); Xiaowen Chu (Hong Kong University of Science and Technology)
3798 DEVIANT: Depth EquiVarIAnt NeTwork for Monocular 3D Object Detection Abhinav Kumar (Michigan State University)*; Garrick Brazil (Facebook); Enrique Corona (Ford Motor Company); Armin Parchami (Ford Motor Company); Xiaoming Liu (Michigan State University)
3809 RBP-Pose: Residual Bounding Box Projection for Category-Level Pose Estimation Ruida Zhang (Tsinghua University)*; Yan Di (Technical University of Munich); Zhiqiang Lou (Tsinghua University); Fabian Manhardt (Google); Federico Tombari (Google, TU Munich); Xiangyang Ji (Tsinghua University)
3820 Levenshtein OCR Cheng Da (Alibaba DAMO Academy)*; Wang Peng (Alibaba DAMO Academy); Cong Yao (Alibaba DAMO Academy)
3821 Multi-Granularity Prediction for Scene Text Recognition Wang Peng (Alibaba DAMO Academy); Cheng Da (Alibaba DAMO Academy)*; Cong Yao (Alibaba DAMO Academy)
3827 MixSKD: Self-Knowledge Distillation from Mixup for Image Recognition Chuanguang Yang (Institute of Computing Technology, Chinese Academy of Sciences )*; Zhulin An (Institute of Computing Technology, Chinese Academy of Sciences); Helong Zhou (Beijing Horizon Information Technology Co.,Ltd); linhang cai (Institute of Computing Technology, Chinese Academy of Sciences); Xiang Zhi (Institute of Computing Technology, Chinese Academy of Sciences); Jiwen Wu (Institute of Computing Technology, Chinese Academy of Sciences); yongjun xu (Institute of Computing Technology, Chinese Academy of Sciences); Qian Zhang (Horizon Robotics)
3834 Switch-BERT: Learning to Model Multimodal Interactions by Switching Attention and Input Qingpei Guo (Ant Financial Services Group)*; Kaisheng Yao (Amazon); Wei Chu (Ant Group)
3837 Efficient Video Transformers with Spatial-temporal Token Selection Junke Wang (Fudan University)*; Xitong Yang (University of Maryland); Hengduo Li (University of Maryland, College Park ); Li Liu (BirenTech Research); Zuxuan Wu (UMD); Yu-Gang Jiang (Fudan University)
3844 DAS: Densely-Anchored Sampling for Deep Metric Learning Lizhao Liu (South China University of Technology); Shangxin Huang (South China University of Technology); Zhuangwei Zhuang (South China University of Technology); Ran Yang (South China University of Technology); Mingkui Tan (South China University of Technology)*; Yaowei Wang (PengCheng Laboratory)
3864 ReCoNet: Recurrent Correction Network for Fast and Efficient Multi-modality Image Fusion Zhanbo Huang (Dalian University of Technology); Jinyuan Liu (Dalian University of Technology); Xin Fan (Dalian University of Technology)*; Risheng Liu (Dalian University of Technology); Wei Zhong (Dalian University of Technology); Zhongxuan Luo (DALIAN UNIVERSITY OF TECHNOLOGY)
3867 RIBAC: Towards Robust and Imperceptible Backdoor Attack against Compact DNN Huy Phan (Rutgers University)*; Cong Shi (Rutgers University); Yi Xie (Rutgers University); Tianfang Zhang (Rutgers University, New Brunswick); Zhuohang Li (University of Tennessee, Knoxville); Tianming Zhao (Temple University); Jian Liu (The University of Tennessee, Knoxville); Yan Wang (Temple University); Yingying Chen (Rutgers University); bo yuan (rutgers university)
3870 Point Cloud Compression with Sibling Context and Surface Priors Zhili CHEN (HKUST); Zian Qian (HKUST); Sukai Wang (HKUST); Qifeng Chen (HKUST)*
3874 Self-Feature Distillation with Uncertainty Modeling for Degraded Image Recognition zhou yang (Xidian University); Weisheng Dong (Xidian University)*; Xin Li (West Virginia University); Jinjian Wu (Xidian University); Leida Li (Xidian University); Guangming Shi (Xidian University)
3885 Point Cloud Compression using Range Image-based Entropy Model for Autonomous Driving Sukai Wang (HKUST)*; Ming Liu (HKUST)
3904 CANF-VC: Conditional Augmented Normalizing Flows for Video Compression Yung-Han Ho (NCTU); Chih-Peng Chang (National Chiao Tung Univeristy); Peng-Yu Chen (NYCU); Alessandro Gnutti (University of Brescia); Wen-Hsiao Peng (National Yang Ming Chiao Tung University)*
3912 Bi-level Feature Alignment for Versatile Image Translation and Manipulation Fangneng Zhan (Max Planck Institute for Informatics); Yingchen Yu (Nanyang Technological University); Rongliang WU (Nanyang Technological University); Jiahui Zhang (Nanyang Technological University); Kaiwen Cui (Nanyang Technological University); Aoran Xiao (Nanyang Technological University); Shijian Lu (Nanyang Technological University)*; Chunyan Miao (NTU)
3918 Lane Detection Transformer based on Multi-frame Horizontal and Vertical Attention and Visual Transformer Module Han Zhang (Beihang University)*; Yunchao Gu (BUAA); Xinliang Wang (BUAA); Junjun Pan (Beihang University); Minghui Wang (Beihang University)
3921 Label-Guided Auxiliary Training Improves 3D Object Detector yaomin huang (East China Normal University); Xinmei Liu (East China Normal University)*; Yichen Zhu (Midea Group); Zhiyuan Xu (Midea Group); Chaomin Shen (East China Normal University); Zhengping Che (Midea Group); Guixu Zhang (East China Normal University); Yaxin Peng (Department of Mathematics, School of Science, Shanghai University); Feifei Feng (Midea Grooup); Jian Tang (Midea Group)
3932 FedX: Unsupervised Federated Learning with Cross Knowledge Distillation Sungwon Han (KAIST)*; Sungwon Park (KAIST); Fangzhao Wu (MSRA); Sundong Kim (Institute for Basic Science); Chuhan Wu (Tsinghua University); Xing Xie (Microsoft Research Asia); Meeyoung Cha (Institute for Basic Science)
3936 ProposalContrast: Unsupervised Pre-training for LiDAR-based 3D Object Detection Junbo Yin (Beijing Institute of Technology); Wenguan Wang (Eidgenössische Technische Hochschule Zürich); Dingfu Zhou (Baidu); Jin Fang (Baidu ); Liangjun Zhang (baidu); Cheng-Zhong Xu (University of Macau); Jianbing Shen (Inception Institute of Artificial Intelligence)*
3948 Audio-Driven Stylized Gesture Generation with Flow-Based Model Sheng Ye (Tsinghua University)*; Yu-Hui Wen (Tsinghua University); Yanan Sun (Tsinghua University); Ying He (Nanyang Technological University); Ziyang Zhang (HUAWEI TECHNOLOGIES CO.LTD); Yaoyuan Wang (Huawei Technologies Co., Ltd.); Weihua He (Tsinghua University); Yong-Jin Liu (Tsinghua University)
3958 Unsupervised Domain Adaptation for One-Stage Object Detector using Offsets to Bounding Box Jayeon Yoo (Seoul National University); Inseop Chung (Seoul National University); Nojun Kwak (Seoul National University)*
3964 Joint Feature Learning and Relation Modeling for Tracking: A One-Stream Framework Botao Ye (Institute of Computing Technology, Chinese Academy of Sciences)*; Hong Chang (Chinese Academy of Sciences); Bingpeng MA (University of Chinese Academy of Sciences); Shiguang Shan (Institute of Computing Technology, Chinese Academy of Sciences); Xilin Chen (Institute of Computing Technology, Chinese Academy of Sciences)
3965 PreTraM: Self-Supervised Pre-training via Connecting Trajectory and Map Chenfeng Xu (UC Berkeley)*; Tian Li (University of California, San Diego); Chen Tang (UC Berkeley); Lingfeng Sun (UC Berkeley); Kurt Keutzer (EECS, UC Berkeley); Masayoshi TOMIZUKA (MSC Lab); Alireza Fathi (Google); Wei Zhan (University of California, Berkeley)
3966 DeepPS2: Revisiting Photometric Stereo using Two Differently Illuminated Images Ashish Tiwari (Indian Institute of Technology Gandhinagar)*; Shanmuganathan Raman (Indian Institute of Technology (IIT) Gandhinagar)
3977 Learn From All: Erasing Attention Consistency for Noisy Label Facial Expression Recognition Yuhang Zhang (Beijing University of Posts and Telecommunicates); Chengrui Wang (Beijing University of Posts and Telecommunications); Xu Ling (Beijing University of Posts and Telecommunications); Weihong Deng (Beijing University of Posts and Telecommunications)*
3984 Novel Class Discovery without Forgetting Joseph K J (Indian Institute of Technology, Hyderabad)*; Sujoy Paul (Google Research); Gaurav Aggarwal (Google); Soma Biswas (Indian Institute of Science, Bangalore); Piyush Rai (IIT Kanpur); Kai Han (The University of Hong Kong); Vineeth N Balasubramanian (Indian Institute of Technology, Hyderabad)
3985 Self-Constrained Inference Optimization on Structural Groups for Human Pose Estimation ZheHan Kan (Southern University of Science and Technology); Shuoshuo Chen (Southern University of Science and Technology); Zeng Li (Southern University of Science and Technology); Zhihai He (Southern University of Science and Technology)*
3989 Predicting is not Understanding: Recognizing and Addressing Underspecification in Machine Learning Damien Teney (University of Adelaide)*; Maxime Peyrard (EPFL); Ehsan M Abbasnejad (The University of Adelaide)
3991 A Non-isotropic Probabilistic Take on Proxy-based Deep Metric Learning Michael Kirchhof (University of Tübingen)*; Karsten Roth (University of Tuebingen); Zeynep Akata (University of Tübingen); Enkelejda Kasneci (University of Tuebingen)
3998 Relative Pose from SIFT Features Daniel Barath (ETH Zürich)*; Zuzana Kukelova (Czech Technical University in Prague)
3999 Monocular 3D Object Reconstruction with GAN Inversion Junzhe Zhang (Nanyang Technological University)*; Daxuan Ren (Nanyang Technological University); Zhongang Cai (SenseTime International Pte Ltd); Chai Kiat Yeo (Nanyang Technological University); Bo Dai (Shanghai AI Lab); Chen Change Loy (Nanyang Technological University)
4001 PromptDet: Towards Open-vocabulary Detection using Uncurated Images Chengjian Feng (Meituan inc.)*; Yujie Zhong (University of Oxford); Zequn Jie (Meituan inc.); Xiangxiang Chu (Meituan); Haibing Ren (Meituan Inc.); Xiaolin Wei (Meituan); Weidi Xie (Shanghai Jiao Tong University); Lin Ma (Meituan)
4005 Densely Constrained Depth Estimator for Monocular 3D Object Detection Yingyan Li (CASIA)*; Yuntao Chen (TuSimple); Jiawei He (Institute of Automation, Chinese Academy of Sciences); Zhaoxiang Zhang (Chinese Academy of Sciences, China)
4016 Content Adaptive Latents and Decoder for Neural Image Compression Guanbo Pan (Beihang University)*; Guo Lu (Beijing Institute of Technology); Zhihao Hu (Beihang University); Dong Xu (The University of Hong Kong)
4018 High-Fidelity Image Inpainting with GAN Inversion Yongsheng YU (University of Chinese Academy of Sciences); Libo Zhang (Institute of Software Chinese Academy of Sciences)*; Heng Fan (University of North Texas); Tiejian Luo (University of Chinese Academy of Sciences)
4019 Spatially Invariant Unsupervised 3D Object-Centric Learning and Scene Decomposition Tianyu Wang (The Australian National University); Miaomiao Liu (The Australian National University)*; Kee Siong Ng (The Australian National University)
4020 W2N: Switching From Weak Supervision to Noisy Supervision for Object Detection Zitong Huang (Harbin Institute of Technology); Yiping Bao (Megvii(Face++) Inc); Bowen Dong (Harbin Institute of Technology); erjin zhou (megvii); Wangmeng Zuo (Harbin Institute of Technology, China)*
4021 UnrealEgo: A New Dataset for Robust Egocentric 3D Human Motion Capture Hiroyasu Akada (Max Planck Institute for Informatics, Keio University); Jian Wang (Max Planck Institute for Informatics); Soshi Shimada (MPI for Informatics); Masaki Takahashi (Keio University); Christian Theobalt (MPI Informatik); Vladislav Golyanik (MPI for Informatics)*
4022 MotionCLIP: Exposing Human Motion Generation to CLIP Space Guy Tevet (Tel Aviv University)*; Brian Gordon (Tel Aviv University); Amir Hertz (Tel Aviv University); Amit H Bermano (Tel-Aviv University); Danny Cohen-Or (Tel Aviv University)
4023 Efficient and Degradation-Adaptive Network for Real-World Image Super-Resolution Jie Liang (The Hong Kong Polytechnic University)*; Hui Zeng (OPPO); Lei Zhang (“Hong Kong Polytechnic University, Hong Kong, China”)
4024 Unidirectional Video Denoising by Mimicking Backward Recurrent Modules with Look-ahead Forward Ones Junyi Li (Harbin Institute of Technology); Xiaohe Wu (Harbin Institute of technology); zhenxing niu (Alibaba Group-Machine Intelligence Technology); Wangmeng Zuo (Harbin Institute of Technology, China)*
4029 Map-free Visual Relocalization: Metric Pose Relative to a Single Image Eduardo Arnold (University of Warwick); Jamie M Wynn (Niantic); Sara Vicente (Niantic); Guillermo Garcia-Hernando (Niantic); Aron Monszpart (Niantic); Victor A Prisacariu (Niantic Labs); Daniyar Turmukhambetov (Niantic); Eric Brachmann (Niantic)*
4032 DeltaGAN: Towards Diverse Few-shot ImageGeneration with Sample-Specific Delta Yan Hong (Shanghai Jiao Tong University); Li Niu (Shanghai Jiao Tong University)*; Jianfu Zhang (Shanghai Jiao Tong University); Liqing Zhang (Shanghai Jiao Tong University)
4035 Sample-Adaptive Augmentation for Long-Tailed Image Classification Yan Hong (Shanghai Jiao Tong University); Jianfu Zhang (Shanghai Jiao Tong University)*; Zhongyi Sun (Tencent); Ke Yan (Tencent)
4037 TokenMix: Rethinking Image Mixing for Data Augmentation in Vision Transformers Jihao Liu (Sensetime)*; Boxiao Liu (Institute of Computing Technology, Chinese Academy of Sciences); Hang Zhou (The Chinese University of Hong Kong); Hongsheng Li (The Chinese University of Hong Kong); Yu Liu (SenseTime Group LTD)
4041 UFO: Unified Feature Optimization Teng Xi (Baidu Inc.)*; Yifan Sun (Baidu Research); Deli Yu (Baidu Inc. ); Bi Li (Baidu Inc.); Nan Peng (Baidu Inc.); gang zhang (Baidu Inc.); Xinyu Zhang (Baidu Inc.); Zhigang Wang (shanghai AI lab); jinwen chen (Baidu Inc.); Jian Wang (Baidu Inc.); liu lufei (Baidu Inc); Haocheng Feng (Baidu Inc.); Junyu Han (Baidu Inc.); jingtuo liu (baidu); Errui Ding (Baidu Inc.); Jingdong Wang (Baidu)
4043 Master of All: Simultaneous Generalization of Urban-Scene Segmentation to All Adverse Weather Conditions Nikhil Reddy (IIT Delhi)*; Abhinav Singhal (Indian Institute of Technology, Delhi); Abhishek Kumar (IIT Delhi); Mahsa Baktashmotlagh (University of Queensland); Chetan Arora (Indian Institute of Technology Delhi)
4047 PalQuant: Accelerating High-precision Networks on Low-precision Accelerators Qinghao Hu (Institute of Automation, Chinese Academy of Sciences)*; gang li (shanghai jiao tong university); Qiman Wu (Baidu Inc.); Jian Cheng (“Chinese Academy of Sciences, China”)
4057 Self-Supervised Learning for Real-World Super-Resolution from Dual Zoomed Observations Zhilu Zhang (Harbin Institute of Technology); Ruohao Wang (Harbin Institute of Technology); Hongzhi Zhang (Harbin Institute of Technology); Yunjin Chen (ULSee Inc.); Wangmeng Zuo (Harbin Institute of Technology, China)*
4059 UniMiSS: Universal Medical Self-Supervised Learning via Breaking Dimensionality Barrier Yutong Xie (University of Adelaide)*; Jianpeng Zhang (Northwestern Polytechnical University); Yong Xia (Northwestern Polytechnical University, Research & Development Institute of Northwestern Polytechnical University in Shenzhen); Qi Wu (University of Adelaide)
4073 Self-distilled Feature Aggregation for Self-supervised Monocular Depth Estimation Zhengming Zhou (NLPR-IA-CAS); Qiulei Dong (NLPR-IA-CAS)*
4074 Negative Samples are at Large: Leveraging Hard-distance Elastic Loss for Re-identification Hyungtae Lee (DEVCOM Army Research Laboratory)*; Sungmin Eum (Booz Allen Hamilton Inc.); Heesung Kwon (U.S. Army Research Laboratory)
4076 Global-local Motion Transformer for Unsupervised Skeleton-based Action Learning Boeun Kim (Seoul National University)*; Hyung Jin Chang (University of Birmingham); Jungho Kim (KETI); Jin Young Choi (Seoul National University)
4080 Towards Efficient and Scale-Robust Ultra-High-Definition Image Demoiréing Xin Yu (The University of Hong Kong)*; Peng Dai (The University of Hong Kong); Wenbo Li (The Chinese University of Hong Kong); Lan Ma (TCL Corporate Research); Jiajun Shen (TCL Research); Jia Li (Sun Yat-Sen University); Xiaojuan Qi (The University of Hong Kong)
4084 Instance Contour Adjustment via Structure-driven CNN Shuchen Weng (Peking University)*; Yi Wei (Samsung Research America Inc.); Ming-Ching Chang (University at Albany – SUNY); Boxin Shi (Peking University)
4085 ERDN: Equivalent Receptive Field Deformable Network for Video Deblurring Bangrui Jiang (Tsinghua University)*; zhihuai xie (Tencent); Zhen Xia (Tencent); Songnan Li (Tencent); Shan Liu (Tencent America)
4090 Localizing Visual Sounds the Easy Way Shentong Mo (Carnegie Mellon University); Pedro Morgado (CMU)*
4105 Polarimetric Pose Prediction Daoyi Gao (Technical University of Munich)*; Yitong Li (Technical University of Munich); Patrick Ruhkamp (Technical University of Munich); Iuliia Skobleva (Technical University of Munich); Magdalena Wysocki (Technical University of Munich); HyunJun Jung ( Technical University of Munich); Pengyuan Wang (TUM); Arturo Guridi (Technical University of Munich); Benjamin Busam (Technical University of Munich)
4115 DFNet: Enhance Absolute Pose Regression with Direct Feature Matching Shuai Chen (University of Oxford)*; Xinghui Li (University of Oxford); Zirui Wang (University of Oxford); Victor Adrian Prisacariu (University of Oxford)
4117 A-OKVQA: A Benchmark for Visual Question Answering using World Knowledge Dustin Schwenk (Allen Institute for Artificial Intelligence); Apoorv Khandelwal (Allen Institute for AI); Christopher A Clark (Allen Institute for AI); Kenneth Marino (CMU); Roozbeh Mottaghi (Allen Institute for AI)*
4119 Sound Localization by Self-Supervised Time Delay Estimation Ziyang Chen (University of Michigan)*; David Fouhey (University of Michigan); Andrew Owens (U Michigan)
4120 AdaFocus V3: On Unified Spatial-temporal Dynamic Video Recognition Yulin Wang (Tsinghua University); Yang Yue (Tsinghua University); Xinhong Xu (Tsinghua University); Ali Hassani (University of Oregon); Victor Kulikov (Picsart); Nikita Orlov (PicsArt); Shiji Song (Department of Automation, Tsinghua University); Humphrey Shi (U of Oregon | UIUC | PAIR); Gao Huang (Tsinghua)*
4123 Discrete-Constrained Regression for Local Counting Models Haipeng Xiong (National University of Singapore)*; Angela Yao (National University of Singapore)
4124 Towards Regression-Free Neural Networks for Diverse Compute Platforms Rahul Duggal (Georgia Tech); Hao Zhou (Amazon); Shuo Yang (Amazon); Jun Fang (Amazon)*; Yuanjun Xiong (Amazon); Wei Xia (Amazon)
4130 Selection and Cross Similarity for Event-Image Deep Stereo Hoonhee Cho (KAIST)*; Kuk-Jin Yoon (KAIST)
4136 Long Movie Clip Classification with State-Space Video Models Md Mohaiminul Islam (UNC Chapel Hill)*; Gedas Bertasius (UNC Chapel Hill)
4145 Relationship Spatialization for Depth Estimation xiaoyu xu (University of Waterloo)*; Jiayan Qiu (University of Waterloo); Xinchao Wang (National University of Singapore); Zhou Wang (University of Waterloo)
4150 Breadcrumbs: Adversarial Class-Balanced Sampling for Long-tailed Recognition Bo Liu (Wormpex AI Research)*; Haoxiang Li (Wormpex AI Research); Hao Kang (Wormpex AI Research); Gang Hua (Wormpex AI Research); Nuno Vasconcelos (UCSD, USA)
4152 Image2Point: 3D Point-Cloud Understanding with 2D Image Pretrained Models Chenfeng Xu (UC Berkeley)*; Shijia Yang (UC Berkeley); Tomer Galanti (Massachusetts Institute of Technology); Bichen Wu (Facebook Research); Xiangyu Yue (University of California, Berkeley); Bohan Zhai (UC Berkeley); Wei Zhan (University of California, Berkeley); Kurt Keutzer (EECS, UC Berkeley); Peter Vajda (Facebook); Masayoshi Tomizuka (University of California, Berkeley)
4175 Visual Prompt Tuning Menglin Jia (Cornell University)*; Luming Tang (Cornell University); Bor-Chun Chen (Facebook AI); Claire T Cardie (Cornell University); Serge Belongie (University of Copenhagen); Bharath Hariharan (Cornell University); Ser-Nam Lim (Meta AI)
4181 Multi-scale and Cross-scale Contrastive Learning for Semantic Segmentation THEODOROS PISSAS (University College London)*; Claudio S Ravasio (King’s College London (KCL)); Lyndon DaCruz (Moorfields Eye Hospital / University College London); Christos Bergeles (Kings College London)
4185 Rethinking Generic Camera Models for Deep Single Image Camera Calibration to Recover Rotation and Fisheye Distortion Nobuhiko Wakai (Panasonic Corporation)*; Satoshi Sato (Panasonic Corporation); Yasunori Ishii (Panasonic Holdings); Takayoshi Yamashita (Chubu University)
4188 Neural-Sim: Learning to Generate Training Data with NeRF Yunhao Ge (University of Southern California)*; Harkirat Behl (University of Oxford); Jiashu Xu (USC); Suriya Gunasekar (Microsoft Research); Neel Joshi (MICROSOFT RESEARCH); Yale Song (FAIR); Xin Wang (Microsoft Research); Laurent Itti (University of Southern California); Vibhav Vineet (Microsoft Research)
4195 Word-Level Fine-Grained Story Visualization Bowen Li (University of Oxford)*
4206 Chairs Can be Stood on: Overcoming Object Bias in Human-Object Interaction Detection Guangzhi Wang (National University of Singapore)*; Yangyang Guo (National University of Singapore); Yongkang Wong (National University of Singapore); Mohan Kankanhalli (National University of Singapore,)
4208 GOCA: Guided Online Cluster Assignment for Self Supervised Video Representation Learning HUSEYIN COSKUN (Technical University of Munich)*; Alireza Zareian (Snap Inc.); Joshua L Moore (Snapchat); Federico Tombari (Google, TU Munich); Chen Wang (Snap Inc.)
4217 Learning Audio-Video Modalities from Image Captions Arsha Nagrani (Google )*; Paul Hongsuck Seo (Google); Bryan Seybold (Google); Anja Hauth (Google AI); Santiago Manen (Google); Chen Sun (Brown University); Cordelia Schmid (Google)
4220 Inverted Pyramid Multi-task Transformer for Dense Scene Understanding Hanrong Ye (The Hong Kong University of Science and Technology)*; Dan Xu (The Hong Kong University of Science and Technology)
4222 Image Inpainting with Cascaded Modulation GAN and Object-Aware Training Haitian Zheng (University of Rochester)*; Zhe Lin (Adobe Research); Jingwan Lu (Adobe Research ); Scott Cohen (Adobe Research); Eli Shechtman (Adobe Research, US); Connelly Barnes (Adobe); Jianming Zhang (Adobe Research); Ning Xu (Adobe Research); Sohrab Amirghodsi (Adobe Research); Jiebo Luo (U. Rochester)
4231 Planes vs. Chairs: Category-guided 3D shape learning without any 3D cues Zixuan Huang (Georgia Institute of Technology)*; Stefan Stojanov (Georgia Institute of Technology); Anh Thai (Georgia Institute of Technology); Varun Jampani (Google); James Rehg (Georgia Institute of Technology)
4237 ART-SS: An Adaptive Rejection Technique for Semi-Supervised restoration for adverse weather-affected images Rajeev Yasarla ( AIBEE )*; Carey E Priebe (Johns Hopkins University); Vishal Patel (Johns Hopkins University)
4239 Skeleton-Parted Graph Scattering Networks for 3D Human Motion Prediction Maosen Li (Cooperative Medianet Innovation Center, Shanghai Jiao Tong University)*; Siheng Chen (Shanghai Jiao Tong University); Zijing Zhang (Zhejiang University); Lingxi Xie (Huawei Inc.); Qi Tian (Huawei Cloud & AI); Ya Zhang (Cooperative Medianet Innovation Center, Shang hai Jiao Tong University)
4241 MHR-Net: Multiple-Hypothesis Reconstruction of Non-Rigid Shapes from 2D Views Haitian Zeng (University of Technology Sydney)*; Xin Yu (University of Technology Sydney); Jiaxu Miao (Zhejiang University); Yi Yang (Zhejiang University)
4243 Unifying Event Detection and Captioning as Sequence Generation via Pre-Training Qi Zhang (Renmin University of China)*; Yuqing Song (Renmin University of China); Qin Jin (Renmin University of China)
4247 Depth Map Decomposition for Monocular Depth Estimation Jinyoung Jun (Korea University)*; Jae-Han Lee (Gauss Labs Inc.); Chul Lee (Dongguk University); Chang-Su Kim (Korea university)
4249 Human-centric Image Cropping with Partition-aware and Content-preserving Features Bo Zhang (Shanghai Jiao Tong University)*; Li Niu (Shanghai Jiao Tong University); Xing Zhao (Shanghai Jiao Tong University); Liqing Zhang (Shanghai Jiao Tong University)
4252 Backbone is All Your Need: A Simplified Architecture for Visual Object Tracking Boyu Chen (The University of Sydney); Peixia Li (The University of Sydney)*; Lei Bai (Shanghai AI Laboratory); Lei Qiao (SenseTime Group Limited); Qiuhong Shen (Harbin Institute of Technology (Shenzhen)); Bo Li (SenseTime Group Limited); Weihao Gan (SenseTime Group Limited); Wei Wu (SenseTime Group Limited); Wanli Ouyang (The University of Sydney)
4255 StyleFace: Towards Identity-Disentangled Face Generation on Megapixels Yuchen Luo (Shanghai Jiao Tong University)*; Junwei Zhu (Tencent); Keke He (Tencent); Wenqing Chu (Tencent); Ying Tai (Tencent YouTu); Junchi Yan (Shanghai Jiao Tong University); Chengjie Wang (Tencent; Shanghai Jiao Tong University)
4260 Fusion from Decomposition: A Self-Supervised Decomposition Approach for Image Fusion Pengwei Liang (Harbin Institute of Technology)*; Junjun Jiang (Harbin Institute of Technology); Xianming Liu (Harbin Institute of Technology); Jiayi Ma (Wuhan University)
4261 Learning Degradation Representations for Image Deblurring dasong Li (Chinese University of Hong Kong)*; Yi Zhang (CUHK); Ka Chun Cheung (Nvidia); Xiaogang Wang (Chinese University of Hong Kong, Hong Kong); Hongwei Qin (Sensetime); Hongsheng Li (The Chinese University of Hong Kong)
4269 Aware of the History: Trajectory Forecasting with the Local Behavior Data Yiqi Zhong (University of Southern California)*; Zhenyang Ni (Shanghai Jiao Tong University); Siheng Chen (Shanghai Jiao Tong University); Ulrich Neumann (USC)
4270 FAR: Fourier Aerial Video Recognition Divya Kothandaraman (University of Maryland College Park)*; Tianrui Guan (University of Maryland, College Park); Xijun Wang (University of Maryland, College Park); Shuowen Hu (US Army Research Laboratory); Ming C Lin (UMD-CP & UNC-CH ); Dinesh Manocha (University of Maryland at College Park)
4271 X-Learner: Learning Cross Sources and Tasks for Universal Visual Representation Yinan He (Beijing University of Posts and Telecommunications)*; Gengshi Huang (School of Electronics and Information Technology, Sun Yat-sen University); Siyu Chen (Carnegie Mellon University); Jianing Teng (sensetime); Kun Wang (SenseTime Group Limited); Zhenfei Yin (Sensetime); Lu Sheng (Beihang University); Ziwei Liu (Nanyang Technological University); Yu Qiao (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences); Jing Shao (Sensetime)
4273 Disentangled Differentiable Network Pruning Shangqian Gao (University of Pittsburgh)*; Feihu Huang (University of Pittsburgh); Yanfu Zhang (University of Pittsburgh); Heng Huang (University of Pittsburgh)
4275 Video Extrapolation in Space and Time Yunzhi Zhang (Stanford University)*; Jiajun Wu (Stanford University)
4277 IDa-Det: An Information Discrepancy-aware Distillation for 1-bit Detectors Sheng Xu (Beihang University)*; Yanjing Li (Beihang University); Bohan Zeng (Beihang University); Teli Ma (Shanghai Artificial Intelligence Laboratory); Baochang Zhang (Beihang University); Xianbin Cao (Beihang University, China); Peng Gao (Chinese university of hong kong); Jinhu Lu (Beihang University, Beijing, China)
4278 Multimodal Transformer with Variable-length Memory for Vision-and-Language Navigation chuang lin (Monash University)*; Yi Jiang (Bytedance); Jianfei Cai (Monash University); Lizhen Qu (Monash University); Reza Haffari (Monash University, Australia); Zehuan Yuan (Bytedance.Inc)
4282 DnA: Improving Few-shot Transfer Learning with Low-Rank Decomposition and Alignment Ziyu Jiang (Texas A&M University)*; Tianlong Chen (Unversity of Texas at Austin); Xuxi Chen (University of Texas at Austin); Yu Cheng (Microsoft Research); Luowei Zhou (Microsoft); Lu Yuan (Microsoft); Ahmed Awadallah (Microsoft); Zhangyang Wang (University of Texas at Austin)
4284 Translating a Visual LEGO Manual to a Machine-Executable Plan Ruocheng Wang (Stanford University)*; Yunzhi Zhang (Stanford University); Jiayuan Mao (MIT); Chin-Yi Cheng (Google Research); Jiajun Wu (Stanford University)
4286 Cornerformer: Purifying Instances for Corner-based Detectors Haoran Wei (University of Chinese Academy of Sciences)*; Xin Chen (Huawei Inc.); Lingxi Xie (Huawei Inc.); Qi Tian (Huawei Cloud & AI)
4287 Contributions of Shape, Texture, and Color in Visual Recognition Yunhao Ge (University of Southern California)*; Yao Xiao (University of Southern California); Zhi Xu (University of Southern California); Xingrui Wang (University of Southern California); Laurent Itti (University of Southern California)
4288 Monitored Distillation for Positive Congruent Depth Completion Tian Yu Liu (UCLA); Parth Agrawal (UCLA); Allison Y Chen (University of California, Los Angeles); Byung-Woo Hong (Chung-Ang University); Alex Wong (Yale University)*
4292 Towards Unbiased Label Distribution Learning for Facial Pose Estimation Using Anisotropic Spherical Gaussian Zhiwen Cao (Purdue University); Dongfang Liu (Rochester Institute of Technology)*; Qifan Wang (Meta AI); Yingjie Victor Chen (Purdue University)
4293 AirDet: Few-Shot Detection without Fine-tuning for Autonomous Exploration Bowen Li (Tongji University)*; Chen Wang (Carnegie Mellon University); Pranay Reddy Anthireddy (Indian Institute of Information Technology, Design and Manufacturing, Jabalpur); Seungchan Kim (Carnegie Mellon University); Sebastian Scherer (Carnegie Mellon University)
4295 Learning to Weight Samples for Dynamic Early-exiting Networks Yizeng Han (Tsinghua University); Yifan Pu (Tsinghua University); Zihang Lai (CMU); Chaofei Wang (Tsinghua University); Shiji Song (Department of Automation, Tsinghua University); cao junfeng (CMRI); Wenhui Huang (CMRI); Chao Deng (China Mobile Research Institute); Gao Huang (Tsinghua)*
4300 Constrained Mean Shift Using Distant Yet Related Neighbors for Representation Learning K L Navaneet (University of California, Davis); Soroush Abbasi Koohpayegani (University of Maryland Baltimore County)*; Ajinkya B Tejankar (UMBC); Kossar Pourahmadi Meibodi (University of Maryland, Baltimore County); Akshayvarun Subramanya (UMBC); Hamed Pirsiavash (University of California Davis)
4303 SLIP: Self-supervision meets Language-Image Pre-training Norman Mu (University of California, Berkeley)*; Alexander Kirillov (Facebook AI Reserach); David Wagner (UC Berkeley); Saining Xie (Facebook AI Research)
4304 Learning Visual Styles from Audio-Visual Associations Tingle Li (Tsinghua University)*; Yichen Liu (Tsinghua University); Andrew Owens (U Michigan); Hang Zhao (Tsinghua University)
4305 Dynamic Low-Resolution Distillation for Cost-Efficient End-to-End Text Spotting Ying Chen (Hikvision Research Institute); Liang Qiao (Zhejiang University & Hikvision Research Institute)*; Zhanzhan Cheng (Zhejiang University & Hikvision Research Institute); Shiliang Pu (Hikvision Research Institute); Yi Niu (Hikvision Research Institute); Xi Li (Zhejiang University)
4310 Prompting Visual-Language Models for Efficient Video Understanding Chen Ju (Cooperative Medianet Innovation Center, Shang hai Jiao Tong University); Tengda Han (University of Oxford); Kunhao Zheng (Shanghai Jiaotong University); Ya Zhang (Cooperative Medianet Innovation Center, Shang hai Jiao Tong University); Weidi Xie (Shanghai Jiao Tong University)*
4318 One-Trimap Video Matting Hongje Seong (Yonsei University)*; Seoung Wug Oh (Adobe Research); Brian Price (Adobe); Euntai Kim (Yonsei University); Joon-Young Lee (Adobe Research)
4323 Contrastive Learning for Diverse Disentangled Foreground Generation Yuheng Li (UW Madison)*; Yijun Li (Adobe Research); Jingwan Lu (Adobe Research ); Eli Shechtman (Adobe Research, US); Yong Jae Lee (University of Wisconsin-Madison); Krishna Kumar Singh (Adobe Research)
4326 Resolution-free Point Cloud Sampling Network with Data Distillation Tianxin Huang (Zhejiang University)*; Jiangning Zhang (Zhejiang University); Jun Chen (Zhejiang University); Yuang Liu (Zhejiang University); Yong Liu (Zhejiang University)
4327 BIPS: Bi-modal Indoor Panorama Synthesis via Residual Depth-aided Adversarial Learning Changgyoon Oh (KAIST)*; Wonjune Cho (NAVER LABS); Yujeong Chae (KAIST); Daehee Park (KAIST); Lin Wang (HKUST); Kuk-Jin Yoon (KAIST)
4330 Augmentation of rPPG Benchmark Datasets: Learning to Remove and Embed rPPG Signals via Double Cycle Consistent Learning from Unpaired Facial Videos WEI-HAO Chung (National Tsing Hua University)*; CHENG-JU HSIEH (National Tsing Hua University); Chiou-Ting Hsu (National Tsing Hua University)
4331 Fabric Material Recovery from Video Using Multi-Scale Geometric Auto-Encoder Junbang Liang (University of Maryland, College Park)*; Ming C Lin (UMD-CP & UNC-CH )
4333 An Invisible Black-box Backdoor Attack through Frequency Domain Tong Wang (Nanjing University); Yuan Yao (Nanjing University)*; Feng Xu (Nanjing University); Shengwei An (Purdue University); Hanghang Tong (University of Illinois at Urbana-Champaign); Ting Wang (Penn State)
4336 Learning Mutual Modulation for Self-Supervised Cross-Modal Super-Resolution Xiaoyu Dong (The University of Tokyo / RIKEN AIP); Naoto Yokoya (The University of Tokyo)*; Longguang Wang (National University of Defense Technology); Tatsumi Uezato (Hitachi, Ltd)
4338 TransGrasp: Grasp Pose Estimation of a Category of Objects by Transferring Grasps from Only One Labeled Instance Hongtao Wen (Dalian University of Technology); Jianhang Yan (Dalian University of Technology); Wanli Peng (Dalian University of Technology)*; Yi Sun (Dalian University of Technology)
4343 Learning Instance and Task-Aware Dynamic Kernels for Few-shot Learning Rongkai Ma (Monash University)*; Pengfei Fang (The Australian National University); Gil Avraham (Monash University); Yan Zuo (CSIRO); Tianyu Zhu (Monash University); Tom Drummond (University of Melbourne); Mehrtash Harandi (Monash University)
4346 PillarNet: Real-Time and High-Performance Pillar-based 3D Object Detection Guangsheng Shi (Harbin Institute of Technology)*; Ruifeng Li (Harbin Institute of Technology); Chao Ma (Shanghai Jiao Tong University)
4348 Robust Object Detection With Inaccurate Bounding Boxes Chengxin Liu (Huazhong University of Science and Technology); Kewei Wang (Huazhong Univ. of Sci.&Tech.); Hao Lu (Huazhong University of Science and Technology); Zhiguo Cao (Huazhong Univ. of Sci.&Tech.)*; Ziming Zhang (Worcester Polytechnic Institute)
4349 Revisiting the Critical Factors of Augmentation-Invariant Representation Learning Junqiang Huang (MEGVII Technology)*; Xiangwen Kong (MEGVII Technology); Xiangyu Zhang (Megvii Technology)
4359 A Fast Knowledge Distillation Framework for Visual Recognition Zhiqiang Shen (Carnegie Mellon University)*; Eric Xing (MBZUAI, CMU, and Petuum Inc.)
4366 MegBA: A GPU-Based Distributed Library for Large-Scale Bundle Adjustment Jie Ren (Megvii Inc.); Wenteng Liang (Megvii); Ran Yan (Megvii)*; Luo Mai (University of Edinburgh); Shiwen Liu (Megvii); Xiao Liu (Megvii Inc)
4367 Spectrum-aware and Transferable Architecture Search for Hyperspectral Image Restoration Wei He (Wuhan University)*; Quanming Yao (Tsinghua University); Naoto Yokoya (The University of Tokyo); Tatsumi Uezato (Hitachi, Ltd); Hongyan Zhang (Wuhan University); Liangpei Zhang (Wuhan University)
4374 Boosting Transferability of Targeted Adversarial Examples via Hierarchical Generative Networks Xiao Yang (Tsinghua University)*; Yinpeng Dong (Tsinghua University); Tianyu Pang (Sea AI Lab); Hang Su (Tsinghua Univiersity); Jun Zhu (Tsinghua University)
4378 Exploring the Devil in Graph Spectral Domain for 3D Point Cloud Attacks Qianjiang Hu (Peking University); Daizong Liu (Peking University); Wei Hu (Peking University)*
4385 Geometry-aware Single-image Full-body Human Relighting Chaonan Ji (Tsinghua University); Tao Yu (Tsinghua University); Kaiwen Guo (Google); JINGXIN LIU (OPPO); Yebin Liu (Tsinghua University)*
4388 Optical Flow Training under Limited Label Budget via Active Learning Shuai Yuan (Duke University)*; Xian Sun (Duke University); Hannah H Kim (Duke University); Shuzhi Yu (Duke University); Carlo Tomasi (Duke University)
4395 RVSL: Robust Vehicle Similarity Learning in Real Hazy Scenes Based on Semi-supervised Learning Wei-Ting Chen (National Taiwan University)*; I-HSIANG CHEN (National Taiwan University); CHIH-YUAN YEH (National Taiwan University); Hao-Hsiang Yang (National Taiwan University); Hua-En Chang (National Taiwan University); Jian-Jiun Ding (National Taiwan University); Sy-Yen Kuo (National Taiwan University)
4400 Hierarchical Feature Embedding for Visual Tracking Zhixiong Pi (Huazhong University of Science and Technology)*; Weitao Wan (Tencent); Chong Sun (Tencent Wechat); Changxin Gao (Huazhong University of Science and Technology); Nong Sang (Huazhong University of Science and Technology); Chen Li (Tencent)
4401 Neural Color Operators for Sequential Image Retouching YILI WANG (Tsinghua University); Xin Li (Baidu); Kun Xu (Tsinghua University)*; Dongliang He (Baidu); Qi Zhang (baidu); Fu Li (Baidu); Errui Ding (Baidu Inc.)
4402 Optimizing Image Compression via Joint Learning with Denoising Ka Leong Cheng (The Hong Kong University of Science and Technology); Yueqi Xie (The Hong Kong University of Science and Technology); Qifeng Chen (HKUST)*
4405 DICE: Leveraging Sparsification for Out-of-Distribution Detection Yiyou Sun (University of Wisconsin Madison); Yixuan Li (University of Wisconsin-Madison)*
4406 DeMFI: Deep Joint Deblurring and Multi-Frame Interpolation with Flow-Guided Attentive Correlation and Recursive Boosting Jihyong Oh (KAIST)*; Munchurl Kim (Korea Advanced Institute of Science and Technology)
4408 Invariant Feature Learning for Generalized Long-Tailed Classification Kaihua Tang (Nanyang Technological University)*; Mingyuan Tao (Damo Academy, Alibaba Group); Jiaxin Qi (Nanyang Technological University); Zhenguang Liu (Zhejiang University); Hanwang Zhang (Nanyang Technological University)
4411 Fine-Grained Visual Entailment Christopher L Thomas (Columbia University)*; Yipeng Zhang (Columbia University); Shih-Fu Chang (Columbia University)
4412 Sliced Recursive Transformer Zhiqiang Shen (Carnegie Mellon University)*; Zechun Liu (Carnegie Mellon University); Eric Xing (MBZUAI, CMU, and Petuum Inc.)
4413 Lightweight Attentional Feature Fusion: A New Baseline for Text-to-Video Retrieval Fan Hu (Renmin University of China); Aozhu Chen (Renmin University of China); Ziyue Wang (Renmin University of China); Fangming Zhou (Renmin University of China); Jianfeng Dong (Zhejiang Gongshang University); Xirong Li (Renmin University of China)*
4416 Asymmetric Relation Consistency Reasoning for Video Relation Grounding Huan Li (Xi’an Jiaotong University); Ping Wei (Xi’an Jiaotong University)*; Jiapeng Li (Xi’an Jiaotong University); Zeyu Ma (Xi’an Jiaotong University); Jiahui Shang (Xi’an Jiaotong University); Nanning Zheng (Xi’an Jiaotong University)
4420 PETR: Position Embedding Transformation for Multi-View 3D Object Detection Yingfei Liu (Megvii Technology); Tiancai Wang ( Megvii Technology)*; Xiangyu Zhang (Megvii Technology); Jian Sun (Megvii Technology)
4422 Contextual Text Block Detection towards Scene Text Understanding Chuhui Xue (Nanyang Technological University); Jiaxing Huang (Nanyang Technological University); Wenqing Zhang (ByteDance); Shijian Lu (Nanyang Technological University)*; Changhu Wang (ByteDance.Inc); Song Bai (University of Oxford)
4426 Structure-aware Editable Morphable Model for 3D Facial Detail Animation and Manipulation Jingwang Ling (Tsinghua University); Zhibo Wang (Tsinghua University); Ming Lu (Intel Labs China); Quan Wang (Sensetime); Chen Qian (SenseTime); Feng Xu (Tsinghua University)*
4429 UniNet: Unified Architecture Search with Convolution, Transformer, and MLP Jihao Liu (Sensetime)*; Xin Huang (Waseda University); Guanglu Song (Sensetime); Hongsheng Li (The Chinese University of Hong Kong); Yu Liu (SenseTime Group LTD)
4433 Efficient Decoder-free Object Detection with Transformers Peixian Chen (Youtu Tencent); mengdan zhang (Youtu, Tencent); Yunhang Shen (Xiamen University); Kekai Sheng (Youtu Lab, Tencent Inc.); Yuting Gao (tencent); Xing Sun (Shopee); Ke Li (Tencent)*; Chunhua Shen (“University of Adelaide, Australia”)
4439 Rethinking Keypoint Representations: Modeling Keypoints and Poses as Objects for Multi-Person Human Pose Estimation William McNally (University of Waterloo)*; Kanav Vats (University of Waterloo); Alexander Wong (University of Waterloo); John McPhee (University of Waterloo)
4440 CA-SSL: Class-Agnostic Semi-Supervised Learning for Detection and Segmentation Lu Qi (The Chinese University of Hong Kong)*; Jason Kuen (Adobe Research); Zhe Lin (Adobe Research); Jiuxiang Gu (Adobe Research); Fengyun Rao (Tencent); Dian Li (Tencent.com); Weidong Guo (Tencent); Zhen Wen (Tencent Technology (Shenzhen) Co., Ltd); Ming-Hsuan Yang (University of California at Merced); Jiaya Jia (Chinese University of Hong Kong)
4447 StARformer: Transformer with State-Action-Reward Representations for Visual Reinforcement Learning Jinghuan Shang (Stony Brook University)*; Kumara Kahatapitiya (Stony Brook University); Xiang Li (Stony Brook University); Michael S Ryoo (Stony Brook/Google)
4451 S2Net: Stochastic Sequential Pointcloud Forecasting Xinshuo Weng (NVIDIA Research)*; Junyu Nan (Carnegie Mellon University); Kuan-Hui Lee (Toyota Research Institute); Rowan McAllister (Toyota Research Institute); Adrien Gaidon (Toyota Research Institute); Nicholas Rhinehart (UC Berkeley); Kris Kitani (Carnegie Mellon University)
4452 D3Net: A Unified Speaker-Listener Architecture for 3D Dense Captioning and Visual Grounding Zhenyu Chen (Technical University of Munich)*; Qirui Wu (Simon Fraser University); Matthias Niessner (Technical University of Munich); Angel X Chang (Simon Fraser University)
4464 AMixer: Adaptive Weight Mixing for Self-Attention Free Vision Transformers Yongming Rao (Tsinghua University); Wenliang Zhao (Tsinghua University); Jie Zhou (Tsinghua University); Jiwen Lu (Tsinghua University)*
4471 Neural Image Representations for Multi-Image Fusion and Layer Separation Seonghyeon Nam (York University); Marcus A Brubaker (York University); Michael S Brown (York University)*
4477 Panoramic Human Activity Recognition Ruize Han (College of Intelligence and Computing, Tianjin University); Haomin Yan (Tianjin University); Jiacheng Li (College of Intelligence and Computing, Tianjin University); Songmiao Wang (Tianjin University); Wei Feng (College of Intelligence and Computing, Tianjin University, China)*; Song Wang (University of South Carolina)
4478 Compiler-Aware Neural Architecture Search for On-Mobile Real-time Super-Resolution Yushu Wu (Northeastern University)*; Yifan Gong (Northeastern University); Pu Zhao (Northeastern University); Yanyu Li (Northeastern University); Zheng Zhan (Northeastern University); Wei Niu (William & Mary); Hao Tang (ETH Zurich); Minghai Qin (Western Digital Research); Bin Ren (William & Mary); Yanzhi Wang (Northeastern University)
4481 Dual Adaptive Transformations for Weakly Supervised Point Cloud Segmentation Zhonghua Wu (Nanyang Technological University)*; Yicheng Wu (Monash University); Guosheng Lin (Nanyang Technological University); Jianfei Cai (Monash University); Chen Qian (SenseTime)
4495 Modality Synergy Complement Learning with Cascaded Aggregation for Visible-Infrared Person Re-Identification Yiyuan Zhang (Beijing Institute of Technology); Sanyuan Zhao (Beijing Institute of Technology )*; Yuhao Kang (Beijing Institute of Technology); Jianbing Shen (Inception Institute of Artificial Intelligence)
4496 RA-Depth: Resolution Adaptive Self-Supervised Monocular Depth Estimation Mu He (Nanjing University of Science and Technology)*; Le Hui (Nanjing University of Science and Technology); Yikai Bian (Nanjing University of Science and Technology); Jian Ren (Nanjing University of Science and Technology); Jin Xie (Nanjing University of Science and Technology); Jian Yang (Nanjing University of Science and Technology)
4505 MoFaNeRF: Morphable Facial Neural Radiance Field Yiyu Zhuang (Nanjing University); Hao Zhu (Nanjing University)*; Xusen Sun (Nanjing University); Xun Cao (Nanjing University)
4513 Visual Cross-View Metric Localization with Dense Uncertainty Estimates Zimin Xia (Delft University of Technology)*; Olaf Booij (TomTom); Marco Manfredi (TomTom); Julian F P Kooij (Delft University of Technology)
4525 The One Where They Reconstructed 3D Humans and Environments in TV Shows Georgios Pavlakos (UC Berkeley)*; Ethan Weber (UC Berkeley); Matthew Tancik (UC Berkeley); Angjoo Kanazawa (University of California Berkeley)
4530 PointInst3D: Segmenting 3D Instances by Points Tong He (University of Adelaide)*; Wei Yin (University of Adelaide); Chunhua Shen (“University of Adelaide, Australia”); Anton van den Hengel (University of Adelaide)
4533 PolyphonicFormer: Unified Query Learning for Depth-aware Video Panoptic Segmentation Haobo Yuan (Wuhan University)*; Xiangtai Li (Peking University); Yibo Yang (Peking University); Guangliang Cheng (Sensetime Group Limited); Jing Zhang (The University of Sydney); Yunhai Tong (Peking University); Lefei Zhang (Wuhan University); Dacheng Tao (JD.com)
4534 Quasi-Balanced Self-Training on Noise-Aware Synthesis of Object Point Clouds for Closing Domain Gap Yongwei Chen (South China University of Technology); ZiHao Wang (South China University of Technology); Longkun Zou (South China University of Technology); Ke Chen (South China University of Technology); Kui Jia (South China University of Technology)*
4537 TinyViT: Fast Pretraining Distillation for Small Vision Transformers Kan Wu (Sun Yat-sen University); Jinnian Zhang (University of Wisconsin Madison); Houwen Peng (Microsoft Research)*; Mengchen Liu (Microsoft); Bin Xiao (Microsoft); Jianlong Fu (Microsoft Research); Lu Yuan (Microsoft)
4551 VirtualPose: Learning Generalizable 3D Human Pose Models from Virtual Data Jiajun Su (Peking University)*; Chunyu Wang (Microsoft Research asia); Xiaoxuan Ma (Peking University); Wenjun Zeng (EIT Institute for Advanced Study); Yizhou Wang (PKU)
4552 Poseur: Direct Human Pose Regression with Transformers Weian Mao (the university of adelaide)*; Yongtao Ge (The University of Adelaide); Chunhua Shen (“University of Adelaide, Australia”); Xinlong Wang (University of Adelaide); Zhi Tian (Meituan); Zhibin Wang (Alibaba Group); Anton van den Hengel (University of Adelaide)
4557 Adaptive Image Transformations for Transfer-based Adversarial Attack Zheng Yuan (Institute of Computing Technology, Chinese Academy of Sciences); Jie Zhang (ICT, CAS)*; Shiguang Shan (Institute of Computing Technology, Chinese Academy of Sciences)
4566 D2ADA: Dynamic Density-aware Active Domain Adaptation for Semantic Segmentation Tsung-Han Wu (National Taiwan University)*; Yi-Syuan Liou (National Taiwan University); Shao-Ji Yuan (National Taiwan University); Hsin-Ying Lee (National Taiwan University); Tung-I Chen (National Taiwan University); Kuan-Chih Huang (National Taiwan University); Winston H. Hsu (National Taiwan University)
4568 SQN: Weakly-Supervised Semantic Segmentation of Large-Scale 3D Point Clouds Qingyong Hu (University of Oxford); Bo Yang (The Hong Kong Polytechnic University)*; Guangchi Fang (Sun Yat-sen University); Yulan Guo (Sun Yat-sen University); Ales Leonardis (University of Birmingham); Niki Trigoni (University of Oxford); Andrew Markham (University of Oxford)
4581 Deep Portrait Delighting Joshua William Weir (Victoria University of Wellington)*; Junhong Zhao (CMIC); Andrew Chalmers (CMIC); Taehyun Rhee (Victoria University of Wellington)
4584 Vector Quantized Image-to-Image Translation Yu-Jie Chen (National Chiao Tung University); Shin-I Cheng (National Chiao Tung University); Wei-Chen Chiu (National Chiao Tung University)*; Hung-Yu Tseng (Facebook); Hsin-Ying Lee (Snap Inc)
4588 PointMixer: MLP-Mixer for Point Cloud Understanding Jaesung Choe (KAIST)*; Chunghyun Park (POSTECH); Francois Rameau (KAIST); Jaesik Park (POSTECH); In So Kweon (KAIST)
4589 V2X-ViT: Vehicle-to-Everything Cooperative Perception with Vision Transformer Runsheng Xu (University of California, Los Angeles); Hao Xiang (University of California, Los Angeles); Zhengzhong Tu (University of Texas at Austin); Xin Xia (University of California, Los Angeles); Ming-Hsuan Yang (University of California at Merced); Jiaqi Ma (University of California, Los Angeles)*
4593 Cross-Domain Ensemble Distillation for Domain Generalization Kyungmoon Lee (POSTECH)*; Sungyeon Kim (POSTECH); Suha Kwak (POSTECH)
4596 Cross-Modal 3D Shape Generation and Manipulation Zezhou Cheng (University of Massachusetts, Amherst)*; Menglei Chai (Snap Inc.); Jian Ren (Snap Inc.); Hsin-Ying Lee (Snap Inc); Kyle B Olszewski (Snap Inc.); Zeng Huang (Snap Inc.); Subhransu Maji (University of Massachusetts, Amherst); Sergey Tulyakov (Snap Inc)
4607 Latent Partition Implicit with Surface Codes for 3D Representation Chao Chen (Tsinghua University); Yu-Shen Liu (Tsinghua University)*; Zhizhong Han (Wayne State University)
4614 FILM: Frame Interpolation for Large Motion Fitsum Reda (Google)*; Janne Kontkanen (Google); Eric Tabellion (Google); Deqing Sun (Google); Caroline Pantofaru (Google Research); Brian Curless (University of Washington)
4619 Facial Depth and Normal Estimation using Single Dual-Pixel Camera Minjun Kang (KAIST)*; Jaesung Choe (KAIST); Hyowon Ha (Facebook); Hae-Gon Jeon (GIST); Sunghoon Im (DGIST); In So Kweon (KAIST); Kuk-Jin Yoon (KAIST)
4622 Initialization and Alignment for Adversarial Texture Optimization Xiaoming Zhao (University of Illinois at Urbana-Champaign)*; Zhizhen Zhao (University of Illinois at Urbana-Champaign); Alexander Schwing (UIUC)
4631 Regularizing Vector Embedding in Bottom-Up Human Pose Estimation Haixin Wang (School of Artificial Intelligence, University of Chinese Academy of Sciences)*; lu zhou (CASIA); Yingying Chen (CASIA); Ming Tang (Institute of Automation, Chinese Academy of Sciences); Jinqiao Wang (Institute of Automation, Chinese Academy of Sciences)
4633 Equivariant Hypergraph Neural Networks Jinwoo Kim (KAIST); Saeyoon Oh (KAIST); Sungjun Cho (LG AI Research); Seunghoon Hong (KAIST)*
4636 Learning Quality-aware Dynamic Memory for Video Object Segmentation Yong Liu (Tsinghua University)*; Ran Yu (Tsinghua university); Fei Yin (Tsinghua University); Xinyuan Zhao (Huawei); Wei Zhao (Huawei); Weihao Xia (University College London); Yujiu Yang (Tsinghua University)
4652 Neural Scene Decoration from a Single Photograph Hong Wing Pang (The Hong Kong University of Science and Technology)*; Yingshu Chen ( The Hong Kong University of Science and Technology); Phuoc-Hieu T. Le (VinAI Research); Binh-Son Hua (VinAI Research); Thanh Nguyen (Deakin University, Australia); Sai-Kit Yeung (Hong Kong University of Science and Technology)
4656 Bottom Up Top Down Detection Transformers for Language Grounding in Images and Point Clouds Ayush Jain (Carnegie Mellon University)*; Nikolaos Gkanatsios (Carnegie Mellon University); Ishita Mediratta (Meta AI); Katerina Fragkiadaki (Carnegie Mellon University)
4658 CIRCLE:Convolutional Implicit Reconstruction and Completion for Large-scale Indoor Scene Hao-Xiang Chen (Tsinghua University)*; Jiahui Huang (Tsinghua University); Tai-Jiang Mu (Tsinghua University); Shi-Min Hu (Tsinghua University)
4659 Discovering Deformable Keypoint Pyramids Jianing Qian (University of Pennsylvania)*; Anastasios Panagopoulos (University of Pennsylvania); Dinesh Jayaraman (University of Pennsylvania)
4668 TIDEE: Tidying Up Novel Rooms using Visuo-Semantic Commonsense Priors Gabriel Sarch (Carnegie Mellon University)*; Zhaoyuan Fang (Carnegie Mellon University); Adam Harley (Carnegie Mellon University); Paul Schydlo (Carnegie Mellon University); Michael J Tarr (Carnegie Mellon University); Saurabh Gupta (UIUC); Katerina Fragkiadaki (Carnegie Mellon University)
4669 MOTR: End-to-End Multiple-Object Tracking with TRansformer Fangao Zeng (Megvii Technology); Bin Dong (Megvii Technology); Yuang Zhang (Shanghai Jiao Tong University); Tiancai Wang ( Megvii Technology)*; Xiangyu Zhang (Megvii Technology); Yichen Wei (Megvii Research Shanghai)
4672 K-centered Patch Sampling for Efficient Video Recognition Seong Hyeon Park (KAIST AI)*; Jihoon Tack (KAIST); Byeongho Heo (NAVER AI LAB); Jung-Woo Ha (NAVER CLOVA AI Lab); Jinwoo Shin (KAIST)
4675 Learning Implicit Feature Alignment Function for Semantic Segmentation Hanzhe Hu (Peking University)*; Yinbo Chen (UC San Diego); Jiarui Xu (University of California San Diego); Shubhankar Borse (Qualcomm AI Research ); Hong Cai (Qualcomm AI Research); Fatih Porikli (Qualcomm AI Research); Xiaolong Wang (UCSD)
4677 A Visual Navigation Perspective for Category-Level Object Pose Estimation Jiaxin Guo (Zhejiang University)*; Yiyi Liao (MPI-IS and University of Tübingen); Zhong Fangxun (CUHK); Rong Xiong (Zhejiang University); Yunhui Liu (CUHK); Yue Wang (Zhejiang University)
4681 ScaleNet: Searching for the Model to Scale Jiyang Xie (Huawei Noah’s Ark Lab); Xiu Su (University of Sydney); Shan You (SenseTime); Zhanyu Ma (Beijing University of Posts and Telecommunications)*; Fei Wang (University of Science and Technology of China); Chen Qian (SenseTime)
4684 Centrality and Consistency: Two-Stage Clean Samples Identification for Learning with Instance-Dependent Noisy Labels Ganlong Zhao (The University of Hong Kong); Guanbin Li (Sun Yat-sen University)*; Yipeng Qin (Cardiff University); Feng Liu (Deepwise AI Lab); Yizhou Yu (The University of Hong Kong)
4685 GALA: Toward Geometry-and-Lighting-Aware Object Search for Compositing Sijie Zhu (University of Central Florida)*; Zhe Lin (Adobe Research); Scott Cohen (Adobe Research); Jason Kuen (Adobe Research); Zhifei Zhang (Adobe Research); Chen Chen (University of Central Florida)
4688 FairGRAPE: Fairness-aware GRAdient Pruning mEthod for Face Attribute Classification Xiaofeng Lin (University of California – Los Angeles); Seungbae Kim (University of South Florida); Jungseock Joo (University of California Los Angeles)*
4697 Tackling Background Distraction in Video Object Segmentation Suhwan Cho (Yonsei University)*; Heansung Lee (Yonsei University); Minhyeok Lee ( Yonsei University); Chaewon Park (Yonsei University); Sungjun Jang (Yonsei University); Minjung Kim (Yonsei University); Sangyoun Lee (Yonsei University)
4700 Hyperspherical Learning in Multi-Label Classification Bo Ke (Tencent Youtu Lab)*; yunquan zhu (Tencent YouTu Lab); Mengtian Li (East China Normal University); Xiujun shu (Tencent Toutu Lab); Ruizhi Qiao (Tencent Youtu Lab); Bo Ren (Tencent)
4705 The Surprisingly Straightforward Scene Text Removal Method With Gated Attention and Region of Interest Generation: A Comprehensive Prominent Model Analysis Hyeonsu Lee (Naver Corporation)*; Chankyu Choi (Naver Corporation)
4708 FingerprintNet: Synthesized Fingerprints for Generated Image Detection Yonghyun Jeong (NAVER CLOVA)*; Doyeon Kim (Line+); Youngmin Ro (Samsung SDS); pyounggeon kim (SDS); Jongwon Choi (Chung-Ang University)
4715 ParticleSfM: Exploiting Dense Point Trajectories for Localizing Moving Cameras in the Wild Wang Zhao (Tsinghua University)*; Shaohui Liu (ETH Zurich); Hengkai Guo (ByteDance AI Lab); Wenping Wang (The University of Hong Kong); Yong-Jin Liu (Tsinghua University)
4721 Free-Viewpoint RGB-D Human Performance Capture and Rendering Phong Ha Nguyen (University of Oulu)*; Nikolaos Sarafianos (Facebook Reality Labs); Christoph Lassner (Meta Reality Labs Research); Janne Heikkila (University of Oulu, Finland); Tony Tung (Facebook)
4727 When Active Learning Meets Implicit Semantic Data Augmentation zhuangzhuang chen (shenzhen university); Jin Zhang (Shenzhen University); Pan Wang (Shenzhen University); Jie Chen (Shenzhen University); Jianqiang Li (Shenzhen University)*
4733 Multiview Regenerative Morphing with Dual Flows Chih-Jung Tsai (National Tsing Hua University); Cheng Sun (National Tsing Hua University); Hwann-Tzong Chen (National Tsing Hua University)*
4734 Frequency and Spatial Dual Guidance for Image Dehazing Hu Yu (University of Science and Technology of China); Naishan Zheng (University of Science and Technology of China); man zhou (University of Science and Technology of China); Jie Huang (University of Science and Technology of China); Zeyu Xiao (University of Science and Technology of China); Feng Zhao (University of Science and Technology of China)*
4736 The Anatomy of Video Editing: A Dataset and Benchmark Suite for AI-Assisted Video Editing Dawit Mureja Argaw (KAIST)*; Fabian Caba (Adobe Research); Joon-Young Lee (Adobe Research); Markus Woodson (Adobe); In So Kweon (KAIST)
4739 Hallucinating Pose-Compatible Scenes Tim Brooks (UC Berkeley)*; Alexei A Efros (UC Berkeley)
4748 Faster VoxelPose: Real-time 3D Human Pose Estimation by Orthographic Projection Hang Ye (Peking University); Wentao Zhu (Peking University)*; Chunyu Wang (Microsoft Research asia); Rujie Wu (Peking University); Yizhou Wang (PKU)
4754 Video Interpolation by Event-driven Anisotropic Adjustment of Optical Flow Song Wu (Huawei Technologies Co., Ltd.); Kaichao You (Tsinghua Univ); Weihua He (Tsinghua University)*; Chen Yang (Peking University); Yang Tian (Tsinghua University); Yaoyuan Wang (Huawei Technologies Co., Ltd.); Jianxing Liao (HUAWEI TECHNOLOGIES CO.LTD); Ziyang Zhang (HUAWEI TECHNOLOGIES CO.LTD)
4761 Motion and Appearance Adaptation for Cross-Domain Motion Transfer Borun Xu (University of Electronic Science and Technology of China)*; Biao Wang (Alibaba Group); Jinhong Deng (University of Electronic Science and Technology of China); Jiale Tao (University of Electronic Science and Technology of China); Tiezheng Ge (Alibaba Group); Yuning Jiang (Alibaba Group); Wen Li (University of Electronic Science and Technology of China); Lixin Duan (University of Electronic Science and Technology of China)
4762 AdaBin: Improving Binary Neural Networks with Adaptive Binary Sets Zhijun Tu (Institute of Artificial Intelligence and Robotics, Xi’an Jiaotong university)*; Xinghao Chen (Huawei Noah’s Ark Lab); Pengju Ren (Institute of Artificial Intelligence at Xi’an Jiaotong University); Yunhe Wang (Huawei Technologies)
4781 Social-Implicit: Rethinking Trajectory Prediction Evaluation and The Effectiveness of Implicit Maximum Likelihood Estimation Abduallah A Mohamed (Meta)*; Deyao Zhu (King Abdullah University of Science and Technology); Warren Vu (The University of Texas at Austin); Mohamed Elhoseiny (KAUST); Christian Claudel (The university of Texas at Austin)
4788 A Generalized & Robust Framework For Timestamp Supervision in Temporal Action Segmentation Rahul Rahaman (National University of Singapore)*; Dipika Singhania (National University of Singapore); Alex Thiery (National University of Singapore); Angela Yao (National University of Singapore)
4790 A Deep Moving-camera Background Model Guy Erez (Ben Gurion University)*; Ron A Shapira Weber (Ben-Gurion University); Oren Freifeld (Ben-Gurion University)
4800 DLME: Deep Local-flatness Manifold Embedding Zelin Zang (Zhejiang University & Westlake University)*; Siyuan Li (Westlake University); di wu (Westlake University); Ge Wang (Westlake University); Kai Wang (National University of Singapore); Lei Shang (Alibaba Group); Baigui Sun (Alibaba Group); Hao Li (Alibaba Group); Stan Z. Li (Westlake University)
4802 Neural Video Compression using GANs for Detail Synthesis and Propagation Fabian Mentzer (Google)*; Eirikur Agustsson (Google); Johannes Ballé (Google); David Minnen (Google Inc.); Nick Johnston (Google); George Toderici (Google Research)
4804 Few-shot Action Recognition with Hierarchical Matching and Contrastive Learning Sipeng Zheng (Renmin University of China)*; Shizhe Chen (INRIA); Qin Jin (Renmin University of China)
4807 Perspective Flow Aggregation for Data-Limited 6D Object Pose Estimation Yinlin Hu (EPFL)*; Pascal Fua (EPFL, Switzerland); Mathieu Salzmann (EPFL)
4820 TALISMAN: Targeted Active Learning for Object Detection with Rare Classes and Slices using Submodular Mutual Information Suraj Kothawade (UT Dallas)*; Saikat Ghosh (University of Texas at Dallas); Sumit Shekhar (Adobe Research); Yu Xiang (The University of Texas at Dallas); Rishabh Iyer (University of Texas at Dallas)
4826 New Datasets and Models for Contextual Reasoning in Visual Dialog Yifeng Zhang (University of Minnesota, Twin Cities); Ming Jiang (University of Minnesota); Qi Zhao (University of Minnesota)*
4828 Remote Respiration Monitoring of Moving Person Using Radio Signals Jae-Ho Choi (Pohang University of Science and Technology)*; KIBONG KANG (POSTECH); Kyung-Tae Kim (Pohang University of Science and Technology)
4832 AdvDO: Realistic Adversarial Attacks for Trajectory Prediction Yulong Cao (University of Michigan, Ann Arbor )*; Chaowei Xiao (NVIDIA); Anima Anandkumar (NVIDIA/Caltech); Danfei Xu (Stanford University); Marco Pavone (Stanford University)
4836 Cross-Modality Transformer for Visible-Infrared Person Re-Identification Kongzhu Jiang (University of Science and Technology of China)*; Tianzhu Zhang (University of Science and Technology of China); Xiang Liu (Dongguan University of Technology); Bingqiao Qian (University of Science and Technology of China); Yongdong Zhang (University of Science and Technology of China); Feng Wu (University of Science and Technology of China)
4849 VL-LTR: Learning Class-wise Visual-Linguistic Representation for Long-Tailed Visual Recognition Changyao Tian (Chinese University of Hong Kong); Wenhai Wang (Nanjing University); Xizhou Zhu (SenseTime); Jifeng Dai (SenseTime)*; Yu Qiao (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences)
4857 Self-Supervised Classification Network Elad Amrani (IBM / Technion)*; Leonid Karlinsky (IBM-Research); Alex Bronstein (Technion)
4865 DevNet: Self-supervised Monocular Depth Learning via Density Volume Construction Kaichen Zhou (University of Oxford)*; Lanqing Hong (Huawei Noah’s Ark Lab); Changhao Chen (National University of Defense Technology); Hang Xu (Huawei Noah’s Ark Lab); Chaoqiang Ye (Huawei); Qingyong Hu (University of Oxford); Zhenguo Li (Huawei Noah’s Ark Lab)
4872 Bayesian Optimization with Clustering and Rollback for CNN Auto Pruning Hanwei FAN (HKUST)*; Jiandong MU (HKUST); Wei Zhang (Hong Kong University of Science and Technology)
4873 Towards Real-World HDRTV Reconstruction: A Data Synthesis-based Approach Zhen Cheng (University of Science and Technology of China)*; Tao Wang (Huawei Noah’s Ark Lab); Yong Li (Huawei Noah’s Ark Lab); Fenglong Song (Huawei Noah’s Ark Lab); Chang Chen (Huawei Noah’s Ark Lab); Zhiwei Xiong (University of Science and Technology of China)
4874 Quantum Motion Segmentation Federica Arrigoni (University of Trento)*; Willi Menapace (University of Trento); Marcel Seelbach Benkner (University of Siegen); Elisa Ricci (University of Trento); Vladislav Golyanik (MPI for Informatics)
4879 Open-world Semantic Segmentation via Contrasting and Clustering Vision-language Embedding Quande Liu (The Chinese University of Hong Kong)*; Youpeng Wen (Dalian University of Technology); Jianhua Han (Huawei Noah’s Ark Lab); Chunjing Xu (Huawei Noah’s Ark Lab); Hang Xu (Huawei Noah’s Ark Lab); Xiaodan Liang (Sun Yat-sen University)
4880 Custom Structure Preservation in Face Aging Guillermo Gomez-Trenado (University of Granada)*; Stéphane Lathuilière (Telecom-Paris); Pablo Mesejo (University of Granada); Oscar Cordón García (University of Granada)
4883 DANBO: Disentangled Articulated Neural Body Representations via Graph Neural Networks Shih-Yang Su (University of British Columbia)*; Timur Bagautdinov (Facebook); Helge Rhodin (UBC)
4888 Class Is Invariant to Context and Vice Versa: On Learning Invariance for Out-Of-Distribution Generalization Jiaxin Qi (Nanyang Technological University)*; Kaihua Tang (Nanyang Technological University); Qianru Sun (Singapore Management University); Xian-Sheng Hua (Damo Academy, Alibaba Group); Hanwang Zhang (Nanyang Technological University)
4891 Spatio-Temporal Deformable Attention Network for Video Deblurring Huicong Zhang (Harbin Institute of Technology)*; Haozhe Xie (Tencent AI Lab); Hongxun Yao (Harbin Institute of Technology)
4894 CHORE: Contact, Human and Object REconstruction from a single RGB image Xianghui Xie (Saarland University )*; Bharat Lal Bhatnagar (University of Tübingen, MPI informatik); Gerard Pons-Moll (University of Tübingen)
4899 Complementing Brightness Constancy with Deep Networks for Optical Flow Prediction Vincent LE GUEN (EDF R&D, CNAM)*; Clément Rambour (Cnam); Nicolas Thome (CNAM, Paris)
4902 Learning Discriminative Shrinkage Deep Networks for Image Deconvolution Pin-Hung Kuo (National Taiwan University)*; Jinshan Pan (Nanjing University of Science and Technology); Shao-Yi Chien (National Taiwan University); Ming-Hsuan Yang (University of California at Merced)
4904 Camera Pose Estimation and Localization with Active Audio Sensing Karren D Yang (MIT); Michael Firman (Niantic); Eric Brachmann (Niantic)*; Clement LJC Godard (Niantic)
4906 Learning Efficient Multi-Agent Cooperative Visual Exploration Chao Yu (Tsinghua University); Xinyi Yang (Tinghua University)*; Jiaxuan Gao (Tsinghua University); Huazhong Yang (Tsinghua University); Yu Wang (Tsinghua University); Yi Wu (Tsinghua University)
4908 4DContrast: Contrastive Learning with Dynamic Correspondences for 3D Scene Understanding Yujin Chen (Technical University of Munich)*; Matthias Niessner (Technical University of Munich); Angela Dai (Technical University of Munich)
4918 Learned Vertex Descent: A New Direction for 3D Human Model Fitting Enric Corona (IRI)*; Gerard Pons-Moll (University of Tübingen); Guillem Alenyà (IRI); Francesc Moreno (IRI)
4921 Hierarchical Semi-Supervised Contrastive Learning for Contamination-Resistant Anomaly Detection Gaoang Wang (Zhejiang University); Yibing Zhan (JD Explore Academy); Xinchao Wang (National University of Singapore); Mingli Song (Zhejiang University)*; Klara Nahrstedt (University of Illinois at Urbana-Champaign)
4927 Learning to Fit Morphable Models Vasileios Choutas (ETH Zurich)*; Federica Bogo (Meta); Jingjing Shen (Microsoft); Julien Valentin (Microsoft)
4929 Few-Shot Classification with Contrastive Learning Zhanyuan Yang (Shenzhen University); Jinghua Wang (Harbin Institute of Technology); Yingying Zhu (Shenzhen University)*
4931 ARM: Any-Time Super-Resolution Method Bohong Chen (Xiamen University)*; Mingbao Lin (Xiamen University, China); Kekai Sheng (Youtu Lab, Tencent Inc.); mengdan zhang (Youtu, Tencent); Peixian Chen (Youtu Tencent); Ke Li (Tencent); Liujuan Cao (Xiamen University); Rongrong Ji (Xiamen University, China)
4933 Tracking Every Thing in the Wild Siyuan Li (ETH Zurich)*; Martin Danelljan (ETH Zurich); Henghui Ding (ETH Zurich); Thomas E Huang (ETH Zürich); Fisher Yu (ETH Zurich)
4934 Learning Self-prior for Mesh Denoising using Dual Graph Convolutional Networks Shota Hattori (The University of Tokyo)*; Tatsuya Yatagawa (The University of Tokyo); Yutaka Ohtake (The University of Tokyo); Suzuki Hiromasa (The University of Tokyo)
4940 Few Zero Level Set-Shot Learning of Shape Signed Distance Functions in Feature Space Amine Ouasfi (IMT Atlantique ); Adnane Boukhayma (Inria)*
4948 Attention-aware Learning for Hyperparameters Prediction in Image Processing Pipelines Haina Qin (University of Chinese Academy of Sciences); Longfei Han (Beijing Technology and Business University); Juan Wang (Institute of Automation, Chinese Academy of Sciences); Congxuan Zhang (Nanchang Hangkong University); Bing Li (National Laboratory of Pattern Recognition (NLPR), Institute of Automation, Chinese Academy of Sciences)*; Weiming Hu (Institute of Automation,Chinese Academy of Sciences); Yanwei Li (Zeku Technology(Shanghai) Corp.,Ltd.)
4950 Attaining Class-level Forgetting in Pretrained Model using Few Samples Pravendra Singh (IIT Roorkee); Pratik Mazumder (Indian Institute of Technology Jodhpur)*; Mohammed Asad Karim (Carnegie Mellon University)
4951 Data Invariants to Understand Unsupervised Out-of-Distribution Detection Lars Doorenbos (University of Bern)*; Raphael Sznitman (University of Bern); Pablo Márquez Neila (University of Bern)
4953 STEEX: Steering Counterfactual Explanations with Semantics Paul Jacob (École Polytechnique ); eloi zablocki (Valeo.ai)*; Hedi Ben-younes (Valeo AI); Mickael Chen (valeo.ai); Patrick Pérez (Valeo.ai); Matthieu Cord (Sorbonne University)
4958 Outpainting by Queries Kai Yao (Xi’an Jiaotong-liverpool University); Penglei Gao (Xi’an Jiaotong-Liverpool University); Xi Yang (Xi’an Jiaotong Liverpool University ); jie Sun (Xi’an Jiaotong-Liverpool University ); Rui Zhang (Xi’an Jiaotong-Liverpool University); Kaizhu Huang (Duke Kunshan University)*
4961 HULC: 3D HUman Motion Capture with Pose Manifold SampLing and Dense Contact Guidance Soshi Shimada (MPI for Informatics)*; Vladislav Golyanik (MPI for Informatics); Zhi Li (Max Planck Institute for Informatics); Patrick Pérez (Valeo.ai); Weipeng Xu (Reality Labs Research); Christian Theobalt (MPI Informatik)
4962 Interpretable Open-Set Domain Adaptation via Angular Margin Separation Xinhao Li (University of Electronic Science and Technology of China); Jingjing Li (University of Electronic Science and Technology of China)*; Zhekai Du (University of Electronic Science and Technology of China); Lei Zhu (Shandong Normal Unversity); Wen Li (University of Electronic Science and Technology of China)
4963 EgoBody: Human Body Shape and Motion of Interacting People from Head-Mounted Devices Siwei Zhang (ETH Zurich)*; Qianli Ma (Max Planck Institute for Intelligent Systems); Yan Zhang (ETH Zurich); Zhiyin Qian (ETH Zürich); Taein Kwon (ETH Zurich); Marc Pollefeys (ETH Zurich / Microsoft); Federica Bogo (Meta); Siyu Tang (ETH Zurich)
4966 ViTAS: Vision Transformer Architecture Search Xiu Su (University of Sydney); Shan You (SenseTime)*; Jiyang Xie (Huawei Noah’s Ark Lab); Mingkai Zheng (The University of Sydney); Fei Wang (University of Science and Technology of China); Chen Qian (SenseTime); Changshui Zhang (Tsinghua University); Xiaogang Wang (Chinese University of Hong Kong, Hong Kong); Chang Xu (University of Sydney)
4970 LaLaLoc++: Global Floor Plan Comprehension for Layout Localisation in Unvisited Environments Henry Howard-Jenkins (University of Oxford)*; Victor Adrian Prisacariu (University of Oxford)
4972 diffConv: Analyzing Irregular Point Clouds with an Irregular View Manxi Lin (Technical University of Denmark)*; Aasa Feragen (Technical University of Denmark)
4975 ReAct: Temporal Action Detection with Relational Action Queries Dingfeng Shi (Beihang University)*; Yujie Zhong (University of Oxford); Qiong Cao (JD.com); Jing Zhang (The University of Sydney); Lin Ma (Meituan); Jia Li (Beihang University); Dacheng Tao (JD.com)
4976 StyleBabel: Artistic Style Tagging and Captioning Dan Ruta (University of Surrey)*; Andrew Gilbert (University of Surrey); Pranav V Aggarwal (Adobe Inc.); Naveen Marri (Adobe Inc); Ajinkya Kale (Adobe); Jo Briggs (University of Northumbria); Chris Speed (University of Edinburgh); Hailin Jin (Adobe Research); Baldo Faieta (Adobe); Alex Filipkowski (Adobe); Zhe Lin (Adobe Research); John Collomosse (Adobe Research)
4977 TACS: Taxonomy Adaptive Cross-Domain Semantic Segmentation RUI GONG (ETH Zurich)*; Martin Danelljan (ETH Zurich); Dengxin Dai (ETH Zurich); Danda Pani Paudel (ETH Zürich); Ajad Chhatkuli (ETH Zurich); Fisher Yu (ETH Zurich); Luc Van Gool (ETH Zurich)
4983 Domain Invariant Autoencoders for Self-supervised Learning from Multi-domains Haiyang Yang (Nanjing University)*; Shixiang Tang (The University of Sydney); Meilin Chen (Zhejiang University); Yizhou Wang (Zhejiang University); Feng Zhu (University of Science and Technology of China); Lei Bai (Shanghai AI Laboratory); Rui Zhao (SenseTime Group Limited); Wanli Ouyang (The University of Sydney)
4987 Learned Variational Video Color Propagation Markus Hofinger (Graz University of Technology)*; Erich Kobler (University Hospital Bonn); Alexander Effland (University of Bonn); Thomas Pock (Graz University of Technology)
4988 PD-Flow: A Point Cloud Denoising Framework with Normalizing Flows aihua mao (South China University of Technolgoy)*; Zihui Du (South China University of Technology); Yu-Hui Wen (Tsinghua University); Jun Xuan (South China University of Technology); Yong-Jin Liu (Tsinghua University)
4992 Prototypical Contrast Adaptation for Domain Adaptive Semantic Segmentation ZhengKai Jiang (Tencent Youtu Lab)*; Yuxi Li (Tencent); Ceyuan Yang (Chinese University of Hong Kong); Peng Gao (Chinese university of hong kong); Yabiao Wang (Tencent); Ying Tai (Tencent YouTu); Chengjie Wang (Tencent; Shanghai Jiao Tong University)
4996 Adversarial Contrastive Learning via Asymmetric InfoNCE Qiying Yu (Tsinghua University)*; Jieming Lou (Harbin Institute of Technology); Xianyuan Zhan (Tsinghua University); Qizhang Li (Harbin Institute of Technology); Wangmeng Zuo (Harbin Institute of Technology, China); Yang Liu (Tsinghua University); Jingjing Liu (Tsinghua University)
4998 NeRF for Outdoor Scene Relighting Viktor Rudnev (Max Planck Institute for Informatics)*; Mohamed Elgharib (Max Planck Institute for Informatics); William Smith (University of York); Lingjie Liu (Max Planck Institute for Informatics ); Vladislav Golyanik (MPI for Informatics); Christian Theobalt (MPI Informatik)
5001 FusionVAE: A Deep Hierarchical Variational Autoencoder for RGB Image Fusion Fabian Duffhauss (Bosch Center for Artificial Intelligence)*; Vien Anh Ngo (Bosch Center for Artificial Intelligence); Hanna Ziesche (Bosch Center for AI); Gerhard Neumann (Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany)
5007 Self-calibrating Photometric Stereo by Neural Inverse Rendering Junxuan Li (Australian National University)*; HONGDONG LI (Australian National University, Australia)
5009 Time-rEversed diffusioN tEnsor Transformer: A new TENET of Few-Shot Object Detection Shan Zhang (Australian National University); Naila Murray (Naver Labs); Lei Wang (“University of Wollongong, Australia”); Piotr Koniusz (ANU College of Engineering and Computer Science)*
5017 Detecting Generated Images by Real Images Bo Liu (Chongqing University of Posts and Telecommunications); fan yang (Chongqing University of Posts and Telecommunications); Xiuli Bi (Chongqing University of Posts and Telecommunications); bin xiao (Chongqing University of Posts and Telecommunications)*; Weisheng Li (Chongqing University of Posts and Telecommunications); Xinbo Gao (Chongqing University of Posts and Telecommunications)
5018 VisageSynTalk: Unseen Speaker Video-to-Speech Synthesis via Speech-Visage Feature Selection Joanna Hong (KAIST)*; Minsu Kim (KAIST); Yong Man Ro (KAIST)
5020 Delta Distillation for Efficient Video Processing Amirhossein Habibian (Qualcomm AI Research)*; Haitam Ben Yahia (Qualcomm AI Research); Davide Abati (Qualcomm AI Research); Efstratios Gavves (University of Amsterdam ); Fatih Porikli (Qualcomm AI Research)
5026 PANDORA: A Panoramic Detection Dataset for Object with Orientation Hang Xu (Hangzhou Dianzi University;The Institute of Computing Technology of the Chinese Academy of Sciences); Qiang Zhao (The Institute of Computing Technology of the Chinese Academy of Sciences); Yike Ma (Institute of Computing Technology, Chinese Academy of Sciences); Xiaodong Li (Huawei Noah’s Ark Lab); Peng Yuan (Huawei Noah’s Ark Lab); Bailan Feng (Huawei Noah’s Ark Lab); Chenggang Yan (Hangzhou Dianzi University); Feng Dai (Institute of Computing Technology, Chinese Academy of Sciences)*
5032 Instance As Identity: A Generic Online Paradigm for Video Instance Segmentation Feng Zhu (University of Technology Sydney)*; Zongxin Yang (Zhejiang University); Xin Yu (University of Technology Sydney); Yi Yang (Zhejiang University); Yunchao Wei (UTS)
5034 Audio-Visual Mismatch-Aware Video Retrieval via Association and Adjustment Sangmin Lee (KAIST)*; Sungjune Park (KAIST); Yong Man Ro (KAIST)
5036 3D Clothed Human Reconstruction in the Wild Gyeongsik Moon (Seoul National University); Hyeongjin Nam (Seoul National University); Takaaki Shiratori (Meta Reality Labs Research); Kyoung Mu Lee (Seoul National University)*
5040 Classification-Regression for Chart Comprehension Matan Levy (The Hebrew University of Jerusalem)*; Rami Ben-Ari (OriginAI); Dani Lischinski (The Hebrew University of Jerusalem)
5042 Zero-Shot Category-Level Object Pose Estimation Walter Goodwin (University of Oxford)*; Sagar Vaze (Visual Geometry Group, University of Oxford); Ioannis Havoutis (“Oxford Robotics Institute, Universtity of Oxford”); Ingmar Posner (Oxford University)
5044 AssistQ: Affordance-centric Question-driven Task Completion for Egocentric Assistant Benita Wong (National University of Singapore)*; Joya Chen (National University of Singapore); You Wu (Harvard University); Stan Weixian Lei (National University of Singapore); Dongxing Mao (National University of Singapore); Difei Gao (NUS); Mike Zheng Shou (National University of Singapore)
5047 Laplace Mesh Transformer: Dual Attention and Topology Aware Network for 3D mesh Classification and Segmentation Xiao-Juan Li (Institute of Computing Technology, Chinese Academy of Sciences); Jie Yang (Institute of Computing Technology, Chinese Academy of Sciences)*; Fang-Lue Zhang (Victoria University of Wellington)
5048 CoMER: Modeling Coverage for Transformer-based Handwritten Mathematical Expression Recognition Wenqi Zhao (Peking University)*; Liangcai Gao (Peking University)
5049 RBC: Rectifying the Biased Context in Continual Semantic Segmentation Hanbin Zhao (Zhejiang University)*; Fengyu Yang (University of Michigan); Xinghe Fu (Zhejiang University); Xi Li (Zhejiang University)
5051 Don’t Forget Me: Accurate Background Recovery for Text Removal via Modeling Local-Global Context Chongyu Liu (South China University of Technology); Lianwen Jin (South China University of Technology)*; Yuliang Liu (Huazhong University of Science and Technology); Canjie Luo (South China University of Technology); Bangdong Chen (South China University of Technology); Fengjun Guo (IntSig Information Co. Ltd); Kai Ding (IntSig Information Co., Ltd)
5066 Semi-Supervised Keypoint Detector and Descriptor for Retinal Image Matching Jiazhen Liu (Renmin University of China); Xirong Li (Renmin University of China)*; Qijie Wei ( Vistel Inc.); Jie Xu (Beijing Tongren Hospital); Dayong Ding (Vistel Inc.)
5069 Memory-Augmented Model-Driven Network for Pansharpening Keyu Yan ( Hefei Institutes of Physical Science,Chinese Academy of Sciences)*; man zhou (Chinese Academy of Sciences); li zhang (Chinese Academy of Sciences); Chengjun Xie (Institute of Intelligent Machines, Chinese Academy of Sciences China)
5076 Factorizing Knowledge in Neural Networks Xingyi Yang (National University of Singapore)*; Jingwen Ye (National University of Singapore); Xinchao Wang (National University of Singapore)
5081 Unleashing Transformers: Parallel Token Prediction with Discrete Absorbing Diffusion for Fast High-Resolution Image Generation from Vector-Quantized Codes Sam Bond-Taylor (Durham University)*; Peter Hessey (Durham University); Hiroshi Sasaki (Durham University); Toby P Breckon (Durham University); Chris G. Willcocks (Durham University)
5082 Contrastive Vicinal Space for Unsupervised Domain Adaptation Jaemin Na (Ajou University)*; Dongyoon Han (NAVER AI Lab); Hyung Jin Chang (University of Birmingham); Wonjun Hwang (Ajou University)
5083 Weight Fixing Networks Chris Subia-Waud (University of Southampton)*; Srinandan Dasmahapatra (University of Southampton)
5088 Sim-to-Real 6D Object Pose Estimation via Iterative Self-training for Robotic Bin Picking Kai Chen (The Chinese University of Hong Kong); Rui Cao (The Chinese University of Hong Kong); Stephen L James (UC Berkeley); YICHUAN LI (CUHK); Yunhui Liu (CUHK); Pieter Abbeel (UC Berkeley); Qi Dou (The Chinese University of Hong Kong)*
5092 ChunkyGAN: Real Image Inversion via Segments Adéla Šubrtová (Czech Technical University); David Futschik (Czech Technical University in Prague, FEE); Jan Čech (Czech Technical University in Prague); Michal Lukáč (Adobe Research); Eli Shechtman (Adobe Research, US); Daniel Sýkora (Czech Technical University in Prague)*
5099 Towards Sequence-Level Training for Visual Tracking Minji Kim (Seoul National University)*; Seungkwan Lee (POSTECH); Jungseul Ok (POSTECH); Bohyung Han (Seoul National University); Minsu Cho (POSTECH)
5111 Scale-aware Spatio-temporal Relation Learning for Video Anomaly Detection Guoqiu Li (Tsinghua Shenzhen International Graduate School, Tsinghua University)*; Guanxiong Cai (Shenzhen SenseTime Technology Co., Ltd); Xingyu ZENG (SenseTime Group Limited); Rui Zhao (SenseTime Group Limited)
5114 Tracking by Associating Clips Sanghyun Woo (KAIST)*; Kwanyong Park (KAIST); Seoung Wug Oh (Adobe Research); In So Kweon (KAIST); Joon-Young Lee (Adobe Research)
5117 An Information Theoretic Approach forAttention-Driven Face Forgery Detection Ke Sun (Xiamen University)*; Hong Liu (National Institute of Informatics ); Taiping Yao (Tencent YouTu); Xiaoshuai Sun (Xiamen University); Shen Chen (Tencent YouTu Lab); Shouhong Ding (Tencent); Rongrong Ji (Xiamen University, China)
5118 Compound Prototype Matching for Few-shot Action Recognition Yifei Huang (The University of Tokyo)*; Lijin Yang (The University of Tokyo); Yoichi Sato (University of Tokyo)
5119 Self-Promoted Supervision for Few-Shot Transformer Bowen Dong (Harbin Institute of Technology); Pan Zhou (NUS); Shuicheng Yan (National University of Singapore, Department of Electrical and Computer Engineering); Wangmeng Zuo (Harbin Institute of Technology, China)*
5122 Completely Self-Supervised Crowd Counting via Distribution Matching deepak babu sam (Indian Institute of Science)*; Abhinav Agarwalla (Carnegie Mellon University); Jimmy Joseph (Stony Brook University); Vishwanath Sindagi (Johns Hopkins University); Venkatesh Babu RADHAKRISHNAN (Indian Institute of Science); Vishal Patel (Johns Hopkins University)
5123 Geodesic-Former: a Geodesic-Guided Few-shot 3D Point Cloud Instance Segmenter Tuan Duc Ngo (VinAI Research)*; Khoi Nguyen (VinAI Research)
5127 SeedFormer: Patch Seeds based Point Cloud Completion with Upsample Transformer Haoran Zhou (Nanjing University)*; Yun Cao (Tencent); Wenqing Chu (Tencent); Junwei Zhu (Tencent); Tong Lu (Nanjing University); Ying Tai (Tencent YouTu); Chengjie Wang (Tencent; Shanghai Jiao Tong University)
5129 3D-PL: Domain Adaptive Depth Estimation with 3D-aware Pseudo-Labeling Yu-Ting Yen (National Chiao Tung University, Phiar Technologies)*; Chia-Ni Lu (National Chiao Tung University ); Wei-Chen Chiu (National Chiao Tung University); Yi-Hsuan Tsai (Phiar Technologies)
5136 Towards Accurate Active Camera Localization Qihang Fang (Shandong University); Yingda Yin (Peking University); Qingnan Fan (Tencent AI Lab)*; Fei Xia (Google Inc); Siyan Dong (Shandong University); Sheng Wang (3vjia); Jue Wang (Tencent AI Lab); Leonidas Guibas (Stanford University); Baoquan Chen (Peking University)
5138 Few-shot Object Counting and Detection Thanh Van Nguyen (VinAI Research)*; Chau Hai Pham (VinAI Research); Khoi Nguyen (VinAI Research); Minh Hoai (Stony Brook University)
5140 RealPatch: A Statistical Matching Framework for Model Patching with Real Samples Sara Romiti (University of Sussex)*; Christopher Inskip (University of Sussex); Viktoriia Sharmanska (University of Sussex and Imperial College London); Novi Quadrianto (University of Sussex and Basque Center for Applied Mathematics)
5144 GAN Cocktail: mixing GANs without dataset access Omri Avrahami (The Hebrew University of Jerusalem)*; Dani Lischinski (The Hebrew University of Jerusalem); Ohad Fried (IDC Herzliya)
5156 Coarse-To-Fine Incremental Few-Shot Learning Xiang Xiang (Huazhong University of Science and Technology)*; Yuwen Tan (Huazhong University of Science and Technology); Qian Wan (Wuhan Research Institute of Posts and Telecommunications); Jing Ma (Huazhong University of Science and Technology); Alan Yuille (Johns Hopkins University); Gregory D. Hager (The Johns Hopkins University)
5157 Learning Unbiased Transferability for Domain Adaptation by Uncertainty Modeling Jian Hu (Queen Mary University of London)*; Haowen Zhong (Zhejiang Lab); Fei Yang (Zhejiang Lab); Shaogang Gong (Queen Mary University of London); Guile Wu (Queen Mary University of London); Junchi Yan (Shanghai Jiao Tong University)
5158 Camera Pose Auto-Encoders for Improving Pose Regression Yoli Shavit (Faculty of Engineering, Bar Ilan University); Yosi Keller (Bar Ilan University)*
5160 CoGS: Controllable Generation and Search from Sketch and Style Cusuh Ham (Georgia Institute of Technology)*; Gemma Canet Tarrés (CVSSP, University of Surrey); Tu Bui (University of Surrey); James Hays (Georgia Institute of Technology, USA); Zhe Lin (Adobe Research); John Collomosse (Adobe Research)
5172 Active Audio-Visual Separation of Dynamic Sound Sources Sagnik Majumder (University of Texas at Austin)*; Kristen Grauman (Facebook AI Research & UT Austin)
5175 AU-aware 3D Face Reconstruction through Personalized AU-specific Blendshape Learning Chenyi Kuang (Rensselaer Polytechnic Institute)*; Zijun Cui (Rensselaer Polytechnic Institute); Jeffrey Kephart (IBM Research, USA); Qiang Ji (Renselaer Polytechnic Institute)
5180 Directed Ray Distance Functions for 3D Scene Reconstruction Nilesh Kulkarni (University of Michigan)*; Justin Johnson (University of Michigan); David Fouhey (University of Michigan)
5189 Background-Insensitive Scene Text Recognition with Text Semantic Segmentation Liang Zhao (University of South Carolina)*; Zhenyao Wu (University of South Carolina); Xinyi Wu (University of South Carolina); Greg Wilsbacher (University of South Carolina); Song Wang (University of South Carolina)
5198 Geometry-Guided Progressive NeRF for Generalizable and Efficient Neural Human Rendering Mingfei Chen (University of Washington)*; Jianfeng Zhang (NUS); Xiangyu Xu (Sea AI Lab); Lijuan Liu (SEA AI LAB); Yujun Cai (Nanyang Technological University); Jiashi Feng (ByteDance); Shuicheng Yan (Sea AI Labs)
5207 MorphMLP: An Efficient MLP-Like Backbone for Spatial-Temporal Representation Learning David Junhao Zhang (National University of Singapore)*; Kunchang Li (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences); Yali Wang (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences); Yunpeng Chen (National University of Singapore); Shashwat Chandra (National University of Singapore); Yu Qiao (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences); Luoqi Liu (meitu); Mike Zheng Shou (National University of Singapore)
5211 Continual Variational Autoencoder Learning via Online Cooperative Memorization Fei Ye (University of york)*; Adrian Bors (University of York)
5215 Semantic Novelty Detection via Relational Reasoning Francesco Cappio Borlino (Politecnico di Torino); Silvia Bucci (Italian Institute of Technology)*; Tatiana Tommasi (Politecnico di Torino)
5217 FindIt: Generalized Localization with Natural Language Queries Weicheng Kuo (Google)*; Fred Bertsch (Google); Wei Li (GOOGLE INC); AJ Piergiovanni (Google); Mohammad Saffar (Google); Anelia Angelova (Google)
5224 SelectionConv: Convolutional Neural Networks for Non-rectilinear Image Data David M Hart (Brigham Young University)*; Michael Whitney (Brigham Young University); Bryan S Morse (Brigham Young University)
5227 HairNet: Hairstyle Transfer with Pose Changes Peihao Zhu (KAUST)*; Rameen Abdal (KAUST); JOHN C FEMIANI (Miami University); Peter Wonka (KAUST)
5234 Learn2Augment: Learning to Composite Videos for Data Augmentation in Action Recognition Shreyank N Gowda (University of Edinburgh)*; Marcus Rohrbach (Facebook AI Research); Frank Keller (University of Edinburgh); Laura Sevilla-Lara (Facebook)
5235 Action-based Contrastive Learning for Trajectory Prediction Marah Halawa (Technische Universität Berlin)*; Olaf Hellwich (Technical University Berlin); Pia Bideau (TU Berlin)
5240 Scaling Open-vocabulary Image Segmentation with Image-level Labels Golnaz Ghiasi (Google Brain)*; Xiuye Gu (Google); Yin Cui (Google); Tsung-Yi Lin (Nvidia Research)
5247 Improving Closed and Open-Vocabulary Attribute Prediction using Transformers Khoi Pham (University of Maryland, College Park)*; Kushal Kafle (Adobe Research); Zhe Lin (Adobe Research); Zhihong Ding (Adobe Research); Scott Cohen (Adobe Research); Quan Hung Tran (Adobe Research); Abhinav Shrivastava (University of Maryland)
5251 FS-COCO: Towards Understanding of Freehand Sketches of Common Objects in Context Pinaki Nath Chowdhury (University of Surrey)*; Aneeshan Sain (University of Surrey); Ayan Kumar Bhunia (University of Surrey); Tao Xiang (University of Surrey); Yulia Gryaditskaya (University of Surrey); Yi-Zhe Song (University of Surrey)
5252 A Contrastive Objective for Learning Disentangled Representations Jonathan Kahana (Hebrew University of Jerusalem)*; Yedid Hoshen (The Hebrew University of Jerusalem)
5256 Unbiased Multi-Modality Guidance for Image Inpainting Yongsheng YU (University of Chinese Academy of Sciences); Dawei Du (Kitware, Inc.)*; Libo Zhang (Institute of Software Chinese Academy of Sciences); Tiejian Luo (University of Chinese Academy of Sciences)
5257 Learned Monocular Depth Priors in Visual-Inertial Initialization Yunwen Zhou (Google)*; Abhishek Kar (Google); Eric L Turner (GOOGLE LLC); Adarsh Kowdle (Google); Chao Guo (Google Inc.); Ryan DuToit (Google); Konstantine Tsotsos (Google)
5261 DexMV: Imitation Learning for Dexterous Manipulation from Human Videos Yuzhe Qin (University of California San Diego)*; Yueh-Hua Wu (UCSD); Shaowei Liu (UIUC); Hanwen Jiang (UT Austin); Ruihan Yang (UC San Diego); Yang Fu (UCSD); Xiaolong Wang (UCSD)
5265 Exploring Fine-grained Audiovisual Categorization with the SSW60 Dataset Grant Van Horn (Cornell University)*; Rui Qian (Cornell University); Kimberly Wilber (Google); Hartwig Adam (Google); Oisin Mac Aodha (University of Edinburgh); Serge Belongie (University of Copenhagen)
5266 Radatron: Accurate Detection Using Multi-Resolution Cascaded MIMO Radar Sohrab Madani (UIUC)*; Junfeng Guan (UIUC); Waleed Ahmed (UIUC); Saurabh Gupta (UIUC); Haitham Hassanieh (UIUC)
5270 COMPOSER: Compositional Reasoning of Group Activity in Videos with Keypoint-Only Modality Honglu Zhou (Rutgers University)*; Asim Kadav (NEC Labs); Aviv Shamsian (Bar Ilan University); Shijie Geng (Rutgers University); Farley Lai (NEC Laboratories America, Inc.); Long Zhao (Google Research); Ting Liu (Google Research); Mubbasir Kapadia (Rutgers University); Hans Peter Graf (NEC Labs)
5272 The Fish Counting Dataset: A Benchmark for Multiple Object Tracking and Counting Justin Kay (Caltech, Ai.Fish); Peter Kulits (Caltech); Suzanne C Stathatos (Caltech); Siqi Deng (Amazon); Erik Young (Trout Unlimited); Sara M Beery (Caltech); Grant Van Horn (Cornell University)*; Pietro Perona (California Institute of Technology)
5287 Object Level Depth Reconstruction for Category Level 6D Object Pose Estimation From Monocular RGB Image Zhaoxin Fan (Renmin University of China)*; Zhenbo Song (Nanjing University of Science and Technology); Jian Xu (Nreal); Zhicheng Wang (Nreal); Kejian Wu (Nreal); Hongyan Liu (Tsinghua University); Jun He (Renmin University of China)
5293 DeepMend: Learning Occupancy Functions to Represent Shape for Repair Nikolas Lamb (Clarkson University)*; Sean Banerjee (Clarkson University); Natasha Kholgade Banerjee (Clarkson University)
5297 Graph Neural Network for Cell Tracking in Microscopy Videos Tal Ben-Haim (School of Electrical and Computer Engineering, Ben-Gurion University)*; Tammy Riklin Raviv (BGU)
5299 Anti-Neuron Watermarking: Protecting Personal Data Against Unauthorized Neural Networks Zihang Zou (University of Central Florida)*; Boqing Gong (Google); Liqiang Wang (University of Central Florida)
5310 PACS: A Dataset for Physical Audiovisual Commonsense Reasoning Samuel Yu (Carnegie Mellon University)*; Peter Wu (UC Berkeley); Paul Pu Liang (Carnegie Mellon University); Ruslan Salakhutdinov (Carnegie Mellon University); Louis-Philippe Morency (Carnegie Mellon University)
5315 Intelli-Paint: Towards Developing More Human-Intelligible Painting Agents Jaskirat Singh (Australian National University)*; Cameron Y Smith (Adobe Research); Jose Echevarria (Adobe System Inc.); Liang Zheng (Australian National University)
5317 Rethinking Few-Shot Object Detection on A Multi-Domain Benchmark Kibok Lee (Yonsei University); Hao Yang (Amazon)*; Satyaki Chakraborty (Amazon ); Zhaowei Cai (Amazon); Gurumurthy Swaminathan (Amazon); Avinash Ravichandran (Amazon); Onkar Dabeer (Amazon)
5318 LidarNAS: Unifying and Searching Neural Architectures for 3D Point Clouds Chenxi Liu (Waymo)*; Zhaoqi Leng (Waymo); Pei Sun (Waymo); Shuyang Cheng (Waymo LLC); Charles R. Qi (Waymo); Yin Zhou (Waymo); Mingxing Tan (Waymo); Dragomir Anguelov (Waymo)
5325 Improving the Intra-class Long-tail in 3D Detection via Rare Example Mining Chiyu Jiang (Waymo)*; Mahyar Najibi (Waymo LLC); Charles R. Qi (Waymo); Yin Zhou (Waymo); Dragomir Anguelov (Waymo)
5326 Learning to Learn with Smooth Regularization Yuanhao Xiong (UCLA)*; Cho-Jui Hsieh (UCLA)
5327 A Dataset for Interactive Vision-Language Navigation with Unknown Command Feasibility Andrea Burns (Boston University)*; Deniz Arsan (University of Illinois at Urbana Champaign); Sanjna Agrawal (Boston University); Ranjitha Kumar (UIUC: CS); Kate Saenko (Boston University); Bryan Plummer (Boston University)
5330 CoVisPose: Co-Visibility Pose Transformer for Wide-Baseline Relative Pose Estimation in 360 Indoor Panoramas Will A Hutchcroft (Zillow Group)*; Yuguang Li (Zillow Group); Ivaylo Boyadzhiev (Zillow Group); Zhiqiang Wan (Zillow); Haiyan Wang (The City College of New York); Sing Bing Kang (Zillow Group)
5340 PT4AL: Using Self-Supervised Pretext Tasks for Active Learning John Seon Keun Yi (Georgia Institute of Technology)*; Minseok Seo (si-analytics); Jongchan Park (Lunit); Dong-Geol Choi (Hanbat National University)
5351 Uncertainty Quantification in Depth Estimation via Constrained Ordinal Regression Dongting Hu (The University of Melbourne); Liuhua Peng (The University of Melbourne); Tingjin Chu (University of Melbourne); Xiaoxing Zhang (Meituan); Yinian Mao (Meituan-Dianping Group ); Howard Bondell (University of Melbourne); Mingming Gong (University of Melbourne)*
5361 All You Need is RAW: Defending Against Adversarial Attacks with Camera Image Pipelines Yuxuan Zhang (Princeton University)*; Bo Dong (Princeton University); Felix Heide (Princeton University)
5362 ParC-Net: Position Aware Circular Convolution with Merits from ConvNets and Transformer Haokui Zhang (Lighthouse Co.Ltd)*; Wenze Hu (Lighthouse Co.Ltd); Xiaoyu Wang (The Chinese University of Hong Kong (Shenzhen))
5369 B ́ezierPalm: A Free lunch for Palmprint Recognition KAI ZHAO (UCLA)*; Lei Shen (Tencent); Yingyi Zhang (Tencent); Chuhan Zhou (Tencent & VIA University College); Tao Wang (Tencent YouTu Lab); Ruixin Zhang (Tencent); Shouhong Ding (Tencent); Wei Jia (Heifei University of Technology); Wei Shen (Shanghai Jiao Tong University)
5372 A Repulsive Force Unit for Garment Collision Handling in Neural Networks Qingyang Tan (UMD)*; Yi Zhou (Adobe Research); Tuanfeng Wang (adobe research); Duygu Ceylan (Adobe Research); Xin Sun (Adobe Research); Dinesh Manocha (University of Maryland at College Park)
5373 CYBORGS: Contrastively Bootstrapping Object Representations by Grounding in Segmentation Renhao Wang (Tsinghua University)*; Hang Zhao (Tsinghua University); Yang Gao (Tsinghua University)
5377 Connecting Compression Spaces withTransformer for Approximate Nearest Neighbor Search Haokui Zhang (Lighthouse Co.Ltd)*; Buzhou Tang (Harbin Institute of Technology, China); Wenze Hu (Lighthouse Co.Ltd); Xiaoyu Wang (The Chinese University of Hong Kong (Shenzhen))
5381 Training Vision Transformers with Only 2040 Images Yunhao Cao (Nanjing University); Hao Yu (Nanjing University); Jianxin Wu (Nanjing University)*
5384 Black-box Few-shot Knowledge Distillation Dang Nguyen (Deakin University)*; Sunil Gupta (Deakin University, Australia); Kien Duc Do (Deakin Unviersity); Svetha Venkatesh (Deakin University)
5388 AutoAvatar: Autoregressive Neural Fields for Dynamic Avatar Modeling Ziqian Bai (Simon Fraser University)*; Timur Bagautdinov (Facebook); Javier Romero (Facebook); Michael Zollhöfer (Facebook Reality Labs); Ping Tan (Simon Fraser University); Shunsuke Saito (Facebook)
5392 Ghost-free High Dynamic Range Imaging with Context-aware Transformer Zhen Liu (Sichuan University; Megvii ); Yinglong Wang (Huawei Noah’s Ark Lab); Bing Zeng (University of Electronic Science and Technology of China); Shuaicheng Liu (UESTC; Megvii)*
5393 Cross-Domain Cross-Set Few-Shot Learning via Learning Compact and Aligned Representations Wentao Chen (University of Science and Technology of China)*; Zhang Zhang (Institute of Automation, Chinese Academy of Sciences); Wei Wang (Institute of Automation Chinese Academy of Sciences); Liang Wang (NLPR, China); Zilei Wang (University of Science and Technology of China); Tieniu Tan (NLPR, China)
5396 Motion Transformer for Unsupervised Image Animation Jiale Tao (University of Electronic Science and Technology of China)*; Biao Wang (Alibaba Group); Tiezheng Ge (Alibaba Group); Yuning Jiang (Alibaba Group); Wen Li (University of Electronic Science and Technology of China); Lixin Duan (University of Electronic Science and Technology of China)
5404 LiDAR Distillation: Bridging the Beam-Induced Domain Gap for 3D Object Detection Yi Wei (Tsinghua University)*; Zibu Wei (Tsinghua University); Yongming Rao (Tsinghua University); Jiaxin Li (Gaussian Robotics); Jie Zhou (Tsinghua University); Jiwen Lu (Tsinghua University)
5405 PSS: Progressive Sample Selection for Open-World Visual Representation Learning Tianyue Cao (Shanghai Jiao Tong University); Yongxin Wang (Amazon)*; Yifan Xing (AMAZON CORPORATE LLC); Tianjun Xiao (Amazon); Tong He (Amazon); Zheng Zhang (AWS); Hao Zhou (Amazon); Joseph Tighe (Amazon)
5408 Self-slimmed Vision Transformer Zhuofan Zong (Beihang University)*; Kunchang Li (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences); Guanglu Song (Sensetime); Yali Wang (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences); Yu Qiao (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences); Biao Leng (Beihang University); Yu Liu (SenseTime Group LTD)
5410 Switchable Online Knowledge Distillation Biao Qian (Hefei University of Technology); Yang Wang (Hefei University of Technology)*; Hongzhi Yin (The University of Queensland); Richang Hong (Hefei University of Technology); Meng Wang (Hefei University of Technology)
5418 Adaptive Transformers for Robust Few-shot Cross-domain Face Anti-spoofing Hsin-Ping Huang (University of California, Merced)*; Deqing Sun (Google); Yaojie Liu (Google); Wen-Sheng Chu (Google); Taihong Xiao (University of California at Merced); Jinwei Yuan (Google); Hartwig Adam (Google); Ming-Hsuan Yang (University of California at Merced)
5419 GraphFit: Learning Multi-scale Graph-Convolutional Representation for Point Cloud Normal Estimation Keqiang Li (Institute of Automation, Chinese Academy of Sciences; School of Artificial Intelligence, University of Chinese Academy of Sciences)*; Mingyang Zhao (University of Chinese Academy and Sciences&Beijing Academy of Artificial Intelligence); Huaiyu Wu (Institute of Automation, Chinese Academy of Sciences); Dong-Ming Yan (NLPR, CASIA); Zhen Shen (Institute of Automation, Chinese Academy of Sciences/Qingdao Academy of Intelligent Industries); Fei-Yue Wang (Institute of Automation, Chinese Academy of Sciences ); gang xiong (CASIA)
5424 Are Vision Transformers Robust to Patch-wise Perturbations? Jindong Gu (University of Munich)*; Volker Tresp (Siemens AG and Ludwig Maximilian University of Munich ); Yao Qin (Google)
5428 DualPrompt: Complementary Prompting for Rehearsal-free Continual Learning Zifeng Wang (Northeastern University)*; Zizhao Zhang (Google); Sayna Ebrahimi (Google); Ruoxi Sun (Google); Han Zhang (Google); Chen-Yu Lee (Google); Xiaoqi Ren (Google); Guolong Su (Google); Vincent Perot (Google AI); Jennifer Dy (Northeastern); Tomas Pfister (Google)
5430 EleGANt: Exquisite and Locally Editable GAN for Makeup Transfer Chenyu Yang (Tsinghua University)*; Wanrong He (Tsinghua University); Yingqing Xu (Tsinghua University); Yang Gao (Tsinghua University)
5436 Union-set Multi-source Model Adaptation for Semantic Segmentation Zongyao Li (Hokkaido University)*; Ren Togo (Hokkaido University); Takahiro Ogawa (Hokkaido University); Miki Haseyama (Hokkaido University)
5441 Bridging Images and Videos: A Simple Learning Framework for Large Vocabulary Video Object Detection Sanghyun Woo (KAIST)*; Kwanyong Park (KAIST); Seoung Wug Oh (Adobe Research); In So Kweon (KAIST); Joon-Young Lee (Adobe Research)
5443 TDAM: Top-Down Attention Module for Contextually Guided Feature Selection in CNNs Shantanu Jaiswal (Agency for Science, Technology and Research )*; Basura Fernando (Agency for Science, Technology and Research, A*STAR, Singapore); Cheston Tan (Institute for Infocomm Research, Singapore)
5451 Exploring Disentangled Content Information for Face Forgery Detection Jiahao Liang (Beijing University of Posts and Telecommunications)*; Huafeng Shi (SenseTime Group Limited); Weihong Deng (Beijing University of Posts and Telecommunications)
5458 Object Discovery via Contrastive Learning for Weakly Supervised Object Detection Jinhwan Seo (Pohang University of Science and Technology)*; Wonho Bae (University of British Columbia); Danica J. Sutherland (University of British Columbia); Junhyug Noh (Lawrence Livermore National Laboratory); Daijin Kim (Pohang University of Science and Technology)
5460 Unifying Vision Unsupervised Contrastive Learning from a Graph Perspective Shixiang Tang (The University of Sydney)*; Feng Zhu (University of Science and Technology of China); Lei Bai (Shanghai AI Laboratory); Rui Zhao (SenseTime Group Limited); Chenyu Wang (University of Sydney, Sydney Neuroimaging Analysis Centre); Wanli Ouyang (The University of Sydney)
5463 E-NeRV: Expedite Neural Video Representation with Disentangled Spatial-Temporal Context Zizhang Li (Zhejiang University)*; Mengmeng Wang (Zhejiang University); Huaijin Pi (Zhejiang University); Kechun Xu (Zhejiang University); Jianbiao Mei (Zhejiang University); Yong Liu (Zhejiang University)
5478 $\ell_\infty$-Robustness and Beyond: Unleashing Efficient Adversarial Training Hadi Mohaghegh Dolatabadi (University of Melbourne)*; Sarah Erfani (University of Melbourne); Christopher Leckie (University of Melbourne)
5481 Spatial-Separated Curve Rendering Network for Efficient and High-Resolution Image Harmonization Jingtang Liang (University of Macau)*; Xiaodong Cun (Tencent AI Lab); Chi-Man Pun (University of Macau); Jue Wang (Tencent AI Lab)
5484 Point MixSwap: Attentional Point Cloud Mixing via Swapping Matched Structural Divisions Ardian Umam (NYCU)*; Cheng-Kun Yang (National Taiwan University); Yung-Yu Chuang (National Taiwan University); Jen-Hui Chuang (National Chiao Tung University ); Yen-Yu Lin (National Yang Ming Chiao Tung University)
5491 One Size Does NOT Fit All: Data-Adaptive Adversarial Training Shuo Yang (University of Sydney)*; Chang Xu (University of Sydney)
5494 IS-MVSNet: Importance Sampling-based MVSNet Likang Wang (HKUST)*; Yue Gong (Huawei Technologies Co., Ltd.); Xinjun Ma (Huawei); Qirui Wang (Huawei Technologies Co., Ltd.); Kaixuan Zhou (Huawei ); Lei Chen (Hong Kong University of Science and Technology)
5496 Multi-Granularity Pruning for Model Acceleration on Mobile Devices Tianli Zhao (Institute of Automation,Chinese Academy of Sciences;University of Chinese Academy of Sciences); Xi Sheryl Zhang (Institute of Automation, Chinese Academy of Sciences); Wentao Zhu (Amazon); Jiaxing Wang (Institute of Automation, Chinese Academy of Sciences); Sen Yang (Kuaishou); Ji Liu (Kwai Inc.); Jian Cheng (“Chinese Academy of Sciences, China”)*
5500 Style-Agnostic Reinforcement Learning Juyong Lee (POSTECH); Seokjun Ahn (POSTECH); Jaesik Park (POSTECH)*
5504 Editing Out-of-domain GAN Inversion via Differential Activations Haorui Song (South China University of Technology); Yong Du (Ocean University of China); Tianyi Xiang (South China University of Technology); Junyu Dong (Ocean University of China); Jing Qin (The Hong Kong Polytechnic University); Shengfeng He (South China University of Technology)*
5508 Bagging Regional Classification Activation Maps for Weakly Supervised Object Localization Lei Zhu (Beijing University of Posts and Telecommunications); Qian Chen (University of Science and Technology of China); Lujia Jin (Peking University); yunfei you (Peking University); Yanye Lu (Peking University)*
5518 Mutually Reinforcing Structure with Proposal Contrastive Consistency for Few-Shot Object Detection TianXue Ma (East China Normal University)*; Mingwei Bi (Tencent); Jian Zhang (Tencent Youtu); Wang Yuan (East China Normal University); Zhizhong Zhang (East China Normal University); Yuan Xie (East China Normal University); Shouhong Ding (Tencent); Lizhuang Ma (Shanghai Jiao Tong University)
5523 Panoptic-PartFormer: Learning a Unified model for Panoptic Part Segmentation Xiangtai Li (Peking University)*; Shilin Xu (Peking University); Yibo Yang (Peking University); Guangliang Cheng (Sensetime Group Limited); Yunhai Tong (Peking University); Dacheng Tao (JD.com)
5536 TextAdaIN: Paying Attention to Shortcut Learning in Text Recognizers Oren Nuriel (Amazon)*; Ron Litman (Amazon); Sharon Fogel (Amazon)
5537 Speaker-adaptive Lip Reading with User-dependent Padding Minsu Kim (KAIST)*; Hyunjun Kim (KAIST); Yong Man Ro (KAIST)
5541 Online Domain Adaptation for Semantic Segmentation in Ever-Changing Conditions Theodoros Panagiotakopoulos (KTH Royal Institute of Technology in Stockholm); Pier Luigi Dovesi (Univrses); Linus Härenstam-Nielsen (Artisense); Matteo Poggi (University of Bologna)*
5542 Point Scene Understanding via Disentangled Instance Mesh Reconstruction Jiaxiang Tang (Peking University)*; Xiaokang Chen (Peking University); Jingbo Wang (The Chinese University of HongKong); Gang Zeng (Peking University)
5543 Dual Contrastive Learning with Anatomical Auxiliary Supervision for Few-shot Medical Image Segmentation Huisi Wu (Shenzhen University)*; Fangyan Xiao (Shenzhen University); Chongxin Liang (Shenzhen University)
5544 An Efficient Person Clustering Algorithm for Open Checkout-free Groceries Junde Morsen Wu (Purdue University); Yu Zhang (Harbin Institute of Technology); RAO FU (None); Yuanpei Liu (Beijing Institute of Technology); Jing Gao (Purdue University)*
5548 Face2Face^ρ: Real-Time High-Resolution One-Shot Face Reenactment Kewei Yang (NetEase Games AI Lab)*; Kang Chen (NetEase Games AI Lab); Daoliang Guo (NetEase Games AI Lab); Song-Hai Zhang (Tsinghua University); Yuan-Chen Guo (Tsinghua University); Weidong Zhang (Netease Games AI Lab)
5549 Decoupled Contrastive Learning Chun-Hsiao Yeh (Academia Sinica / UC Berkeley)*; Cheng-Yao Hong (Academia Sinica); Yen-Chi Hsu (Academia Sinica); Tyng-Luh Liu (Academia Sinica); Yubei Chen (Berkeley AI Research, UC Berkeley); yann lecun (Facebook)
5555 Learning Algebraic Representation for Systematic Generalization in Abstract Reasoning Chi Zhang (University of California, Los Angeles)*; Sirui Xie (UCLA); Baoxiong Jia (UCLA); Ying Nian Wu (University of California, Los Angeles); Song-Chun Zhu (UCLA); Yixin Zhu (Peking University)
5556 On the Robustness of Quality Measures for GANs Motasem Alfarra (KAUST)*; Juan C Perez (KAUST); Anna Fruehstueck (KAUST); Philip Torr (University of Oxford); Peter Wonka (KAUST); Bernard Ghanem (KAUST)
5557 Automatic Check-Out via Prototype-based Classifier Learning from Single-Product Exemplars Hao Chen (Nanjing University of Science and Technology)*; Xiu-Shen Wei (Nanjing University of Science and Technology); Faen Zhang (AInnovation Co. Ltd.); Yang Shen (Nanjing University of Science and Technology); Hui Xu (QINGDAO AINNOVATION TECHNOLOGY GROUP CO., LTD); liang xiao (nanjing university of science and technology)
5559 TDViT: Temporal Dilated Transformer for Dense Video Tasks Guanxiong Sun (Queen’s University Belfast); Yang Hua (Queen’s University Belfast)*; Guosheng Hu (Oosto); Neil Robertson (Queen’s University Belfast)
5561 POP: Mining POtential Performance of new fashion products via webly cross-modal query expansion Christian Joppi (Humatics srl)*; Geri Skenderi (University of Verona); Marco Cristani (University of Verona)
5564 BRACE: The Breakdancing Competition Dataset for Dance Motion Synthesis Davide Moltisanti (University of Edinburgh)*; Jinyi Wu (S-Lab Nanyang Technological University); Bo Dai (Shanghai AI Lab); Chen Change Loy (Nanyang Technological University)
5578 Towards Racially Unbiased Skin Tone Estimation via Scene Disambiguation Haiwen Feng (Max Planck Institute for Intelligent Systems); Timo Bolkart (Max Planck Institute for Intelligent Systems); Joachim Tesch (Max Planck Institute for Intelligent Systems); Michael J. Black (Max Planck Institute for Intelligent Systems); Victoria Fernandez Abrevaya (Max Planck Institute)*
5580 Style-Guided Shadow Removal Jin Wan (Beijing Jiaotong University); Hui Yin (Beijing Jiaotong University)*; Zhenyao Wu (University of South Carolina); Xinyi Wu (University of South Carolina); Yanting Liu (Yanting Liu); Song Wang (University of South Carolina)
5584 Sound-guided Semantic Video Generation Seung Hyun Lee (Korea University)*; Gyeongrok Oh (Korea University); Wonmin Byeon (NVIDIA Research); Jihyun Bae (Korea University); Chanyoung Kim (Korea University); Won Jeong Ryoo (Korea University); Sang Ho Yoon (KAIST); Hyunjun Cho (Korea University); Jinkyu Kim (Korea University); Sangpil Kim (Korea University)
5585 Robust Visual Tracking by Segmentation Matthieu Paul (ETH Zurich)*; Martin Danelljan (ETH Zurich); Christoph Mayer (ETH Zurich); Luc Van Gool (ETH Zurich)
5591 Semi-Supervised Learning of Optical Flow by Flow Supervisor Woobin Im (KAIST); Sebin Lee (KAIST); Sungeui Yoon (KAIST)*
5595 Joint Learning of Localized Representations from Medical Images and Reports Philip Müller (Technical University of Munich)*; Georgios Kaissis (Technische Universität München); congyu zou (Klinikum Rechts der Isar Technische Universität München ); Daniel Rueckert (Technische Universität München)
5599 D2C-SR: A Divergence to Convergence Approach for Real-World Image Super-Resolution Youwei Li (Megvii); Haibin Huang (Kuaishou Technology); lanpeng jia (GWM); Haoqiang Fan (Megvii Inc(face++)); Shuaicheng Liu (UESTC; Megvii)*
5612 Continual 3D Convolutional Neural Networks for Real-time Processing of Videos Lukas Hedegaard (Aarhus University)*; Alexandros Iosifidis (Aarhus University)
5613 Salient Object Detection for Point Clouds Songlin Fan (Peking University ); Wei Gao (SECE, Shenzhen Graduate School, Peking University)*; Ge Li (Peking University)
5616 Deep ensemble learning by diverse knowledge distillation for fine-grained object classification Naoki Okamoto (Chubu university)*; Tsubasa Hirakawa (Chubu University); Takayoshi Yamashita (Chubu University); Hironobu Fujiyoshi (Chubu University)
5619 Source-free Video Domain Adaptation by Learning Temporal Consistency for Action Recognition Yuecong Xu (Institute for Infocomm Research, A*STAR, Singapore)*; Jianfei Yang (Nanyang Technological University); Haozhi Cao (Nanyang Technological University); Keyu Wu (Institute for Infocomm Research, A*STAR, Singapore); Min Wu (Institute for Infocomm Research, A*STAR, Singapore); Zhenghua Chen (Institute for Infocomm Research, A*STAR, Singapore)
5643 GRIT-VLP: Grouped Mini-batch Sampling for Efficient Vision and Language Pre-training Jaeseok Byun (Seoul National university); Taebaek Hwang (M.IN.D Lab); Jianlong Fu (Microsoft Research); Taesup Moon (Seoul National University)*
5644 Pose Forecasting in Industrial Human-Robot Collaboration Alessio Sampieri (Sapienza University)*; Guido Maria D’Amely di Melendugno (Sapienza University); ANDREA AVOGARO (University of Verona); Federico Cunico (University of Verona); Francesco Setti (University of Verona); Geri Skenderi (University of Verona); Marco Cristani (University of Verona); Fabio Galasso (Sapienza University)
5648 MeshLoc: Mesh-Based Visual Localization Vojtech Panek (CTU in Prague, FEE, CIIRC)*; Zuzana Kukelova (Czech Technical University in Prague); Torsten Sattler (Czech Technical University in Prague)
5660 Dress Code: High-Resolution Multi-Category Virtual Try-On Davide Morelli (UNIMORE); Matteo Fincato (Università degli Studi di Modena e Reggio Emilia); Marcella Cornia (University of Modena and Reggio Emilia)*; Federico Landi (University of Modena and Reggio Emilia); Fabio Cesari (YOOX Net-A-Porter Group S.p.A.); Rita Cucchiara (Università di Modena e Reggio Emilia)
5661 UC-OWOD: Unknown-Classified Open World Object Detection Zhiheng Wu (Institute of Automation, Chinese Academy of Sciences (CASIA))*; Yue Lu (Institute of Automation, Chinese Academy of Sciences(CASIA)); Xingyu Chen (Xiaobing.AI); Zhengxing Wu (CASIA); Liwen Kang (Institute of Automation, Chinese Academy of Sciences (CASIA)); Junzhi Yu (CASIA)
5666 Helpful or Harmful: Inter-Task Association in Continual Learning Hyundong Jin (Chung-Ang University ); Eunwoo Kim (Chung-Ang University)*
5669 RayTran: 3D pose estimation and shape reconstruction of multiple objects from videos with ray-traced transformers Michał J Tyszkiewicz (EPFL); Kevis-Kokitsi Maninis (Google Research)*; Stefan Popov (Google Research); Vittorio Ferrari (Google Research)
5673 Efficient Point Cloud Segmentation with Geometry-aware Sparse Networks Maosheng Ye (HKUST)*; Rui Wan (Deeproute.ai); Shuangjie Xu (HKUST); Tongyi Cao (Deeproute.ai); Qifeng Chen (HKUST)
5677 Dynamic Spatio-Temporal Specialization Learning for Fine-Grained Action Recognition Tianjiao Li (Singapore University of Technology and Design)*; Lin Geng Foo (Singapore University of Technology and Design); Qiuhong Ke (Monash University); Hossein Rahmani (Lancaster University); Anran Wang (Bytedance); Jinghua Wang (Harbin Institute of Technology); Jun Liu (Singapore University of Technology and Design)
5685 TISE: Bag of Metrics for Text-to-Image Synthesis Evaluation Tan Minh Dinh (VinAI Research)*; Rang NGUYEN (VinAI Research); Binh-Son Hua (VinAI Research)
5688 CostDCNet: Cost Volume based Depth Completion for a Single RGB-D Image Jaewon Kam (POSTECH); Jungeon Kim (POSTECH); Soongjin Kim (POSTECH); Jaesik Park (POSTECH); Seungyong Lee (POSTECH)*
5697 Efficient Video Deblurring Guided by Motion Magnitude Yusheng Wang (The University of Tokyo)*; Yunfan Lu (Hong Kong University of Science and Technology); Ye Gao (Honor Technologies Japan); Lin Wang (HKUST); Zhihang Zhong (The University of Tokyo); Yinqiang Zheng (The University of Tokyo); Atsushi Yamashita (The University of Tokyo)
5702 Space-Partitioning RANSAC Daniel Barath (ETH Zürich)*; Gábor Valasek (ELTE)
5704 Towards Accurate Binary Neural Networks via Modeling Contextual Dependencies Xingrun Xing (Beihang University); Yangguang Li (SenseTime Group Limited); Wei Li (Nanyang Technological University); Wenrui Ding (Beihang University); Yalong Jiang (Beihang University)*; Yufeng Wang (Beihang University); Jing Shao (Sensetime); Chunlei Liu (Beihang University); Xianglong Liu (BUAA)
5712 Overcoming Shortcut Learning in a Target Domain by Generalizing Basic Visual Factors from a Source Domain Piyapat Saranrittichai (Bosch Center for Artificial Intelligence)*; Chaithanya Kumar Mummadi (Bosch Center for Artificial Intelligence); Claudia Blaiotta (Bosch Center for Artificial Intelligence); Mauricio Munoz (Bosch Center for Artificial Intelligence); Volker Fischer (Bosch Center for Artificial Intelligence)
5721 SimpleRecon: 3D Reconstruction Without 3D Convolutions Mohamed Sayed (University College London)*; John Gibson (Niantic, Inc.); Jamie Watson (Niantic); Victor A Prisacariu (Niantic Labs); Michael Firman (Niantic); Clement LJC Godard (Niantic)
5739 SemAug: Semantically Meaningful Image Augmentations for Object Detection Through Language Grounding Morgan L Heisler (Huawei Technologies Canada Co., Ltd.)*; Amin Banitalebi-Dehkordi (Huawei Technologies Canada Co., Ltd.); Yong Zhang (Huawei Technologies Canada Co., Ltd.)
5740 A data-centric approach for improving ambiguous labels with combined semi-supervised classification and clustering Lars Schmarje (Kiel University)*; Monty Santarossa (Kiel University); Simon-Martin Schröder (Kiel University); Claudius Zelenka (Kiel University); Rainer Kiko (Laboratoire d’Océanographie de Villefranche-sur-Mer); Jenny Stracke (University of Bonn); Nina Volkmann (University of Veterinary Medicine Hannover); Reinhard Koch (Kiel University)
5750 SPIN: An Empirical Evaluation on Sharing Parameters of Isotropic Networks Anish J Prabhu (Apple)*; Chien-Yu lin (University of Washington); Thomas Merth (Apple); Sachin Mehta (University of Washington); Anurag Ranjan (Apple); Maxwell C Horton (Apple, Xnor.Ai and University of Washington); Mohammad Rastegari (University of Washington)
5754 SAGA: Stochastic Whole-Body Grasping With Contact Yan Wu (ETH Zurich); Jiahao Wang (Max Planck Institute for Informatics); Yan Zhang (ETH Zurich); Siwei Zhang (ETH Zurich); Otmar Hilliges (ETH Zurich); Fisher Yu (ETH Zurich); Siyu Tang (ETH Zurich)*
5761 GTCaR: Graph Transformer for Camera Re-localization Xinyi Li (Magic Leap)*; Haibin Ling (Stony Brook University)
5764 Actor-centered Representations for Action Localization in Streaming Videos Sathyanarayanan N Aakur (OK State)*; Sudeep Sarkar (University of South Florida, Tampa)
5769 Photo-realistic Neural Domain Randomization Sergey Zakharov (Toyota Research Institute)*; Rareș A Ambruș (Toyota Research Institute); Vitor Guizilini (Toyota Research Institute); Wadim Kehl (Woven Planet); Adrien Gaidon (Toyota Research Institute)
5770 ShAPO: Implicit Representations for Multi-Object Shape, Appearance, and Pose Optimization Muhammad Zubair Irshad (Georgia Institute of Technology)*; Sergey Zakharov (Toyota Research Institute); Rareș A Ambruș (Toyota Research Institute); Thomas Kollar (Toyota Research Institute); Zsolt Kira (Georgia Institute of Technology); Adrien Gaidon (Toyota Research Institute)
5771 Structure and Motion for Casual Videos Zhoutong Zhang (MIT)*; Forrester Cole (Google Research); Zhengqi Li (Google Inc.); Noah Snavely (Google); Michael Rubinstein (Google); William T Freeman (Google)
5775 Single Frame Atmospheric Turbulence Mitigation: A Benchmark Study and A New Physics-Inspired Transformer Model Zhiyuan Mao (Purdue University)*; AJAY KUMAR JAISWAL (UT Austin); Zhangyang Wang (University of Texas at Austin); Stanley Chan (Purdue University, USA)
5778 Incremental Task Learning with Incremental Rank Updates Rakib Hyder (University of California, Riverside)*; Ken Shao (UCR); Boyu Hou (The University of California, Riverside ); Panagiotis Markopoulos (RIT); Ashley Prater-Bennette (Air Force Research Laboratory); M. Salman Asif (University of California, Riverside)
5787 Bandwidth-Aware Adaptive Codec for DNN Inference Offloading in IoT Xiufeng Xie (Kwai Inc.)*; Ning Zhou (Amazon); Wentao Zhu (Amazon); Ji Liu (Kwai Inc.)
5789 Inpainting at Modern Camera Resolution by Guided PatchMatch with Auto-Curation Connelly Barnes (Adobe)*; Lingzhi Zhang (University of Pennsylvania); Jianbo Shi (University of Pennsylvania); Zhe Lin (Adobe Research); Eli Shechtman (Adobe Research, US); Sohrab Amirghodsi (Adobe Research); Kevin Wampler (Adobe Systems Inc.)
5794 Controllable Video Generation through Global and Local Motion Dynamics Aram Davtyan (University of Bern)*; Paolo Favaro (University of Bern)
5812 UniCR: Universally Approximated Certified Robustness via Randomized Smoothing Hanbin Hong (University of Connecticut)*; Binghui Wang (Illinois Institute of Technology); Yuan Hong (University of Connecticut)
5829 3D Siamese Transformer Network for Single Object Tracking on Point Clouds Le Hui (Nanjing University of Science and Technology)*; Lingpeng Wang (Nanjing University of Science and Technology); Linghua Tang (Nanjing University of Science and Technology); Kaihao Lan (Nanjing University of Science and Technology); Jin Xie (Nanjing University of Science and Technology); Jian Yang (Nanjing University of Science and Technology)
5837 Hardly Perceptible Trojan Attack against Neural Networks with Bit Flips Jiawang Bai (Tsinghua University)*; Kuofeng Gao (Tsinghua University); dihong gong (Tencent AI Lab); Shu-Tao Xia (Tsinghua University); Zhifeng Li (Tencent AI Lab); Wei Liu (Tencent)
5856 StyleHEAT: One-Shot High-Resolution Editable Talking Face Generation via Pre-trained StyleGAN Fei Yin (Tsinghua University)*; Yong Zhang (Tencent AI Lab); Xiaodong Cun (Tencent AI Lab); Mingdeng Cao (Tsinghua University); Yanbo Fan (Tencent AI Lab); Xuan Wang (Tencent AI Lab); Qingyan Bai (Tsinghua University); Baoyuan Wu (The Chinese University of Hong Kong, Shenzhen); Jue Wang (Tencent AI Lab); Yujiu Yang (Tsinghua University)
5859 Referring Object Manipulation of Natural Images with Conditional Classifier-Free Guidance Myungsub Choi (Google)*
5880 Self-Supervised Interactive Object Segmentation Through a Singulation-and-Grasping Approach Houjian Yu (University of Minnesota, Twin Cities)*; Changhyun Choi (University of Minnesota, Twin Cities)
5898 BigColor: Colorization using a Generative Color Prior for Natural Images geonung kim (POSTECH); Kyoungkook Kang (POSTECH); Seongtae Kim (POSTECH); Hwayoon Lee (POSTECH); Sehoon Kim (Samsung electronics co. ltd.); Jonghyun Kim (Samsung Electronics); Seung-Hwan Baek (POSTECH); Sunghyun Cho (POSTECH)*
5901 Object Wake-up: 3D Object Rigging from a Single Image Ji Yang (University of Alberta)*; Xinxin Zuo (University of Alberta); Sen Wang (University of Alberta); Zhenbo Yu (Shanghai Jiao Tong University); Xingyu Li (University of Alberta); Bingbing Ni (Shanghai Jiao Tong University); Minglun Gong (University of Guelph); Li Cheng (ECE dept., University of Alberta)
5905 ClearPose: Large-scale Transparent Object Dataset and Benchmark Xiaotong Chen (University of Michigan, Ann Arbor)*; Huijie Zhang (University of Michigan, Ann Arbor); Zeren Yu (University of Michigan–Ann Arbor); Anthony Opipari (University of Michigan); Odest Chadwicke Jenkins (University of Michigan)
5907 Domain Knowledge-Informed Self-Supervised Representations for Workout Form Assessment Paritosh Parmar (University of British Columbia)*; Amol Gharat (Flex A.I.); Helge Rhodin (UBC)
5908 Neural Capture of Animatable 3D Human from Monocular Video Gusi Te (Peking University); Xiu Li (Tencent); Xiao Li (Microsoft Research Asia)*; Jinglu Wang (Microsoft Research Asia); Wei Hu (Peking University); Yan Lu (Microsoft Research Asia)
5913 Open Vocabulary Object Detection with Pseudo Bounding-Box Labels Mingfei Gao (Apple)*; Chen Xing (Salesforce Research); Juan Carlos Niebles (Salesforce & Stanford University); Junnan Li (Salesforce); Ran Xu (Salesforce Research); Wenhao Liu (Salesforce Metamind); Caiming Xiong (Salesforce Research)
5914 BoundaryFace: A mining framework with noise label self-correction for Face Recognition Shijie Wu (Southwest Jiaotong University)*; Xun Gong (Southwest Jiaotong University)
5915 IntegratedPIFu: Integrated Pixel Aligned Implicit Function for Single-view Human Reconstruction Kennard Chan Yanting (Nanyang Technological University)*; Guosheng Lin (Nanyang Technological University); Haiyu Zhao (SenseTime International Pte Ltd); Weisi Lin (Nanyang Technological University, Singapore)
5922 BMD: A General Class-balanced Multicentric Dynamic Prototype Strategy for Source-free Domain Adaptation Sanqing Qu (Tongji University); Guang Chen (Tongji University)*; Jing Zhang (The University of Sydney); Zhijun Li (University of Science and Technology of China); Wei He (University of Science and Technology Beijing); Dacheng Tao (JD.com)
5923 What Matters for 3D Scene Flow Network Guangming Wang (Shanghai Jiao Tong University); Yunzhe Hu (Shanghai Jiao Tong University); Zhe Liu (University of Cambridge); Yiyang Zhou (UC Berkeley ); Masayoshi TOMIZUKA (MSC Lab); Wei Zhan (University of California, Berkeley); Hesheng Wang (SJTU)*
5932 Controllable Shadow Generation Using Pixel Heigh Maps Yichen Sheng (Purdue University)*; Yifan Liu (University of Adelaide); Jianming Zhang (Adobe Research); Wei Yin (University of Adelaide); A. Cengiz Oztireli (University of Cambridge, Google); He Zhang (Adobe); Zhe Lin (Adobe Research); Eli Shechtman (Adobe Research, US); Bedrich Benes (Purdue University)
5937 CADyQ: Content-Aware Dynamic Quantization for Image Super-Resolution Cheeun Hong (Seoul National University); Sungyong Baik (Hanyang University); Heewon Kim (Seoul National University); Seungjun Nah (NVIDIA); Kyoung Mu Lee (Seoul National University)*
5940 SPSN: Superpixel Prototype Sampling Network for RGB-D Salient Object Detection Minhyeok Lee ( Yonsei University)*; Chaewon Park (Yonsei University); Suhwan Cho (Yonsei University); Sangyoun Lee (Yonsei University)
5950 Long Video Generation with Time-Agnostic VQGAN and Time-Sensitive Transformer Songwei Ge (University of Maryland)*; Thomas F Hayes (Meta); Harry Yang (Facebook); Xi Yin (Facebook); Guan Pang (Facebook); David Jacobs (University of Maryland, USA); Jia-Bin Huang (Facebook ); Devi Parikh (Georgia Tech & Facebook AI Research)
5951 Combining Internal and External Constraints for Unrolling Shutter in Videos Eyal Naor (Weizmann Institute)*; Itai Antebi (Weizmann); Shai Bagon (Weizmann Institute of Science); Michal Irani (Weizmann Institute, Israel)
5961 Global Spectral Filter Memory Network for Video Object Segmentation Yong Liu (Tsinghua University)*; Ran Yu (Tsinghua university); Jiahao Wang (Tsinghua University); Xinyuan Zhao (Huawei); Yitong Wang (Bytedance); Yansong Tang (Tsinghua University); Yujiu Yang (Tsinghua University)
5964 SEMICON: A Learning-to-hash Solution for Large-scale Fine-grained Image Retrieval Yang Shen (Nanjing University of Science and Technology); Xu Hao XH SUN (Nanjing University Of Science And Technology); Xiu-Shen Wei (Nanjing University of Science and Technology)*; Qing-Yuan Jiang (HuaWei); Jian Yang (Nanjing University of Science and Technology)
5966 Batch-efficient EigenDecomposition for Small and Medium Matrices Yue Song (University of Trento)*; Nicu Sebe (University of Trento); Wei Wang (EPFL)
5972 General Object Pose Transformation Network from Unpaired Data Yukun Su (South China University of Technology)*; Guosheng Lin (Nanyang Technological University); RuiZhou Sun (South China University of Technology); Qingyao Wu (South China University of Technology)
5974 Robust Network Architecture Search via Feature Distortion Restraining Yaguan QIAN (Zhejiang University of Science and Technology)*; Shenghui Huang (Zhejiang University of Science and Technology); Bin WANG (Network and Information Security Laboratory of Hangzhou Hikvision Digital Technology Co.); Xiang Ling (Institute of Software, Chinese Academy of Sciences); Xiaohui Guan (Zhejiang University of Water Resources and Electric Power); Zhaoquan Gu (Guangzhou University); Shaoning Zeng (Yangtze Delta Region Institute (Huzhou), University of Electronic Science and Technology of China); Wujie Zhou (Zhejiang University of Science and Technology); Haijiang Wang (Zhejiang University of Science and Technology)
5988 Correspondence Reweighted Translation Averaging Lalit Manam (Indian Institute of Science Bengaluru)*; Venu Madhav Govindu (Indian Institute of Science)
5993 RepMix: Representation Mixing for Robust Attribution of Synthesized Images Tu Bui (University of Surrey)*; Ning Yu (Salesforce Research); John Collomosse (Adobe Research)
6000 When Deep Classifiers Agree: Analyzing Correlations between Learning Order and Image Statistics Iuliia Pliushch (Goethe University)*; Martin Mundt (TU Darmstadt); Nicolas Lupp (Goethe University Frankfurt); Visvanathan Ramesh (Goethe University)
6002 S2F2: Single-Stage Flow Forecasting for Future Multiple Trajectories Prediction YU-WEN CHEN (National Tsing Hua University); Hsuan-Kung Yang (National Tsing Hua University); Chu-Chi Chiu (National Tsin-Hua University); Chun-Yi Lee (National Tsing Hua University)*
6004 Few-Shot Object Detection by Knowledge Distillation Using Bag-of-Visual-Words Representations Wenjie Pei (Harbin Institute of Technology, Shenzhen); Shuang Wu (Harbin Institute of Technology, Shenzhen); Dianwen Mei (Harbin Institute of Technology, Shenzhen); Fanglin Chen (Harbin Institute of Technology, Shenzhen); Jiandong Tian (CAS); Guangming Lu ( Harbin Institute of Technology, Shenzhen)*
6009 Stochastic Consensus: Enhancing Semi-Supervised Learning with Consistency of Stochastic Classifiers Hui Tang (South China University of Technology)*; Kui Jia (South China University of Technology); Lin Sun (Magic Leap)
6011 Learning Where To Look – Generative NAS is Surprisingly Efficient Jovita Lukasik (University of Mannheim)*; Steffen Jung (MPII); Margret Keuper (University of Mannheim)
6023 Realistic One-shot Mesh-based Head Avatars Taras Khakhulin (Skolkovo Institute of Science and Technology)*; Vanessa Valerievna Skliarova (Skoltech); Victor Lempitsky (Yandex); Egor Zakharov (Skolkovo Institute of Science and Technology)
6024 Ensemble Knowledge Guided Sub-network Search and Fine-tuning for Filter Pruning Seunghyun Lee (Inha University); Byung Cheol Song (Inha University)*
6037 SALISA: Saliency-based Input Sampling for Efficient Video Object Detection Babak Ehteshami Bejnordi (Qualcomm AI Reseach)*; Amir Ghodrati (Qualcomm AI Research); Fatih Porikli (Qualcomm AI Research); Amirhossein Habibian (Qualcomm AI Research)
6039 Video Instance Segmentation via Multi-Scale Spatio-Temporal Split Attention Transformer Omkar Thawakar (MBZUAI)*; Sanath Narayan (Inception Institute of Artificial Intelligence); Jiale Cao (Tianjin University); Hisham Cholakkal (MBZUAI); Rao Muhammad Anwer (MBZUAI/AALTO); Muhammad Haris Khan (Muhammad Bin Zayed University of Artificial Intelligence); Salman Khan (MBZUAI/ANU); Michael Felsberg (Linköping University); Fahad Shahbaz Khan (MBZUAI)
6044 RankSeg: Adaptive Pixel Classification with Image Category Ranking for Segmentation Haodi He (University of Science and Technology of China); Yuhui Yuan (Microsoft Research)*; Xiangyu Yue (University of California, Berkeley); Han Hu (Microsoft Research Asia)
6046 Contextformer: A Transformer with Spatio-Channel Attention for Context Modeling in Learned Image Compression Ahmet Burakhan Koyuncu (Technical University of Munich)*; Han Gao (Tencent America); Atanas Boev (Huawei Technologies Duesseldorf GmbH); Georgii Gaikov (Huawei Moscow Research Center); Elena Alshina (Huawei Technologies); Eckehard Steinbach (TUM)
6048 Image Super-Resolution with Deep Dictionary Shunta Maeda (Navier Inc.)*
6054 ECO-TR: Efficient Correspondences Finding Via Coarse-to-Fine Refinement Dongli Tan (Xiamen University)*; Jiang-Jiang Liu (Nankai University); Xingyu Chen (Youtu Lab); Chao Chen (Youtu Laboratory); Ruixin Zhang (Tencent); Yunhang Shen (Xiamen University); Shouhong Ding (Tencent); Rongrong Ji (Xiamen University, China)
6056 Responsive Listening Head Generation: A Benchmark Dataset and Baseline Mohan Zhou (Harbin Institute of Technology)*; Yalong Bai (JD AI Research); Wei Zhang (JD AI Research); Ting Yao (JD AI Research); Tiejun Zhao (Harbin Institute of Technology); Tao Mei (AI Research of JD.com)
6063 WISE: Whitebox Image Stylization by Example-based Learning Winfried Lötzsch (Merantix Momentum); Max Reimann (Hasso-Plattner-Institute)*; Martin Büßemeyer (Hasso-Plattner-Institut); Amir Semmo (Digital Masterpieces GmbH); Jürgen Döllner (Hasso-Plattner-Institut); Matthias Trapp (Hasso Plattner Institute, University of Potsdam)
6067 3D Equivariant Graph Implicit Functions Yunlu Chen (University of Amsterdam)*; Basura Fernando (Agency for Science, Technology and Research, A*STAR, Singapore); Hakan Bilen (University of Edinburgh); Matthias Niessner (Technical University of Munich); Efstratios Gavves (University of Amsterdam )
6068 AnimeCeleb: Large-Scale Animation CelebHeads Dataset for Head Reenactment Kangyeol Kim (KAIST)*; Sunghyun Park (KAIST); Jaeseong Lee (KAIST); Sunghyo Chung (Korea University); Junsoo Lee (NAVER WEBTOON Ltd.); Jaegul Choo (Korea Advanced Institute of Science and Technology)
6076 Towards Scale-Aware, Robust, and Generalizable Unsupervised Monocular Depth Estimation by Integrating IMU Motion Dynamics Sen Zhang (The University of Sydney); Jing Zhang (The University of Sydney)*; Dacheng Tao (The University of Sydney)
6078 Dynamic Local Aggregation Network with Adaptive Clusterer for Anomaly Detection Zhiwei Yang (Xidian University)*; Peng Wu (Xidian University); Jing Liu (Xidian University); Xiaotao Liu (Xidian University)
6080 Learning Semantic Segmentation from Multiple Datasets with Label Shifts Dongwan Kim (Seoul National University)*; Yi-Hsuan Tsai (Phiar Technologies); Yumin Suh (NEC Labs America); Masoud Faraki (NEC Labs); Sparsh Garg (NEC Labs America); Manmohan Chandraker (UC San Diego); Bohyung Han (Seoul National University)
6086 SecretGen: Privacy Recovery on Pre-trained Models via Distribution Discrimination Zhuowen Yuan (UIUC); Fan Wu (UIUC); Yunhui Long (University of Illinois); Chaowei Xiao (NVIDIA); Bo Li (UIUC)*
6090 A Kendall Shape Space Approach to 3D Shape Estimation from 2D Landmarks Martha Paskin (Zuse Institute Berlin); Daniel Baum (Zuse Institute Berlin); Mason N Dean (City University of Hong Kong); Christoph von Tycowicz (Zuse Institute Berlin)*
6092 Temporally Consistent Transformer for Video Denoising Mingyang Song (ETH Zurich)*; Yang Zhang (Disney Research Studios); Tunç Aydin (Disney Research)
6093 Action Quality Assessment with Temporal Parsing Transformer Yang Bai (Durham University); Desen Zhou (Baidu, Inc.)*; Songyang Zhang (Shanghai AI Laboratory); Jian Wang (Baidu); Errui Ding (Baidu Inc.); Yu Guan (University of Warwick); Yang Long (Durham University); Jingdong Wang (Baidu)
6097 A study of Pre-training strategies and datasets for facial representation learning Adrian Bulat (Samsung AI Center, Cambridge)*; Shiyang Cheng (Samsung); Jing Yang (University of Nottingham); Andrew Garbett (Samsung AI Center); Enrique Sanchez (Samsung AI Centre); Georgios Tzimiropoulos (Queen Mary University of London)
6112 Neural Strands: Learning Hair Geometry and Appearance from Multi-View Images Radu Alexandru Rosu (University of Bonn); Shunsuke Saito (Facebook); Ziyan Wang (Carnegie Mellon University); Chenglei Wu (Facebook Reality Labs); Sven Behnke (University of Bonn); Giljoo Nam (Facebook Inc.)*
6114 Conditional Stroke Recovery for Fine-Grained Sketch-Based Image Retrieval Zhixin Ling (Fudan University)*; Zhen Xing (Fudan University); Jian Zhou (Fudan University); Xiangdong Zhou (Fudan University)
6123 Generalized Brain Image Synthesis with Transferable Convolutional Sparse Coding Networks Yawen Huang (Tencent)*; Feng Zheng (SUSTech); Xu Sun (Tencent); Yuexiang Li (Jarvis Lab, Tencent); Ling Shao (Terminus Group); Yefeng Zheng (Tencent)
6127 Wave-ViT: Unifying Wavelet and Transformers for Visual Representation Learning Ting Yao (JD AI Research); Yingwei Pan (JD AI Research)*; Yehao Li (JD AI Research); Chong-Wah Ngo (Singapore Management University); Tao Mei (AI Research of JD.com)
6129 GraphCSPN: Geometry-Aware Depth Completion via Dynamic GCNs Xin Liu (Tsinghua University)*; Xiaofei Shao (Deptrum); Bo Wang (Deptrum); Ya-Li Li (Tsinghua University); Shengjin Wang (Tsinghua University)
6138 Revisiting Batch Norm Initialization Jim Davis (Ohio State University); Logan Frank (Ohio State University)*
6141 NewsStories: Illustrating articles with visual summaries Reuben Tan (Boston University)*; Bryan Plummer (Boston University); Kate Saenko (Boston University); J.P. Lewis (Google Research); Avneesh Sud (Google); Thomas Leung (Google)
6144 Improving Few-Shot Learning through Multi-task Representation Learning Theory Quentin Bouniot (CEA, LIST)*; Ievgen Redko (Laboratoire Hubert Curien); Romaric Audigier (CEA LIST); Angélique Loesch (CEA LIST); Amaury Habrard (University of St-Etienne, Lab. H. Curien)
6145 Deep Semantic Statistics Matching (D2SM) Denoising Network Kangfu Mei (Johns Hopkins University)*; Vishal Patel (Johns Hopkins University); Rui Huang (The Chinese University of Hong Kong, Shenzhen)
6148 Long-tailed Instance Segmentation using Gumbel Optimized Loss Konstantinos P Alexandridis (University of Liverpool)*; Jiankang Deng (Imperial College London); Anh Nguyen (University of Liverpool); Shan Luo (University of Liverpool)
6162 DetMatch: Two Teachers are Better Than One for Joint 2D and 3D Semi-Supervised Object Detection Jinhyung Park (Carnegie Mellon University)*; Chenfeng Xu (UC Berkeley); Yiyang Zhou (UC Berkeley ); Masayoshi TOMIZUKA (MSC Lab); Wei Zhan (University of California, Berkeley)
6177 3D Scene Inference from Transient Histograms Sacha Jungerman (University of Wisconsin-Madison)*; Atul N Ingle (University of Wisconsin-Madison); Yin Li (University of Wisconsin-Madison); Mohit Gupta (“University of Wisconsin-Madison, USA “)
6178 SSBNet: Improving Visual Recognition Efficiency by Adaptive Sampling Ho Man Kwan (The Hong Kong University of Science and Technology)*; S.H. Song (HKUST)
6182 Deep 360° Optical Flow Estimation by Multi-Projection Fusion Yiheng Li (Victoria University of Wellington); Connelly Barnes (Adobe); Kun Huang (Victoria University of Wellington); Fang-Lue Zhang (Victoria University of Wellington)*
6187 Neural Space-filling Curves Hanyu Wang (University of Maryland – College Park)*; Kamal Gupta (University of Maryland); Larry Davis (University of Maryland); Abhinav Shrivastava (University of Maryland)
6192 MFIM: Megapixel Facial Identity Manipulation Sanghyeon Na (kakaobrain)*
6194 Objects Can Move: 3D Change Detection by GeometricTransformation Consistency Aikaterini Adam (National Techniclal University of Athens)*; Torsten Sattler (Czech Technical University in Prague); Konstantinos Karantzalos (National Technical University of Athens); Tomas Pajdla (Czech Technical University in Prague)
6199 MUGEN: A Playground for Video-Audio-Text Multimodal Understanding and GENeration Thomas F Hayes (Meta); Songyang Zhang (University of Rochester)*; Xi Yin (Facebook); Guan Pang (Facebook); Sasha Sheng (Meta Platforms); Harry Yang (Facebook); Songwei Ge (University of Maryland, College Park); Qiyuan Hu (Facebook AI Research); Devi Parikh (Georgia Tech & Facebook AI Research)
6203 PatchRD: Detail-Preserving Shape Completion by Learning Patch Retrieval and Deformation Bo Sun (UT Austin)*; Vladimir Kim (Adobe); Qixing Huang (The University of Texas at Austin); Noam Aigerman (Adobe); Siddhartha Chaudhuri (Adobe Research)
6207 Network Binarization via Contrastive Learning Yuzhang Shang (Illinois Institute of Technology)*; Dan Xu (The Hong Kong University of Science and Technology); Ziliang Zong (Texas State University); Liqiang Nie (Harbin Institute of Technology (Shenzhen)); Yan Yan (Illinois Institute of Technology)
6210 Lipschitz Continuity Retained Binary Neural Network Yuzhang Shang (Illinois Institute of Technology)*; Dan Xu (The Hong Kong University of Science and Technology); Bin Duan (Illinois Institute of Technology); Ziliang Zong (Texas State University); Liqiang Nie (Harbin Institute of Technology (Shenzhen)); Yan Yan (Illinois Institute of Technology)
6212 Is Geometry Enough for Matching in Visual Localization? Qunjie Zhou (Technical University of Munich)*; Sérgio Agostinho (Institute for Systems and Robotics, Instituto Superior Técnico, Universidade de Lisboa); Aljosa Osep (TUM Munich); Laura Leal-Taixé (TUM)
6214 Webly Supervised Concept Expansion for General Purpose Vision Models Amita Kamath (Allen Institute for Artificial Intelligence); Christopher A Clark (Allen Institute for AI)*; Tanmay Gupta (Allen Institute for Artificial Intelligence); Eric Kolve (Allen AI); Derek Hoiem (University of Illinois at Urbana-Champaign); Aniruddha Kembhavi (Allen Institute for Artificial Intelligence)
6216 Compositional Human-Scene Interaction Synthesis with Semantic Control Kaifeng Zhao (ETH Zurich)*; Shaofei wang (ETH Zurich); Yan Zhang (ETH Zurich); Thabo Beeler (Disney Research | Studios); Siyu Tang (ETH Zurich)
6218 MaCLR: Motion-aware Contrastive Learning of Representations for Videos Fanyi Xiao (Meta); Joseph Tighe (Amazon); Davide Modolo (Amazon)*
6220 Transformers as Meta-Learners for Implicit Neural Representations Yinbo Chen (UC San Diego)*; Xiaolong Wang (UCSD)
6222 RAWtoBit: A Fully End-to-end Camera ISP Network Wooseok Jeong (Korea University); Seung-Won Jung (Korea University)*
6227 SpatialDETR: Robust Scalable Transformer-Based 3D Object Detection from Multi-View Camera Images with Global Cross-Sensor Attention Simon Doll (University of Tübingen)*; Richard Schulz (Mercedes Benz); Lukas Schneider (Daimer); Viviane Benzin (Mercedes-Benz AG); Markus Enzweiler (Esslingen University of Applied Sciences); Hendrik P. A. Lensch (University of Tübingen)
6228 3D Face Reconstruction with Dense Landmarks Erroll Wood (Microsoft)*; Tadas Baltrusaitis (Microsoft); Charlie Hewitt (Microsoft); Matthew A Johnson (Microsoft); Jingjing Shen (Microsoft); Nikola Milosavljevic (Microsoft); Daniel S Wilde (Microsoft); Stephan J Garbin (University College London); Toby Sharp (Microsoft); Ivan Stojiljkovic (Microsoft); Tom Cashman (Microsoft); Julien Valentin (Microsoft)
6236 SWFormer: Sparse Window Transformer for 3D Object Detection in Point Clouds Pei Sun (Waymo)*; Mingxing Tan (Waymo); Weiyue Wang (Waymo); Chenxi Liu (Waymo); Fei Xia (Waymo); Zhaoqi Leng (Waymo); Dragomir Anguelov (Waymo)
6247 Incomplete Multi-view Domain Adaptation via Channel Enhancement and Knowledge Transfer Haifeng Xia (Tulane University)*; Pu Wang (MERL); Zhengming Ding (Tulane University)
6250 Exposure-Aware Dynamic Weighted Learning for Single-Shot HDR Imaging An Gia Vien (Dongguk University); Chul Lee (Dongguk University)*
6259 Seeing through a Black Box: Toward High-Quality Terahertz Imaging via Subspace-and-Attention Guided Restoration Weng-Tai Su (National Tsing Hua University); Yi-Chun Hung (University of California, Los Angeles); Po-Jen Yu (National Tsing Hua University); Shang-Hua Yang (National Tsing Hua University); Chia-Wen Lin (National Tsing Hua University)*
6265 SPViT: Enabling Faster Vision Transformers via Soft Token Pruning Zhenglun Kong (Northeastern University)*; Peiyan Dong (Northeastern University); Xiaolong Ma (Clemson University); Xin Meng (Peking university); Wei Niu (William & Mary); Mengshu Sun (Northeastern University); Xuan Shen (Northeastern University); Geng Yuan (Northeastern University); Bin Ren (William & Mary); Hao Tang (ETH Zurich); Minghai Qin (Western Digital Research); Yanzhi Wang (Northeastern University)
6269 Soft Masking for Cost-Constrained Channel Pruning Ryan Humble (Stanford University)*; Maying Shen (NVIDIA); Jorge Albericio Latorre (NVIDIA); Eric Darve (Stanford University); Jose M. Alvarez (NVIDIA)
6271 Ensemble Learning Priors Driven Deep Unfolding forScalable Snapshot Compressive Imaging Chengshuai Yang (Westlake University)*; Shiyu Zhang (Westlake University); Xin Yuan (Westlake University)
6275 A Simple Baseline for Open Vocabulary Semantic Segmentation with Pre-trained Vision-language Model Mengde Xu (Huazhong University of Science and Tech.); Zheng Zhang (MSRA)*; Fangyun Wei (Microsoft Research Asia); Yutong Lin (Xi’an Jiaotong University); Yue Cao (Microsoft Research); Han Hu (Microsoft Research Asia); Xiang Bai (Huazhong University of Science and Technology)
6276 Triangle Attack: A Query-efficient Decision-based Adversarial Attack Xiaosen Wang (Huazhong University of Science and Technology)*; Zeliang Zhang (Huazhong University of Sci. & Technology); Kangheng Tong (Huazhong University of Science and Technology); dihong gong (Tencent AI Lab); Kun He (Huazhong University of Science and Technology); Zhifeng Li (Tencent AI Lab); Wei Liu (Tencent)
6282 Tailoring Self-Supervision for Supervised Learning WonJun Moon (Sungkyunkwan University)*; Jihwan Kim (Sungkyunkwan University); Jae-Pil Heo (Sungkyunkwan University)
6283 Difficulty-Aware Simulator for Open Set Recognition WonJun Moon (Sungkyunkwan University)*; Jun ho Park (Sungkyunkwan university); Hyun Seok Seong (Sungkyunkwan University); Cheol-Ho Cho (Sungkyunkwan University); Jae-Pil Heo (Sungkyunkwan University)
6287 Non-Uniform Step Size Quantization for Accurate Post-Training Quantization Sangyun Oh (UNIST)*; Hyeonuk Sim (UNIST); Jounghyun Kim (UNIST); Jongeun Lee (UNIST)
6298 FedVLN: Privacy-preserving Federated Vision-and-Language Navigation Kaiwen Zhou (University of California, Santa Cruz)*; Xin Eric Wang (University of California, Santa Cruz)
6305 Data-free Backdoor Removal Based on Channel Lipschitzness Runkai Zheng (Chinese University of Hong Kong (Shenzhen)); Rongjun Tang (The Chinese University of Hong Kong, Shenzhen); Jianze Li (Shenzhen Research Institute of Big Data, The Chinese University of Hong Kong, Shenzhen); Li Liu (Shenzhen Research Institute of Big Data, the chinese university of hong kong shenzhen)*
6312 SuperTickets: Drawing Task-Agnostic Lottery Tickets from Supernets via Jointly Architecture Searching and Parameter Pruning Haoran You (Rice University)*; Baopu Li (Baidu ); Zhanyi Sun (Rice University); Xu Ouyang (Rice University); Yingyan Lin (Rice University)
6316 PCR-CG: Point Cloud Registration via Deep Explicit Color and Geometry Yu Zhang (Shanghai Jiaotong University )*; Yu Junle (HangZhou dianzi university); Xiaolin Huang (Shanghai Jiao Tong University); Wenhui Zhou (Hangzhou Dianzi University); Ji Hou (Meta Reality Labs)
6323 DistPro: Searching A Fast Knowledge Distillation Process via Meta Optimization Xueqing Deng (University of California, Merced); Dawei Sun (University of Illinois Urbana-Champaign); Shawn Newsam (UC Merced); Peng Wang (Bytedance USA LLC.)*
6324 Tomography of Turbulence Strength Based on Scintillation Imaging Nir Shaul (Technion)*; Schechner Yoav (Technion)
6325 Realistic Blur Synthesis for Learning Image Deblurring Jaesung Rim (POSTECH); Geonung Kim (POSTECH); Jungeon Kim (POSTECH); Junyong Lee (POSTECH); Seungyong Lee (POSTECH); Sunghyun Cho (POSTECH)*
6328 GLAMD: Global and Local Attention MaskDistillation for Object Detectors YounHo Jang (Kyung Hee University); Wheemyung Shin (Kyung Hee University); Jinbeom Kim (Sungkyunkwan University (SKKU)); Sung-Ho Bae (Kyung Hee University)*; Simon S Woo (Sungkyunkwan University (SKKU))
6337 Meta-GF: Training Dynamic-Depth Neural Networks Harmoniously Yi Sun (National University of Defense Technology); Jian Li (NUDT); Xin Xu (National University of Defense Technology)*
6338 CXR Segmentation by AdaIN-based Domain Adaptation and Knowledge Distillation Yujin Oh (Kim Jaechul Graduate School of AI, KAIST, Korea); Jong Chul Ye (Kim Jaechul Graduate School of AI, KAIST, Korea)*
6342 Emotion-aware Multi-view Contrastive Learning for Facial Emotion Recognition Daeha Kim (Inha University); Byung Cheol Song (Inha University)*
6356 FCAF3D: Fully Convolutional Anchor-Free 3D Object Detection Danila Rukhovich (Samsung AI Center Moscow); Anna Vorontsova (Samsung AI Center)*; Anton S. Konushin (Samsung AI Center Moscow)
6365 Video Dialog as Conversation about Objects Living in Space-Time Hoang-Anh Pham (Deakin University)*; Thao Minh Le (Deakin University); Vuong Le (Deakin University); Tu Minh Phuong (Posts and Telecommunications Institute of Technology); Truyen Tran (Deakin University)
6366 Few-Shot Class-Incremental Learning from an Open-Set Perspective Can Peng (the University of Queensland)*; Kun Zhao (Sullivan Nicolaides Pathology); Tianren Wang (The University of Queensland); Meng Li (The University of Queensland); Brian C Lovell (University of Queensland)
6380 ML-BPM: Multi-teacher Learning with Bidirectional Photometric Mixing for Open Compound Domain Adaptation in Semantic Segmentation Fei Pan (KAIST)*; Sungsu Hur (KAIST); Seokju Lee (KENTECH); Junsik Kim (Harvard University); In So Kweon (KAIST)
6389 DRCNet: Dynamic Image Restoration Contrastive Network Fei Li (China Agricultural University)*; Lingfeng Shen (Tencent AI Lab); YANG MI (China Agricultural University); Zhenbo Li (China Agricultural University)
6394 Order Learning Using Partially Ordered Data via Chainization Seon-Ho Lee (MCL, Korea University); Chang-Su Kim (Korea university)*
6395 Style Your Hair: Latent Optimization for Pose-Invariant Hairstyle Transfer via Local-Style-Aware Hair Alignment Chaeyeon Chung ( Korea Advanced Institute of Science and Technology)*; Taewoo Kim (Korea Advanced Institute of Science and Technology ); Yoonseo Kim (KAIST); Sunghyun Park (KAIST); Kangyeol Kim (KAIST); Jaegul Choo (Korea Advanced Institute of Science and Technology)
6403 High-Resolution Virtual Try-On with Misalignment and Occlusion-Handled Conditions SangYun Lee (Soongsil University); Gyojung Gu (Korea Advanced Institute of Science and Technology)*; Sunghyun Park (KAIST); Seunghwan Choi (Korea Advanced Institute of Science and Technology ); Jaegul Choo (Korea Advanced Institute of Science and Technology)
6418 Zero-Shot Learning for Reflection Removal of Single 360-Degree Image Byeong-Ju Han (Ulsan National Institute of Science and Technology ); Jae-Young Sim (Ulsan National Institute of Science and Technology)*
6420 A Codec Information Assisted Framework for Efficient Compressed Video Super-Resolution Hengsheng Zhang (Shanghai Jiao Tong University)*; Xueyi Zou (Huawei Noah’s Ark Lab); Jiaming Guo (Huawei Noah’s Ark Lab); Youliang Yan (Huawei Noah’s Ark Lab); Rong Xie (Shanghai Jiao Tong University); Li Song (Shanghai Jiao Tong University)
6421 Towards Ultra Low Latency Spiking Neural Networks for Vision and Sequential Tasks Using Temporal Pruning Sayeed Shafayet Chowdhury (Purdue University)*; Nitin Rathi (Purdue University); Kaushik Roy (Purdue Uniiversity)
6439 MimicME: A Large Scale Diverse 4D Database for Facial Expression Analysis Athanasios Papaioannou (Huawei)*; Baris Gecer (Huawei); Shiyang Cheng (Samsung); Grigorios Chrysos (EPFL); Jiankang Deng (Imperial College London); Eftychia Fotiadou (Imperial College London); Christos Kampouris (ApolloXR); Dimitrios Kollias (Queen Mary University London); Stylianos Moschoglou (Huawei Technologies Co. Ltd); Kritaphat Songsri-In (Imperial College London); Stylianos Ploumpis (Huawei Technologies Co. Ltd); George Trigeorgis (Imperial College London ); Panagiotis Tzirakis (Imperial College London); Evangelos Ververas (Imperial College London); Yuxiang Zhou (Deepmind, Google); Allan Ponniah (NHS); Anastasios Roussos (Institute of Computer Science, Foundation for Research and Technology Hellas); Stefanos Zafeiriou (Imperial College London)
6441 Black-Box Dissector: Towards Erasing-based Hard-Label Model Stealing Attack Yixu Wang (Xiamen University)*; Jie Li (Xiamen University); Hong Liu (National Institute of Informatics ); Yan Wang (Pinterest); Yongjian Wu (Tencent Technology (Shanghai) Co.,Ltd); Feiyue Huang (Tencent); Rongrong Ji (Xiamen University, China)
6451 Video Anomaly Detection by Solving Decoupled Spatio-Temporal Jigsaw Puzzles Guodong Wang (Beihang University)*; Yunhong Wang (State Key Laboratory of Virtual Reality Technology and System, Beihang University, Beijing 100191, China); Jie Qin (Nanjing University of Aeronautics and Astronautics); Dongming Zhang ( National Computer Network Emergency Response Technical Team/Coordination Center of China ); Xiuguo bao (National Computer Network Emergency Response Technical Team/Coordination Center of China); Di Huang (Beihang University, China)
6454 Towards Accurate Network Quantization with Equivalent Smooth Regularizers Kirill Solodskikh (Huawei Noah’s Ark Lab, MSU)*; Vladimir Chikin (Huawei Noah’s Ark Lab); Ruslan Aydarkhanov (Huawei Noah’s Ark Lab); Dehua Song (Huawei Noah’s Ark Lab); Irina Zhelavskaya (Skolkovo Institute of Science and Technology (Skoltech)); Jiansheng Wei (Huawei Technologies Co. Ltd.)
6455 DiffuseMorph: Unsupervised Deformable Image Registration Using Diffusion Model Boah Kim (KAIST)*; Inhwa Han (KAIST); Jong Chul Ye (Kim Jaechul Graduate School of AI, KAIST, Korea)
6459 An Impartial Take to the CNN vs Transformer Robustness Contest Francesco Pinto (University of Oxford)*; Philip Torr (University of Oxford); Puneet Dokania (University of Oxford)
6460 CODER: Coupled Diversity-Sensitive Momentum Contrastive Learning for Image-Text Retrieval Haoran Wang (Baidu)*; Dongliang He (Baidu); Wenhao Wu (Baidu); Boyang Xia (Institute of Computing Technology, Chinese Academy of Science); Min Yang (Baidu); Fu Li (Baidu); Yunlong Yu (Zhejiang University); Zhong Ji (Tianjin University); Errui Ding (Baidu Inc.); Jingdong Wang (Baidu)
6463 Weakly Supervised 3D Scene Segmentation with Region-Level Boundary Awareness and Instance Discrimination Kangcheng LIU (The Chinese University of Hong Kong)*; Yuzhi Zhao (City University of Hong Kong); Qiang Nie (Tencent Youtu Lab); Zhi Gao (NUS); Ben M. Chen (Chinese University of Hong Kong)
6471 FOSTER: Feature Boosting and Compression for Class-Incremental Learning Fu-Yun Wang (Nanjing University)*; Da-Wei Zhou (Nanjing University); Han-Jia Ye (Nanjing University); De-Chuan Zhan (Nanjing University)
6472 Delving into Universal Lesion Segmentation: Method, Dataset, and Benchmark Yu Qiu (Nankai University)*; Jing Xu (Nankai University)
6475 Explicit Model Size Control and Relaxation via Smooth Regularization for Mixed-Precision Quantization Vladimir Chikin (Huawei Noah’s Ark Lab)*; Kirill Solodskikh (Huawei Noah’s Ark Lab, MSU); Irina Zhelavskaya (Skolkovo Institute of Science and Technology (Skoltech))
6479 Large scale Real-world Multi Person Tracking Bing Shuai (Amazon)*; Alessandro Bergamo (Amazon); Uta Büchler (Amazon); Andrew G Berneshawi (Amazon); Alyssa Boden (Amazon Web Services); Joseph Tighe (Amazon)
6491 Class-agnostic Object Detection with Multi-modal Transformer Muhammad Maaz (MBZUAI)*; Hanoona Abdul Rasheed (MBZUAI); Salman Khan (MBZUAI/ANU); Fahad Shahbaz Khan (MBZUAI); Rao Muhammad Anwer (MBZUAI/AALTO); Ming-Hsuan Yang (University of California at Merced)
6493 Language-Grounded Indoor 3D Semantic Segmentation in the Wild Dávid Rozenberszki (Technische Universitat Munchen)*; Or Litany (Stanford); Angela Dai (Technical University of Munich)
6505 Injecting 3D Perception of Controllable NeRF-GAN into StyleGAN for Editable Portrait Image Synthesis Jeong-gi Kwak (Korea University); Yuanming Li (Korea University); Dongsik Yoon (Korea University); Donghyeon Kim (Korea university); David K Han (Drexel University); Hanseok Ko (Korea University)*
6512 BASQ: Branch-wise Activation-clipping Search Quantization for Sub-4-bit Neural Networks Han-Byul Kim (Seoul National University)*; Eunhyeok Park (POSTECH); Sungjoo Yoo (Seoul National University)
6513 AdaNeRF: Adaptive Sampling for Real-time Rendering of Neural Radiance Fields Andreas Kurz (Graz University of Technology)*; Thomas Neff (Graz University of Technology); Zhaoyang Lv (Facebook); Michael Zollhöfer (Facebook Reality Labs); Markus Steinberger (Graz University of Technology)
6516 Neural Light Field Estimation for Street Scenes with Differentiable Virtual Object Insertion Zian Wang (University of Toronto)*; Wenzheng Chen (University of Toronto); David Acuna (University of Toronto, NVIDIA); Jan Kautz (NVIDIA); Sanja Fidler (University of Toronto, NVIDIA)
6519 Tree Structure-Aware Few-Shot Image Classification via Hierarchical Aggregation Min Zhang (Zhejiang University)*; Siteng Huang (Westlake University); Wenbin Li (Nanjing University); Donglin Wang (Westlake University)
6526 PoseScript: 3D Human Poses from Natural Language Ginger Delmas (NAVER LABS EUROPE)*; Philippe Weinzaepfel (NAVER LABS Europe); Thomas LUCAS (Naver); Francesc Moreno (IRI); Gregory Rogez (NAVER LABS Europe)
6532 Learning Energy-Based Models With Adversarial Training Xuwang Yin (University of Virginia)*; Shiying Li (University of North Carolina, Chapel Hill); Gustavo Rohde (University of Virginia)
6538 You Already Have It: A Generator-Free Low-Precision DNN Training Framework using Stochastic Rounding Geng Yuan (Northeastern University)*; Sung-En Chang (Northeastern University); Qing Jin (Northeastern University); Alec Lu (Simon Fraser University ); Yanyu Li (Northeastern University); Yushu Wu (Northeastern University); Zhenglun Kong (Northeastern University); Yanyue Xie (Northeastern University); Peiyan Dong (Northeastern University); Minghai Qin (Western Digital Research); Xiaolong Ma (Clemson University); Xulong Tang (University of Pittsburgh); Zhenman Fang (Simon Fraser University); Yanzhi Wang (Northeastern University)
6540 TIPS: Text-Induced Pose Synthesis Prasun Roy (University of Technology Sydney)*; Subhankar Ghosh (University of Technology Sydney ); Saumik Bhattacharya (Indian Institute of Technology Kharagpur ); Umapada Pal (Indian Statistical Institute, Kolkata); Michael Blumenstein (University of Technology Sydney)
6541 Unsupervised High-Fidelity Facial Texture Generation and Reconstruction Ron Slossberg (Technion)*; Ibrahim Jubran (The University of Haifa); Ron Kimmel (Technion)
6551 Addressing Heterogeneity in Federated Learning via Distributional Transformation Haolin Yuan (Johns Hopkins University); Bo Hui (Johns Hopkins University); Yuchen Yang (Johns Hopkins University); Philippe Burlina (JHU/APL/CS/SOM); Neil Zhenqiang Gong (Duke University); Yinzhi Cao (JHU)*
6555 Adversarial Label Poisoning Attack on Graph Neural Networks via Label Propagation Ganlin Liu (The University of Liverpool)*; Xiaowei Huang (Liverpool University); Xinping Yi (University of Liverpool)
6559 Approximate Discrete Optimal Transport Plan with Auxiliary Measure Method Dongsheng An (Stony Brook University)*; Na Lei (Dalian University of Technology); Xianfeng GU (Stony Brook University)
6560 Visual Knowledge Tracing Neehar Kondapaneni (Caltech)*; Pietro Perona (California Institute of Technology); Oisin Mac Aodha (University of Edinburgh)
6562 Semi-Leak: Membership Inference Attacks Against Semi-supervised Learning Xinlei He (CISPA Helmholtz Center for Information Security)*; Hongbin Liu (Duke University); Neil Zhenqiang Gong (Duke University); Yang Zhang (CISPA Helmholtz Center for Information Security)
6565 DProST: Dynamic Projective Spatial Transformer Network for 6D Pose Estimation Jaewoo Park (Seoul National University); Nam Ik Cho (Seoul National University)*
6567 Accurate Detection of Proteins in Cryo-Electron Tomograms from Sparse Labels Qinwen Huang (Duke University)*; Alberto Bartesaghi (Duke University); Ye Zhou (Duke University); Hsuan-Fu Liu (Duke University)
6576 Subspace Diffusion Generative Models Bowen Jing (Massachusetts Institute of Technology)*; Gabriele Corso (MIT); Renato Berlinghieri (MIT); Tommi Jaakkola (MIT)
6583 Multi-modal Text Recognition Networks: Interactive Enhancements between Visual and Semantic Features Byeonghu Na (KAIST); Yoonsik Kim (Clova AI Research, NAVER Corp.); Sungrae Park (Upstage AI Research, Upstage AI)*
6592 Inductive and Transductive Few-Shot Video Classification via Appearance and Temporal Alignments Khoi D. Nguyen (VinAI Research)*; Quoc-Huy Tran (Retrocausal, Inc.); Khoi Nguyen (VinAI Research); Binh-Son Hua (VinAI Research); Rang NGUYEN (VinAI Research)
6599 Learning Long-Term Spatial-Temporal Graphs for Active Speaker Detection Kyle Min (Intel Labs); Sourya Roy (University of California, Riverside); Subarna Tripathi (Intel Labs)*; Tanaya Guha (University of Glasgow); Somdeb Majumdar (Intel Labs)
6602 Relative Contrastive Loss for Unsupervised Representation Learning Shixiang Tang (The University of Sydney)*; Feng Zhu (University of Science and Technology of China); Lei Bai (Shanghai AI Laboratory); Rui Zhao (SenseTime Group Limited); Wanli Ouyang (The University of Sydney)
6615 Personalized Education: Blind Knowledge Distillation Xiang Deng (State University of New York at Binghamton)*; Jian Zheng (Amazon); Zhongfei Zhang (Binghamton University)
6619 Fast Two-View Motion Segmentation Using Christoffel Polynomials Bengisu Ozbay (Northeastern University); Octavia Camps (Northeastern University); Mario Sznaier (Northeastern University)*
6623 Real Spike: Learning Real-valued Spikes for Spiking Neural Networks Yufei Guo (The Second Academy of China Aerospace Science and Industry Corporation)*; Liwen Zhang (X Lab, the Second Academy of CASIC, Beijing); Yuanpei Chen (X LAB,The Second Academy of CASIC,Beijing); Xinyi Tong (The Second Academy of China Aerospace Science and Industry Corporation); Xiaode Liu (X Lab, The Second Academy of China Aerospace Science and Industry Corporation); YingLei Wang (CASIC); Xuhui Huang (X Lab, The Second Academy of CASIC); Zhe Ma (Xlab, the Second Academy of CASIC, Beijing)
6627 Language-Driven Artistic Style Transfer Tsu-Jui Fu (UCSB)*; Xin Eric Wang (University of California, Santa Cruz); William Yang Wang (UC Santa Barbara)
6634 FedLTN: Federated Learning for Sparse and Personalized Lottery Ticket Networks Vaikkunth Mugunthan (DynamoFL)*; Eric Lin (DynamoFL); Vignesh Gokul (University of California San Diego); Christian Lau (DynamoFL); Lalana Kagal (MIT); Steve Pieper (Isomics, Inc.)
6639 Transformer with Implicit Edges for Particle-based Physics Simulation Yidi Shao (Nanyang Technological University)*; Chen Change Loy (Nanyang Technological University); Bo Dai (Shanghai AI Lab)
6651 Improving the Perceptual Quality of 2D Animation Interpolation Shuhong Chen (University of Maryland – College Park)*; Matthias Zwicker (University of Maryland)
6652 Towards Open-vocabulary Scene Graph Generation with Prompt-based Finetuning Tao He (Monash University); Lianli Gao (The University of Electronic Science and Technology of China); Jingkuan Song (UESTC); Yuan-Fang Li (Monash University)*
6655 S3C: Self-Supervised Stochastic Classifiers for Few-Shot Class-Incremental Learning Jayateja Kalla (Indian Institute of Science); Soma Biswas (Indian Institute of Science, Bangalore)*
6660 Entry-Flipped Transformer for Inference and Prediction of Participant Behavior BO HU (Nanyang Technological University)*; Tat-Jen Cham (Nanyang Technological University)
6665 OpenLDN: Learning to Discover Novel Classes for Open-World Semi-Supervised Learning Mamshad Nayeem Rizve (University of Central Florida)*; Navid Kardan (University of Central Florida); Salman Khan (MBZUAI/ANU); Fahad Shahbaz Khan (MBZUAI); Mubarak Shah (University of Central Florida)
6666 Fine-grained Fashion Representation Learning by Online Deep Clustering Yang Jiao (Amazon)*; Ning Xie (Amazon); Yan Gao (Amazon); Chien-Chih Wang (Amazon); Yi Sun (Amazon)
6667 Perspective Phase Angle Model for Polarimetric 3D Reconstruction Guangcheng Chen (Guangdong University of Technology)*; Li He (Southern University of Science and Technology); Yisheng Guan (Guangdong University of Technology); Hong Zhang (University of Alberta)
6670 Selective TransHDR: Transformer-based selective HDR Imaging using Ghost Region Mask Jou Won Song (Sogang University); Ye-In Park (Sogang University); Kyeongbo Kong (Pukyong National University); Jaeho Kwak (Sogang University); Suk-Ju Kang (Sogang University)*
6671 3D Interacting Hand Pose Estimation by Hand De-occlusion and Removal Hao Meng (BeiHang University); Sheng Jin (The University of Hong Kong)*; Wentao Liu (Sensetime); Chen Qian (SenseTime); Mengxiang Lin (Beihang University); Wanli Ouyang (The University of Sydney); Ping Luo (The University of Hong Kong)
6678 Recover Fair Deep Classification Models via Altering Pre-trained Structure Yanfu Zhang (University of Pittsburgh)*; Shangqian Gao (University of Pittsburgh); Heng Huang (University of Pittsburgh)
6680 Improving Fine-Grained Visual Recognition in Low Data Regimes via Self-Boosting Attention Mechanism Yangyang Shu (University of Adelaide); Lingqiao Liu (University of Adelaide)*; Baosheng Yu (The University of Sydney); Haiming Xu (The University of Adelaide)
6686 VSA: Learning Varied-Size Window Attention in Vision Transformers Qiming Zhang (The University of Sydney)*; YUFEI XU (University of sydney); Jing Zhang (The University of Sydney); Dacheng Tao (JD.com)
6693 PoseGPT: Quantization-based 3D Human Motion Generation and Forecasting Thomas LUCAS (Naver)*; Fabien Baradel (Naver Labs Europe); Philippe Weinzaepfel (NAVER LABS Europe); Gregory Rogez (NAVER LABS Europe)
6694 CAViT: Contextual Alignment Vision Transformer for Video Object Re-identification jinlin wu (Institute of Automation, Chinese Academy of Sciences, Beijing, China)*; He Lingxiao (nlpr,cripac); Wu Liu (AI Research of JD.com); Yang Yang (Institute of Automation, Chinese Academy of Sciences); Zhen Lei (NLPR, CASIA, China); Tao Mei (AI Research of JD.com); Stan Z. Li (Westlake University)
6698 Learning Series-Parallel Lookup Tables for Efficient Image Super-Resolution Cheng Ma (Tsinghua University); Jingyi Zhang (Tsinghua University); Jie Zhou (Tsinghua University); Jiwen Lu (Tsinghua University)*
6715 Frozen CLIP Models are Efficient Video Learners Ziyi Lin (The Chinese University of Hong Kong)*; Shijie Geng (Rutgers University); Renrui Zhang (Shanghai AI Lab); Peng Gao (Chinese university of hong kong); Gerard de Melo (Hasso Plattner Institute); Xiaogang Wang (Chinese University of Hong Kong, Hong Kong); Jifeng Dai (SenseTime); Yu Qiao (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences); Hongsheng Li (The Chinese University of Hong Kong)
6719 Deforming Radiance Fields with Cages Tianhan Xu (The University of Tokyo)*; Tatsuya Harada (The University of Tokyo / RIKEN)
6720 GeoAug: Data Augmentation for Few-Shot NeRF with Geometry Constrains Di Chen (Alibaba Group)*; Yu Liu (Alibaba Group); Lianghua Huang (Alibaba Group); bin wang (alibaba group); Pan Pan (Alibaba Group)
6722 DoodleFormer: Creative Sketch Drawing with Transformers Ankan Kumar Bhunia (MBZUAI)*; Salman Khan (MBZUAI/ANU); Hisham Cholakkal (MBZUAI); Rao Muhammad Anwer (MBZUAI/AALTO); Fahad Shahbaz Khan (MBZUAI); Jorma Laaksonen (Aalto University); Michael Felsberg (Linköping University)
6727 Implicit Neural Representations for Variable Length Human Motion Generation Pablo Alberto Cervantes Baque (Tokyo Institute of Technology)*; Yusuke Sekikawa (Denso IT Laboratory); Ikuro Sato (Tokyo Institute of Technology / Denso IT Laboratory); Koichi SHINODA (Tokyo Institute of Technology)
6730 FLEX: Extrinsic Parameters-free Multi-view 3D Human Motion Reconstruction Brian Gordon (Tel Aviv University); Sigal Raab (Tel Aviv University)*; Guy Azov (Tel Aviv University); Raja Giryes (Tel Aviv University); Danny Cohen-Or (Tel Aviv University)
6731 Pairwise Contrastive Learning Network for Action Quality Assessment Mingzhe Li (Huaqiao University); Hong-Bo Zhang (Huaqiao University)*; Qing Lei (Huaqiao University); Zongwen Fan (Huaqiao University); Jinghua Liu (Huaqiao University); Ji-Xiang Du (Huaqiao University)
6742 Large-displacement 3D Object Tracking with Hybrid Non-local Optimization Xuhui Tian (Shandong University)*; Xinran Lin (Shandong University); Fan Zhong (Shandong University); Xueying N/A Qin (Shandong University)
6745 Learning Object Placement via Dual-path Graph Completion Siyuan Zhou (Shanghai Jiao Tong University)*; Liu Liu (Shanghai Jiao Tong University); Li Niu (Shanghai Jiao Tong University); Liqing Zhang (Shanghai Jiao Tong University)
6777 Unbiased Manifold Augmentation for Coarse Class Subdivision Baoming Yan (Alibaba Group)*; KE GAO (alibaba-inc); Bo Gao (Alibaba Group); Lin Wang (Alibaba-inc); Jiang Yang (Alibaba Group); Xiaobo Li (Alibaba)
6798 Rethinking Video Rain Streak Removal: A New Synthesis Model and A Deraining Network with Video Rain Prior Shuai Wang ( College of Intelligence and Computing, Tianjin University); Lei Zhu (The Hong Kong University of Science and Technology (Guangzhou))*; Huazhu Fu (IHPC, ASTAR); Jing Qin (The Hong Kong Polytechnic University); Carola-Bibiane B Schönlieb (Cambridge University); Wei Feng (School of Computer Science and Technology, Tianjin University); Song Wang (University of South Carolina)
6817 Expanded Adaptive Scaling Normalization for End to End Image Compression Chajin Shin (Yonsei University)*; Hyeongmin Lee (Yonsei University ); Hanbin Son (Yonsei Univ.); Sangjin Lee (Yonsei University); Dogyoon Lee (Yonsei University); Sangyoun Lee (Yonsei University)
6827 Embedding contrastive unsupervised features to cluster in- and out-of-distribution noise in corrupted image datasets Paul Albert (Insight Centre for Data Analytics (DCU))*; Eric Arazo (Insight Centre for Data Analytics (DCU)); Noel O Connor (Home); Kevin McGuinness (DCU)
6835 Filter Pruning via Feature Discrimination in Deep Neural Networks Zhiqiang He (Zhejiang University of Science and Technology)*; Yaguan QIAN (Zhejiang University of Science and Technology); Yuqi Wang (Zhejiang University of Science and Technology); Bin WANG (Network and Information Security Laboratory of Hangzhou Hikvision Digital Technology Co.); Xiaohui Guan (Zhejiang University of Water Resources and Electric Power); Zhaoquan Gu (Guangzhou University); Xiang Ling (Institute of Software, Chinese Academy of Sciences); Shaoning Zeng (Yangtze Delta Region Institute (Huzhou), University of Electronic Science and Technology of China); Haijiang Wang (Zhejiang University of Science and Technology); Wujie Zhou (Zhejiang University of Science and Technology)
6836 VoViT: Low Latency Graph-based Audio-Visual Voice Separation Transformer Juan Felipe Montesinos (Universitat Pompeu Fabra)*; Venkatesh Shenoy Kadandale (Universitat Pompeu Fabra); Gloria Haro (Universitat Pompeu Fabra)
6837 SGBANet: Semantic GAN and Balanced Attention Network for Arbitrarily Oriented Scene Text Recognition Dajian Zhong (East China Normal University)*; Shujing Lv (East China Normal University); Palaiahnakote Shivakumara (University of Malaya); Bing Yin (IFLYTEK Co.,Ltd); Jiajia Wu (IFLYTEK Co.,Ltd); Umapada Pal (Indian Statistical Institute, Kolkata); Yue Lu (East China Normal University)
6838 DenseHybrid: Hybrid Anomaly Detection for Dense Open-set Recognition Matej Grcić (University of Zagreb, Faculty of Electrical Engineering and Computing)*; Petra Bevandić (Faculty of Electrical Engineering and Computing); Sinisa Segvic (UniZg-FER)
6862 D2-TPred: Discontinuous Dependency for Trajectory Prediction under Traffic Lights Yuzhen Zhang (Zhengzhou University); Wentong Wang (Zhengzhou University); weizhi guo (zhengzhou university); Pei Lv (Zhengzhou University)*; Mingliang Xu (Zhengzhou University); Wei Chen (State Key Lab of CAD&CG, Zhejiang University); Dinesh Manocha (University of Maryland at College Park)
6867 Where in the World is this Image? Transformer-based Geo-localization in the Wild Shraman Pramanick (Johns Hopkins University)*; Ewa M Nowara (Meta Reality Labs); Joshua Gleason (Univ of Maryland); Carlos Castillo (Johns Hopkins University); Rama Chellappa (Johns Hopkins University)
6884 MODE: Multi-view Omnidirectional Depth Estimation with 360-degree Cameras Ming Li (NanJing University)*; Xueqian Jin (Nanjing University); Xuejiao Hu (Nanjing University); Jingzhao Dai (Nanjing University); Sidan Du (Nanjing University); Yang Li (NanJing University)
6895 NashAE: Disentangling Representations through Adversarial Covariance Minimization Eric C Yeats (Duke University)*; Frank Liu (Oak Ridge National Lab); David Womble (Oak Ridge National Laboratory); Hai Li (Duke University)
6900 Rethinking Confidence Calibration for Failure Prediction Fei Zhu (Institute of Automation of Chinese Academy of Sciences)*; Zhen Cheng (Institute of Automation of Chinese Academy of Sciences); Xu-Yao Zhang (Institute of Automation of Chinese Academy of Sciences); Cheng-Lin Liu (Institute of Automation of Chinese Academy of Sciences)
6905 Colorization for in situ marine plankton images Guannan Guo (Shenzhen Institute of Advanced Technology ,Chinese Academy of Sciences); Qi Lin (Xiamen University); Tao Chen (Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences); Zhenghui Feng (Harbin Institute of Technology, Shenzhen); Zheng Wang (Shenzhen Institutes of Advanced Technology); Jianping Li (Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences)*
6912 PIP: Physical Interaction Prediction via Mental Simulation with Span Selection Jiafei Duan (University of Washington, Seattle)*; Samson Yu (Agency for Science, Technology and Research); Soujanya Poria (Singapore University of Technology and Design); Bihan Wen (Nanyang Technological University); Cheston Tan (Institute for Infocomm Research, Singapore)
6917 Generator Knows What Discriminator Should Learn in Unconditional GANs Gayoung Lee (NAVER AI Lab)*; Hyunsu Kim (NAVER AI Lab); Junho Kim (NAVER AI Lab); Seonghyeon Kim (Clova AI Research, NAVER Corp.); Jung-Woo Ha (NAVER CLOVA AI Lab); Yunjey Choi (NAVER AI Lab)
6921 A Gyrovector Space Approach for Symmetric Positive Semi-definite Matrix Learning Xuan Son Nguyen (Ensea)*
6940 Compositional Visual Generation with Composable Diffusion Models Nan Liu (University of Illinois at Urbana-Champaign); Shuang Li (MIT); Yilun Du (MIT)*; Antonio Torralba (MIT); Joshua Tenenbaum (MIT)
6942 Temporal and cross-modal attention for audio-visual zero-shot learning Otniel-Bogdan Mercea (University of Tübingen)*; Thomas Hummel (University of Tübingen); A. Sophia Koepke (University of Tübingen); Zeynep Akata (University of Tübingen)
6946 Telepresence Video Quality Assessment Zhenqiang Ying (The University of Texas at Austin)*; Deepti Ghadiyaram (Facebook); Alan Bovik (University of Texas at Austin)
6955 Enhancing Multi-modal Features Using Local Self-attention for 3D Object Detection hao li (Hikvision Digital Technology Co. Ltd)*; Zehan Zhang (Shanghai Jiao Tong University & Hangzhou Hikvision Digital Technology Co. Ltd); Zhao Xian (Hikvision); yulong wang (Hikvision Digital Technology Co. Ltd); Yuxi Shen (Hikvision); Shiliang Pu (Hikvision Research Institute); Hui Mao (Hangzhou hikvision digital technology Co.,Ltd)
6956 Totems: Physical Objects for Verifying Visual Integrity Jingwei Ma (University of Washington)*; Lucy Chai (MIT); Minyoung Huh (MIT); Tongzhou Wang (MIT); Ser-Nam Lim (Meta AI); Phillip Isola (MIT); Antonio Torralba (MIT)
6959 ManiFest: manifold deformation for few-shot image translation Fabio Pizzati (Inria / Vislab)*; Jean-Francois Lalonde (Université Laval); Raoul de Charette (Inria)
6963 3D Shape Sequence of Human Comparison and Classification using Current and Varifolds Emery Pierson (Université de Lille)*; Mohamed Daoudi (IMT Lille Douai); Sylvain Arguillere (Institute Camille Jordan)
6971 Decouple-and-Sample: Protecting sensitive information in task agnostic data release Abhishek Singh (MIT)*; Ethan Garza (MIT); Ayush Chopra (MIT); Praneeth Vepakomma (MIT); Vivek Sharma (MIT); Ramesh Raskar (Massachusetts Institute of Technology)
6972 Not All Models Are Equal: Predicting Model Transferability in a Self-challenging Fisher Space Wenqi Shao (The Chinese University of HongKong)*; Xun Zhao (Tencent Company); Yixiao Ge (Tencent); Zhaoyang Zhang (The Chinese University of Hong Kong); Lei Yang (Tencent); Xiaogang Wang (Chinese University of Hong Kong, Hong Kong); Ying Shan (Tencent); Ping Luo (The University of Hong Kong)
6973 Object Detection as Probabilistic Set Prediction Georg Hess (Chalmers University of Technology)*; Christoffer Petersson (Zenseact); Lennart Svensson (Chalmers University of Technology)
6974 k-SALSA: k-anonymous synthetic averaging of retinal images via local style alignment Minkyu Jeon (Korea University)*; Hyeonjin Park (Korea university); Hyunwoo J Kim (Korea University); Michael G Morley (Ophthalmic Consultants fo Boston); Hyunghoon Cho (Broad Institute of MIT and Harvard)
6976 Uncertainty-guided Source-free Domain Adaptation Subhankar Roy (University of Trento)*; Martin Trapp (Aalto University ); Andrea Pilzer (Aalto University); Juho Kannala (Aalto University, Finland); Nicu Sebe (University of Trento); Elisa Ricci (University of Trento); Arno Solin (Aalto University)
6978 LA3: Efficient Label-Aware AutoAugment Mingjun Zhao (University of Alberta)*; Shan Lu (University of Alberta); Zixuan Wang (Tencent Inc.); Xiaoli Wang (Tencent); Di Niu (University of Alberta)
6982 Weakly-Supervised Temporal Action Detection for Fine-Grained Videos with Hierarchical Atomic Actions Zhi Li (University of California, Berkeley)*; Lu He (Tencent America); Huijuan Xu (Pennsylvania State University)
6986 Geometric Features Informed Multi-person Human-object Interaction Recognition in Videos Tanqiu Qiao (Durham University); Qianhui Men (University of Oxford); Frederick W. B. Li (University of Durham); Yoshiki Kubotani (Waseda University); Shigeo Morishima (Waseda Research Institute for Science and Engineering); Hubert P. H. Shum (Durham University)*
6990 FEAR: Fast, Efficient, Accurate and Robust Visual Tracker Vasyl Borsuk (Ukrainian Catholic University); Roman Vei (Ukrainian Catholic University); Orest Kupyn (Ukrainian Catholic University); Tetiana Martyniuk (Ukrainian Catholic University)*; Igor Krashenyi (Piñata Farms); Jiri Matas (CMP CTU FEE)
6997 Variance-Aware Weight Initializationfor Point Convolutional Neural Networks Pedro Hermosilla Casajus (Ulm University)*; Michael Schelling (Ulm University – Institute of Media Informatics); Tobias Ritschel (UCL); Timo Ropinski (Ulm University)
7004 Learning Visual Representation from Modality-Shared Contrastive Language-Image Pre-training Haoxuan You (Columbia University)*; Luowei Zhou (Microsoft); Bin Xiao (Microsoft); Noel C Codella (Microsoft); Yu Cheng (Microsoft Research); Ruochen Xu (Microsoft); Shih-Fu Chang (Columbia University); Lu Yuan (Microsoft)
7016 Single-Stream Multi-Level Alignment for Vision-Language Pretraining Zaid Khan (Northeastern University)*; Vijay Kumar B G (NEC Laboratories America); Xiang Yu (NEC Labs); Samuel Schulter (NEC Laboratories America); Manmohan Chandraker (UC San Diego); YUN FU (Northeastern University)
7022 Revisiting Outer Optimization in Adversarial Training Ali Dabouei (West Virginia university)*; Fariborz Taherkhani (Carnegie Mellon University); Sobhan Soleymani (West Virginia University); Nasser Nasrabadi (West Virginia University)
7027 Supervised Attribute Information Removal and Reconstruction for Image Manipulation Nannan Li (Boston University)*; Bryan Plummer (Boston University)
7028 Conditional-Flow NeRF: Accurate 3D Modelling with Reliable Uncertainty Quantification Jianxiong Shen (IRI, CSIC-UPC)*; Antonio Agudo (Institut de Robotica i Informatica Industrial, CSIC-UPC); Francesc Moreno (IRI); Adria Ruiz (Seedtag)
7035 BLT: Bidirectional Layout Transformer for Controllable Layout Generation Xiang Kong (Carnegie Mellon University)*; Lu Jiang (Google Research); Huiwen Chang (Google); Han Zhang (Google); Yuan Hao (Google); Haifeng Gong (Google Inc.); Irfan Essa (Google)
7039 Neural Correspondence Field for Object Pose Estimation Lin Huang (University at Buffalo); Tomas Hodan (Facebook Reality Labs)*; Lingni Ma (Facebook Reality Labs); Linguang Zhang (Facebook Reality Labs); Luan Tran (Facebook); Christopher D Twigg (Meta); PO-CHEN WU (Meta Inc.); Junsong Yuan (“State University of New York at Buffalo, USA”); Cem Keskin (Facebook); Robert Wang (Facebook Reality Labs)
7043 The Missing Link: Finding label relations across datasets Jasper Uijlings (Google Research)*; Thomas Mensink (Google Research); Vittorio Ferrari (Google Research)
7044 On Label Granularity and Object Localization Elijah Cole (Caltech)*; Kimberly Wilber (Google); Grant Van Horn (Cornell University); Xuan Yang (Google); Marco Fornoni (Google); Pietro Perona (California Institute of Technology); Serge Belongie (University of Copenhagen); Andrew Howard (Google); Oisin Mac Aodha (University of Edinburgh)
7045 RadioTransformer: A Cascaded Global-Focal Transformer for Visual Attention-guided Disease Classification Moinak Bhattacharya (Stony Brook University)*; Shubham Jain (Stony Brook University); Prateek Prasanna (Stony Brook University)
7048 OIMNet++: Prototypical Normalization and Localization-aware Learning for Person Search Sanghoon Lee (Yonsei University); Youngmin Oh (Yonsei University); Donghyeon Baek (Yonsei University); Junghyup Lee (Yonsei University); Bumsub Ham (Yonsei University)*
7050 Most and Least Retrievable Images in Visual-Language Query Systems Liuwan Zhu (Old Dominion University)*; Rui Ning (Old Dominion University); Jiang Li (Old Dominion University); Chunsheng Xin (Old Dominion University); Hongyi Wu (Univesity of Arizona)
7051 Contrasting quadratic assignments for set-based representation learning Artem Moskalev (University of Amsterdam)*; Ivan Sosnovik (University of Amsterdam); Volker Fischer (Bosch Center for Artificial Intelligence); Arnold W.M. Smeulders (University of Amsterdam)
7061 How stable are Transferability Metrics evaluations? Andrea Agostinelli (Google)*; Michal Pandy (University of Cambridge); Jasper Uijlings (Google Research); Thomas Mensink (Google Research); Vittorio Ferrari (Google Research)
7070 A Comparative Study of Graph Matching Algorithms in Computer Vision Stefan Haller (Heidelberg University)*; Lorenz Feineis (Heidelberg University); Lisa Hutschenreiter (Heidelberg University); Florian Bernard (University of Bonn); Carsten Rother (University of Heidelberg); Dagmar Kainmueller (MDC); Paul Swoboda (MPI fuer Informatik, Saarbruecken); Bogdan Savchynskyy (Heidelberg University)
7077 HM: Hybrid Masking for Few-Shot Segmentation Seonghyeon Moon (Rutgers University)*; Samuel S Sohn (Rutgers University); Honglu Zhou (Rutgers University); Sejong Yoon (The College of New Jersey); Vladimir Pavlovic (Rutgers University); Muhammad Haris Khan (Muhammad Bin Zayed University of Artificial Intelligence); Mubbasir Kapadia (Rutgers)
7082 UCTNet: Uncertainty-aware Cross-modal Transformer Network for Indoor RGB-D Semantic Segmentation Xiaowen Ying (Lehigh University)*; Mooi Choo Chuah (Lehigh University)
7090 Learning Omnidirectional Flow in 360° Video via Siamese Representation Keshav Bhandari (Texas State University)*; Bin Duan (Illinois Institute of Technology); Gaowen Liu (Cisco Research); Hugo M Latapie (Cisco); Ziliang Zong (Texas State University); Yan Yan (Illinois Institute of Technology)
7093 Improving Generalization in Federated Learning by Seeking Flat Minima Debora Caldarola (Politecnico di Torino)*; Barbara Caputo (Politecnico di Torino); Marco Ciccone (Politecnico di Torino)
7099 Efficient Deep Visual and Inertial Odometry with Adaptive Visual Modality Selection Mingyu Yang (University of Michigan)*; Yu Chen (University of Michigan); Hun Seok Kim (Nil)
7102 MultiMAE: Multi-modal Multi-task Masked Autoencoders Roman Bachmann (EPFL)*; David Mizrahi (EPFL); Andrei Atanov (EPFL); Amir Zamir (Swiss Federal Institute of Technology (EPFL))
7110 GigaDepth: Learning Depth from StructuredLight with Branching Neural Networks Simon Schreiberhuber (TUWien)*; Jean-Baptiste Weibel (TU Wien); Timothy Patten (University of Technology Sydney); Markus Vincze (TU Wien)
7122 Diverse Generation from a Single Video Made Possible Niv Haim (Weizmann Institute of Science)*; Ben Feinstein (Weizmann Institute of Science); Niv Granot (Weizmann Institute of Science); Assaf Shocher (Weizmann Institute of Science); Shai Bagon (Weizmann Institute of Science); Tali Dekel (Weizmann Institute of Science); Michal Irani (Weizmann Institute, Israel)
7127 Privacy-Preserving Action Recognition via Motion Difference Quantization Sudhakar Kumawat (Osaka University)*; Hajime Nagahara (Osaka University)
7139 Learning Phase Mask for Privacy-Preserving Passive Depth Estimation Zaid Tasneem (Rice University); Giovanni Milione (4 independence Way, Princeton, NJ 08540); Yi-Hsuan Tsai (Phiar Technologies); Xiang Yu (NEC Labs); Ashok Veeraraghavan (Rice University); Manmohan Chandraker (UC San Diego); Francesco Pittaluga (NEC Laboratories America)*
7143 DuelGAN: A Duel Between Two Discriminators Stabilizes the GAN Training Jiaheng Wei (UCSC)*; Minghao Liu (UCSC); Jiahao Luo (UCSC); Andrew Zhu (UCSC); James E Davis (UC Santa Cruz); Yang Liu (UC Santa Cruz)
7151 Should All Proposals be Treated Equally in Object Detection? Yunsheng Li (UCSD)*; Yinpeng Chen (Microsoft); Xiyang Dai (Microsoft); DongDong Chen (Microsoft Cloud AI); Mengchen Liu (Microsoft); Pei Yu (); Ying Jin (Microsoft); Lu Yuan (Microsoft); Zicheng Liu (Microsoft); Nuno Vasconcelos (UC San Diego)
7153 Interpretations Steered Network Pruning via Amortized Inferred Saliency Maps Alireza Ganjdanesh (University of Pittsburgh); Shangqian Gao (University of Pittsburgh); Heng Huang (University of Pittsburgh)*
7158 Out-of-Distribution Identification: Let Detector Tell Which I Am Not Sure Ruoqi Li (SJTU); Chongyang Zhang (Shanghai Jiao Tong University)*; Hao Zhou (Shanghai Jiao Tong University); Chao Shi (Shanghai Jiao Tong University); Yan Luo (Shanghai Jiao Tong University)
7167 Unsupervised Few-Shot Image Classification by Learning Features into Clustering Space Shuo Li (Xidian University); Fang Liu (Xidian University)*; Zehua Hao (Xidian University); Kaibo Zhao (Xidian University); Licheng Jiao (Xidian University)
7173 ViP: Unified Certified Detection and Recovery for Patch Attack with Vision Transformers Junbo Li (UC Santa Cruz); Huan Zhang (UCLA); Cihang Xie (University of California, Santa Cruz)*
7174 Panoramic Vision Transformer for Saliency Detection in 360 Videos Heeseung Yun (Seoul National University)*; Sehun Lee (Seoul National University); Gunhee Kim (Seoul National University)
7175 ActiveNeRF: Learning where to See with Uncertainty Estimation Xuran Pan (Tsinghua University); Zihang Lai (CMU); Shiji Song (Department of Automation, Tsinghua University); Gao Huang (Tsinghua)*
7176 incDFM: Incremental Deep Feature Modeling for Continual Novelty Detection Amanda S Rios (University of Southern California; Intel )*; Nilesh A Ahuja (Intel); Ibrahima Ndiour (Intel); Ergin U Genc (Intel); Laurent Itti (University of Southern California); Omesh Tickoo (Intel)
7186 BA-Net: Bridge Attention for Deep Convolutional Neural Networks Yue Zhao (Sun Yat-sen University); Junzhou Chen (Sun Yat-sen University)*; Zhang Zirui (Sun Yat-sen University); Ronghui Zhang (Sun Yat-Sen University)
7199 Super-Resolution by Predicting Offsets: An Ultra-Efficient Super-Resolution Network for Rasterized Images Jinjin Gu (The University of Sydney)*; Haoming CAI (University of Maryland, College Park); Chenyu Dong (Graduate school at Shenzhen , Tsinghua University); Ruofan Zhang (Tsinghua University); Yulun Zhang (ETH Zurich); Wenming Yang (Tsinghua University); Chun Yuan (Graduate school at ShenZhen,Tsinghua university)
7210 Animation from Blur: Multi-modal Blur Decomposition with Motion Guidance Zhihang Zhong (The University of Tokyo); Xiao Sun (Microsoft Research Asia); Zhirong Wu (Microsoft Research); Yinqiang Zheng (The University of Tokyo); Stephen Lin (Microsoft Research)*; Imari Sato (National Institute of Informatics)
7211 Zero-Shot Attribute Attacks on Fine-Grained Recognition Models Nasim Shafiee (Northeastern University)*; Ehsan Elhamifar (Northeastern University)
7214 Break and Make: Interactive Structural Understanding Using LEGO Bricks Aaron T Walsman (University of Washington)*; Muru Zhang (University of Washington); Klemen Kotar (Allen Institute for AI); Karthik Desingh (University Washington); Dieter Fox (NVIDIA Research / University of Washington); Ali Farhadi (University of Washington, Allen Institue for AI, Apple)
7218 PoserNet: Refining Relative Camera Poses Exploiting Object Detections Matteo Taiana (Istituto Italiano di Tecnologia)*; Matteo Toso (Istituto Italiano di Tecnologia); Stuart James (Istituto Italiano di Tecnologia (IIT)); Alessio Del Bue (Istituto Italiano di Tecnologia (IIT))
7224 Towards Effective and Robust Neural Trojan Defenses via Input Filtering Kien Duc Do (Deakin Unviersity)*; Haripriya Harikumar (Deakin University); Hung Le (Deakin University); Dung Nguyen (Deakin University); Truyen Tran (Deakin University); Santu Rana (Deakin University, Australia); Dang Nguyen (Deakin University); Willy Susilo (University of Wollongong); Svetha Venkatesh (Deakin University)
7230 View Vertically: A Hierarchical Network for Trajectory Prediction via Fourier Spectrums Conghao Wong (Huazhong University of Science and Technology); Beihao Xia (Huazhong University of Science and Technology); Ziming Hong (Huazhong University of Science and Technology); Qinmu Peng (Huazhong University of Science and Technology); Wei Yuan (Huazhong University of Science and Technology); Qiong Cao (JD.com); Yibo Yang (Peking University); Xinge YOU (Huazhong University of Science and Technology)*
7238 Bi-directional Contrastive Learning for Domain Adaptive Semantic Segmentation Geon Lee (Yonsei University); Chanho Eom (Yonsei University); Wonkyung Lee (PS Analytics); Hyekang Park (Yonsei University); Bumsub Ham (Yonsei University)*
7277 Rayleigh EigenDirections (REDs): Nonlinear GAN latent space traversals for multidimensional features Guha Balakrishnan (Rice University)*; Raghudeep Gadde (Amazon); Aleix M Martinez (Amazon); Pietro Perona (Amazon Web Services (AWS))
7278 ActionFormer: Localizing Moments of Actions with Transformers Chen-Lin Zhang (4Paradigm, Inc); Jianxin Wu (Nanjing University); Yin Li (University of Wisconsin-Madison)*
7281 Theoretical Understanding of the Information Flow on Continual Learning Performance Joshua J Andle (University of Maine); Salimeh Yasaei Sekeh (University of Maine)*
7283 3DG-STFM: 3D Geometric Guided Student-Teacher Feature Matching Runyu Mao (Purdue University)*; Chen Bai (Xpeng Motors); yatong an (xm); Fengqing Maggie Zhu (Purdue University, USA); Cheng Lu (Xiaopeng)
7288 Pure Transformer with Integrated Experts for Scene Text Recognition Yew Lee Tan (Nanyang Technological University)*; Wai-Kin Adams Kong (Nanyang Technological University); Jung Jae Kim (I2R)
7301 AudioScopeV2: Audio-Visual Attention Architectures for Calibrated Open-Domain On-Screen Sound Separation Efthymios Tzinis (University of Illinois at Urbana-Champaign); Scott Wisdom (Google)*; Tal Remez (Google); John Hershey (Google)
7304 Bridging the Domain Gap towards Generalization in Automatic Colorization Hyejin Lee (Kookmin University); Daehee Kim (Naver Corp.); Daeun Lee (Korea university); Jinkyu Kim (Korea University); Jaekoo Lee (Kookmin University)*
7311 Learning with Free Object Segments for Long-Tailed Instance Segmentation Cheng Zhang (Carnegie Mellon University)*; Tai-Yu Pan (The Ohio State University); tianle chen (The Ohio State University); Jike Zhong (The Ohio State University); Wenjin Fu (The Ohio State University); Wei-Lun Chao (The Ohio State University)
7315 Rethinking Closed-loop Training for Autonomous Driving Chris Zhang (Waabi / University of Toronto)*; Runsheng Guo (University of Waterloo); Wenyuan Zeng (Waabi, University of Toronto); Yuwen Xiong (University of Toronto); Binbin Dai (Waabi); Rui Hu (Waabi); Mengye Ren (NYU / Google); Raquel Urtasun (Uber ATG)
7331 Autoregressive Uncertainty Modeling for 3D Bounding Box Prediction YuXuan Liu (Covariant.ai, UC Berkeley)*; Nikhil Mishra (Covariant.ai, UC Berkeley); Maximilian Sieb (Covariant.ai); Fred Shentu (UC Berkeley); Pieter Abbeel (UC Berkeley); Peter Chen (COVARIANT.AI)
7337 Learning Regional Purity for Instance Segmentation on 3D Point Clouds Shichao Dong (Nanyang Technological University)*; Guosheng Lin (Nanyang Technological University); Tzu-Yi HUNG (Delta Research Center)
7346 Learning from Unlabeled 3D Environments for Vision-and-Language Navigation Shizhe Chen (INRIA)*; Pierre-Louis Guhur (Inria); Makarand Tapaswi (Wadhwani AI, IIIT Hyderbad); Cordelia Schmid (Inria/Google); Ivan Laptev (INRIA Paris)
7350 A Dataset Generation Framework for Evaluating Megapixel Image Classifiers & their Explanations Gautam B Machiraju (Stanford University)*; Sylvia Plevritis (Stanford University); Parag Mallick (Stanford University)
7351 Sports Video Analysis on Large-Scale Data Dekun Wu (University of Pittsburgh)*; He Zhao (York University); Xingce Bao (EPFL); Rick Wildes (York University)
7368 Audio-Visual Segmentation Jinxing Zhou (Hefei University of Technology); Jianyuan Wang (Chinese University of Hong Kong); Jiayi Zhang (BeiHang University); Weixuan Sun (Australian National University); Jing Zhang (Australian National University); Stan Birchfield (NVIDIA); Dan Guo (Hefei University of Technology); Lingpeng Kong (The University of Hong Kong); Meng Wang (Hefei University of Technology); Yiran Zhong (Australian National University)*
7374 SLiDE: Self-supervised LiDAR De-snowing through Reconstruction Difficulty Gwangtak Bae (Seoul National University)*; Byungjun Kim (Seoul National University); Seongyong Ahn (Agency for Defense Development); jihong Min (Agency for Defense Development); Inwook Shim (Inha University)
7375 On the Angular Update and Hyperparameter Tuning of a Scale-Invariant Network Juseung Yun (KAIST)*; Janghyeon Lee (LG AI Research); Hyounguk Shon (KAIST); Eojindl Yi (KAIST); Seung Hwan Kim (LG AI Research); Junmo Kim (KAIST)
7384 IGFormer: Interaction Graph Transformer for Skeleton-based Human Interaction Recognition Yunsheng Pang (University of Melbourne)*; Qiuhong Ke (Monash University); Hossein Rahmani (Lancaster University); James Bailey (THE UNIVERSITY OF MELBOURNE); Jun Liu (Singapore University of Technology and Design)
7385 LANA: Latency Aware Network Acceleration Pavlo Molchanov (NVIDIA)*; James B Hall (Microsoft Research); Hongxu Yin (NVIDIA ); Nicolo Fusi (Microsoft Research); Jan Kautz (NVIDIA); Arash Vahdat (NVIDIA)
7388 A Sketch Is Worth a Thousand Words:Image Retrieval with Text and Sketch Patsorn Sangkloy (Georgia Institute of Technology)*; Wittawat Jitkrittum (Google Research); Diyi Yang (Georgia Institute of Technology); James Hays (Georgia Institute of Technology, USA)
7396 HVC-Net: Unifying Homography, Visibility, and Confidence Learning for Planar Object Tracking Haoxian Zhang (Tencent)*; Yonggen Ling (Tencent)
7417 3D Random Occlusion and Multi-Layer Projection for Deep Multi-Camera Pedestrian Localization Rui Qiu (Xi’an Jiaotong-Liverpool University, University of Liverpool); Ming Xu (Xi’an Jiaotong-Liverpool University)*; Yuyao Yan (Xi’an Jiaotong-Liverpool University); Jeremy S Smith (University of Liverpool); Xi Yang (Xi’an Jiaotong Liverpool University )
7427 Masked Siamese Networks for Label-Efficient Learning Mahmoud Assran (Facebook AI)*; Mathilde Caron (Facebook Artificial Intelligence Research); Ishan Misra (Facebook AI Research); Piotr Bojanowski (Facebook); Florian Bordes (MILA); Pascal Vincent (Facebook FAIR & MILA Université de Montréal); Armand Joulin (Facebook AI Research); Mike Rabbat (Facebook FAIR); Nicolas Ballas (Facebook FAIR)
7441 A Simple Single-Scale Vision Transformer for Object Detection and Instance Segmentation Wuyang Chen (University of Texas at Austin)*; Xianzhi Du (Google Brain); Fan Yang (Google); Lucas Beyer (Google Brain); Xiaohua Zhai (Google Brain); Tsung-Yi Lin (Google Brain); Huizhong Chen (Google); Jing Li (Google Brain); Xiaodan Song (Google Brain); Zhangyang Wang (University of Texas at Austin); Denny Zhou (Google Brain)
7443 A Cloud 3D Dataset and Application-Specific Learned Image Compression in Cloud 3D Tianyi Liu (The University of Texas at San Antonio)*; Sen He (The University of Texas at San Antonio); Vinodh Kumaran Jayakumar (UTSA); Wei Wang (The University of Texas at San Antonio)
7449 Cross-Domain Few-Shot Semantic Segmentation Shuo Lei (Virginia Tech)*; Xuchao Zhang (NEC Labs America); Jianfeng He (Virginia Tech); Fanglan Chen (Virginia Tech); Bowen Du (Beihang Univeristy); Chang-Tien Lu (Virginia Tech, USA)
7450 VizWiz-FewShot: Locating Objects in Images Taken by People With Visual Impairments Yu-Yun Tseng (University of Colorado Boulder)*; Alexander Bell (IVC Group); Danna Gurari (University of Colorado Boulder)
7474 Towards Metrical Reconstruction of Human Faces Wojciech Zielonka (Max Planck Institute for Intelligent Systems); Timo Bolkart (Max Planck Institute for Intelligent Systems); Justus Thies (Max Planck Institute for Intelligent Systems)*
7476 DeepShadow: Neural Shape from Shadow Asaf Karnieli (Reichman University)*; Yacov Hel-Or (The Interdisciplinary Center); Ohad Fried (IDC Herzliya)
7500 Class-Incremental Learning with Cross-Space Clustering and Controlled Transfer Arjun Ashok (Indian Institute of Technology, Hyderabad)*; Joseph K J (Indian Institute of Technology, Hyderabad); Vineeth N Balasubramanian (Indian Institute of Technology, Hyderabad)
7509 Object discovery and representation networks Olivier Henaff (DeepMind)*; Skanda Koppula (DeepMind); Evan Shelhamer (DeepMind); Daniel Zoran (DeepMind); Andrew Jaegle (DeepMind); Andrew Zisserman (Oxford University); Joao Carreira (DeepMind); Relja Arandjelović (DeepMind)
7511 MeshUDF: Fast and Differentiable Meshing of Unsigned Distance Field Networks Benoit Guillard (EPFL)*; Federico Stella (EPFL); Pascal Fua (EPFL, Switzerland)
7519 Natural Synthetic Anomalies for Self-Supervised Anomaly Detection and Localization Hannah M Schlueter (Imperial College London)*; Jeremy Tan (Imperial College London); Benjamin Hou (Imperial College London); Bernhard Kainz (Imperial College London, FAU Erlangen-Nürnberg)
7522 Shap-CAM: Visual Explanations for Convolutional Neural Networks based on Shapley Value Quan Zheng (Tsinghua University); Ziwei Wang (Tsinghua University); Jie Zhou (Tsinghua University); Jiwen Lu (Tsinghua University)*
7529 Simple Open-Vocabulary Object Detection with Vision Transformers Matthias Minderer (Google Research)*; Alexey Gritsenko (Google Brain); Austin C Stone (Google); Maxim Neumann (Google); Dirk Weißenborn (German Research Center for Artificial Intelligence); Alexey Dosovitskiy (Inceptive); Aravindh Mahendran (Google); Anurag Arnab (Google); Mostafa Dehghani (Google Brain); Zhuoran Shen (Pony.ai); Xiao Wang (Google); Xiaohua Zhai (Google Brain); Thomas Kipf (Google Brain); Neil Houlsby (Google)
7533 Video Restoration Framework and its Meta-adaptations to Data-poor Conditions Prashant W Patil (Deakin University)*; Sunil Gupta (Deakin University, Australia); Santu Rana (Deakin University, Australia); Svetha Venkatesh (Deakin University)
7539 PRIME: A Few Primitives Can Boost Robustness to Common Corruptions Apostolos Modas (EPFL)*; Rahul Shekhar Rade (EthonAI); Guillermo Ortiz-Jimenez (EPFL); Seyed-Mohsen Moosavi-Dezfooli (Imperial College London); Pascal Frossard (EPFL)
7541 AlphaVC: High-Performance and Efficient Learned Video Compression Yibo Shi (Huawei); Yunying Ge (Huawei Technologies); Jing Wang (Huawei)*; Jue Mao (Huawei technologies)
7542 Content-Oriented Learned Image Compression Meng Li (Huawei); Shangyin Gao (Huawei); Yihui Feng (HUAWEI Technology Co., Ltd); Yibo Shi (Huawei); Jing Wang (Huawei)*
7543 Generating Natural Images with Direct Patch Distributions Matching Ariel Elnekave (Hebrew University of Jerusalem)*; Yair Weiss (Hebrew University)
7545 Latent Space Smoothing for Individually Fair Representations Momchil Peychev (ETH Zurich)*; Anian Ruoss (DeepMind); Mislav Balunovic (ETH Zurich); Maximilian Baader (ETH Zürich); Martin Vechev (ETH Zurich)
7555 SAU: Smooth activation function using convolution with approximate identities Koushik Biswas (Indraprastha Institute of Information Technology, New Delhi, India)*; Sandeep Kumar (Shaheed Bhagat Singh College, University of Delhi, Delhi); Shilpak Banerjee (Indian Institute of Technology Tirupati); Ashish Kumar Pandey (Indraprastha Institute of Information Technology, New Delhi, India)
7561 TRoVE: Transforming Road Scene Datasets into Photorealistic Virtual Environments Shubham Dokania (IIIT Hyderabad)*; Anbumani Subramanian (IIIT-Hyderabad); Manmohan Chandraker (UC San Diego); C.V. Jawahar (IIIT-Hyderabad)
7562 Motion Sensitive Contrastive Learning for Self-supervised Video Representation JingCheng Ni (Behang University)*; Nan Zhou (Beihang University); Jie Qin (Nanjing University of Aeronautics and Astronautics); Qian Wu (Megvii); Junqi Liu (Megvii); Boxun Li (Megvii Inc.); Di Huang (Beihang University, China)
7573 Scaling Adversarial Training to Large Perturbation Bounds Sravanti Addepalli (Indian Institute of Science)*; Samyak Jain (Indian Institute of Technology (BHU), Varanasi); Gaurang Sriramanan (University of Maryland, College Park); Venkatesh Babu RADHAKRISHNAN (Indian Institute of Science)
7592 RDO-Q: Extremely Fine-Grained Channel-Wise Quantization via Rate-Distortion Optimization Zhe Wang (Institute for Infocomm Research, Singapore)*; Jie Lin (Institute for Infocomm Research (I2R), Singapore); Xue Geng (I2R, A*STAR); Mohamed M. Sabry Aly (Nanyang Technological University); Vijay R. Chandrasekhar (Institute for Infocomm Research)
7605 Camera Auto-calibration from the Steiner Conic of the Fundamental Matrix Yu LIU (United International College, BNU-HKBU)*; Hui Zhang (UIC)
7626 Understanding Collapse in Non-Contrastive Siamese Representation Learning Alexander C Li (Carnegie Mellon University)*; Alexei A Efros (UC Berkeley); Deepak Pathak (Carnegie Mellon University)
7634 AutoTransition: Learning to Recommend Video Transition Effects Yaojie Shen (Institute of Software, Chinese Academy of Sciences); Libo Zhang (Institute of Software Chinese Academy of Sciences); Kai Xu (ByteDance Inc); Xiaojie Jin (Bytedance Inc. USA)*
7651 SPE-Net: Boosting Point Cloud Analysis via Rotation Robustness Enhancement Zhaofan Qiu (JD.com); Yehao Li (JD AI Research); Yu Wang (JD AI Research); Yingwei Pan (JD AI Research); Ting Yao (JD AI Research)*; Tao Mei (AI Research of JD.com)
7667 Text-based Temporal Localization of Novel Events Sudipta Paul (University of California, Riverside)*; Niluthpol C Mithun (SRI International); Amit K. Roy-Chowdhury (University of California, Riverside)
7687 Effective Presentation Attack Detection Driven by Face Related Task Wentian Zhang (Shenzhen University); Haozhe Liu ( King Abdullah University of Science and Technology); Feng Liu (Shenzhen University )*; Raghavendra Ramachandra (NTNU, Norway); Christoph Busch (Norwegian University of Science and Technology)
7691 LWGNet – Learned Wirtinger Gradients for Fourier Ptychographic Phase Retrieval Atreyee Saha (Indian Institute of Technology Madras)*; Salman Siddique Khan (IIT Madras); Sagar Sehrawat (IIT Madras); Sanjana S Prabhu (Indian Institute of Technology Madras); Shanti Bhattacharya (IIT Madras); Kaushik Mitra (IIT Madras)
7693 Federated Self-supervised Learning for Video Understanding Yasar Rehman (TCL Corporate Research(Hong Kong) Co. Ltd); Yan Gao (University of Cambridge)*; Jiajun Shen (TCL Research); Pedro Gusmao (University of Cambridge); Nicholas Lane (University of Cambridge and Samsung AI)
7694 Reliability-Aware Prediction via Uncertainty Learning for Person Image Retrieval Zhaopeng Dou (Tsinghua University)*; Zhongdao Wang (Tsinghua University); Weihua Chen (alibaba group); Ya-Li Li (Tsinghua University); Shengjin Wang (Tsinghua University)
7704 The Shape Part Slot Machine: Contact-based Reasoning for Generating 3D Shapes from Parts Kai Wang (Brown University)*; Paul Guerrero (Adobe); Vladimir Kim (Adobe); Siddhartha Chaudhuri (Adobe Research); Minhyuk Sung (KAIST); Daniel Ritchie (Brown University)
7710 Attention Diversification for Domain Generalization Rang Meng (Hikvision Research Institute)*; Xianfeng Li (Hikvision Research Institute ); Weijie Chen (Zhejiang University); Shicai Yang (Hikvision Research Institute); Jie Song (Zhejiang University); Xinchao Wang (National University of Singapore); Lei Zhang (Chongqing University); Mingli Song (Zhengjiang University); Di Xie (Hikvision Research Institute); Shiliang Pu (Hikvision Research Institute)
7718 Exploiting the local parabolic landscapes of adversarial losses to accelerate black-box adversarial attack Hoang Tran (Oak Ridge National Laboratory); Dan Lu (Oak Ridge National Laboratory); Guannan Zhang (Oak Ridge National Laboratory)*
7719 Towards Efficient and Effective Self-Supervised Learning of Visual Representations Sravanti Addepalli (Indian Institute of Science)*; Kaushal Bhogale (Indian Institute of Technology, Madras); Priyam Dey (Indian Institute of Science); Venkatesh Babu RADHAKRISHNAN (Indian Institute of Science)
7722 TransVLAD: Focusing on Locally Aggregated Descriptors for Few-Shot Learning Haoquan Li (Southern University of Science and Technology)*; Laoming Zhang (Southern University of Science and Technology); Daoan Zhang (Southern University of Science and Technology); Lang Fu (Southern University of Science and Technology); Peng Yang (Southern University of Science and Technology); Jianguo Zhang (Southern University of Science and Technology)
7735 Rotation Regularization Without Rotation Takumi Kobayashi (National Institute of Advanced Industrial Science and Technology)*
7741 Parameterized Temperature Scaling for Boosting the Expressive Power in Post-Hoc Uncertainty Calibration Christian Tomani (TUM)*; Daniel Cremers (TU Munich); Florian Buettner (German Cancer Research Center and Frankfurt University)
7746 FairStyle: Debiasing StyleGAN2 with Style Channel Manipulations Cemre Efe Karakas (Bogazici University); Alara Dirik (Bogazici University); Eylül Yalçınkaya (Bogazici University); Pinar Yanardag (Bogazici University)*
7756 Dynamic Temporal Filtering in Video Models Fuchen Long (JD.com); Zhaofan Qiu (JD.com); Yingwei Pan (JD AI Research)*; Ting Yao (JD AI Research); Chong-Wah Ngo (Singapore Management University); Tao Mei (AI Research of JD.com)
7764 DH-AUG: DH Forward Kinematics Model Driven Augmentation for 3D Human Pose Estimation linzhi huang (Beijing University of Posts and Telecommunications)*; Jiahao Liang (Beijing University of Posts and Telecommunications); Weihong Deng (Beijing University of Posts and Telecommunications)
7765 Super-resolution 3D Human Shape from a Single Low-Resolution Image Marco Pesavento (University of Surrey)*; Marco Volino (University of Surrey); Adrian Hilton (University of Surrey)
7771 Trading Positional Complexity vs Deepness in Coordinate Networks Jianqiao Zheng (University of Adelaide)*; Sameera Ramasinghe (University of Adelaide); Xueqian Li (Carnegie Mellon University); Simon Lucey (University of Adelaide)
7785 ESS: Learning Event-based Semantic Segmentation from Still Images Zhaoning Sun (ETH Zürich); Nico Messikommer (University of Zurich & ETH Zurich)*; Daniel Gehrig (University of Zurich & ETH Zurich); Davide Scaramuzza (University of Zurich & ETH Zurich, Switzerland)
7802 U-Boost NAS: Utilization-Boosted Differentiable Neural Architecture Search Ahmet Yüzügüler (EPFL)*; Nikolaos Dimitriadis (EPFL); Pascal Frossard (EPFL)
7803 MonteBoxFinder: Detecting and Filtering Primitives to Fit a Noisy Point Cloud Michaël Ramamonjisoa (Ecole des Ponts)*; Sinisa Stekovic (Graz University of Technology); Vincent Lepetit (Ecole des Ponts ParisTech)
7815 Trapped in texture bias? A large scale comparison of deep instance segmentation Johannes Theodoridis (Hochschule der Medien Stuttgart)*; Jessica Hofmann (Hochschule der Medien); Johannes Maucher (Media University Stuttgart); Andreas G Schilling (University of Tübingen)
7845 MVDG: A Unified Multi-view Framework for Domain Generalization Jian Zhang (Nanjing University)*; Lei Qi (Southeast University); Yinghuan Shi (Nanjing University); Yang Gao (Nanjing University)
7847 MINER: Multiscale Implicit Neural Representation Vishwanath Saragadam (Rice University)*; Jasper T Tan (Rice University); Guha Balakrishnan (Rice University); Richard Baraniuk (Rice University); Ashok Veeraraghavan (Rice University)
7856 PTQ4ViT: Post-Training Quantization for Vision Transformers with Twin Uniform Quantization Zhihang Yuan (Peking University)*; Chenhao Xue (Peking University); Yiqi Chen (Peking University); Qiang Wu (HOUMO.AI); Guangyu Sun (Peking University)
7865 Context-Consistent Semantic Image Editing with Style-Preserved Modulation Wuyang Luo (School of Computer Science, Fudan University); Su Yang (School of Computer Science, Fudan University)*; Hong Wang (School of Computer Science, Fudan University); Bo Long (School of Computer Science, Fudan University ); Weishan Zhang (Department of Software Engineering, China University of Petroleum)
7874 Distilling the Undistillable: Learning from a Nasty Teacher Surgan Jandial (MDSR Labs, Adobe)*; Yash Khasbage (Indian Institute of Technology, Hyderabad); Arghya Pal (Harvard University); Vineeth N Balasubramanian (Indian Institute of Technology, Hyderabad); Balaji Krishnamurthy ()
7879 Grounding Visual Representations with Texts for Domain Generalization Seonwoo Min (LG AI Research)*; Nokyung Park (Korea University); Siwon Kim (Seoul National University); Seunghyun Park (Clova AI Research, NAVER Corp.); Jinkyu Kim (Korea University)
7883 Towards Accurate Open-Set Recognition via Background-Class Regularization Wonwoo Cho (Korea Advanced Institute of Science and Technology)*; Jaegul Choo (Korea Advanced Institute of Science and Technology)
7899 In Defense of Image Pre-Training for Spatiotemporal Recognition Xianhang Li (University of California, Santa Cruz)*; Huiyu Wang (JHU); Chen Wei (Johns Hopkins University); Jieru Mei (Johns Hopkins University); Alan Yuille (Johns Hopkins University); Yuyin Zhou (UC Santa Cruz); Cihang Xie (University of California, Santa Cruz)
7925 SocialVAE: Human Trajectory Prediction using Timewise Latents Pei Xu (Clemson University)*; Jean-Bernard Hayet (CIMAT); Ioannis Karamouzas (Clemson University)
7926 BodySLAM: Joint Camera Localisation, Mapping, and Human Motion Tracking Dorian F Henning (Imperial College London)*; Tristan Laidlow (Imperial College London); Stefan Leutenegger (TU Munich)
7935 Eliminating Gradient Conflict in Reference-based Line-Art Colorization zekun li (University of Electronic Science and Technology of China)*; Zhengyang Geng (Peking University); Zhao Kang (University of Electronic Science and Technology of China); Wenyu Chen (University of Electronic Science and Technology of China); Yibo Yang (Peking University)
7950 Transfer without Forgetting Matteo Boschini (University of Modena and Reggio Emilia)*; Lorenzo Bonicelli (Università of Modena and Reggio Emilia); Angelo Porrello (University of Modena and Reggio Emilia); Giovanni Bellitto (University of Catania); Matteo Pennisi (University of Catania); Simone Palazzo (University of Catania); Concetto Spampinato (University of Catania); SIMONE CALDERARA (University of Modena and Reggio Emilia, Italy)
7955 DSR — A dual subspace re-projection network for surface anomaly detection Vitjan Zavrtanik (University of Ljubljana)*; Matej Kristan (University of Ljubljana); Danijel Skocaj (University of Ljubljana)
7964 Multi-Exit Semantic Segmentation Networks Alexandros Kouris (Imperial College London and Samsung AI)*; Stylianos Venieris (Samsung AI); Stefanos Laskaridis (Samsung AI); Nicholas Lane (University of Cambridge and Samsung AI)
7968 Almost-Orthogonal Layers for Efficient General-Purpose Lipschitz Networks Bernd Prach (IST Austria)*; Christoph H Lampert (IST Austria)
8001 Bridging the visual semantic gap in VLN via semantically richer instructions Joaquín Ignacio Ossandón (Universidad Catolica de Chile)*; Benjamín Earle (Universidad Católica de Chile); Alvaro Soto (Universidad Catolica de Chile)
8003 Kernel Relative-prototype Spectral Filtering for Few-shot Learning Tao Zhang (Chengdu Techman Software Co., Ltd.)*; Wu Huang (Sichuan University)
8009 StoryDALL-E: Adapting Pretrained Text-to-image Transformers for Story Continuation Adyasha Maharana (UNC Chapel Hill)*; Darryl Hannan (University of North Carolina at Chapel Hill); Mohit Bansal (University of North Carolina at Chapel Hill)
8026 Unsupervised Learning of Efficient Geometry-Aware Neural Articulated Representations Atsuhiro Noguchi (The University of Tokyo)*; Xiao Sun (Microsoft Research Asia); Stephen Lin (Microsoft Research); Tatsuya Harada (The University of Tokyo / RIKEN)
8029 PANDORA: Polarization-Aided Neural Decomposition Of Radiance Akshat Dave (Rice University)*; Yongyi Zhao (Rice University); Ashok Veeraraghavan (Rice University)
8042 OCR-free Document Understanding Transformer Geewook Kim (NAVER Corporation)*; Teakgyu Hong (Upstage AI); Moonbin Yim (Clova AI Research, NAVER Corp.); Jeongyeon Nam (Naver); Jinyoung Park (TmaxAI); Jinyeong Yim (Google); Wonseok Hwang (LBox); Sangdoo Yun (NAVER AI LAB); Dongyoon Han (NAVER AI Lab); Seunghyun Park (Clova AI Research, NAVER Corp.)
8048 VQGAN-CLIP: Open Domain Image Generation and Manipulation Using Natural Language Katherine B Crowson (EleutherAI); Stella R Biderman (Booz Allen Hamilton)*; daniel kornis (Eleuther.ai); Dashiell Stander (Eleuther AI); Eric Hallahan (EleutherAI); Louis J Castricato (Georgia Tech); Edward Raff (Booz Allen Hamilton)
8063 Learning to use unlabeled data in data augmentation for 3D detection Zhaoqi Leng (Waymo)*; Shuyang Cheng (Waymo LLC); Ben Caine (Google); Weiyue Wang (Waymo); Xiao Zhang (Cruise); Jonathon Shlens (Google); Mingxing Tan (Waymo); Dragomir Anguelov (Waymo)
8070 Differentiable Zooming for Multiple Instance Learning on Whole-Slide Images Kevin Thandiackal (ETH Zurich / IBM Research)*; Boqi Chen (ETH Zurich ); Pushpak Pati (IBM Research Zurich); Guillaume Jaume (Harvard); Drew Williamson (Pathology, Brigham and Women’s Hospital, Harvard Medical School); Maria Gabrani (IBM Research); Orcun Goksel (ETH Zurich)
8081 Towards Learning Neural Representations from Shadows Kushagra Tiwary (MIT)*; Tzofi M Klinghoffer (Massachusetts Institute of Technology); Ramesh Raskar (Massachusetts Institute of Technology)
8086 Augmenting Deep Classifiers with Polynomial Neural Networks Grigorios Chrysos (EPFL)*; Markos Georgopoulos (Imperial College London); Jiankang Deng (Imperial College London); Jean Kossaifi (NVIDIA); Yannis Panagakis (University of Athens); Animashree Anandkumar (Caltech)
8092 AdaBest: Minimizing Client Drift in Federated Learning via Adaptive Bias Estimation Farshid Varno (Dalhousie/Imagia)*; Marzie Saghayi (Dalhousie University); Laya Rafiee Sevyeri (Concordia); Sharut Gupta (MILA, Imagia, Indian Institute of Technology Delhi (IIT Delhi)); Stan Matwin (Dalhouise University); Mohammad Havaei (Imagia)
8094 A Simple Approach and Benchmark for 21,000-Category Object Detection Yutong Lin (Xi’an Jiaotong University); Chen Li (Xi’an Jiaotong University); Yue Cao (Microsoft Research); Zheng Zhang (MSRA); Jianfeng Wang (Microsoft); Lijuan Wang (Microsoft); Zicheng Liu (Microsoft); Han Hu (Microsoft Research Asia)*
8106 Bitwidth-Adaptive Quantization-Aware Neural Network Training: A Meta-Learning Approach Jiseok Youn (Seoul National University)*; Jaehun Song (Seoul National University); Hyung-Sin Kim (Seoul National University); Saewoong Bahk (Seoul National University)
8140 Learning with Noisy Labels by Efficient Transition Matrix Estimation to Combat Label Miscorrection Seong Min Kye (KAIST); Kwanghee Choi (Sogang University); Joonyoung Yi (Hyperconnect); Buru Chang (Hyperconnect)*
8170 Online Task-free Continual Learning with Dynamic Sparse Distributed Memory Julien Pourcel (ENSEA)*; Ngoc-Son Vu (ETIS/Université Paris Seine, Université Cergy-Pontoise, ENSEA, CNRS/ 95000-Cergy); Robert M FRENCH (CNRS)
PHP Code Snippets Powered By : XYZScripts.com