Paper ID |
Paper Title |
Authors |
8 |
Learning Uncoupled-Modulation CVAE for 3D Action-Conditioned Human Motion Synthesis |
Chongyang Zhong (Institute of Computing Technology, Chinese Academy of Sciences)*; Lei Hu (Institute of Computing Technology, Chinese Academy of Sciences ); Zihao Zhang (Institute of Computing Technology, Chinese Academy of Sciences); Shihong Xia (institute of computing technology of the Chinese academy of sciences) |
16 |
Generative Domain Adaptation for Face Anti-Spoofing |
Qianyu Zhou (Shanghai Jiao Tong University)*; Ke-Yue Zhang (YouTu Lab, Tencent); Taiping Yao (Tencent YouTu); Ran Yi (Shanghai Jiao Tong University); Kekai Sheng (Youtu Lab, Tencent Inc.); Shouhong Ding (Tencent); Lizhuang Ma (Shanghai Jiao Tong University) |
19 |
Learning Depth from Focus in the Wild |
Changyeon Won (GIST)*; Hae-Gon Jeon (GIST) |
34 |
Relighting4D: Neural Relightable Human from Videos |
Zhaoxi Chen (Nanyang Technological University )*; Ziwei Liu (Nanyang Technological University) |
46 |
PPT: token-Pruned Pose Transformer for monocular and multi-view human pose estimation |
Haoyu Ma (University of California, Irvine)*; Zhe Wang (UC-Irvine); Yifei Chen (Tencent); Deying Kong (university of california, irvine); Liangjian Chen (Reality Labs); Xingwei Liu (University of California Irvine); Xiangyi Yan (University of California, Irvine); Hao Tang (University of California Irvine); Xiaohui Xie (University of California, Irvine) |
52 |
Understanding the Dynamics of DNNs Using Graph Modularity |
Yao Lu (Zhejiang University of Technology)*; Wen Yang (Zhejiang University of Technology); Yunzhe Zhang (Zhejiang University of Technology); Zuohui Chen (Zhejiang University of Technology); Jinyin Chen (Zhejiang University of Technology); Qi Xuan (Zhejiang University of Technology); Zhen Wang (Northwestern Polytechnical University); Xiaoniu Yang (Zhejiang University of Technology; Science and Technology on Communication Information Security Control Laboratory) |
65 |
Discriminability-Transferability Trade-Off: An Information-Theoretic Perspective |
Quan Cui (Waseda University)*; Bingchen Zhao (University of Edinburgh); Zhao-Min Chen (NanJing University); Borui Zhao (Megvii Technology); Renjie Song (Megvii Inc.); Boyan Zhou (ByteDance); Jiajun Liang (Megvii); Osamu Yoshie (Waseda University) |
69 |
Learning-based Point Cloud Registration for 6D Object Pose Estimation in the Real World |
Zheng Dang (EPFL)*; Lizhou Wang (Xi’an Jiaotong University); Yu Guo (School of Software Engineering, Xi’an Jiaotong University); Mathieu Salzmann (EPFL) |
74 |
AvatarPoser: Articulated Full-Body Pose Tracking from Sparse Motion Sensing |
Jiaxi Jiang (ETH Zurich)*; Paul Streli (ETH Zurich); Huajian Qiu (EPFL); Andreas R Fender (ETH Zurich); Larissa Laich (Facebook Reality Labs); Patrick Snape (Meta); Christian Holz (ETH Zürich) |
75 |
Knowledge Condensation Distillation |
chenxin li (Xiamen University)*; Mingbao Lin (Xiamen University, China); Zhiyuan Ding (Xiamen University); Nie Lin (Hunan University); Yihong Zhuang (Xiamen University); Yue Huang (Xiamen University); Xinghao Ding (Xiamen University); Liujuan Cao (Xiamen University) |
83 |
CAR: Class-aware Regularizations for Semantic Segmentation |
Ye Huang (University of Technology Sydney)*; Di Kang (Tencent); Liang Chen (Fujian Normal University); Xuefei Zhe (Tencent AI lab); Wenjing Jia (University of Technology Sydney); Linchao Bao (Tencent AI Lab); Xiangjian He (University of Nottingham Ningbo China) |
86 |
Style-Hallucinated Dual Consistency Learning for Domain Generalized Semantic Segmentation |
Yuyang Zhao (National University of Singapore)*; Zhun Zhong (University of Trento); Na Zhao (NUS); Nicu Sebe (University of Trento); Gim Hee Lee (National University of Singapore) |
88 |
Reducing Information Loss for Spiking Neural Networks |
Yufei Guo (The Second Academy of China Aerospace Science and Industry Corporation)*; Yuanpei Chen (X LAB,The Second Academy of CASIC,Beijing); Liwen Zhang (X Lab, the Second Academy of CASIC, Beijing); YingLei Wang (CASIC); Xiaode Liu (X Lab, The Second Academy of China Aerospace Science and Industry Corporation); Xinyi Tong (The Second Academy of China Aerospace Science and Industry Corporation); Yuanyuan Ou (Chongqing University); Xuhui Huang (X Lab, The Second Academy of CASIC); Zhe Ma (Xlab, the Second Academy of CASIC, Beijing) |
95 |
Real-Time Intermediate Flow Estimation for Video Frame Interpolation |
Zhewei Huang (MEGVII)*; Tianyuan Zhang (Carnegie Mellon University); Wen Heng (Megvii inc.); Boxin Shi (Peking University); Shuchang Zhou (MEGVII Technology) |
101 |
Class-incremental Novel Class Discovery |
Subhankar Roy (University of Trento); Mingxuan Liu (University of Trento); Zhun Zhong (University of Trento)*; Nicu Sebe (University of Trento); Elisa Ricci (University of Trento) |
103 |
PixelFolder: An Efficient Progressive Pixel Synthesis Network for Image Generation |
Jing He (Xiamen university)*; Yiyi Zhou (Xiamen University); Qi Zhang (Tencent); Jun Peng (Xiamen University); Yunhang Shen (Xiamen University); Xiaoshuai Sun (Xiamen University); Chao Chen (Youtu Laboratory); Rongrong Ji (Xiamen University, China) |
107 |
Minimal Neural Atlas: Parameterizing Complex Surfaces with Minimal Charts and Distortion |
Weng Fei Low (National University of Singapore)*; Gim Hee Lee (National University of Singapore) |
121 |
Contrastive Prototypical Network with Wasserstein Confidence Penalty |
Haoqing Wang (Peking University)*; Zhi-Hong Deng (Peking University) |
123 |
Privacy-Preserving Face Recognition with Learnable Privacy Budgets in Frequency Domain |
Jiazhen Ji (Tencent)*; Huan Wang (Xiamen University); Yuge Huang (Tencent YouTu); Jiaxiang Wu (Tencent); Xingkun Xu (Tencent); Shouhong Ding (Tencent); ShengChuan Zhang (Xiamen University); Liujuan Cao (Xiamen University); Rongrong Ji (Xiamen University, China) |
127 |
An End-to-End Transformer Model for Crowd Localization |
Dingkang Liang (Huazhong University of Science and Technology)*; Wei Xu (Beijing University of Posts and Telecommunications); Xiang Bai (Huazhong University of Science and Technology) |
132 |
Deformable Feature Aggregation for Dynamic Multi-Modal 3D Object Detection |
Zehui Chen (University of Science and Technology of China); Zhenyu Li (Harbin Institute of Technology); Shiquan Zhang (SenseTime Research); Liangji Fang (Sensetime Research); Qinhong Jiang (SenseTime Research; Shanghai AI Laboratory); Feng Zhao (University of Science and Technology of China)* |
140 |
Masked Generative Distillation |
Zhendong Yang (Graduate school at ShenZhen,Tsinghua university)*; Zhe Li (Bytedance Inc.); Shao Mingqi (Graduate school at ShenZhen, Tsinghua university); Dachuan Shi (Graduate school at ShenZhen, Tsinghua University); Zehuan Yuan (Bytedance.Inc); Chun Yuan (Graduate school at ShenZhen,Tsinghua university) |
145 |
Saliency Hierarchy Modeling via Generative Kernels for Salient Object Detection |
Wenhu Zhang (Zhejiang University)*; Liangli Zheng (Zhejiang University); Huanyu Wang (Zhejiang University); Xintian Wu (Zhejiang University); Xi Li (Zhejiang University) |
154 |
Tip-Adapter: Training-free Adaption of CLIP for Few-shot Classification |
Renrui Zhang (Shanghai AI Lab)*; Zhang Wei (Shanghai AI-Lab); Rongyao Fang (Chinese University of Hong Kong); Peng Gao (Chinese university of hong kong); Kunchang Li (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences); Jifeng Dai (SenseTime); Yu Qiao (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences); Hongsheng Li (The Chinese University of Hong Kong) |
160 |
Temporal Lift Pooling for Continuous Sign Language Recognition |
Lianyu Hu (Tianjin University)*; Liqing Gao (College of Intelligence and Computing,Tianjin University); Zekang Liu (College of Intelligence and Computing, Tianjin University); Wei Feng (College of Intelligence and Computing, Tianjin University, China) |
167 |
MORE: Multi-Order RElation Mining for Dense Captioning in 3D Scenes |
Yang Jiao (Fudan University)*; Shaoxiang Chen (Fudan University); Zequn Jie (Meituan inc.); Jingjing Chen (Fudan University); Lin Ma (Meituan); Yu-Gang Jiang (Fudan University) |
171 |
JPEG Artifacts Removal via Contrastive Representation Learning |
Xi Wang (University of Science and Technology of China); Xueyang Fu (University of Science and Technology of China)*; Yurui Zhu (University of Science and Technology of China); Zheng-Jun Zha (University of Science and Technology of China) |
180 |
Tackling Long-Tailed Category Distribution Under Domain Shifts |
Xiao Gu (Imperial College London)*; Yao Guo (Shanghai Jiao Tong Univerisity); Zeju Li (Imperial College London); Jianing Qiu (Imperial College London); DOU QI (The Chinese University of Hong Kong); Yuxuan Liu (Institude of Medical Robotics, Shanghai Jiao Tong University); Benny P L Lo (Imperial College London); Guang-Zhong Yang (SJTU) |
184 |
WeLSA: Learning To Predict 6D Pose From Weakly Labeled Data Using Shape Alignment |
Shishir Reddy Vutukur (TU Munich / Siemens Technology)*; Ivan Shugurov (TU Munich / Siemens Corporate Technology); Benjamin Busam (Technical University of Munich); ANDREAS HUTTER (Siemens Corporate Technology, Germany); Slobodan Ilic (TUM) |
190 |
Fine-grained Data Distribution Alignment for Post-Training Quantization |
Yunshan Zhong (xiamen university)*; Mingbao Lin (Xiamen University, China); Mengzhao Chen (Xiamen University); Ke Li (Tencent); Yunhang Shen (Xiamen University); Fei Chao (Xiamen University); Yongjian Wu (Tencent Technology (Shanghai) Co.,Ltd); Rongrong Ji (Xiamen University, China) |
192 |
Few-shot Single-view 3D Reconstruction with Memory Prior Contrastive Network |
Zhen Xing (Fudan University)*; Yijiang Chen (Fudan University); Zhixin Ling (Fudan University); Xiangdong Zhou (Fudan University); Yu Xiang (The University of Texas at Dallas) |
194 |
ExtrudeNet: Unsupervised Inverse Sketch-and-Extrude for Shape Parsing |
Daxuan Ren (Nanyang Technological University)*; Jianmin Zheng (Nanyang Technological University); Jianfei Cai (Monash University); jiatong j li (Sensetime); Junzhe Zhang (Nanyang Technological University) |
196 |
P-STMO: Pre-Trained Spatial Temporal Many-to-One Model for 3D Human Pose Estimation |
Wenkang Shan (Peking University)*; Zhenhua Liu (Peking University); xinfeng zhang (University of Chinese Academy of Sciences); Shanshe Wang (Peking University); Siwei Ma (Peking University, China); Wen Gao (PKU) |
205 |
Contrast-Phys: Unsupervised Video-based Remote Physiological Measurement via Spatiotemporal Contrast |
Zhaodong Sun (University of Oulu)*; Xiaobai Li (University of Oulu) |
222 |
Panoptic Scene Graph Generation |
Jingkang Yang (Nanyang Technological University)*; Yi Zhe Ang (Nanyang Technological University); Zujin GUO (Nanyang Technological University); Kaiyang Zhou (Nanyang Technological University); Wayne Zhang (SenseTime Research); Ziwei Liu (Nanyang Technological University) |
247 |
StyleSwap: Style-Based Generator Empowers Robust Face Swapping |
Zhiliang Xu (Baidu Inc.); Hang Zhou (The Chinese University of Hong Kong)*; Zhibin Hong (Baidu Inc.); Ziwei Liu (Nanyang Technological University); Jiaming Liu (Baidu Inc.); zhizhi guo (Department of Computer Vision Technology (VIS), Baidu Inc); Junyu Han (Baidu Inc.); jingtuo liu (baidu); Errui Ding (Baidu Inc.); Jingdong Wang (Baidu) |
248 |
Boosting Event Stream Super-Resolution with A Recurrent Neural Network |
Wenming Weng (University of Science and Technology of China)*; Yueyi Zhang (University of Science and Technology of China); Zhiwei Xiong (University of Science and Technology of China) |
249 |
Unknown-Oriented Learning for Open Set Domain Adaptation |
jie liu (City University of Hong Kong)*; Xiaoqing Guo (City University of Hong Kong); Yixuan YUAN (City University of Hong Kong) |
255 |
Unpaired Deep Image Dehazing Using Contrastive Disentanglement Learning |
Xiang Chen (Nanjing University of Science and Technology)*; Zhentao Fan (Shenyang Aerospace University); Pengpeng Li (Dalian Polytechnic University); Longgang Dai (Shenyang Aerospace University); Caihua Kong (Shenyang Aerospace University); Zhuoran Zheng (Nanjing University of Science and Technology ); Yufeng Huang (Shenyang Aerospace University); Yufeng Li (Shenyang Aerospace University) |
263 |
Check and Link: Pairwise Lesion Correspondence Guides Mammogram Mass Detection |
Ziwei Zhao (Peking University)*; Dong Wang (Peking University); Yihong Chen (Peking University); Ziteng Wang (Yizhun-ai); Liwei Wang (Peking University) |
265 |
Generative Subgraph Contrast for Self-Supervised Graph Representation Learning |
yuehui han (njust)*; Le Hui (Nanjing University of Science and Technology); Haobo Jiang (Nanjing University of Science and Technology); Jianjun Qian (Nanjing University of Science and Technology); Jin Xie (Nanjing University of Science and Technology) |
267 |
DVS-Voltmeter: Stochastic Process-based Event Simulator for Dynamic Vision Sensors |
SongNan Lin (Nanyang Technological University)*; Ye Ma (McGill University); Zhenhua Guo (Aliababa Group); Bihan Wen (Nanyang Technological University) |
268 |
Prototype-Guided Continual Adaptation for Class-Incremental Unsupervised Domain Adaptation |
Hongbin Lin (South China University of Technology); Yifan Zhang (National University of Singapore); Zhen Qiu (South China University of Technology); Shuaicheng Niu (South China University of Technology); Chuang Gan (MIT-IBM Watson AI Lab); Yanxia Liu (South China University of Technology); Mingkui Tan (South China University of Technology)* |
283 |
SiRi: A Simple Selective Retraining Mechanism for Transformer-based Visual Grounding |
Mengxue Qu (Beijing Jiaotong University)*; Yu Wu (Princeton University); Wu Liu (AI Research of JD.com); Qiqi Gong (BeijingJiaotong University); Xiaodan Liang (Sun Yat-sen University); Olga Russakovsky (Princeton University); Yao Zhao (Beijing Jiaotong University); Yunchao Wei (UTS) |
287 |
Benchmarking Omni-Vision Representation through the Lens of Visual Realms |
Yuanhan Zhang (Nanyang Technological University); Zhenfei Yin (Sensetime); Jing Shao (Sensetime); Ziwei Liu (Nanyang Technological University)* |
291 |
Paint2Pix: Interactive Painting based Progressive Image Synthesis and Editing |
Jaskirat Singh (Australian National University)*; Liang Zheng (Australian National University); Cameron Y Smith (Adobe Research); Jose Echevarria (Adobe System Inc.) |
296 |
BEAT: A Large-Scale Semantic and Emotional Multi-Modal Dataset for Conversational Gestures Synthesis |
Haiyang Liu (The University of Tokyo)*; Zihao Zhu (Keio University); Naoya Iwamoto (Huawei Technologies Japan K.K.); Yichen Peng (Japan Advanced Institute of Science and Technology); Zhengqing Li (Huawei Japan K.K.); YOU ZHOU (Tokyo Research Center, Huawei); Elif Bozkurt (Huawei Turkey R&D Center, Istanbul, Turkey); Bo Zheng (Huawei) |
300 |
Active Pointly-Supervised Instance Segmentation |
Chufeng Tang (Tsinghua University)*; Lingxi Xie (Huawei Inc.); Gang Zhang (Tsinghua University); xiaopeng zhang (Huawei Cloud EI ); Qi Tian (Huawei Cloud & AI); Xiaolin Hu (Tsinghua University) |
303 |
DecoupleNet: Decoupled Network for Domain Adaptive Semantic Segmentation |
Xin Lai (The Chinese University of Hong Kong)*; Zhuotao Tian (The Chinese University of Hong Kong); Xiaogang XU (The Chinese University of Hong Kong); Yingcong Chen (Hong Kong University of Science and Technology); Shu Liu (SmartMore); Hengshuang Zhao (University of Oxford); Liwei Wang (CUHK); Jiaya Jia (Chinese University of Hong Kong) |
315 |
ByteTrack: Multi-Object Tracking by Associating Every Detection Box |
Yifu Zhang (Huazhong University of Science and Technology); Peize Sun (The University of Hong Kong); Yi Jiang (Bytedance); Dongdong Yu (ByteDance Inc.); Fucheng Weng (Huazhong University of Science and Technology); Zehuan Yuan (Bytedance.Inc); Ping Luo (The University of Hong Kong); Wenyu Liu (Huazhong University of Science and Technology); Xinggang Wang (Huazhong University of Science and Technology)* |
317 |
Robust Multi-Object Tracking by Marginal Inference |
Yifu Zhang (Huazhong University of Science and Technology); Chunyu Wang (Microsoft Research asia); Xinggang Wang (Huazhong University of Science and Technology)*; Wenjun Zeng (EIT Institute for Advanced Study); Wenyu Liu (Huazhong University of Science and Technology) |
322 |
Doubly-Fused ViT: Fuse Information from Vision Transformer Doubly with Local Representation |
Li Gao (Wuhan University)*; Dong Nie (UNC); Bo Li (Alibaba Group); Xiaofeng Ren (alibaba group) |
326 |
CATRE: Iterative Point Clouds Alignment for Category-level Object Pose Refinement |
Xingyu Liu (Tsinghua University); Gu Wang (JD.COM); Yi Li (University of Washington); Xiangyang Ji (Tsinghua University)* |
334 |
Spatiotemporal Self-attention Modeling with Temporal Patch Shift for Action Recognition |
Wangmeng Xiang (The Hong Kong Polytechnic University)*; Chao Li (Alibaba); Biao Wang (Alibaba); Xihan Wei (Alibaba); Xian-Sheng Hua (Damo Academy, Alibaba Group); Lei Zhang (“Hong Kong Polytechnic University, Hong Kong, China”) |
339 |
Efficient Long-Range Attention Network for Image Super-resolution |
Xindong Zhang (The Hong Kong Polytechnic University)*; Hui Zeng (OPPO); Shi Guo (The Hong Kong Polytechnic University); Lei Zhang (“Hong Kong Polytechnic University, Hong Kong, China”) |
343 |
DID-M3D: Decoupling Instance Depth for Monocular 3D Object Detection |
Liang Peng (ZJU)*; Xiaopei Wu (ZhejiangUniversity); Zheng Yang (FABU); Haifeng Liu (ZJU); Deng Cai (ZJU) |
349 |
FlowFormer: A Transformer Architecture for Optical Flow |
Zhaoyang Huang (Chinese University of HongKong)*; Xiaoyu Shi (CUHK); Chao Zhang (Samsung Telecommunication Research Institute); Qiang Wang (Samsung Research China, Beijing); Ka Chun Cheung (Nvidia); Hongwei Qin (Sensetime); Jifeng Dai (SenseTime); Hongsheng Li (The Chinese University of Hong Kong) |
357 |
Coarse-to-Fine Sparse Transformer for Hyperspectral Image Reconstruction |
Yuanhao Cai (Tsinghua Univisity, Tsinghua Shenzhen International Graduate School); Jing Lin (Tsinghua Univisity, Tsinghua Shenzhen International Graduate School)*; Xiaowan Hu (Tsinghua Univisity, Tsinghua Shenzhen International Graduate School); Haoqian Wang (Tsinghua Shenzhen International Graduate School, Tsinghua University); Xin Yuan (Westlake University); Yulun Zhang (ETH Zurich); Radu Timofte (University of Wurzburg & ETH Zurich); Luc Van Gool (ETH Zurich) |
358 |
An Embedded Feature Whitening Approach to Deep Neural Network Optimization |
Hongwei Yong (The Hong Kong Polytechnic University)*; Lei Zhang (“Hong Kong Polytechnic University, Hong Kong, China”) |
361 |
Optimization over Disentangled Encoding: Unsupervised Cross-Domain Point Cloud Completion via Occlusion Factor Manipulation |
Jingyu Gong (Shanghai Jiao Tong University)*; Fengqi Liu (Shanghai Jiao Tong University); Jiachen Xu (Shanghai Jiao Tong University); Min Wang (Sensetime Group); Xin Tan (Shanghai Jiao Tong University); Zhizhong Zhang (East China Normal University); Ran Yi (Shanghai Jiao Tong University); Haichuan Song (East China Normal University); Yuan Xie (East China Normal University); Lizhuang Ma (Shanghai Jiao Tong University) |
362 |
Source-Free Domain Adaptation with Contrastive Domain Alignment and Self-supervised Exploration for Face Anti-Spoofing |
Yuchen Liu (Shanghai Jiao Tong university)*; Yabo Chen (Shanghai Jiao Tong University ); Wenrui Dai (Shanghai Jiao Tong University); Mengran Gou (Qualcomm); Chun-Ting Huang (Qualcomm); Hongkai Xiong (Shanghai Jiao Tong University) |
368 |
MPPNet: Multi-Frame Feature Intertwining with Proxy Points for 3D Temporal Object Detection |
Xuesong Chen (The Chinese University of Hong Kong)*; Shaoshuai Shi (MPI Informatics); Benjin Zhu (MEGVII); Ka Chun Cheung (Nvidia); Hang Xu (Huawei Noah’s Ark Lab); Hongsheng Li (The Chinese University of Hong Kong) |
379 |
SdAE: Self-distillated Masked Autoencoder |
Yabo Chen (Shanghai Jiao Tong University ); Yuchen Liu (Shanghai Jiao Tong university); Dongsheng Jiang (Huawei Cloud & AI); xiaopeng zhang (Huawei Cloud EI )*; Wenrui Dai (Shanghai Jiao Tong University); Hongkai Xiong (Shanghai Jiao Tong University); Qi Tian (Huawei Cloud & AI) |
383 |
A Transformer-based Decoder for Semantic Segmentation with Multi-level Context Mining |
Bowen Shi (Shanghai Jiao Tong University)*; Dongsheng Jiang (Huawei Cloud & AI); xiaopeng zhang (Huawei Cloud EI ); Han Li (Shanghai Jiao Tong University); Wenrui Dai (Shanghai Jiao Tong University); Junni Zou (Shanghai Jiao Tong University); Hongkai Xiong (Shanghai Jiao Tong University); Qi Tian (Huawei Cloud & AI) |
399 |
Graph-constrained Contrastive Regularization for Semi-weakly Volumetric Segmentation |
Simon Reiß (Karlsruhe Institute of Technology)*; Constantin Marc Seibold (Karlsruhe Institute of Technology); Alexander Freytag (Carl Zeiss AG, Jena, Germany); Rodner Erik (University of Applied Sciences Berlin); Rainer Stiefelhagen (Karlsruhe Institute of Technology) |
401 |
Improving Vision Transformers by Revisiting High-frequency Components |
Jiawang Bai (Tsinghua University)*; Li Yuan (Peking University); Shu-Tao Xia (Tsinghua University); Shuicheng Yan (Sea AI Labs); Zhifeng Li (Tencent AI Lab); Wei Liu (Tencent) |
405 |
Adaptive Co-Teaching for Unsupervised Monocular Depth Estimation |
Weisong Ren (Dalian University of Technology); Lijun Wang (Dalian University of Technology)*; Yongri Piao (Dalian University of Technology); Miao Zhang (Dalian University of Technology); Huchuan Lu (Dalian University of Technology); Ting Liu (Alibaba) |
408 |
FurryGAN: High quality foreground-aware image synthesis |
Jeongmin Bae (Yonsei University); Mingi Kwon (Yonsei University); Youngjung Uh (Yonsei University)* |
433 |
An Efficient Spatio-Temporal Pyramid Transformer for Action Detection |
Yuetian Weng (Monash University); Zizheng Pan (Monash University); Mingfei Han (Monash University; DATA61, CSIRO); Xiaojun Chang (University of Technology Sydney); Bohan Zhuang (Monash University)* |
434 |
LocVTP: Video-Text Pre-training for Temporal Localization |
Meng Cao (Peking University); Tianyu Yang (Tencent AI Lab); Junwu Weng (Tencent AI Lab); Can Zhang (Peking University); Jue Wang (Tencent AI Lab); Yuexian Zou (Peking University)* |
444 |
Fusing Local Similarities for Retrieval-based 3D Orientation Estimation of Unseen Objects |
Chen Zhao (EPFL)*; Yinlin Hu (EPFL); Mathieu Salzmann (EPFL) |
458 |
Online Segmentation of LiDAR Sequences: Dataset and Algorithm |
Romain Loiseau (École des ponts ParisTech)*; Mathieu Aubry (École des ponts ParisTech); loic landrieu (IGN) |
460 |
MVSTER: Epipolar Transformer for Efficient Multi-View Stereo |
Xiaofeng Wang (Institute of Automation, Chinese Academy of Sciences; School of Artificial Intelligence, University of Chinese Academy of Sciences)*; Zheng Zhu (Tsinghua University); Guan Huang (Institute of Automation, Chinese Academy of Sciences); Fangbo Qin (Institute of Automation, Chinese Academy of Sciences); Yun Ye (XForwardAI Technology Co., Ltd, Beijing, China); Yijia He (Beijing Kuaishou Technology Co., Ltd); Xu Chi (Phigent Robotics); Xingang Wang (Institute of Automation, CAS) |
463 |
Unsupervised Learning of 3D Semantic Keypoints with Mutual Reconstruction |
Haocheng Yuan (Northwestern Polytechnical University); Chen Zhao (EPFL); Shichao Fan (Northwestern Polytechnical University); Jiaxi Jiang (Northwestern Polytechnical University); Jiaqi Yang (Northwestern Polytechnical University)* |
482 |
Generalizable Medical Image Segmentation via Random Amplitude Mixup and Domain-Specific Image Restoration |
Ziqi Zhou (Nanjing University)*; Lei Qi (Southeast University); Yinghuan Shi (Nanjing University) |
499 |
Demystifying Unsupervised Semantic Correspondence Estimation |
Mehmet Aygün (The University of Edinburgh)*; Oisin Mac Aodha (University of Edinburgh) |
513 |
Learning Shadow Correspondence for Video Shadow Detection |
Xinpeng Ding (The Hong Kong University of Science and Technology); Jingwen Yang (The Hong Kong University of Science and Technology); Xiaowei Hu (Shanghai AI Laboratory); Xiaomeng Li (The Hong Kong University of Science and Technology)* |
514 |
PolarMOT: How far can geometric relations take us in 3D multi-object tracking? |
Aleksandr Kim (Technical University of Munich); Guillem Brasó (TUM); Aljosa Osep (TUM Munich)*; Laura Leal-Taixé (TUM) |
516 |
Few-Shot End-to-End Object Detection via Constantly Concentrated Encoding across Heads |
Jiawei Ma (Columbia University)*; Guangxing Han (Columbia University); Shiyuan Huang (Columbia University); Yuncong Yang (Columbia University); Shih-Fu Chang (Columbia University) |
525 |
MVDECOR: Multi-view Dense Correspondence Learning for Fine-grained 3D Segmentation |
Gopal Sharma (University of Massachusetts Amherst)*; Kangxue Yin (NVIDIA); Subhransu Maji (University of Massachusetts, Amherst); Evangelos Kalogerakis (UMass Amherst); Or Litany (NVIDIA); Sanja Fidler (University of Toronto, NVIDIA) |
537 |
Implicit Neural Representations for Image Compression |
Yannick Strümpler (ETH Zürich)*; Janis Postels (ETH Zurich); Ren Yang (ETH Zurich); Luc Van Gool (ETH Zurich); Federico Tombari (Google, TU Munich) |
541 |
Cross-modal Prototype Driven Network for Radiology Report Generation |
Jun Wang (University of Warwick)*; Abhir Bhalerao (University of Warwick); Yulan He (University of Warwick) |
556 |
Scene Text Recognition with Permuted Autoregressive Sequence Models |
Darwin Bautista (University of the Philippines)*; Rowel Atienza (University of the Philippines) |
568 |
XMem: Long-Term Video Object Segmentation with an Atkinson-Shiffrin Memory Model |
Ho Kei Cheng (University of Illinois Urbana-Champaign)*; Alexander Schwing (UIUC) |
570 |
SUPR: A Sparse Unified Part-Based Human Body Model |
Ahmed A A Osman (Max Planck Institute for Intelligent Systems)*; Michael J. Black (Max Planck Institute for Intelligent Systems); Timo Bolkart (Max Planck Institute for Intelligent Systems); Dimitrios Tzionas (University of Amsterdam) |
575 |
SCAM! Transferring humans between images with Semantic Cross Attention Modulation |
Nicolas Dufour (ENPC)*; David Picard (ENPC); Vicky Kalogeiton (Ecole Polytechnique) |
583 |
Q-FW: A Hybrid Classical-Quantum Frank-Wolfe for Quadratic Binary Optimization |
Alp Yurtsever (Umeå University); Tolga Birdal (TU Munich)*; Vladislav Golyanik (MPI for Informatics) |
584 |
Revisiting Point Cloud Simplification: A Learnable Feature Preserving Approach |
Rolandos Alexandros Potamias (Imperial College London)*; Giorgos Bouritsas (Imperial College London); Stefanos Zafeiriou (Imperial College London) |
599 |
Neural Architecture Search for Spiking Neural Networks |
Youngeun Kim (Yale University)*; Yuhang Li (Yale University); Hyoungseob Park (Yale University); Yeshwanth Venkatesha (Yale university); Priyadarshini Panda (Yale University) |
601 |
Neuromorphic Data Augmentation for Training Spiking Neural Networks |
Yuhang Li (Yale University)*; Youngeun Kim (Yale University); Hyoungseob Park (Yale University); Tamar Geller (Yale University); Priyadarshini Panda (Yale University) |
602 |
RelPose: Predicting Probabilistic Relative Rotation for Single Objects in the Wild |
Jason Y Zhang (Carnegie Mellon University)*; Deva Ramanan (Carnegie Mellon University); Shubham Tulsiani (Carnegie Mellon University) |
609 |
Human Trajectory Prediction via Neural Social Physics |
Jiangbei Yue (Leeds University); Dinesh Manocha (University of Maryland at College Park)*; He Wang (Leeds University) |
615 |
Explicit Occlusion Reasoning for Multi-person 3D Human Pose Estimation |
Qihao Liu (Johns Hopkins University); Yi Zhang (Johns Hopkins University); Song Bai (University of Oxford); Alan Yuille (Johns Hopkins University)* |
626 |
R2L: Distilling Neural Radiance Field to Neural Light Field for Efficient Novel View Synthesis |
Huan Wang (Northeastern University); Jian Ren (Snap Inc.); Zeng Huang (Snap Inc.)*; Kyle B Olszewski (Snap Inc.); Menglei Chai (Snap Inc.); YUN FU (Northeastern University); Sergey Tulyakov (Snap Inc) |
629 |
Towards Open Set Video Anomaly Detection |
Yuansheng Zhu (Rochester Institute of Technology)*; Wentao Bao (Rochester Institute of Technology); Qi Yu (Rochester Institute of Technology) |
634 |
Object-Compositional Neural Implicit Surfaces |
Qianyi Wu (Monash University)*; Xian Liu (The Chinese University of Hong Kong); Yuedong Chen (Monash University); Kejie Li (University of Oxford); Chuanxia Zheng (Monash University); Jianfei Cai (Monash University); Jianmin Zheng (Nanyang Technological University) |
636 |
Sem2NeRF: Converting Single-View Semantic Masks to Neural Radiance Fields |
Yuedong Chen (Monash University)*; Qianyi Wu (Monash University); Chuanxia Zheng (Monash University); Tat-Jen Cham (Nanyang Technological University); Jianfei Cai (Monash University) |
641 |
WaveGAN: Frequency-aware GAN for High-Fidelity Few-shot Image Generation |
Mengping Yang (East China University of Science and Technology)*; Zhe Wang ( East China University of Science and Technology ); Ziqiu Chi (East China University Of Science and Technology); Wenyi Feng (east China university of science and technology) |
642 |
Class-Agnostic Object Counting Robust to Intraclass Diversity |
Shenjian Gong (Nanjing University of Science and Technology)*; Shanshan Zhang (Nanjing University of Science and Technology); Jian Yang (Nanjing University of Science and Technology); Dengxin Dai (MPI for Informatics ); Bernt Schiele (MPI Informatics) |
650 |
TM2T: Stochastic and Tokenized Modeling for the Reciprocal Generation of 3D Human Motions and Texts |
Chuan Guo (University of Alberta)*; Xinxin Zuo (University of Alberta); Sen Wang (University of Alberta); Li Cheng (ECE dept., University of Alberta) |
652 |
Self-Distillation for Robust LiDAR Semantic Segmentation in Autonomous Driving |
Jiale Li (Zhejiang University); Hang Dai (Mohamed bin Zayed University of Artificial Intelligence)*; Yong Ding (Zhejiang University) |
654 |
Semi-Supervised Monocular 3D Object Detection by Multi-View Consistency |
Qing Lian (Hong Kong University of Science and Technology )*; Yanbo XU (The Hong Kong University of Science and Technology); Weilong Yao (Shanghai Xiantu Intelligent Technology Co., Ltd.); Yingcong Chen (Hong Kong University of Science and Technology); Tong Zhang (Hong Kong University of Science and Technology) |
655 |
Lidar Point Cloud Guided Monocular 3D Object Detection |
Liang Peng (ZJU)*; Fei Liu (Zhejiang University); Zhengxu Yu (Zhejiang University); Senbo Yan (Zhejiang University); Dan Deng (FABU); Zheng Yang (FABU); Haifeng Liu (ZJU); Deng Cai (ZJU) |
656 |
Structural Causal 3D Reconstruction |
Weiyang Liu (University of Cambridge)*; Zhen Liu (Mila, University of Montreal); Liam Paull (Université de Montréal); Adrian Weller (University of Cambridge); Bernhard Schölkopf (MPI for Intelligent Systems, Tübingen) |
671 |
KD-MVS: Knowledge Distillation Based Self-supervised Learning for Multi-view Stereo |
Yikang Ding (Tsinghua University)*; Qingtian Zhu (Peking University); Xiangyue Liu (Beihang University); Wentao Yuan (Peking Universtiy); Haotian Zhang (Megvii); Chi Zhang (Megvii Inc.) |
685 |
When Counting Meets HMER: Counting-Aware Network for Handwritten Mathematical Expression Recognition |
Bohan Li (Huazhong University of Science and Technology)*; Ye Yuan (Tomorrow Advancing Life); Dingkang Liang (Huazhong University of Science and Technology); Xiao Liu (Tencent); zhilong ji (Tomorrow Advancing Life); Jinfeng Bai (TAL); Wenyu Liu (Huazhong University of Science and Technology); Xiang Bai (Huazhong University of Science and Technology) |
689 |
Shape Matters: Deformable Patch Attack |
Zhaoyu Chen (Fudan University); Bo Li (Nanjing University)*; Shuang Wu (Tencent); Jianghe Xu (Tencent Youtu Lab); Shouhong Ding (Tencent); Wenqiang Zhang (Fudan University) |
690 |
PTSEFormer: Progressive Temporal-Spatial Enhanced TransFormer Towards Video Object Detection |
Han Wang (Shanghai Jiao Tong University)*; Jun Tang (hikvision); Xiaodong Liu (Hikvision); Shanyan Guan (Shanghai Jiao Tong University); Rong Xie (Shanghai Jiao Tong University); Li Song (Shanghai Jiao Tong University) |
694 |
BEVFormer: Learning Bird-Eye-View Representations from Multi-View Images via Spatiotemporal Transformer |
Zhiqi Li (Nanjing University); Wenhai Wang (Nanjing University); Hongyang Li (SenseTime); Enze Xie (The University of Hong Kong); Chonghao Sima (Purdue University); Tong Lu (Nanjing University); Yu Qiao (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences); Jifeng Dai (SenseTime)* |
696 |
Detecting Tampered Scene Text in the Wild |
YuXin Wang (University of Science and Technology of China)*; Hongtao Xie (University of Science and Technology of China); Mengting Xing (University of Science and Technology of China); Jing Wang (Huawei Cloud & AI); Shenggao Zhu (Huawei); Yongdong Zhang (University of Science and Technology of China) |
702 |
Projective Parallel Single-pixel Imaging to Overcome Global Illumination in 3D Structure Light Scanning |
Yuxi Li (Beihang University)*; Huijie Zhao (Beihang University); Hongzhi Jiang (Beihang University); Xudong Li (Beihang University) |
709 |
CelebV-HQ: A Large-Scale Video Facial Attributes Dataset |
Hao Zhu (SenseTime Research)*; Wayne Wu (SenseTime Research); Wentao Zhu (Peking University); Liming Jiang (Nanyang Technological University); Siwei Tang (Sensetime research); Li Zhang (Sensetime); Ziwei Liu (Nanyang Technological University); Chen Change Loy (Nanyang Technological University) |
710 |
Open-world Semantic Segmentation for LIDAR Point Clouds |
Jun CEN (The Hong Kong University of Science and Technology)*; Peng YUN (Hong Kong University of Science and Technology); Shiwei Zhang (DAMO Academy, Alibaba Group); Junhao CAI (HKUST); Di LUAN (Hong Kong University of Science and Technology); Mingqian Tang (Alibaba Group); Michael Yu Wang (HKUST); Ming Liu (HKUST) |
721 |
Burn After Reading: Online Adaptation for Cross-domain Streaming Data |
Luyu Yang (University of Maryland, College Park)*; Mingfei Gao (Apple); Zeyuan Chen (Salesforce Research); Ran Xu (Salesforce Research); Abhinav Shrivastava (University of Maryland); Chetan Ramaiah (Salesforce Research) |
728 |
CLOSE: Curriculum Learning On the Sharing Extent Towards Better One-shot NAS |
Zixuan Zhou (Tsinghua University)*; Xuefei Ning (Tsinghua University); Yi Cai (Tsinghua University); Jiashu Han (None); Yiping Deng (Huawei); Yuhan Dong (Tsinghua University); Huazhong Yang (Tsinghua University); Yu Wang (Tsinghua University) |
734 |
RigNet: Repetitive Image Guided Network for Depth Completion |
Zhiqiang Yan (Nanjing University of Science and Tenchnology)*; Kun Wang (Nanjing University of Science and Technology); Xiang Li (Nanjing University of Science and Technology); Zhenyu Zhang (Tencent); Jun Li (Nanjing University of Science and Technology); Jian Yang (Nanjing University of Science and Technology) |
744 |
Streamable Neural Fields |
Junwoo Cho (Sungkyunkwan University)*; Seungtae Nam (Sungkyunkwan University); Daniel Rho (Sungkyunkwan University); Jong Hwan Ko (Sungkyunkwan University); Eunbyung Park (Sungkyunkwan University) |
755 |
2DPASS: 2D Priors Assisted Semantic Segmentation on LiDAR Point Clouds |
Xu Yan (The Chinese University of Hong Kong, Shenzhen); Jiantao Gao (Shanghai University); Chaoda Zheng (The Chinese University of Hong Kong, Shen Zhen); chao zheng (Tencent); Ruimao Zhang (The Chinese University of Hong Kong, Shenzhen); Shuguang Cui (The Chinese University of Hong Kong, Shenzhen ); Zhen Li (The Chinese University of Hong Kong, Shenzhen)* |
762 |
Where to Focus: Investigating Hierarchical Attention Relationship for Fine-Grained Visual Classification |
Yang Liu (Beihang University); Lei Zhou (Beihang University)*; Pengcheng Zhang (Beihang University); Xiao Bai (Beihang University); Lin Gu (RIKEN,AIP / The University of Tokyo); Xiaohan Yu (Griffith University); Jun Zhou (Griffith University); Hancock Edwin (“University of York, UK”) |
776 |
Mind the Gap in Distilling StyleGANs |
Guodong Xu (The Chinese University of Hong Kong)*; Yuenan HOU (Shanghai AI Lab); Ziwei Liu (Nanyang Technological University); Chen Change Loy (Nanyang Technological University) |
784 |
End-to-End Active Speaker Detection |
Juan C Leon (KAUST)*; Moritz Cordes (Leuphana University of Lüneburg); Chen Zhao (KAUST); Bernard Ghanem (KAUST) |
785 |
Joint-Modal Label Denoising for Weakly-Supervised Audio-Visual Video Parsing |
Haoyue Cheng (Nanjing University); Zhaoyang Liu (SenseTime Research); Hang Zhou (The Chinese University of Hong Kong); Chen Qian (SenseTime); Wayne Wu (SenseTime Research); Limin Wang (Nanjing University)* |
790 |
Learn-to-Decompose: Cascaded Decomposition Network for Cross-Domain Few-Shot Facial Expression Recognition |
Xinyi Zou (Xiamen University); Yan Yan (Xiamen University)*; Jing-Hao Xue (University College London); Si Chen (Xiamen University of Technology); Hanzi Wang (Xiamen University) |
798 |
Learning with Recoverable Forgetting |
Jingwen Ye (National University of Singapore)*; Fu Yifang (National University of Singapore); Jie Song (Zhejiang University); Xingyi Yang (National University of Singapore); Songhua Liu (National University of Singapore); Xin Jin (University of Science and Technology of China); Mingli Song (Zhejiang University); Xinchao Wang (National University of Singapore) |
800 |
Masked Autoencoders for Point Cloud Self-supervised Learning |
Yatian Pang (National University of Singapore); Wenxiao Wang (State Key Lab of CAD&CG, Zhejiang University); Francis EH Tay (National University of Singapore); Wei Liu (Tencent); Yonghong Tian (Peking University); Li Yuan (Peking University)* |
803 |
RamGAN: Region Attentive Morphing GAN for Region-Level Makeup Transfer |
Jianfeng Xiang (ShenZhen University)*; Junliang Chen (Shenzhen University); Wenshuang Liu (Shenzhen University); Xianxu Hou (Shenzhen University); Linlin Shen (Shenzhen University) |
807 |
Efficient One Pass Self-distillation with Zipf’s Label Smoothing |
Jiajun Liang (Megvii)*; Linze Li (MEGVII Technology); Zhaodong Bing (Megvii Technology); Borui Zhao (Megvii Technology); Yao Tang (Peking University); Bo Lin (MEGVII Technology); Haoqiang Fan (Megvii Inc(face++)) |
812 |
DaViT: Dual Attention Vision Transformers |
Mingyu Ding (The University of Hong Kong)*; Bin Xiao (Microsoft); Noel C Codella (Microsoft); Ping Luo (The University of Hong Kong); Jingdong Wang (Baidu); Lu Yuan (Microsoft) |
815 |
OneFace: One Threshold for All |
Jiaheng Liu (Beihang University); zhipeng yu (University of Chinese Academy of Sciences); Haoyu Qin (SenseTime); Yichao Wu (Sensetime Group Limited); Ding Liang (Sensetime Group Limited); Gangming Zhao (The University of Hong Kong); Ke Xu (Beihang University)* |
820 |
Semantic-Sparse Colorization Network for Deep Exemplar-based Colorization |
Yunpeng Bai (Tsinghua University )*; Chao Dong (SIAT); Zenghao Chai (Tsinghua University); Andong Wang (Tsinghua University); Zhengzhuo Xu (Tsinghua University); Chun Yuan (Graduate school at ShenZhen,Tsinghua university) |
822 |
Vibration-based Uncertainty Estimation for Learning from Limited Supervision |
Hengtong Hu (Hefei University of Technology)*; Lingxi Xie (Huawei Inc.); Xinyue Huo (University of Science and Technology of China); Richang Hong (HeFei University of Technology); Qi Tian (Huawei Cloud & AI) |
824 |
SOS! Self-supervised Learning Over Sets Of Handled Objects In Egocentric Action Recognition |
Victor A Escorcia (Samsung AI Center)*; Ricardo Guerrero (Samsung AI Center Cambridge); Xiatian Zhu (Samsung AI Centre); Brais Martinez (Samsung AI Center) |
829 |
FADE: Fusing the Assets of Decoder and Encoder for Task-Agnostic Upsampling |
Hao Lu (Huazhong University of Science and Technology); Wenze Liu (Huazhong university of science and technology); Hongtao Fu (Huazhong university of Science and Technology); Zhiguo Cao (Huazhong Univ. of Sci.&Tech.)* |
833 |
VTC: Improving Video-Text Retrieval with User Comments |
Laura Hanu (Unitary)*; James Thewlis (Unitary); Yuki M Asano (University of Amsterdam); Christian Rupprecht (University of Oxford) |
839 |
Less than Few: Self-Shot Video Instance Segmentation |
Pengwan Yang (University of Amsterdam)*; Yuki M Asano (University of Amsterdam); Pascal Mettes (University of Amsterdam); Cees Snoek (University of Amsterdam) |
841 |
End-to-End Visual Editing with a Generatively Pre-Trained Artist |
Andrew Brown (University of Oxford)*; Cheng-Yang Fu (Facebook.com); Omkar M Parkhi (Facebook); Tamara Berg (Facebook AI Research); Andrea Vedaldi (University of Oxford / Facebook AI Research) |
852 |
COUCH: Towards Controllable Human-chair Interactions |
Xiaohan Zhang (University of Tübingen, MPI Informatics); Bharat Lal Bhatnagar (University of Tübingen, MPI informatik); Sebastian Starke (University of Edinburgh); Vladimir Guzov (University of Tuebingen); Gerard Pons-Moll (University of Tübingen)* |
859 |
MovieCuts: A New Dataset and Benchmark forCut Type Recognition |
Alejandro Pardo (KAUST)*; Fabian Caba (Adobe Research); Juan C Leon (KAUST); Ali K Thabet (Facebook); Bernard Ghanem (KAUST) |
877 |
High-fidelity GAN Inversion with Padding Space |
Qingyan Bai (Tsinghua University)*; Yinghao Xu (Chinese University of Hong Kong); Jiapeng Zhu (HKUST); Weihao Xia (University College London); Yujiu Yang (Tsinghua University); Yujun Shen (Dept. of IE, CUHK) |
893 |
LiDAL: Inter-frame Uncertainty Based Active Learning for 3D LiDAR Semantic Segmentation |
ZEYU HU (Hong Kong University of Science and Technology)*; Xuyang Bai (HKUST); Runze Zhang (Tencent); Xin Wang (Tencent); Guangyuan Sun (TENCENT); Hongbo Fu (City University of Hong Kong); Chiew-Lan Tai (Hong Kong University of Science & Technology) |
897 |
Optimal Boxes: Boosting End-to-End Scene Text Recognition by Adjusting Annotated Bounding Boxes via Reinforcement Learning |
Jingqun Tang (Ant Group)*; wenming qian (Huazhong University of Science and Technology); Luchuan Song (University of Science and Technology of China); Xiena Dong (Hangzhou Dianzi Universiy); lan li (Whu Han University); Xiang Bai (Huazhong University of Science and Technology) |
912 |
Concurrent Subsidiary Supervision for Unsupervised Source-Free Domain Adaptation |
Jogendra Nath Kundu (Indian Institute of Science)*; Suvaansh Bhambri (Indian Institute of Science); Akshay R Kulkarni (Indian Institute of Science); Hiran Sarkar (Indian Institute of Science); Varun Jampani (Google); Venkatesh Babu RADHAKRISHNAN (Indian Institute of Science) |
913 |
Designing One Unified Framework for High-Fidelity Face Reenactment and Swapping |
Chao Xu (Zhejiang University)*; Jiangning Zhang (Zhejiang University); Yue Han (Zhejiang University); Guanzhong Tian (Ningbo Research Institute, Zhejiang University); xianfang zeng (Zhejiang University); Ying Tai (Tencent YouTu); Yabiao Wang (Tencent); Chengjie Wang (Tencent; Shanghai Jiao Tong University); Yong Liu (Zhejiang University) |
919 |
Category-Level 6D Object Pose and Size Estimation using Self-Supervised Deep Prior Deformation Networks |
Jiehong Lin (South China University of Technology)*; Zewei Wei (South China University of Technology); Changxing Ding (South China University of Technology); Kui Jia (South China University of Technology) |
927 |
Intrinsic Neural Fields: Learning Functions on Manifolds |
Lukas Koestler (Technical University of Munich)*; Daniel Grittner (Technische Universität München); Michael Moeller (University of Siegen); Daniel Cremers (TU Munich); Zorah Laehner (University of Siegen) |
930 |
LaMAR: Benchmarking Localization and Mapping for Augmented Reality |
Paul-Edouard Sarlin (ETH Zurich); Mihai Dusmanu (ETH Zurich)*; Johannes L Schönberger (Microsoft); Pablo Speciale (Microsoft); Lukas Gruber (Microsoft); Viktor Larsson (Lund University); Ondrej Miksik (Microsoft); Marc Pollefeys (ETH Zurich / Microsoft) |
933 |
3D Compositional Zero-shot Learning with DeCompositional Consensus |
Muhammad Ferjad Naeem (ETH Zürich)*; Evin Pınar Örnek (TU Munich); Yongqin Xian (ETH Zurich); Luc Van Gool (ETH Zurich); Federico Tombari (Google, TU Munich) |
939 |
Video Mask Transfiner for High-Quality Video Instance Segmentation |
Lei Ke (HKUST)*; Henghui Ding (ETH Zurich); Martin Danelljan (ETH Zurich); Yu-Wing Tai (Kuaishou Technology / HKUST); Chi-Keung Tang (Hong Kong University of Science and Technology); Fisher Yu (ETH Zurich) |
940 |
FashionViL: Fashion-Focused Vision-and-Language Representation Learning |
Xiao Han (University of Surrey)*; Licheng Yu (Facebook); Xiatian Zhu (University of Surrey); Li Zhang (Fudan University); Yi-Zhe Song (University of Surrey); Tao Xiang (University of Surrey) |
945 |
Adaptive Face Forgery Detection in Cross Domain |
Luchuan Song (University of Science and Technology of China)*; Zheng Fang (BeihangUniversity); Xiaodan Li (Alibaba Group); Xiaoyi Dong (University of Science and Technology of China); Zhenchao Jin (University of Science and Technology of China); Yuefeng Chen (Alibaba Group); Siwei Lyu (University at Buffalo) |
958 |
LiP-Flow: Learning Inference-time Priors for Codec Avatars via Normalizing Flows in Latent Space |
Emre Aksan (ETH Zurich)*; Shugao Ma (Facebook); Akin Caliskan (Center for Vision Speech and Signal Processing – University of Surrey); Stanislav Pidhorskyi (Facebook Inc.); Alexander Richard (Facebook Reality Labs); Shih-En Wei (Facebook); Jason Saragih (Facebook); Otmar Hilliges (ETH Zurich) |
961 |
Dense Teacher: Dense Pseudo-Labels for Semi-supervised Object Detection |
Hongyu Zhou (Megvii)*; Songtao Liu (MEGVII); Zeming Li (Megvii(Face++) Inc); Jian Sun (Megvii Technology); Weixin Mao (waseda university); Zheng Ge (MEGVII Technology); haiyan yu (Harbin Institute of Technology) |
968 |
Metric Learning based Interactive Modulation for Real-World Super-Resolution |
Chong Mou (Peking University Shenzhen Graduate School)*; Yanze Wu (Tencent); Xintao Wang (Tencent); Chao Dong (SIAT); Jian Zhang (Peking University Shenzhen Graduate School); Ying Shan (Tencent) |
971 |
Optimal Transport for Label-Efficient Visible-Infrared Person Re-Identification |
Jiangming Wang (East China Normal University); Zhizhong Zhang (East China Normal University); Mingang Chen (Shanghai Development Center of Computer Software Technology); yi zhang (zhejianglab); Cong Wang (Huawei Technologies); Bin Sheng (Shanghai Jiao Tong University); Yanyun Qu (XMU); Yuan Xie (East China Normal University)* |
977 |
Proposal-Free Temporal Action Detection via Global Segmentation Mask Learning |
Sauradip Nag (University of Surrey)*; Xiatian Zhu (University of Surrey); Yi-Zhe Song (University of Surrey); Tao Xiang (University of Surrey) |
979 |
Sobolev Training for Implicit Neural Representations with Approximated Image Derivatives |
Wentao Yuan (Peking Universtiy)*; Qingtian Zhu (Peking University); Xiangyue Liu (Beihang University); Yikang Ding (Tsinghua University); Haotian Zhang (Megvii); Chi Zhang (Megvii Inc.) |
982 |
Unsupervised Night Image Enhancement: When Layer Decomposition Meets Light-Effects Suppression |
Yeying Jin (National University of Singapore)*; Wenhan Yang (NTU); Robby T. Tan (National University of Singapore) |
986 |
Point-to-Box Network for Accurate Object Detection via Single Point Supervision |
Pengfei Chen (University of Chinese Academy of Sciences); Xuehui Yu (University of Chinese Academy of Sciences); Xumeng Han (University of Chinese Academy of Sciences); Najmul Hassan (University of Oregon); Kai Wang (U of Oregon); Jiachen Li (UIUC); Jian Zhao (Institute of North Electronic Equipment); Humphrey Shi (U of Oregon | UIUC | PAIR); Zhenjun Han (University of Chinese Academy of Sciences)*; Qixiang Ye (University of Chinese Academy of Sciences, China) |
989 |
Dynamic Dual Trainable Bounds for Ultra-low Precision Super-Resolution Networks |
Yunshan Zhong (xiamen university)*; Mingbao Lin (Xiamen University, China); xunchao li (Xiamen University); Ke Li (Tencent); Yunhang Shen (Xiamen University); Fei Chao (Xiamen University); Yongjian Wu (Tencent Technology (Shanghai) Co.,Ltd); Rongrong Ji (Xiamen University, China) |
999 |
Locality Guidance for Improving Vision Transformers on Tiny Datasets |
Kehan Li (Peking University); Runyi Yu (Peking University); Zhennan Wang (Peng Cheng Laboratory); Li Yuan (Peking University); Guoli Song (Peng Cheng Laboratory); Jie Chen (Peking University)* |
1002 |
Weakly Supervised Object Localization through Inter-class Feature Similarity and Intra-class Appearance Consistency |
Jun Wei (The Chinese University of Hong Kong, Shenzhen); Sheng Wang (Shanghai Zelixir Biotech); S. Kevin Zhou (USTC); Shuguang Cui (The Chinese University of Hong Kong, Shenzhen ); Zhen Li (The Chinese University of Hong Kong, Shenzhen)* |
1003 |
Semi-Supervised Temporal Action Detection with Proposal-Free Masking |
Sauradip Nag (University of Surrey)*; Xiatian Zhu (University of Surrey); Yi-Zhe Song (University of Surrey); Tao Xiang (University of Surrey) |
1005 |
Neighborhood Collective Estimation for Noisy Label Identification and Correction |
Jichang Li (The University of Hong Kong)*; Guanbin Li (Sun Yat-sen University); Feng Liu (Deepwise AI Lab); Yizhou Yu (The University of Hong Kong) |
1010 |
Zero-Shot Temporal Action Detection via Vision-Language Prompting |
Sauradip Nag (University of Surrey)*; Xiatian Zhu (University of Surrey); Yi-Zhe Song (University of Surrey); Tao Xiang (University of Surrey) |
1016 |
Dual-Stream Knowledge-Preserving Hashing for Unsupervised Video Retrieval |
Pandeng Li (University of Science and Technology of China)*; Hongtao Xie (University of Science and Technology of China); Jiannan Ge (University of Science and Technology of China); Lei Zhang (Kuaishou); Shaobo Min (tencent); Yongdong Zhang (University of Science and Technology of China) |
1018 |
Discover and Mitigate Unknown Biases with Debiasing Alternate Networks |
Zhiheng Li (University of Rochester)*; Anthony Hoogs (Kitware); Chenliang Xu (University of Rochester) |
1020 |
Hierarchical Memory Learning for Fine-Grained Scene Graph Generation |
Youming Deng (Wuhan University); Yansheng Li (Wuhan University)*; Yongjun Zhang (Wuhan University); Xiang Xiang (Huazhong University of Science and Technology); Jian Wang (Ant Group); Jingdong Chen (Ant Group); Jiayi Ma (Wuhan University) |
1026 |
Improving Test-Time Adaptation via Shift-agnostic Weight Regularization and Nearest Source Prototypes |
Sungha Choi (Qualcomm AI Research)*; Seunghan Yang (Qualcomm AI Research); Seokeon Choi (Qualcomm AI research); Sungrack Yun (Qualcomm AI Research) |
1028 |
Automatic dense annotation of large-vocabulary sign language videos |
Liliane Momeni (University of Oxford)*; Hannah Bull (LIMSI (CNRS)); Prajwal K R (VGG, Oxford); Samuel Albanie (University of Cambridge); Gul Varol (Ecole des Ponts ParisTech); Andrew Zisserman (University of Oxford) |
1029 |
Few-shot Class-incremental Learning via Entropy-regularized Data-free Replay |
Huan Liu (McMaster University)*; Li Gu (Huawei Canada); Zhixiang Chi (Huawei Noah’s Ark Laboratory); Yuanhao Yu (Huawei Noah’s Ark Laboratory); Yang Wang (Concordia University); Jun Chen (McMaster University); Jin Tang ( Huawei Noah’s Ark Laboratory) |
1035 |
Learning Instance-Specific Adaptation for Cross-Domain Segmentation |
Yuliang Zou (Virginia Tech)*; Zizhao Zhang (Google); Chun-Liang Li (Google); Han Zhang (Google); Tomas Pfister (Google); Jia-Bin Huang (Facebook ) |
1039 |
SALVe: Semantic Alignment Verification for Floorplan Reconstruction from Sparse Panoramas |
John W Lambert (Georgia Institute of Technology)*; Yuguang Li (Zillow Group); Ivaylo Boyadzhiev (Zillow Group); Lambert Wixson (Zillow Group); Manjunath Narayana (Zillow group); Will A Hutchcroft (Zillow Group); James Hays (Georgia Institute of Technology, USA); Frank Dellaert (Georgia Tech); Sing Bing Kang (Zillow Group) |
1044 |
Active Learning Strategies for Weakly-Supervised Object Detection |
Huy V. Vo (Ecole Normale Supérieure – INRIA – Valeo.ai)*; Oriane Siméoni (valeo.ai); Spyros Gidaris (valeo.ai); Andrei Bursuc (valeo.ai); Patrick Pérez (Valeo.ai); Jean Ponce (Inria) |
1049 |
3D Human Pose Estimation Using Möbius Graph Convolutional Networks |
Niloofar Azizi (ICG department of TU Graz)*; Horst Possegger (Graz University of Technology); Emanuele Rodola (Sapienza University of Rome); Horst Bischof (Graz University of Technology) |
1055 |
Real-time Online Video Detection with Temporal Smoothing Transformers |
Yue Zhao (University of Texas at Austin)*; Philipp Kraehenbuehl (UT Austin) |
1060 |
3D-FM GAN: Towards 3D-Controllable Face Manipulation |
Yuchen Liu (Princeton University)*; Zhixin Shu (Adobe Research); Yijun Li (Adobe Research); Zhe Lin (Adobe Research); Richard Zhang (Adobe); Sun-Yuan Kung (Princeton University) |
1064 |
SinNeRF: Training Neural Radiance Field on Complex Scene from a Single Image |
Dejia Xu (University of Texas at Austin)*; Yifan Jiang (University of Texas at Austin); Peihao Wang (University of Texas at Austin); Zhiwen Fan (University of Texas at Austin); Humphrey Shi (U of Oregon | UIUC | PAIR); Zhangyang Wang (University of Texas at Austin) |
1069 |
Entropy-driven Sampling and Training Scheme for Conditional Diffusion Generation |
Guangcong Zheng (Zhejiang University); Shengming Li (Zhejiang University); Hui Wang (Zhejiang University); Taiping Yao (Tencent YouTu); Yang Chen (Tencent); Shouhong Ding (Tencent); Xi Li (Zhejiang University)* |
1076 |
Identity-aware Hand Mesh Estimation and Personalization from RGB Images |
Deying Kong (university of california, irvine)*; Linguang Zhang (Facebook Reality Labs); Liangjian Chen (Reality Labs); Haoyu Ma (University of California, Irvine); Xiangyi Yan (University of California, Irvine); shanlin sun (University of California, Irvine); Xingwei Liu (University of California Irvine); Kun Han (University of California Irvine); Xiaohui Xie (University of California, Irvine) |
1084 |
TALLFormer: Temporal Action Localization with a Long-memory Transformer |
Feng Cheng (University of North Carolina ch); Gedas Bertasius (UNC Chapel Hill)* |
1086 |
Unsupervised and Semi-supervised Bias Benchmarking in Face Recognition |
Siqi Deng (Amazon)*; Alexandra Chouldechova (CMU); Yongxin Wang (Amazon); Wei Xia (Amazon); Pietro Perona (California Institute of Technology) |
1100 |
Domain Adaptive Hand Keypoint and Pixel Localization in the Wild |
Takehiko Ohkawa (The University of Tokyo)*; Yu-Jhe Li (Carnegie Mellon University); Qichen Fu (Carnegie Mellon University); Ryosuke Furuta (The University of Tokyo); Kris Kitani (Carnegie Mellon University); Yoichi Sato (University of Tokyo) |
1103 |
Skeleton-free Pose Transfer for Stylized 3D Characters |
Zhouyingcheng Liao (Saarland University)*; Jimei Yang (Adobe); Jun Saito (Adobe); Gerard Pons-Moll (University of Tübingen); Yang Zhou (Adobe Research) |
1105 |
Differentiable Raycasting for Self-supervised Occupancy Forecasting |
Tarasha Khurana (Carnegie Mellon University)*; Peiyun Hu (Carnegie Mellon University); Achal D Dave (Amazon); Jason P Ziglar (Argo AI); David Held (); Deva Ramanan (Carnegie Mellon University) |
1109 |
InAction: Interpretable Action Decision Making for Autonomous Driving |
Taotao Jing (Tulane University)*; Haifeng Xia (Tulane University); Renran Tian (Indiana University-Purdue University Indianapolis); Haoran Ding (IUPUI); Xiao Luo (IUPUI); Joshua E Domeyer (Toyota Motor North America); Rini Sherony (Toyota CSRC); Zhengming Ding (Tulane University) |
1114 |
CramNet: Camera-Radar Fusion with Ray-Constrained Cross-Attention for Robust 3D Object Detection |
Jyh-Jing Hwang (Waymo)*; Henrik Kretzschmar (Waymo); Joshua M Manela (Waymo); Sean Rafferty (Waymo); Nicholas Armstrong-Crews (Waymo); Tiffany Chen (Waymo); Dragomir Anguelov (Waymo) |
1118 |
CycDA: Unsupervised Cycle Domain Adaptation to Learn from Image to Video |
Wei Lin (Graz University of Technology)*; Anna Kukleva (MPII); Kunyang Sun (Southeast University); Horst Possegger (Graz University of Technology); Hilde Kuehne (University of Frankfurt); Horst Bischof (Graz University of Technology) |
1119 |
Latent Discriminant deterministic Uncertainty |
Gianni Franchi (ENSTA Paris)*; Xuanlong Yu (ENSTA Paris); Andrei Bursuc (valeo.ai); Emanuel Aldea (Paris-Saclay University); Severine Dubuisson (Aix-Marseille University); David Filliat (ENSTA Paris) |
1129 |
Auto-FedRL: Federated Hyperparameter Optimization for Multi-institutional Medical Image Segmentation |
Pengfei Guo (Johns Hopkins University)*; Dong Yang (NVIDIA Corporation); Ali Hatamizadeh (NVIDIA Corporation); An Xu (University of Pittsburgh); Ziyue Xu (NVIDIA); Wenqi Li (NVIDIA); Can Zhao (Nvidia); Daguang Xu (NVIDIA Corporation); Stephanie Anne Harmon (National Cancer Institute); Evrim Turkbey (NIH); Baris Turkbey (National Cancer Institute); Bradford J Wood (National Institutes of Health); Francesca Patella (ASST Santi Paolo e Carlo); Elvira Stellato (University of Milan); Gianpaolo Carrafiello (University of Milan); Vishal Patel (Johns Hopkins University); Holger R Roth (NVIDIA) |
1135 |
Image-based CLIP-Guided Essence Transfer |
Hila Chefer (Tel Aviv University)*; Sagie Benaim (University of Copenhagen); Roni Paiss (Tel Aviv University, Google); Lior Wolf (Tel Aviv University, Israel) |
1136 |
Prune Your Model Before Distill It |
JinHyuk Park (Hongik University); Albert No (Hongik University)* |
1155 |
S2N: Suppression-Strengthen Network for Event-based Recognition under Variant Illuminations |
zengyu wan (University of Science and Technology of China)*; Yang Wang (University of Science and Technology of China); Ganchao Tan (University of Science and Technology of China); Yang Cao (University of Science and Technology of China); Zheng-Jun Zha (University of Science and Technology of China) |
1159 |
MILES: Visual BERT Pre-training with Injected Language Semantics for Video-text Retrieval |
Yuying Ge (The University of Hong Kong)*; Yixiao Ge (Tencent); Xihui Liu (UC Berkeley); Jinpeng Wang (National University of Singapore); Jianping Wu (Tsinghua University); Ying Shan (Tencent); Xiaohu Qie (Tencent); Ping Luo (The University of Hong Kong) |
1161 |
PASS: Part-Aware Self-Supervised Pre-Training for Person Re-Identification |
Kuan Zhu (Institute of Automation, Chinese Academy of Sciences)*; Haiyun Guo (CASIA); Tianyi Yan (Institute of Automation,Chinese Academy of Sciences;School of Artificial Intelligence, University of Chinese Academy Sciences); Yousong Zhu (Institute of Automation, Chinese Academy of Sciences); Jinqiao Wang (Institute of Automation, Chinese Academy of Sciences); Ming Tang (Institute of Automation, Chinese Academy of Sciences) |
1165 |
RegionCL: Exploring Contrastive Region Pairs for Self-supervised Representation Learning |
YUFEI XU (University of sydney)*; Qiming Zhang (The University of Sydney); Jing Zhang (The University of Sydney); Dacheng Tao (JD.com) |
1174 |
Towards Data-Efficient Detection Transformers |
Wen Wang (University of Science and Technology of China)*; Jing Zhang (The University of Sydney); Yang Cao (University of Science and Technology of China); Yongliang Shen (Zhejiang University); Dacheng Tao (JD.com) |
1175 |
Label2Label: A Language Modeling Framework for Multi-Attribute Learning |
Wanhua Li (Tsinghua University); Zhexuan Cao (Tsinghua University); Jianjiang Feng (Tsinghua University); Jie Zhou (Tsinghua University); Jiwen Lu (Tsinghua University)* |
1179 |
Anti-Retroactive Interference for Lifelong Learning |
Runqi Wang (Beihang University); Yuxiang Bao (Beihang University); Baochang Zhang (Beihang University)*; Jianzhuang Liu (Huawei Noah’s Ark Lab); Wentao Zhu (Amazon); Guodong Guo (IDL, Baidu Research) |
1181 |
Emotion Recognition for Multiple Context Awareness |
Dingkang Yang (Fudan University); shuai huang (Fudan university); Shunli Wang (Fudan University); Yang Liu (Fudan University); Peng Zhai (Fudan university); Liuzhen Su (Fudan University); Mingcheng Li (Fudan University); Lihua Zhang (Fudan University)* |
1182 |
Box-supervised Instance Segmentation with Level Set Evolution |
Wentong Li (Zhejiang University ); Wenyu Liu (Zhejiang University); Jianke Zhu (Zhejiang University)*; Miaomiao Cui (Alibaba-inc); Xian-Sheng Hua (Damo Academy, Alibaba Group); Lei Zhang (“Hong Kong Polytechnic University, Hong Kong, China”) |
1197 |
mc-BEiT: Multi-choice Discretization for Image BERT Pre-training |
Xiaotong Li (Peking University)*; Yixiao Ge (Tencent); Kun Yi (Nanjing University); Zixuan Hu (Peking University); Ying Shan (Tencent); Lingyu Duan (Peking University) |
1198 |
Adaptive Cross-Domain Learning for Generalizable Person Re-Identification |
Pengyi Zhang (Zhejiang University)*; Huanzhang Dou (Zhejiang University); Yunlong Yu (Zhejiang University); Xi Li (Zhejiang University) |
1202 |
MetaGait: Learning to Learn an Omni Sample Adaptive Representation for Gait Recognition |
Huanzhang Dou (Zhejiang University)*; Pengyi Zhang (Zhejiang University); Wei Su (Zhejiang University); Yunlong Yu (Zhejiang University); Xi Li (Zhejiang University) |
1203 |
Bootstrapped Masked Autoencoders for Vision BERT Pretraining |
Xiaoyi Dong (University of Science and Technology of China)*; Jianmin Bao (Microsoft Research Asia); Ting Zhang (MSRA); Dongdong Chen (Microsoft Cloud AI); Weiming Zhang (University of Science and Technology of China); Lu Yuan (Microsoft); Dong Chen (Microsoft Research Asia); Fang Wen (Microsoft Research Asia ); Nenghai Yu (University of Science and Technology of China) |
1209 |
Masked Discrimination for Self-Supervised Learning on Point Clouds |
Haotian Liu (University of Wisconsin-Madison)*; Mu Cai (University of Wisconsin-Madison); Yong Jae Lee (University of Wisconsin-Madison) |
1214 |
GEB+: A Benchmark for Generic Event Boundary Captioning, Grounding and Retrieval |
Yuxuan Wang (National University of Singapore); Difei Gao (NUS); Licheng Yu (Facebook); Stan Weixian Lei (National University of Singapore); Matt Feiszli (Facebook Research); Mike Zheng Shou (National University of Singapore)* |
1225 |
FAST-VQA: Efficient End-to-end Video Quality Assessment with Fragment Sampling |
Haoning Wu (Nanyang Technological University)*; Chaofeng Chen (Nanyang Technological University); Jingwen Hou (Nanyang Technological University); Liang Liao (Nanyang Technological University); Annan Wang (Nanyang Technological University); Wenxiu Sun (SenseTime Research and Tetras.AI); Qiong Yan (SenseTime Group Limited); Weisi Lin (Nanyang Technological University, Singapore) |
1235 |
Learning to train a point cloud reconstruction network without matching |
Tianxin Huang (Zhejiang University)*; Xuemeng Yang (Zhejiang University); Jiangning Zhang (Zhejiang University); Jinhao Cui (Zhejiang Unversity); Hao Zou (Zhejiang University); Jun Chen (Zhejiang University); Xiangrui Zhao (Zhejiang University); Yong Liu (Zhejiang University) |
1243 |
Long-Tailed Class Incremental Learning |
Xialei Liu (Nankai University)*; Yusong Hu (Nankai University); Xu-Sheng Cao (Nankai University); Andy Bagdanov (University of Florence, Italy); Ke Li (Tencent); Ming-Ming Cheng (Nankai University) |
1247 |
CODA: A Real-World Road Corner Case Dataset for Object Detection in Autonomous Driving |
Kaican Li (Huawei Noah’s Ark Lab)*; Kai Chen (HKUST); Haoyu Wang (Purdue University); Lanqing Hong (Huawei Noah’s Ark Lab); Chaoqiang Ye (Huawei); Jianhua Han (Huawei Noah’s Ark Lab); Yukuai Chen (Huawei Intelligent Automotive Solution BU); Wei Zhang ( Noah’s Ark Lab, Huawei Technologies); Chunjing Xu (Huawei Noah’s Ark Lab); Dit-Yan Yeung (HKUST); Xiaodan Liang (Sun Yat-sen University); Zhenguo Li (Huawei Noah’s Ark Lab); Hang Xu (Huawei Noah’s Ark Lab) |
1253 |
CMT: Context-Matching-Guided Transformer for 3D Tracking in Point Clouds |
Zhiyang Guo (University of Science and Technology of China)*; Yunyao Mao (University of Science and Technology of China); Wengang Zhou (University of Science and Technology of China); Min Wang (Institute of Artificial Intelligence, Hefei Comprehensive National Science Center); Houqiang Li (University of Science and Technology of China) |
1257 |
Motion Inspired Unsupervised Perception and Prediction in Autonomous Driving |
Mahyar Najibi (Waymo LLC); Jingwei Ji (Waymo); Yin Zhou (Waymo)*; Charles R. Qi (Waymo); Xinchen Yan (Waymo); Scott Ettinger (Waymo); Dragomir Anguelov (Waymo) |
1259 |
Unitail: Detecting, Reading, and Matching in Retail Scene |
Fangyi Chen (Carnegie Mellon University)*; Han Zhang (CMU); zaiwang li (pitt); Jiachen Dou (Carnegie Mellon University); Shentong Mo (Carnegie Mellon University); Hao Chen (Carnegie Mellon University); Yong-Xin Zhang (Tsinghua University); Uzair Ahmed (Carnegie Mellon University); Chenchen Zhu (Meta AI); Marios Savvides (Carnegie Mellon University) |
1275 |
DODA: Data-oriented Sim-to-Real Domain Adaptation for 3D Semantic Segmentation |
Runyu Ding (The University of Hong Kong)*; Jihan Yang (The University of Hong Kong); Li Jiang (Max Planck Institute for Informatics); Xiaojuan Qi (The University of Hong Kong) |
1277 |
Learning to Drive by Watching YouTube Videos: Action-Conditioned Contrastive Policy Pretraining |
Qihang Zhang (Chinese University of Hong Kong); Zhenghao Peng (Chinese University of Hong Kong); Bolei Zhou (UCLA)* |
1278 |
Multi-Curve Translator for High-Resolution Photorealistic Image Translation |
Yuda Song (Zhejiang University); Hui Qian (Zhejiang University); Xin Du (Zhejiang University)* |
1280 |
Dynamic Metric Learning with Cross-Level Concept Distillation |
Wenzhao Zheng (Tsinghua University)*; Yuanhui Huang (Tsinghua University); Borui Zhang (Tsinghua University); Jie Zhou (Tsinghua University); Jiwen Lu (Tsinghua University) |
1287 |
Deep Bayesian Video Frame Interpolation |
Zhiyang Yu (Harbin Institute of Technology)*; Yu Zhang (Beihang University); Xujie Xiang (Beihang University); Dongqing Zou (SenseTime Research;Qing Yuan Research Institute, Shanghai Jiao Tong University); Xijun Chen (Harbin Institute of Technology); Jimmy Ren (SenseTime Research;Qing Yuan Research Institute, Shanghai Jiao Tong University) |
1300 |
PanoFormer: Panorama Transformer for Indoor 360° Depth Estimation |
Zhijie Shen (Beijing Jiaotong University); Chunyu Lin (Beijing Jiaotong University)*; Kang Liao (Beijing Jiaotong University); Lang Nie (Beijing Jiaotong University); Zishuo Zheng (Beijing Jiaotong University); Yao Zhao (Beijing Jiaotong University) |
1312 |
Cross Attention Based Style Distribution for Controllable Person Image Synthesis |
Xinyue Zhou (East China Normal University ); Mingyu Yin (East China Normal University); Xinyuan Chen (Shanghai AI Laboratory); Li Sun (East China Normal University)*; Changxin Gao (Huazhong University of Science and Technology); Qingli Li (East China Normal University) |
1315 |
Generative Meta-Adversarial Network for Unseen Object Navigation |
Sixian Zhang (ICT, China Academy of Science)*; Weijie Li (ICT, China Academy of Sciences); Xinhang Song (ICT); Yubing Bai (ICT,China Academy of Science); Shuqiang Jiang (ICT, China Academy of Science) |
1316 |
Unsupervised Visual Representation Learning by Synchronous Momentum Grouping |
Bo Pang (Shanghai Jiao Tong University)*; Yifan Zhang (Shanghai Jiao Tong University); Yaoyi Li (Huawei); Jia Cai (Huawei); Cewu Lu (Shanghai Jiao Tong University) |
1317 |
OSFormer: One-Stage Camouflaged Instance Segmentation with Transformers |
Jialun Pei (Huazhong University of Science and Technology); Tianyang Cheng (Huazhong University of Science and Technology); Deng-Ping Fan (ETH Zurich)*; He Tang (Huazhong University of Science and Technology); Chuanbo Chen (Huazhong University of Science and Technology); Luc Van Gool (ETH Zürich) |
1321 |
Highly Accurate Dichotomous Image Segmentation |
Xuebin Qin (University of Alberta); Hang Dai (Mohamed bin Zayed University of Artificial Intelligence); Xiaobin Hu (Technische Universität München); Deng-Ping Fan (ETH Zurich)*; Ling Shao (Terminus Group); Luc Van Gool (ETH Zurich) |
1322 |
KeypointNeRF: Generalizing Image-based Volumetric Avatars using Relative Spatial Encoding of Keypoints |
Marko Mihajlovic (ETH Zurich)*; Aayush Bansal (Carnegie Mellon University); Michael Zollhöfer (Facebook Reality Labs); Siyu Tang (ETH Zurich); Shunsuke Saito (Facebook) |
1326 |
MENet: a Memory-Based Network with Dual-Branch for Efficient Event Stream Processing |
Linhui Sun (CASIA)*; Yifan Zhang (Institute of Automation, Chinese Academy of Sciences); Ke Cheng (Institute of Automation, Chinese Academy of Sciences); Jian Cheng (“Chinese Academy of Sciences, China”); Hanqing Lu (NLPR, Institute of Automation, CAS) |
1330 |
Making Heads or Tails: Towards Semantically Consistent Visual Counterfactuals |
Simon Vandenhende (KU Leuven)*; Dhruv Mahajan (Facebook); Filip Radenovic (Facebook AI); Deepti Ghadiyaram (Facebook) |
1331 |
LEDNet: Joint Low-light Enhancement and Deblurring in the Dark |
Shangchen Zhou (Nanyang Technological University)*; Chongyi Li ( Nanyang Technological University); Chen Change Loy (Nanyang Technological University) |
1336 |
RC-MVSNet: Unsupervised Multi-View Stereo with Neural Rendering |
Di Chang (Technical University of Munich)*; Aljaz Bozic (Technical University Munich); Tong Zhang (EPFL); Qingsong Yan (hong kong university of science and technology); Yingcong Chen (Hong Kong University of Science and Technology); Sabine Süsstrunk (EPFL); Matthias Niessner (Technical University of Munich) |
1342 |
StretchBEV: Stretching Future Instance Prediction Spatially and Temporally |
Kaan Adil Akan (Koc University); Fatma Guney (Koc University)* |
1344 |
AgeTransGAN for Facial Age Transformation with Rectified Performance Metrics |
Gee-Sern Hsu (National Taiwan University of Science and Technology)*; Rui-Cang Xie ( National Taiwan University of Science and Technology); Zhi-Ting Chen (National Taiwan University of Science and Technology); Yu-Hong Lin (National Taiwan University of Science and Technology) |
1346 |
Boosting Supervised Dehazing Methods via Bi-level Patch Reweighting |
Xingyu Jiang (beihang ); Hongkun Dou (Beihang University); Chengwei Fu (beihang); Bingquan Dai (Beihang); Tianrun Xu (North China University of Technology); Yue Deng (Samsung Research America)* |
1347 |
Detecting and Recovering Sequential DeepFake Manipulation |
Rui Shao (Nanyang Technological University)*; Tianxing Wu (Nanyang Technological University); Ziwei Liu (Nanyang Technological University) |
1353 |
MTFormer: Multi-Task Learning via Transformer and Cross-Task Reasoning |
Xiaogang XU (The Chinese University of Hong Kong)*; Hengshuang Zhao (University of Oxford); Vibhav Vineet (Microsoft Research); Ser-Nam Lim (Meta AI); Antonio Torralba (MIT) |
1356 |
Prediction-Guided Distillation for Dense Object Detection |
Chenhongyi Yang (University of Edinburgh)*; Mateusz Ochal (Heriot Watt University); Amos Storkey (U Edinburgh); Elliot J Crowley (University of Edinburgh) |
1358 |
Towards Generic 3D Tracking in RGBD Videos: Benchmark and Baseline |
Jinyu Yang (Southern University of Science and Technology)*; Zhongqun Zhang (University of Birmingham); Zhe LI (SUSTech); Hyung Jin Chang (University of Birmingham); Ales Leonardis (University of Birmingham); Feng Zheng (SUSTech) |
1364 |
C3P: Cross-domain Pose Prior Propagation for Weakly Supervised 3D Human Pose Estimation |
cunlin wu (Huazhong University of Science and Technology); Yang Xiao (Huazhong Univ. of Sci.&Tech.)*; Boshen Zhang (Tencent); Mingyang Zhang (Huazhong Univ. of Sci.&Tech); Zhiguo Cao (Huazhong Univ. of Sci.&Tech.); Joey Tianyi Zhou (A*STAR Centre for Frontier AI Research (CFAR) ) |
1366 |
Adaptive Fine-Grained Sketch-Based Image Retrieval |
Ayan Kumar Bhunia (University of Surrey)*; Aneeshan Sain (University of Surrey); Parth Hiren Shah (Indian Institute of Technology Guwahati); Animesh Gupta (Thapar University); Pinaki Nath Chowdhury (University of Surrey); Tao Xiang (University of Surrey); Yi-Zhe Song (University of Surrey) |
1376 |
Learning Ego 3D Representation as Ray Tracing |
Jiachen Lu (Fudan University); Zheyuan Zhou (Fudan University); Xiatian Zhu (University of Surrey); Hang Xu (Huawei Noah’s Ark Lab); Li Zhang (Fudan University)* |
1380 |
Accelerating Score-based Generative Models with Preconditioned Diffusion Sampling |
Hengyuan Ma (Fudan University); Li Zhang (Fudan University)*; Xiatian Zhu (University of Surrey); Jianfeng Feng (Fudan University) |
1382 |
RCLane: Relay Chain Prediction for Lane Detection |
Shenghua Xu (Fudan University); Xinyue Cai (Huawei Noah’s Ark Lab); Bin Zhao (Fudan University); Li Zhang (Fudan University)*; Hang Xu (Huawei Noah’s Ark Lab); Yanwei Fu (Fudan University); Xiangyang Xue (Fudan University) |
1394 |
Point Primitive Transformer for Long-Term 4D Point Cloud Video Understanding |
Hao Wen (Tsinghua University); Yunze Liu (Tsinghua University)*; Jingwei Huang (Huawei); Bo Duan (Huawei); Li Yi (Tsinghua University) |
1395 |
Towards Efficient Adversarial Training on Vision Transformers |
Boxi Wu (Zhejiang University)*; Jindong Gu (University of Munich); Zhifeng Li (Tencent AI Lab); Deng Cai (ZJU); Xiaofei He (Zhejiang University); Wei Liu (Tencent) |
1397 |
Adaptive Agent Transformer for Few-shot Segmentation |
Yuan Wang (University of Science and Technology of China)*; Rui Sun (University of Science and Technology of China); Zhe Zhang (Lunar Exploration and Space Engineering Center of CNSA); Tianzhu Zhang (University of Science and Technology of China) |
1408 |
Improving Few-Shot Part Segmentation using Coarse Supervision |
Oindrila Saha (University of Massachusetts Amherst)*; Zezhou Cheng (University of Massachusetts, Amherst); Subhransu Maji (University of Massachusetts, Amherst) |
1412 |
Mining Relations among Cross-Frame Affinities for Video Semantic Segmentation |
Guolei Sun (ETH Zurich); Yun Liu (ETH Zurich)*; Hao Tang (ETH Zurich); Ajad Chhatkuli (ETH Zurich); Le Zhang (University of Electronic Science and Technology of China); Luc Van Gool (ETH Zurich) |
1414 |
Out-of-distribution Detection with Boundary Aware Learning |
Sen Pei (Institute of Automation, Chinese Academy of Sciences)*; Xin Zhang (Institute of Automation, Chinese Academy of Sciences, University of Chinese Academy of Sciences); Bin Fan (University of Science and Technology Beijing); Gaofeng Meng (Chinese Academy of Sciences) |
1415 |
NeILF: Neural Incident Light Field for Physically-based Material Estimation |
Yao Yao (Apple Inc.); Jingyang Zhang (The Hong Kong University of Science and Technology)*; Jingbo Liu (Apple Inc.); Yihang Qu (Apple Inc.); Tian Fang (Apple); David N McKinnon (Apple); Yanghai Tsin (Apple Inc); Long Quan (Apple) |
1417 |
ViewFormer: NeRF-free Neural Rendering from Few Images Using Transformers |
Jonáš Kulhánek (Czech Technical University in Prague)*; Erik Derner (CTU CIIRC); Torsten Sattler (Czech Technical University in Prague); Robert Babuska (TU Delft) |
1421 |
L-Tracing: Fast Light Visibility Estimation on Neural Surfaces by Sphere Tracing |
Ziyu Chen (Shanghai Jiao Tong University)*; Chenjing Ding (Sensetime Group Limited); Jianfei Guo (Shanghai AI Laboratory); Dongliang Wang (SenseTime Group Limited); Yikang Li (Shanghai AI Lab); Xuan Xiao (SenseTime Group Limited); Wei Wu (SenseTime Group Limited); Li Song (Shanghai Jiao Tong University) |
1424 |
ARF: Artistic Radiance fields |
Kai Zhang (Cornell University)*; Nicholas I Kolkin (Adobe Research); Sai Bi (Adobe Research); Fujun Luan (Adobe Research); Zexiang Xu (Adobe Research); Eli Shechtman (Adobe Research, US); Noah Snavely (Cornell University and Google AI) |
1425 |
Multiview Stereo with Cascaded Epipolar RAFT |
Zeyu Ma (Princeton University)*; Zachary Teed (Princeton University); Jia Deng (Princeton University) |
1439 |
What to Hide from Your Students: Attention-Guided Masked Image Modeling |
Ioannis Kakogeorgiou (National Technical University of Athens)*; Spyros Gidaris (valeo.ai); Bill Psomas (National Technical University of Athens); Yannis Avrithis (IARAI, Athena RC); Andrei Bursuc (valeo.ai); Konstantinos Karantzalos (National Technical University of Athens); Nikos Komodakis (University of Crete) |
1441 |
Static and Dynamic Concepts for Self-supervised Video Representation Learning |
Rui Qian (The Chinese University of Hong Kong)*; Shuangrui Ding (Shanghai Jiao Tong University); Xian Liu (The Chinese University of Hong Kong); Dahua Lin (The Chinese University of Hong Kong) |
1447 |
Deep Partial Updating: Towards Communication Efficient Updating for On-device Inference |
Zhongnan Qu (ETH Zurich)*; Cong Liu (University of Texas at Dallas); Lothar Thiele (ETH Zürich) |
1455 |
Gradient-based Uncertainty for Monocular Depth Estimation |
Julia Hornauer (Ulm University)*; Vasileios Belagiannis (Otto von Guericke University Magdeburg) |
1456 |
Flow-Guided Transformer for Video Inpainting |
Kaidong Zhang (University of Science and Technology of China); Jingjing Fu (Microsoft)*; Dong Liu (University of Science and Technology of China) |
1468 |
Relationformer: A Unified Framework for Image-to-Graph Generation |
Suprosanna Shit (TUM)*; Rajat Koner (Ludwig Maximilian University of Munich); Bastian Wittmann (Technical University of Munich); Johannes C. Paetzold (TUM); Ivan Ezhov (TUM); Hongwei Li (Technical University of Munich); Jiazhen Pan (Technical University of Munich); Sahand Sharifzadeh (Ludwig Maximilian University of Munich); Georgios Kaissis (Technische Universität München); Volker Tresp (LMU); Bjoern Menze (TUM) |
1469 |
ARAH: Animatable Volume Rendering of Articulated Human SDFs |
Shaofei wang (ETH Zurich)*; Katja Schwarz (MPI Tuebingen); Andreas Geiger (University of Tuebingen); Siyu Tang (ETH Zurich) |
1471 |
Learning Hierarchy Aware Features for Reducing Mistake Severity |
Ashima Garg (IIIT Delhi)*; Depanshu Sani (Indraprastha Institute of Information Technology); Saket Anand (Indraprastha Institute of Information Technology Delhi) |
1474 |
Exploiting Unlabeled Data with Vision and Language Models for Object Detection |
Shiyu Zhao (Rutgers University)*; Zhixing Zhang (Rutgers University); Samuel Schulter (NEC Laboratories America); Long Zhao (Google Research); Vijay Kumar B G (NEC Laboratories America); Anastasis Stathopoulos (Rutgers University); Manmohan Chandraker (UC San Diego); Dimitris N. Metaxas (Rutgers) |
1479 |
A Simple and Robust Correlation Filtering method for text-based person search |
Wei Suo (Northwestern Polytechnical University); MengYang Sun (Northwestern Polytechnical University); Kai Niu (Northwestern Polytechnical University); Yiqi Gao (Northwestern Polytechnical University); Peng Wang (Northwestern Polytechnical University); Yanning Zhang (Northwestern Polytechnical University)*; Qi Wu (University of Adelaide) |
1482 |
Hunting Group Clues with Transformers for Social Group Activity Recognition |
Masato Tamura (Hitachi America, Ltd.)*; Rahul Vishwakarma (Hitachi America Ltd.); Ravigopal Vennelakanti (Hitachi America, Ltd.) |
1493 |
Quantized GAN for Complex Music Generation from Dance Videos |
Ye Zhu (Illinois Institute of Technology)*; Kyle B Olszewski (Snap Inc.); Yu Wu (Princeton University); Panos Achlioptas (Stanford University); Menglei Chai (Snap Inc.); Yan Yan (Illinois Institute of Technology); Sergey Tulyakov (Snap Inc) |
1506 |
Not Just Streaks: Towards Ground Truth for Single Image Deraining |
Yunhao Ba (UCLA)*; Howard Zhang (UCLA); Ethan Yang (UCLA); Akira Suzuki (UCLA); Arnold J Pfahnl (University of California, Los Angeles); Chethan Chinder Chandrappa (University of California – Los Angeles); Celso de Melo (Army Research Laboratory); Suya You (US Army Research Laboratory); Stefano Soatto (UCLA); Alex Wong (Yale University); Achuta Kadambi (UCLA) |
1511 |
HIVE: Evaluating the Human Interpretability of Visual Explanations |
Sunnie S. Y. Kim (Princeton University)*; Nicole Meister (Princeton University); Vikram V. Ramaswamy (Princeton University); Ruth C Fong (Princeton University); Olga Russakovsky (Princeton University) |
1512 |
GAMa: Cross-view Video Geo-localization |
Shruti Vyas (University of Central Florida)*; Chen Chen (University of Central Florida); Mubarak Shah (University of Central Florida) |
1516 |
Meta-Sampler: Almost-Universal yet Task-Oriented Sampling for Point Clouds |
Ta-Ying Cheng (University of Oxford); Qingyong Hu (University of Oxford)*; Qian Xie (University of Oxford); Niki Trigoni (University of Oxford); Andrew Markham (University of Oxford) |
1517 |
Multi-Query Video Retrieval |
Zeyu Wang (Princeton University)*; Yu Wu (Princeton University); Karthik Narasimhan (Princeton University); Olga Russakovsky (Princeton University) |
1525 |
Waymo Open Dataset: Panoramic Video Panoptic Segmentation |
Jieru Mei (Johns Hopkins University); Alex Zhu (Waymo)*; Xinchen Yan (Waymo); Hang Yan (Waymo LLC); Siyuan Qiao (Google); Yukun Zhu (Google Inc.); Liang-Chieh Chen (Google Inc.); Henrik Kretzschmar (Waymo) |
1531 |
MIME: Minority Inclusion for Majority Group Enhancement of AI Performance |
Pradyumna Chari (UCLA); Yunhao Ba (UCLA)*; Shreeram Athreya (UCLA); Achuta Kadambi (UCLA) |
1534 |
Self-supervised Human Mesh Recovery with Cross-Representation Alignment |
Xuan Gong (University at Buffalo); Meng Zheng (United Imaging Intelligence); Benjamin Planche (United Imaging Intelligence); Srikrishna Karanam (Adobe Research); Terrence Chen (United Imaging Intelligence); David Doermann (University at Buffalo); Ziyan Wu (United Imaging Intelligence)* |
1541 |
TL;DW? Summarizing Instructional Videos with Task Relevance & Cross-Modal Saliency |
Medhini Narasimhan (UC Berkeley)*; Arsha Nagrani (Google); Chen Sun (Brown University); Michael Rubinstein (Google); Trevor Darrell (UC Berkeley); Anna Rohrbach (UC Berkeley); Cordelia Schmid (Google) |
1542 |
A Perceptual Quality Metric for Video Frame Interpolation |
Qiqi Hou (Portland State University)*; Abhijay Ghildyal (Portland State University); Feng Liu (Portland State University) |
1543 |
Adaptive Feature Interpolation for Low-Shot Image Generation |
Mengyu Dai (Microsoft Corporation)*; Haibin Hang (Amazom.com); Xiaoyang Guo (Facebook) |
1544 |
Rethinking Learning Approaches for Long-Term Action Anticipation |
Megha Nawhal (Simon Fraser University)*; Akash Abdu Jyothi (Simon Fraser University); Greg Mori (Simon Fraser University / Borealis AI) |
1546 |
Object Manipulation via Visual Target Localization |
Kiana Ehsani (Allen Institute for Artificial Intelligence)*; Ali Farhadi (University of Washington, Apple); Aniruddha Kembhavi (Allen Institute for Artificial Intelligence); Roozbeh Mottaghi (Allen Institute for AI) |
1549 |
AlignSDF: Pose-Aligned Signed Distance Fields for Hand-Object Reconstruction |
Zerui Chen (Inria Paris); Yana Hasson (Inria); Cordelia Schmid (Inria/Google)*; Ivan Laptev (INRIA Paris) |
1551 |
Shift-tolerant Perceptual Similarity Metric |
Abhijay Ghildyal (Portland State University)*; Feng Liu (Portland State University) |
1557 |
Making the Most of Text Semantics to Improve Biomedical Vision-Language Processing |
Benedikt Boecking (Carnegie Mellon University); Naoto Usuyama (Microsoft Research); Shruthi J Bannur (Microsoft Research); Daniel Coelho de Castro (Microsoft Research); Anton Schwaighofer (Microsoft Research); Stephanie Hyland (Microsoft Research); Maria Teodora A Wetscherek (Microsoft); Tristan Naumann (Microsoft Research Redmond, US); Aditya Nori (Microsoft Research); Javier Alvarez-Valle (Microsoft Research); Hoifung Poon (Microsoft Research); Ozan Oktay (Microsoft Research)* |
1561 |
Self-Supervised Sparse Representation for Video Anomaly Detection |
Jhih-Ciang Wu (Academia Sinica )*; He-Yen Hsieh (Academia Sinica); Ding-Jie Chen (Academia Sinica); Chiou-Shann Fuh (National Taiwan University); Tyng-Luh Liu (Academia Sinica) |
1567 |
CPO: Change Robust Panorama to Point Cloud Localization |
Junho Kim (Seoul National University)*; Hojun Jang (Seoul National University); Changwoon Choi (Seoul National University); Young Min Kim (Seoul National University) |
1569 |
MonoPLFlowNet: Permutohedral Lattice FlowNet for Real-Scale 3D Scene Flow Estimation with Monocular Images |
Runfa Li (UC San Diego)*; Truong Nguyen (UC San Diego) |
1576 |
DLCFT: Deep Linear Continual Fine-Tuning for General Incremental Learning |
Hyounguk Shon (KAIST)*; Janghyeon Lee (LG AI Research); Seung Hwan Kim (LG AI Research); Junmo Kim (KAIST) |
1578 |
Contrastive Positive Mining for Unsupervised 3D Action Representation Learning |
Haoyuan Zhang (Tianjin University)*; Yonghong Hou (Tianjin University); Wenjing Zhang (Tianjin University); Wanqing Li (University of Wollongong) |
1580 |
Patch Similarity Aware Data-Free Quantization for Vision Transformers |
Zhikai Li (Institute of Automation, Chinese Academy of Sciences; School of Artificial Intelligence, University of Chinese Academy of Sciences); Liping Ma (Institute of Automation, Chinese Academy of Sciences); Mengjuan Chen (Center of Precision Sensing and Control, Institute of Automation, Chinese Academy of Sciences); Junrui Xiao (Institute of Automation, Chinese Academy of Sciences; School of Artificial Intelligence, University of Chinese Academy of Sciences); Qingyi Gu (Institute of Automation, Chinese Academy of Sciences)* |
1586 |
Perception-Distortion Balanced ADMM Optimization for Single-Image Super-Resolution |
Yuehan Zhang (National University of Singapore)*; Bo Ji (National University of Singapore); Jia Hao (HiSilicon (Shanghai) Technologies Co., Ltd); Angela Yao (National University of Singapore) |
1596 |
DualFormer: Local-Global Stratified Transformer for Efficient Video Recognition |
Yuxuan Liang (National University of Singapore)*; Pan Zhou (Sea AI Lab); Roger Zimmermann (NUS); Shuicheng Yan (Sea AI Labs) |
1606 |
Hierarchical Contrastive Inconsistency Learning for Deepfake Video Detection |
Zhihao Gu (Shanghai Jiao Tong University)*; Taiping Yao (Tencent YouTu); Yang Chen (Tencent); Shouhong Ding (Tencent); Lizhuang Ma (Shanghai Jiao Tong University) |
1616 |
Watermark Vaccine: Adversarial Attacks to Prevent Watermark Removal |
Xinwei Liu (Institute of Information Engineering,Chinese Academy of Sciences)*; Jian Liu (Ant Group); Yang Bai (Tsinghua); Jindong Gu (University of Munich); Tao Chen (Ant Group); Xiaojun Jia (Institute of Information Engineering,Chinese Academy of Sciences); Xiaochun Cao (Sun Yat-sen University) |
1625 |
ECCV Caption: Correcting False Negatives by Collecting Machine-and-Human-verified Image-Caption Associations for MS-COCO |
Sanghyuk Chun (NAVER AI Lab)*; Wonjae Kim (NAVER AI Lab); Song Park (NAVER AI Lab); Minsuk Chang (NAVER AI Lab); Seong Joon Oh (Naver AI Lab) |
1626 |
Personalizing Federated Medical Image Segmentation via Local Calibration |
Jiacheng Wang (Xiamen University); Yueming Jin (The Chinese University of Hong Kong); Liansheng Wang (Xiamen University)* |
1628 |
Learning to Detect Every Thing in an Open World |
Kuniaki Saito (Boston University)*; Ping Hu (Boston University); Trevor Darrell (UC Berkeley); Kate Saenko (Boston University) |
1648 |
MVP: Multimodality-guided Visual Pre-training |
Longhui Wei (University of Science and Technology of China)*; Lingxi Xie (Huawei Inc.); Wengang Zhou (University of Science and Technology of China); Houqiang Li (University of Science and Technology of China); Qi Tian (Huawei Cloud & AI) |
1649 |
Uncertainty Learning in Kernel Estimation for Multi-Stage Blind Image Super-Resolution |
Zhenxuan Fang (Xidian University); Weisheng Dong (Xidian University)*; Xin Li (West Virginia University); Jinjian Wu (Xidian University); Leida Li (Xidian University); Guangming Shi (Xidian University) |
1666 |
Physical Attack on Monocular Depth Estimation in Autonomous Driving with Optimal Adversarial Patches |
Zhiyuan Cheng (Purdue University)*; James C Liang (Rochester Institute of Technology); Hongjun Choi (Purdue University); Guanhong Tao (Purdue University); Zhiwen Cao (Purdue University); Dongfang Liu (Rochester Institute of Technology); Xiangyu Zhang (Purdue University) |
1670 |
KVT: $k$-NN Attention for Boosting Vision Transformers |
Pichao Wang (Alibaba Group)*; Xue Wang (Alibaba DAMO Academy); Fan Wang (Alibaba Group); Ming Lin (Alibaba Group); Shuning Chang (Alibiba Group); Hao Li (Alibaba Group); rong jin (alibaba group) |
1673 |
Locally Varying Distance Transform for Unsupervised Visual Anomaly Detection |
Wen-Yan Lin (SMU); Zhonghang Liu (SMU); Siying Liu (I2R Singapore)* |
1676 |
Hierarchical Feature Alignment Network for Unsupervised Video Object Segmentation |
Gensheng Pei (Nanjing University of Science and Technology)*; Fumin Shen (UESTC); Yazhou Yao (Nanjing University of Science and Technology); Guo-Sen Xie (Nanjing University of Science and Technology); Zhenmin Tang ( Nanjing University of Science and Technology); Jinhui Tang (Nanjing University of Science and Technology) |
1677 |
PalGAN: Image Colorization with Palette Generative Adversarial Networks |
Yi Wang (Shanghai AI Laboratory)*; Menghan Xia (Tencent AI lab); Lu Qi (The Chinese University of Hong Kong); Jing Shao (Sensetime); Yu Qiao (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences) |
1687 |
Fast-Vid2Vid: Spatial-Temporal Compression for Video-to-Video Synthesis |
Long Zhuo (Shanghai AI Lab)*; Guangcong Wang (Nanyang Technological University); Shikai Li (SenseTime Research); Wayne Wu (SenseTime Research); Ziwei Liu (Nanyang Technological University) |
1693 |
Generative Negative Text Replay for Continual Vision-Language Pretraining |
Shipeng Yan (ShanghaiTech University)*; Lanqing Hong (Huawei Noah’s Ark Lab); Hang Xu (Huawei Noah’s Ark Lab); Jianhua Han (Huawei Noah’s Ark Lab); Tinne Tuytelaars (KU Leuven); Zhenguo Li (Huawei Noah’s Ark Lab); Xuming He (ShanghaiTech University) |
1697 |
Learning Spatio-Temporal Downsampling for Effective Video Upscaling |
Xiaoyu Xiang (Meta Platforms Inc.)*; Yapeng Tian (University of Texas at Dallas); Vijay Rengarajan (Meta Platforms Inc.); Lucas D Young (Facebook); Bo Zhu (Meta Platforms, Inc.); Rakesh Ranjan (Facebook) |
1698 |
Geometric Representation Learning for Document Image Rectification |
Hao Feng (University of Science and Technology of China)*; Wengang Zhou (University of Science and Technology of China); Jiajun Deng (University of Science and Technology of China); Yuechen Wang (University of Science and Technology of China); Houqiang Li (University of Science and Technology of China) |
1701 |
ASpanFormer: Detector-Free Image Matching with Adaptive Span Transformer |
Hongkai Chen (HKUST)*; Zixin Luo (Apple Inc.); Lei Zhou (Apple); Yurun Tian (Apple); Zhen Mingmin (Apple Inc.); Tian Fang (Apple); David N McKinnon (Apple); Yanghai Tsin (Apple Inc); Long Quan (Apple) |
1709 |
Egocentric Activity Recognition and Localization on a 3D Map |
Miao Liu (Georgia Institute of Technology)*; Lingni Ma (Facebook Reality Labs); Kiran Somasundaram (Facebook Reality Labs); Yin Li (University of Wisconsin-Madison); Kristen Grauman (Facebook AI Research & UT Austin); James Rehg (Georgia Institute of Technology); Chao Li (Facebook Reality Labs) |
1710 |
Generative Adversarial Network for Future Hand Segmentation from Egocentric Video |
Wenqi Jia (Georgia Institute of Technology)*; Miao Liu (Georgia Institute of Technology); James Rehg (Georgia Institute of Technology) |
1712 |
One-Shot Medical Landmark Localization by Edge-Guided Transform and Noisy Landmark Refinement |
Zihao Yin (Center for Data Science, Peking University); Ping Gong (Deepwise AI Lab); Chunyu Wang (Microsoft Research asia); Yizhou Yu (The University of Hong Kong); Yizhou Wang (PKU)* |
1721 |
Learning Prior Feature and Attention Enhanced Image Inpainting |
chenjie cao (fudan.edu.cn)*; Qiaole Dong (Fudan University); Yanwei Fu (Fudan University) |
1730 |
AdaAfford: Learning to Adapt Manipulation Affordance for 3D Articulated Objects via Few-shot Interactions |
Yian Wang (Peking university); Ruihai Wu (Peking University); Kaichun Mo (Stanford); Jiaqi Ke (Peking University); Qingnan Fan (Tencent AI Lab); Leonidas Guibas (Stanford University); Hao Dong (Peking University)* |
1735 |
Video Graph Transformer for Video Question Answering |
Junbin Xiao (National University of Singapore)*; Pan Zhou (Sea AI Lab); Tat-Seng Chua (National Univ. of Singapore); Shuicheng Yan (Sea AI Labs) |
1737 |
A Reliable Online Method for Joint Estimation of Focal Length and Camera Rotation |
Yiming Qian (Osaka University)*; James Elder (York University) |
1738 |
Learning Local Implicit Fourier Representation for Image Warping |
Jaewon Lee (DGIST)*; Kwang Pyo Choi (Samsung Electronics); Kyong Hwan Jin (DGIST) |
1740 |
SepLUT: Separable Image-adaptive Lookup Tables for Real-time Image Enhancement |
Canqian Yang (Shanghai Jiao Tong University); Meiguang Jin (Alibaba Group); Yi Xu (Shanghai Jiao Tong University)*; Rui Zhang (Shanghai Jiao Tong University); Ying Chen (Alibaba Group); Huaida Liu (Alibaba) |
1744 |
Temporal-MPI: Enabling Multi-Plane Images for Dynamic Scene Modelling via Temporal Basis Learning |
Wenpeng Xing (Hong Kong Baptist University); Jie Chen (Hong Kong Baptist University)* |
1746 |
Blind Image Decomposition |
Junlin Han (CSIRO)*; Weihao Li (Data61, CSIRO); Pengfei Fang (The Australian National University); Chunyi Sun (Australian National University ); Jie Hong (Australian National University); Mohammad Ali Armin (CSIRO(Data61)); Lars Petersson (Data61/CSIRO); HONGDONG LI (Australian National University, Australia) |
1751 |
INT: Towards Infinite-frames 3D Detection with An Efficient Framework |
Jianyun Xu (DAMO Academy, Alibaba Group)*; Zhenwei Miao (DAMO Academy, Alibaba Group); Da Zhang (UC Santa Barbara); Hongyu Pan (DAMO Academy, Alibaba Group); Kaixuan Liu (DAMO Academy, Alibaba Group); Peihan Hao (DAMO Academy, Alibaba Group); Jun Zhu (DAMO Academy, Alibaba Group); Zhengyang Sun (Tsinghua University); Li Hongmin (Huawei TCS lab); Xin Zhan (DAMO Academy, Alibaba Group) |
1756 |
MuLUT: Cooperating Multiple Look-Up Tables for Efficient Image Super-Resolution |
Jiacheng Li (University of Science and Technology of China); Chang Chen (Huawei Noah’s Ark Lab); Zhen Cheng (University of Science and Technology of China); Zhiwei Xiong (University of Science and Technology of China)* |
1757 |
NDF: Neural Deformable Fields for Dynamic Human Modelling |
Ruiqi Zhang (Hong Kong Baptist University); Jie Chen (Hong Kong Baptist University)* |
1759 |
MPIB: An MPI-Based Bokeh Rendering Framework for Realistic Partial Occlusion Effects |
Juewen Peng (Huazhong University of Science and Technology); Jianming Zhang (Adobe Research); Xianrui Luo (Huazhong University of Science and Technology); Hao Lu (Huazhong University of Science and Technology); Ke Xian (Huazhong University of Science and Technology)*; Zhiguo Cao (Huazhong Univ. of Sci.&Tech.) |
1761 |
Neural Density-Distance Fields |
Itsuki UEDA (University of Tsukuba)*; Yoshihiro Fukuhara (Waseda University); Hirokatsu Kataoka (National Institute of Advanced Industrial Science and Technology (AIST)); Hiroaki Aizawa (Hiroshima University); Hidehiko Shishido (University of Tsukuba); Itaru Kitahara (University of Tsukuba) |
1762 |
MoDA: Map style transfer for self-supervised Domain Adaptation of embodied agents |
Eun Sun Lee (Seoul National University)*; Junho Kim (Seoul National University); Sangwon Park (Seoul Nat’l University); Young Min Kim (Seoul National University) |
1766 |
L3: Accelerator-Friendly Lossless Image Format for High-Resolution, High-Throughput DNN Training |
Jonghyun Bae (Seoul National University)*; Woohyeon Baek (Seoul National University); Tae Jun Ham (Seoul National University); Jae W. Lee (Seoul National University) |
1780 |
Prior-Guided Adversarial Initialization for Fast Adversarial Training |
Xiaojun Jia (Institute of Information Engineering,Chinese Academy of Sciences)*; Yong Zhang (Tencent AI Lab); Xingxing Wei (Beihang University); Baoyuan Wu (The Chinese University of Hong Kong, Shenzhen; Shenzhen Research Institute of Big Data); Ke Ma (UCAS); Jue Wang (Tencent AI Lab); Xiaochun Cao (Sun Yat-sen University) |
1790 |
Housekeep: Tidying Virtual Households using Commonsense Reasoning |
Yash Mukund Kant (University of Toronto)*; Arun Ramachandran (Georgia Institute of Technology); Sriram Yenamandra (Georgia Institute of Technology); Igor Gilitschenski (University of Toronto); Dhruv Batra (Georgia Tech & Facebook AI Research); Andrew Szot (Georgia Institute of Technology); Harsh Agrawal (Georgia Institute of Technology) |
1804 |
Real-RawVSR: Real-World Raw Video Super-Resolution with a Benchmark Dataset |
Huanjing Yue (Tianjin University)*; Zhiming Zhang (Tianjin University); Jingyu Yang (Tianjin University) |
1807 |
ST-P3: End-to-end Vision-based Autonomous Driving via Spatial-Temporal Feature Learning |
Shengchao Hu (Shanghai Jiao Tong University)*; Li Chen (Shanghai AI Laboratory); Penghao Wu (Shanghai Jiao Tong University); Hongyang Li (SenseTime); Junchi Yan (Shanghai Jiao Tong University); Dacheng Tao (JD.com) |
1810 |
NeXT: Towards High Quality Neural Radiance Fields via Multi-Skip Transformer |
Yunxiao Wang (Tsinghua University); Yanjie Li (Tsinghua University)*; Peidong Liu (Tsinghua University); Tao Dai (Shenzhen University); Shu-Tao Xia (Tsinghua University) |
1814 |
Learning Spatiotemporal Frequency-Transformer for Compressed Video Super-Resolution |
Zhongwei Qiu (University of Science and Technology Beijing); Huan Yang (Microsoft Research)*; Jianlong Fu (Microsoft Research); Dongmei Fu (University of Science and Technology Beijing) |
1819 |
Adversarial Partial Domain Adaptation by Cycle Inconsistency |
Kun-Yu Lin (Sun Yat-sen University); Jiaming Zhou (Sun Yat-sen University); Yukun Qiu (Sun Yat-sen University); WEI-SHI ZHENG (Sun Yat-sen University, China)* |
1824 |
BayesCap: Bayesian Identity Cap for Calibrated Uncertainty in Frozen Neural Networks |
Uddeshya Upadhyay (University of Tübingen)*; Shyamgopal Karthik (University of Tübingen); Massimiliano Mancini (University of Tübingen); Yanbei Chen (University of Tübingen); Zeynep Akata (University of Tübingen) |
1831 |
Domain Randomization-Enhanced Depth Simulation and Restoration for Perceiving and Grasping Specular and Transparent Objects |
Qiyu Dai (Peking University); Jiyao Zhang (Xi’an Jiaotong University); Qiwei Li (Peking University); tianhao wu (Peking University); Hao Dong (Peking University); Ziyuan Liu (Huawei group); Ping Tan (Simon Fraser University); He Wang (Peking University)* |
1832 |
PS-NeRF: Neural Inverse Rendering for Multi-view Photometric Stereo |
Wenqi Yang (The University of Hong Kong)*; Guanying CHEN (The Chinese University of Hong Kong, Shenzhen); Chaofeng Chen (Nanyang Technological University); Zhenfang Chen (MIT-IBM Watson AI Lab); Kwan-Yee K. Wong (The University of Hong Kong) |
1845 |
DeciWatch: A Simple Baseline for 10× Efficient 2D and 3D Pose Estimation |
Ailing Zeng (The Chinese University of Hong Kong)*; Xuan Ju (The Chinese University of Hong Kong); Lei Yang (Sensetime Group Limited); Ruiyuan Gao (The Chinese University of Hong Kong); Xizhou Zhu (SenseTime); Bo Dai (Shanghai AI Lab); Qiang Xu (The Chinese University of Hong Kong) |
1846 |
Hierarchical Latent Structure for Multi-Modal Vehicle Trajectory Forecasting |
Dooseop Choi (ETRI)*; KyoungWook Min (ETRI) |
1848 |
SmoothNet: A Plug-and-Play Network for Refining Human Poses in Videos |
Ailing Zeng (The Chinese University of Hong Kong)*; Lei Yang (Sensetime Group Limited); Xuan Ju (The Chinese University of Hong Kong); Jiefeng Li (Shanghai Jiao Tong University); Jianyi Wang (Nanyang Technological University); Qiang Xu (The Chinese University of Hong Kong) |
1851 |
Share With Thy Neighbors: Single-View Reconstruction by Cross-Instance Consistency |
Tom Monnier (École des ponts Paristech)*; Matthew Fisher (Adobe Research); Alexei A Efros (UC Berkeley); Mathieu Aubry (École des ponts ParisTech) |
1852 |
End-to-End Weakly Supervised Object Detection with Sparse Proposal Evolution |
Mingxiang Liao (University of Chinese Academy of Sciences); Fang Wan (University of Chinese Academy of Sciences)*; Yuan Yao (University of Chinese Academy of Sciences); Zhenjun Han (University of Chinese Academy of Sciences); Zou Jialing (University of Chinese Academy of Science); Yuze Wang ( Huawei Noah’s Ark Lab); Bailan Feng (Huawei Noah’s Ark Lab); Peng Yuan (Huawei Noah’s Ark Lab); Qixiang Ye (University of Chinese Academy of Sciences, China) |
1853 |
PAC-Net: Highlight Your Video via History Preference Modeling |
Hang Wang (Huawei HiSilicon)*; Penghao Zhou (ByteDance); Chong Zhou (Nanyang Technological University); Zhao Zhang (Nankai University); Xing Sun (Shopee) |
1859 |
Efficient Point Cloud Analysis Using Hilbert Curve |
Wanli Chen (CUHK)*; Xinge Zhu (The Chinese University of Hong Kong); Guojin Chen (The Chinese University of Hong Kong); Bei Yu (CUHK) |
1860 |
Learning Online Multi-Sensor Depth Fusion |
Erik Sandström (ETH Zürich)*; Martin R. Oswald (ETH Zurich); Suryansh Kumar (ETH Zurich); Silvan Weder (ETH Zürich); Fisher Yu (ETH Zurich); Cristian Sminchisescu (Lund University); Luc Van Gool (ETH Zurich) |
1866 |
Self-Support Few-Shot Semantic Segmentation |
Qi Fan (HKUST)*; Wenjie Pei (Harbin Institute of Technology, Shenzhen); Yu-Wing Tai (Kuaishou Technology / HKUST); Chi-Keung Tang (Hong Kong University of Science and Technology) |
1868 |
Few-Shot Object Detection with Model Calibration |
Qi Fan (HKUST)*; Chi-Keung Tang (Hong Kong University of Science and Technology); Yu-Wing Tai (Kuaishou Technology / HKUST) |
1870 |
S2-VER: Semi-Supervised Visual Emotion Recognition |
Guoli Jia (NanKai University); Jufeng Yang (Nankai University )* |
1882 |
Self-Supervision Can Be a Good Few-Shot Learner |
Yuning Lu (USTC); liangjian Wen (the Noah’s Ark Lab, Huawei Technologies Company Limited); Jianzhuang Liu (Huawei Noah’s Ark Lab); Yajing Liu (USTC); Xinmei Tian (USTC)* |
1886 |
My View is the Best View: Procedure Learning from Egocentric Videos |
Siddhant Bansal (IIIT, Hyderabad)*; Chetan Arora (Indian Institute of Technology Delhi); C.V. Jawahar (IIIT-Hyderabad) |
1894 |
Trace Controlled Text to Image Generation |
Kun Yan (Beihang University)*; Lei Ji (Microsoft); Chenfei Wu (Microsoft); Jianmin Bao (microsoft.com); Ming Zhou (SINOVATION VENTURES); Nan Duan (Microsoft Research); Shuai Ma (Beihang University) |
1925 |
Towards Comprehensive Representation Enhancement in Semantics-guided Self-supervised Monocular Depth Estimation |
Jingyuan Ma (HikVision Research Institute)*; Xiangyu Lei (Hikvision Research Institute); Nan Liu (hikvison); Zhao Xian (Hikvision); Shiliang Pu (Hikvision Research Institute) |
1929 |
Calibration-free Multi-view Crowd Counting |
Qi Zhang (City University of Hong Kong, Hong Kong)*; Antoni Chan (City University of Hong Kong, Hong, Kong) |
1930 |
Unsupervised Domain Adaptation for Monocular 3D Object Detection via Self-Training |
Zhenyu Li (Harbin Institute of Technology)*; Zehui Chen (University of Science and Technology of China); Ang Li (SenseTime Research); Liangji Fang (Sensetime Research); Qinhong Jiang (SenseTime Research; Shanghai AI Laboratory); Xianming Liu (Harbin Institute of Technology); Junjun Jiang (Harbin Institute of Technology) |
1940 |
Online Continual Learning with Contrastive Vision Transformer |
Zhen Wang (The University of Sydney )*; Liu Liu (The University of Sydney); Yajing Kong (The University of Sydney); Jiaxian Guo (The University of Sydney); Dacheng Tao (JD.com) |
1946 |
COO: Comic Onomatopoeia Dataset for Recognizing Arbitrary or Truncated Texts |
Jeonghun Baek (The University of Tokyo)*; Yusuke Matsui (The University of Tokyo); Kiyoharu Aizawa (The University of Tokyo) |
1947 |
BungeeNeRF: Progressive Neural Radiance Field for Extreme Multiscale Scene Rendering |
Yuanbo Xiangli (Chinese University of Hong Kong)*; Linning Xu (CUHK); Xingang Pan (Max Planck Institute for Informatics); Nanxuan Zhao (University of Bath); Anyi Rao (The Chinese University of Hong Kong); Christian Theobalt (MPI Informatik); Bo Dai (Shanghai AI Lab); Dahua Lin (The Chinese University of Hong Kong) |
1951 |
AiATrack: Attention in Attention for Transformer Visual Tracking |
Shenyuan Gao (Huazhong University of Science and Technology)*; CHUNLUAN ZHOU (Wormpex AI Research); Chao Ma (Shanghai Jiao Tong University); Xinggang Wang (Huazhong University of Science and Technology); Junsong Yuan (“State University of New York at Buffalo, USA”) |
1952 |
Learning Invariant Visual Representations for Compositional Zero-Shot Learning |
Tian Zhang (Beijing University of Posts and Telecommunications); Kongming Liang (Beijing University of Posts and Telecommunications)*; Ruoyi Du (Beijing University of Posts and Telecommunications); Xian Sun (Aerospace Information Research Institute, Chinese Academy of Sciences); Zhanyu Ma (Beijing University of Posts and Telecommunications); Jun Guo (Beijing University of Posts and Telecommunications) |
1954 |
Image Coding for Machines with Omnipotent Feature Learning |
Ruoyu Feng (University of Science and Technology of China)*; Xin Jin (University of Science and Technology of China); Zongyu Guo (University of Science and Technology of China); Runsen Feng (University of Science and Technology of China); Yixin Gao (University of Science and Technology of China); Tianyu He (Microsoft Research Asia); Zhizheng Zhang (Microsoft Research); Simeng Sun (University of Science and Technology of China); Zhibo Chen (University of Science and Technology of China) |
1959 |
MOTCOM: The Multi-Object Tracking Dataset Complexity Metric |
Malte Pedersen (Aalborg University)*; Joakim Bruslund Haurum (Aalborg University); Patrick Dendorfer (TUM); Thomas B. Moeslund (Aalborg University) |
1980 |
How Severe is Benchmark-Sensitivity in Video Self-Supervised Learning? |
Fida Mohammad Thoker (University of Amsterdam)*; Hazel Doughty (University of Amsterdam); Piyush Nitin Bagad (University of Amsterdam); Cees Snoek (University of Amsterdam) |
1982 |
Rethinking Robust Representation Learning Under Fine-grained Noisy Faces |
Bingqi Ma (Sensetime Group Limited)*; Guanglu Song (Sensetime); Boxiao Liu (Institute of Computing Technology, Chinese Academy of Sciences); Yu Liu (SenseTime Group LTD) |
1986 |
Feature Representation Learning for Unsupervised Cross-domain Image Retrieval |
Conghui Hu (National University of Singapore)*; Gim Hee Lee (National University of Singapore) |
1987 |
Cost Aggregation with 4D Convolutional Swin Transformer for Few-Shot Segmentation |
sunghwan hong (Korea University); Seokju Cho (Korea University); Jisu Nam (korea university); Stephen Lin (Microsoft Research); Seungryong Kim (Korea University)* |
1988 |
Spatial-Frequency Domain Information Integration for Pan-sharpening |
man zhou (University of Science and Technology of China); Jie Huang (University of Science and Technology of China); Keyu Yan (University of Science and Technology of China); Hu Yu (University of Science and Technology of China); Xueyang Fu (University of Science and Technology of China); Aiping Liu (University of Science and Technology of China); Xian Wei (East China Normal University); Feng Zhao (University of Science and Technology of China)* |
1991 |
TOCH: Spatio-Temporal Object-to-Hand Correspondence for Motion Refinement |
Keyang Zhou (University of Tübingen)*; Bharat Lal Bhatnagar (University of Tübingen, MPI informatik); Jan E. Lenssen (TU Dortmund); Gerard Pons-Moll (University of Tübingen) |
1999 |
HRDA: Context-Aware High-Resolution Domain-Adaptive Semantic Segmentation |
Lukas Hoyer (ETH Zurich)*; Dengxin Dai (ETH Zurich); Luc Van Gool (ETH Zurich) |
2012 |
Combating Label Distribution Shift for Active Domain Adaptation |
Sehyun Hwang (POSTECH)*; Sohyun Lee (POSTECH); Sungyeon Kim (POSTECH); Jungseul Ok (POSTECH); Suha Kwak (POSTECH) |
2016 |
GIPSO: Geometrically Informed Propagation for Online Adaptation in 3D LiDAR Segmentation |
Cristiano Saltori (University of Trento)*; Evgeny Krivosheev (University of Trento); Stéphane Lathuilière (Telecom-Paris); Nicu Sebe (University of Trento); Fabio Galasso (Sapienza University); Giuseppe Fiameni (NVIDIA); Elisa Ricci (University of Trento); Fabio Poiesi (Fondazione Bruno Kessler) |
2025 |
SuperLine3D: Self-supervised Line Segmentation and Description for LiDAR Point Cloud |
Xiangrui Zhao (Zhejiang University)*; Sheng Yang (Alibaba Group); Tianxin Huang (Zhejiang University); Jun Chen (Zhejiang University); Teng Ma (Alibaba Group); Mingyang Li (Alibaba A.I. Labs); Yong Liu (Zhejiang University) |
2031 |
Efficient Meta-Tuning for Content-aware Neural Video Delivery |
Xiaoqi Li (Columbia university in the city of New york)*; Jiaming Liu (Peking University); Shizun Wang (Beijing University of Posts and Telecommunications); Cheng Lyu (Beijing University of Posts and Telecommunications); Ming Lu (Intel Labs China); Yurong Chen (Intel Labs China); Anbang Yao (Intel Labs China); Yandong Guo (OPPO Research Institute); Shanghang Zhang (University of California, Berkeley) |
2033 |
PoseTrans: A Simple Yet Effective Pose Transformation Augmentation for Human Pose Estimation |
Wentao Jiang (Beihang University)*; Sheng Jin (The University of Hong Kong); Wentao Liu (Sensetime); Chen Qian (SenseTime); Ping Luo (The University of Hong Kong); Si Liu (Beihang University) |
2039 |
3D-Aware Semantic-Guided Generative Model for Human Synthesis |
Jichao Zhang (University of Trento)*; Enver Sangineto (University of Modena and Reggio Emilia); Hao Tang (ETH Zurich); Aliaksandr Siarohin (Snapchat); Zhun Zhong (University of Trento); Nicu Sebe (University of Trento); Wei Wang (EPFL) |
2041 |
Improving Covariance Conditioning of the SVD Meta-layer by Orthogonality |
Yue Song (University of Trento)*; Nicu Sebe (University of Trento); Wei Wang (EPFL) |
2050 |
CoSMix: Compositional Semantic Mix for Domain Adaptation in 3D LiDAR Segmentation |
Cristiano Saltori (University of Trento)*; Fabio Galasso (Sapienza University); Giuseppe Fiameni (NVIDIA); Nicu Sebe (University of Trento); Elisa Ricci (University of Trento); Fabio Poiesi (Fondazione Bruno Kessler) |
2054 |
Streaming Multiscale Deep Equilibrium Models |
Can Ufuk Ertenli (Middle East Technical University)*; Emre Akbas (METU); Ramazan Gokberk Cinbis (METU) |
2057 |
AvatarCap: Animatable Avatar Conditioned Monocular Human Volumetric Capture |
Zhe Li (Tsinghua University)*; Zerong Zheng (Tsinghua University); Hongwen Zhang (Tsinghua University); Chaonan Ji (Tsinghua University); Yebin Liu (Tsinghua University) |
2061 |
Hierarchical Average Precision Training for Pertinent Image Retrieval |
Elias Ramzi (Conservatoire Nation des Arts et Metiers)*; Nicolas Audebert (Cnam); Nicolas Thome (CNAM, Paris); Clément Rambour (Cnam); Xavier B Bitot (Coexya) |
2087 |
Fashionformer: A Simple, Effective and Unified Baseline for Human Fashion Segmentation and Recognition |
Shilin Xu (Peking University); Xiangtai Li (Peking University)*; Jingbo Wang (The Chinese University of HongKong); Guangliang Cheng (Sensetime Group Limited); Yunhai Tong (Peking University); Dacheng Tao (JD.com) |
2088 |
Out-of-Distribution Detection with Semantic Mismatch under Masking |
Yijun Yang (The Chinese University of Hong Kong)*; Ruiyuan Gao (The Chinese University of Hong Kong); Qiang Xu (The Chinese University of Hong Kong) |
2104 |
Target-absent Human Attention |
Zhibo Yang (Stony Brook University)*; Sounak Mondal (Stony Brook University); Seoyoung Ahn (Stony Brook University); Gregory Zelinsky (Stony Brook University); Minh Hoai (Stony Brook University); Dimitris Samaras (Stony Brook University) |
2105 |
Reference-based Image Super-Resolution with Deformable Attention Transformer |
Jiezhang Cao (ETH Zürich)*; Jingyun Liang (ETH Zurich); Kai Zhang (ETH Zurich); Yawei Li (ETH Zurich); Yulun Zhang (ETH Zurich); Wenguan Wang (Eidgenössische Technische Hochschule Zürich); Luc Van Gool (ETH Zurich) |
2116 |
Cross-Attention of Disentangled Modalities for 3D Human Mesh Recovery with Transformers |
Junhyeong Cho (POSTECH)*; Kim Youwang (POSTECH); Tae-Hyun Oh (POSTECH) |
2118 |
Learning to Generate Realistic LiDAR Point Cloud |
Vlas Zyrianov (University of Illinois Urbana Champaign); Xiyue Zhu (university of illinois); Shenlong Wang (UIUC)* |
2124 |
GeoRefine: Self-Supervised Online Depth Refinement for Accurate Dense Mapping |
Pan Ji (OPPO US Research Center)*; Qingan Yan (OPPO US Research Center); Yuxin Ma (Wing LLC); Yi Xu (OPPO US Research Center) |
2134 |
Transform your Smartphone into a DSLR Camera: Learning the ISP in the Wild |
Ardhendu Shekhar Tripathi (ETH Zurich)*; Martin Danelljan (ETH Zurich); Samarth Shukla (ETH Zurich); Radu Timofte (University of Wurzburg & ETH Zurich); Luc Van Gool (ETH Zurich) |
2138 |
Uncertainty-Based Spatial-Temporal Attention for Online Action Detection |
Hongji Guo (Rensselaer Polytechnic Institute)*; Zhou Ren (Wormpex AI Research); Yi Wu (Wormpex AI Research); Gang Hua (Wormpex AI Research); Qiang Ji (Rensselaer Polytechnic Institute) |
2144 |
Video Question Answering with Iterative Video-Text Co-Tokenization |
AJ Piergiovanni (Google)*; Kairo Morton (Massachusetts Institute of Technology); Weicheng Kuo (Google); Michael S Ryoo (Google; Stony Brook University); Anelia Angelova (Google) |
2145 |
LaTeRF: Label and Text Driven Object Radiance Fields |
Ashkan Mirzaei (University of Toronto)*; Yash Mukund Kant (University of Toronto); Jonathan Kelly (University of Toronto); Igor Gilitschenski (University of Toronto) |
2146 |
Temporally Consistent Semantic Video Editing |
Yiran Xu (University of Maryland, College Park)*; Badour A Sh AlBahar (Virginia Tech); Jia-Bin Huang (Facebook ) |
2149 |
SPot-the-Difference Self-Supervised Pre-training for Anomaly Detection and Segmentation |
Yang Zou (Amazon AI)*; Jongheon Jeong (KAIST); Latha Pemula (Amazon); Dongqing Zhang (Amazon); Onkar Dabeer (Amazon) |
2151 |
Exploring Plain Vision Transformer Backbones for Object Detection |
Yanghao Li (Facebook AI Research)*; Hanzi Mao (Facebook AI Research); Ross Girshick (FAIR); Kaiming He (Facebook AI Research) |
2152 |
Fine-grained Egocentric Hand-Object Segmentation: Dataset, Model, and Applications |
Lingzhi Zhang (University of Pennsylvania)*; Shenghao Zhou (University of Pennsylvania); Simon Stent (Toyota Research Institute); Jianbo Shi (University of Pennsylvania) |
2154 |
Is It Necessary to Transfer Temporal Knowledge for Domain Adaptive Video Semantic Segmentation? |
Xinyi Wu (University of South Carolina); Zhenyao Wu (University of South Carolina)*; Jin Wan (Beijing Jiaotong University); Lili Ju (University of South Carolina); Song Wang (University of South Carolina) |
2162 |
GIMO: Gaze-Informed Human Motion Prediction in Context |
Yang Zheng (Tsinghua University); Yanchao Yang (Stanford University)*; Kaichun Mo (Stanford); Jiaman Li (University of Southern California); Tao Yu (Tsinghua University); Yebin Liu (Tsinghua University); Karen Liu (Stanford); Leonidas Guibas (Stanford University) |
2166 |
Error Compensation Framework for Flow-Guided Video Inpainting |
Jaeyeon Kang (Yonsei University); Seoung Wug Oh (Adobe Research); Seon Joo Kim (Yonsei University)* |
2170 |
Decomposing The Tangent of Occluding Boundaries According to Curvatures and Torsions |
Huizong Yang (Georgia Institute of Technology)*; Anthony Yezzi (Georgia Institute of Technology) |
2171 |
CPrune: Compiler-Informed Model Pruning for Efficient Target-Aware DNN Execution |
Taeho Kim (University of Colorado at Boulder)*; Yongin Kwon (Electronics and Telecommunications Research Institute); Jemin Lee (Electronics and Telecommunications Research Institute); Taeho Kim (Electronics and Telecommunications Research Institute); Sangtae Ha (University of Colorado at Boulder) |
2180 |
Scraping Textures from Natural Images for Synthesis and Editing |
Xueting Li (University of California, Merced)*; Xiaolong Wang (UCSD); Ming-Hsuan Yang (University of California at Merced); Alexei A Efros (UC Berkeley); Sifei Liu (NVIDIA) |
2203 |
Self-supervised Learning of Visual Graph Matching |
Chang Liu (Shanghai Jiao Tong University); Shaofeng Zhang (Shanghai Jiao Tong University); Xiaokang Yang (Shanghai Jiao Tong University of China); Junchi Yan (Shanghai Jiao Tong University)* |
2206 |
Disentangling Architecture and Training for Optical Flow |
Deqing Sun (Google)*; Charles Herrmann (Google); Fitsum Reda (Google); Michael Rubinstein (Google); David J Fleet (University of Toronto); William T Freeman (Google) |
2217 |
PointFix: Learning to Fix Domain Bias for Robust Online Stereo Adaptation |
Kwonyoung Kim (Yonsei University); JungIn Park (Yonsei University); Jiyoung Lee (NAVER AI Lab); Dongbo Min (Ewha Womans University); Kwanghoon Sohn (Yonsei Univ.)* |
2218 |
Teaching Where to Look: Attention Similarity Knowledge Distillation for Low Resolution Face Recognition |
Sungho Shin (Gwangju Institute of Science and Technology); Joosoon Lee (Gwangju Institute of Science and Technology); junseok lee (GIST(Gwangju Institute of Science and Technology)); Yeonguk Yu (Gwangju Institute of Science and Technology); Kyoobin Lee (Gwangju Institute of Science and Technology)* |
2219 |
Iwin: Human-Object Interaction Detection via Transformer with Irregular Windows |
Danyang Tu (Shanghai Jiao Tong University)*; Xiongkuo Min (Shanghai Jiao Tong University); Huiyu Duan (Shanghai Jiao Tong University); Guodong Guo (Baidu); Guangtao Zhai (Shanghai Jiao Tong University); Wei Shen (Shanghai Jiao Tong University) |
2221 |
Single Stage Virtual Try-on via Deformable Attention Flows |
Shuai Bai (Alibaba Group)*; Huiling Zhou (Alibaba); Zhikang Li (DAMO Academy, Alibaba Group); Chang Zhou (Alibaba Group); Hongxia Yang (Alibaba Group) |
2222 |
Learning Deep Non-Blind Image Deconvolution Without Ground Truths |
Yuhui Quan (South China University of Technology)*; Zhuojie Chen (South China University of Technology); Huan Zheng (National University of Singapore); Hui Ji (National University of Singapore) |
2233 |
Rethinking Zero-shot Action Recognition: Learning from Latent Atomic Actions |
Yijun Qian (Carnegie Mellon University)*; Lijun Yu (Carnegie Mellon University); Wenhe Liu (Carnegie Mellon University); Alexander Hauptmann (Carnegie Mellon University) |
2234 |
NeuRIS: Neural Reconstruction of Indoor Scenes Using Normal Priors |
Jiepeng Wang (The University of Hong Kong); Peng Wang (The University of Hong Kong); Xiaoxiao Long (The University of Hong Kong); Christian Theobalt (MPI Informatik); Taku Komura (The University of Hong Kong); Lingjie Liu (Max Planck Institute for Informatics ); Wenping Wang (The University of Hong Kong)* |
2237 |
Rethinking Data Augmentation for Robust Visual Question Answering |
Long Chen (Columbia University)*; Yuhang Zheng (Zhejiang University); Jun Xiao (Zhejiang University) |
2240 |
Dual-Domain Self-Supervised Learning and Model Adaption for Deep Compressive Imaging |
Yuhui Quan (South China University of Technology)*; Xinran Qin (South China University of Technology); Tongyao Pang (National University of Singapore); Hui Ji (National University of Singapore) |
2242 |
Explicit Image Caption Editing |
Zhen Wang (Zhejiang University); Long Chen (Columbia University)*; Wenbo Ma (Zhejiang University); Guangxing Han (Columbia University); Yulei Niu (Columbia University); Jian Shao (Zhejiang University); Jun Xiao (Zhejiang University) |
2255 |
SphereFed: Hyperspherical Federated Learning |
Xin Dong (Harvard Univeristy)*; Sai Qian Zhang (Harvard University); Ang Li (Google DeepMind); H.T. Kung (Harvard University) |
2257 |
Local Color Distributions Prior for Image Enhancement |
Haoyuan Wang (City University of Hong Kong)*; Ke Xu (City University of Hong Kong); Rynson W.H. Lau (City University of Hong Kong) |
2267 |
Teaching with Soft Label Smoothing for Mitigating Noisy Labels in Facial Expressions |
Tohar Lukov (National University of Singapore)*; Na Zhao (NUS); Gim Hee Lee (National University of Singapore); Ser-Nam Lim (Facebook AI) |
2269 |
Multi-Modal Masked Pre-Training for Monocular Panoramic Depth Completion |
Zhiqiang Yan (Nanjing University of Science and Tenchnology)*; Xiang Li (Nanjing University of Science and Technology); Kun Wang (Nanjing University of Science and Technology); Zhenyu Zhang (Tencent); Jun Li (Nanjing University of Science and Technology); Jian Yang (Nanjing University of Science and Technology) |
2272 |
2D Amodal Instance Segmentation Guided by 3D Shape Prior |
Zhixuan Li (Peking University); Weining Ye (Peking University); Tingting Jiang (Peking University)*; Tiejun Huang (Peking University) |
2280 |
How to Synthesize a Large-Scale and Trainable Micro-Expression Dataset? |
Yuchi Liu (Australian National University)*; Zhongdao Wang (Tsinghua University); Tom Gedeon (The Australian National University); Liang Zheng (Australian National University) |
2285 |
HEAD: HEtero-Assists Distillation for Heterogeneous Object Detectors |
Luting Wang (Beihang University)*; Xiaojie Li (sensetime); Yue Liao (Beihang University); Zeren Jiang (ETH Zurich); Jianlong Wu (Shandong University); Fei Wang (University of Science and Technology of China); Chen Qian (SenseTime); Si Liu (Beihang University) |
2293 |
Meta Spatio-Temporal Debiasing for Video Scene Graph Generation |
LI XU (Singapore University of Technology and Design)*; Haoxuan Qu (Singapore University of Technology and Design); Jason Kuen (Adobe Research); Jiuxiang Gu (Adobe Research); Jun Liu (Singapore University of Technology and Design) |
2307 |
A Sliding Window Scheme for Online Temporal Action Localization |
Young Hwi Kim (Yonsei University); Hyolim Kang (Yonsei University); Seon Joo Kim (Yonsei University)* |
2310 |
Ultra-high-resolution unpaired stain transformation via Kernelized Instance Normalization |
Ming-Yang Ho (aetherAI)*; Min-Sheng Wu (aetherAI); Che-Ming Wu (aetherAI) |
2311 |
SESS: Saliency Enhancing with Scaling and Sliding |
Osman Tursun (Queensland University of Technology)*; SIMON DENMAN (Queensland University of Technology, Australia); Sridha Sridharan (QUT); Clinton Fookes (Queensland University of Technology) |
2312 |
Data Efficient 3D Learner via Knowledge Transferred from 2D Model |
Ping-Chung Yu (National Tsing Hua University)*; Cheng Sun (National Tsing Hua University); Min Sun (NTHU) |
2319 |
MeshMAE: Masked Autoencoders for 3D Mesh Data Analysis |
Yaqian Liang (Wuhan University); Shanshan Zhao (JD.COM); Baosheng Yu (The University of Sydney); Jing Zhang (The University of Sydney); Fazhi He (Wuhan University)* |
2327 |
ERA: Expert Retrieval and Assembly for Early Action Prediction |
Lin Geng Foo (Singapore University of Technology and Design)*; Tianjiao Li (Singapore University of Technology and Design); Hossein Rahmani (Lancaster University); Qiuhong Ke (Monash University); Jun Liu (Singapore University of Technology and Design) |
2328 |
Mining Cross-Person Cues for Body-Part Interactiveness Learning in HOI Detection |
Xiaoqian Wu (Shanghai Jiao Tong University); Yong-Lu Li (Shanghai Jiao Tong University); Xinpeng Liu (Shanghai Jiao Tong University); Junyi Zhang (Shanghai Jiao Tong University); Yuzhe Wu (DongHua University); Cewu Lu (Shanghai Jiao Tong University)* |
2334 |
Improving GANs for Long-Tailed Data through Group Spectral Regularization |
Harsh Rangwani (Indian Institute of Science)*; Naman Jaswani (Indian Institute of Science); Tejan Karmali (Indian Institute of Science, Bengaluru); Varun Jampani (Google); Venkatesh Babu RADHAKRISHNAN (Indian Institute of Science) |
2336 |
Hierarchical Semantic Regularization of Latent Spaces in StyleGANs |
Tejan Karmali (Indian Institute of Science, Bengaluru)*; Rishubh Parihar (Indian Institute of Science, Bangalore); Susmit Agrawal (Indian Institute of Science); Harsh Rangwani (Indian Institute of Science); Varun Jampani (Google); Maneesh K Singh (Motive Technologies ); Venkatesh Babu RADHAKRISHNAN (Indian Institute of Science) |
2337 |
Symmetry Regularization and Saturating Nonlinearity for Robust Quantization |
SEIN PARK (POSTECH); Yeongsang Jang (POSTECH); Eunhyeok Park (POSTECH)* |
2350 |
IntereStyle: Encoding an Interest Region for Robust StyleGAN Inversion |
Seung Jun Moon (KAIST)*; Gyeong-Moon Park (Kyung Hee University) |
2369 |
Improving RGB-D Point Cloud Registration by Learning Multi-scale Local Linear Transformation |
Ziming Wang (Beihang University); Xiaoliang Huo (Beihang University); Zhenghao Chen (University of Sydney); Jing Zhang (Beihang University); Lu Sheng (Beihang University)*; Dong Xu (The University of Hong Kong) |
2373 |
Learning Dynamic Facial Radiance Fields for Few-Shot Talking Head Synthesis |
Shuai Shen (Tsinghua University); Wanhua Li (Tsinghua University); Zheng Zhu (Tsinghua University); Yueqi Duan (Tsinghua University); Jie Zhou (Tsinghua University); Jiwen Lu (Tsinghua University)* |
2378 |
StyleLight: HDR Panorama Generation for Lighting Estimation and Editing |
Guangcong Wang (Nanyang Technological University)*; Yinuo Yang (Nanyang Technological University); Chen Change Loy (Nanyang Technological University); Ziwei Liu (Nanyang Technological University) |
2379 |
You Should Look at All Objects |
Zhenchao Jin (University of Science and Technology of China)*; Dongdong Yu (ByteDance Inc.); Luchuan Song (University of Science and Technology of China); Zehuan Yuan (Bytedance.Inc); Lequan Yu (The University of Hong Kong) |
2384 |
BRNet: Exploring Comprehensive Features for Monocular Depth Estimation |
Wencheng Han (Beijing Institute of Technology)*; Junbo Yin (Beijing Institute of Technology); Xiaogang Jin (Zhejiang University); dai xiangdong (oppo); Jianbing Shen (Inception Institute of Artificial Intelligence) |
2403 |
CoupleFace: Relation Matters for Face Recognition Distillation |
Jiaheng Liu (Beihang University)*; Haoyu Qin (SenseTime); Yichao Wu (Sensetime Group Limited); Jinyang Guo (The University of Sydney); Ding Liang (Sensetime Group Limited); Ke Xu (Beihang University) |
2404 |
Collaborating Domain-shared and Target-specific Feature Clustering for Cross-domain 3D Action Recognition |
Qinying Liu (University of Science and Technology of China); Zilei Wang (University of Science and Technology of China)* |
2406 |
Adaptive Spatial-BCE Loss for Weakly Supervised Semantic Segmentation |
Tong Wu (Beijing Institute of Technology); Guangyu Ryan Gao (Beijing Institute of Technology)*; junshi huang (Meituan); Xiaolin Wei (Meituan); Xiaoming Wei (Meituan); Chi Harold Liu (Beijing Institute of Technology) |
2418 |
Multi-Person 3D Pose and Shape Estimation via Inverse Kinematics and Refinement |
Junuk Cha (UNIST)*; Muhammad Saqlain (Ulsan National Institute of Science and Technology); GeonU Kim (UNIST); Mingyu Shin (ULSAN NATIONAL INSTITUTE OF SCIENCE AND TECHNOLOGY); Seungryul Baek (UNIST) |
2423 |
Explaining Deepfake Detection by Analysing Image Matching |
Shichao Dong (Megvii); Jin Wang (Megvii); Haoqiang Fan (Megvii Inc(face++)); Jiajun Liang (Megvii); Renhe Ji (Megvii)* |
2424 |
L-CoDer: Language-based Colorization with Color-object Decoupling Transformer |
Zheng Chang (Beijing University of Posts and Telecommunications); Shuchen Weng (Peking University)*; Yu Li (International Digital Economy Academy); Si Li (Beijing University of Posts and Telecommunications); Boxin Shi (Peking University) |
2449 |
GitNet: Geometric Prior-based Transformation for Birds-Eye-View Segmentation |
Shi Gong (Huazhong University of Science and Technology); Xiaoqing Ye (Baidu Inc.); Xiao Tan (Baidu Inc.); Jingdong Wang (Baidu); Errui Ding (Baidu Inc.); Yu Zhou (Huazhong University of Science and Technology)*; Xiang Bai (Huazhong University of Science and Technology) |
2459 |
Unsupervised Deep Multi-Shape Matching |
Dongliang Cao (Technical University of Munich); Florian Bernard (University of Bonn)* |
2463 |
GaitEdge: Beyond Plain End-to-end Gait Recognition for Better Practicality |
Junhao Liang (Southern University of Science and Technology in China)*; Chao Fan (SUSTech); Saihui Hou (Beijing Normal University); Chuanfu Shen (Southern University of Science and Technology); Yongzhen Huang (School of Artificial Intelligence, Beijing Normal University); Shiqi Yu (Southern University of Science and Technology) |
2483 |
EAutoDet: Efficient Architecture Search for Object Detection |
Xiaoxing Wang (Shanghai Jiao Tong University); Jiale Lin (Shanghai Jiao Tong University); Juanping Zhao (Guangdong OPPO Mobile Telecommunications Co., Ltd.); Xiaokang Yang (Shanghai Jiao Tong University of China); Junchi Yan (Shanghai Jiao Tong University)* |
2485 |
A Max-Flow based Approach for Neural Architecture Search |
Chao Xue (beijing university of posts and telecommunications)*; Xiaoxing Wang (Shanghai Jiao Tong University); Junchi Yan (Shanghai Jiao Tong University); Chun-Guang Li (Beijing University of Posts & Telecommunications) |
2488 |
Can Shuffling Video Benefit Temporal Bias Problem: A Novel Training Framework for Temporal Grounding |
Jiachang Hao (Beijing University of Posts and Telecommunications)*; Haifeng Sun (Beijing university of posts and telecommunications); Pengfei Ren (Beijing University of Posts and Telecommunications); Jingyu Wang (Beijing University of Posts and Telecommunications); Qi Qi (Beijing University of Posts and Telecommunications); Jianxin Liao (beijing university of posts and telecommunications) |
2494 |
tSF: Transformer-based Semantic Filter for Few-Shot Learning |
Jinxiang Lai (Tencent)*; Siqian Yang (Tencent); Wenlong Liu (Tencent); Yi Zeng (Tencent); Zhongyi Huang (Tencent); Wenlong Wu (Tencent); Jun Liu (Tencent); Bin-Bin Gao (Tencent); Chengjie Wang (Tencent; Shanghai Jiao Tong University) |
2501 |
Dense Gaussian Processes for Few-Shot Segmentation |
Joakim Johnander (Linköping University)*; Johan Edstedt (Linköping University); Fahad Shahbaz Khan (MBZUAI); Michael Felsberg (Linköping University); Martin Danelljan (ETH Zurich) |
2507 |
Adversarial Feature Augmentation for Cross-domain Few-shot Classification |
Yanxu Hu (Sun Yat-sen University); Andy J Ma (Sun Yat-sen University)* |
2511 |
Real-Time Neural Character Rendering with Pose-Guided Multiplane Images |
Hao Ouyang (HKUST)*; Bo Zhang (Microsoft Research Asia); Pan Zhang (Shanghai AI Laboratory); Hao Yang (Microsoft Research Asia); Dong Chen (Microsoft Research Asia); Jiaolong Yang (Microsoft Research); Qifeng Chen (HKUST); Fang Wen (Microsoft Research Asia ) |
2512 |
Constructing Balance from Imbalance for Long-tailed Image Recognition |
Yue Xu (Shanghai Jiao Tong University); Yong-Lu Li (Shanghai Jiao Tong University); Jiefeng Li (Shanghai Jiao Tong University); Cewu Lu (Shanghai Jiao Tong University)* |
2516 |
SparseNeuS: Fast Generalizable Neural Surface Reconstruction from Sparse Views |
Xiaoxiao Long (The University of Hong Kong)*; Cheng Lin (Tencent); Peng Wang (The University of Hong Kong); Taku Komura (The University of Hong Kong); Wenping Wang (The University of Hong Kong) |
2538 |
Dual Perspective Network for Audio Visual Event Localization |
Varshanth Rao (Huawei Technologies)*; Md Ibrahim Khalil (Huawei Noah’s Ark Laboratory); Haoda Li (University of California, Berkeley); Peng Dai (Huawei Technologies Inc.Canada); Juwei Lu (Huawei Noah’s Ark Lab) |
2542 |
SiamDoGe: Domain Generalizable Semantic Segmentation using Siamese Network |
Zhenyao Wu (University of South Carolina)*; Xinyi Wu (University of South Carolina); Xiaoping Zhang (Wuhan University); Song Wang (University of South Carolina); Lili Ju (University of South Carolina) |
2545 |
Is Appearance Free Action Recognition Possible? |
Filip Ilic (Graz University of Technology)*; Rick Wildes (York University); Thomas Pock (Graz University of Technology) |
2557 |
Detecting Twenty-thousand Classes using Image-level Supervision |
Xingyi Zhou (The University of Texas at Austin)*; Rohit Girdhar (Facebook AI Research); Armand Joulin (Facebook AI Research); Philipp Kraehenbuehl (UT Austin); Ishan Misra (Facebook AI Research) |
2558 |
DCL-Net: Deep Correspondence Learning Network for 6D Pose Estimation |
Hongyang Li (South China University of Technology)*; Jiehong Lin (South China University of Technology); Kui Jia (South China University of Technology) |
2565 |
Learning Cross-Video Neural Representations for High-Quality Frame Interpolation |
Wentao Shangguan (Washington University in St Louis); Yu Sun (Washington University in St. Louis); Weijie Gan (Washington University in St. Louis); Ulugbek S. Kamilov (Washington University in St. Louis)* |
2568 |
Learning Visibility for Robust Dense Human Body Estimation |
Chun-Han Yao (University of California at Merced)*; Jimei Yang (Adobe); Duygu Ceylan (Adobe Research); Yi Zhou (Adobe Research); Yang Zhou (Adobe Research); Ming-Hsuan Yang (University of California at Merced) |
2573 |
Texturify: Generating Textures on 3D Shape Surfaces |
Yawar Siddiqui (Technical University of Munich)*; Justus Thies (Max Planck Institute for Intelligent Systems); Fangchang Ma (Apple Inc.); Qi Shan (Apple Inc.); Matthias Niessner (Technical University of Munich); Angela Dai (Technical University of Munich) |
2575 |
Unsupervised Selective Labeling for More Effective Semi-Supervised Learning |
Xudong Wang (UC Berkeley / ICSI); Long Lian (UC Berkeley / ICSI); Stella X Yu (UC Berkeley / ICSI)* |
2576 |
Reliable Visual Question Answering: Abstain Rather Than Answer Incorrectly |
Spencer Whitehead (Meta AI)*; Suzanne Petryk (UC Berkeley); Vedaad Shakib (UC Berkeley); Joseph E Gonzalez (UC Berkeley); Trevor Darrell (UC Berkeley); Anna Rohrbach (UC Berkeley); Marcus Rohrbach (Facebook AI Research) |
2581 |
Studying Bias in GANs through the Lens of Race |
Vongani H Maluleke (University of California, Berkeley); Neerja Thakkar (University of California, Berkeley)*; Tim Brooks (UC Berkeley); Ethan Weber (UC Berkeley); Trevor Darrell (UC Berkeley); Alexei A Efros (UC Berkeley); Angjoo Kanazawa (University of California Berkeley); Devin Guillory (UC Berkeley) |
2583 |
On Multi-Domain Long-Tailed Recognition, Imbalanced Domain Generalization and Beyond |
Yuzhe Yang (MIT)*; Hao Wang (Rutgers University); Dina Katabi (Massachusetts Institute of Technology) |
2584 |
Disentangling Object Motion and Occlusion for Unsupervised Multi-frame Monocular Depth |
Ziyue Feng (Clemson University)*; Liang Yang (Apple Inc); Longlong Jing (Waymo LLC); Haiyan Wang (The City College of New York); YingLi Tian (City University of New York); Bing Li (Clemson University) |
2586 |
Autoregressive 3D Shape Generation via Canonical Mapping |
An-Chieh Cheng (National Tsing Hua University); Xueting Li (University of California, Merced); Sifei Liu (NVIDIA)*; Min Sun (NTHU); Ming-Hsuan Yang (University of California at Merced) |
2589 |
Learning Continuous Implicit Representation for Near-Periodic Patterns |
Bowei Chen (CMU)*; Tiancheng Zhi (ByteDance); Martial Hebert (cmu); Srinivasa Narasimhan (Carnegie Mellon University, USA) |
2596 |
Robust Landmark-based Stent Tracking in X-ray Fluoroscopy |
Luojie Huang (Johns Hopkins Uniersity); Yikang Liu (United Imaging Intelligence America); Li Chen (University of Washington); Eric Z. Chen (United Imaging Intelligence America); Xiao Chen (United Imaging Intelligence America); Shanhui Sun (United Imaging Intelligence America)* |
2598 |
Depth Field Networks for Generalizable Multi-view Scene Representation |
Vitor Guizilini (Toyota Research Institute)*; Igor Vasiljevic (Toyota Research Institute); Jiading Fang (Toyota Technological Institute at Chicago); Rareș A Ambruș (Toyota Research Institute); Greg Shakhnarovich (Toyota Technological Institute at Chicago); Matthew Walter (Toyota Technological Institute at Chicago); Adrien Gaidon (Toyota Research Institute) |
2601 |
Max Pooling with Vision Transformers reconciles class and shape in weakly supervised semantic segmentation |
Simone Rossetti (Sapienza University); Damiano Zappia (Deepplants S.r.l.); Marta Sanzari (Sapienza University of Rome); Marco Schaerf (Sapienza University of Rome); fiora pirri (University of Rome, Sapienza)* |
2605 |
GRIT: Faster and Better Image captioning Transformer Using Dual Visual Features |
Van-Quang Nguyen (Tohoku University)*; Masanori Suganuma (Tohoku University / RIKEN AIP); Takayuki Okatani (Tohoku University/RIKEN AIP) |
2609 |
Learning Semantic Correspondence with Sparse Annotations |
Shuaiyi Huang (University of Maryland, College Park)*; Luyu Yang (University of Maryland, College Park); Bo He (University of Maryland); Songyang Zhang (Shanghai AI Laboratory); Xuming He (ShanghaiTech University); Abhinav Shrivastava (University of Maryland) |
2610 |
A Real World Dataset for Multi-view 3D Reconstruction |
Rakesh Shrestha (Simon Fraser University)*; Siqi Hu (Alibaba damo academy); Minghao Gou (Shanghai Jiao Tong University); Ziyuan Liu (Huawei group); Ping Tan (Simon Fraser University) |
2620 |
Social ODE: Multi-Agent Trajectory Forecasting with Neural Ordinary Differential Equations |
Song Wen (Rutgers University)*; Hao Wang (Rutgers University); Dimitris N. Metaxas (Rutgers) |
2621 |
3D Instances as 1D Kernels |
Yizheng Wu (Huazhong Univ. of Sci.&Tech.); Min Shi (Huazhong University of Science and Technology); Shuaiyuan Du (Huazhong Univ. of Sci.&Tech. ); Hao Lu (Huazhong University of Science and Technology); Zhiguo Cao (Huazhong Univ. of Sci.&Tech.)*; Weicai Zhong (Huawei CBG Consumer Cloud Service Big Data Platform Dept.) |
2624 |
Context-Aware Streaming Perception in Dynamic Environments |
Gur-Eyal Sela (UC Berkeley)*; Ionel Gog (UC Berkeley); Justin Wong (UC Berkeley); Kumar Krishna Agrawal (UC Berkeley); Xiangxi Mo (UC Berkeley); Sukrit Kalra (UC Berkeley); Peter Schafhalter (UC Berkeley); Eric Leong (UC Berkeley); Xin Wang (Microsoft Research); Bharathan Balaji (Amazon); Joseph E Gonzalez (UC Berkeley); Ion Stoica (UC Berkeley) |
2625 |
PointTree: Transformation-Robust Point Cloud Encoder with Relaxed K-D Trees |
Jun-Kun Chen (University of Illinois at Urbana-Champaign)*; Yu-Xiong Wang (University of Illinois at Urbana-Champaign) |
2631 |
Dense Siamese Network for Dense Unsupervised Learning |
Wenwei Zhang (NTU)*; Jiangmiao Pang (CUHK); Kai Chen (SenseTime Research); Chen Change Loy (Nanyang Technological University) |
2633 |
Uncertainty-aware Multi-modal Learning via Cross-modal Random Network Prediction |
Hu Wang (the University of Adelaide)*; Jianpeng Zhang (Northwestern Polytechnical University); Yuanhong Chen (University of Adelaide); Congbo Ma (The University of Adelaide); Jodie C Avery (University of Adelaide); Mary L Hull (University of Adelaide); Gustavo Carneiro (University of Adelaide) |
2638 |
Enhanced Accuracy and Robustness via Multi-Teacher Adversarial Distillation |
Shiji Zhao (Beihang University); Jie Yu (Beihang University); Zhenlong Sun (Tencent Technology Co.Ltd); Bo Zhang (WeChat Search Application Department, Tencent); Xingxing Wei (Beihang University)* |
2645 |
End-to-end graph-constrained vectorized floorplan generation with panoptic refinement |
Jiachen Liu (Pennsylvania State University)*; Yuan Xue (Johns Hopkins University); Jose P. Duarte (Penn State University); Krishnendra Shekhawat (BITS Pilani); Zihan Zhou (Manycore Tech Inc.); Sharon Xiaolei Huang (The Pennsylvania State University) |
2649 |
Context Enhanced Stereo Transformer |
weiyu Guo (University of Chinese Academy of Sciences)*; Zhaoshuo Li (Johns Hopkins University); Yongkui Yang (Shenzhen Institute of Advanced Technology,Chinese Academy of Sciences); Zheng Wang (Shenzhen Institutes of Advanced Technology); Russ Taylor (Johns Hopkins University); Mathias Unberath (Johns Hopkins University); Alan Yuille (Johns Hopkins University); Yingwei Li (Johns Hopkins University) |
2652 |
NSNet: Non-saliency Suppression Sampler for Efficient Video Recognition |
Boyang Xia (Institute of Computing Technology, Chinese Academy of Science); Wenhao Wu (Baidu)*; Haoran Wang (Baidu); RUI SU (the University of Sydney); Dongliang He (Baidu); Haosen Yang (Harbin Institute of Technology); Xiaoran Fan (Institute of Computing Technology, Chinese Academy of Sciences); Wanli Ouyang (The University of Sydney) |
2663 |
Hierarchically Self-Supervised Transformer for Human Skeleton Representation Learning |
Yuxiao Chen (Rutgers University)*; Long Zhao (Google Research); Jianbo Yuan (Bytedance); Yu Tian (Rutgers); zhaoyang xia (Rutgers University); Shijie Geng (Rutgers University); Ligong Han (Rutgers University); Dimitris N. Metaxas (Rutgers) |
2666 |
Few-Shot Video Object Detection |
Qi Fan (HKUST)*; Chi-Keung Tang (Hong Kong University of Science and Technology); Yu-Wing Tai (Kuaishou Technology / HKUST) |
2667 |
Improving the Reliability for Confidence Estimation |
Haoxuan Qu (Singapore University of Technology and Design)*; Yanchao Li (Singapore University of Technology and Design); Lin Geng Foo (Singapore University of Technology and Design); Jason Kuen (Adobe Research); Jiuxiang Gu (Adobe Research); Jun Liu (Singapore University of Technology and Design) |
2686 |
Selective Query-guided Debiasing for Video Corpus Moment Retrieval |
Sunjae Yoon (KAIST)*; Ji Woo Hong (KAIST); Eunseop Yoon (KAIST); DaHyun Kim (KAIST); Junyeong Kim (Chung-Ang University); Hee Suk Yoon (KAIST); Chang D. Yoo (KAIST) |
2701 |
Posterior Refinement on Metric Matrix Improves Generalization in Metric Learning |
Mingda Wang (Shanghai Jiao Tong University); Canqian Yang (Shanghai Jiao Tong University); Yi Xu (Shanghai Jiao Tong University)* |
2707 |
DISP6D: Disentangled Implicit Shape and Pose Learning for Scalable 6D Pose Estimation |
Yilin Wen (The University of Hong Kong)*; Xiangyu Li (Brown University); Hao Pan (Microsoft Research); Lei Yang (The University of Hong Kong); Zheng Wang (SUSTech); Taku Komura (The University of Hong Kong); Wenping Wang (The University of Hong Kong) |
2709 |
Few-shot Image Generation with Mixup-based Distance Learning |
Chaerin Kong (Seoul National University); Jeesoo Kim (Naver Webtoon AI); Donghoon Han (Seoul National University); Nojun Kwak (Seoul National University)* |
2715 |
Data-Free Neural Architecture Search via Recursive Label Calibration |
Zechun Liu (Carnegie Mellon University); Zhiqiang Shen (Carnegie Mellon University)*; Yun Long (Google); Eric Xing (MBZUAI, CMU, and Petuum Inc.); Kwang-Ting Cheng (Hong Kong University of Science and Technology); Chas H Leichner (Google) |
2717 |
Distilling Object Detectors With Global Knowledge |
Sanli Tang (Hikvision Research Institute); Zhongyu Zhang (Hikvision Research Institute); Zhanzhan Cheng (Zhejiang University & Hikvision Research Institute)*; Jing Lu (Hikvision Research Institute); Yunlu Xu (Hikvision Research Institute); Yi Niu (Hikvision Research Institute); Fan He (Shanghai Jiao Tong University) |
2730 |
NEST: Neural Event Stack for Event-based Image Enhancement |
Minggui Teng (Peking University)*; Chu Zhou (Peking University); Hanyue Lou (Peking University); Boxin Shi (Peking University) |
2732 |
Multi-Granularity Distillation Scheme Towards Lightweight Semi-Supervised Semantic Segmentation |
Jie Qin (School of Artificial Intelligence, University of Chinese Academy of Sciences; Institute of Automation,Chinese Academy of Sciences)*; Jie Wu (ByteDance Inc); Ming Li (Xiamen University); Xuefeng Xiao (ByteDance Inc); Min Zheng (ByteDance); Xingang Wang (Institute of Automation, CAS) |
2740 |
A Style-Based GAN Encoder for High Fidelity Reconstruction of Images and Videos |
Xu YAO (Telecom ParisTech)*; Alasdair Newson (Telecom Paris); Yann Gousseau (Telecom Paris); PIERRE HELLIER (Interdigital (Technicolor)) |
2746 |
Unifying Visual Perception by Dispersible Points Learning |
Jianming Liang (Beihang University)*; Guanglu Song (Sensetime); Biao Leng (Beihang University); Yu Liu (SenseTime Group LTD) |
2747 |
Towards High-Fidelity Single-view Holistic Reconstruction of Indoor Scenes |
Haolin Liu (The Chinese University of Hong Kong, Shenzhen)*; Yujian Zheng (The Chinese University of Hong Kong, Shenzhen); Guanying CHEN (The Chinese University of Hong Kong, Shenzhen); Shuguang Cui (The Chinese University of Hong Kong, Shenzhen ); Xiaoguang Han (Shenzhen Research Institute of Big Data, the Chinese University of Hong Kong (Shenzhen)) |
2756 |
Multimodal Transformer for Automatic 3D Annotation and Object Detection |
Chang Liu (The University of Hong Kong)*; Xiaoyan QIAN (The University of Hong Kong); Binxiao Huang (The University of Hong Kong); Xiaojuan Qi (The University of Hong Kong); Edmund Lam (The University of Hong Kong); Siew-Chong Tan (Nil); Ngai Wong (The University of Hong Kong) |
2761 |
SP-Net: Slowly Progressing Dynamic Inference Networks |
Huanyu Wang (Zhejiang University)*; Wenhu Zhang (Zhejiang University); Shihao Su (Zhejiang University); Hui Wang (Zhejiang University); Zhenwei Miao (DAMO Academy, Alibaba Group); Xin Zhan (DAMO Academy, Alibaba Group); Xi Li (Zhejiang University) |
2764 |
No Token Left Behind: Explainability-Aided Image Classification and Generation |
Roni Paiss (Tel Aviv University, Google); Hila Chefer (Tel Aviv University)*; Lior Wolf (Tel Aviv University, Israel) |
2766 |
Dynamically Transformed Instance Normalization Network for Generalizable Person Re-Identification |
BingLiang Jiao (Northwestern Polytechnical University ); Lingqiao Liu (University of Adelaide); Liying Gao ( Northwestern Polytechnical University); Guosheng Lin (Nanyang Technological University); Lu Yang (Northwestern Polytechnical University); Shizhou Zhang (NorthWestern Polytechnical University); Peng Wang (Northwestern Polytechnical University)*; Yanning Zhang (Northwestern Polytechnical University) |
2772 |
Editable Indoor Lighting Estimation |
Henrique Weber (Université Laval)*; Mathieu Garon (Depix); Jean-Francois Lalonde (Université Laval) |
2783 |
PseCo: Pseudo Labeling and Consistency Training for Semi-Supervised Object Detection |
Gang Li (Nanjing University of Science and Technology)*; Xiang Li (Nanjing University of Science and Technology); Yujie Wang (Sensetime Research); Yichao Wu (Sensetime Group Limited); Ding Liang (Sensetime Group Limited); Shanshan Zhang (Max Planck Institute for Informatics) |
2786 |
CompNVS: Novel View Synthesis with Scene Completion |
Zuoyue Li (ETH Zurich)*; Tianxing Fan (Zhejiang University); Zhenqiang Li (The University of Tokyo); Zhaopeng Cui (Zhejiang University); Yoichi Sato (University of Tokyo); Marc Pollefeys (ETH Zurich / Microsoft); Martin R. Oswald (ETH Zurich) |
2787 |
Dynamic 3D Scene Analysis by Point Cloud Accumulation |
Shengyu Huang (ETH Zürich)*; Zan Gojcic (NVIDIA); Jiahui Huang (Tsinghua University); Andreas Wieser (ETH Zürich); Konrad Schindler (ETH Zurich) |
2798 |
FakeCLR: Exploring Contrastive Learning for Solving Latent Discontinuity in Data-Efficient GANs |
Ziqiang Li (University of Science and Technology of China)*; Chaoyue Wang (JD.com); Heliang Zheng (JD Explore Academy, JD.com); Jing Zhang (The University of Sydney); Bin Li (University of Science and Technology of China) |
2802 |
Resolving Copycat Problems in Visual Imitation Learning via Residual Action Prediction |
Chia-Chi Chuang (Tsinghua University); Donglin Yang (Tsinghua University); Chuan Wen (Tsinghua University)*; Yang Gao (Tsinghua University) |
2804 |
REALY: Rethinking the Evaluation of 3D Face Reconstruction |
Zenghao Chai (Tsinghua University); Haoxian Zhang (Tencent); Jing Ren (ETH Zurich); Di Kang (Tencent); Zhengzhuo Xu (Tsinghua University); Xuefei Zhe (Tencent AI lab); Chun Yuan (Graduate school at ShenZhen,Tsinghua university); Linchao Bao (Tencent AI Lab)* |
2806 |
TransMatting: Enhancing Transparent Objects Matting with Transformers |
huanqia cai (University of Chinese Academy of Sciences)*; Fanglei Xue (University of Chinese Academy of Sciences); Lele Xu (Key Laboratory of Space Utilization, Technology and Engineering Center for space Utilization, Chinese Academy of Sciences.); lili guo (Key Laboratory of Space Utilization, Technology and Engineering Center for space Utilization, Chinese Academy of Sciences. ) |
2814 |
Diverse Image Inpainting with Normalizing Flow |
Cairong Wang (Graduate school at Shenzhen, Tsinghua University)*; Yiming M Zhu (Graduate school at ShenZhen,Tsinghua university); Chun Yuan (Graduate school at ShenZhen,Tsinghua university) |
2818 |
Video Activity Localisation with Uncertainties in Temporal Boundary |
Jiabo Huang (Queen Mary University of London)*; Hailin Jin (Adobe Research); Shaogang Gong (Queen Mary University of London); Yang Liu (Peking University) |
2822 |
SketchSampler: Sketch-based 3D Reconstruction via View-dependent Depth Sampling |
Chenjian Gao (Beihang University); Qian Yu (Beihang University)*; Lu Sheng (Beihang University); Yi-Zhe Song (University of Surrey); Dong Xu (The University of Hong Kong) |
2829 |
Exploring Resolution and Degradation Clues as Self-supervised Signal for Low Quality Object Detection |
Ziteng Cui (The University of Tokyo); Yingying Zhu (University of Texas Arlington); Lin Gu (RIKEN,AIP / The University of Tokyo)*; Guo-Jun Qi (Futurewei Technologies); Xiaoxiao Li (The University of British Columbia); Renrui Zhang (Shanghai AI Lab); Zenghui Zhang (Shanghai Jiao Tong university); Tatsuya Harada (The University of Tokyo / RIKEN) |
2840 |
CP2: Copy-Paste Contrastive Pretraining for Semantic Segmentation |
Feng Wang (Tsinghua University)*; Huiyu Wang (JHU); Chen Wei (Johns Hopkins University); Alan Yuille (Johns Hopkins University); Wei Shen (Shanghai Jiao Tong University) |
2852 |
Learning from Multiple Annotator Noisy Labels via Sample-wise Label Fusion |
Zhengqi Gao (MIT)*; Fan-Keng Sun (MIT); Mingran Yang (MIT); Sucheng Ren (South China University of Technology); Zikai Xiong (Massachusetts Institute of Technology); Marc Engeler (Takeda); Antonio Burazer (Takeda); Linda Wildling (Takeda Pharmaceuticals International AG); Luca Daniel (Massachusetts Institute of Technology); Duane Boning (MIT) |
2856 |
Robust Category-Level 6D Pose Estimation with Coarse-to-Fine Rendering of Neural Features |
Wufei Ma (Purdue University)*; Angtian Wang (Johns Hopkins University); Alan Yuille (Johns Hopkins University); Adam Kortylewski (Max Planck Institute for Informatics) |
2861 |
A Unified Framework for Domain Adaptive Pose Estimation |
Donghyun Kim (MIT-IBM Watson AI Lab)*; Kaihong Wang (Boston University); Stan Sclaroff (Boston University); Margrit Betke (Boston University); Kate Saenko (Boston University) |
2862 |
A Broad Study of Pre-training for Domain Generalization and Adaptation |
Donghyun Kim (MIT-IBM Watson AI Lab)*; Kaihong Wang (Boston University); Stan Sclaroff (Boston University); Kate Saenko (Boston University) |
2863 |
BlobGAN: Spatially Disentangled Scene Representations |
Dave Epstein (UC Berkeley)*; Taesung Park (Adobe Research); Richard Zhang (Adobe); Eli Shechtman (Adobe Research, US); Alexei A Efros (UC Berkeley) |
2864 |
LGV: Boosting Adversarial Example Transferability from Large Geometric Vicinity |
Martin Gubri (University of Luxembourg)*; Maxime Cordy (University of Luxembourg); Mike Papadakis (University of Luxembourg); Yves Le Traon (University of Luxembourg); Koushik Sen (University of California, Berkeley) |
2871 |
LocalBins: Improving Depth Estimation by Learning Local Distributions |
Shariq F Bhat (KAUST)*; Ibraheem Alhashim (National Center for Artificial Intelligence (NCAI), Saudi Data and Artificial Intelligence Authority (SDAIA), Riyadh, Kingdom of Saudi Arabia); Peter Wonka (KAUST) |
2872 |
Prior Knowledge Guided Unsupervised Domain Adaptation |
Tao Sun (Stony Brook University)*; Cheng Lu (Xiaopeng); Haibin Ling (Stony Brook University) |
2877 |
Fast Two-step Blind Optical Aberration Correction |
Thomas Eboli (ENS Paris-Saclay)*; Jean-Michel Morel (Centre Borelli ENS Paris-Saclay); Gabriele Facciolo (ENS Paris – Saclay) |
2887 |
Controllable and Guided Face Synthesis for Unconstrained Face Recognition |
Feng Liu (Michigan State University)*; Minchul Kim (Michigan State University); Anil Jain (Michigan State University); Xiaoming Liu (Michigan State University) |
2888 |
2D GANs Meet Unsupervised Single-view 3D Reconstruction |
Feng Liu (Michigan State University)*; Xiaoming Liu (Michigan State University) |
2891 |
Seeing Far in the Dark with Patterned Flash |
Zhanghao Sun (Stanford University)*; Jian Wang (Snap); Yicheng Wu (Snap Inc.); Shree Nayar (Snap) |
2900 |
Unified Implicit Neural Stylization |
Zhiwen Fan (University of Texas at Austin)*; Yifan Jiang (University of Texas at Austin); Peihao Wang (University of Texas at Austin); Xinyu Gong (University of Texas at Austin); Dejia Xu (University of Texas at Austin); Zhangyang Wang (University of Texas at Austin) |
2901 |
Improved Masked Image Generation with Token-Critic |
Jose Lezama (Google Research)*; Huiwen Chang (Google); Lu Jiang (Google Research); Irfan Essa (Google) |
2902 |
UNIF: United Neural Implicit Functions for Clothed Human Reconstruction and Animation |
Shenhan Qian (ShanghaiTech University)*; Jiale Xu (ShanghaiTech University); Ziwei Liu (Nanyang Technological University); Liqian Ma (ZMO AI); Shenghua Gao (Shanghaitech University) |
2903 |
PseudoClick: Interactive Image Segmentation with Click Imitation |
Qin Liu (UNC)*; Meng Zheng (United Imaging Intelligence); Benjamin Planche (United Imaging Intelligence); Srikrishna Karanam (Adobe Research); Terrence Chen (United Imaging Intelligence); Marc Niethammer (UNC); Ziyan Wu (United Imaging Intelligence) |
2904 |
CoSCL: Cooperation of Small Continual Learners is Stronger than a Big One |
Liyuan Wang (Tsinghua University)*; Xingxing Zhang (Tsinghua University); Qian Li (Tsinghua University); Jun Zhu (Tsinghua University); Yi Zhong (Tsinghua University) |
2909 |
Scalable Learning to Optimize: A Learned Optimizer Can Train Big Models |
Xuxi Chen (University of Texas at Austin)*; Tianlong Chen (Unversity of Texas at Austin); Yu Cheng (Microsoft Research); Weizhu Chen (Microsoft); Ahmed Awadallah (Microsoft); Zhangyang Wang (University of Texas at Austin) |
2921 |
PRIF: Primary Ray-based Implicit Function |
Brandon Yushan Feng (University of Maryland, College Park)*; Yinda Zhang (Google); Danhang Tang (Google); Ruofei Du (Google); Amitabh Varshney (University of Maryland) |
2925 |
From Face to Natural Image: Learning Real Degradation for Blind Image Super-Resolution |
Xiaoming Li (Harbin Institute of Technology); Chaofeng Chen (Nanyang Technological University); Xianhui Lin (Alibaba Group); Wangmeng Zuo (Harbin Institute of Technology, China)*; Lei Zhang (“Hong Kong Polytechnic University, Hong Kong, China”) |
2936 |
QISTA-ImageNet: A Deep Compressive Image Sensing Framework Solving Lq-Norm Optimization Problem |
Gang-Xuan Lin (Academia Sinica); Shih-Wei Hu (National Taiwan University); Chun-Shien Lu (Academia Sinica)* |
2943 |
Trust, but Verify: Using Self-Supervised Probing to Improve Trustworthiness |
Ailin Deng (National University of Singapore)*; Shen Li (National University of Singapore); Miao Xiong (National University of Singapore); Zhirui Chen (National University of Singapore); Bryan Hooi (National University of Singapore) |
2948 |
Spatial and Visual Perspective-Taking via View Rotation and Relation Reasoning for Embodied Reference Understanding |
Cheng Shi (ShanghaiTech University); Sibei Yang (ShanghaiTech University)* |
2953 |
Med-DANet: Dynamic Architecture Network for Efficient Medical Volumetric Segmentation |
Wenxuan Wang (University of Science and Technology Beijing)*; Chen Chen (University of Central Florida); Jing Wang (University of Science and Technology Beijing); Sen Zha (University of Science and Technology Beijing); Yan Zhang (University of Science and Technology Beijing); Jiangyun Li (University of Science and Technology Beijing) |
3005 |
Worst Case Matters for Few-Shot Recognition |
Minghao Fu (Nanjing University); Yunhao Cao (Nanjing University); Jianxin Wu (Nanjing University)* |
3017 |
Self-Filtering: A Noise-Aware Sample Selection for Label Noise with Confidence Penalization |
Qi Wei (Shandong University)*; Haoliang Sun (Shandong University); Xiankai Lu (Shandong University); Yilong Yin (Shandong University) |
3035 |
Point Cloud Domain Adaptation via Masked Local 3D Structure Prediction |
hanxue liang (University of Texas at Austin)*; Hehe Fan (NUS); Zhiwen Fan (University of Texas at Austin); Yi Wang (University of Texas at Austin); Tianlong Chen (Unversity of Texas at Austin); Yu Cheng (Microsoft Research); Zhangyang Wang (University of Texas at Austin) |
3041 |
Translation, Scale and Rotation: Cross-Modal Alignment Meets RGB-Infrared Vehicle Detection |
Maoxun Yuan (Beihang University); Yinyan Wang (BeiHaing University); Xingxing Wei (Beihang University)* |
3043 |
Simple Baselines for Image Restoration |
Liangyu Chen (Megvii Technology)*; Xiaojie Chu (Megvii Technology); Xiangyu Zhang (Megvii Technology); Jian Sun (Megvii Technology) |
3058 |
RDA: Reciprocal Distribution Alignment for Robust Semi-supervised Learning |
Yue Duan (Nanjing University)*; Lei Qi (Southeast University); Lei Wang (“University of Wollongong, Australia”); Luping Zhou (University of Sydney); Yinghuan Shi (Nanjing University) |
3060 |
Exploring Hierarchical Graph Representation for Large-Scale Zero-Shot Image Classification |
Kai Yi (King Abdullah University of Science and Technology)*; xiaoqian shen (King Abdullah University of Science and Technology); Yunhao Gou (Hong Kong University of Science and Technology); Mohamed Elhoseiny (KAUST) |
3080 |
Doubly Deformable Aggregation of Covariance Matrices for Few-shot Segmentation |
Zhitong Xiong (Techinical University of Munich)*; Haopeng Li (The University of Melbourne); Xiaoxiang Zhu (Technical University of Munich (TUM); German Aerospace Center (DLR)) |
3093 |
MemSAC: Memory Augmented Sample Consistency for Large Scale Domain Adaptation |
Tarun Kalluri (UC San Diego)*; Astuti Sharma (UCSD); Manmohan Chandraker (UC San Diego) |
3094 |
GCISG: Guided Causal Invariant Learning for Improved Syn-to-real Generalization |
Gilhyun Nam (Agency for Defense Development)*; Gyeongjae Choi (Agency for Defense Development); Kyungmin Lee (Agency for Defense Development) |
3101 |
Temporal Saliency Query Network for Efficient Video Recognition |
Boyang Xia (Institute of Computing Technology, Chinese Academy of Science); Zhihao Wang (Institute of Computing Technology, Chinese Academy of Sciences); Wenhao Wu (Baidu)*; Haoran Wang (Baidu); Jungong Han (Aberystwyth University) |
3116 |
Towards Interpretable Video Super-Resolution via Alternating Optimization |
Jiezhang Cao (ETH Zürich)*; Jingyun Liang (ETH Zurich); Kai Zhang (ETH Zurich); Wenguan Wang (Eidgenössische Technische Hochschule Zürich); Qin Wang (ETH Zurich); Yulun Zhang (ETH Zurich); Hao Tang (ETH Zurich); Luc Van Gool (ETH Zurich) |
3118 |
R-DFCIL: Relation-Guided Representation Learning for Data-Free Class Incremental Learning |
Qiankun Gao (Peking University Shenzhen Graduate School)*; Chen Zhao (KAUST); Bernard Ghanem (KAUST); Jian Zhang (Peking University Shenzhen Graduate School) |
3125 |
Spike Transformer: Monocular Depth Estimation for Spiking Camera |
Jiyuan Zhang (Peking University)*; Lulu Tang (Tsingua University); Zhaofei Yu (Peking University); Jiwen Lu (Tsinghua University); Tiejun Huang (Peking University) |
3127 |
Towards Robust Face Recognition with Comprehensive Search |
Manyuan Zhang (Sensetime)*; Guanglu Song (Sensetime); Yu Liu (SenseTime Group LTD); Hongsheng Li (The Chinese University of Hong Kong) |
3129 |
Improving Image Restoration by Revisiting Global Information Aggregation |
Xiaojie Chu (Megvii Technology)*; Liangyu Chen (Megvii Technology); Chengpeng Chen (Megvii); Xin Lu (Megvii Technology) |
3132 |
Learning Pedestrian Group Representations for Multi-modal Trajectory Prediction |
Inhwan Bae (Gwangju Institute of Science and Technology)*; Jin-Hwi Park (GIST); Hae-Gon Jeon (GIST) |
3138 |
RFLA: Gaussian Receptive Field based Label Assignment for Tiny Object Detection |
Chang Xu (Wuhan University); Jinwang Wang (Huawei Technoloty); Wen Yang (Wuhan University)*; Huai Yu (Wuhan University); Lei Yu (Wuhan University); Gui-Song Xia (Wuhan University) |
3139 |
Semi-supervised Single-view 3D Reconstruction via Prototype Shape Priors |
Zhen Xing (Fudan University)*; Hengduo Li (University of Maryland, College Park ); Zuxuan Wu (UMD); Yu-Gang Jiang (Fudan University) |
3145 |
Sequential Multi-View Fusion Network for Fast LiDAR Point Motion Estimation |
Gang Zhang (Damo Academy, Alibaba Group)*; Xiaoyan Li (Beijing University of Technology); Zhenhua Wang (DAMO Academy, Alibaba Group) |
3147 |
A Large-scale Multiple-objective Method for Black-box Attack against Object Detection |
Siyuan Liang (Chinese Academy of Sciences); Longkang Li (Mohamed bin Zayed University of Artificial Intelligence); Yanbo Fan (Tencent AI Lab); Xiaojun Jia (Institute of Information Engineering,Chinese Academy of Sciences); Jingzhi Li (Institute of information engineering, CAS); Baoyuan Wu (The Chinese University of Hong Kong, Shenzhen)*; Xiaochun Cao (Sun Yat-sen University) |
3150 |
GradAuto: Energy-oriented Attack on Dynamic Neural Networks |
Jianhong Pan (Singapore University of Technology and Design)*; Qichen Zheng (Singapore University of Technology and Design); Zhipeng Fan (NYU TANDON SCHOOL OF ENGINEERING); Hossein Rahmani (Lancaster University); Qiuhong Ke (Monash University); Jun Liu (Singapore University of Technology and Design) |
3151 |
Semantic-guided Multi-Mask Image Harmonization |
Xuqian Ren (Watrix Technology); Yifan Liu (University of Adelaide)* |
3155 |
Manifold Adversarial Learning for Cross-domain 3D Shape Representation |
Hao Huang (New York University); Cheng Chen (New York University); Yi Fang (New York University)* |
3167 |
GAN with Multivariate Disentangling for Controllable Hair Editing |
Xuyang Guo (Institute of Computing Technology, Chinese Academy of Sciences); Meina Kan (Institute of Computing Technology, Chinese Academy of Sciences); Tianle Chen (Institute of Computing Technology, Chinese Academy of Sciences); Shiguang Shan (Institute of Computing Technology, Chinese Academy of Sciences)* |
3169 |
Fast-MoCo: Boost Momentum-based Contrastive Learning with Combinatorial Patches |
Yuanzheng Ci (The University of Sydney)*; Chen Lin (University of Oxford); Lei Bai (Shanghai AI Laboratory); Wanli Ouyang (The University of Sydney) |
3179 |
Dense Cross-Query-and-Support Attention Weighted Mask Aggregation for Few-Shot Segmentation |
Xinyu Shi (School of Computer Science and Engineering, Southeast University); DONG WEI (Tencent Jarvis Lab)*; Yu Zhang (Southeast University); Donghuan Lu (Tencent); Munan Ning (Tencent); Jiashun Chen (School of Computer Science and Engineering, Southeast University); Kai Ma (Tencent); Yefeng Zheng (Tencent) |
3180 |
Acknowledging the Unknown for Multi-label Learning with Single Positive Labels |
Donghao Zhou (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences)*; Pengfei Chen (The Chinese University of Hong Kong); Qiong Wang (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences); Guangyong Chen (Shenzhen Institutes of Advanced Technology); Pheng-Ann Heng (The Chinese Univsersity of Hong Kong) |
3200 |
LoRD: Local 4D Implicit Representation for High-Fidelity Dynamic Human Modeling |
Boyan Jiang (Fudan University)*; Xinlin Ren (Fudan University); Mingsong Dou (Google Inc.); Xiangyang Xue (Fudan University); Yanwei Fu (Fudan University); Yinda Zhang (Google) |
3202 |
Bilateral Normal Integration |
Xu Cao (Osaka University)*; Hiroaki Santo (Osaka University); Boxin Shi (Peking University); Fumio Okura (Osaka University); Yasuyuki Matsushita (Osaka University) |
3203 |
Harmonizer: Learning to Perform White-Box Image and Video Harmonization |
Zhanghan Ke (City University of Hong Kong)*; Chunyi Sun (Australian National University ); Lei ZHU (City University of Hong Kong); Ke Xu (City University of Hong Kong); Rynson W.H. Lau (City University of Hong Kong) |
3213 |
On the Versatile Uses of Partial Distance Correlation in Deep Learning |
Xingjian Zhen (University of Wisconsin-Madison)*; Zihang Meng (University of Wisconsin Madison); Rudrasis Chakraborty (Butlr); Vikas Singh (University of Wisconsin Madison) |
3214 |
Object-Centric Unsupervised Image Captioning |
Zihang Meng (University of Wisconsin Madison)*; David Yang (Facebook); Xuefei Cao (Facebook); Ashish Shah (Facebook AI); Ser-Nam Lim (Meta AI) |
3217 |
Pose2Room: Understanding 3D Scenes from Human Activities |
Yinyu Nie (Technical University of Munich)*; Angela Dai (Technical University of Munich); Xiaoguang Han (Shenzhen Research Institute of Big Data, the Chinese University of Hong Kong (Shenzhen)); Matthias Niessner (Technical University of Munich) |
3218 |
Capturing, Reconstructing, and Simulating: the UrbanScene3D Dataset |
Liqiang Lin (Shenzhen University); Yilin Liu (Shenzhen University); Yue Hu (Shenzhen University); Xingguang Yan (Shenzhen University); Ke Xie (Shenzhen University); Hui Huang (Shenzhen University)* |
3225 |
A Spectral View of Randomized Smoothing under Common Corruptions: Benchmarking and Improving Certified Robustness |
Jiachen Sun (University of Michigan)*; Akshay Mehra (Tulane University); Bhavya Kailkhura (Lawrence Livermore National Laboratory); Pin-Yu Chen (IBM Research); Dan Hendrycks (UC Berkeley); Jihun Hamm (Tulane University); Zhuoqing Morley Mao (University of Michigan) |
3229 |
CLIP-Actor: Text-Driven Recommendation and Stylization for Animating Human Meshes |
Kim Youwang (POSTECH)*; Ji-Yeon Kim (POSTECH); Tae-Hyun Oh (POSTECH) |
3240 |
Interpretable Image Classification with Differentiable Prototypes Assignment |
Dawid Damian Rymarczyk (Jagiellonian University)*; Łukasz Struski (Jagiellonian University); Michał Górszczak (Jagiellonian University); Koryna Lewandowska (Jagiellonian University); Jacek Tabor (Jagiellonian University); Bartosz Zieliński (Jagiellonian University) |
3247 |
Efficient One-stage Video Object Detection by Exploiting Temporal Consistency |
Guanxiong Sun (Queen’s University Belfast); Yang Hua (Queen’s University Belfast)*; Guosheng Hu (Oosto); Neil Robertson (Queen’s University Belfast) |
3250 |
ConCL: Concept Contrastive Learning for Dense Prediction Pre-training in Pathology Images |
Jiawei Yang (UCLA)*; Hanbo Chen (Tencent AI Lab); Yuan Liang (UCLA); Junzhou Huang (University of Texas at Arlington); Lei He (UCLA); Jianhua Yao (National Institutes of Health) |
3254 |
Leveraging Action Affinity and Continuity for Semi-supervised Temporal Action Segmentation |
Guodong Ding (National University of Singapore)*; Angela Yao (National University of Singapore) |
3257 |
Fast and High Quality Image Denoising via Malleable Convolution |
Yifan Jiang (University of Texas at Austin)*; Bartlomiej Wronski (Google Research); Ben Mildenhall (Google Research); Jonathan T Barron (Google Research); Zhangyang Wang (University of Texas at Austin); Tianfan Xue (Google) |
3265 |
Data Association between Event Streams andIntensity Frames under Diverse Baselines |
Dehao Zhang (Peking University)*; Qiankun Ding (Peking University); Peiqi Duan (Peking University); Chu Zhou (Peking University); Boxin Shi (Peking University) |
3287 |
Self-Regulated Feature Learning via Teacher-free Feature Distillation |
Lujun Li (Chinese Academy of Science)* |
3289 |
TS2-Net: Token Shift and Selection Transformer for Text-Video Retrieval |
Yuqi Liu (Renmin University of China)*; Pengfei Xiong (Shopee); luhui xu (tencent); Cao Shengming (Tencent); Qin Jin (Renmin University of China) |
3292 |
TAPE: Task-Agnostic Prior Embedding for Image Restoration |
Lin Liu (University of Science and Technology of China)*; Lingxi Xie (Huawei Inc.); Xiaopeng Zhang (Noah’s Ark Lab, Huawei Inc.); Shanxin Yuan (Huawei Noah’s Ark Lab); Xiangyu Chen (University of Macau; SIAT); Wengang Zhou (University of Science and Technology of China); Houqiang Li (University of Science and Technology of China); Qi Tian (Huawei Cloud & AI) |
3293 |
MVSalNet:Multi-View Augmentation for RGB-D Salient Object Detection |
JiaYuan Zhou (Dalian University of Technology)*; Lijun Wang (Dalian University of Technology); Huchuan Lu (Dalian University of Technology); Kaining Huang (huang kaining); Xinchu Shi (Meituan Group); Bocong Liu (Meituan) |
3295 |
Rethinking IoU-based Optimization for Single-stage 3D Object Detection |
Hualian Sheng (College of Information Science and Electronic Engineering, Zhejiang University; DAMO Academy, Alibaba Group)*; Sijia Cai (DAMO Academy, Alibaba Group); Na Zhao (NUS); Bing Deng (Damo Academy, Alibaba Group); Jianqiang Huang (Damo Academy, Alibaba Group); Xian-Sheng Hua (Damo Academy, Alibaba Group); Min-Jian Zhao (Zhejiang University); Gim Hee Lee (National University of Singapore) |
3298 |
Uncertainty Inspired Underwater Image Enhancement |
Zhenqi Fu (Xiamen University)*; Wu Wang (Xiamen University); Yue Huang (Xiamen University); Xinghao Ding (Xiamen University); Kai-Kuang Ma (Nanyang Technological University, Singapore) |
3300 |
k-means Mask Transformer |
Qihang Yu (Johns Hopkins University)*; Huiyu Wang (JHU); Siyuan Qiao (Google); Maxwell D Collins (Google Inc.); Yukun Zhu (Google Inc.); Hartwig Adam (Google); Alan Yuille (Johns Hopkins University); Liang-Chieh Chen (Google Inc.) |
3302 |
Contrastive Vision-Language Pre-training with Limited Resources |
Quan Cui (Waseda University)*; Boyan Zhou (ByteDance); Yu Guo (Fudan University); Weidong Yin (UBC); Hao Wu (Bytedance Inc.); Osamu Yoshie (Waseda University); Yubo Chen (Bytedance) |
3305 |
Learning Linguistic Association Towards Efficient Text-Video Retrieval |
Sheng Fang (ICT); Shuhui Wang (VIPL,ICT,Chinese academic of science)*; Junbao Zhuo (ICT CAS); Xinzhe Han (University of Chinese Academy of Sciences); Qingming Huang (University of Chinese Academy of Sciences) |
3308 |
United Defocus Blur Detection and Deblurring via Adversarial Promoting Learning |
Wenda Zhao (Dalian University of Technology)*; Fei Wei (Dalian University of Techology); You He (Naval Aviation University); Huchuan Lu (Dalian University of Technology) |
3314 |
Unstructured Feature Decoupling for Vehicle Re-Identification |
Wen Qian (Institute of Automation, Chinese Academy of Sciences)*; Hao Luo (Alibaba group); Silong Peng (The Chinese academy of science); Fan Wang (Alibaba Group); Chen Chen (The Chinese academy of science); Hao Li (Alibaba Group) |
3322 |
Improving Adversarial Robustness of 3D Point Cloud Classification Models |
Guanlin Li (Nanyang Technological University)*; Guowen Xu (Nanyang Technological University); Han Qiu (Tsinghua University); Ruan HE (Tencent); Jiwei Li (Shannon.AI); Tianwei Zhang (Nanyang Technological University) |
3324 |
ASSISTER: Assistive Navigation via Conditional Instruction Generation |
Zanming Huang (Boston University); Zhongkai Shangguan (Boston University); Jimuyang Zhang (Boston University); Gilad Bar (Rutgers University – Camden); Matthew Boyd (Boston University); Eshed Ohn-Bar (Boston University)* |
3342 |
Deep Hash Distillation for Image Retrieval |
Young Kyun Jang (Seoul National University)*; Geonmo Gu (NAVER corp); Byungsoo Ko (NAVER/LINE Corp.); Isaac Kang (Seoul National University); Nam Ik Cho (Seoul National University) |
3345 |
Learning Spatial-Preserved Skeleton Representations for Few-Shot Action Recognition |
Ning Ma (Zhejiang University)*; Hongyi Zhang (Zhejiang University); Xuhui Li (Zhejiang University); Sheng Zhou (Zhejiang University); Zhen Zhang (National University of Singapore); Jun Wen (Harvard University); Haifeng Li (Zhejiang University); Jingjun Gu (Zhejiang University); Jiajun Bu (Zhejiang University) |
3346 |
Digging into Radiance Grid for Real-Time View Synthesis with Detail Preservation |
Jian Zhang (Alibaba Group); Jinchi Huang (Alibaba Group); Bowen Cai (Alibaba Group); Huan Fu (Alibaba Group)*; Mingming Gong (University of Melbourne); Chaohui Wang (Laboratoire d’Informatique Gaspard Monge, Université Paris-Est); Jiaming Wang (Alibaba Group); Hongchen Luo (Alibaba Group); Rongfei Jia (Alibaba Group); Binqiang Zhao (Alibaba); Xing Tang (Alibaba Group) |
3351 |
S^2Contact: Graph-based Network for 3D Hand-Object Contact Estimation with Semi-Supervised Learning |
Tze Ho Elden Tse (University of Birmingham)*; Zhongqun Zhang (University of Birmingham); Kwang In Kim (UNIST); Ales Leonardis (University of Birmingham); Feng Zheng (SUSTech); Hyung Jin Chang (University of Birmingham) |
3359 |
TD-Road: Top-Down Road Network Extraction with Holistic Graph Construction |
Yang He (Amazon)*; Ravi Garg (Amazon com services inc); Amber Roy Chowdhury (Amazon) |
3366 |
StyleGAN-Human: A Data-Centric Odyssey of Human Generation |
Jianglin Fu (SenseTime)*; Shikai Li (SenseTime Research); Yuming Jiang (Nanyang Technological University); Kwan-Yee Lin (SenseTime Research); Chen Qian (SenseTime); Chen Change Loy (Nanyang Technological University); Wayne Wu (SenseTime Research); Ziwei Liu (Nanyang Technological University) |
3369 |
Hourglass Attention Network for Image Inpainting |
Ye Deng (Xi’an Jiaotong University)*; Siqi Hui (Xi’an Jiaotong University); Rongye Meng (IAIR, Xi’an Jiaotong University); Sanping Zhou (Xi’an Jiaotong University); Jinjun Wang (Xi’an Jiaotong University) |
3370 |
MaxViT: Multi-Axis Vision Transformer |
Zhengzhong Tu (University of Texas at Austin)*; Hossein Talebi (Google); Han Zhang (Google); Feng Yang (Google Research); Peyman Milanfar (Google); Alan Bovik (University of Texas at Austin); Yinxiao Li (Google) |
3378 |
Gen6D: Generalizable Model-Free 6-DoF Object Pose Estimation from RGB Images |
Yuan Liu (The University of Hong Kong)*; Yilin Wen (The University of Hong Kong); Sida Peng (Zhejiang University); Cheng Lin (Tencent); Xiaoxiao Long (The University of Hong Kong); Taku Komura (The University of Hong Kong); Wenping Wang (The University of Hong Kong) |
3385 |
ColorFormer: Image Colorization via Color Memory assisted Hybrid-attention Transformer |
Xiaozhong Ji (Tencent)*; Boyuan Jiang (Tencent Youtu Lab); Donghao Luo (Tencent); Guangpin Tao (Nanjing University); Wenqing Chu (Tencent); Zhifeng Xie (Shanghai University); Chengjie Wang (Tencent; Shanghai Jiao Tong University); Ying Tai (Tencent YouTu) |
3387 |
Spotting Temporally Precise, Fine-Grained Events in Video |
James Hong (Stanford University)*; Haotian Zhang (Stanford University); Michaël Gharbi (Adobe Research); Matthew Fisher (Adobe Research); Kayvon Fatahalian (Stanford) |
3390 |
SegPGD: An Effective and Efficient Adversarial Attack for Evaluating and Boosting Segmentation Robustness |
Jindong Gu (University of Munich)*; Hengshuang Zhao (University of Oxford); Volker Tresp (Siemens AG and Ludwig Maximilian University of Munich ); Philip Torr (University of Oxford) |
3391 |
Adversarial Erasing Framework via Triplet with Gated Pyramid Pooling Layer for Weakly Supervised Semantic Segmentation |
Sung-Hoon Yoon (KAIST)*; Hyeokjun Kweon (KAIST); Jegyeong Cho (KAIST); Shinjeong Kim (KAIST); Kuk-Jin Yoon (KAIST) |
3393 |
Semi-Supervised Vision Transformers |
Zejia Weng (Fudan University)*; Xitong Yang (University of Maryland); Ang Li (Google DeepMind); Zuxuan Wu (UMD); Yu-Gang Jiang (Fudan University) |
3394 |
Learning an Isometric Surface Parameterization for Texture Unwrapping |
Sagnik Das (Stony Brook University)*; Ke Ma (Stony Brook University); Zhixin Shu (Adobe Research); Dimitris Samaras (Stony Brook University) |
3409 |
Mimic Embedding via Adaptive Aggregation: Learning Generalizable Person Re-identification |
BOQIANG XU (University of Chinese Academy of Sciences;Institute of Automation,Chinese Academy of Sciences)*; Jian Liang (CASIA); He Lingxiao (nlpr,cripac); Zhenan Sun (Chinese of Academy of Sciences) |
3418 |
CryoAI: Amortized Inference of Poses for Ab Initio Reconstruction of 3D Molecular Volumes from Real Cryo-EM Images |
Axel Levy (Stanford University); Frederic Poitevin (SLAC National Accelerator Laboratory); Julien N. P. Martel (Stanford University); Youssef Nashed (SLAC National Accelerator Laboratory); Ariana Peck (SLAC National Accelerator Laboratory); Nina Miolane (UCSB); Daniel Ratner (Stanford University ); Mike Dunne (SLAC National Accelerator Laboratory); Gordon Wetzstein (Stanford University)* |
3419 |
EAGAN: Efficient Two-stage Evolutionary Architecture Search for GANs |
Guohao Ying (University of Southern California); Xin He (Hong Kong Baptist University); Bin Gao (National University of Singapore); Bo Han (HKBU / RIKEN); Xiaowen Chu (Hong Kong University of Science and Technology)* |
3428 |
ScalableViT: Rethinking the Context-oriented Generalization of Vision Transformer |
Rui Yang (Tsinghua University)*; Hailong Ma (ByteDance Inc); Jie Wu (ByteDance Inc); Yansong Tang (Tsinghua University); Xuefeng Xiao (ByteDance Inc); Min Zheng (ByteDance); Xiu Li (Tsinghua University) |
3429 |
PlaneFormers: From Sparse View Planes to 3D Reconstruction |
Samir Agarwala (University of Michigan)*; Linyi Jin (University of Michigan); Chris Rockwell (University of Michigan); David Fouhey (University of Michigan) |
3438 |
Domain Adaptive Video Segmentation via Temporal Pseudo Supervision |
Yun Xing (Nanyang Technological University); Dayan Guan (Mohamed bin Zayed University of Artificial Intelligence); Jiaxing Huang (Nanyang Technological University); Shijian Lu (Nanyang Technological University)* |
3442 |
Diverse Learner: Exploring Diverse Supervision for Semi-supervised Object Detection |
Linfeng Li (Baidu)*; Minyue Jiang (Baidu Inc.); Yue Yu (Baidu.Inc.); Wei Zhang (Baidu Inc); Xiangru Lin (Baidu Inc.); Yingying Li (Baidu); Xiao Tan (Baidu Inc.); Jingdong Wang (Baidu); Errui Ding (Baidu Inc.) |
3452 |
Overlooked Poses Actually Make Sense: Distilling Privileged Knowledge for Human Motion Prediction |
Xiaoning Sun (Nanjing University of Science and Technology)*; Qiongjie Cui (Nanjing University of Science and Technology); Huaijiang Sun (Nanjing University of Science and Technology); Bin Li (Tianjin AiForward Science and Technology); Weiqing Li (Nanjing University of Science and Technology); Jianfeng Lu (Nanjing University of Science and Technology) |
3455 |
Towards Hard-Positive Query Mining for DETR-based Human-Object Interaction Detection |
Xubin Zhong (South China University of Technology); Changxing Ding (South China University of Technology)*; Zijian Li (South China University of Technology); Shaoli Huang (Tencent AI-Lab) |
3458 |
Learning Extremely Lightweight and Robust Model with Differentiable Constraints on Sparsity and Condition Number |
Xian Wei (East China Normal University); Yangyu Xu (Fujian Institute of Research on the Structure of Matter, Chinese Academy of Sciences;University of Chinese Academy of Sciences); yanhui huang (Fuzhou University); Hairong Lv (Tsinghua University); Hai Lan (Fujian Institute of Research on the Structure of Matter, Chinese Academy of Sciences); Mingsong Chen (East China Normal University); XUAN TANG (East China Normal University)* |
3470 |
Structural Triangulation: A Closed-Form Solution to Constrained 3D Human Pose Estimation |
Zhuo Chen (Shanghai Jiao Tong University)*; Xu Zhao (Shanghai Jiao Tong University); Xiaoyue Wan (Shanghai Jiao Tong University) |
3474 |
Latency-Aware Collaborative Perception |
Zixing Lei (Shanghai Jiao Tong University)*; Shunli Ren (Shanghai Jiao Tong University); Yue Hu (Shanghai Jiao Tong University); Wenjun Zhang (Shanghai Jiao Tong University); Siheng Chen (Shanghai Jiao Tong University) |
3475 |
Homogeneous Multi-modal Feature Fusion and Interaction for 3D Object Detection |
Xin Li (East China Normal University)*; Botian Shi (Shanghai AI Lab); Yuenan HOU (Shanghai AI Lab); Xingjiao Wu ( East China Normal University); Tianlong Ma (East China Normal University); Yikang Li (Shanghai AI Lab); Liang He (ECNU) |
3484 |
Unfolded Deep Kernel Estimation for Blind Image Super-resolution |
Hongyi Zheng (The Hong Kong Polytechnic University); Hongwei Yong (The Hong Kong Polytechnic University); Lei Zhang (“Hong Kong Polytechnic University, Hong Kong, China”)* |
3487 |
Rethinking Clustering-Based Pseudo-Labeling for Unsupervised Meta-Learning |
Xingping Dong (Inception Institute of Artificial Intelligence)*; Jianbing Shen (Inception Institute of Artificial Intelligence); Ling Shao (Terminus Group) |
3489 |
Continual Semantic Segmentation via Structure Preserving and Projected Feature Alignment |
Zihan Lin (University of Science and Technology of China); Zilei Wang (University of Science and Technology of China)*; Yixin Zhang (University of Science and Technology of China) |
3498 |
SC-wLS: Towards Interpretable Feed-forward Camera Re-localization |
Xin Wu (Peking University)*; Hao Zhao (Intel Labs China); Shunkai Li (Peking University); Yingdian Cao (Peking University); Hongbin Zha (Peking University, China) |
3500 |
Weakly-Supervised Stitching Network for Real-World Panoramic Image Generation |
Dae-Young Song (Chungnam National University); Geonsoo Lee (Chungnam National University); HeeKyung Lee (ETRI(Electronics and Telecommunications Reseach Institute)); Gi-Mun Um (ETRI(Electronics and Telecommunications Research Institute)); Donghyeon Cho (Chungnam National University)* |
3503 |
FloatingFusion: Depth from ToF and Image-stabilized Stereo Cameras |
Andreas Meuleman (KAIST); Hakyeong Kim (KAIST); James Tompkin (Brown University); Min H. Kim (KAIST)* |
3504 |
Dual-Evidential Learning for Weakly-supervised Temporal Action Localization |
Mengyuan Chen (Institute of Automation, Chinese Academy of Sciences)*; Junyu Gao (CASIA); Shicai Yang (Hikvision Research Institute); Changsheng Xu (CASIA) |
3511 |
DynaST: Dynamic Sparse Transformer for Exemplar-Guided Image Generation |
Songhua Liu (National University of Singapore)*; Jingwen Ye (National University of Singapore); Sucheng Ren (South China University of Technology); Xinchao Wang (National University of Singapore) |
3512 |
D2HNet: Joint Denoising and Deblurring with Hierarchical Network for Robust Night Image Restoration |
Yuzhi Zhao (City University of Hong Kong)*; Yongzhe Xu (SenseTime Group Limited); Qiong Yan (SenseTime Group Limited); DINGDONG YANG (University of Michigan); Xuehui Wang (Shanghai Jiao Tong University); Lai-Man Po (CITY UNIVERSITY OF HONG KONG) |
3514 |
DELTAR: Depth Estimation from a Light-weight ToF Sensor and RGB Image |
Yijin Li (Zhejiang University); Yinda Zhang (Google); Xinyang Liu (Zhejiang University); Wenqi Dong (Zhejiang University); Han Zhou (Zhejiang University); Hujun Bao (Zhejiang University); Guofeng Zhang (Zhejiang University); Zhaopeng Cui (Zhejiang University)* |
3515 |
ERA: Enhanced Rational Activations |
Martin Trimmel (Lund University)*; Mihai Zanfir (Google); Richard I Hartley (google); Cristian Sminchisescu (Google) |
3518 |
FrequencyLowCut pooling – Plug & Play against Catastrophic Overfitting |
Julia Grabinski (University of Siegen)*; Janis Keuper (Fraunhofer); Margret Keuper (University of Mannheim); Steffen Jung (MPII) |
3520 |
Interclass Prototype Relation for Few-Shot Segmentation |
Atsuro Okazawa (SoftBank Corp.)* |
3523 |
Multi-Faceted Distillation of Base-Novel Commonality for Few-shot Object Detection |
Shuang Wu (Harbin Institute of Technology, Shenzhen); Wenjie Pei (Harbin Institute of Technology, Shenzhen); Dianwen Mei (Harbin Institute of Technology, Shenzhen); Fanglin Chen (Harbin Institute of Technology, Shenzhen); Jiandong Tian (CAS); Guangming Lu ( Harbin Institute of Technology, Shenzhen)* |
3525 |
X-DETR: A Versatile Architecture for Instance-wise Vision-Language Tasks |
Zhaowei Cai (Amazon)*; Gukyeong Kwon (Amazon); Avinash Ravichandran (Amazon); Erhan Bas (Amazon); Zhuowen Tu (UC San Diego); Rahul Bhotika (Amazon); Stefano Soatto (UCLA) |
3535 |
Equivariance and Invariance Inductive Bias for Learning from Insufficient Data |
Tan Wang (Nanyang Technological University)*; Qianru Sun (Singapore Management University); Sugiri Pranata (Panasonic R&D Center Singapore); Karlekar Jayashree (Panasonic); Hanwang Zhang (Nanyang Technological University) |
3539 |
Multimodal Conditional Image Synthesis with Product-of-Experts GANs |
Xun Huang (NVIDIA)*; Arun Mallya (NVIDIA); Ting-Chun Wang (NVIDIA); Ming-Yu Liu (NVIDIA) |
3551 |
Balancing between Forgetting and Acquisition in Incremental Subpopulation Learning |
Mingfu Liang (Northwestern University)*; JIAHUAN ZHOU (Peking University); Wei Wei (Northwestern University); Ying Wu (Northwestern University) |
3555 |
TensoRF: Tensorial Radiance Fields |
Anpei Chen (ShanghaiTech University)*; Zexiang Xu (Adobe Research); Andreas Geiger (University of Tuebingen); Jingyi Yu (Shanghai Tech University); Hao Su (UCSD) |
3580 |
PointCLM: A Contrastive Learning-based Framework for Multi-instance Point Cloud Registration |
Mingzhi Yuan (Fudan University)*; Zhihao Li (Fudan); Qiuye Jin (Fudan University); Xinrong Chen (Fudan University); Manning Wang (Fudan University) |
3581 |
Slim Scissors: Segmenting Thin Object from Synthetic Background |
Kunyang Han (Beijing Jiaotong University)*; Jun Hao Liew (ByteDance); Jiashi Feng (ByteDance); Huawei Tian (People’s Public Security University of China); Yao Zhao (Beijing Jiaotong University); Yunchao Wei (UTS) |
3591 |
CLASTER: Clustering with Reinforcement Learning for Zero-Shot Action Recognition |
Shreyank N Gowda (University of Edinburgh)*; Laura Sevilla-Lara (Facebook); Frank Keller (University of Edinburgh); Marcus Rohrbach (Facebook AI Research) |
3593 |
Discovering Human-Object Interaction Concepts via Self-Compositional Learning |
Zhi Hou (The University of Sydney)*; Baosheng Yu (The University of Sydney); Dacheng Tao (The University of Sydney) |
3598 |
Mixed-Precision Neural Network Quantization via Learned Layer-wise Importance |
Chen Tang (Tsinghua University)*; Kai Ouyang (Tsinghua University); Zhi Wang (Tsinghua University); Yifei Zhu (Shanghai Jiao Tong University); Wen Ji (Institute of Computing Technology, Chinese Academy of Sciences); Yaowei Wang (PengCheng Laboratory); Wenwu Zhu (Tsinghua University) |
3604 |
TREND: Truncated Generalized Normal Density Estimation of Inception Embeddings for GAN Evaluation |
Junghyuk Lee (School of Integrated Technology, Yonsei University); Jong-Seok Lee (“Yonsei University, Korea”)* |
3606 |
3D Room Layout Estimation from a Cubemap of Panorama Image via Deep Manhattan Hough Transform |
Yining Zhao (Tsinghua University); Chao Wen (Bytedance); Zhou Xue (Bytedance); Yue Gao (Tsinghua University)* |
3623 |
JoJoGAN: One Shot Face Stylization |
Min Jin Chong (Univeristy of Illinois at Urbana-Champaign)*; David Forsyth (Univeristy of Illinois at Urbana-Champaign) |
3627 |
Convolutional Embedding Makes Hierarchical Vision Transformer Stronger |
Cong Wang (OPPO); Hongmin Xu (OPPO)*; Xiong Zhang (Neolix Autonomous Vehicle); Li Wang (North China University of Technology ); Zhitong Zheng (OPPO); Haifeng Liu (OPPO) |
3632 |
Weakly Supervised Object Localization via Transformer with Implicit Spatial Calibration |
Haotian Bai (The Chinese University of Hongkong, shenzhen); Ruimao Zhang (The Chinese University of Hong Kong, Shenzhen)*; Jiong WANG (The Chinese University of Hong Kong, Shenzhen); Xiang Wan (Shenzhen Research Institute of Big Data, the Chinese University of Hong Kong (Shenzhen)) |
3641 |
Few-shot Class-incremental Learning for 3D Point Cloud Objects |
Townim Faisal Chowdhury (North South University); Ali Cheraghian (Australian National University (ANU)); Sameera Chandimal Ramasinghe (Australian National University); Sahar Ahmadi (University of Technology Sydney); Morteza Saberi (University of Technology, Sydney); Shafin Rahman (North South University)* |
3643 |
Learning Graph Neural Networks for Image Style Transfer |
Yongcheng Jing (The University of Sydney); Yining Mao (Zhejiang University); Yiding Yang (Wormpex AI Research); Yibing Zhan (JD Explore Academy); Mingli Song (Zhejiang University); Xinchao Wang (National University of Singapore)*; Dacheng Tao (JD.com) |
3644 |
JPerceiver: Joint Perception Network for Depth, Pose and Layout Estimation in Driving Scenes |
Haimei Zhao (The University of Sydney)*; Jing Zhang (The University of Sydney); Sen Zhang (The University of Sydney); Dacheng Tao (JD.com) |
3645 |
Meta-Learning with Less Forgetting on Large-Scale Non-Stationary Task Distributions |
Zhenyi Wang (University at Buffalo)*; Li Shen (JD Explore Academy); Le Fang (University at Buffalo); Qiuling Suo (State University of New York at Buffalo); Donglin Zhan (Columbia University); Tiehang Duan (Facebook); Mingchen Gao (University at Buffalo, SUNY) |
3655 |
Semi-supervised 3D Object Detection with Proficient Teachers |
Junbo Yin (Beijing Institute of Technology); Jin Fang (Baidu ); Dingfu Zhou (Baidu); Wenguan Wang (Eidgenössische Technische Hochschule Zürich); Liangjun Zhang (baidu); Cheng-Zhong Xu (University of Macau); Jianbing Shen (Inception Institute of Artificial Intelligence)* |
3658 |
NeFSAC: Neurally Filtered Minimal Samples |
Luca Cavalli (ETH Zurich)*; Marc Pollefeys (ETH Zurich / Microsoft); Daniel Barath (ETH Zürich) |
3660 |
Domain Generalization by Mutual-Information Regularization with Pre-trained Models |
Junbum Cha (Kakaobrain)*; Kyungjae Lee (Chung-Ang University); Sungrae Park (Upstage AI Research, Upstage AI); Sanghyuk Chun (NAVER AI Lab) |
3661 |
AcroFOD: An Adaptive Method for Cross-domain Few-shot Object Detection |
Yipeng Gao (Sun Yat-sen University, China); Lingxiao YANG (Sun-Yat Sen University); Yunmu Huang (Huawei Technologies Co., Ltd.); Song Xie (Huawei Technologies Co., Ltd.); Shiyong Li ( AI Application Research Center, Huawei Technologies Co., Ltd); WEI-SHI ZHENG (Sun Yat-sen University, China)* |
3665 |
Primitive-based Shape Abstraction via Nonparametric Bayesian Inference |
Yuwei Wu (National University of Singapore)*; Weixiao Liu (National University of Singapore); Sipu Ruan (National University of Singapore); Gregory S Chirikjian (National University of Singapore) |
3670 |
Active label correction using robust parameter update and entropy propagation |
Kwang In Kim (UNIST)* |
3671 |
E-Graph: Minimal Solution for Rigid Rotation with Extensibility Graphs |
Yanyan Li (tum)*; Federico Tombari (Google, TU Munich) |
3672 |
Unified Fully and Timestamp Supervised Temporal Action Segmentation via Sequence to Sequence Translation |
Nadine Behrmann (Bosch Center for Artificial Intelligence)*; S. Alireza Golestaneh (Google); Zico Kolter (Carnegie Mellon University); Jürgen Gall (University of Bonn); Mehdi Noroozi (Bosch Gmb) |
3677 |
Counterfactual Intervention Feature Transfer for Visible-Infrared Person Re-identification |
Xulin Li (University of Science and Technology of China); Yan Lu (University of Sydney); Bin Liu (University of Science and Technology of China)*; Yating Liu (USTC); Guojun Yin (University of Science and Technology of China); Qi Chu (University of Science and Technology of China); Jinyang Huang (University Of Science And Technology Of China); Feng Zhu (University of Science and Technology of China); Rui Zhao (SenseTime Group Limited); Nenghai Yu (University of Science and Technology of China) |
3681 |
A Closer Look at Invariances in Self-supervised Pre-training for 3D Vision |
Lanxiao Li (Karlsruher Institut fuer Technologie)*; Michael Heizmann (Karlsruher Institut fuer Technologie) |
3685 |
VecGAN: Image-to-Image Translation with Interpretable Latent Directions |
Yusuf Dalva (Bilkent University); Said F Altındiş (Bilkent University); Aysegul Dundar (Bilkent University)* |
3686 |
SNeS: Learning Probably Symmetric Neural Surfaces from Incomplete Data |
Eldar Insafutdinov (University of Oxford); Dylan Campbell (University of Oxford)*; Joao F Henriques (University of Oxford); Andrea Vedaldi (Oxford University) |
3689 |
Three things everyone should know about Vision Transformers |
Hugo Touvron (Facebook AI Research)*; Matthieu Cord (Sorbonne University); Alaaeldin M El-Nouby (Facebook AI Research); Jakob Verbeek (Facebook); Herve Jegou (Facebook AI Research) |
3690 |
DeiT III: Revenge of the ViT |
Hugo Touvron (Facebook AI Research)*; Matthieu Cord (Sorbonne University); Herve Jegou (Facebook AI Research) |
3693 |
Any-resolution Training for High-resolution Image Synthesis |
Lucy Chai (MIT)*; Michaël Gharbi (Adobe Research); Eli Shechtman (Adobe Research, US); Phillip Isola (MIT); Richard Zhang (Adobe) |
3703 |
HDR-Plenoxels: Self-Calibrating High Dynamic Range Radiance Fields |
Kim Jun-Seong (POSTECH)*; Kim Yu-Ji (POSTECH); Moon Ye-Bin (POSTECH); Tae-Hyun Oh (POSTECH) |
3719 |
PartImageNet: A Large, High-Quality Dataset of Parts |
Ju He (Johns Hopkins University)*; Shuo Yang (University of Technology Sydney); Shaokang Yang (ByteDance); Adam Kortylewski (Max Planck Institute for Informatics); Xiaoding Yuan (Johns Hopkins University); Jie-Neng Chen (Johns Hopkins University); shuai liu (ByteDance Inc.); Cheng Yang (ByteDance Inc.); Qihang Yu (Johns Hopkins University); Alan Yuille (Johns Hopkins University) |
3721 |
Abstracting Sketches through Simple Primitives |
Stephan Alaniz (University of Tübingen)*; Massimiliano Mancini (University of Tübingen); Anjan Dutta (University of Surrey); Diego Marcos (Wageningen University); Zeynep Akata (University of Tübingen) |
3723 |
MTTrans: Cross-Domain Object Detection with Mean Teacher Transformer |
Jinze Yu (Beihang University); Jiaming Liu (Peking University); Xiaobao Wei (Beihang University); Haoyi Zhou (Beihang University); Yohei Nakata (Panasonic Corporation); Denis A Gudovskiy (Panasonic); Tomoyuki Okuno (Panasonic); Jianxin Li (Beihang University); Kurt Keutzer (UC Berkeley); Shanghang Zhang (University of California, Berkeley)* |
3731 |
TAFIM: Targeted Adversarial Attacks against Facial Image Manipulations |
Shivangi Aneja (Technical University Of Munich )*; Lev Markhasin (Sony Europe); Matthias Niessner (Technical University of Munich) |
3737 |
NeuMan: Neural Human Radiance Field from a Single Video |
Wei Jiang (University of British Columbia)*; Kwang Moo Yi (University of British Columbia); Golnoosh Samei (UBC); Oncel Tuzel (Apple); Anurag Ranjan (Apple) |
3747 |
Learning Implicit Templates for Point-Based Clothed Human Modeling |
Siyou Lin (Tsinghua University)*; Hongwen Zhang (Tsinghua University); Zerong Zheng (Tsinghua University); Ruizhi Shao (Tsinghua University); Yebin Liu (Tsinghua University) |
3751 |
Event Neural Networks |
Matthew Dutson (University of Wisconsin-Madison)*; Yin Li (University of Wisconsin-Madison); Mohit Gupta (“University of Wisconsin-Madison, USA “) |
3755 |
Learning to Censor by Noisy Sampling |
Ayush Chopra (MIT)*; Abhinav Java (Adobe, MDSR Labs); Abhishek Singh (MIT); Vivek Sharma (MIT); Ramesh Raskar (Massachusetts Institute of Technology) |
3758 |
ConMatch: Semi-Supervised Learning with Confidence-Guided Consistency Regularization |
Jiwon Kim (Korea University)*; Youngjo Min (Korea University); Daehwan Kim (Samsung electro mechanics); Gyuseong Lee (Korea University); Junyoung Seo (Korea University); Kwangrok Ryoo (Korea University); Seungryong Kim (Korea University) |
3760 |
Granularity-aware Adaptation for Image Retrieval over Multiple Tasks |
Jon Almazan (Naver Labs); Byungsoo Ko (NAVER/LINE Corp.); Geonmo Gu (NAVER corp); Diane Larlus (Naver Labs Europe); Yannis Kalantidis (NAVER LABS Europe)* |
3769 |
EdgeViTs: Competing Light-weight CNNs on Mobile Devices with Vision Transformers |
Junting Pan (The Chinese University of Hong Kong); Adrian Bulat (Samsung AI Center, Cambridge); Fuwen Tan (Samsung AI Center, Cambridge); Xiatian Zhu (University of Surrey); Lukasz Dudziak (Samsung AI Center Cambridge); Hongsheng Li (The Chinese University of Hong Kong); Georgios Tzimiropoulos (Queen Mary University of London); Brais Martinez (Samsung AI Center)* |
3780 |
Multi-Domain Multi-Definition Landmark Localization for Small Datasets |
David Ferman (AI Foundation); Gaurav Bharaj (AI Foundation)* |
3781 |
TAVA: Template-free Animatable Volumetric Actors |
Ruilong Li (UC Berkeley)*; Julian Tanke (University of Bonn); Minh P Vo (Facebook Reality Labs); Michael Zollhöfer (Facebook Reality Labs); Jürgen Gall (University of Bonn); Angjoo Kanazawa (University of California Berkeley); Christoph Lassner (Meta Reality Labs Research) |
3792 |
Stereo Depth Estimation with Echoes |
Chenghao Zhang (National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, China)*; Kun Tian (Institute of Automation, Chinese Academy of Sciences); Bolin Ni (Institute of Automation, Chinese Academy of Sciences); Gaofeng Meng (Chinese Academy of Sciences); Bin Fan (University of Science and Technology Beijing); Zhaoxiang Zhang (Chinese Academy of Sciences, China); Chunhong Pan (Institute of Automation, Chinese Academy of Sciences) |
3794 |
EASNet:Searching Elastic and Accurate Network Architecture for Stereo Matching |
Qiang Wang (Harbin Institute of Technology (Shenzhen))*; Shaohuai Shi (The Hong Kong University of Science and Technology); Kaiyong Zhao (Hong Kong Baptist University); Xiaowen Chu (Hong Kong University of Science and Technology) |
3798 |
DEVIANT: Depth EquiVarIAnt NeTwork for Monocular 3D Object Detection |
Abhinav Kumar (Michigan State University)*; Garrick Brazil (Facebook); Enrique Corona (Ford Motor Company); Armin Parchami (Ford Motor Company); Xiaoming Liu (Michigan State University) |
3809 |
RBP-Pose: Residual Bounding Box Projection for Category-Level Pose Estimation |
Ruida Zhang (Tsinghua University)*; Yan Di (Technical University of Munich); Zhiqiang Lou (Tsinghua University); Fabian Manhardt (Google); Federico Tombari (Google, TU Munich); Xiangyang Ji (Tsinghua University) |
3820 |
Levenshtein OCR |
Cheng Da (Alibaba DAMO Academy)*; Wang Peng (Alibaba DAMO Academy); Cong Yao (Alibaba DAMO Academy) |
3821 |
Multi-Granularity Prediction for Scene Text Recognition |
Wang Peng (Alibaba DAMO Academy); Cheng Da (Alibaba DAMO Academy)*; Cong Yao (Alibaba DAMO Academy) |
3827 |
MixSKD: Self-Knowledge Distillation from Mixup for Image Recognition |
Chuanguang Yang (Institute of Computing Technology, Chinese Academy of Sciences )*; Zhulin An (Institute of Computing Technology, Chinese Academy of Sciences); Helong Zhou (Beijing Horizon Information Technology Co.,Ltd); linhang cai (Institute of Computing Technology, Chinese Academy of Sciences); Xiang Zhi (Institute of Computing Technology, Chinese Academy of Sciences); Jiwen Wu (Institute of Computing Technology, Chinese Academy of Sciences); yongjun xu (Institute of Computing Technology, Chinese Academy of Sciences); Qian Zhang (Horizon Robotics) |
3834 |
Switch-BERT: Learning to Model Multimodal Interactions by Switching Attention and Input |
Qingpei Guo (Ant Financial Services Group)*; Kaisheng Yao (Amazon); Wei Chu (Ant Group) |
3837 |
Efficient Video Transformers with Spatial-temporal Token Selection |
Junke Wang (Fudan University)*; Xitong Yang (University of Maryland); Hengduo Li (University of Maryland, College Park ); Li Liu (BirenTech Research); Zuxuan Wu (UMD); Yu-Gang Jiang (Fudan University) |
3844 |
DAS: Densely-Anchored Sampling for Deep Metric Learning |
Lizhao Liu (South China University of Technology); Shangxin Huang (South China University of Technology); Zhuangwei Zhuang (South China University of Technology); Ran Yang (South China University of Technology); Mingkui Tan (South China University of Technology)*; Yaowei Wang (PengCheng Laboratory) |
3864 |
ReCoNet: Recurrent Correction Network for Fast and Efficient Multi-modality Image Fusion |
Zhanbo Huang (Dalian University of Technology); Jinyuan Liu (Dalian University of Technology); Xin Fan (Dalian University of Technology)*; Risheng Liu (Dalian University of Technology); Wei Zhong (Dalian University of Technology); Zhongxuan Luo (DALIAN UNIVERSITY OF TECHNOLOGY) |
3867 |
RIBAC: Towards Robust and Imperceptible Backdoor Attack against Compact DNN |
Huy Phan (Rutgers University)*; Cong Shi (Rutgers University); Yi Xie (Rutgers University); Tianfang Zhang (Rutgers University, New Brunswick); Zhuohang Li (University of Tennessee, Knoxville); Tianming Zhao (Temple University); Jian Liu (The University of Tennessee, Knoxville); Yan Wang (Temple University); Yingying Chen (Rutgers University); bo yuan (rutgers university) |
3870 |
Point Cloud Compression with Sibling Context and Surface Priors |
Zhili CHEN (HKUST); Zian Qian (HKUST); Sukai Wang (HKUST); Qifeng Chen (HKUST)* |
3874 |
Self-Feature Distillation with Uncertainty Modeling for Degraded Image Recognition |
zhou yang (Xidian University); Weisheng Dong (Xidian University)*; Xin Li (West Virginia University); Jinjian Wu (Xidian University); Leida Li (Xidian University); Guangming Shi (Xidian University) |
3885 |
Point Cloud Compression using Range Image-based Entropy Model for Autonomous Driving |
Sukai Wang (HKUST)*; Ming Liu (HKUST) |
3904 |
CANF-VC: Conditional Augmented Normalizing Flows for Video Compression |
Yung-Han Ho (NCTU); Chih-Peng Chang (National Chiao Tung Univeristy); Peng-Yu Chen (NYCU); Alessandro Gnutti (University of Brescia); Wen-Hsiao Peng (National Yang Ming Chiao Tung University)* |
3912 |
Bi-level Feature Alignment for Versatile Image Translation and Manipulation |
Fangneng Zhan (Max Planck Institute for Informatics); Yingchen Yu (Nanyang Technological University); Rongliang WU (Nanyang Technological University); Jiahui Zhang (Nanyang Technological University); Kaiwen Cui (Nanyang Technological University); Aoran Xiao (Nanyang Technological University); Shijian Lu (Nanyang Technological University)*; Chunyan Miao (NTU) |
3918 |
Lane Detection Transformer based on Multi-frame Horizontal and Vertical Attention and Visual Transformer Module |
Han Zhang (Beihang University)*; Yunchao Gu (BUAA); Xinliang Wang (BUAA); Junjun Pan (Beihang University); Minghui Wang (Beihang University) |
3921 |
Label-Guided Auxiliary Training Improves 3D Object Detector |
yaomin huang (East China Normal University); Xinmei Liu (East China Normal University)*; Yichen Zhu (Midea Group); Zhiyuan Xu (Midea Group); Chaomin Shen (East China Normal University); Zhengping Che (Midea Group); Guixu Zhang (East China Normal University); Yaxin Peng (Department of Mathematics, School of Science, Shanghai University); Feifei Feng (Midea Grooup); Jian Tang (Midea Group) |
3932 |
FedX: Unsupervised Federated Learning with Cross Knowledge Distillation |
Sungwon Han (KAIST)*; Sungwon Park (KAIST); Fangzhao Wu (MSRA); Sundong Kim (Institute for Basic Science); Chuhan Wu (Tsinghua University); Xing Xie (Microsoft Research Asia); Meeyoung Cha (Institute for Basic Science) |
3936 |
ProposalContrast: Unsupervised Pre-training for LiDAR-based 3D Object Detection |
Junbo Yin (Beijing Institute of Technology); Wenguan Wang (Eidgenössische Technische Hochschule Zürich); Dingfu Zhou (Baidu); Jin Fang (Baidu ); Liangjun Zhang (baidu); Cheng-Zhong Xu (University of Macau); Jianbing Shen (Inception Institute of Artificial Intelligence)* |
3948 |
Audio-Driven Stylized Gesture Generation with Flow-Based Model |
Sheng Ye (Tsinghua University)*; Yu-Hui Wen (Tsinghua University); Yanan Sun (Tsinghua University); Ying He (Nanyang Technological University); Ziyang Zhang (HUAWEI TECHNOLOGIES CO.LTD); Yaoyuan Wang (Huawei Technologies Co., Ltd.); Weihua He (Tsinghua University); Yong-Jin Liu (Tsinghua University) |
3958 |
Unsupervised Domain Adaptation for One-Stage Object Detector using Offsets to Bounding Box |
Jayeon Yoo (Seoul National University); Inseop Chung (Seoul National University); Nojun Kwak (Seoul National University)* |
3964 |
Joint Feature Learning and Relation Modeling for Tracking: A One-Stream Framework |
Botao Ye (Institute of Computing Technology, Chinese Academy of Sciences)*; Hong Chang (Chinese Academy of Sciences); Bingpeng MA (University of Chinese Academy of Sciences); Shiguang Shan (Institute of Computing Technology, Chinese Academy of Sciences); Xilin Chen (Institute of Computing Technology, Chinese Academy of Sciences) |
3965 |
PreTraM: Self-Supervised Pre-training via Connecting Trajectory and Map |
Chenfeng Xu (UC Berkeley)*; Tian Li (University of California, San Diego); Chen Tang (UC Berkeley); Lingfeng Sun (UC Berkeley); Kurt Keutzer (EECS, UC Berkeley); Masayoshi TOMIZUKA (MSC Lab); Alireza Fathi (Google); Wei Zhan (University of California, Berkeley) |
3966 |
DeepPS2: Revisiting Photometric Stereo using Two Differently Illuminated Images |
Ashish Tiwari (Indian Institute of Technology Gandhinagar)*; Shanmuganathan Raman (Indian Institute of Technology (IIT) Gandhinagar) |
3977 |
Learn From All: Erasing Attention Consistency for Noisy Label Facial Expression Recognition |
Yuhang Zhang (Beijing University of Posts and Telecommunicates); Chengrui Wang (Beijing University of Posts and Telecommunications); Xu Ling (Beijing University of Posts and Telecommunications); Weihong Deng (Beijing University of Posts and Telecommunications)* |
3984 |
Novel Class Discovery without Forgetting |
Joseph K J (Indian Institute of Technology, Hyderabad)*; Sujoy Paul (Google Research); Gaurav Aggarwal (Google); Soma Biswas (Indian Institute of Science, Bangalore); Piyush Rai (IIT Kanpur); Kai Han (The University of Hong Kong); Vineeth N Balasubramanian (Indian Institute of Technology, Hyderabad) |
3985 |
Self-Constrained Inference Optimization on Structural Groups for Human Pose Estimation |
ZheHan Kan (Southern University of Science and Technology); Shuoshuo Chen (Southern University of Science and Technology); Zeng Li (Southern University of Science and Technology); Zhihai He (Southern University of Science and Technology)* |
3989 |
Predicting is not Understanding: Recognizing and Addressing Underspecification in Machine Learning |
Damien Teney (University of Adelaide)*; Maxime Peyrard (EPFL); Ehsan M Abbasnejad (The University of Adelaide) |
3991 |
A Non-isotropic Probabilistic Take on Proxy-based Deep Metric Learning |
Michael Kirchhof (University of Tübingen)*; Karsten Roth (University of Tuebingen); Zeynep Akata (University of Tübingen); Enkelejda Kasneci (University of Tuebingen) |
3998 |
Relative Pose from SIFT Features |
Daniel Barath (ETH Zürich)*; Zuzana Kukelova (Czech Technical University in Prague) |
3999 |
Monocular 3D Object Reconstruction with GAN Inversion |
Junzhe Zhang (Nanyang Technological University)*; Daxuan Ren (Nanyang Technological University); Zhongang Cai (SenseTime International Pte Ltd); Chai Kiat Yeo (Nanyang Technological University); Bo Dai (Shanghai AI Lab); Chen Change Loy (Nanyang Technological University) |
4001 |
PromptDet: Towards Open-vocabulary Detection using Uncurated Images |
Chengjian Feng (Meituan inc.)*; Yujie Zhong (University of Oxford); Zequn Jie (Meituan inc.); Xiangxiang Chu (Meituan); Haibing Ren (Meituan Inc.); Xiaolin Wei (Meituan); Weidi Xie (Shanghai Jiao Tong University); Lin Ma (Meituan) |
4005 |
Densely Constrained Depth Estimator for Monocular 3D Object Detection |
Yingyan Li (CASIA)*; Yuntao Chen (TuSimple); Jiawei He (Institute of Automation, Chinese Academy of Sciences); Zhaoxiang Zhang (Chinese Academy of Sciences, China) |
4016 |
Content Adaptive Latents and Decoder for Neural Image Compression |
Guanbo Pan (Beihang University)*; Guo Lu (Beijing Institute of Technology); Zhihao Hu (Beihang University); Dong Xu (The University of Hong Kong) |
4018 |
High-Fidelity Image Inpainting with GAN Inversion |
Yongsheng YU (University of Chinese Academy of Sciences); Libo Zhang (Institute of Software Chinese Academy of Sciences)*; Heng Fan (University of North Texas); Tiejian Luo (University of Chinese Academy of Sciences) |
4019 |
Spatially Invariant Unsupervised 3D Object-Centric Learning and Scene Decomposition |
Tianyu Wang (The Australian National University); Miaomiao Liu (The Australian National University)*; Kee Siong Ng (The Australian National University) |
4020 |
W2N: Switching From Weak Supervision to Noisy Supervision for Object Detection |
Zitong Huang (Harbin Institute of Technology); Yiping Bao (Megvii(Face++) Inc); Bowen Dong (Harbin Institute of Technology); erjin zhou (megvii); Wangmeng Zuo (Harbin Institute of Technology, China)* |
4021 |
UnrealEgo: A New Dataset for Robust Egocentric 3D Human Motion Capture |
Hiroyasu Akada (Max Planck Institute for Informatics, Keio University); Jian Wang (Max Planck Institute for Informatics); Soshi Shimada (MPI for Informatics); Masaki Takahashi (Keio University); Christian Theobalt (MPI Informatik); Vladislav Golyanik (MPI for Informatics)* |
4022 |
MotionCLIP: Exposing Human Motion Generation to CLIP Space |
Guy Tevet (Tel Aviv University)*; Brian Gordon (Tel Aviv University); Amir Hertz (Tel Aviv University); Amit H Bermano (Tel-Aviv University); Danny Cohen-Or (Tel Aviv University) |
4023 |
Efficient and Degradation-Adaptive Network for Real-World Image Super-Resolution |
Jie Liang (The Hong Kong Polytechnic University)*; Hui Zeng (OPPO); Lei Zhang (“Hong Kong Polytechnic University, Hong Kong, China”) |
4024 |
Unidirectional Video Denoising by Mimicking Backward Recurrent Modules with Look-ahead Forward Ones |
Junyi Li (Harbin Institute of Technology); Xiaohe Wu (Harbin Institute of technology); zhenxing niu (Alibaba Group-Machine Intelligence Technology); Wangmeng Zuo (Harbin Institute of Technology, China)* |
4029 |
Map-free Visual Relocalization: Metric Pose Relative to a Single Image |
Eduardo Arnold (University of Warwick); Jamie M Wynn (Niantic); Sara Vicente (Niantic); Guillermo Garcia-Hernando (Niantic); Aron Monszpart (Niantic); Victor A Prisacariu (Niantic Labs); Daniyar Turmukhambetov (Niantic); Eric Brachmann (Niantic)* |
4032 |
DeltaGAN: Towards Diverse Few-shot ImageGeneration with Sample-Specific Delta |
Yan Hong (Shanghai Jiao Tong University); Li Niu (Shanghai Jiao Tong University)*; Jianfu Zhang (Shanghai Jiao Tong University); Liqing Zhang (Shanghai Jiao Tong University) |
4035 |
Sample-Adaptive Augmentation for Long-Tailed Image Classification |
Yan Hong (Shanghai Jiao Tong University); Jianfu Zhang (Shanghai Jiao Tong University)*; Zhongyi Sun (Tencent); Ke Yan (Tencent) |
4037 |
TokenMix: Rethinking Image Mixing for Data Augmentation in Vision Transformers |
Jihao Liu (Sensetime)*; Boxiao Liu (Institute of Computing Technology, Chinese Academy of Sciences); Hang Zhou (The Chinese University of Hong Kong); Hongsheng Li (The Chinese University of Hong Kong); Yu Liu (SenseTime Group LTD) |
4041 |
UFO: Unified Feature Optimization |
Teng Xi (Baidu Inc.)*; Yifan Sun (Baidu Research); Deli Yu (Baidu Inc. ); Bi Li (Baidu Inc.); Nan Peng (Baidu Inc.); gang zhang (Baidu Inc.); Xinyu Zhang (Baidu Inc.); Zhigang Wang (shanghai AI lab); jinwen chen (Baidu Inc.); Jian Wang (Baidu Inc.); liu lufei (Baidu Inc); Haocheng Feng (Baidu Inc.); Junyu Han (Baidu Inc.); jingtuo liu (baidu); Errui Ding (Baidu Inc.); Jingdong Wang (Baidu) |
4043 |
Master of All: Simultaneous Generalization of Urban-Scene Segmentation to All Adverse Weather Conditions |
Nikhil Reddy (IIT Delhi)*; Abhinav Singhal (Indian Institute of Technology, Delhi); Abhishek Kumar (IIT Delhi); Mahsa Baktashmotlagh (University of Queensland); Chetan Arora (Indian Institute of Technology Delhi) |
4047 |
PalQuant: Accelerating High-precision Networks on Low-precision Accelerators |
Qinghao Hu (Institute of Automation, Chinese Academy of Sciences)*; gang li (shanghai jiao tong university); Qiman Wu (Baidu Inc.); Jian Cheng (“Chinese Academy of Sciences, China”) |
4057 |
Self-Supervised Learning for Real-World Super-Resolution from Dual Zoomed Observations |
Zhilu Zhang (Harbin Institute of Technology); Ruohao Wang (Harbin Institute of Technology); Hongzhi Zhang (Harbin Institute of Technology); Yunjin Chen (ULSee Inc.); Wangmeng Zuo (Harbin Institute of Technology, China)* |
4059 |
UniMiSS: Universal Medical Self-Supervised Learning via Breaking Dimensionality Barrier |
Yutong Xie (University of Adelaide)*; Jianpeng Zhang (Northwestern Polytechnical University); Yong Xia (Northwestern Polytechnical University, Research & Development Institute of Northwestern Polytechnical University in Shenzhen); Qi Wu (University of Adelaide) |
4073 |
Self-distilled Feature Aggregation for Self-supervised Monocular Depth Estimation |
Zhengming Zhou (NLPR-IA-CAS); Qiulei Dong (NLPR-IA-CAS)* |
4074 |
Negative Samples are at Large: Leveraging Hard-distance Elastic Loss for Re-identification |
Hyungtae Lee (DEVCOM Army Research Laboratory)*; Sungmin Eum (Booz Allen Hamilton Inc.); Heesung Kwon (U.S. Army Research Laboratory) |
4076 |
Global-local Motion Transformer for Unsupervised Skeleton-based Action Learning |
Boeun Kim (Seoul National University)*; Hyung Jin Chang (University of Birmingham); Jungho Kim (KETI); Jin Young Choi (Seoul National University) |
4080 |
Towards Efficient and Scale-Robust Ultra-High-Definition Image Demoiréing |
Xin Yu (The University of Hong Kong)*; Peng Dai (The University of Hong Kong); Wenbo Li (The Chinese University of Hong Kong); Lan Ma (TCL Corporate Research); Jiajun Shen (TCL Research); Jia Li (Sun Yat-Sen University); Xiaojuan Qi (The University of Hong Kong) |
4084 |
Instance Contour Adjustment via Structure-driven CNN |
Shuchen Weng (Peking University)*; Yi Wei (Samsung Research America Inc.); Ming-Ching Chang (University at Albany – SUNY); Boxin Shi (Peking University) |
4085 |
ERDN: Equivalent Receptive Field Deformable Network for Video Deblurring |
Bangrui Jiang (Tsinghua University)*; zhihuai xie (Tencent); Zhen Xia (Tencent); Songnan Li (Tencent); Shan Liu (Tencent America) |
4090 |
Localizing Visual Sounds the Easy Way |
Shentong Mo (Carnegie Mellon University); Pedro Morgado (CMU)* |
4105 |
Polarimetric Pose Prediction |
Daoyi Gao (Technical University of Munich)*; Yitong Li (Technical University of Munich); Patrick Ruhkamp (Technical University of Munich); Iuliia Skobleva (Technical University of Munich); Magdalena Wysocki (Technical University of Munich); HyunJun Jung ( Technical University of Munich); Pengyuan Wang (TUM); Arturo Guridi (Technical University of Munich); Benjamin Busam (Technical University of Munich) |
4115 |
DFNet: Enhance Absolute Pose Regression with Direct Feature Matching |
Shuai Chen (University of Oxford)*; Xinghui Li (University of Oxford); Zirui Wang (University of Oxford); Victor Adrian Prisacariu (University of Oxford) |
4117 |
A-OKVQA: A Benchmark for Visual Question Answering using World Knowledge |
Dustin Schwenk (Allen Institute for Artificial Intelligence); Apoorv Khandelwal (Allen Institute for AI); Christopher A Clark (Allen Institute for AI); Kenneth Marino (CMU); Roozbeh Mottaghi (Allen Institute for AI)* |
4119 |
Sound Localization by Self-Supervised Time Delay Estimation |
Ziyang Chen (University of Michigan)*; David Fouhey (University of Michigan); Andrew Owens (U Michigan) |
4120 |
AdaFocus V3: On Unified Spatial-temporal Dynamic Video Recognition |
Yulin Wang (Tsinghua University); Yang Yue (Tsinghua University); Xinhong Xu (Tsinghua University); Ali Hassani (University of Oregon); Victor Kulikov (Picsart); Nikita Orlov (PicsArt); Shiji Song (Department of Automation, Tsinghua University); Humphrey Shi (U of Oregon | UIUC | PAIR); Gao Huang (Tsinghua)* |
4123 |
Discrete-Constrained Regression for Local Counting Models |
Haipeng Xiong (National University of Singapore)*; Angela Yao (National University of Singapore) |
4124 |
Towards Regression-Free Neural Networks for Diverse Compute Platforms |
Rahul Duggal (Georgia Tech); Hao Zhou (Amazon); Shuo Yang (Amazon); Jun Fang (Amazon)*; Yuanjun Xiong (Amazon); Wei Xia (Amazon) |
4130 |
Selection and Cross Similarity for Event-Image Deep Stereo |
Hoonhee Cho (KAIST)*; Kuk-Jin Yoon (KAIST) |
4136 |
Long Movie Clip Classification with State-Space Video Models |
Md Mohaiminul Islam (UNC Chapel Hill)*; Gedas Bertasius (UNC Chapel Hill) |
4145 |
Relationship Spatialization for Depth Estimation |
xiaoyu xu (University of Waterloo)*; Jiayan Qiu (University of Waterloo); Xinchao Wang (National University of Singapore); Zhou Wang (University of Waterloo) |
4150 |
Breadcrumbs: Adversarial Class-Balanced Sampling for Long-tailed Recognition |
Bo Liu (Wormpex AI Research)*; Haoxiang Li (Wormpex AI Research); Hao Kang (Wormpex AI Research); Gang Hua (Wormpex AI Research); Nuno Vasconcelos (UCSD, USA) |
4152 |
Image2Point: 3D Point-Cloud Understanding with 2D Image Pretrained Models |
Chenfeng Xu (UC Berkeley)*; Shijia Yang (UC Berkeley); Tomer Galanti (Massachusetts Institute of Technology); Bichen Wu (Facebook Research); Xiangyu Yue (University of California, Berkeley); Bohan Zhai (UC Berkeley); Wei Zhan (University of California, Berkeley); Kurt Keutzer (EECS, UC Berkeley); Peter Vajda (Facebook); Masayoshi Tomizuka (University of California, Berkeley) |
4175 |
Visual Prompt Tuning |
Menglin Jia (Cornell University)*; Luming Tang (Cornell University); Bor-Chun Chen (Facebook AI); Claire T Cardie (Cornell University); Serge Belongie (University of Copenhagen); Bharath Hariharan (Cornell University); Ser-Nam Lim (Meta AI) |
4181 |
Multi-scale and Cross-scale Contrastive Learning for Semantic Segmentation |
THEODOROS PISSAS (University College London)*; Claudio S Ravasio (King’s College London (KCL)); Lyndon DaCruz (Moorfields Eye Hospital / University College London); Christos Bergeles (Kings College London) |
4185 |
Rethinking Generic Camera Models for Deep Single Image Camera Calibration to Recover Rotation and Fisheye Distortion |
Nobuhiko Wakai (Panasonic Corporation)*; Satoshi Sato (Panasonic Corporation); Yasunori Ishii (Panasonic Holdings); Takayoshi Yamashita (Chubu University) |
4188 |
Neural-Sim: Learning to Generate Training Data with NeRF |
Yunhao Ge (University of Southern California)*; Harkirat Behl (University of Oxford); Jiashu Xu (USC); Suriya Gunasekar (Microsoft Research); Neel Joshi (MICROSOFT RESEARCH); Yale Song (FAIR); Xin Wang (Microsoft Research); Laurent Itti (University of Southern California); Vibhav Vineet (Microsoft Research) |
4195 |
Word-Level Fine-Grained Story Visualization |
Bowen Li (University of Oxford)* |
4206 |
Chairs Can be Stood on: Overcoming Object Bias in Human-Object Interaction Detection |
Guangzhi Wang (National University of Singapore)*; Yangyang Guo (National University of Singapore); Yongkang Wong (National University of Singapore); Mohan Kankanhalli (National University of Singapore,) |
4208 |
GOCA: Guided Online Cluster Assignment for Self Supervised Video Representation Learning |
HUSEYIN COSKUN (Technical University of Munich)*; Alireza Zareian (Snap Inc.); Joshua L Moore (Snapchat); Federico Tombari (Google, TU Munich); Chen Wang (Snap Inc.) |
4217 |
Learning Audio-Video Modalities from Image Captions |
Arsha Nagrani (Google )*; Paul Hongsuck Seo (Google); Bryan Seybold (Google); Anja Hauth (Google AI); Santiago Manen (Google); Chen Sun (Brown University); Cordelia Schmid (Google) |
4220 |
Inverted Pyramid Multi-task Transformer for Dense Scene Understanding |
Hanrong Ye (The Hong Kong University of Science and Technology)*; Dan Xu (The Hong Kong University of Science and Technology) |
4222 |
Image Inpainting with Cascaded Modulation GAN and Object-Aware Training |
Haitian Zheng (University of Rochester)*; Zhe Lin (Adobe Research); Jingwan Lu (Adobe Research ); Scott Cohen (Adobe Research); Eli Shechtman (Adobe Research, US); Connelly Barnes (Adobe); Jianming Zhang (Adobe Research); Ning Xu (Adobe Research); Sohrab Amirghodsi (Adobe Research); Jiebo Luo (U. Rochester) |
4231 |
Planes vs. Chairs: Category-guided 3D shape learning without any 3D cues |
Zixuan Huang (Georgia Institute of Technology)*; Stefan Stojanov (Georgia Institute of Technology); Anh Thai (Georgia Institute of Technology); Varun Jampani (Google); James Rehg (Georgia Institute of Technology) |
4237 |
ART-SS: An Adaptive Rejection Technique for Semi-Supervised restoration for adverse weather-affected images |
Rajeev Yasarla ( AIBEE )*; Carey E Priebe (Johns Hopkins University); Vishal Patel (Johns Hopkins University) |
4239 |
Skeleton-Parted Graph Scattering Networks for 3D Human Motion Prediction |
Maosen Li (Cooperative Medianet Innovation Center, Shanghai Jiao Tong University)*; Siheng Chen (Shanghai Jiao Tong University); Zijing Zhang (Zhejiang University); Lingxi Xie (Huawei Inc.); Qi Tian (Huawei Cloud & AI); Ya Zhang (Cooperative Medianet Innovation Center, Shang hai Jiao Tong University) |
4241 |
MHR-Net: Multiple-Hypothesis Reconstruction of Non-Rigid Shapes from 2D Views |
Haitian Zeng (University of Technology Sydney)*; Xin Yu (University of Technology Sydney); Jiaxu Miao (Zhejiang University); Yi Yang (Zhejiang University) |
4243 |
Unifying Event Detection and Captioning as Sequence Generation via Pre-Training |
Qi Zhang (Renmin University of China)*; Yuqing Song (Renmin University of China); Qin Jin (Renmin University of China) |
4247 |
Depth Map Decomposition for Monocular Depth Estimation |
Jinyoung Jun (Korea University)*; Jae-Han Lee (Gauss Labs Inc.); Chul Lee (Dongguk University); Chang-Su Kim (Korea university) |
4249 |
Human-centric Image Cropping with Partition-aware and Content-preserving Features |
Bo Zhang (Shanghai Jiao Tong University)*; Li Niu (Shanghai Jiao Tong University); Xing Zhao (Shanghai Jiao Tong University); Liqing Zhang (Shanghai Jiao Tong University) |
4252 |
Backbone is All Your Need: A Simplified Architecture for Visual Object Tracking |
Boyu Chen (The University of Sydney); Peixia Li (The University of Sydney)*; Lei Bai (Shanghai AI Laboratory); Lei Qiao (SenseTime Group Limited); Qiuhong Shen (Harbin Institute of Technology (Shenzhen)); Bo Li (SenseTime Group Limited); Weihao Gan (SenseTime Group Limited); Wei Wu (SenseTime Group Limited); Wanli Ouyang (The University of Sydney) |
4255 |
StyleFace: Towards Identity-Disentangled Face Generation on Megapixels |
Yuchen Luo (Shanghai Jiao Tong University)*; Junwei Zhu (Tencent); Keke He (Tencent); Wenqing Chu (Tencent); Ying Tai (Tencent YouTu); Junchi Yan (Shanghai Jiao Tong University); Chengjie Wang (Tencent; Shanghai Jiao Tong University) |
4260 |
Fusion from Decomposition: A Self-Supervised Decomposition Approach for Image Fusion |
Pengwei Liang (Harbin Institute of Technology)*; Junjun Jiang (Harbin Institute of Technology); Xianming Liu (Harbin Institute of Technology); Jiayi Ma (Wuhan University) |
4261 |
Learning Degradation Representations for Image Deblurring |
dasong Li (Chinese University of Hong Kong)*; Yi Zhang (CUHK); Ka Chun Cheung (Nvidia); Xiaogang Wang (Chinese University of Hong Kong, Hong Kong); Hongwei Qin (Sensetime); Hongsheng Li (The Chinese University of Hong Kong) |
4269 |
Aware of the History: Trajectory Forecasting with the Local Behavior Data |
Yiqi Zhong (University of Southern California)*; Zhenyang Ni (Shanghai Jiao Tong University); Siheng Chen (Shanghai Jiao Tong University); Ulrich Neumann (USC) |
4270 |
FAR: Fourier Aerial Video Recognition |
Divya Kothandaraman (University of Maryland College Park)*; Tianrui Guan (University of Maryland, College Park); Xijun Wang (University of Maryland, College Park); Shuowen Hu (US Army Research Laboratory); Ming C Lin (UMD-CP & UNC-CH ); Dinesh Manocha (University of Maryland at College Park) |
4271 |
X-Learner: Learning Cross Sources and Tasks for Universal Visual Representation |
Yinan He (Beijing University of Posts and Telecommunications)*; Gengshi Huang (School of Electronics and Information Technology, Sun Yat-sen University); Siyu Chen (Carnegie Mellon University); Jianing Teng (sensetime); Kun Wang (SenseTime Group Limited); Zhenfei Yin (Sensetime); Lu Sheng (Beihang University); Ziwei Liu (Nanyang Technological University); Yu Qiao (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences); Jing Shao (Sensetime) |
4273 |
Disentangled Differentiable Network Pruning |
Shangqian Gao (University of Pittsburgh)*; Feihu Huang (University of Pittsburgh); Yanfu Zhang (University of Pittsburgh); Heng Huang (University of Pittsburgh) |
4275 |
Video Extrapolation in Space and Time |
Yunzhi Zhang (Stanford University)*; Jiajun Wu (Stanford University) |
4277 |
IDa-Det: An Information Discrepancy-aware Distillation for 1-bit Detectors |
Sheng Xu (Beihang University)*; Yanjing Li (Beihang University); Bohan Zeng (Beihang University); Teli Ma (Shanghai Artificial Intelligence Laboratory); Baochang Zhang (Beihang University); Xianbin Cao (Beihang University, China); Peng Gao (Chinese university of hong kong); Jinhu Lu (Beihang University, Beijing, China) |
4278 |
Multimodal Transformer with Variable-length Memory for Vision-and-Language Navigation |
chuang lin (Monash University)*; Yi Jiang (Bytedance); Jianfei Cai (Monash University); Lizhen Qu (Monash University); Reza Haffari (Monash University, Australia); Zehuan Yuan (Bytedance.Inc) |
4282 |
DnA: Improving Few-shot Transfer Learning with Low-Rank Decomposition and Alignment |
Ziyu Jiang (Texas A&M University)*; Tianlong Chen (Unversity of Texas at Austin); Xuxi Chen (University of Texas at Austin); Yu Cheng (Microsoft Research); Luowei Zhou (Microsoft); Lu Yuan (Microsoft); Ahmed Awadallah (Microsoft); Zhangyang Wang (University of Texas at Austin) |
4284 |
Translating a Visual LEGO Manual to a Machine-Executable Plan |
Ruocheng Wang (Stanford University)*; Yunzhi Zhang (Stanford University); Jiayuan Mao (MIT); Chin-Yi Cheng (Google Research); Jiajun Wu (Stanford University) |
4286 |
Cornerformer: Purifying Instances for Corner-based Detectors |
Haoran Wei (University of Chinese Academy of Sciences)*; Xin Chen (Huawei Inc.); Lingxi Xie (Huawei Inc.); Qi Tian (Huawei Cloud & AI) |
4287 |
Contributions of Shape, Texture, and Color in Visual Recognition |
Yunhao Ge (University of Southern California)*; Yao Xiao (University of Southern California); Zhi Xu (University of Southern California); Xingrui Wang (University of Southern California); Laurent Itti (University of Southern California) |
4288 |
Monitored Distillation for Positive Congruent Depth Completion |
Tian Yu Liu (UCLA); Parth Agrawal (UCLA); Allison Y Chen (University of California, Los Angeles); Byung-Woo Hong (Chung-Ang University); Alex Wong (Yale University)* |
4292 |
Towards Unbiased Label Distribution Learning for Facial Pose Estimation Using Anisotropic Spherical Gaussian |
Zhiwen Cao (Purdue University); Dongfang Liu (Rochester Institute of Technology)*; Qifan Wang (Meta AI); Yingjie Victor Chen (Purdue University) |
4293 |
AirDet: Few-Shot Detection without Fine-tuning for Autonomous Exploration |
Bowen Li (Tongji University)*; Chen Wang (Carnegie Mellon University); Pranay Reddy Anthireddy (Indian Institute of Information Technology, Design and Manufacturing, Jabalpur); Seungchan Kim (Carnegie Mellon University); Sebastian Scherer (Carnegie Mellon University) |
4295 |
Learning to Weight Samples for Dynamic Early-exiting Networks |
Yizeng Han (Tsinghua University); Yifan Pu (Tsinghua University); Zihang Lai (CMU); Chaofei Wang (Tsinghua University); Shiji Song (Department of Automation, Tsinghua University); cao junfeng (CMRI); Wenhui Huang (CMRI); Chao Deng (China Mobile Research Institute); Gao Huang (Tsinghua)* |
4300 |
Constrained Mean Shift Using Distant Yet Related Neighbors for Representation Learning |
K L Navaneet (University of California, Davis); Soroush Abbasi Koohpayegani (University of Maryland Baltimore County)*; Ajinkya B Tejankar (UMBC); Kossar Pourahmadi Meibodi (University of Maryland, Baltimore County); Akshayvarun Subramanya (UMBC); Hamed Pirsiavash (University of California Davis) |
4303 |
SLIP: Self-supervision meets Language-Image Pre-training |
Norman Mu (University of California, Berkeley)*; Alexander Kirillov (Facebook AI Reserach); David Wagner (UC Berkeley); Saining Xie (Facebook AI Research) |
4304 |
Learning Visual Styles from Audio-Visual Associations |
Tingle Li (Tsinghua University)*; Yichen Liu (Tsinghua University); Andrew Owens (U Michigan); Hang Zhao (Tsinghua University) |
4305 |
Dynamic Low-Resolution Distillation for Cost-Efficient End-to-End Text Spotting |
Ying Chen (Hikvision Research Institute); Liang Qiao (Zhejiang University & Hikvision Research Institute)*; Zhanzhan Cheng (Zhejiang University & Hikvision Research Institute); Shiliang Pu (Hikvision Research Institute); Yi Niu (Hikvision Research Institute); Xi Li (Zhejiang University) |
4310 |
Prompting Visual-Language Models for Efficient Video Understanding |
Chen Ju (Cooperative Medianet Innovation Center, Shang hai Jiao Tong University); Tengda Han (University of Oxford); Kunhao Zheng (Shanghai Jiaotong University); Ya Zhang (Cooperative Medianet Innovation Center, Shang hai Jiao Tong University); Weidi Xie (Shanghai Jiao Tong University)* |
4318 |
One-Trimap Video Matting |
Hongje Seong (Yonsei University)*; Seoung Wug Oh (Adobe Research); Brian Price (Adobe); Euntai Kim (Yonsei University); Joon-Young Lee (Adobe Research) |
4323 |
Contrastive Learning for Diverse Disentangled Foreground Generation |
Yuheng Li (UW Madison)*; Yijun Li (Adobe Research); Jingwan Lu (Adobe Research ); Eli Shechtman (Adobe Research, US); Yong Jae Lee (University of Wisconsin-Madison); Krishna Kumar Singh (Adobe Research) |
4326 |
Resolution-free Point Cloud Sampling Network with Data Distillation |
Tianxin Huang (Zhejiang University)*; Jiangning Zhang (Zhejiang University); Jun Chen (Zhejiang University); Yuang Liu (Zhejiang University); Yong Liu (Zhejiang University) |
4327 |
BIPS: Bi-modal Indoor Panorama Synthesis via Residual Depth-aided Adversarial Learning |
Changgyoon Oh (KAIST)*; Wonjune Cho (NAVER LABS); Yujeong Chae (KAIST); Daehee Park (KAIST); Lin Wang (HKUST); Kuk-Jin Yoon (KAIST) |
4330 |
Augmentation of rPPG Benchmark Datasets: Learning to Remove and Embed rPPG Signals via Double Cycle Consistent Learning from Unpaired Facial Videos |
WEI-HAO Chung (National Tsing Hua University)*; CHENG-JU HSIEH (National Tsing Hua University); Chiou-Ting Hsu (National Tsing Hua University) |
4331 |
Fabric Material Recovery from Video Using Multi-Scale Geometric Auto-Encoder |
Junbang Liang (University of Maryland, College Park)*; Ming C Lin (UMD-CP & UNC-CH ) |
4333 |
An Invisible Black-box Backdoor Attack through Frequency Domain |
Tong Wang (Nanjing University); Yuan Yao (Nanjing University)*; Feng Xu (Nanjing University); Shengwei An (Purdue University); Hanghang Tong (University of Illinois at Urbana-Champaign); Ting Wang (Penn State) |
4336 |
Learning Mutual Modulation for Self-Supervised Cross-Modal Super-Resolution |
Xiaoyu Dong (The University of Tokyo / RIKEN AIP); Naoto Yokoya (The University of Tokyo)*; Longguang Wang (National University of Defense Technology); Tatsumi Uezato (Hitachi, Ltd) |
4338 |
TransGrasp: Grasp Pose Estimation of a Category of Objects by Transferring Grasps from Only One Labeled Instance |
Hongtao Wen (Dalian University of Technology); Jianhang Yan (Dalian University of Technology); Wanli Peng (Dalian University of Technology)*; Yi Sun (Dalian University of Technology) |
4343 |
Learning Instance and Task-Aware Dynamic Kernels for Few-shot Learning |
Rongkai Ma (Monash University)*; Pengfei Fang (The Australian National University); Gil Avraham (Monash University); Yan Zuo (CSIRO); Tianyu Zhu (Monash University); Tom Drummond (University of Melbourne); Mehrtash Harandi (Monash University) |
4346 |
PillarNet: Real-Time and High-Performance Pillar-based 3D Object Detection |
Guangsheng Shi (Harbin Institute of Technology)*; Ruifeng Li (Harbin Institute of Technology); Chao Ma (Shanghai Jiao Tong University) |
4348 |
Robust Object Detection With Inaccurate Bounding Boxes |
Chengxin Liu (Huazhong University of Science and Technology); Kewei Wang (Huazhong Univ. of Sci.&Tech.); Hao Lu (Huazhong University of Science and Technology); Zhiguo Cao (Huazhong Univ. of Sci.&Tech.)*; Ziming Zhang (Worcester Polytechnic Institute) |
4349 |
Revisiting the Critical Factors of Augmentation-Invariant Representation Learning |
Junqiang Huang (MEGVII Technology)*; Xiangwen Kong (MEGVII Technology); Xiangyu Zhang (Megvii Technology) |
4359 |
A Fast Knowledge Distillation Framework for Visual Recognition |
Zhiqiang Shen (Carnegie Mellon University)*; Eric Xing (MBZUAI, CMU, and Petuum Inc.) |
4366 |
MegBA: A GPU-Based Distributed Library for Large-Scale Bundle Adjustment |
Jie Ren (Megvii Inc.); Wenteng Liang (Megvii); Ran Yan (Megvii)*; Luo Mai (University of Edinburgh); Shiwen Liu (Megvii); Xiao Liu (Megvii Inc) |
4367 |
Spectrum-aware and Transferable Architecture Search for Hyperspectral Image Restoration |
Wei He (Wuhan University)*; Quanming Yao (Tsinghua University); Naoto Yokoya (The University of Tokyo); Tatsumi Uezato (Hitachi, Ltd); Hongyan Zhang (Wuhan University); Liangpei Zhang (Wuhan University) |
4374 |
Boosting Transferability of Targeted Adversarial Examples via Hierarchical Generative Networks |
Xiao Yang (Tsinghua University)*; Yinpeng Dong (Tsinghua University); Tianyu Pang (Sea AI Lab); Hang Su (Tsinghua Univiersity); Jun Zhu (Tsinghua University) |
4378 |
Exploring the Devil in Graph Spectral Domain for 3D Point Cloud Attacks |
Qianjiang Hu (Peking University); Daizong Liu (Peking University); Wei Hu (Peking University)* |
4385 |
Geometry-aware Single-image Full-body Human Relighting |
Chaonan Ji (Tsinghua University); Tao Yu (Tsinghua University); Kaiwen Guo (Google); JINGXIN LIU (OPPO); Yebin Liu (Tsinghua University)* |
4388 |
Optical Flow Training under Limited Label Budget via Active Learning |
Shuai Yuan (Duke University)*; Xian Sun (Duke University); Hannah H Kim (Duke University); Shuzhi Yu (Duke University); Carlo Tomasi (Duke University) |
4395 |
RVSL: Robust Vehicle Similarity Learning in Real Hazy Scenes Based on Semi-supervised Learning |
Wei-Ting Chen (National Taiwan University)*; I-HSIANG CHEN (National Taiwan University); CHIH-YUAN YEH (National Taiwan University); Hao-Hsiang Yang (National Taiwan University); Hua-En Chang (National Taiwan University); Jian-Jiun Ding (National Taiwan University); Sy-Yen Kuo (National Taiwan University) |
4400 |
Hierarchical Feature Embedding for Visual Tracking |
Zhixiong Pi (Huazhong University of Science and Technology)*; Weitao Wan (Tencent); Chong Sun (Tencent Wechat); Changxin Gao (Huazhong University of Science and Technology); Nong Sang (Huazhong University of Science and Technology); Chen Li (Tencent) |
4401 |
Neural Color Operators for Sequential Image Retouching |
YILI WANG (Tsinghua University); Xin Li (Baidu); Kun Xu (Tsinghua University)*; Dongliang He (Baidu); Qi Zhang (baidu); Fu Li (Baidu); Errui Ding (Baidu Inc.) |
4402 |
Optimizing Image Compression via Joint Learning with Denoising |
Ka Leong Cheng (The Hong Kong University of Science and Technology); Yueqi Xie (The Hong Kong University of Science and Technology); Qifeng Chen (HKUST)* |
4405 |
DICE: Leveraging Sparsification for Out-of-Distribution Detection |
Yiyou Sun (University of Wisconsin Madison); Yixuan Li (University of Wisconsin-Madison)* |
4406 |
DeMFI: Deep Joint Deblurring and Multi-Frame Interpolation with Flow-Guided Attentive Correlation and Recursive Boosting |
Jihyong Oh (KAIST)*; Munchurl Kim (Korea Advanced Institute of Science and Technology) |
4408 |
Invariant Feature Learning for Generalized Long-Tailed Classification |
Kaihua Tang (Nanyang Technological University)*; Mingyuan Tao (Damo Academy, Alibaba Group); Jiaxin Qi (Nanyang Technological University); Zhenguang Liu (Zhejiang University); Hanwang Zhang (Nanyang Technological University) |
4411 |
Fine-Grained Visual Entailment |
Christopher L Thomas (Columbia University)*; Yipeng Zhang (Columbia University); Shih-Fu Chang (Columbia University) |
4412 |
Sliced Recursive Transformer |
Zhiqiang Shen (Carnegie Mellon University)*; Zechun Liu (Carnegie Mellon University); Eric Xing (MBZUAI, CMU, and Petuum Inc.) |
4413 |
Lightweight Attentional Feature Fusion: A New Baseline for Text-to-Video Retrieval |
Fan Hu (Renmin University of China); Aozhu Chen (Renmin University of China); Ziyue Wang (Renmin University of China); Fangming Zhou (Renmin University of China); Jianfeng Dong (Zhejiang Gongshang University); Xirong Li (Renmin University of China)* |
4416 |
Asymmetric Relation Consistency Reasoning for Video Relation Grounding |
Huan Li (Xi’an Jiaotong University); Ping Wei (Xi’an Jiaotong University)*; Jiapeng Li (Xi’an Jiaotong University); Zeyu Ma (Xi’an Jiaotong University); Jiahui Shang (Xi’an Jiaotong University); Nanning Zheng (Xi’an Jiaotong University) |
4420 |
PETR: Position Embedding Transformation for Multi-View 3D Object Detection |
Yingfei Liu (Megvii Technology); Tiancai Wang ( Megvii Technology)*; Xiangyu Zhang (Megvii Technology); Jian Sun (Megvii Technology) |
4422 |
Contextual Text Block Detection towards Scene Text Understanding |
Chuhui Xue (Nanyang Technological University); Jiaxing Huang (Nanyang Technological University); Wenqing Zhang (ByteDance); Shijian Lu (Nanyang Technological University)*; Changhu Wang (ByteDance.Inc); Song Bai (University of Oxford) |
4426 |
Structure-aware Editable Morphable Model for 3D Facial Detail Animation and Manipulation |
Jingwang Ling (Tsinghua University); Zhibo Wang (Tsinghua University); Ming Lu (Intel Labs China); Quan Wang (Sensetime); Chen Qian (SenseTime); Feng Xu (Tsinghua University)* |
4429 |
UniNet: Unified Architecture Search with Convolution, Transformer, and MLP |
Jihao Liu (Sensetime)*; Xin Huang (Waseda University); Guanglu Song (Sensetime); Hongsheng Li (The Chinese University of Hong Kong); Yu Liu (SenseTime Group LTD) |
4433 |
Efficient Decoder-free Object Detection with Transformers |
Peixian Chen (Youtu Tencent); mengdan zhang (Youtu, Tencent); Yunhang Shen (Xiamen University); Kekai Sheng (Youtu Lab, Tencent Inc.); Yuting Gao (tencent); Xing Sun (Shopee); Ke Li (Tencent)*; Chunhua Shen (“University of Adelaide, Australia”) |
4439 |
Rethinking Keypoint Representations: Modeling Keypoints and Poses as Objects for Multi-Person Human Pose Estimation |
William McNally (University of Waterloo)*; Kanav Vats (University of Waterloo); Alexander Wong (University of Waterloo); John McPhee (University of Waterloo) |
4440 |
CA-SSL: Class-Agnostic Semi-Supervised Learning for Detection and Segmentation |
Lu Qi (The Chinese University of Hong Kong)*; Jason Kuen (Adobe Research); Zhe Lin (Adobe Research); Jiuxiang Gu (Adobe Research); Fengyun Rao (Tencent); Dian Li (Tencent.com); Weidong Guo (Tencent); Zhen Wen (Tencent Technology (Shenzhen) Co., Ltd); Ming-Hsuan Yang (University of California at Merced); Jiaya Jia (Chinese University of Hong Kong) |
4447 |
StARformer: Transformer with State-Action-Reward Representations for Visual Reinforcement Learning |
Jinghuan Shang (Stony Brook University)*; Kumara Kahatapitiya (Stony Brook University); Xiang Li (Stony Brook University); Michael S Ryoo (Stony Brook/Google) |
4451 |
S2Net: Stochastic Sequential Pointcloud Forecasting |
Xinshuo Weng (NVIDIA Research)*; Junyu Nan (Carnegie Mellon University); Kuan-Hui Lee (Toyota Research Institute); Rowan McAllister (Toyota Research Institute); Adrien Gaidon (Toyota Research Institute); Nicholas Rhinehart (UC Berkeley); Kris Kitani (Carnegie Mellon University) |
4452 |
D3Net: A Unified Speaker-Listener Architecture for 3D Dense Captioning and Visual Grounding |
Zhenyu Chen (Technical University of Munich)*; Qirui Wu (Simon Fraser University); Matthias Niessner (Technical University of Munich); Angel X Chang (Simon Fraser University) |
4464 |
AMixer: Adaptive Weight Mixing for Self-Attention Free Vision Transformers |
Yongming Rao (Tsinghua University); Wenliang Zhao (Tsinghua University); Jie Zhou (Tsinghua University); Jiwen Lu (Tsinghua University)* |
4471 |
Neural Image Representations for Multi-Image Fusion and Layer Separation |
Seonghyeon Nam (York University); Marcus A Brubaker (York University); Michael S Brown (York University)* |
4477 |
Panoramic Human Activity Recognition |
Ruize Han (College of Intelligence and Computing, Tianjin University); Haomin Yan (Tianjin University); Jiacheng Li (College of Intelligence and Computing, Tianjin University); Songmiao Wang (Tianjin University); Wei Feng (College of Intelligence and Computing, Tianjin University, China)*; Song Wang (University of South Carolina) |
4478 |
Compiler-Aware Neural Architecture Search for On-Mobile Real-time Super-Resolution |
Yushu Wu (Northeastern University)*; Yifan Gong (Northeastern University); Pu Zhao (Northeastern University); Yanyu Li (Northeastern University); Zheng Zhan (Northeastern University); Wei Niu (William & Mary); Hao Tang (ETH Zurich); Minghai Qin (Western Digital Research); Bin Ren (William & Mary); Yanzhi Wang (Northeastern University) |
4481 |
Dual Adaptive Transformations for Weakly Supervised Point Cloud Segmentation |
Zhonghua Wu (Nanyang Technological University)*; Yicheng Wu (Monash University); Guosheng Lin (Nanyang Technological University); Jianfei Cai (Monash University); Chen Qian (SenseTime) |
4495 |
Modality Synergy Complement Learning with Cascaded Aggregation for Visible-Infrared Person Re-Identification |
Yiyuan Zhang (Beijing Institute of Technology); Sanyuan Zhao (Beijing Institute of Technology )*; Yuhao Kang (Beijing Institute of Technology); Jianbing Shen (Inception Institute of Artificial Intelligence) |
4496 |
RA-Depth: Resolution Adaptive Self-Supervised Monocular Depth Estimation |
Mu He (Nanjing University of Science and Technology)*; Le Hui (Nanjing University of Science and Technology); Yikai Bian (Nanjing University of Science and Technology); Jian Ren (Nanjing University of Science and Technology); Jin Xie (Nanjing University of Science and Technology); Jian Yang (Nanjing University of Science and Technology) |
4505 |
MoFaNeRF: Morphable Facial Neural Radiance Field |
Yiyu Zhuang (Nanjing University); Hao Zhu (Nanjing University)*; Xusen Sun (Nanjing University); Xun Cao (Nanjing University) |
4513 |
Visual Cross-View Metric Localization with Dense Uncertainty Estimates |
Zimin Xia (Delft University of Technology)*; Olaf Booij (TomTom); Marco Manfredi (TomTom); Julian F P Kooij (Delft University of Technology) |
4525 |
The One Where They Reconstructed 3D Humans and Environments in TV Shows |
Georgios Pavlakos (UC Berkeley)*; Ethan Weber (UC Berkeley); Matthew Tancik (UC Berkeley); Angjoo Kanazawa (University of California Berkeley) |
4530 |
PointInst3D: Segmenting 3D Instances by Points |
Tong He (University of Adelaide)*; Wei Yin (University of Adelaide); Chunhua Shen (“University of Adelaide, Australia”); Anton van den Hengel (University of Adelaide) |
4533 |
PolyphonicFormer: Unified Query Learning for Depth-aware Video Panoptic Segmentation |
Haobo Yuan (Wuhan University)*; Xiangtai Li (Peking University); Yibo Yang (Peking University); Guangliang Cheng (Sensetime Group Limited); Jing Zhang (The University of Sydney); Yunhai Tong (Peking University); Lefei Zhang (Wuhan University); Dacheng Tao (JD.com) |
4534 |
Quasi-Balanced Self-Training on Noise-Aware Synthesis of Object Point Clouds for Closing Domain Gap |
Yongwei Chen (South China University of Technology); ZiHao Wang (South China University of Technology); Longkun Zou (South China University of Technology); Ke Chen (South China University of Technology); Kui Jia (South China University of Technology)* |
4537 |
TinyViT: Fast Pretraining Distillation for Small Vision Transformers |
Kan Wu (Sun Yat-sen University); Jinnian Zhang (University of Wisconsin Madison); Houwen Peng (Microsoft Research)*; Mengchen Liu (Microsoft); Bin Xiao (Microsoft); Jianlong Fu (Microsoft Research); Lu Yuan (Microsoft) |
4551 |
VirtualPose: Learning Generalizable 3D Human Pose Models from Virtual Data |
Jiajun Su (Peking University)*; Chunyu Wang (Microsoft Research asia); Xiaoxuan Ma (Peking University); Wenjun Zeng (EIT Institute for Advanced Study); Yizhou Wang (PKU) |
4552 |
Poseur: Direct Human Pose Regression with Transformers |
Weian Mao (the university of adelaide)*; Yongtao Ge (The University of Adelaide); Chunhua Shen (“University of Adelaide, Australia”); Xinlong Wang (University of Adelaide); Zhi Tian (Meituan); Zhibin Wang (Alibaba Group); Anton van den Hengel (University of Adelaide) |
4557 |
Adaptive Image Transformations for Transfer-based Adversarial Attack |
Zheng Yuan (Institute of Computing Technology, Chinese Academy of Sciences); Jie Zhang (ICT, CAS)*; Shiguang Shan (Institute of Computing Technology, Chinese Academy of Sciences) |
4566 |
D2ADA: Dynamic Density-aware Active Domain Adaptation for Semantic Segmentation |
Tsung-Han Wu (National Taiwan University)*; Yi-Syuan Liou (National Taiwan University); Shao-Ji Yuan (National Taiwan University); Hsin-Ying Lee (National Taiwan University); Tung-I Chen (National Taiwan University); Kuan-Chih Huang (National Taiwan University); Winston H. Hsu (National Taiwan University) |
4568 |
SQN: Weakly-Supervised Semantic Segmentation of Large-Scale 3D Point Clouds |
Qingyong Hu (University of Oxford); Bo Yang (The Hong Kong Polytechnic University)*; Guangchi Fang (Sun Yat-sen University); Yulan Guo (Sun Yat-sen University); Ales Leonardis (University of Birmingham); Niki Trigoni (University of Oxford); Andrew Markham (University of Oxford) |
4581 |
Deep Portrait Delighting |
Joshua William Weir (Victoria University of Wellington)*; Junhong Zhao (CMIC); Andrew Chalmers (CMIC); Taehyun Rhee (Victoria University of Wellington) |
4584 |
Vector Quantized Image-to-Image Translation |
Yu-Jie Chen (National Chiao Tung University); Shin-I Cheng (National Chiao Tung University); Wei-Chen Chiu (National Chiao Tung University)*; Hung-Yu Tseng (Facebook); Hsin-Ying Lee (Snap Inc) |
4588 |
PointMixer: MLP-Mixer for Point Cloud Understanding |
Jaesung Choe (KAIST)*; Chunghyun Park (POSTECH); Francois Rameau (KAIST); Jaesik Park (POSTECH); In So Kweon (KAIST) |
4589 |
V2X-ViT: Vehicle-to-Everything Cooperative Perception with Vision Transformer |
Runsheng Xu (University of California, Los Angeles); Hao Xiang (University of California, Los Angeles); Zhengzhong Tu (University of Texas at Austin); Xin Xia (University of California, Los Angeles); Ming-Hsuan Yang (University of California at Merced); Jiaqi Ma (University of California, Los Angeles)* |
4593 |
Cross-Domain Ensemble Distillation for Domain Generalization |
Kyungmoon Lee (POSTECH)*; Sungyeon Kim (POSTECH); Suha Kwak (POSTECH) |
4596 |
Cross-Modal 3D Shape Generation and Manipulation |
Zezhou Cheng (University of Massachusetts, Amherst)*; Menglei Chai (Snap Inc.); Jian Ren (Snap Inc.); Hsin-Ying Lee (Snap Inc); Kyle B Olszewski (Snap Inc.); Zeng Huang (Snap Inc.); Subhransu Maji (University of Massachusetts, Amherst); Sergey Tulyakov (Snap Inc) |
4607 |
Latent Partition Implicit with Surface Codes for 3D Representation |
Chao Chen (Tsinghua University); Yu-Shen Liu (Tsinghua University)*; Zhizhong Han (Wayne State University) |
4614 |
FILM: Frame Interpolation for Large Motion |
Fitsum Reda (Google)*; Janne Kontkanen (Google); Eric Tabellion (Google); Deqing Sun (Google); Caroline Pantofaru (Google Research); Brian Curless (University of Washington) |
4619 |
Facial Depth and Normal Estimation using Single Dual-Pixel Camera |
Minjun Kang (KAIST)*; Jaesung Choe (KAIST); Hyowon Ha (Facebook); Hae-Gon Jeon (GIST); Sunghoon Im (DGIST); In So Kweon (KAIST); Kuk-Jin Yoon (KAIST) |
4622 |
Initialization and Alignment for Adversarial Texture Optimization |
Xiaoming Zhao (University of Illinois at Urbana-Champaign)*; Zhizhen Zhao (University of Illinois at Urbana-Champaign); Alexander Schwing (UIUC) |
4631 |
Regularizing Vector Embedding in Bottom-Up Human Pose Estimation |
Haixin Wang (School of Artificial Intelligence, University of Chinese Academy of Sciences)*; lu zhou (CASIA); Yingying Chen (CASIA); Ming Tang (Institute of Automation, Chinese Academy of Sciences); Jinqiao Wang (Institute of Automation, Chinese Academy of Sciences) |
4633 |
Equivariant Hypergraph Neural Networks |
Jinwoo Kim (KAIST); Saeyoon Oh (KAIST); Sungjun Cho (LG AI Research); Seunghoon Hong (KAIST)* |
4636 |
Learning Quality-aware Dynamic Memory for Video Object Segmentation |
Yong Liu (Tsinghua University)*; Ran Yu (Tsinghua university); Fei Yin (Tsinghua University); Xinyuan Zhao (Huawei); Wei Zhao (Huawei); Weihao Xia (University College London); Yujiu Yang (Tsinghua University) |
4652 |
Neural Scene Decoration from a Single Photograph |
Hong Wing Pang (The Hong Kong University of Science and Technology)*; Yingshu Chen ( The Hong Kong University of Science and Technology); Phuoc-Hieu T. Le (VinAI Research); Binh-Son Hua (VinAI Research); Thanh Nguyen (Deakin University, Australia); Sai-Kit Yeung (Hong Kong University of Science and Technology) |
4656 |
Bottom Up Top Down Detection Transformers for Language Grounding in Images and Point Clouds |
Ayush Jain (Carnegie Mellon University)*; Nikolaos Gkanatsios (Carnegie Mellon University); Ishita Mediratta (Meta AI); Katerina Fragkiadaki (Carnegie Mellon University) |
4658 |
CIRCLE:Convolutional Implicit Reconstruction and Completion for Large-scale Indoor Scene |
Hao-Xiang Chen (Tsinghua University)*; Jiahui Huang (Tsinghua University); Tai-Jiang Mu (Tsinghua University); Shi-Min Hu (Tsinghua University) |
4659 |
Discovering Deformable Keypoint Pyramids |
Jianing Qian (University of Pennsylvania)*; Anastasios Panagopoulos (University of Pennsylvania); Dinesh Jayaraman (University of Pennsylvania) |
4668 |
TIDEE: Tidying Up Novel Rooms using Visuo-Semantic Commonsense Priors |
Gabriel Sarch (Carnegie Mellon University)*; Zhaoyuan Fang (Carnegie Mellon University); Adam Harley (Carnegie Mellon University); Paul Schydlo (Carnegie Mellon University); Michael J Tarr (Carnegie Mellon University); Saurabh Gupta (UIUC); Katerina Fragkiadaki (Carnegie Mellon University) |
4669 |
MOTR: End-to-End Multiple-Object Tracking with TRansformer |
Fangao Zeng (Megvii Technology); Bin Dong (Megvii Technology); Yuang Zhang (Shanghai Jiao Tong University); Tiancai Wang ( Megvii Technology)*; Xiangyu Zhang (Megvii Technology); Yichen Wei (Megvii Research Shanghai) |
4672 |
K-centered Patch Sampling for Efficient Video Recognition |
Seong Hyeon Park (KAIST AI)*; Jihoon Tack (KAIST); Byeongho Heo (NAVER AI LAB); Jung-Woo Ha (NAVER CLOVA AI Lab); Jinwoo Shin (KAIST) |
4675 |
Learning Implicit Feature Alignment Function for Semantic Segmentation |
Hanzhe Hu (Peking University)*; Yinbo Chen (UC San Diego); Jiarui Xu (University of California San Diego); Shubhankar Borse (Qualcomm AI Research ); Hong Cai (Qualcomm AI Research); Fatih Porikli (Qualcomm AI Research); Xiaolong Wang (UCSD) |
4677 |
A Visual Navigation Perspective for Category-Level Object Pose Estimation |
Jiaxin Guo (Zhejiang University)*; Yiyi Liao (MPI-IS and University of Tübingen); Zhong Fangxun (CUHK); Rong Xiong (Zhejiang University); Yunhui Liu (CUHK); Yue Wang (Zhejiang University) |
4681 |
ScaleNet: Searching for the Model to Scale |
Jiyang Xie (Huawei Noah’s Ark Lab); Xiu Su (University of Sydney); Shan You (SenseTime); Zhanyu Ma (Beijing University of Posts and Telecommunications)*; Fei Wang (University of Science and Technology of China); Chen Qian (SenseTime) |
4684 |
Centrality and Consistency: Two-Stage Clean Samples Identification for Learning with Instance-Dependent Noisy Labels |
Ganlong Zhao (The University of Hong Kong); Guanbin Li (Sun Yat-sen University)*; Yipeng Qin (Cardiff University); Feng Liu (Deepwise AI Lab); Yizhou Yu (The University of Hong Kong) |
4685 |
GALA: Toward Geometry-and-Lighting-Aware Object Search for Compositing |
Sijie Zhu (University of Central Florida)*; Zhe Lin (Adobe Research); Scott Cohen (Adobe Research); Jason Kuen (Adobe Research); Zhifei Zhang (Adobe Research); Chen Chen (University of Central Florida) |
4688 |
FairGRAPE: Fairness-aware GRAdient Pruning mEthod for Face Attribute Classification |
Xiaofeng Lin (University of California – Los Angeles); Seungbae Kim (University of South Florida); Jungseock Joo (University of California Los Angeles)* |
4697 |
Tackling Background Distraction in Video Object Segmentation |
Suhwan Cho (Yonsei University)*; Heansung Lee (Yonsei University); Minhyeok Lee ( Yonsei University); Chaewon Park (Yonsei University); Sungjun Jang (Yonsei University); Minjung Kim (Yonsei University); Sangyoun Lee (Yonsei University) |
4700 |
Hyperspherical Learning in Multi-Label Classification |
Bo Ke (Tencent Youtu Lab)*; yunquan zhu (Tencent YouTu Lab); Mengtian Li (East China Normal University); Xiujun shu (Tencent Toutu Lab); Ruizhi Qiao (Tencent Youtu Lab); Bo Ren (Tencent) |
4705 |
The Surprisingly Straightforward Scene Text Removal Method With Gated Attention and Region of Interest Generation: A Comprehensive Prominent Model Analysis |
Hyeonsu Lee (Naver Corporation)*; Chankyu Choi (Naver Corporation) |
4708 |
FingerprintNet: Synthesized Fingerprints for Generated Image Detection |
Yonghyun Jeong (NAVER CLOVA)*; Doyeon Kim (Line+); Youngmin Ro (Samsung SDS); pyounggeon kim (SDS); Jongwon Choi (Chung-Ang University) |
4715 |
ParticleSfM: Exploiting Dense Point Trajectories for Localizing Moving Cameras in the Wild |
Wang Zhao (Tsinghua University)*; Shaohui Liu (ETH Zurich); Hengkai Guo (ByteDance AI Lab); Wenping Wang (The University of Hong Kong); Yong-Jin Liu (Tsinghua University) |
4721 |
Free-Viewpoint RGB-D Human Performance Capture and Rendering |
Phong Ha Nguyen (University of Oulu)*; Nikolaos Sarafianos (Facebook Reality Labs); Christoph Lassner (Meta Reality Labs Research); Janne Heikkila (University of Oulu, Finland); Tony Tung (Facebook) |
4727 |
When Active Learning Meets Implicit Semantic Data Augmentation |
zhuangzhuang chen (shenzhen university); Jin Zhang (Shenzhen University); Pan Wang (Shenzhen University); Jie Chen (Shenzhen University); Jianqiang Li (Shenzhen University)* |
4733 |
Multiview Regenerative Morphing with Dual Flows |
Chih-Jung Tsai (National Tsing Hua University); Cheng Sun (National Tsing Hua University); Hwann-Tzong Chen (National Tsing Hua University)* |
4734 |
Frequency and Spatial Dual Guidance for Image Dehazing |
Hu Yu (University of Science and Technology of China); Naishan Zheng (University of Science and Technology of China); man zhou (University of Science and Technology of China); Jie Huang (University of Science and Technology of China); Zeyu Xiao (University of Science and Technology of China); Feng Zhao (University of Science and Technology of China)* |
4736 |
The Anatomy of Video Editing: A Dataset and Benchmark Suite for AI-Assisted Video Editing |
Dawit Mureja Argaw (KAIST)*; Fabian Caba (Adobe Research); Joon-Young Lee (Adobe Research); Markus Woodson (Adobe); In So Kweon (KAIST) |
4739 |
Hallucinating Pose-Compatible Scenes |
Tim Brooks (UC Berkeley)*; Alexei A Efros (UC Berkeley) |
4748 |
Faster VoxelPose: Real-time 3D Human Pose Estimation by Orthographic Projection |
Hang Ye (Peking University); Wentao Zhu (Peking University)*; Chunyu Wang (Microsoft Research asia); Rujie Wu (Peking University); Yizhou Wang (PKU) |
4754 |
Video Interpolation by Event-driven Anisotropic Adjustment of Optical Flow |
Song Wu (Huawei Technologies Co., Ltd.); Kaichao You (Tsinghua Univ); Weihua He (Tsinghua University)*; Chen Yang (Peking University); Yang Tian (Tsinghua University); Yaoyuan Wang (Huawei Technologies Co., Ltd.); Jianxing Liao (HUAWEI TECHNOLOGIES CO.LTD); Ziyang Zhang (HUAWEI TECHNOLOGIES CO.LTD) |
4761 |
Motion and Appearance Adaptation for Cross-Domain Motion Transfer |
Borun Xu (University of Electronic Science and Technology of China)*; Biao Wang (Alibaba Group); Jinhong Deng (University of Electronic Science and Technology of China); Jiale Tao (University of Electronic Science and Technology of China); Tiezheng Ge (Alibaba Group); Yuning Jiang (Alibaba Group); Wen Li (University of Electronic Science and Technology of China); Lixin Duan (University of Electronic Science and Technology of China) |
4762 |
AdaBin: Improving Binary Neural Networks with Adaptive Binary Sets |
Zhijun Tu (Institute of Artificial Intelligence and Robotics, Xi’an Jiaotong university)*; Xinghao Chen (Huawei Noah’s Ark Lab); Pengju Ren (Institute of Artificial Intelligence at Xi’an Jiaotong University); Yunhe Wang (Huawei Technologies) |
4781 |
Social-Implicit: Rethinking Trajectory Prediction Evaluation and The Effectiveness of Implicit Maximum Likelihood Estimation |
Abduallah A Mohamed (Meta)*; Deyao Zhu (King Abdullah University of Science and Technology); Warren Vu (The University of Texas at Austin); Mohamed Elhoseiny (KAUST); Christian Claudel (The university of Texas at Austin) |
4788 |
A Generalized & Robust Framework For Timestamp Supervision in Temporal Action Segmentation |
Rahul Rahaman (National University of Singapore)*; Dipika Singhania (National University of Singapore); Alex Thiery (National University of Singapore); Angela Yao (National University of Singapore) |
4790 |
A Deep Moving-camera Background Model |
Guy Erez (Ben Gurion University)*; Ron A Shapira Weber (Ben-Gurion University); Oren Freifeld (Ben-Gurion University) |
4800 |
DLME: Deep Local-flatness Manifold Embedding |
Zelin Zang (Zhejiang University & Westlake University)*; Siyuan Li (Westlake University); di wu (Westlake University); Ge Wang (Westlake University); Kai Wang (National University of Singapore); Lei Shang (Alibaba Group); Baigui Sun (Alibaba Group); Hao Li (Alibaba Group); Stan Z. Li (Westlake University) |
4802 |
Neural Video Compression using GANs for Detail Synthesis and Propagation |
Fabian Mentzer (Google)*; Eirikur Agustsson (Google); Johannes Ballé (Google); David Minnen (Google Inc.); Nick Johnston (Google); George Toderici (Google Research) |
4804 |
Few-shot Action Recognition with Hierarchical Matching and Contrastive Learning |
Sipeng Zheng (Renmin University of China)*; Shizhe Chen (INRIA); Qin Jin (Renmin University of China) |
4807 |
Perspective Flow Aggregation for Data-Limited 6D Object Pose Estimation |
Yinlin Hu (EPFL)*; Pascal Fua (EPFL, Switzerland); Mathieu Salzmann (EPFL) |
4820 |
TALISMAN: Targeted Active Learning for Object Detection with Rare Classes and Slices using Submodular Mutual Information |
Suraj Kothawade (UT Dallas)*; Saikat Ghosh (University of Texas at Dallas); Sumit Shekhar (Adobe Research); Yu Xiang (The University of Texas at Dallas); Rishabh Iyer (University of Texas at Dallas) |
4826 |
New Datasets and Models for Contextual Reasoning in Visual Dialog |
Yifeng Zhang (University of Minnesota, Twin Cities); Ming Jiang (University of Minnesota); Qi Zhao (University of Minnesota)* |
4828 |
Remote Respiration Monitoring of Moving Person Using Radio Signals |
Jae-Ho Choi (Pohang University of Science and Technology)*; KIBONG KANG (POSTECH); Kyung-Tae Kim (Pohang University of Science and Technology) |
4832 |
AdvDO: Realistic Adversarial Attacks for Trajectory Prediction |
Yulong Cao (University of Michigan, Ann Arbor )*; Chaowei Xiao (NVIDIA); Anima Anandkumar (NVIDIA/Caltech); Danfei Xu (Stanford University); Marco Pavone (Stanford University) |
4836 |
Cross-Modality Transformer for Visible-Infrared Person Re-Identification |
Kongzhu Jiang (University of Science and Technology of China)*; Tianzhu Zhang (University of Science and Technology of China); Xiang Liu (Dongguan University of Technology); Bingqiao Qian (University of Science and Technology of China); Yongdong Zhang (University of Science and Technology of China); Feng Wu (University of Science and Technology of China) |
4849 |
VL-LTR: Learning Class-wise Visual-Linguistic Representation for Long-Tailed Visual Recognition |
Changyao Tian (Chinese University of Hong Kong); Wenhai Wang (Nanjing University); Xizhou Zhu (SenseTime); Jifeng Dai (SenseTime)*; Yu Qiao (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences) |
4857 |
Self-Supervised Classification Network |
Elad Amrani (IBM / Technion)*; Leonid Karlinsky (IBM-Research); Alex Bronstein (Technion) |
4865 |
DevNet: Self-supervised Monocular Depth Learning via Density Volume Construction |
Kaichen Zhou (University of Oxford)*; Lanqing Hong (Huawei Noah’s Ark Lab); Changhao Chen (National University of Defense Technology); Hang Xu (Huawei Noah’s Ark Lab); Chaoqiang Ye (Huawei); Qingyong Hu (University of Oxford); Zhenguo Li (Huawei Noah’s Ark Lab) |
4872 |
Bayesian Optimization with Clustering and Rollback for CNN Auto Pruning |
Hanwei FAN (HKUST)*; Jiandong MU (HKUST); Wei Zhang (Hong Kong University of Science and Technology) |
4873 |
Towards Real-World HDRTV Reconstruction: A Data Synthesis-based Approach |
Zhen Cheng (University of Science and Technology of China)*; Tao Wang (Huawei Noah’s Ark Lab); Yong Li (Huawei Noah’s Ark Lab); Fenglong Song (Huawei Noah’s Ark Lab); Chang Chen (Huawei Noah’s Ark Lab); Zhiwei Xiong (University of Science and Technology of China) |
4874 |
Quantum Motion Segmentation |
Federica Arrigoni (University of Trento)*; Willi Menapace (University of Trento); Marcel Seelbach Benkner (University of Siegen); Elisa Ricci (University of Trento); Vladislav Golyanik (MPI for Informatics) |
4879 |
Open-world Semantic Segmentation via Contrasting and Clustering Vision-language Embedding |
Quande Liu (The Chinese University of Hong Kong)*; Youpeng Wen (Dalian University of Technology); Jianhua Han (Huawei Noah’s Ark Lab); Chunjing Xu (Huawei Noah’s Ark Lab); Hang Xu (Huawei Noah’s Ark Lab); Xiaodan Liang (Sun Yat-sen University) |
4880 |
Custom Structure Preservation in Face Aging |
Guillermo Gomez-Trenado (University of Granada)*; Stéphane Lathuilière (Telecom-Paris); Pablo Mesejo (University of Granada); Oscar Cordón García (University of Granada) |
4883 |
DANBO: Disentangled Articulated Neural Body Representations via Graph Neural Networks |
Shih-Yang Su (University of British Columbia)*; Timur Bagautdinov (Facebook); Helge Rhodin (UBC) |
4888 |
Class Is Invariant to Context and Vice Versa: On Learning Invariance for Out-Of-Distribution Generalization |
Jiaxin Qi (Nanyang Technological University)*; Kaihua Tang (Nanyang Technological University); Qianru Sun (Singapore Management University); Xian-Sheng Hua (Damo Academy, Alibaba Group); Hanwang Zhang (Nanyang Technological University) |
4891 |
Spatio-Temporal Deformable Attention Network for Video Deblurring |
Huicong Zhang (Harbin Institute of Technology)*; Haozhe Xie (Tencent AI Lab); Hongxun Yao (Harbin Institute of Technology) |
4894 |
CHORE: Contact, Human and Object REconstruction from a single RGB image |
Xianghui Xie (Saarland University )*; Bharat Lal Bhatnagar (University of Tübingen, MPI informatik); Gerard Pons-Moll (University of Tübingen) |
4899 |
Complementing Brightness Constancy with Deep Networks for Optical Flow Prediction |
Vincent LE GUEN (EDF R&D, CNAM)*; Clément Rambour (Cnam); Nicolas Thome (CNAM, Paris) |
4902 |
Learning Discriminative Shrinkage Deep Networks for Image Deconvolution |
Pin-Hung Kuo (National Taiwan University)*; Jinshan Pan (Nanjing University of Science and Technology); Shao-Yi Chien (National Taiwan University); Ming-Hsuan Yang (University of California at Merced) |
4904 |
Camera Pose Estimation and Localization with Active Audio Sensing |
Karren D Yang (MIT); Michael Firman (Niantic); Eric Brachmann (Niantic)*; Clement LJC Godard (Niantic) |
4906 |
Learning Efficient Multi-Agent Cooperative Visual Exploration |
Chao Yu (Tsinghua University); Xinyi Yang (Tinghua University)*; Jiaxuan Gao (Tsinghua University); Huazhong Yang (Tsinghua University); Yu Wang (Tsinghua University); Yi Wu (Tsinghua University) |
4908 |
4DContrast: Contrastive Learning with Dynamic Correspondences for 3D Scene Understanding |
Yujin Chen (Technical University of Munich)*; Matthias Niessner (Technical University of Munich); Angela Dai (Technical University of Munich) |
4918 |
Learned Vertex Descent: A New Direction for 3D Human Model Fitting |
Enric Corona (IRI)*; Gerard Pons-Moll (University of Tübingen); Guillem Alenyà (IRI); Francesc Moreno (IRI) |
4921 |
Hierarchical Semi-Supervised Contrastive Learning for Contamination-Resistant Anomaly Detection |
Gaoang Wang (Zhejiang University); Yibing Zhan (JD Explore Academy); Xinchao Wang (National University of Singapore); Mingli Song (Zhejiang University)*; Klara Nahrstedt (University of Illinois at Urbana-Champaign) |
4927 |
Learning to Fit Morphable Models |
Vasileios Choutas (ETH Zurich)*; Federica Bogo (Meta); Jingjing Shen (Microsoft); Julien Valentin (Microsoft) |
4929 |
Few-Shot Classification with Contrastive Learning |
Zhanyuan Yang (Shenzhen University); Jinghua Wang (Harbin Institute of Technology); Yingying Zhu (Shenzhen University)* |
4931 |
ARM: Any-Time Super-Resolution Method |
Bohong Chen (Xiamen University)*; Mingbao Lin (Xiamen University, China); Kekai Sheng (Youtu Lab, Tencent Inc.); mengdan zhang (Youtu, Tencent); Peixian Chen (Youtu Tencent); Ke Li (Tencent); Liujuan Cao (Xiamen University); Rongrong Ji (Xiamen University, China) |
4933 |
Tracking Every Thing in the Wild |
Siyuan Li (ETH Zurich)*; Martin Danelljan (ETH Zurich); Henghui Ding (ETH Zurich); Thomas E Huang (ETH Zürich); Fisher Yu (ETH Zurich) |
4934 |
Learning Self-prior for Mesh Denoising using Dual Graph Convolutional Networks |
Shota Hattori (The University of Tokyo)*; Tatsuya Yatagawa (The University of Tokyo); Yutaka Ohtake (The University of Tokyo); Suzuki Hiromasa (The University of Tokyo) |
4940 |
Few Zero Level Set-Shot Learning of Shape Signed Distance Functions in Feature Space |
Amine Ouasfi (IMT Atlantique ); Adnane Boukhayma (Inria)* |
4948 |
Attention-aware Learning for Hyperparameters Prediction in Image Processing Pipelines |
Haina Qin (University of Chinese Academy of Sciences); Longfei Han (Beijing Technology and Business University); Juan Wang (Institute of Automation, Chinese Academy of Sciences); Congxuan Zhang (Nanchang Hangkong University); Bing Li (National Laboratory of Pattern Recognition (NLPR), Institute of Automation, Chinese Academy of Sciences)*; Weiming Hu (Institute of Automation,Chinese Academy of Sciences); Yanwei Li (Zeku Technology(Shanghai) Corp.,Ltd.) |
4950 |
Attaining Class-level Forgetting in Pretrained Model using Few Samples |
Pravendra Singh (IIT Roorkee); Pratik Mazumder (Indian Institute of Technology Jodhpur)*; Mohammed Asad Karim (Carnegie Mellon University) |
4951 |
Data Invariants to Understand Unsupervised Out-of-Distribution Detection |
Lars Doorenbos (University of Bern)*; Raphael Sznitman (University of Bern); Pablo Márquez Neila (University of Bern) |
4953 |
STEEX: Steering Counterfactual Explanations with Semantics |
Paul Jacob (École Polytechnique ); eloi zablocki (Valeo.ai)*; Hedi Ben-younes (Valeo AI); Mickael Chen (valeo.ai); Patrick Pérez (Valeo.ai); Matthieu Cord (Sorbonne University) |
4958 |
Outpainting by Queries |
Kai Yao (Xi’an Jiaotong-liverpool University); Penglei Gao (Xi’an Jiaotong-Liverpool University); Xi Yang (Xi’an Jiaotong Liverpool University ); jie Sun (Xi’an Jiaotong-Liverpool University ); Rui Zhang (Xi’an Jiaotong-Liverpool University); Kaizhu Huang (Duke Kunshan University)* |
4961 |
HULC: 3D HUman Motion Capture with Pose Manifold SampLing and Dense Contact Guidance |
Soshi Shimada (MPI for Informatics)*; Vladislav Golyanik (MPI for Informatics); Zhi Li (Max Planck Institute for Informatics); Patrick Pérez (Valeo.ai); Weipeng Xu (Reality Labs Research); Christian Theobalt (MPI Informatik) |
4962 |
Interpretable Open-Set Domain Adaptation via Angular Margin Separation |
Xinhao Li (University of Electronic Science and Technology of China); Jingjing Li (University of Electronic Science and Technology of China)*; Zhekai Du (University of Electronic Science and Technology of China); Lei Zhu (Shandong Normal Unversity); Wen Li (University of Electronic Science and Technology of China) |
4963 |
EgoBody: Human Body Shape and Motion of Interacting People from Head-Mounted Devices |
Siwei Zhang (ETH Zurich)*; Qianli Ma (Max Planck Institute for Intelligent Systems); Yan Zhang (ETH Zurich); Zhiyin Qian (ETH Zürich); Taein Kwon (ETH Zurich); Marc Pollefeys (ETH Zurich / Microsoft); Federica Bogo (Meta); Siyu Tang (ETH Zurich) |
4966 |
ViTAS: Vision Transformer Architecture Search |
Xiu Su (University of Sydney); Shan You (SenseTime)*; Jiyang Xie (Huawei Noah’s Ark Lab); Mingkai Zheng (The University of Sydney); Fei Wang (University of Science and Technology of China); Chen Qian (SenseTime); Changshui Zhang (Tsinghua University); Xiaogang Wang (Chinese University of Hong Kong, Hong Kong); Chang Xu (University of Sydney) |
4970 |
LaLaLoc++: Global Floor Plan Comprehension for Layout Localisation in Unvisited Environments |
Henry Howard-Jenkins (University of Oxford)*; Victor Adrian Prisacariu (University of Oxford) |
4972 |
diffConv: Analyzing Irregular Point Clouds with an Irregular View |
Manxi Lin (Technical University of Denmark)*; Aasa Feragen (Technical University of Denmark) |
4975 |
ReAct: Temporal Action Detection with Relational Action Queries |
Dingfeng Shi (Beihang University)*; Yujie Zhong (University of Oxford); Qiong Cao (JD.com); Jing Zhang (The University of Sydney); Lin Ma (Meituan); Jia Li (Beihang University); Dacheng Tao (JD.com) |
4976 |
StyleBabel: Artistic Style Tagging and Captioning |
Dan Ruta (University of Surrey)*; Andrew Gilbert (University of Surrey); Pranav V Aggarwal (Adobe Inc.); Naveen Marri (Adobe Inc); Ajinkya Kale (Adobe); Jo Briggs (University of Northumbria); Chris Speed (University of Edinburgh); Hailin Jin (Adobe Research); Baldo Faieta (Adobe); Alex Filipkowski (Adobe); Zhe Lin (Adobe Research); John Collomosse (Adobe Research) |
4977 |
TACS: Taxonomy Adaptive Cross-Domain Semantic Segmentation |
RUI GONG (ETH Zurich)*; Martin Danelljan (ETH Zurich); Dengxin Dai (ETH Zurich); Danda Pani Paudel (ETH Zürich); Ajad Chhatkuli (ETH Zurich); Fisher Yu (ETH Zurich); Luc Van Gool (ETH Zurich) |
4983 |
Domain Invariant Autoencoders for Self-supervised Learning from Multi-domains |
Haiyang Yang (Nanjing University)*; Shixiang Tang (The University of Sydney); Meilin Chen (Zhejiang University); Yizhou Wang (Zhejiang University); Feng Zhu (University of Science and Technology of China); Lei Bai (Shanghai AI Laboratory); Rui Zhao (SenseTime Group Limited); Wanli Ouyang (The University of Sydney) |
4987 |
Learned Variational Video Color Propagation |
Markus Hofinger (Graz University of Technology)*; Erich Kobler (University Hospital Bonn); Alexander Effland (University of Bonn); Thomas Pock (Graz University of Technology) |
4988 |
PD-Flow: A Point Cloud Denoising Framework with Normalizing Flows |
aihua mao (South China University of Technolgoy)*; Zihui Du (South China University of Technology); Yu-Hui Wen (Tsinghua University); Jun Xuan (South China University of Technology); Yong-Jin Liu (Tsinghua University) |
4992 |
Prototypical Contrast Adaptation for Domain Adaptive Semantic Segmentation |
ZhengKai Jiang (Tencent Youtu Lab)*; Yuxi Li (Tencent); Ceyuan Yang (Chinese University of Hong Kong); Peng Gao (Chinese university of hong kong); Yabiao Wang (Tencent); Ying Tai (Tencent YouTu); Chengjie Wang (Tencent; Shanghai Jiao Tong University) |
4996 |
Adversarial Contrastive Learning via Asymmetric InfoNCE |
Qiying Yu (Tsinghua University)*; Jieming Lou (Harbin Institute of Technology); Xianyuan Zhan (Tsinghua University); Qizhang Li (Harbin Institute of Technology); Wangmeng Zuo (Harbin Institute of Technology, China); Yang Liu (Tsinghua University); Jingjing Liu (Tsinghua University) |
4998 |
NeRF for Outdoor Scene Relighting |
Viktor Rudnev (Max Planck Institute for Informatics)*; Mohamed Elgharib (Max Planck Institute for Informatics); William Smith (University of York); Lingjie Liu (Max Planck Institute for Informatics ); Vladislav Golyanik (MPI for Informatics); Christian Theobalt (MPI Informatik) |
5001 |
FusionVAE: A Deep Hierarchical Variational Autoencoder for RGB Image Fusion |
Fabian Duffhauss (Bosch Center for Artificial Intelligence)*; Vien Anh Ngo (Bosch Center for Artificial Intelligence); Hanna Ziesche (Bosch Center for AI); Gerhard Neumann (Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany) |
5007 |
Self-calibrating Photometric Stereo by Neural Inverse Rendering |
Junxuan Li (Australian National University)*; HONGDONG LI (Australian National University, Australia) |
5009 |
Time-rEversed diffusioN tEnsor Transformer: A new TENET of Few-Shot Object Detection |
Shan Zhang (Australian National University); Naila Murray (Naver Labs); Lei Wang (“University of Wollongong, Australia”); Piotr Koniusz (ANU College of Engineering and Computer Science)* |
5017 |
Detecting Generated Images by Real Images |
Bo Liu (Chongqing University of Posts and Telecommunications); fan yang (Chongqing University of Posts and Telecommunications); Xiuli Bi (Chongqing University of Posts and Telecommunications); bin xiao (Chongqing University of Posts and Telecommunications)*; Weisheng Li (Chongqing University of Posts and Telecommunications); Xinbo Gao (Chongqing University of Posts and Telecommunications) |
5018 |
VisageSynTalk: Unseen Speaker Video-to-Speech Synthesis via Speech-Visage Feature Selection |
Joanna Hong (KAIST)*; Minsu Kim (KAIST); Yong Man Ro (KAIST) |
5020 |
Delta Distillation for Efficient Video Processing |
Amirhossein Habibian (Qualcomm AI Research)*; Haitam Ben Yahia (Qualcomm AI Research); Davide Abati (Qualcomm AI Research); Efstratios Gavves (University of Amsterdam ); Fatih Porikli (Qualcomm AI Research) |
5026 |
PANDORA: A Panoramic Detection Dataset for Object with Orientation |
Hang Xu (Hangzhou Dianzi University;The Institute of Computing Technology of the Chinese Academy of Sciences); Qiang Zhao (The Institute of Computing Technology of the Chinese Academy of Sciences); Yike Ma (Institute of Computing Technology, Chinese Academy of Sciences); Xiaodong Li (Huawei Noah’s Ark Lab); Peng Yuan (Huawei Noah’s Ark Lab); Bailan Feng (Huawei Noah’s Ark Lab); Chenggang Yan (Hangzhou Dianzi University); Feng Dai (Institute of Computing Technology, Chinese Academy of Sciences)* |
5032 |
Instance As Identity: A Generic Online Paradigm for Video Instance Segmentation |
Feng Zhu (University of Technology Sydney)*; Zongxin Yang (Zhejiang University); Xin Yu (University of Technology Sydney); Yi Yang (Zhejiang University); Yunchao Wei (UTS) |
5034 |
Audio-Visual Mismatch-Aware Video Retrieval via Association and Adjustment |
Sangmin Lee (KAIST)*; Sungjune Park (KAIST); Yong Man Ro (KAIST) |
5036 |
3D Clothed Human Reconstruction in the Wild |
Gyeongsik Moon (Seoul National University); Hyeongjin Nam (Seoul National University); Takaaki Shiratori (Meta Reality Labs Research); Kyoung Mu Lee (Seoul National University)* |
5040 |
Classification-Regression for Chart Comprehension |
Matan Levy (The Hebrew University of Jerusalem)*; Rami Ben-Ari (OriginAI); Dani Lischinski (The Hebrew University of Jerusalem) |
5042 |
Zero-Shot Category-Level Object Pose Estimation |
Walter Goodwin (University of Oxford)*; Sagar Vaze (Visual Geometry Group, University of Oxford); Ioannis Havoutis (“Oxford Robotics Institute, Universtity of Oxford”); Ingmar Posner (Oxford University) |
5044 |
AssistQ: Affordance-centric Question-driven Task Completion for Egocentric Assistant |
Benita Wong (National University of Singapore)*; Joya Chen (National University of Singapore); You Wu (Harvard University); Stan Weixian Lei (National University of Singapore); Dongxing Mao (National University of Singapore); Difei Gao (NUS); Mike Zheng Shou (National University of Singapore) |
5047 |
Laplace Mesh Transformer: Dual Attention and Topology Aware Network for 3D mesh Classification and Segmentation |
Xiao-Juan Li (Institute of Computing Technology, Chinese Academy of Sciences); Jie Yang (Institute of Computing Technology, Chinese Academy of Sciences)*; Fang-Lue Zhang (Victoria University of Wellington) |
5048 |
CoMER: Modeling Coverage for Transformer-based Handwritten Mathematical Expression Recognition |
Wenqi Zhao (Peking University)*; Liangcai Gao (Peking University) |
5049 |
RBC: Rectifying the Biased Context in Continual Semantic Segmentation |
Hanbin Zhao (Zhejiang University)*; Fengyu Yang (University of Michigan); Xinghe Fu (Zhejiang University); Xi Li (Zhejiang University) |
5051 |
Don’t Forget Me: Accurate Background Recovery for Text Removal via Modeling Local-Global Context |
Chongyu Liu (South China University of Technology); Lianwen Jin (South China University of Technology)*; Yuliang Liu (Huazhong University of Science and Technology); Canjie Luo (South China University of Technology); Bangdong Chen (South China University of Technology); Fengjun Guo (IntSig Information Co. Ltd); Kai Ding (IntSig Information Co., Ltd) |
5066 |
Semi-Supervised Keypoint Detector and Descriptor for Retinal Image Matching |
Jiazhen Liu (Renmin University of China); Xirong Li (Renmin University of China)*; Qijie Wei ( Vistel Inc.); Jie Xu (Beijing Tongren Hospital); Dayong Ding (Vistel Inc.) |
5069 |
Memory-Augmented Model-Driven Network for Pansharpening |
Keyu Yan ( Hefei Institutes of Physical Science,Chinese Academy of Sciences)*; man zhou (Chinese Academy of Sciences); li zhang (Chinese Academy of Sciences); Chengjun Xie (Institute of Intelligent Machines, Chinese Academy of Sciences China) |
5076 |
Factorizing Knowledge in Neural Networks |
Xingyi Yang (National University of Singapore)*; Jingwen Ye (National University of Singapore); Xinchao Wang (National University of Singapore) |
5081 |
Unleashing Transformers: Parallel Token Prediction with Discrete Absorbing Diffusion for Fast High-Resolution Image Generation from Vector-Quantized Codes |
Sam Bond-Taylor (Durham University)*; Peter Hessey (Durham University); Hiroshi Sasaki (Durham University); Toby P Breckon (Durham University); Chris G. Willcocks (Durham University) |
5082 |
Contrastive Vicinal Space for Unsupervised Domain Adaptation |
Jaemin Na (Ajou University)*; Dongyoon Han (NAVER AI Lab); Hyung Jin Chang (University of Birmingham); Wonjun Hwang (Ajou University) |
5083 |
Weight Fixing Networks |
Chris Subia-Waud (University of Southampton)*; Srinandan Dasmahapatra (University of Southampton) |
5088 |
Sim-to-Real 6D Object Pose Estimation via Iterative Self-training for Robotic Bin Picking |
Kai Chen (The Chinese University of Hong Kong); Rui Cao (The Chinese University of Hong Kong); Stephen L James (UC Berkeley); YICHUAN LI (CUHK); Yunhui Liu (CUHK); Pieter Abbeel (UC Berkeley); Qi Dou (The Chinese University of Hong Kong)* |
5092 |
ChunkyGAN: Real Image Inversion via Segments |
Adéla Šubrtová (Czech Technical University); David Futschik (Czech Technical University in Prague, FEE); Jan Čech (Czech Technical University in Prague); Michal Lukáč (Adobe Research); Eli Shechtman (Adobe Research, US); Daniel Sýkora (Czech Technical University in Prague)* |
5099 |
Towards Sequence-Level Training for Visual Tracking |
Minji Kim (Seoul National University)*; Seungkwan Lee (POSTECH); Jungseul Ok (POSTECH); Bohyung Han (Seoul National University); Minsu Cho (POSTECH) |
5111 |
Scale-aware Spatio-temporal Relation Learning for Video Anomaly Detection |
Guoqiu Li (Tsinghua Shenzhen International Graduate School, Tsinghua University)*; Guanxiong Cai (Shenzhen SenseTime Technology Co., Ltd); Xingyu ZENG (SenseTime Group Limited); Rui Zhao (SenseTime Group Limited) |
5114 |
Tracking by Associating Clips |
Sanghyun Woo (KAIST)*; Kwanyong Park (KAIST); Seoung Wug Oh (Adobe Research); In So Kweon (KAIST); Joon-Young Lee (Adobe Research) |
5117 |
An Information Theoretic Approach forAttention-Driven Face Forgery Detection |
Ke Sun (Xiamen University)*; Hong Liu (National Institute of Informatics ); Taiping Yao (Tencent YouTu); Xiaoshuai Sun (Xiamen University); Shen Chen (Tencent YouTu Lab); Shouhong Ding (Tencent); Rongrong Ji (Xiamen University, China) |
5118 |
Compound Prototype Matching for Few-shot Action Recognition |
Yifei Huang (The University of Tokyo)*; Lijin Yang (The University of Tokyo); Yoichi Sato (University of Tokyo) |
5119 |
Self-Promoted Supervision for Few-Shot Transformer |
Bowen Dong (Harbin Institute of Technology); Pan Zhou (NUS); Shuicheng Yan (National University of Singapore, Department of Electrical and Computer Engineering); Wangmeng Zuo (Harbin Institute of Technology, China)* |
5122 |
Completely Self-Supervised Crowd Counting via Distribution Matching |
deepak babu sam (Indian Institute of Science)*; Abhinav Agarwalla (Carnegie Mellon University); Jimmy Joseph (Stony Brook University); Vishwanath Sindagi (Johns Hopkins University); Venkatesh Babu RADHAKRISHNAN (Indian Institute of Science); Vishal Patel (Johns Hopkins University) |
5123 |
Geodesic-Former: a Geodesic-Guided Few-shot 3D Point Cloud Instance Segmenter |
Tuan Duc Ngo (VinAI Research)*; Khoi Nguyen (VinAI Research) |
5127 |
SeedFormer: Patch Seeds based Point Cloud Completion with Upsample Transformer |
Haoran Zhou (Nanjing University)*; Yun Cao (Tencent); Wenqing Chu (Tencent); Junwei Zhu (Tencent); Tong Lu (Nanjing University); Ying Tai (Tencent YouTu); Chengjie Wang (Tencent; Shanghai Jiao Tong University) |
5129 |
3D-PL: Domain Adaptive Depth Estimation with 3D-aware Pseudo-Labeling |
Yu-Ting Yen (National Chiao Tung University, Phiar Technologies)*; Chia-Ni Lu (National Chiao Tung University ); Wei-Chen Chiu (National Chiao Tung University); Yi-Hsuan Tsai (Phiar Technologies) |
5136 |
Towards Accurate Active Camera Localization |
Qihang Fang (Shandong University); Yingda Yin (Peking University); Qingnan Fan (Tencent AI Lab)*; Fei Xia (Google Inc); Siyan Dong (Shandong University); Sheng Wang (3vjia); Jue Wang (Tencent AI Lab); Leonidas Guibas (Stanford University); Baoquan Chen (Peking University) |
5138 |
Few-shot Object Counting and Detection |
Thanh Van Nguyen (VinAI Research)*; Chau Hai Pham (VinAI Research); Khoi Nguyen (VinAI Research); Minh Hoai (Stony Brook University) |
5140 |
RealPatch: A Statistical Matching Framework for Model Patching with Real Samples |
Sara Romiti (University of Sussex)*; Christopher Inskip (University of Sussex); Viktoriia Sharmanska (University of Sussex and Imperial College London); Novi Quadrianto (University of Sussex and Basque Center for Applied Mathematics) |
5144 |
GAN Cocktail: mixing GANs without dataset access |
Omri Avrahami (The Hebrew University of Jerusalem)*; Dani Lischinski (The Hebrew University of Jerusalem); Ohad Fried (IDC Herzliya) |
5156 |
Coarse-To-Fine Incremental Few-Shot Learning |
Xiang Xiang (Huazhong University of Science and Technology)*; Yuwen Tan (Huazhong University of Science and Technology); Qian Wan (Wuhan Research Institute of Posts and Telecommunications); Jing Ma (Huazhong University of Science and Technology); Alan Yuille (Johns Hopkins University); Gregory D. Hager (The Johns Hopkins University) |
5157 |
Learning Unbiased Transferability for Domain Adaptation by Uncertainty Modeling |
Jian Hu (Queen Mary University of London)*; Haowen Zhong (Zhejiang Lab); Fei Yang (Zhejiang Lab); Shaogang Gong (Queen Mary University of London); Guile Wu (Queen Mary University of London); Junchi Yan (Shanghai Jiao Tong University) |
5158 |
Camera Pose Auto-Encoders for Improving Pose Regression |
Yoli Shavit (Faculty of Engineering, Bar Ilan University); Yosi Keller (Bar Ilan University)* |
5160 |
CoGS: Controllable Generation and Search from Sketch and Style |
Cusuh Ham (Georgia Institute of Technology)*; Gemma Canet Tarrés (CVSSP, University of Surrey); Tu Bui (University of Surrey); James Hays (Georgia Institute of Technology, USA); Zhe Lin (Adobe Research); John Collomosse (Adobe Research) |
5172 |
Active Audio-Visual Separation of Dynamic Sound Sources |
Sagnik Majumder (University of Texas at Austin)*; Kristen Grauman (Facebook AI Research & UT Austin) |
5175 |
AU-aware 3D Face Reconstruction through Personalized AU-specific Blendshape Learning |
Chenyi Kuang (Rensselaer Polytechnic Institute)*; Zijun Cui (Rensselaer Polytechnic Institute); Jeffrey Kephart (IBM Research, USA); Qiang Ji (Renselaer Polytechnic Institute) |
5180 |
Directed Ray Distance Functions for 3D Scene Reconstruction |
Nilesh Kulkarni (University of Michigan)*; Justin Johnson (University of Michigan); David Fouhey (University of Michigan) |
5189 |
Background-Insensitive Scene Text Recognition with Text Semantic Segmentation |
Liang Zhao (University of South Carolina)*; Zhenyao Wu (University of South Carolina); Xinyi Wu (University of South Carolina); Greg Wilsbacher (University of South Carolina); Song Wang (University of South Carolina) |
5198 |
Geometry-Guided Progressive NeRF for Generalizable and Efficient Neural Human Rendering |
Mingfei Chen (University of Washington)*; Jianfeng Zhang (NUS); Xiangyu Xu (Sea AI Lab); Lijuan Liu (SEA AI LAB); Yujun Cai (Nanyang Technological University); Jiashi Feng (ByteDance); Shuicheng Yan (Sea AI Labs) |
5207 |
MorphMLP: An Efficient MLP-Like Backbone for Spatial-Temporal Representation Learning |
David Junhao Zhang (National University of Singapore)*; Kunchang Li (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences); Yali Wang (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences); Yunpeng Chen (National University of Singapore); Shashwat Chandra (National University of Singapore); Yu Qiao (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences); Luoqi Liu (meitu); Mike Zheng Shou (National University of Singapore) |
5211 |
Continual Variational Autoencoder Learning via Online Cooperative Memorization |
Fei Ye (University of york)*; Adrian Bors (University of York) |
5215 |
Semantic Novelty Detection via Relational Reasoning |
Francesco Cappio Borlino (Politecnico di Torino); Silvia Bucci (Italian Institute of Technology)*; Tatiana Tommasi (Politecnico di Torino) |
5217 |
FindIt: Generalized Localization with Natural Language Queries |
Weicheng Kuo (Google)*; Fred Bertsch (Google); Wei Li (GOOGLE INC); AJ Piergiovanni (Google); Mohammad Saffar (Google); Anelia Angelova (Google) |
5224 |
SelectionConv: Convolutional Neural Networks for Non-rectilinear Image Data |
David M Hart (Brigham Young University)*; Michael Whitney (Brigham Young University); Bryan S Morse (Brigham Young University) |
5227 |
HairNet: Hairstyle Transfer with Pose Changes |
Peihao Zhu (KAUST)*; Rameen Abdal (KAUST); JOHN C FEMIANI (Miami University); Peter Wonka (KAUST) |
5234 |
Learn2Augment: Learning to Composite Videos for Data Augmentation in Action Recognition |
Shreyank N Gowda (University of Edinburgh)*; Marcus Rohrbach (Facebook AI Research); Frank Keller (University of Edinburgh); Laura Sevilla-Lara (Facebook) |
5235 |
Action-based Contrastive Learning for Trajectory Prediction |
Marah Halawa (Technische Universität Berlin)*; Olaf Hellwich (Technical University Berlin); Pia Bideau (TU Berlin) |
5240 |
Scaling Open-vocabulary Image Segmentation with Image-level Labels |
Golnaz Ghiasi (Google Brain)*; Xiuye Gu (Google); Yin Cui (Google); Tsung-Yi Lin (Nvidia Research) |
5247 |
Improving Closed and Open-Vocabulary Attribute Prediction using Transformers |
Khoi Pham (University of Maryland, College Park)*; Kushal Kafle (Adobe Research); Zhe Lin (Adobe Research); Zhihong Ding (Adobe Research); Scott Cohen (Adobe Research); Quan Hung Tran (Adobe Research); Abhinav Shrivastava (University of Maryland) |
5251 |
FS-COCO: Towards Understanding of Freehand Sketches of Common Objects in Context |
Pinaki Nath Chowdhury (University of Surrey)*; Aneeshan Sain (University of Surrey); Ayan Kumar Bhunia (University of Surrey); Tao Xiang (University of Surrey); Yulia Gryaditskaya (University of Surrey); Yi-Zhe Song (University of Surrey) |
5252 |
A Contrastive Objective for Learning Disentangled Representations |
Jonathan Kahana (Hebrew University of Jerusalem)*; Yedid Hoshen (The Hebrew University of Jerusalem) |
5256 |
Unbiased Multi-Modality Guidance for Image Inpainting |
Yongsheng YU (University of Chinese Academy of Sciences); Dawei Du (Kitware, Inc.)*; Libo Zhang (Institute of Software Chinese Academy of Sciences); Tiejian Luo (University of Chinese Academy of Sciences) |
5257 |
Learned Monocular Depth Priors in Visual-Inertial Initialization |
Yunwen Zhou (Google)*; Abhishek Kar (Google); Eric L Turner (GOOGLE LLC); Adarsh Kowdle (Google); Chao Guo (Google Inc.); Ryan DuToit (Google); Konstantine Tsotsos (Google) |
5261 |
DexMV: Imitation Learning for Dexterous Manipulation from Human Videos |
Yuzhe Qin (University of California San Diego)*; Yueh-Hua Wu (UCSD); Shaowei Liu (UIUC); Hanwen Jiang (UT Austin); Ruihan Yang (UC San Diego); Yang Fu (UCSD); Xiaolong Wang (UCSD) |
5265 |
Exploring Fine-grained Audiovisual Categorization with the SSW60 Dataset |
Grant Van Horn (Cornell University)*; Rui Qian (Cornell University); Kimberly Wilber (Google); Hartwig Adam (Google); Oisin Mac Aodha (University of Edinburgh); Serge Belongie (University of Copenhagen) |
5266 |
Radatron: Accurate Detection Using Multi-Resolution Cascaded MIMO Radar |
Sohrab Madani (UIUC)*; Junfeng Guan (UIUC); Waleed Ahmed (UIUC); Saurabh Gupta (UIUC); Haitham Hassanieh (UIUC) |
5270 |
COMPOSER: Compositional Reasoning of Group Activity in Videos with Keypoint-Only Modality |
Honglu Zhou (Rutgers University)*; Asim Kadav (NEC Labs); Aviv Shamsian (Bar Ilan University); Shijie Geng (Rutgers University); Farley Lai (NEC Laboratories America, Inc.); Long Zhao (Google Research); Ting Liu (Google Research); Mubbasir Kapadia (Rutgers University); Hans Peter Graf (NEC Labs) |
5272 |
The Fish Counting Dataset: A Benchmark for Multiple Object Tracking and Counting |
Justin Kay (Caltech, Ai.Fish); Peter Kulits (Caltech); Suzanne C Stathatos (Caltech); Siqi Deng (Amazon); Erik Young (Trout Unlimited); Sara M Beery (Caltech); Grant Van Horn (Cornell University)*; Pietro Perona (California Institute of Technology) |
5287 |
Object Level Depth Reconstruction for Category Level 6D Object Pose Estimation From Monocular RGB Image |
Zhaoxin Fan (Renmin University of China)*; Zhenbo Song (Nanjing University of Science and Technology); Jian Xu (Nreal); Zhicheng Wang (Nreal); Kejian Wu (Nreal); Hongyan Liu (Tsinghua University); Jun He (Renmin University of China) |
5293 |
DeepMend: Learning Occupancy Functions to Represent Shape for Repair |
Nikolas Lamb (Clarkson University)*; Sean Banerjee (Clarkson University); Natasha Kholgade Banerjee (Clarkson University) |
5297 |
Graph Neural Network for Cell Tracking in Microscopy Videos |
Tal Ben-Haim (School of Electrical and Computer Engineering, Ben-Gurion University)*; Tammy Riklin Raviv (BGU) |
5299 |
Anti-Neuron Watermarking: Protecting Personal Data Against Unauthorized Neural Networks |
Zihang Zou (University of Central Florida)*; Boqing Gong (Google); Liqiang Wang (University of Central Florida) |
5310 |
PACS: A Dataset for Physical Audiovisual Commonsense Reasoning |
Samuel Yu (Carnegie Mellon University)*; Peter Wu (UC Berkeley); Paul Pu Liang (Carnegie Mellon University); Ruslan Salakhutdinov (Carnegie Mellon University); Louis-Philippe Morency (Carnegie Mellon University) |
5315 |
Intelli-Paint: Towards Developing More Human-Intelligible Painting Agents |
Jaskirat Singh (Australian National University)*; Cameron Y Smith (Adobe Research); Jose Echevarria (Adobe System Inc.); Liang Zheng (Australian National University) |
5317 |
Rethinking Few-Shot Object Detection on A Multi-Domain Benchmark |
Kibok Lee (Yonsei University); Hao Yang (Amazon)*; Satyaki Chakraborty (Amazon ); Zhaowei Cai (Amazon); Gurumurthy Swaminathan (Amazon); Avinash Ravichandran (Amazon); Onkar Dabeer (Amazon) |
5318 |
LidarNAS: Unifying and Searching Neural Architectures for 3D Point Clouds |
Chenxi Liu (Waymo)*; Zhaoqi Leng (Waymo); Pei Sun (Waymo); Shuyang Cheng (Waymo LLC); Charles R. Qi (Waymo); Yin Zhou (Waymo); Mingxing Tan (Waymo); Dragomir Anguelov (Waymo) |
5325 |
Improving the Intra-class Long-tail in 3D Detection via Rare Example Mining |
Chiyu Jiang (Waymo)*; Mahyar Najibi (Waymo LLC); Charles R. Qi (Waymo); Yin Zhou (Waymo); Dragomir Anguelov (Waymo) |
5326 |
Learning to Learn with Smooth Regularization |
Yuanhao Xiong (UCLA)*; Cho-Jui Hsieh (UCLA) |
5327 |
A Dataset for Interactive Vision-Language Navigation with Unknown Command Feasibility |
Andrea Burns (Boston University)*; Deniz Arsan (University of Illinois at Urbana Champaign); Sanjna Agrawal (Boston University); Ranjitha Kumar (UIUC: CS); Kate Saenko (Boston University); Bryan Plummer (Boston University) |
5330 |
CoVisPose: Co-Visibility Pose Transformer for Wide-Baseline Relative Pose Estimation in 360 Indoor Panoramas |
Will A Hutchcroft (Zillow Group)*; Yuguang Li (Zillow Group); Ivaylo Boyadzhiev (Zillow Group); Zhiqiang Wan (Zillow); Haiyan Wang (The City College of New York); Sing Bing Kang (Zillow Group) |
5340 |
PT4AL: Using Self-Supervised Pretext Tasks for Active Learning |
John Seon Keun Yi (Georgia Institute of Technology)*; Minseok Seo (si-analytics); Jongchan Park (Lunit); Dong-Geol Choi (Hanbat National University) |
5351 |
Uncertainty Quantification in Depth Estimation via Constrained Ordinal Regression |
Dongting Hu (The University of Melbourne); Liuhua Peng (The University of Melbourne); Tingjin Chu (University of Melbourne); Xiaoxing Zhang (Meituan); Yinian Mao (Meituan-Dianping Group ); Howard Bondell (University of Melbourne); Mingming Gong (University of Melbourne)* |
5361 |
All You Need is RAW: Defending Against Adversarial Attacks with Camera Image Pipelines |
Yuxuan Zhang (Princeton University)*; Bo Dong (Princeton University); Felix Heide (Princeton University) |
5362 |
ParC-Net: Position Aware Circular Convolution with Merits from ConvNets and Transformer |
Haokui Zhang (Lighthouse Co.Ltd)*; Wenze Hu (Lighthouse Co.Ltd); Xiaoyu Wang (The Chinese University of Hong Kong (Shenzhen)) |
5369 |
B ́ezierPalm: A Free lunch for Palmprint Recognition |
KAI ZHAO (UCLA)*; Lei Shen (Tencent); Yingyi Zhang (Tencent); Chuhan Zhou (Tencent & VIA University College); Tao Wang (Tencent YouTu Lab); Ruixin Zhang (Tencent); Shouhong Ding (Tencent); Wei Jia (Heifei University of Technology); Wei Shen (Shanghai Jiao Tong University) |
5372 |
A Repulsive Force Unit for Garment Collision Handling in Neural Networks |
Qingyang Tan (UMD)*; Yi Zhou (Adobe Research); Tuanfeng Wang (adobe research); Duygu Ceylan (Adobe Research); Xin Sun (Adobe Research); Dinesh Manocha (University of Maryland at College Park) |
5373 |
CYBORGS: Contrastively Bootstrapping Object Representations by Grounding in Segmentation |
Renhao Wang (Tsinghua University)*; Hang Zhao (Tsinghua University); Yang Gao (Tsinghua University) |
5377 |
Connecting Compression Spaces withTransformer for Approximate Nearest Neighbor Search |
Haokui Zhang (Lighthouse Co.Ltd)*; Buzhou Tang (Harbin Institute of Technology, China); Wenze Hu (Lighthouse Co.Ltd); Xiaoyu Wang (The Chinese University of Hong Kong (Shenzhen)) |
5381 |
Training Vision Transformers with Only 2040 Images |
Yunhao Cao (Nanjing University); Hao Yu (Nanjing University); Jianxin Wu (Nanjing University)* |
5384 |
Black-box Few-shot Knowledge Distillation |
Dang Nguyen (Deakin University)*; Sunil Gupta (Deakin University, Australia); Kien Duc Do (Deakin Unviersity); Svetha Venkatesh (Deakin University) |
5388 |
AutoAvatar: Autoregressive Neural Fields for Dynamic Avatar Modeling |
Ziqian Bai (Simon Fraser University)*; Timur Bagautdinov (Facebook); Javier Romero (Facebook); Michael Zollhöfer (Facebook Reality Labs); Ping Tan (Simon Fraser University); Shunsuke Saito (Facebook) |
5392 |
Ghost-free High Dynamic Range Imaging with Context-aware Transformer |
Zhen Liu (Sichuan University; Megvii ); Yinglong Wang (Huawei Noah’s Ark Lab); Bing Zeng (University of Electronic Science and Technology of China); Shuaicheng Liu (UESTC; Megvii)* |
5393 |
Cross-Domain Cross-Set Few-Shot Learning via Learning Compact and Aligned Representations |
Wentao Chen (University of Science and Technology of China)*; Zhang Zhang (Institute of Automation, Chinese Academy of Sciences); Wei Wang (Institute of Automation Chinese Academy of Sciences); Liang Wang (NLPR, China); Zilei Wang (University of Science and Technology of China); Tieniu Tan (NLPR, China) |
5396 |
Motion Transformer for Unsupervised Image Animation |
Jiale Tao (University of Electronic Science and Technology of China)*; Biao Wang (Alibaba Group); Tiezheng Ge (Alibaba Group); Yuning Jiang (Alibaba Group); Wen Li (University of Electronic Science and Technology of China); Lixin Duan (University of Electronic Science and Technology of China) |
5404 |
LiDAR Distillation: Bridging the Beam-Induced Domain Gap for 3D Object Detection |
Yi Wei (Tsinghua University)*; Zibu Wei (Tsinghua University); Yongming Rao (Tsinghua University); Jiaxin Li (Gaussian Robotics); Jie Zhou (Tsinghua University); Jiwen Lu (Tsinghua University) |
5405 |
PSS: Progressive Sample Selection for Open-World Visual Representation Learning |
Tianyue Cao (Shanghai Jiao Tong University); Yongxin Wang (Amazon)*; Yifan Xing (AMAZON CORPORATE LLC); Tianjun Xiao (Amazon); Tong He (Amazon); Zheng Zhang (AWS); Hao Zhou (Amazon); Joseph Tighe (Amazon) |
5408 |
Self-slimmed Vision Transformer |
Zhuofan Zong (Beihang University)*; Kunchang Li (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences); Guanglu Song (Sensetime); Yali Wang (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences); Yu Qiao (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences); Biao Leng (Beihang University); Yu Liu (SenseTime Group LTD) |
5410 |
Switchable Online Knowledge Distillation |
Biao Qian (Hefei University of Technology); Yang Wang (Hefei University of Technology)*; Hongzhi Yin (The University of Queensland); Richang Hong (Hefei University of Technology); Meng Wang (Hefei University of Technology) |
5418 |
Adaptive Transformers for Robust Few-shot Cross-domain Face Anti-spoofing |
Hsin-Ping Huang (University of California, Merced)*; Deqing Sun (Google); Yaojie Liu (Google); Wen-Sheng Chu (Google); Taihong Xiao (University of California at Merced); Jinwei Yuan (Google); Hartwig Adam (Google); Ming-Hsuan Yang (University of California at Merced) |
5419 |
GraphFit: Learning Multi-scale Graph-Convolutional Representation for Point Cloud Normal Estimation |
Keqiang Li (Institute of Automation, Chinese Academy of Sciences; School of Artificial Intelligence, University of Chinese Academy of Sciences)*; Mingyang Zhao (University of Chinese Academy and Sciences&Beijing Academy of Artificial Intelligence); Huaiyu Wu (Institute of Automation, Chinese Academy of Sciences); Dong-Ming Yan (NLPR, CASIA); Zhen Shen (Institute of Automation, Chinese Academy of Sciences/Qingdao Academy of Intelligent Industries); Fei-Yue Wang (Institute of Automation, Chinese Academy of Sciences ); gang xiong (CASIA) |
5424 |
Are Vision Transformers Robust to Patch-wise Perturbations? |
Jindong Gu (University of Munich)*; Volker Tresp (Siemens AG and Ludwig Maximilian University of Munich ); Yao Qin (Google) |
5428 |
DualPrompt: Complementary Prompting for Rehearsal-free Continual Learning |
Zifeng Wang (Northeastern University)*; Zizhao Zhang (Google); Sayna Ebrahimi (Google); Ruoxi Sun (Google); Han Zhang (Google); Chen-Yu Lee (Google); Xiaoqi Ren (Google); Guolong Su (Google); Vincent Perot (Google AI); Jennifer Dy (Northeastern); Tomas Pfister (Google) |
5430 |
EleGANt: Exquisite and Locally Editable GAN for Makeup Transfer |
Chenyu Yang (Tsinghua University)*; Wanrong He (Tsinghua University); Yingqing Xu (Tsinghua University); Yang Gao (Tsinghua University) |
5436 |
Union-set Multi-source Model Adaptation for Semantic Segmentation |
Zongyao Li (Hokkaido University)*; Ren Togo (Hokkaido University); Takahiro Ogawa (Hokkaido University); Miki Haseyama (Hokkaido University) |
5441 |
Bridging Images and Videos: A Simple Learning Framework for Large Vocabulary Video Object Detection |
Sanghyun Woo (KAIST)*; Kwanyong Park (KAIST); Seoung Wug Oh (Adobe Research); In So Kweon (KAIST); Joon-Young Lee (Adobe Research) |
5443 |
TDAM: Top-Down Attention Module for Contextually Guided Feature Selection in CNNs |
Shantanu Jaiswal (Agency for Science, Technology and Research )*; Basura Fernando (Agency for Science, Technology and Research, A*STAR, Singapore); Cheston Tan (Institute for Infocomm Research, Singapore) |
5451 |
Exploring Disentangled Content Information for Face Forgery Detection |
Jiahao Liang (Beijing University of Posts and Telecommunications)*; Huafeng Shi (SenseTime Group Limited); Weihong Deng (Beijing University of Posts and Telecommunications) |
5458 |
Object Discovery via Contrastive Learning for Weakly Supervised Object Detection |
Jinhwan Seo (Pohang University of Science and Technology)*; Wonho Bae (University of British Columbia); Danica J. Sutherland (University of British Columbia); Junhyug Noh (Lawrence Livermore National Laboratory); Daijin Kim (Pohang University of Science and Technology) |
5460 |
Unifying Vision Unsupervised Contrastive Learning from a Graph Perspective |
Shixiang Tang (The University of Sydney)*; Feng Zhu (University of Science and Technology of China); Lei Bai (Shanghai AI Laboratory); Rui Zhao (SenseTime Group Limited); Chenyu Wang (University of Sydney, Sydney Neuroimaging Analysis Centre); Wanli Ouyang (The University of Sydney) |
5463 |
E-NeRV: Expedite Neural Video Representation with Disentangled Spatial-Temporal Context |
Zizhang Li (Zhejiang University)*; Mengmeng Wang (Zhejiang University); Huaijin Pi (Zhejiang University); Kechun Xu (Zhejiang University); Jianbiao Mei (Zhejiang University); Yong Liu (Zhejiang University) |
5478 |
$\ell_\infty$-Robustness and Beyond: Unleashing Efficient Adversarial Training |
Hadi Mohaghegh Dolatabadi (University of Melbourne)*; Sarah Erfani (University of Melbourne); Christopher Leckie (University of Melbourne) |
5481 |
Spatial-Separated Curve Rendering Network for Efficient and High-Resolution Image Harmonization |
Jingtang Liang (University of Macau)*; Xiaodong Cun (Tencent AI Lab); Chi-Man Pun (University of Macau); Jue Wang (Tencent AI Lab) |
5484 |
Point MixSwap: Attentional Point Cloud Mixing via Swapping Matched Structural Divisions |
Ardian Umam (NYCU)*; Cheng-Kun Yang (National Taiwan University); Yung-Yu Chuang (National Taiwan University); Jen-Hui Chuang (National Chiao Tung University ); Yen-Yu Lin (National Yang Ming Chiao Tung University) |
5491 |
One Size Does NOT Fit All: Data-Adaptive Adversarial Training |
Shuo Yang (University of Sydney)*; Chang Xu (University of Sydney) |
5494 |
IS-MVSNet: Importance Sampling-based MVSNet |
Likang Wang (HKUST)*; Yue Gong (Huawei Technologies Co., Ltd.); Xinjun Ma (Huawei); Qirui Wang (Huawei Technologies Co., Ltd.); Kaixuan Zhou (Huawei ); Lei Chen (Hong Kong University of Science and Technology) |
5496 |
Multi-Granularity Pruning for Model Acceleration on Mobile Devices |
Tianli Zhao (Institute of Automation,Chinese Academy of Sciences;University of Chinese Academy of Sciences); Xi Sheryl Zhang (Institute of Automation, Chinese Academy of Sciences); Wentao Zhu (Amazon); Jiaxing Wang (Institute of Automation, Chinese Academy of Sciences); Sen Yang (Kuaishou); Ji Liu (Kwai Inc.); Jian Cheng (“Chinese Academy of Sciences, China”)* |
5500 |
Style-Agnostic Reinforcement Learning |
Juyong Lee (POSTECH); Seokjun Ahn (POSTECH); Jaesik Park (POSTECH)* |
5504 |
Editing Out-of-domain GAN Inversion via Differential Activations |
Haorui Song (South China University of Technology); Yong Du (Ocean University of China); Tianyi Xiang (South China University of Technology); Junyu Dong (Ocean University of China); Jing Qin (The Hong Kong Polytechnic University); Shengfeng He (South China University of Technology)* |
5508 |
Bagging Regional Classification Activation Maps for Weakly Supervised Object Localization |
Lei Zhu (Beijing University of Posts and Telecommunications); Qian Chen (University of Science and Technology of China); Lujia Jin (Peking University); yunfei you (Peking University); Yanye Lu (Peking University)* |
5518 |
Mutually Reinforcing Structure with Proposal Contrastive Consistency for Few-Shot Object Detection |
TianXue Ma (East China Normal University)*; Mingwei Bi (Tencent); Jian Zhang (Tencent Youtu); Wang Yuan (East China Normal University); Zhizhong Zhang (East China Normal University); Yuan Xie (East China Normal University); Shouhong Ding (Tencent); Lizhuang Ma (Shanghai Jiao Tong University) |
5523 |
Panoptic-PartFormer: Learning a Unified model for Panoptic Part Segmentation |
Xiangtai Li (Peking University)*; Shilin Xu (Peking University); Yibo Yang (Peking University); Guangliang Cheng (Sensetime Group Limited); Yunhai Tong (Peking University); Dacheng Tao (JD.com) |
5536 |
TextAdaIN: Paying Attention to Shortcut Learning in Text Recognizers |
Oren Nuriel (Amazon)*; Ron Litman (Amazon); Sharon Fogel (Amazon) |
5537 |
Speaker-adaptive Lip Reading with User-dependent Padding |
Minsu Kim (KAIST)*; Hyunjun Kim (KAIST); Yong Man Ro (KAIST) |
5541 |
Online Domain Adaptation for Semantic Segmentation in Ever-Changing Conditions |
Theodoros Panagiotakopoulos (KTH Royal Institute of Technology in Stockholm); Pier Luigi Dovesi (Univrses); Linus Härenstam-Nielsen (Artisense); Matteo Poggi (University of Bologna)* |
5542 |
Point Scene Understanding via Disentangled Instance Mesh Reconstruction |
Jiaxiang Tang (Peking University)*; Xiaokang Chen (Peking University); Jingbo Wang (The Chinese University of HongKong); Gang Zeng (Peking University) |
5543 |
Dual Contrastive Learning with Anatomical Auxiliary Supervision for Few-shot Medical Image Segmentation |
Huisi Wu (Shenzhen University)*; Fangyan Xiao (Shenzhen University); Chongxin Liang (Shenzhen University) |
5544 |
An Efficient Person Clustering Algorithm for Open Checkout-free Groceries |
Junde Morsen Wu (Purdue University); Yu Zhang (Harbin Institute of Technology); RAO FU (None); Yuanpei Liu (Beijing Institute of Technology); Jing Gao (Purdue University)* |
5548 |
Face2Face^ρ: Real-Time High-Resolution One-Shot Face Reenactment |
Kewei Yang (NetEase Games AI Lab)*; Kang Chen (NetEase Games AI Lab); Daoliang Guo (NetEase Games AI Lab); Song-Hai Zhang (Tsinghua University); Yuan-Chen Guo (Tsinghua University); Weidong Zhang (Netease Games AI Lab) |
5549 |
Decoupled Contrastive Learning |
Chun-Hsiao Yeh (Academia Sinica / UC Berkeley)*; Cheng-Yao Hong (Academia Sinica); Yen-Chi Hsu (Academia Sinica); Tyng-Luh Liu (Academia Sinica); Yubei Chen (Berkeley AI Research, UC Berkeley); yann lecun (Facebook) |
5555 |
Learning Algebraic Representation for Systematic Generalization in Abstract Reasoning |
Chi Zhang (University of California, Los Angeles)*; Sirui Xie (UCLA); Baoxiong Jia (UCLA); Ying Nian Wu (University of California, Los Angeles); Song-Chun Zhu (UCLA); Yixin Zhu (Peking University) |
5556 |
On the Robustness of Quality Measures for GANs |
Motasem Alfarra (KAUST)*; Juan C Perez (KAUST); Anna Fruehstueck (KAUST); Philip Torr (University of Oxford); Peter Wonka (KAUST); Bernard Ghanem (KAUST) |
5557 |
Automatic Check-Out via Prototype-based Classifier Learning from Single-Product Exemplars |
Hao Chen (Nanjing University of Science and Technology)*; Xiu-Shen Wei (Nanjing University of Science and Technology); Faen Zhang (AInnovation Co. Ltd.); Yang Shen (Nanjing University of Science and Technology); Hui Xu (QINGDAO AINNOVATION TECHNOLOGY GROUP CO., LTD); liang xiao (nanjing university of science and technology) |
5559 |
TDViT: Temporal Dilated Transformer for Dense Video Tasks |
Guanxiong Sun (Queen’s University Belfast); Yang Hua (Queen’s University Belfast)*; Guosheng Hu (Oosto); Neil Robertson (Queen’s University Belfast) |
5561 |
POP: Mining POtential Performance of new fashion products via webly cross-modal query expansion |
Christian Joppi (Humatics srl)*; Geri Skenderi (University of Verona); Marco Cristani (University of Verona) |
5564 |
BRACE: The Breakdancing Competition Dataset for Dance Motion Synthesis |
Davide Moltisanti (University of Edinburgh)*; Jinyi Wu (S-Lab Nanyang Technological University); Bo Dai (Shanghai AI Lab); Chen Change Loy (Nanyang Technological University) |
5578 |
Towards Racially Unbiased Skin Tone Estimation via Scene Disambiguation |
Haiwen Feng (Max Planck Institute for Intelligent Systems); Timo Bolkart (Max Planck Institute for Intelligent Systems); Joachim Tesch (Max Planck Institute for Intelligent Systems); Michael J. Black (Max Planck Institute for Intelligent Systems); Victoria Fernandez Abrevaya (Max Planck Institute)* |
5580 |
Style-Guided Shadow Removal |
Jin Wan (Beijing Jiaotong University); Hui Yin (Beijing Jiaotong University)*; Zhenyao Wu (University of South Carolina); Xinyi Wu (University of South Carolina); Yanting Liu (Yanting Liu); Song Wang (University of South Carolina) |
5584 |
Sound-guided Semantic Video Generation |
Seung Hyun Lee (Korea University)*; Gyeongrok Oh (Korea University); Wonmin Byeon (NVIDIA Research); Jihyun Bae (Korea University); Chanyoung Kim (Korea University); Won Jeong Ryoo (Korea University); Sang Ho Yoon (KAIST); Hyunjun Cho (Korea University); Jinkyu Kim (Korea University); Sangpil Kim (Korea University) |
5585 |
Robust Visual Tracking by Segmentation |
Matthieu Paul (ETH Zurich)*; Martin Danelljan (ETH Zurich); Christoph Mayer (ETH Zurich); Luc Van Gool (ETH Zurich) |
5591 |
Semi-Supervised Learning of Optical Flow by Flow Supervisor |
Woobin Im (KAIST); Sebin Lee (KAIST); Sungeui Yoon (KAIST)* |
5595 |
Joint Learning of Localized Representations from Medical Images and Reports |
Philip Müller (Technical University of Munich)*; Georgios Kaissis (Technische Universität München); congyu zou (Klinikum Rechts der Isar Technische Universität München ); Daniel Rueckert (Technische Universität München) |
5599 |
D2C-SR: A Divergence to Convergence Approach for Real-World Image Super-Resolution |
Youwei Li (Megvii); Haibin Huang (Kuaishou Technology); lanpeng jia (GWM); Haoqiang Fan (Megvii Inc(face++)); Shuaicheng Liu (UESTC; Megvii)* |
5612 |
Continual 3D Convolutional Neural Networks for Real-time Processing of Videos |
Lukas Hedegaard (Aarhus University)*; Alexandros Iosifidis (Aarhus University) |
5613 |
Salient Object Detection for Point Clouds |
Songlin Fan (Peking University ); Wei Gao (SECE, Shenzhen Graduate School, Peking University)*; Ge Li (Peking University) |
5616 |
Deep ensemble learning by diverse knowledge distillation for fine-grained object classification |
Naoki Okamoto (Chubu university)*; Tsubasa Hirakawa (Chubu University); Takayoshi Yamashita (Chubu University); Hironobu Fujiyoshi (Chubu University) |
5619 |
Source-free Video Domain Adaptation by Learning Temporal Consistency for Action Recognition |
Yuecong Xu (Institute for Infocomm Research, A*STAR, Singapore)*; Jianfei Yang (Nanyang Technological University); Haozhi Cao (Nanyang Technological University); Keyu Wu (Institute for Infocomm Research, A*STAR, Singapore); Min Wu (Institute for Infocomm Research, A*STAR, Singapore); Zhenghua Chen (Institute for Infocomm Research, A*STAR, Singapore) |
5643 |
GRIT-VLP: Grouped Mini-batch Sampling for Efficient Vision and Language Pre-training |
Jaeseok Byun (Seoul National university); Taebaek Hwang (M.IN.D Lab); Jianlong Fu (Microsoft Research); Taesup Moon (Seoul National University)* |
5644 |
Pose Forecasting in Industrial Human-Robot Collaboration |
Alessio Sampieri (Sapienza University)*; Guido Maria D’Amely di Melendugno (Sapienza University); ANDREA AVOGARO (University of Verona); Federico Cunico (University of Verona); Francesco Setti (University of Verona); Geri Skenderi (University of Verona); Marco Cristani (University of Verona); Fabio Galasso (Sapienza University) |
5648 |
MeshLoc: Mesh-Based Visual Localization |
Vojtech Panek (CTU in Prague, FEE, CIIRC)*; Zuzana Kukelova (Czech Technical University in Prague); Torsten Sattler (Czech Technical University in Prague) |
5660 |
Dress Code: High-Resolution Multi-Category Virtual Try-On |
Davide Morelli (UNIMORE); Matteo Fincato (Università degli Studi di Modena e Reggio Emilia); Marcella Cornia (University of Modena and Reggio Emilia)*; Federico Landi (University of Modena and Reggio Emilia); Fabio Cesari (YOOX Net-A-Porter Group S.p.A.); Rita Cucchiara (Università di Modena e Reggio Emilia) |
5661 |
UC-OWOD: Unknown-Classified Open World Object Detection |
Zhiheng Wu (Institute of Automation, Chinese Academy of Sciences (CASIA))*; Yue Lu (Institute of Automation, Chinese Academy of Sciences(CASIA)); Xingyu Chen (Xiaobing.AI); Zhengxing Wu (CASIA); Liwen Kang (Institute of Automation, Chinese Academy of Sciences (CASIA)); Junzhi Yu (CASIA) |
5666 |
Helpful or Harmful: Inter-Task Association in Continual Learning |
Hyundong Jin (Chung-Ang University ); Eunwoo Kim (Chung-Ang University)* |
5669 |
RayTran: 3D pose estimation and shape reconstruction of multiple objects from videos with ray-traced transformers |
Michał J Tyszkiewicz (EPFL); Kevis-Kokitsi Maninis (Google Research)*; Stefan Popov (Google Research); Vittorio Ferrari (Google Research) |
5673 |
Efficient Point Cloud Segmentation with Geometry-aware Sparse Networks |
Maosheng Ye (HKUST)*; Rui Wan (Deeproute.ai); Shuangjie Xu (HKUST); Tongyi Cao (Deeproute.ai); Qifeng Chen (HKUST) |
5677 |
Dynamic Spatio-Temporal Specialization Learning for Fine-Grained Action Recognition |
Tianjiao Li (Singapore University of Technology and Design)*; Lin Geng Foo (Singapore University of Technology and Design); Qiuhong Ke (Monash University); Hossein Rahmani (Lancaster University); Anran Wang (Bytedance); Jinghua Wang (Harbin Institute of Technology); Jun Liu (Singapore University of Technology and Design) |
5685 |
TISE: Bag of Metrics for Text-to-Image Synthesis Evaluation |
Tan Minh Dinh (VinAI Research)*; Rang NGUYEN (VinAI Research); Binh-Son Hua (VinAI Research) |
5688 |
CostDCNet: Cost Volume based Depth Completion for a Single RGB-D Image |
Jaewon Kam (POSTECH); Jungeon Kim (POSTECH); Soongjin Kim (POSTECH); Jaesik Park (POSTECH); Seungyong Lee (POSTECH)* |
5697 |
Efficient Video Deblurring Guided by Motion Magnitude |
Yusheng Wang (The University of Tokyo)*; Yunfan Lu (Hong Kong University of Science and Technology); Ye Gao (Honor Technologies Japan); Lin Wang (HKUST); Zhihang Zhong (The University of Tokyo); Yinqiang Zheng (The University of Tokyo); Atsushi Yamashita (The University of Tokyo) |
5702 |
Space-Partitioning RANSAC |
Daniel Barath (ETH Zürich)*; Gábor Valasek (ELTE) |
5704 |
Towards Accurate Binary Neural Networks via Modeling Contextual Dependencies |
Xingrun Xing (Beihang University); Yangguang Li (SenseTime Group Limited); Wei Li (Nanyang Technological University); Wenrui Ding (Beihang University); Yalong Jiang (Beihang University)*; Yufeng Wang (Beihang University); Jing Shao (Sensetime); Chunlei Liu (Beihang University); Xianglong Liu (BUAA) |
5712 |
Overcoming Shortcut Learning in a Target Domain by Generalizing Basic Visual Factors from a Source Domain |
Piyapat Saranrittichai (Bosch Center for Artificial Intelligence)*; Chaithanya Kumar Mummadi (Bosch Center for Artificial Intelligence); Claudia Blaiotta (Bosch Center for Artificial Intelligence); Mauricio Munoz (Bosch Center for Artificial Intelligence); Volker Fischer (Bosch Center for Artificial Intelligence) |
5721 |
SimpleRecon: 3D Reconstruction Without 3D Convolutions |
Mohamed Sayed (University College London)*; John Gibson (Niantic, Inc.); Jamie Watson (Niantic); Victor A Prisacariu (Niantic Labs); Michael Firman (Niantic); Clement LJC Godard (Niantic) |
5739 |
SemAug: Semantically Meaningful Image Augmentations for Object Detection Through Language Grounding |
Morgan L Heisler (Huawei Technologies Canada Co., Ltd.)*; Amin Banitalebi-Dehkordi (Huawei Technologies Canada Co., Ltd.); Yong Zhang (Huawei Technologies Canada Co., Ltd.) |
5740 |
A data-centric approach for improving ambiguous labels with combined semi-supervised classification and clustering |
Lars Schmarje (Kiel University)*; Monty Santarossa (Kiel University); Simon-Martin Schröder (Kiel University); Claudius Zelenka (Kiel University); Rainer Kiko (Laboratoire d’Océanographie de Villefranche-sur-Mer); Jenny Stracke (University of Bonn); Nina Volkmann (University of Veterinary Medicine Hannover); Reinhard Koch (Kiel University) |
5750 |
SPIN: An Empirical Evaluation on Sharing Parameters of Isotropic Networks |
Anish J Prabhu (Apple)*; Chien-Yu lin (University of Washington); Thomas Merth (Apple); Sachin Mehta (University of Washington); Anurag Ranjan (Apple); Maxwell C Horton (Apple, Xnor.Ai and University of Washington); Mohammad Rastegari (University of Washington) |
5754 |
SAGA: Stochastic Whole-Body Grasping With Contact |
Yan Wu (ETH Zurich); Jiahao Wang (Max Planck Institute for Informatics); Yan Zhang (ETH Zurich); Siwei Zhang (ETH Zurich); Otmar Hilliges (ETH Zurich); Fisher Yu (ETH Zurich); Siyu Tang (ETH Zurich)* |
5761 |
GTCaR: Graph Transformer for Camera Re-localization |
Xinyi Li (Magic Leap)*; Haibin Ling (Stony Brook University) |
5764 |
Actor-centered Representations for Action Localization in Streaming Videos |
Sathyanarayanan N Aakur (OK State)*; Sudeep Sarkar (University of South Florida, Tampa) |
5769 |
Photo-realistic Neural Domain Randomization |
Sergey Zakharov (Toyota Research Institute)*; Rareș A Ambruș (Toyota Research Institute); Vitor Guizilini (Toyota Research Institute); Wadim Kehl (Woven Planet); Adrien Gaidon (Toyota Research Institute) |
5770 |
ShAPO: Implicit Representations for Multi-Object Shape, Appearance, and Pose Optimization |
Muhammad Zubair Irshad (Georgia Institute of Technology)*; Sergey Zakharov (Toyota Research Institute); Rareș A Ambruș (Toyota Research Institute); Thomas Kollar (Toyota Research Institute); Zsolt Kira (Georgia Institute of Technology); Adrien Gaidon (Toyota Research Institute) |
5771 |
Structure and Motion for Casual Videos |
Zhoutong Zhang (MIT)*; Forrester Cole (Google Research); Zhengqi Li (Google Inc.); Noah Snavely (Google); Michael Rubinstein (Google); William T Freeman (Google) |
5775 |
Single Frame Atmospheric Turbulence Mitigation: A Benchmark Study and A New Physics-Inspired Transformer Model |
Zhiyuan Mao (Purdue University)*; AJAY KUMAR JAISWAL (UT Austin); Zhangyang Wang (University of Texas at Austin); Stanley Chan (Purdue University, USA) |
5778 |
Incremental Task Learning with Incremental Rank Updates |
Rakib Hyder (University of California, Riverside)*; Ken Shao (UCR); Boyu Hou (The University of California, Riverside ); Panagiotis Markopoulos (RIT); Ashley Prater-Bennette (Air Force Research Laboratory); M. Salman Asif (University of California, Riverside) |
5787 |
Bandwidth-Aware Adaptive Codec for DNN Inference Offloading in IoT |
Xiufeng Xie (Kwai Inc.)*; Ning Zhou (Amazon); Wentao Zhu (Amazon); Ji Liu (Kwai Inc.) |
5789 |
Inpainting at Modern Camera Resolution by Guided PatchMatch with Auto-Curation |
Connelly Barnes (Adobe)*; Lingzhi Zhang (University of Pennsylvania); Jianbo Shi (University of Pennsylvania); Zhe Lin (Adobe Research); Eli Shechtman (Adobe Research, US); Sohrab Amirghodsi (Adobe Research); Kevin Wampler (Adobe Systems Inc.) |
5794 |
Controllable Video Generation through Global and Local Motion Dynamics |
Aram Davtyan (University of Bern)*; Paolo Favaro (University of Bern) |
5812 |
UniCR: Universally Approximated Certified Robustness via Randomized Smoothing |
Hanbin Hong (University of Connecticut)*; Binghui Wang (Illinois Institute of Technology); Yuan Hong (University of Connecticut) |
5829 |
3D Siamese Transformer Network for Single Object Tracking on Point Clouds |
Le Hui (Nanjing University of Science and Technology)*; Lingpeng Wang (Nanjing University of Science and Technology); Linghua Tang (Nanjing University of Science and Technology); Kaihao Lan (Nanjing University of Science and Technology); Jin Xie (Nanjing University of Science and Technology); Jian Yang (Nanjing University of Science and Technology) |
5837 |
Hardly Perceptible Trojan Attack against Neural Networks with Bit Flips |
Jiawang Bai (Tsinghua University)*; Kuofeng Gao (Tsinghua University); dihong gong (Tencent AI Lab); Shu-Tao Xia (Tsinghua University); Zhifeng Li (Tencent AI Lab); Wei Liu (Tencent) |
5856 |
StyleHEAT: One-Shot High-Resolution Editable Talking Face Generation via Pre-trained StyleGAN |
Fei Yin (Tsinghua University)*; Yong Zhang (Tencent AI Lab); Xiaodong Cun (Tencent AI Lab); Mingdeng Cao (Tsinghua University); Yanbo Fan (Tencent AI Lab); Xuan Wang (Tencent AI Lab); Qingyan Bai (Tsinghua University); Baoyuan Wu (The Chinese University of Hong Kong, Shenzhen); Jue Wang (Tencent AI Lab); Yujiu Yang (Tsinghua University) |
5859 |
Referring Object Manipulation of Natural Images with Conditional Classifier-Free Guidance |
Myungsub Choi (Google)* |
5880 |
Self-Supervised Interactive Object Segmentation Through a Singulation-and-Grasping Approach |
Houjian Yu (University of Minnesota, Twin Cities)*; Changhyun Choi (University of Minnesota, Twin Cities) |
5898 |
BigColor: Colorization using a Generative Color Prior for Natural Images |
geonung kim (POSTECH); Kyoungkook Kang (POSTECH); Seongtae Kim (POSTECH); Hwayoon Lee (POSTECH); Sehoon Kim (Samsung electronics co. ltd.); Jonghyun Kim (Samsung Electronics); Seung-Hwan Baek (POSTECH); Sunghyun Cho (POSTECH)* |
5901 |
Object Wake-up: 3D Object Rigging from a Single Image |
Ji Yang (University of Alberta)*; Xinxin Zuo (University of Alberta); Sen Wang (University of Alberta); Zhenbo Yu (Shanghai Jiao Tong University); Xingyu Li (University of Alberta); Bingbing Ni (Shanghai Jiao Tong University); Minglun Gong (University of Guelph); Li Cheng (ECE dept., University of Alberta) |
5905 |
ClearPose: Large-scale Transparent Object Dataset and Benchmark |
Xiaotong Chen (University of Michigan, Ann Arbor)*; Huijie Zhang (University of Michigan, Ann Arbor); Zeren Yu (University of Michigan–Ann Arbor); Anthony Opipari (University of Michigan); Odest Chadwicke Jenkins (University of Michigan) |
5907 |
Domain Knowledge-Informed Self-Supervised Representations for Workout Form Assessment |
Paritosh Parmar (University of British Columbia)*; Amol Gharat (Flex A.I.); Helge Rhodin (UBC) |
5908 |
Neural Capture of Animatable 3D Human from Monocular Video |
Gusi Te (Peking University); Xiu Li (Tencent); Xiao Li (Microsoft Research Asia)*; Jinglu Wang (Microsoft Research Asia); Wei Hu (Peking University); Yan Lu (Microsoft Research Asia) |
5913 |
Open Vocabulary Object Detection with Pseudo Bounding-Box Labels |
Mingfei Gao (Apple)*; Chen Xing (Salesforce Research); Juan Carlos Niebles (Salesforce & Stanford University); Junnan Li (Salesforce); Ran Xu (Salesforce Research); Wenhao Liu (Salesforce Metamind); Caiming Xiong (Salesforce Research) |
5914 |
BoundaryFace: A mining framework with noise label self-correction for Face Recognition |
Shijie Wu (Southwest Jiaotong University)*; Xun Gong (Southwest Jiaotong University) |
5915 |
IntegratedPIFu: Integrated Pixel Aligned Implicit Function for Single-view Human Reconstruction |
Kennard Chan Yanting (Nanyang Technological University)*; Guosheng Lin (Nanyang Technological University); Haiyu Zhao (SenseTime International Pte Ltd); Weisi Lin (Nanyang Technological University, Singapore) |
5922 |
BMD: A General Class-balanced Multicentric Dynamic Prototype Strategy for Source-free Domain Adaptation |
Sanqing Qu (Tongji University); Guang Chen (Tongji University)*; Jing Zhang (The University of Sydney); Zhijun Li (University of Science and Technology of China); Wei He (University of Science and Technology Beijing); Dacheng Tao (JD.com) |
5923 |
What Matters for 3D Scene Flow Network |
Guangming Wang (Shanghai Jiao Tong University); Yunzhe Hu (Shanghai Jiao Tong University); Zhe Liu (University of Cambridge); Yiyang Zhou (UC Berkeley ); Masayoshi TOMIZUKA (MSC Lab); Wei Zhan (University of California, Berkeley); Hesheng Wang (SJTU)* |
5932 |
Controllable Shadow Generation Using Pixel Heigh Maps |
Yichen Sheng (Purdue University)*; Yifan Liu (University of Adelaide); Jianming Zhang (Adobe Research); Wei Yin (University of Adelaide); A. Cengiz Oztireli (University of Cambridge, Google); He Zhang (Adobe); Zhe Lin (Adobe Research); Eli Shechtman (Adobe Research, US); Bedrich Benes (Purdue University) |
5937 |
CADyQ: Content-Aware Dynamic Quantization for Image Super-Resolution |
Cheeun Hong (Seoul National University); Sungyong Baik (Hanyang University); Heewon Kim (Seoul National University); Seungjun Nah (NVIDIA); Kyoung Mu Lee (Seoul National University)* |
5940 |
SPSN: Superpixel Prototype Sampling Network for RGB-D Salient Object Detection |
Minhyeok Lee ( Yonsei University)*; Chaewon Park (Yonsei University); Suhwan Cho (Yonsei University); Sangyoun Lee (Yonsei University) |
5950 |
Long Video Generation with Time-Agnostic VQGAN and Time-Sensitive Transformer |
Songwei Ge (University of Maryland)*; Thomas F Hayes (Meta); Harry Yang (Facebook); Xi Yin (Facebook); Guan Pang (Facebook); David Jacobs (University of Maryland, USA); Jia-Bin Huang (Facebook ); Devi Parikh (Georgia Tech & Facebook AI Research) |
5951 |
Combining Internal and External Constraints for Unrolling Shutter in Videos |
Eyal Naor (Weizmann Institute)*; Itai Antebi (Weizmann); Shai Bagon (Weizmann Institute of Science); Michal Irani (Weizmann Institute, Israel) |
5961 |
Global Spectral Filter Memory Network for Video Object Segmentation |
Yong Liu (Tsinghua University)*; Ran Yu (Tsinghua university); Jiahao Wang (Tsinghua University); Xinyuan Zhao (Huawei); Yitong Wang (Bytedance); Yansong Tang (Tsinghua University); Yujiu Yang (Tsinghua University) |
5964 |
SEMICON: A Learning-to-hash Solution for Large-scale Fine-grained Image Retrieval |
Yang Shen (Nanjing University of Science and Technology); Xu Hao XH SUN (Nanjing University Of Science And Technology); Xiu-Shen Wei (Nanjing University of Science and Technology)*; Qing-Yuan Jiang (HuaWei); Jian Yang (Nanjing University of Science and Technology) |
5966 |
Batch-efficient EigenDecomposition for Small and Medium Matrices |
Yue Song (University of Trento)*; Nicu Sebe (University of Trento); Wei Wang (EPFL) |
5972 |
General Object Pose Transformation Network from Unpaired Data |
Yukun Su (South China University of Technology)*; Guosheng Lin (Nanyang Technological University); RuiZhou Sun (South China University of Technology); Qingyao Wu (South China University of Technology) |
5974 |
Robust Network Architecture Search via Feature Distortion Restraining |
Yaguan QIAN (Zhejiang University of Science and Technology)*; Shenghui Huang (Zhejiang University of Science and Technology); Bin WANG (Network and Information Security Laboratory of Hangzhou Hikvision Digital Technology Co.); Xiang Ling (Institute of Software, Chinese Academy of Sciences); Xiaohui Guan (Zhejiang University of Water Resources and Electric Power); Zhaoquan Gu (Guangzhou University); Shaoning Zeng (Yangtze Delta Region Institute (Huzhou), University of Electronic Science and Technology of China); Wujie Zhou (Zhejiang University of Science and Technology); Haijiang Wang (Zhejiang University of Science and Technology) |
5988 |
Correspondence Reweighted Translation Averaging |
Lalit Manam (Indian Institute of Science Bengaluru)*; Venu Madhav Govindu (Indian Institute of Science) |
5993 |
RepMix: Representation Mixing for Robust Attribution of Synthesized Images |
Tu Bui (University of Surrey)*; Ning Yu (Salesforce Research); John Collomosse (Adobe Research) |
6000 |
When Deep Classifiers Agree: Analyzing Correlations between Learning Order and Image Statistics |
Iuliia Pliushch (Goethe University)*; Martin Mundt (TU Darmstadt); Nicolas Lupp (Goethe University Frankfurt); Visvanathan Ramesh (Goethe University) |
6002 |
S2F2: Single-Stage Flow Forecasting for Future Multiple Trajectories Prediction |
YU-WEN CHEN (National Tsing Hua University); Hsuan-Kung Yang (National Tsing Hua University); Chu-Chi Chiu (National Tsin-Hua University); Chun-Yi Lee (National Tsing Hua University)* |
6004 |
Few-Shot Object Detection by Knowledge Distillation Using Bag-of-Visual-Words Representations |
Wenjie Pei (Harbin Institute of Technology, Shenzhen); Shuang Wu (Harbin Institute of Technology, Shenzhen); Dianwen Mei (Harbin Institute of Technology, Shenzhen); Fanglin Chen (Harbin Institute of Technology, Shenzhen); Jiandong Tian (CAS); Guangming Lu ( Harbin Institute of Technology, Shenzhen)* |
6009 |
Stochastic Consensus: Enhancing Semi-Supervised Learning with Consistency of Stochastic Classifiers |
Hui Tang (South China University of Technology)*; Kui Jia (South China University of Technology); Lin Sun (Magic Leap) |
6011 |
Learning Where To Look – Generative NAS is Surprisingly Efficient |
Jovita Lukasik (University of Mannheim)*; Steffen Jung (MPII); Margret Keuper (University of Mannheim) |
6023 |
Realistic One-shot Mesh-based Head Avatars |
Taras Khakhulin (Skolkovo Institute of Science and Technology)*; Vanessa Valerievna Skliarova (Skoltech); Victor Lempitsky (Yandex); Egor Zakharov (Skolkovo Institute of Science and Technology) |
6024 |
Ensemble Knowledge Guided Sub-network Search and Fine-tuning for Filter Pruning |
Seunghyun Lee (Inha University); Byung Cheol Song (Inha University)* |
6037 |
SALISA: Saliency-based Input Sampling for Efficient Video Object Detection |
Babak Ehteshami Bejnordi (Qualcomm AI Reseach)*; Amir Ghodrati (Qualcomm AI Research); Fatih Porikli (Qualcomm AI Research); Amirhossein Habibian (Qualcomm AI Research) |
6039 |
Video Instance Segmentation via Multi-Scale Spatio-Temporal Split Attention Transformer |
Omkar Thawakar (MBZUAI)*; Sanath Narayan (Inception Institute of Artificial Intelligence); Jiale Cao (Tianjin University); Hisham Cholakkal (MBZUAI); Rao Muhammad Anwer (MBZUAI/AALTO); Muhammad Haris Khan (Muhammad Bin Zayed University of Artificial Intelligence); Salman Khan (MBZUAI/ANU); Michael Felsberg (Linköping University); Fahad Shahbaz Khan (MBZUAI) |
6044 |
RankSeg: Adaptive Pixel Classification with Image Category Ranking for Segmentation |
Haodi He (University of Science and Technology of China); Yuhui Yuan (Microsoft Research)*; Xiangyu Yue (University of California, Berkeley); Han Hu (Microsoft Research Asia) |
6046 |
Contextformer: A Transformer with Spatio-Channel Attention for Context Modeling in Learned Image Compression |
Ahmet Burakhan Koyuncu (Technical University of Munich)*; Han Gao (Tencent America); Atanas Boev (Huawei Technologies Duesseldorf GmbH); Georgii Gaikov (Huawei Moscow Research Center); Elena Alshina (Huawei Technologies); Eckehard Steinbach (TUM) |
6048 |
Image Super-Resolution with Deep Dictionary |
Shunta Maeda (Navier Inc.)* |
6054 |
ECO-TR: Efficient Correspondences Finding Via Coarse-to-Fine Refinement |
Dongli Tan (Xiamen University)*; Jiang-Jiang Liu (Nankai University); Xingyu Chen (Youtu Lab); Chao Chen (Youtu Laboratory); Ruixin Zhang (Tencent); Yunhang Shen (Xiamen University); Shouhong Ding (Tencent); Rongrong Ji (Xiamen University, China) |
6056 |
Responsive Listening Head Generation: A Benchmark Dataset and Baseline |
Mohan Zhou (Harbin Institute of Technology)*; Yalong Bai (JD AI Research); Wei Zhang (JD AI Research); Ting Yao (JD AI Research); Tiejun Zhao (Harbin Institute of Technology); Tao Mei (AI Research of JD.com) |
6063 |
WISE: Whitebox Image Stylization by Example-based Learning |
Winfried Lötzsch (Merantix Momentum); Max Reimann (Hasso-Plattner-Institute)*; Martin Büßemeyer (Hasso-Plattner-Institut); Amir Semmo (Digital Masterpieces GmbH); Jürgen Döllner (Hasso-Plattner-Institut); Matthias Trapp (Hasso Plattner Institute, University of Potsdam) |
6067 |
3D Equivariant Graph Implicit Functions |
Yunlu Chen (University of Amsterdam)*; Basura Fernando (Agency for Science, Technology and Research, A*STAR, Singapore); Hakan Bilen (University of Edinburgh); Matthias Niessner (Technical University of Munich); Efstratios Gavves (University of Amsterdam ) |
6068 |
AnimeCeleb: Large-Scale Animation CelebHeads Dataset for Head Reenactment |
Kangyeol Kim (KAIST)*; Sunghyun Park (KAIST); Jaeseong Lee (KAIST); Sunghyo Chung (Korea University); Junsoo Lee (NAVER WEBTOON Ltd.); Jaegul Choo (Korea Advanced Institute of Science and Technology) |
6076 |
Towards Scale-Aware, Robust, and Generalizable Unsupervised Monocular Depth Estimation by Integrating IMU Motion Dynamics |
Sen Zhang (The University of Sydney); Jing Zhang (The University of Sydney)*; Dacheng Tao (The University of Sydney) |
6078 |
Dynamic Local Aggregation Network with Adaptive Clusterer for Anomaly Detection |
Zhiwei Yang (Xidian University)*; Peng Wu (Xidian University); Jing Liu (Xidian University); Xiaotao Liu (Xidian University) |
6080 |
Learning Semantic Segmentation from Multiple Datasets with Label Shifts |
Dongwan Kim (Seoul National University)*; Yi-Hsuan Tsai (Phiar Technologies); Yumin Suh (NEC Labs America); Masoud Faraki (NEC Labs); Sparsh Garg (NEC Labs America); Manmohan Chandraker (UC San Diego); Bohyung Han (Seoul National University) |
6086 |
SecretGen: Privacy Recovery on Pre-trained Models via Distribution Discrimination |
Zhuowen Yuan (UIUC); Fan Wu (UIUC); Yunhui Long (University of Illinois); Chaowei Xiao (NVIDIA); Bo Li (UIUC)* |
6090 |
A Kendall Shape Space Approach to 3D Shape Estimation from 2D Landmarks |
Martha Paskin (Zuse Institute Berlin); Daniel Baum (Zuse Institute Berlin); Mason N Dean (City University of Hong Kong); Christoph von Tycowicz (Zuse Institute Berlin)* |
6092 |
Temporally Consistent Transformer for Video Denoising |
Mingyang Song (ETH Zurich)*; Yang Zhang (Disney Research Studios); Tunç Aydin (Disney Research) |
6093 |
Action Quality Assessment with Temporal Parsing Transformer |
Yang Bai (Durham University); Desen Zhou (Baidu, Inc.)*; Songyang Zhang (Shanghai AI Laboratory); Jian Wang (Baidu); Errui Ding (Baidu Inc.); Yu Guan (University of Warwick); Yang Long (Durham University); Jingdong Wang (Baidu) |
6097 |
A study of Pre-training strategies and datasets for facial representation learning |
Adrian Bulat (Samsung AI Center, Cambridge)*; Shiyang Cheng (Samsung); Jing Yang (University of Nottingham); Andrew Garbett (Samsung AI Center); Enrique Sanchez (Samsung AI Centre); Georgios Tzimiropoulos (Queen Mary University of London) |
6112 |
Neural Strands: Learning Hair Geometry and Appearance from Multi-View Images |
Radu Alexandru Rosu (University of Bonn); Shunsuke Saito (Facebook); Ziyan Wang (Carnegie Mellon University); Chenglei Wu (Facebook Reality Labs); Sven Behnke (University of Bonn); Giljoo Nam (Facebook Inc.)* |
6114 |
Conditional Stroke Recovery for Fine-Grained Sketch-Based Image Retrieval |
Zhixin Ling (Fudan University)*; Zhen Xing (Fudan University); Jian Zhou (Fudan University); Xiangdong Zhou (Fudan University) |
6123 |
Generalized Brain Image Synthesis with Transferable Convolutional Sparse Coding Networks |
Yawen Huang (Tencent)*; Feng Zheng (SUSTech); Xu Sun (Tencent); Yuexiang Li (Jarvis Lab, Tencent); Ling Shao (Terminus Group); Yefeng Zheng (Tencent) |
6127 |
Wave-ViT: Unifying Wavelet and Transformers for Visual Representation Learning |
Ting Yao (JD AI Research); Yingwei Pan (JD AI Research)*; Yehao Li (JD AI Research); Chong-Wah Ngo (Singapore Management University); Tao Mei (AI Research of JD.com) |
6129 |
GraphCSPN: Geometry-Aware Depth Completion via Dynamic GCNs |
Xin Liu (Tsinghua University)*; Xiaofei Shao (Deptrum); Bo Wang (Deptrum); Ya-Li Li (Tsinghua University); Shengjin Wang (Tsinghua University) |
6138 |
Revisiting Batch Norm Initialization |
Jim Davis (Ohio State University); Logan Frank (Ohio State University)* |
6141 |
NewsStories: Illustrating articles with visual summaries |
Reuben Tan (Boston University)*; Bryan Plummer (Boston University); Kate Saenko (Boston University); J.P. Lewis (Google Research); Avneesh Sud (Google); Thomas Leung (Google) |
6144 |
Improving Few-Shot Learning through Multi-task Representation Learning Theory |
Quentin Bouniot (CEA, LIST)*; Ievgen Redko (Laboratoire Hubert Curien); Romaric Audigier (CEA LIST); Angélique Loesch (CEA LIST); Amaury Habrard (University of St-Etienne, Lab. H. Curien) |
6145 |
Deep Semantic Statistics Matching (D2SM) Denoising Network |
Kangfu Mei (Johns Hopkins University)*; Vishal Patel (Johns Hopkins University); Rui Huang (The Chinese University of Hong Kong, Shenzhen) |
6148 |
Long-tailed Instance Segmentation using Gumbel Optimized Loss |
Konstantinos P Alexandridis (University of Liverpool)*; Jiankang Deng (Imperial College London); Anh Nguyen (University of Liverpool); Shan Luo (University of Liverpool) |
6162 |
DetMatch: Two Teachers are Better Than One for Joint 2D and 3D Semi-Supervised Object Detection |
Jinhyung Park (Carnegie Mellon University)*; Chenfeng Xu (UC Berkeley); Yiyang Zhou (UC Berkeley ); Masayoshi TOMIZUKA (MSC Lab); Wei Zhan (University of California, Berkeley) |
6177 |
3D Scene Inference from Transient Histograms |
Sacha Jungerman (University of Wisconsin-Madison)*; Atul N Ingle (University of Wisconsin-Madison); Yin Li (University of Wisconsin-Madison); Mohit Gupta (“University of Wisconsin-Madison, USA “) |
6178 |
SSBNet: Improving Visual Recognition Efficiency by Adaptive Sampling |
Ho Man Kwan (The Hong Kong University of Science and Technology)*; S.H. Song (HKUST) |
6182 |
Deep 360° Optical Flow Estimation by Multi-Projection Fusion |
Yiheng Li (Victoria University of Wellington); Connelly Barnes (Adobe); Kun Huang (Victoria University of Wellington); Fang-Lue Zhang (Victoria University of Wellington)* |
6187 |
Neural Space-filling Curves |
Hanyu Wang (University of Maryland – College Park)*; Kamal Gupta (University of Maryland); Larry Davis (University of Maryland); Abhinav Shrivastava (University of Maryland) |
6192 |
MFIM: Megapixel Facial Identity Manipulation |
Sanghyeon Na (kakaobrain)* |
6194 |
Objects Can Move: 3D Change Detection by GeometricTransformation Consistency |
Aikaterini Adam (National Techniclal University of Athens)*; Torsten Sattler (Czech Technical University in Prague); Konstantinos Karantzalos (National Technical University of Athens); Tomas Pajdla (Czech Technical University in Prague) |
6199 |
MUGEN: A Playground for Video-Audio-Text Multimodal Understanding and GENeration |
Thomas F Hayes (Meta); Songyang Zhang (University of Rochester)*; Xi Yin (Facebook); Guan Pang (Facebook); Sasha Sheng (Meta Platforms); Harry Yang (Facebook); Songwei Ge (University of Maryland, College Park); Qiyuan Hu (Facebook AI Research); Devi Parikh (Georgia Tech & Facebook AI Research) |
6203 |
PatchRD: Detail-Preserving Shape Completion by Learning Patch Retrieval and Deformation |
Bo Sun (UT Austin)*; Vladimir Kim (Adobe); Qixing Huang (The University of Texas at Austin); Noam Aigerman (Adobe); Siddhartha Chaudhuri (Adobe Research) |
6207 |
Network Binarization via Contrastive Learning |
Yuzhang Shang (Illinois Institute of Technology)*; Dan Xu (The Hong Kong University of Science and Technology); Ziliang Zong (Texas State University); Liqiang Nie (Harbin Institute of Technology (Shenzhen)); Yan Yan (Illinois Institute of Technology) |
6210 |
Lipschitz Continuity Retained Binary Neural Network |
Yuzhang Shang (Illinois Institute of Technology)*; Dan Xu (The Hong Kong University of Science and Technology); Bin Duan (Illinois Institute of Technology); Ziliang Zong (Texas State University); Liqiang Nie (Harbin Institute of Technology (Shenzhen)); Yan Yan (Illinois Institute of Technology) |
6212 |
Is Geometry Enough for Matching in Visual Localization? |
Qunjie Zhou (Technical University of Munich)*; Sérgio Agostinho (Institute for Systems and Robotics, Instituto Superior Técnico, Universidade de Lisboa); Aljosa Osep (TUM Munich); Laura Leal-Taixé (TUM) |
6214 |
Webly Supervised Concept Expansion for General Purpose Vision Models |
Amita Kamath (Allen Institute for Artificial Intelligence); Christopher A Clark (Allen Institute for AI)*; Tanmay Gupta (Allen Institute for Artificial Intelligence); Eric Kolve (Allen AI); Derek Hoiem (University of Illinois at Urbana-Champaign); Aniruddha Kembhavi (Allen Institute for Artificial Intelligence) |
6216 |
Compositional Human-Scene Interaction Synthesis with Semantic Control |
Kaifeng Zhao (ETH Zurich)*; Shaofei wang (ETH Zurich); Yan Zhang (ETH Zurich); Thabo Beeler (Disney Research | Studios); Siyu Tang (ETH Zurich) |
6218 |
MaCLR: Motion-aware Contrastive Learning of Representations for Videos |
Fanyi Xiao (Meta); Joseph Tighe (Amazon); Davide Modolo (Amazon)* |
6220 |
Transformers as Meta-Learners for Implicit Neural Representations |
Yinbo Chen (UC San Diego)*; Xiaolong Wang (UCSD) |
6222 |
RAWtoBit: A Fully End-to-end Camera ISP Network |
Wooseok Jeong (Korea University); Seung-Won Jung (Korea University)* |
6227 |
SpatialDETR: Robust Scalable Transformer-Based 3D Object Detection from Multi-View Camera Images with Global Cross-Sensor Attention |
Simon Doll (University of Tübingen)*; Richard Schulz (Mercedes Benz); Lukas Schneider (Daimer); Viviane Benzin (Mercedes-Benz AG); Markus Enzweiler (Esslingen University of Applied Sciences); Hendrik P. A. Lensch (University of Tübingen) |
6228 |
3D Face Reconstruction with Dense Landmarks |
Erroll Wood (Microsoft)*; Tadas Baltrusaitis (Microsoft); Charlie Hewitt (Microsoft); Matthew A Johnson (Microsoft); Jingjing Shen (Microsoft); Nikola Milosavljevic (Microsoft); Daniel S Wilde (Microsoft); Stephan J Garbin (University College London); Toby Sharp (Microsoft); Ivan Stojiljkovic (Microsoft); Tom Cashman (Microsoft); Julien Valentin (Microsoft) |
6236 |
SWFormer: Sparse Window Transformer for 3D Object Detection in Point Clouds |
Pei Sun (Waymo)*; Mingxing Tan (Waymo); Weiyue Wang (Waymo); Chenxi Liu (Waymo); Fei Xia (Waymo); Zhaoqi Leng (Waymo); Dragomir Anguelov (Waymo) |
6247 |
Incomplete Multi-view Domain Adaptation via Channel Enhancement and Knowledge Transfer |
Haifeng Xia (Tulane University)*; Pu Wang (MERL); Zhengming Ding (Tulane University) |
6250 |
Exposure-Aware Dynamic Weighted Learning for Single-Shot HDR Imaging |
An Gia Vien (Dongguk University); Chul Lee (Dongguk University)* |
6259 |
Seeing through a Black Box: Toward High-Quality Terahertz Imaging via Subspace-and-Attention Guided Restoration |
Weng-Tai Su (National Tsing Hua University); Yi-Chun Hung (University of California, Los Angeles); Po-Jen Yu (National Tsing Hua University); Shang-Hua Yang (National Tsing Hua University); Chia-Wen Lin (National Tsing Hua University)* |
6265 |
SPViT: Enabling Faster Vision Transformers via Soft Token Pruning |
Zhenglun Kong (Northeastern University)*; Peiyan Dong (Northeastern University); Xiaolong Ma (Clemson University); Xin Meng (Peking university); Wei Niu (William & Mary); Mengshu Sun (Northeastern University); Xuan Shen (Northeastern University); Geng Yuan (Northeastern University); Bin Ren (William & Mary); Hao Tang (ETH Zurich); Minghai Qin (Western Digital Research); Yanzhi Wang (Northeastern University) |
6269 |
Soft Masking for Cost-Constrained Channel Pruning |
Ryan Humble (Stanford University)*; Maying Shen (NVIDIA); Jorge Albericio Latorre (NVIDIA); Eric Darve (Stanford University); Jose M. Alvarez (NVIDIA) |
6271 |
Ensemble Learning Priors Driven Deep Unfolding forScalable Snapshot Compressive Imaging |
Chengshuai Yang (Westlake University)*; Shiyu Zhang (Westlake University); Xin Yuan (Westlake University) |
6275 |
A Simple Baseline for Open Vocabulary Semantic Segmentation with Pre-trained Vision-language Model |
Mengde Xu (Huazhong University of Science and Tech.); Zheng Zhang (MSRA)*; Fangyun Wei (Microsoft Research Asia); Yutong Lin (Xi’an Jiaotong University); Yue Cao (Microsoft Research); Han Hu (Microsoft Research Asia); Xiang Bai (Huazhong University of Science and Technology) |
6276 |
Triangle Attack: A Query-efficient Decision-based Adversarial Attack |
Xiaosen Wang (Huazhong University of Science and Technology)*; Zeliang Zhang (Huazhong University of Sci. & Technology); Kangheng Tong (Huazhong University of Science and Technology); dihong gong (Tencent AI Lab); Kun He (Huazhong University of Science and Technology); Zhifeng Li (Tencent AI Lab); Wei Liu (Tencent) |
6282 |
Tailoring Self-Supervision for Supervised Learning |
WonJun Moon (Sungkyunkwan University)*; Jihwan Kim (Sungkyunkwan University); Jae-Pil Heo (Sungkyunkwan University) |
6283 |
Difficulty-Aware Simulator for Open Set Recognition |
WonJun Moon (Sungkyunkwan University)*; Jun ho Park (Sungkyunkwan university); Hyun Seok Seong (Sungkyunkwan University); Cheol-Ho Cho (Sungkyunkwan University); Jae-Pil Heo (Sungkyunkwan University) |
6287 |
Non-Uniform Step Size Quantization for Accurate Post-Training Quantization |
Sangyun Oh (UNIST)*; Hyeonuk Sim (UNIST); Jounghyun Kim (UNIST); Jongeun Lee (UNIST) |
6298 |
FedVLN: Privacy-preserving Federated Vision-and-Language Navigation |
Kaiwen Zhou (University of California, Santa Cruz)*; Xin Eric Wang (University of California, Santa Cruz) |
6305 |
Data-free Backdoor Removal Based on Channel Lipschitzness |
Runkai Zheng (Chinese University of Hong Kong (Shenzhen)); Rongjun Tang (The Chinese University of Hong Kong, Shenzhen); Jianze Li (Shenzhen Research Institute of Big Data, The Chinese University of Hong Kong, Shenzhen); Li Liu (Shenzhen Research Institute of Big Data, the chinese university of hong kong shenzhen)* |
6312 |
SuperTickets: Drawing Task-Agnostic Lottery Tickets from Supernets via Jointly Architecture Searching and Parameter Pruning |
Haoran You (Rice University)*; Baopu Li (Baidu ); Zhanyi Sun (Rice University); Xu Ouyang (Rice University); Yingyan Lin (Rice University) |
6316 |
PCR-CG: Point Cloud Registration via Deep Explicit Color and Geometry |
Yu Zhang (Shanghai Jiaotong University )*; Yu Junle (HangZhou dianzi university); Xiaolin Huang (Shanghai Jiao Tong University); Wenhui Zhou (Hangzhou Dianzi University); Ji Hou (Meta Reality Labs) |
6323 |
DistPro: Searching A Fast Knowledge Distillation Process via Meta Optimization |
Xueqing Deng (University of California, Merced); Dawei Sun (University of Illinois Urbana-Champaign); Shawn Newsam (UC Merced); Peng Wang (Bytedance USA LLC.)* |
6324 |
Tomography of Turbulence Strength Based on Scintillation Imaging |
Nir Shaul (Technion)*; Schechner Yoav (Technion) |
6325 |
Realistic Blur Synthesis for Learning Image Deblurring |
Jaesung Rim (POSTECH); Geonung Kim (POSTECH); Jungeon Kim (POSTECH); Junyong Lee (POSTECH); Seungyong Lee (POSTECH); Sunghyun Cho (POSTECH)* |
6328 |
GLAMD: Global and Local Attention MaskDistillation for Object Detectors |
YounHo Jang (Kyung Hee University); Wheemyung Shin (Kyung Hee University); Jinbeom Kim (Sungkyunkwan University (SKKU)); Sung-Ho Bae (Kyung Hee University)*; Simon S Woo (Sungkyunkwan University (SKKU)) |
6337 |
Meta-GF: Training Dynamic-Depth Neural Networks Harmoniously |
Yi Sun (National University of Defense Technology); Jian Li (NUDT); Xin Xu (National University of Defense Technology)* |
6338 |
CXR Segmentation by AdaIN-based Domain Adaptation and Knowledge Distillation |
Yujin Oh (Kim Jaechul Graduate School of AI, KAIST, Korea); Jong Chul Ye (Kim Jaechul Graduate School of AI, KAIST, Korea)* |
6342 |
Emotion-aware Multi-view Contrastive Learning for Facial Emotion Recognition |
Daeha Kim (Inha University); Byung Cheol Song (Inha University)* |
6356 |
FCAF3D: Fully Convolutional Anchor-Free 3D Object Detection |
Danila Rukhovich (Samsung AI Center Moscow); Anna Vorontsova (Samsung AI Center)*; Anton S. Konushin (Samsung AI Center Moscow) |
6365 |
Video Dialog as Conversation about Objects Living in Space-Time |
Hoang-Anh Pham (Deakin University)*; Thao Minh Le (Deakin University); Vuong Le (Deakin University); Tu Minh Phuong (Posts and Telecommunications Institute of Technology); Truyen Tran (Deakin University) |
6366 |
Few-Shot Class-Incremental Learning from an Open-Set Perspective |
Can Peng (the University of Queensland)*; Kun Zhao (Sullivan Nicolaides Pathology); Tianren Wang (The University of Queensland); Meng Li (The University of Queensland); Brian C Lovell (University of Queensland) |
6380 |
ML-BPM: Multi-teacher Learning with Bidirectional Photometric Mixing for Open Compound Domain Adaptation in Semantic Segmentation |
Fei Pan (KAIST)*; Sungsu Hur (KAIST); Seokju Lee (KENTECH); Junsik Kim (Harvard University); In So Kweon (KAIST) |
6389 |
DRCNet: Dynamic Image Restoration Contrastive Network |
Fei Li (China Agricultural University)*; Lingfeng Shen (Tencent AI Lab); YANG MI (China Agricultural University); Zhenbo Li (China Agricultural University) |
6394 |
Order Learning Using Partially Ordered Data via Chainization |
Seon-Ho Lee (MCL, Korea University); Chang-Su Kim (Korea university)* |
6395 |
Style Your Hair: Latent Optimization for Pose-Invariant Hairstyle Transfer via Local-Style-Aware Hair Alignment |
Chaeyeon Chung ( Korea Advanced Institute of Science and Technology)*; Taewoo Kim (Korea Advanced Institute of Science and Technology ); Yoonseo Kim (KAIST); Sunghyun Park (KAIST); Kangyeol Kim (KAIST); Jaegul Choo (Korea Advanced Institute of Science and Technology) |
6403 |
High-Resolution Virtual Try-On with Misalignment and Occlusion-Handled Conditions |
SangYun Lee (Soongsil University); Gyojung Gu (Korea Advanced Institute of Science and Technology)*; Sunghyun Park (KAIST); Seunghwan Choi (Korea Advanced Institute of Science and Technology ); Jaegul Choo (Korea Advanced Institute of Science and Technology) |
6418 |
Zero-Shot Learning for Reflection Removal of Single 360-Degree Image |
Byeong-Ju Han (Ulsan National Institute of Science and Technology ); Jae-Young Sim (Ulsan National Institute of Science and Technology)* |
6420 |
A Codec Information Assisted Framework for Efficient Compressed Video Super-Resolution |
Hengsheng Zhang (Shanghai Jiao Tong University)*; Xueyi Zou (Huawei Noah’s Ark Lab); Jiaming Guo (Huawei Noah’s Ark Lab); Youliang Yan (Huawei Noah’s Ark Lab); Rong Xie (Shanghai Jiao Tong University); Li Song (Shanghai Jiao Tong University) |
6421 |
Towards Ultra Low Latency Spiking Neural Networks for Vision and Sequential Tasks Using Temporal Pruning |
Sayeed Shafayet Chowdhury (Purdue University)*; Nitin Rathi (Purdue University); Kaushik Roy (Purdue Uniiversity) |
6439 |
MimicME: A Large Scale Diverse 4D Database for Facial Expression Analysis |
Athanasios Papaioannou (Huawei)*; Baris Gecer (Huawei); Shiyang Cheng (Samsung); Grigorios Chrysos (EPFL); Jiankang Deng (Imperial College London); Eftychia Fotiadou (Imperial College London); Christos Kampouris (ApolloXR); Dimitrios Kollias (Queen Mary University London); Stylianos Moschoglou (Huawei Technologies Co. Ltd); Kritaphat Songsri-In (Imperial College London); Stylianos Ploumpis (Huawei Technologies Co. Ltd); George Trigeorgis (Imperial College London ); Panagiotis Tzirakis (Imperial College London); Evangelos Ververas (Imperial College London); Yuxiang Zhou (Deepmind, Google); Allan Ponniah (NHS); Anastasios Roussos (Institute of Computer Science, Foundation for Research and Technology Hellas); Stefanos Zafeiriou (Imperial College London) |
6441 |
Black-Box Dissector: Towards Erasing-based Hard-Label Model Stealing Attack |
Yixu Wang (Xiamen University)*; Jie Li (Xiamen University); Hong Liu (National Institute of Informatics ); Yan Wang (Pinterest); Yongjian Wu (Tencent Technology (Shanghai) Co.,Ltd); Feiyue Huang (Tencent); Rongrong Ji (Xiamen University, China) |
6451 |
Video Anomaly Detection by Solving Decoupled Spatio-Temporal Jigsaw Puzzles |
Guodong Wang (Beihang University)*; Yunhong Wang (State Key Laboratory of Virtual Reality Technology and System, Beihang University, Beijing 100191, China); Jie Qin (Nanjing University of Aeronautics and Astronautics); Dongming Zhang ( National Computer Network Emergency Response Technical Team/Coordination Center of China ); Xiuguo bao (National Computer Network Emergency Response Technical Team/Coordination Center of China); Di Huang (Beihang University, China) |
6454 |
Towards Accurate Network Quantization with Equivalent Smooth Regularizers |
Kirill Solodskikh (Huawei Noah’s Ark Lab, MSU)*; Vladimir Chikin (Huawei Noah’s Ark Lab); Ruslan Aydarkhanov (Huawei Noah’s Ark Lab); Dehua Song (Huawei Noah’s Ark Lab); Irina Zhelavskaya (Skolkovo Institute of Science and Technology (Skoltech)); Jiansheng Wei (Huawei Technologies Co. Ltd.) |
6455 |
DiffuseMorph: Unsupervised Deformable Image Registration Using Diffusion Model |
Boah Kim (KAIST)*; Inhwa Han (KAIST); Jong Chul Ye (Kim Jaechul Graduate School of AI, KAIST, Korea) |
6459 |
An Impartial Take to the CNN vs Transformer Robustness Contest |
Francesco Pinto (University of Oxford)*; Philip Torr (University of Oxford); Puneet Dokania (University of Oxford) |
6460 |
CODER: Coupled Diversity-Sensitive Momentum Contrastive Learning for Image-Text Retrieval |
Haoran Wang (Baidu)*; Dongliang He (Baidu); Wenhao Wu (Baidu); Boyang Xia (Institute of Computing Technology, Chinese Academy of Science); Min Yang (Baidu); Fu Li (Baidu); Yunlong Yu (Zhejiang University); Zhong Ji (Tianjin University); Errui Ding (Baidu Inc.); Jingdong Wang (Baidu) |
6463 |
Weakly Supervised 3D Scene Segmentation with Region-Level Boundary Awareness and Instance Discrimination |
Kangcheng LIU (The Chinese University of Hong Kong)*; Yuzhi Zhao (City University of Hong Kong); Qiang Nie (Tencent Youtu Lab); Zhi Gao (NUS); Ben M. Chen (Chinese University of Hong Kong) |
6471 |
FOSTER: Feature Boosting and Compression for Class-Incremental Learning |
Fu-Yun Wang (Nanjing University)*; Da-Wei Zhou (Nanjing University); Han-Jia Ye (Nanjing University); De-Chuan Zhan (Nanjing University) |
6472 |
Delving into Universal Lesion Segmentation: Method, Dataset, and Benchmark |
Yu Qiu (Nankai University)*; Jing Xu (Nankai University) |
6475 |
Explicit Model Size Control and Relaxation via Smooth Regularization for Mixed-Precision Quantization |
Vladimir Chikin (Huawei Noah’s Ark Lab)*; Kirill Solodskikh (Huawei Noah’s Ark Lab, MSU); Irina Zhelavskaya (Skolkovo Institute of Science and Technology (Skoltech)) |
6479 |
Large scale Real-world Multi Person Tracking |
Bing Shuai (Amazon)*; Alessandro Bergamo (Amazon); Uta Büchler (Amazon); Andrew G Berneshawi (Amazon); Alyssa Boden (Amazon Web Services); Joseph Tighe (Amazon) |
6491 |
Class-agnostic Object Detection with Multi-modal Transformer |
Muhammad Maaz (MBZUAI)*; Hanoona Abdul Rasheed (MBZUAI); Salman Khan (MBZUAI/ANU); Fahad Shahbaz Khan (MBZUAI); Rao Muhammad Anwer (MBZUAI/AALTO); Ming-Hsuan Yang (University of California at Merced) |
6493 |
Language-Grounded Indoor 3D Semantic Segmentation in the Wild |
Dávid Rozenberszki (Technische Universitat Munchen)*; Or Litany (Stanford); Angela Dai (Technical University of Munich) |
6505 |
Injecting 3D Perception of Controllable NeRF-GAN into StyleGAN for Editable Portrait Image Synthesis |
Jeong-gi Kwak (Korea University); Yuanming Li (Korea University); Dongsik Yoon (Korea University); Donghyeon Kim (Korea university); David K Han (Drexel University); Hanseok Ko (Korea University)* |
6512 |
BASQ: Branch-wise Activation-clipping Search Quantization for Sub-4-bit Neural Networks |
Han-Byul Kim (Seoul National University)*; Eunhyeok Park (POSTECH); Sungjoo Yoo (Seoul National University) |
6513 |
AdaNeRF: Adaptive Sampling for Real-time Rendering of Neural Radiance Fields |
Andreas Kurz (Graz University of Technology)*; Thomas Neff (Graz University of Technology); Zhaoyang Lv (Facebook); Michael Zollhöfer (Facebook Reality Labs); Markus Steinberger (Graz University of Technology) |
6516 |
Neural Light Field Estimation for Street Scenes with Differentiable Virtual Object Insertion |
Zian Wang (University of Toronto)*; Wenzheng Chen (University of Toronto); David Acuna (University of Toronto, NVIDIA); Jan Kautz (NVIDIA); Sanja Fidler (University of Toronto, NVIDIA) |
6519 |
Tree Structure-Aware Few-Shot Image Classification via Hierarchical Aggregation |
Min Zhang (Zhejiang University)*; Siteng Huang (Westlake University); Wenbin Li (Nanjing University); Donglin Wang (Westlake University) |
6526 |
PoseScript: 3D Human Poses from Natural Language |
Ginger Delmas (NAVER LABS EUROPE)*; Philippe Weinzaepfel (NAVER LABS Europe); Thomas LUCAS (Naver); Francesc Moreno (IRI); Gregory Rogez (NAVER LABS Europe) |
6532 |
Learning Energy-Based Models With Adversarial Training |
Xuwang Yin (University of Virginia)*; Shiying Li (University of North Carolina, Chapel Hill); Gustavo Rohde (University of Virginia) |
6538 |
You Already Have It: A Generator-Free Low-Precision DNN Training Framework using Stochastic Rounding |
Geng Yuan (Northeastern University)*; Sung-En Chang (Northeastern University); Qing Jin (Northeastern University); Alec Lu (Simon Fraser University ); Yanyu Li (Northeastern University); Yushu Wu (Northeastern University); Zhenglun Kong (Northeastern University); Yanyue Xie (Northeastern University); Peiyan Dong (Northeastern University); Minghai Qin (Western Digital Research); Xiaolong Ma (Clemson University); Xulong Tang (University of Pittsburgh); Zhenman Fang (Simon Fraser University); Yanzhi Wang (Northeastern University) |
6540 |
TIPS: Text-Induced Pose Synthesis |
Prasun Roy (University of Technology Sydney)*; Subhankar Ghosh (University of Technology Sydney ); Saumik Bhattacharya (Indian Institute of Technology Kharagpur ); Umapada Pal (Indian Statistical Institute, Kolkata); Michael Blumenstein (University of Technology Sydney) |
6541 |
Unsupervised High-Fidelity Facial Texture Generation and Reconstruction |
Ron Slossberg (Technion)*; Ibrahim Jubran (The University of Haifa); Ron Kimmel (Technion) |
6551 |
Addressing Heterogeneity in Federated Learning via Distributional Transformation |
Haolin Yuan (Johns Hopkins University); Bo Hui (Johns Hopkins University); Yuchen Yang (Johns Hopkins University); Philippe Burlina (JHU/APL/CS/SOM); Neil Zhenqiang Gong (Duke University); Yinzhi Cao (JHU)* |
6555 |
Adversarial Label Poisoning Attack on Graph Neural Networks via Label Propagation |
Ganlin Liu (The University of Liverpool)*; Xiaowei Huang (Liverpool University); Xinping Yi (University of Liverpool) |
6559 |
Approximate Discrete Optimal Transport Plan with Auxiliary Measure Method |
Dongsheng An (Stony Brook University)*; Na Lei (Dalian University of Technology); Xianfeng GU (Stony Brook University) |
6560 |
Visual Knowledge Tracing |
Neehar Kondapaneni (Caltech)*; Pietro Perona (California Institute of Technology); Oisin Mac Aodha (University of Edinburgh) |
6562 |
Semi-Leak: Membership Inference Attacks Against Semi-supervised Learning |
Xinlei He (CISPA Helmholtz Center for Information Security)*; Hongbin Liu (Duke University); Neil Zhenqiang Gong (Duke University); Yang Zhang (CISPA Helmholtz Center for Information Security) |
6565 |
DProST: Dynamic Projective Spatial Transformer Network for 6D Pose Estimation |
Jaewoo Park (Seoul National University); Nam Ik Cho (Seoul National University)* |
6567 |
Accurate Detection of Proteins in Cryo-Electron Tomograms from Sparse Labels |
Qinwen Huang (Duke University)*; Alberto Bartesaghi (Duke University); Ye Zhou (Duke University); Hsuan-Fu Liu (Duke University) |
6576 |
Subspace Diffusion Generative Models |
Bowen Jing (Massachusetts Institute of Technology)*; Gabriele Corso (MIT); Renato Berlinghieri (MIT); Tommi Jaakkola (MIT) |
6583 |
Multi-modal Text Recognition Networks: Interactive Enhancements between Visual and Semantic Features |
Byeonghu Na (KAIST); Yoonsik Kim (Clova AI Research, NAVER Corp.); Sungrae Park (Upstage AI Research, Upstage AI)* |
6592 |
Inductive and Transductive Few-Shot Video Classification via Appearance and Temporal Alignments |
Khoi D. Nguyen (VinAI Research)*; Quoc-Huy Tran (Retrocausal, Inc.); Khoi Nguyen (VinAI Research); Binh-Son Hua (VinAI Research); Rang NGUYEN (VinAI Research) |
6599 |
Learning Long-Term Spatial-Temporal Graphs for Active Speaker Detection |
Kyle Min (Intel Labs); Sourya Roy (University of California, Riverside); Subarna Tripathi (Intel Labs)*; Tanaya Guha (University of Glasgow); Somdeb Majumdar (Intel Labs) |
6602 |
Relative Contrastive Loss for Unsupervised Representation Learning |
Shixiang Tang (The University of Sydney)*; Feng Zhu (University of Science and Technology of China); Lei Bai (Shanghai AI Laboratory); Rui Zhao (SenseTime Group Limited); Wanli Ouyang (The University of Sydney) |
6615 |
Personalized Education: Blind Knowledge Distillation |
Xiang Deng (State University of New York at Binghamton)*; Jian Zheng (Amazon); Zhongfei Zhang (Binghamton University) |
6619 |
Fast Two-View Motion Segmentation Using Christoffel Polynomials |
Bengisu Ozbay (Northeastern University); Octavia Camps (Northeastern University); Mario Sznaier (Northeastern University)* |
6623 |
Real Spike: Learning Real-valued Spikes for Spiking Neural Networks |
Yufei Guo (The Second Academy of China Aerospace Science and Industry Corporation)*; Liwen Zhang (X Lab, the Second Academy of CASIC, Beijing); Yuanpei Chen (X LAB,The Second Academy of CASIC,Beijing); Xinyi Tong (The Second Academy of China Aerospace Science and Industry Corporation); Xiaode Liu (X Lab, The Second Academy of China Aerospace Science and Industry Corporation); YingLei Wang (CASIC); Xuhui Huang (X Lab, The Second Academy of CASIC); Zhe Ma (Xlab, the Second Academy of CASIC, Beijing) |
6627 |
Language-Driven Artistic Style Transfer |
Tsu-Jui Fu (UCSB)*; Xin Eric Wang (University of California, Santa Cruz); William Yang Wang (UC Santa Barbara) |
6634 |
FedLTN: Federated Learning for Sparse and Personalized Lottery Ticket Networks |
Vaikkunth Mugunthan (DynamoFL)*; Eric Lin (DynamoFL); Vignesh Gokul (University of California San Diego); Christian Lau (DynamoFL); Lalana Kagal (MIT); Steve Pieper (Isomics, Inc.) |
6639 |
Transformer with Implicit Edges for Particle-based Physics Simulation |
Yidi Shao (Nanyang Technological University)*; Chen Change Loy (Nanyang Technological University); Bo Dai (Shanghai AI Lab) |
6651 |
Improving the Perceptual Quality of 2D Animation Interpolation |
Shuhong Chen (University of Maryland – College Park)*; Matthias Zwicker (University of Maryland) |
6652 |
Towards Open-vocabulary Scene Graph Generation with Prompt-based Finetuning |
Tao He (Monash University); Lianli Gao (The University of Electronic Science and Technology of China); Jingkuan Song (UESTC); Yuan-Fang Li (Monash University)* |
6655 |
S3C: Self-Supervised Stochastic Classifiers for Few-Shot Class-Incremental Learning |
Jayateja Kalla (Indian Institute of Science); Soma Biswas (Indian Institute of Science, Bangalore)* |
6660 |
Entry-Flipped Transformer for Inference and Prediction of Participant Behavior |
BO HU (Nanyang Technological University)*; Tat-Jen Cham (Nanyang Technological University) |
6665 |
OpenLDN: Learning to Discover Novel Classes for Open-World Semi-Supervised Learning |
Mamshad Nayeem Rizve (University of Central Florida)*; Navid Kardan (University of Central Florida); Salman Khan (MBZUAI/ANU); Fahad Shahbaz Khan (MBZUAI); Mubarak Shah (University of Central Florida) |
6666 |
Fine-grained Fashion Representation Learning by Online Deep Clustering |
Yang Jiao (Amazon)*; Ning Xie (Amazon); Yan Gao (Amazon); Chien-Chih Wang (Amazon); Yi Sun (Amazon) |
6667 |
Perspective Phase Angle Model for Polarimetric 3D Reconstruction |
Guangcheng Chen (Guangdong University of Technology)*; Li He (Southern University of Science and Technology); Yisheng Guan (Guangdong University of Technology); Hong Zhang (University of Alberta) |
6670 |
Selective TransHDR: Transformer-based selective HDR Imaging using Ghost Region Mask |
Jou Won Song (Sogang University); Ye-In Park (Sogang University); Kyeongbo Kong (Pukyong National University); Jaeho Kwak (Sogang University); Suk-Ju Kang (Sogang University)* |
6671 |
3D Interacting Hand Pose Estimation by Hand De-occlusion and Removal |
Hao Meng (BeiHang University); Sheng Jin (The University of Hong Kong)*; Wentao Liu (Sensetime); Chen Qian (SenseTime); Mengxiang Lin (Beihang University); Wanli Ouyang (The University of Sydney); Ping Luo (The University of Hong Kong) |
6678 |
Recover Fair Deep Classification Models via Altering Pre-trained Structure |
Yanfu Zhang (University of Pittsburgh)*; Shangqian Gao (University of Pittsburgh); Heng Huang (University of Pittsburgh) |
6680 |
Improving Fine-Grained Visual Recognition in Low Data Regimes via Self-Boosting Attention Mechanism |
Yangyang Shu (University of Adelaide); Lingqiao Liu (University of Adelaide)*; Baosheng Yu (The University of Sydney); Haiming Xu (The University of Adelaide) |
6686 |
VSA: Learning Varied-Size Window Attention in Vision Transformers |
Qiming Zhang (The University of Sydney)*; YUFEI XU (University of sydney); Jing Zhang (The University of Sydney); Dacheng Tao (JD.com) |
6693 |
PoseGPT: Quantization-based 3D Human Motion Generation and Forecasting |
Thomas LUCAS (Naver)*; Fabien Baradel (Naver Labs Europe); Philippe Weinzaepfel (NAVER LABS Europe); Gregory Rogez (NAVER LABS Europe) |
6694 |
CAViT: Contextual Alignment Vision Transformer for Video Object Re-identification |
jinlin wu (Institute of Automation, Chinese Academy of Sciences, Beijing, China)*; He Lingxiao (nlpr,cripac); Wu Liu (AI Research of JD.com); Yang Yang (Institute of Automation, Chinese Academy of Sciences); Zhen Lei (NLPR, CASIA, China); Tao Mei (AI Research of JD.com); Stan Z. Li (Westlake University) |
6698 |
Learning Series-Parallel Lookup Tables for Efficient Image Super-Resolution |
Cheng Ma (Tsinghua University); Jingyi Zhang (Tsinghua University); Jie Zhou (Tsinghua University); Jiwen Lu (Tsinghua University)* |
6715 |
Frozen CLIP Models are Efficient Video Learners |
Ziyi Lin (The Chinese University of Hong Kong)*; Shijie Geng (Rutgers University); Renrui Zhang (Shanghai AI Lab); Peng Gao (Chinese university of hong kong); Gerard de Melo (Hasso Plattner Institute); Xiaogang Wang (Chinese University of Hong Kong, Hong Kong); Jifeng Dai (SenseTime); Yu Qiao (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences); Hongsheng Li (The Chinese University of Hong Kong) |
6719 |
Deforming Radiance Fields with Cages |
Tianhan Xu (The University of Tokyo)*; Tatsuya Harada (The University of Tokyo / RIKEN) |
6720 |
GeoAug: Data Augmentation for Few-Shot NeRF with Geometry Constrains |
Di Chen (Alibaba Group)*; Yu Liu (Alibaba Group); Lianghua Huang (Alibaba Group); bin wang (alibaba group); Pan Pan (Alibaba Group) |
6722 |
DoodleFormer: Creative Sketch Drawing with Transformers |
Ankan Kumar Bhunia (MBZUAI)*; Salman Khan (MBZUAI/ANU); Hisham Cholakkal (MBZUAI); Rao Muhammad Anwer (MBZUAI/AALTO); Fahad Shahbaz Khan (MBZUAI); Jorma Laaksonen (Aalto University); Michael Felsberg (Linköping University) |
6727 |
Implicit Neural Representations for Variable Length Human Motion Generation |
Pablo Alberto Cervantes Baque (Tokyo Institute of Technology)*; Yusuke Sekikawa (Denso IT Laboratory); Ikuro Sato (Tokyo Institute of Technology / Denso IT Laboratory); Koichi SHINODA (Tokyo Institute of Technology) |
6730 |
FLEX: Extrinsic Parameters-free Multi-view 3D Human Motion Reconstruction |
Brian Gordon (Tel Aviv University); Sigal Raab (Tel Aviv University)*; Guy Azov (Tel Aviv University); Raja Giryes (Tel Aviv University); Danny Cohen-Or (Tel Aviv University) |
6731 |
Pairwise Contrastive Learning Network for Action Quality Assessment |
Mingzhe Li (Huaqiao University); Hong-Bo Zhang (Huaqiao University)*; Qing Lei (Huaqiao University); Zongwen Fan (Huaqiao University); Jinghua Liu (Huaqiao University); Ji-Xiang Du (Huaqiao University) |
6742 |
Large-displacement 3D Object Tracking with Hybrid Non-local Optimization |
Xuhui Tian (Shandong University)*; Xinran Lin (Shandong University); Fan Zhong (Shandong University); Xueying N/A Qin (Shandong University) |
6745 |
Learning Object Placement via Dual-path Graph Completion |
Siyuan Zhou (Shanghai Jiao Tong University)*; Liu Liu (Shanghai Jiao Tong University); Li Niu (Shanghai Jiao Tong University); Liqing Zhang (Shanghai Jiao Tong University) |
6777 |
Unbiased Manifold Augmentation for Coarse Class Subdivision |
Baoming Yan (Alibaba Group)*; KE GAO (alibaba-inc); Bo Gao (Alibaba Group); Lin Wang (Alibaba-inc); Jiang Yang (Alibaba Group); Xiaobo Li (Alibaba) |
6798 |
Rethinking Video Rain Streak Removal: A New Synthesis Model and A Deraining Network with Video Rain Prior |
Shuai Wang ( College of Intelligence and Computing, Tianjin University); Lei Zhu (The Hong Kong University of Science and Technology (Guangzhou))*; Huazhu Fu (IHPC, ASTAR); Jing Qin (The Hong Kong Polytechnic University); Carola-Bibiane B Schönlieb (Cambridge University); Wei Feng (School of Computer Science and Technology, Tianjin University); Song Wang (University of South Carolina) |
6817 |
Expanded Adaptive Scaling Normalization for End to End Image Compression |
Chajin Shin (Yonsei University)*; Hyeongmin Lee (Yonsei University ); Hanbin Son (Yonsei Univ.); Sangjin Lee (Yonsei University); Dogyoon Lee (Yonsei University); Sangyoun Lee (Yonsei University) |
6827 |
Embedding contrastive unsupervised features to cluster in- and out-of-distribution noise in corrupted image datasets |
Paul Albert (Insight Centre for Data Analytics (DCU))*; Eric Arazo (Insight Centre for Data Analytics (DCU)); Noel O Connor (Home); Kevin McGuinness (DCU) |
6835 |
Filter Pruning via Feature Discrimination in Deep Neural Networks |
Zhiqiang He (Zhejiang University of Science and Technology)*; Yaguan QIAN (Zhejiang University of Science and Technology); Yuqi Wang (Zhejiang University of Science and Technology); Bin WANG (Network and Information Security Laboratory of Hangzhou Hikvision Digital Technology Co.); Xiaohui Guan (Zhejiang University of Water Resources and Electric Power); Zhaoquan Gu (Guangzhou University); Xiang Ling (Institute of Software, Chinese Academy of Sciences); Shaoning Zeng (Yangtze Delta Region Institute (Huzhou), University of Electronic Science and Technology of China); Haijiang Wang (Zhejiang University of Science and Technology); Wujie Zhou (Zhejiang University of Science and Technology) |
6836 |
VoViT: Low Latency Graph-based Audio-Visual Voice Separation Transformer |
Juan Felipe Montesinos (Universitat Pompeu Fabra)*; Venkatesh Shenoy Kadandale (Universitat Pompeu Fabra); Gloria Haro (Universitat Pompeu Fabra) |
6837 |
SGBANet: Semantic GAN and Balanced Attention Network for Arbitrarily Oriented Scene Text Recognition |
Dajian Zhong (East China Normal University)*; Shujing Lv (East China Normal University); Palaiahnakote Shivakumara (University of Malaya); Bing Yin (IFLYTEK Co.,Ltd); Jiajia Wu (IFLYTEK Co.,Ltd); Umapada Pal (Indian Statistical Institute, Kolkata); Yue Lu (East China Normal University) |
6838 |
DenseHybrid: Hybrid Anomaly Detection for Dense Open-set Recognition |
Matej Grcić (University of Zagreb, Faculty of Electrical Engineering and Computing)*; Petra Bevandić (Faculty of Electrical Engineering and Computing); Sinisa Segvic (UniZg-FER) |
6862 |
D2-TPred: Discontinuous Dependency for Trajectory Prediction under Traffic Lights |
Yuzhen Zhang (Zhengzhou University); Wentong Wang (Zhengzhou University); weizhi guo (zhengzhou university); Pei Lv (Zhengzhou University)*; Mingliang Xu (Zhengzhou University); Wei Chen (State Key Lab of CAD&CG, Zhejiang University); Dinesh Manocha (University of Maryland at College Park) |
6867 |
Where in the World is this Image? Transformer-based Geo-localization in the Wild |
Shraman Pramanick (Johns Hopkins University)*; Ewa M Nowara (Meta Reality Labs); Joshua Gleason (Univ of Maryland); Carlos Castillo (Johns Hopkins University); Rama Chellappa (Johns Hopkins University) |
6884 |
MODE: Multi-view Omnidirectional Depth Estimation with 360-degree Cameras |
Ming Li (NanJing University)*; Xueqian Jin (Nanjing University); Xuejiao Hu (Nanjing University); Jingzhao Dai (Nanjing University); Sidan Du (Nanjing University); Yang Li (NanJing University) |
6895 |
NashAE: Disentangling Representations through Adversarial Covariance Minimization |
Eric C Yeats (Duke University)*; Frank Liu (Oak Ridge National Lab); David Womble (Oak Ridge National Laboratory); Hai Li (Duke University) |
6900 |
Rethinking Confidence Calibration for Failure Prediction |
Fei Zhu (Institute of Automation of Chinese Academy of Sciences)*; Zhen Cheng (Institute of Automation of Chinese Academy of Sciences); Xu-Yao Zhang (Institute of Automation of Chinese Academy of Sciences); Cheng-Lin Liu (Institute of Automation of Chinese Academy of Sciences) |
6905 |
Colorization for in situ marine plankton images |
Guannan Guo (Shenzhen Institute of Advanced Technology ,Chinese Academy of Sciences); Qi Lin (Xiamen University); Tao Chen (Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences); Zhenghui Feng (Harbin Institute of Technology, Shenzhen); Zheng Wang (Shenzhen Institutes of Advanced Technology); Jianping Li (Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences)* |
6912 |
PIP: Physical Interaction Prediction via Mental Simulation with Span Selection |
Jiafei Duan (University of Washington, Seattle)*; Samson Yu (Agency for Science, Technology and Research); Soujanya Poria (Singapore University of Technology and Design); Bihan Wen (Nanyang Technological University); Cheston Tan (Institute for Infocomm Research, Singapore) |
6917 |
Generator Knows What Discriminator Should Learn in Unconditional GANs |
Gayoung Lee (NAVER AI Lab)*; Hyunsu Kim (NAVER AI Lab); Junho Kim (NAVER AI Lab); Seonghyeon Kim (Clova AI Research, NAVER Corp.); Jung-Woo Ha (NAVER CLOVA AI Lab); Yunjey Choi (NAVER AI Lab) |
6921 |
A Gyrovector Space Approach for Symmetric Positive Semi-definite Matrix Learning |
Xuan Son Nguyen (Ensea)* |
6940 |
Compositional Visual Generation with Composable Diffusion Models |
Nan Liu (University of Illinois at Urbana-Champaign); Shuang Li (MIT); Yilun Du (MIT)*; Antonio Torralba (MIT); Joshua Tenenbaum (MIT) |
6942 |
Temporal and cross-modal attention for audio-visual zero-shot learning |
Otniel-Bogdan Mercea (University of Tübingen)*; Thomas Hummel (University of Tübingen); A. Sophia Koepke (University of Tübingen); Zeynep Akata (University of Tübingen) |
6946 |
Telepresence Video Quality Assessment |
Zhenqiang Ying (The University of Texas at Austin)*; Deepti Ghadiyaram (Facebook); Alan Bovik (University of Texas at Austin) |
6955 |
Enhancing Multi-modal Features Using Local Self-attention for 3D Object Detection |
hao li (Hikvision Digital Technology Co. Ltd)*; Zehan Zhang (Shanghai Jiao Tong University & Hangzhou Hikvision Digital Technology Co. Ltd); Zhao Xian (Hikvision); yulong wang (Hikvision Digital Technology Co. Ltd); Yuxi Shen (Hikvision); Shiliang Pu (Hikvision Research Institute); Hui Mao (Hangzhou hikvision digital technology Co.,Ltd) |
6956 |
Totems: Physical Objects for Verifying Visual Integrity |
Jingwei Ma (University of Washington)*; Lucy Chai (MIT); Minyoung Huh (MIT); Tongzhou Wang (MIT); Ser-Nam Lim (Meta AI); Phillip Isola (MIT); Antonio Torralba (MIT) |
6959 |
ManiFest: manifold deformation for few-shot image translation |
Fabio Pizzati (Inria / Vislab)*; Jean-Francois Lalonde (Université Laval); Raoul de Charette (Inria) |
6963 |
3D Shape Sequence of Human Comparison and Classification using Current and Varifolds |
Emery Pierson (Université de Lille)*; Mohamed Daoudi (IMT Lille Douai); Sylvain Arguillere (Institute Camille Jordan) |
6971 |
Decouple-and-Sample: Protecting sensitive information in task agnostic data release |
Abhishek Singh (MIT)*; Ethan Garza (MIT); Ayush Chopra (MIT); Praneeth Vepakomma (MIT); Vivek Sharma (MIT); Ramesh Raskar (Massachusetts Institute of Technology) |
6972 |
Not All Models Are Equal: Predicting Model Transferability in a Self-challenging Fisher Space |
Wenqi Shao (The Chinese University of HongKong)*; Xun Zhao (Tencent Company); Yixiao Ge (Tencent); Zhaoyang Zhang (The Chinese University of Hong Kong); Lei Yang (Tencent); Xiaogang Wang (Chinese University of Hong Kong, Hong Kong); Ying Shan (Tencent); Ping Luo (The University of Hong Kong) |
6973 |
Object Detection as Probabilistic Set Prediction |
Georg Hess (Chalmers University of Technology)*; Christoffer Petersson (Zenseact); Lennart Svensson (Chalmers University of Technology) |
6974 |
k-SALSA: k-anonymous synthetic averaging of retinal images via local style alignment |
Minkyu Jeon (Korea University)*; Hyeonjin Park (Korea university); Hyunwoo J Kim (Korea University); Michael G Morley (Ophthalmic Consultants fo Boston); Hyunghoon Cho (Broad Institute of MIT and Harvard) |
6976 |
Uncertainty-guided Source-free Domain Adaptation |
Subhankar Roy (University of Trento)*; Martin Trapp (Aalto University ); Andrea Pilzer (Aalto University); Juho Kannala (Aalto University, Finland); Nicu Sebe (University of Trento); Elisa Ricci (University of Trento); Arno Solin (Aalto University) |
6978 |
LA3: Efficient Label-Aware AutoAugment |
Mingjun Zhao (University of Alberta)*; Shan Lu (University of Alberta); Zixuan Wang (Tencent Inc.); Xiaoli Wang (Tencent); Di Niu (University of Alberta) |
6982 |
Weakly-Supervised Temporal Action Detection for Fine-Grained Videos with Hierarchical Atomic Actions |
Zhi Li (University of California, Berkeley)*; Lu He (Tencent America); Huijuan Xu (Pennsylvania State University) |
6986 |
Geometric Features Informed Multi-person Human-object Interaction Recognition in Videos |
Tanqiu Qiao (Durham University); Qianhui Men (University of Oxford); Frederick W. B. Li (University of Durham); Yoshiki Kubotani (Waseda University); Shigeo Morishima (Waseda Research Institute for Science and Engineering); Hubert P. H. Shum (Durham University)* |
6990 |
FEAR: Fast, Efficient, Accurate and Robust Visual Tracker |
Vasyl Borsuk (Ukrainian Catholic University); Roman Vei (Ukrainian Catholic University); Orest Kupyn (Ukrainian Catholic University); Tetiana Martyniuk (Ukrainian Catholic University)*; Igor Krashenyi (Piñata Farms); Jiri Matas (CMP CTU FEE) |
6997 |
Variance-Aware Weight Initializationfor Point Convolutional Neural Networks |
Pedro Hermosilla Casajus (Ulm University)*; Michael Schelling (Ulm University – Institute of Media Informatics); Tobias Ritschel (UCL); Timo Ropinski (Ulm University) |
7004 |
Learning Visual Representation from Modality-Shared Contrastive Language-Image Pre-training |
Haoxuan You (Columbia University)*; Luowei Zhou (Microsoft); Bin Xiao (Microsoft); Noel C Codella (Microsoft); Yu Cheng (Microsoft Research); Ruochen Xu (Microsoft); Shih-Fu Chang (Columbia University); Lu Yuan (Microsoft) |
7016 |
Single-Stream Multi-Level Alignment for Vision-Language Pretraining |
Zaid Khan (Northeastern University)*; Vijay Kumar B G (NEC Laboratories America); Xiang Yu (NEC Labs); Samuel Schulter (NEC Laboratories America); Manmohan Chandraker (UC San Diego); YUN FU (Northeastern University) |
7022 |
Revisiting Outer Optimization in Adversarial Training |
Ali Dabouei (West Virginia university)*; Fariborz Taherkhani (Carnegie Mellon University); Sobhan Soleymani (West Virginia University); Nasser Nasrabadi (West Virginia University) |
7027 |
Supervised Attribute Information Removal and Reconstruction for Image Manipulation |
Nannan Li (Boston University)*; Bryan Plummer (Boston University) |
7028 |
Conditional-Flow NeRF: Accurate 3D Modelling with Reliable Uncertainty Quantification |
Jianxiong Shen (IRI, CSIC-UPC)*; Antonio Agudo (Institut de Robotica i Informatica Industrial, CSIC-UPC); Francesc Moreno (IRI); Adria Ruiz (Seedtag) |
7035 |
BLT: Bidirectional Layout Transformer for Controllable Layout Generation |
Xiang Kong (Carnegie Mellon University)*; Lu Jiang (Google Research); Huiwen Chang (Google); Han Zhang (Google); Yuan Hao (Google); Haifeng Gong (Google Inc.); Irfan Essa (Google) |
7039 |
Neural Correspondence Field for Object Pose Estimation |
Lin Huang (University at Buffalo); Tomas Hodan (Facebook Reality Labs)*; Lingni Ma (Facebook Reality Labs); Linguang Zhang (Facebook Reality Labs); Luan Tran (Facebook); Christopher D Twigg (Meta); PO-CHEN WU (Meta Inc.); Junsong Yuan (“State University of New York at Buffalo, USA”); Cem Keskin (Facebook); Robert Wang (Facebook Reality Labs) |
7043 |
The Missing Link: Finding label relations across datasets |
Jasper Uijlings (Google Research)*; Thomas Mensink (Google Research); Vittorio Ferrari (Google Research) |
7044 |
On Label Granularity and Object Localization |
Elijah Cole (Caltech)*; Kimberly Wilber (Google); Grant Van Horn (Cornell University); Xuan Yang (Google); Marco Fornoni (Google); Pietro Perona (California Institute of Technology); Serge Belongie (University of Copenhagen); Andrew Howard (Google); Oisin Mac Aodha (University of Edinburgh) |
7045 |
RadioTransformer: A Cascaded Global-Focal Transformer for Visual Attention-guided Disease Classification |
Moinak Bhattacharya (Stony Brook University)*; Shubham Jain (Stony Brook University); Prateek Prasanna (Stony Brook University) |
7048 |
OIMNet++: Prototypical Normalization and Localization-aware Learning for Person Search |
Sanghoon Lee (Yonsei University); Youngmin Oh (Yonsei University); Donghyeon Baek (Yonsei University); Junghyup Lee (Yonsei University); Bumsub Ham (Yonsei University)* |
7050 |
Most and Least Retrievable Images in Visual-Language Query Systems |
Liuwan Zhu (Old Dominion University)*; Rui Ning (Old Dominion University); Jiang Li (Old Dominion University); Chunsheng Xin (Old Dominion University); Hongyi Wu (Univesity of Arizona) |
7051 |
Contrasting quadratic assignments for set-based representation learning |
Artem Moskalev (University of Amsterdam)*; Ivan Sosnovik (University of Amsterdam); Volker Fischer (Bosch Center for Artificial Intelligence); Arnold W.M. Smeulders (University of Amsterdam) |
7061 |
How stable are Transferability Metrics evaluations? |
Andrea Agostinelli (Google)*; Michal Pandy (University of Cambridge); Jasper Uijlings (Google Research); Thomas Mensink (Google Research); Vittorio Ferrari (Google Research) |
7070 |
A Comparative Study of Graph Matching Algorithms in Computer Vision |
Stefan Haller (Heidelberg University)*; Lorenz Feineis (Heidelberg University); Lisa Hutschenreiter (Heidelberg University); Florian Bernard (University of Bonn); Carsten Rother (University of Heidelberg); Dagmar Kainmueller (MDC); Paul Swoboda (MPI fuer Informatik, Saarbruecken); Bogdan Savchynskyy (Heidelberg University) |
7077 |
HM: Hybrid Masking for Few-Shot Segmentation |
Seonghyeon Moon (Rutgers University)*; Samuel S Sohn (Rutgers University); Honglu Zhou (Rutgers University); Sejong Yoon (The College of New Jersey); Vladimir Pavlovic (Rutgers University); Muhammad Haris Khan (Muhammad Bin Zayed University of Artificial Intelligence); Mubbasir Kapadia (Rutgers) |
7082 |
UCTNet: Uncertainty-aware Cross-modal Transformer Network for Indoor RGB-D Semantic Segmentation |
Xiaowen Ying (Lehigh University)*; Mooi Choo Chuah (Lehigh University) |
7090 |
Learning Omnidirectional Flow in 360° Video via Siamese Representation |
Keshav Bhandari (Texas State University)*; Bin Duan (Illinois Institute of Technology); Gaowen Liu (Cisco Research); Hugo M Latapie (Cisco); Ziliang Zong (Texas State University); Yan Yan (Illinois Institute of Technology) |
7093 |
Improving Generalization in Federated Learning by Seeking Flat Minima |
Debora Caldarola (Politecnico di Torino)*; Barbara Caputo (Politecnico di Torino); Marco Ciccone (Politecnico di Torino) |
7099 |
Efficient Deep Visual and Inertial Odometry with Adaptive Visual Modality Selection |
Mingyu Yang (University of Michigan)*; Yu Chen (University of Michigan); Hun Seok Kim (Nil) |
7102 |
MultiMAE: Multi-modal Multi-task Masked Autoencoders |
Roman Bachmann (EPFL)*; David Mizrahi (EPFL); Andrei Atanov (EPFL); Amir Zamir (Swiss Federal Institute of Technology (EPFL)) |
7110 |
GigaDepth: Learning Depth from StructuredLight with Branching Neural Networks |
Simon Schreiberhuber (TUWien)*; Jean-Baptiste Weibel (TU Wien); Timothy Patten (University of Technology Sydney); Markus Vincze (TU Wien) |
7122 |
Diverse Generation from a Single Video Made Possible |
Niv Haim (Weizmann Institute of Science)*; Ben Feinstein (Weizmann Institute of Science); Niv Granot (Weizmann Institute of Science); Assaf Shocher (Weizmann Institute of Science); Shai Bagon (Weizmann Institute of Science); Tali Dekel (Weizmann Institute of Science); Michal Irani (Weizmann Institute, Israel) |
7127 |
Privacy-Preserving Action Recognition via Motion Difference Quantization |
Sudhakar Kumawat (Osaka University)*; Hajime Nagahara (Osaka University) |
7139 |
Learning Phase Mask for Privacy-Preserving Passive Depth Estimation |
Zaid Tasneem (Rice University); Giovanni Milione (4 independence Way, Princeton, NJ 08540); Yi-Hsuan Tsai (Phiar Technologies); Xiang Yu (NEC Labs); Ashok Veeraraghavan (Rice University); Manmohan Chandraker (UC San Diego); Francesco Pittaluga (NEC Laboratories America)* |
7143 |
DuelGAN: A Duel Between Two Discriminators Stabilizes the GAN Training |
Jiaheng Wei (UCSC)*; Minghao Liu (UCSC); Jiahao Luo (UCSC); Andrew Zhu (UCSC); James E Davis (UC Santa Cruz); Yang Liu (UC Santa Cruz) |
7151 |
Should All Proposals be Treated Equally in Object Detection? |
Yunsheng Li (UCSD)*; Yinpeng Chen (Microsoft); Xiyang Dai (Microsoft); DongDong Chen (Microsoft Cloud AI); Mengchen Liu (Microsoft); Pei Yu (); Ying Jin (Microsoft); Lu Yuan (Microsoft); Zicheng Liu (Microsoft); Nuno Vasconcelos (UC San Diego) |
7153 |
Interpretations Steered Network Pruning via Amortized Inferred Saliency Maps |
Alireza Ganjdanesh (University of Pittsburgh); Shangqian Gao (University of Pittsburgh); Heng Huang (University of Pittsburgh)* |
7158 |
Out-of-Distribution Identification: Let Detector Tell Which I Am Not Sure |
Ruoqi Li (SJTU); Chongyang Zhang (Shanghai Jiao Tong University)*; Hao Zhou (Shanghai Jiao Tong University); Chao Shi (Shanghai Jiao Tong University); Yan Luo (Shanghai Jiao Tong University) |
7167 |
Unsupervised Few-Shot Image Classification by Learning Features into Clustering Space |
Shuo Li (Xidian University); Fang Liu (Xidian University)*; Zehua Hao (Xidian University); Kaibo Zhao (Xidian University); Licheng Jiao (Xidian University) |
7173 |
ViP: Unified Certified Detection and Recovery for Patch Attack with Vision Transformers |
Junbo Li (UC Santa Cruz); Huan Zhang (UCLA); Cihang Xie (University of California, Santa Cruz)* |
7174 |
Panoramic Vision Transformer for Saliency Detection in 360 Videos |
Heeseung Yun (Seoul National University)*; Sehun Lee (Seoul National University); Gunhee Kim (Seoul National University) |
7175 |
ActiveNeRF: Learning where to See with Uncertainty Estimation |
Xuran Pan (Tsinghua University); Zihang Lai (CMU); Shiji Song (Department of Automation, Tsinghua University); Gao Huang (Tsinghua)* |
7176 |
incDFM: Incremental Deep Feature Modeling for Continual Novelty Detection |
Amanda S Rios (University of Southern California; Intel )*; Nilesh A Ahuja (Intel); Ibrahima Ndiour (Intel); Ergin U Genc (Intel); Laurent Itti (University of Southern California); Omesh Tickoo (Intel) |
7186 |
BA-Net: Bridge Attention for Deep Convolutional Neural Networks |
Yue Zhao (Sun Yat-sen University); Junzhou Chen (Sun Yat-sen University)*; Zhang Zirui (Sun Yat-sen University); Ronghui Zhang (Sun Yat-Sen University) |
7199 |
Super-Resolution by Predicting Offsets: An Ultra-Efficient Super-Resolution Network for Rasterized Images |
Jinjin Gu (The University of Sydney)*; Haoming CAI (University of Maryland, College Park); Chenyu Dong (Graduate school at Shenzhen , Tsinghua University); Ruofan Zhang (Tsinghua University); Yulun Zhang (ETH Zurich); Wenming Yang (Tsinghua University); Chun Yuan (Graduate school at ShenZhen,Tsinghua university) |
7210 |
Animation from Blur: Multi-modal Blur Decomposition with Motion Guidance |
Zhihang Zhong (The University of Tokyo); Xiao Sun (Microsoft Research Asia); Zhirong Wu (Microsoft Research); Yinqiang Zheng (The University of Tokyo); Stephen Lin (Microsoft Research)*; Imari Sato (National Institute of Informatics) |
7211 |
Zero-Shot Attribute Attacks on Fine-Grained Recognition Models |
Nasim Shafiee (Northeastern University)*; Ehsan Elhamifar (Northeastern University) |
7214 |
Break and Make: Interactive Structural Understanding Using LEGO Bricks |
Aaron T Walsman (University of Washington)*; Muru Zhang (University of Washington); Klemen Kotar (Allen Institute for AI); Karthik Desingh (University Washington); Dieter Fox (NVIDIA Research / University of Washington); Ali Farhadi (University of Washington, Allen Institue for AI, Apple) |
7218 |
PoserNet: Refining Relative Camera Poses Exploiting Object Detections |
Matteo Taiana (Istituto Italiano di Tecnologia)*; Matteo Toso (Istituto Italiano di Tecnologia); Stuart James (Istituto Italiano di Tecnologia (IIT)); Alessio Del Bue (Istituto Italiano di Tecnologia (IIT)) |
7224 |
Towards Effective and Robust Neural Trojan Defenses via Input Filtering |
Kien Duc Do (Deakin Unviersity)*; Haripriya Harikumar (Deakin University); Hung Le (Deakin University); Dung Nguyen (Deakin University); Truyen Tran (Deakin University); Santu Rana (Deakin University, Australia); Dang Nguyen (Deakin University); Willy Susilo (University of Wollongong); Svetha Venkatesh (Deakin University) |
7230 |
View Vertically: A Hierarchical Network for Trajectory Prediction via Fourier Spectrums |
Conghao Wong (Huazhong University of Science and Technology); Beihao Xia (Huazhong University of Science and Technology); Ziming Hong (Huazhong University of Science and Technology); Qinmu Peng (Huazhong University of Science and Technology); Wei Yuan (Huazhong University of Science and Technology); Qiong Cao (JD.com); Yibo Yang (Peking University); Xinge YOU (Huazhong University of Science and Technology)* |
7238 |
Bi-directional Contrastive Learning for Domain Adaptive Semantic Segmentation |
Geon Lee (Yonsei University); Chanho Eom (Yonsei University); Wonkyung Lee (PS Analytics); Hyekang Park (Yonsei University); Bumsub Ham (Yonsei University)* |
7277 |
Rayleigh EigenDirections (REDs): Nonlinear GAN latent space traversals for multidimensional features |
Guha Balakrishnan (Rice University)*; Raghudeep Gadde (Amazon); Aleix M Martinez (Amazon); Pietro Perona (Amazon Web Services (AWS)) |
7278 |
ActionFormer: Localizing Moments of Actions with Transformers |
Chen-Lin Zhang (4Paradigm, Inc); Jianxin Wu (Nanjing University); Yin Li (University of Wisconsin-Madison)* |
7281 |
Theoretical Understanding of the Information Flow on Continual Learning Performance |
Joshua J Andle (University of Maine); Salimeh Yasaei Sekeh (University of Maine)* |
7283 |
3DG-STFM: 3D Geometric Guided Student-Teacher Feature Matching |
Runyu Mao (Purdue University)*; Chen Bai (Xpeng Motors); yatong an (xm); Fengqing Maggie Zhu (Purdue University, USA); Cheng Lu (Xiaopeng) |
7288 |
Pure Transformer with Integrated Experts for Scene Text Recognition |
Yew Lee Tan (Nanyang Technological University)*; Wai-Kin Adams Kong (Nanyang Technological University); Jung Jae Kim (I2R) |
7301 |
AudioScopeV2: Audio-Visual Attention Architectures for Calibrated Open-Domain On-Screen Sound Separation |
Efthymios Tzinis (University of Illinois at Urbana-Champaign); Scott Wisdom (Google)*; Tal Remez (Google); John Hershey (Google) |
7304 |
Bridging the Domain Gap towards Generalization in Automatic Colorization |
Hyejin Lee (Kookmin University); Daehee Kim (Naver Corp.); Daeun Lee (Korea university); Jinkyu Kim (Korea University); Jaekoo Lee (Kookmin University)* |
7311 |
Learning with Free Object Segments for Long-Tailed Instance Segmentation |
Cheng Zhang (Carnegie Mellon University)*; Tai-Yu Pan (The Ohio State University); tianle chen (The Ohio State University); Jike Zhong (The Ohio State University); Wenjin Fu (The Ohio State University); Wei-Lun Chao (The Ohio State University) |
7315 |
Rethinking Closed-loop Training for Autonomous Driving |
Chris Zhang (Waabi / University of Toronto)*; Runsheng Guo (University of Waterloo); Wenyuan Zeng (Waabi, University of Toronto); Yuwen Xiong (University of Toronto); Binbin Dai (Waabi); Rui Hu (Waabi); Mengye Ren (NYU / Google); Raquel Urtasun (Uber ATG) |
7331 |
Autoregressive Uncertainty Modeling for 3D Bounding Box Prediction |
YuXuan Liu (Covariant.ai, UC Berkeley)*; Nikhil Mishra (Covariant.ai, UC Berkeley); Maximilian Sieb (Covariant.ai); Fred Shentu (UC Berkeley); Pieter Abbeel (UC Berkeley); Peter Chen (COVARIANT.AI) |
7337 |
Learning Regional Purity for Instance Segmentation on 3D Point Clouds |
Shichao Dong (Nanyang Technological University)*; Guosheng Lin (Nanyang Technological University); Tzu-Yi HUNG (Delta Research Center) |
7346 |
Learning from Unlabeled 3D Environments for Vision-and-Language Navigation |
Shizhe Chen (INRIA)*; Pierre-Louis Guhur (Inria); Makarand Tapaswi (Wadhwani AI, IIIT Hyderbad); Cordelia Schmid (Inria/Google); Ivan Laptev (INRIA Paris) |
7350 |
A Dataset Generation Framework for Evaluating Megapixel Image Classifiers & their Explanations |
Gautam B Machiraju (Stanford University)*; Sylvia Plevritis (Stanford University); Parag Mallick (Stanford University) |
7351 |
Sports Video Analysis on Large-Scale Data |
Dekun Wu (University of Pittsburgh)*; He Zhao (York University); Xingce Bao (EPFL); Rick Wildes (York University) |
7368 |
Audio-Visual Segmentation |
Jinxing Zhou (Hefei University of Technology); Jianyuan Wang (Chinese University of Hong Kong); Jiayi Zhang (BeiHang University); Weixuan Sun (Australian National University); Jing Zhang (Australian National University); Stan Birchfield (NVIDIA); Dan Guo (Hefei University of Technology); Lingpeng Kong (The University of Hong Kong); Meng Wang (Hefei University of Technology); Yiran Zhong (Australian National University)* |
7374 |
SLiDE: Self-supervised LiDAR De-snowing through Reconstruction Difficulty |
Gwangtak Bae (Seoul National University)*; Byungjun Kim (Seoul National University); Seongyong Ahn (Agency for Defense Development); jihong Min (Agency for Defense Development); Inwook Shim (Inha University) |
7375 |
On the Angular Update and Hyperparameter Tuning of a Scale-Invariant Network |
Juseung Yun (KAIST)*; Janghyeon Lee (LG AI Research); Hyounguk Shon (KAIST); Eojindl Yi (KAIST); Seung Hwan Kim (LG AI Research); Junmo Kim (KAIST) |
7384 |
IGFormer: Interaction Graph Transformer for Skeleton-based Human Interaction Recognition |
Yunsheng Pang (University of Melbourne)*; Qiuhong Ke (Monash University); Hossein Rahmani (Lancaster University); James Bailey (THE UNIVERSITY OF MELBOURNE); Jun Liu (Singapore University of Technology and Design) |
7385 |
LANA: Latency Aware Network Acceleration |
Pavlo Molchanov (NVIDIA)*; James B Hall (Microsoft Research); Hongxu Yin (NVIDIA ); Nicolo Fusi (Microsoft Research); Jan Kautz (NVIDIA); Arash Vahdat (NVIDIA) |
7388 |
A Sketch Is Worth a Thousand Words:Image Retrieval with Text and Sketch |
Patsorn Sangkloy (Georgia Institute of Technology)*; Wittawat Jitkrittum (Google Research); Diyi Yang (Georgia Institute of Technology); James Hays (Georgia Institute of Technology, USA) |
7396 |
HVC-Net: Unifying Homography, Visibility, and Confidence Learning for Planar Object Tracking |
Haoxian Zhang (Tencent)*; Yonggen Ling (Tencent) |
7417 |
3D Random Occlusion and Multi-Layer Projection for Deep Multi-Camera Pedestrian Localization |
Rui Qiu (Xi’an Jiaotong-Liverpool University, University of Liverpool); Ming Xu (Xi’an Jiaotong-Liverpool University)*; Yuyao Yan (Xi’an Jiaotong-Liverpool University); Jeremy S Smith (University of Liverpool); Xi Yang (Xi’an Jiaotong Liverpool University ) |
7427 |
Masked Siamese Networks for Label-Efficient Learning |
Mahmoud Assran (Facebook AI)*; Mathilde Caron (Facebook Artificial Intelligence Research); Ishan Misra (Facebook AI Research); Piotr Bojanowski (Facebook); Florian Bordes (MILA); Pascal Vincent (Facebook FAIR & MILA Université de Montréal); Armand Joulin (Facebook AI Research); Mike Rabbat (Facebook FAIR); Nicolas Ballas (Facebook FAIR) |
7441 |
A Simple Single-Scale Vision Transformer for Object Detection and Instance Segmentation |
Wuyang Chen (University of Texas at Austin)*; Xianzhi Du (Google Brain); Fan Yang (Google); Lucas Beyer (Google Brain); Xiaohua Zhai (Google Brain); Tsung-Yi Lin (Google Brain); Huizhong Chen (Google); Jing Li (Google Brain); Xiaodan Song (Google Brain); Zhangyang Wang (University of Texas at Austin); Denny Zhou (Google Brain) |
7443 |
A Cloud 3D Dataset and Application-Specific Learned Image Compression in Cloud 3D |
Tianyi Liu (The University of Texas at San Antonio)*; Sen He (The University of Texas at San Antonio); Vinodh Kumaran Jayakumar (UTSA); Wei Wang (The University of Texas at San Antonio) |
7449 |
Cross-Domain Few-Shot Semantic Segmentation |
Shuo Lei (Virginia Tech)*; Xuchao Zhang (NEC Labs America); Jianfeng He (Virginia Tech); Fanglan Chen (Virginia Tech); Bowen Du (Beihang Univeristy); Chang-Tien Lu (Virginia Tech, USA) |
7450 |
VizWiz-FewShot: Locating Objects in Images Taken by People With Visual Impairments |
Yu-Yun Tseng (University of Colorado Boulder)*; Alexander Bell (IVC Group); Danna Gurari (University of Colorado Boulder) |
7474 |
Towards Metrical Reconstruction of Human Faces |
Wojciech Zielonka (Max Planck Institute for Intelligent Systems); Timo Bolkart (Max Planck Institute for Intelligent Systems); Justus Thies (Max Planck Institute for Intelligent Systems)* |
7476 |
DeepShadow: Neural Shape from Shadow |
Asaf Karnieli (Reichman University)*; Yacov Hel-Or (The Interdisciplinary Center); Ohad Fried (IDC Herzliya) |
7500 |
Class-Incremental Learning with Cross-Space Clustering and Controlled Transfer |
Arjun Ashok (Indian Institute of Technology, Hyderabad)*; Joseph K J (Indian Institute of Technology, Hyderabad); Vineeth N Balasubramanian (Indian Institute of Technology, Hyderabad) |
7509 |
Object discovery and representation networks |
Olivier Henaff (DeepMind)*; Skanda Koppula (DeepMind); Evan Shelhamer (DeepMind); Daniel Zoran (DeepMind); Andrew Jaegle (DeepMind); Andrew Zisserman (Oxford University); Joao Carreira (DeepMind); Relja Arandjelović (DeepMind) |
7511 |
MeshUDF: Fast and Differentiable Meshing of Unsigned Distance Field Networks |
Benoit Guillard (EPFL)*; Federico Stella (EPFL); Pascal Fua (EPFL, Switzerland) |
7519 |
Natural Synthetic Anomalies for Self-Supervised Anomaly Detection and Localization |
Hannah M Schlueter (Imperial College London)*; Jeremy Tan (Imperial College London); Benjamin Hou (Imperial College London); Bernhard Kainz (Imperial College London, FAU Erlangen-Nürnberg) |
7522 |
Shap-CAM: Visual Explanations for Convolutional Neural Networks based on Shapley Value |
Quan Zheng (Tsinghua University); Ziwei Wang (Tsinghua University); Jie Zhou (Tsinghua University); Jiwen Lu (Tsinghua University)* |
7529 |
Simple Open-Vocabulary Object Detection with Vision Transformers |
Matthias Minderer (Google Research)*; Alexey Gritsenko (Google Brain); Austin C Stone (Google); Maxim Neumann (Google); Dirk Weißenborn (German Research Center for Artificial Intelligence); Alexey Dosovitskiy (Inceptive); Aravindh Mahendran (Google); Anurag Arnab (Google); Mostafa Dehghani (Google Brain); Zhuoran Shen (Pony.ai); Xiao Wang (Google); Xiaohua Zhai (Google Brain); Thomas Kipf (Google Brain); Neil Houlsby (Google) |
7533 |
Video Restoration Framework and its Meta-adaptations to Data-poor Conditions |
Prashant W Patil (Deakin University)*; Sunil Gupta (Deakin University, Australia); Santu Rana (Deakin University, Australia); Svetha Venkatesh (Deakin University) |
7539 |
PRIME: A Few Primitives Can Boost Robustness to Common Corruptions |
Apostolos Modas (EPFL)*; Rahul Shekhar Rade (EthonAI); Guillermo Ortiz-Jimenez (EPFL); Seyed-Mohsen Moosavi-Dezfooli (Imperial College London); Pascal Frossard (EPFL) |
7541 |
AlphaVC: High-Performance and Efficient Learned Video Compression |
Yibo Shi (Huawei); Yunying Ge (Huawei Technologies); Jing Wang (Huawei)*; Jue Mao (Huawei technologies) |
7542 |
Content-Oriented Learned Image Compression |
Meng Li (Huawei); Shangyin Gao (Huawei); Yihui Feng (HUAWEI Technology Co., Ltd); Yibo Shi (Huawei); Jing Wang (Huawei)* |
7543 |
Generating Natural Images with Direct Patch Distributions Matching |
Ariel Elnekave (Hebrew University of Jerusalem)*; Yair Weiss (Hebrew University) |
7545 |
Latent Space Smoothing for Individually Fair Representations |
Momchil Peychev (ETH Zurich)*; Anian Ruoss (DeepMind); Mislav Balunovic (ETH Zurich); Maximilian Baader (ETH Zürich); Martin Vechev (ETH Zurich) |
7555 |
SAU: Smooth activation function using convolution with approximate identities |
Koushik Biswas (Indraprastha Institute of Information Technology, New Delhi, India)*; Sandeep Kumar (Shaheed Bhagat Singh College, University of Delhi, Delhi); Shilpak Banerjee (Indian Institute of Technology Tirupati); Ashish Kumar Pandey (Indraprastha Institute of Information Technology, New Delhi, India) |
7561 |
TRoVE: Transforming Road Scene Datasets into Photorealistic Virtual Environments |
Shubham Dokania (IIIT Hyderabad)*; Anbumani Subramanian (IIIT-Hyderabad); Manmohan Chandraker (UC San Diego); C.V. Jawahar (IIIT-Hyderabad) |
7562 |
Motion Sensitive Contrastive Learning for Self-supervised Video Representation |
JingCheng Ni (Behang University)*; Nan Zhou (Beihang University); Jie Qin (Nanjing University of Aeronautics and Astronautics); Qian Wu (Megvii); Junqi Liu (Megvii); Boxun Li (Megvii Inc.); Di Huang (Beihang University, China) |
7573 |
Scaling Adversarial Training to Large Perturbation Bounds |
Sravanti Addepalli (Indian Institute of Science)*; Samyak Jain (Indian Institute of Technology (BHU), Varanasi); Gaurang Sriramanan (University of Maryland, College Park); Venkatesh Babu RADHAKRISHNAN (Indian Institute of Science) |
7592 |
RDO-Q: Extremely Fine-Grained Channel-Wise Quantization via Rate-Distortion Optimization |
Zhe Wang (Institute for Infocomm Research, Singapore)*; Jie Lin (Institute for Infocomm Research (I2R), Singapore); Xue Geng (I2R, A*STAR); Mohamed M. Sabry Aly (Nanyang Technological University); Vijay R. Chandrasekhar (Institute for Infocomm Research) |
7605 |
Camera Auto-calibration from the Steiner Conic of the Fundamental Matrix |
Yu LIU (United International College, BNU-HKBU)*; Hui Zhang (UIC) |
7626 |
Understanding Collapse in Non-Contrastive Siamese Representation Learning |
Alexander C Li (Carnegie Mellon University)*; Alexei A Efros (UC Berkeley); Deepak Pathak (Carnegie Mellon University) |
7634 |
AutoTransition: Learning to Recommend Video Transition Effects |
Yaojie Shen (Institute of Software, Chinese Academy of Sciences); Libo Zhang (Institute of Software Chinese Academy of Sciences); Kai Xu (ByteDance Inc); Xiaojie Jin (Bytedance Inc. USA)* |
7651 |
SPE-Net: Boosting Point Cloud Analysis via Rotation Robustness Enhancement |
Zhaofan Qiu (JD.com); Yehao Li (JD AI Research); Yu Wang (JD AI Research); Yingwei Pan (JD AI Research); Ting Yao (JD AI Research)*; Tao Mei (AI Research of JD.com) |
7667 |
Text-based Temporal Localization of Novel Events |
Sudipta Paul (University of California, Riverside)*; Niluthpol C Mithun (SRI International); Amit K. Roy-Chowdhury (University of California, Riverside) |
7687 |
Effective Presentation Attack Detection Driven by Face Related Task |
Wentian Zhang (Shenzhen University); Haozhe Liu ( King Abdullah University of Science and Technology); Feng Liu (Shenzhen University )*; Raghavendra Ramachandra (NTNU, Norway); Christoph Busch (Norwegian University of Science and Technology) |
7691 |
LWGNet – Learned Wirtinger Gradients for Fourier Ptychographic Phase Retrieval |
Atreyee Saha (Indian Institute of Technology Madras)*; Salman Siddique Khan (IIT Madras); Sagar Sehrawat (IIT Madras); Sanjana S Prabhu (Indian Institute of Technology Madras); Shanti Bhattacharya (IIT Madras); Kaushik Mitra (IIT Madras) |
7693 |
Federated Self-supervised Learning for Video Understanding |
Yasar Rehman (TCL Corporate Research(Hong Kong) Co. Ltd); Yan Gao (University of Cambridge)*; Jiajun Shen (TCL Research); Pedro Gusmao (University of Cambridge); Nicholas Lane (University of Cambridge and Samsung AI) |
7694 |
Reliability-Aware Prediction via Uncertainty Learning for Person Image Retrieval |
Zhaopeng Dou (Tsinghua University)*; Zhongdao Wang (Tsinghua University); Weihua Chen (alibaba group); Ya-Li Li (Tsinghua University); Shengjin Wang (Tsinghua University) |
7704 |
The Shape Part Slot Machine: Contact-based Reasoning for Generating 3D Shapes from Parts |
Kai Wang (Brown University)*; Paul Guerrero (Adobe); Vladimir Kim (Adobe); Siddhartha Chaudhuri (Adobe Research); Minhyuk Sung (KAIST); Daniel Ritchie (Brown University) |
7710 |
Attention Diversification for Domain Generalization |
Rang Meng (Hikvision Research Institute)*; Xianfeng Li (Hikvision Research Institute ); Weijie Chen (Zhejiang University); Shicai Yang (Hikvision Research Institute); Jie Song (Zhejiang University); Xinchao Wang (National University of Singapore); Lei Zhang (Chongqing University); Mingli Song (Zhengjiang University); Di Xie (Hikvision Research Institute); Shiliang Pu (Hikvision Research Institute) |
7718 |
Exploiting the local parabolic landscapes of adversarial losses to accelerate black-box adversarial attack |
Hoang Tran (Oak Ridge National Laboratory); Dan Lu (Oak Ridge National Laboratory); Guannan Zhang (Oak Ridge National Laboratory)* |
7719 |
Towards Efficient and Effective Self-Supervised Learning of Visual Representations |
Sravanti Addepalli (Indian Institute of Science)*; Kaushal Bhogale (Indian Institute of Technology, Madras); Priyam Dey (Indian Institute of Science); Venkatesh Babu RADHAKRISHNAN (Indian Institute of Science) |
7722 |
TransVLAD: Focusing on Locally Aggregated Descriptors for Few-Shot Learning |
Haoquan Li (Southern University of Science and Technology)*; Laoming Zhang (Southern University of Science and Technology); Daoan Zhang (Southern University of Science and Technology); Lang Fu (Southern University of Science and Technology); Peng Yang (Southern University of Science and Technology); Jianguo Zhang (Southern University of Science and Technology) |
7735 |
Rotation Regularization Without Rotation |
Takumi Kobayashi (National Institute of Advanced Industrial Science and Technology)* |
7741 |
Parameterized Temperature Scaling for Boosting the Expressive Power in Post-Hoc Uncertainty Calibration |
Christian Tomani (TUM)*; Daniel Cremers (TU Munich); Florian Buettner (German Cancer Research Center and Frankfurt University) |
7746 |
FairStyle: Debiasing StyleGAN2 with Style Channel Manipulations |
Cemre Efe Karakas (Bogazici University); Alara Dirik (Bogazici University); Eylül Yalçınkaya (Bogazici University); Pinar Yanardag (Bogazici University)* |
7756 |
Dynamic Temporal Filtering in Video Models |
Fuchen Long (JD.com); Zhaofan Qiu (JD.com); Yingwei Pan (JD AI Research)*; Ting Yao (JD AI Research); Chong-Wah Ngo (Singapore Management University); Tao Mei (AI Research of JD.com) |
7764 |
DH-AUG: DH Forward Kinematics Model Driven Augmentation for 3D Human Pose Estimation |
linzhi huang (Beijing University of Posts and Telecommunications)*; Jiahao Liang (Beijing University of Posts and Telecommunications); Weihong Deng (Beijing University of Posts and Telecommunications) |
7765 |
Super-resolution 3D Human Shape from a Single Low-Resolution Image |
Marco Pesavento (University of Surrey)*; Marco Volino (University of Surrey); Adrian Hilton (University of Surrey) |
7771 |
Trading Positional Complexity vs Deepness in Coordinate Networks |
Jianqiao Zheng (University of Adelaide)*; Sameera Ramasinghe (University of Adelaide); Xueqian Li (Carnegie Mellon University); Simon Lucey (University of Adelaide) |
7785 |
ESS: Learning Event-based Semantic Segmentation from Still Images |
Zhaoning Sun (ETH Zürich); Nico Messikommer (University of Zurich & ETH Zurich)*; Daniel Gehrig (University of Zurich & ETH Zurich); Davide Scaramuzza (University of Zurich & ETH Zurich, Switzerland) |
7802 |
U-Boost NAS: Utilization-Boosted Differentiable Neural Architecture Search |
Ahmet Yüzügüler (EPFL)*; Nikolaos Dimitriadis (EPFL); Pascal Frossard (EPFL) |
7803 |
MonteBoxFinder: Detecting and Filtering Primitives to Fit a Noisy Point Cloud |
Michaël Ramamonjisoa (Ecole des Ponts)*; Sinisa Stekovic (Graz University of Technology); Vincent Lepetit (Ecole des Ponts ParisTech) |
7815 |
Trapped in texture bias? A large scale comparison of deep instance segmentation |
Johannes Theodoridis (Hochschule der Medien Stuttgart)*; Jessica Hofmann (Hochschule der Medien); Johannes Maucher (Media University Stuttgart); Andreas G Schilling (University of Tübingen) |
7845 |
MVDG: A Unified Multi-view Framework for Domain Generalization |
Jian Zhang (Nanjing University)*; Lei Qi (Southeast University); Yinghuan Shi (Nanjing University); Yang Gao (Nanjing University) |
7847 |
MINER: Multiscale Implicit Neural Representation |
Vishwanath Saragadam (Rice University)*; Jasper T Tan (Rice University); Guha Balakrishnan (Rice University); Richard Baraniuk (Rice University); Ashok Veeraraghavan (Rice University) |
7856 |
PTQ4ViT: Post-Training Quantization for Vision Transformers with Twin Uniform Quantization |
Zhihang Yuan (Peking University)*; Chenhao Xue (Peking University); Yiqi Chen (Peking University); Qiang Wu (HOUMO.AI); Guangyu Sun (Peking University) |
7865 |
Context-Consistent Semantic Image Editing with Style-Preserved Modulation |
Wuyang Luo (School of Computer Science, Fudan University); Su Yang (School of Computer Science, Fudan University)*; Hong Wang (School of Computer Science, Fudan University); Bo Long (School of Computer Science, Fudan University ); Weishan Zhang (Department of Software Engineering, China University of Petroleum) |
7874 |
Distilling the Undistillable: Learning from a Nasty Teacher |
Surgan Jandial (MDSR Labs, Adobe)*; Yash Khasbage (Indian Institute of Technology, Hyderabad); Arghya Pal (Harvard University); Vineeth N Balasubramanian (Indian Institute of Technology, Hyderabad); Balaji Krishnamurthy () |
7879 |
Grounding Visual Representations with Texts for Domain Generalization |
Seonwoo Min (LG AI Research)*; Nokyung Park (Korea University); Siwon Kim (Seoul National University); Seunghyun Park (Clova AI Research, NAVER Corp.); Jinkyu Kim (Korea University) |
7883 |
Towards Accurate Open-Set Recognition via Background-Class Regularization |
Wonwoo Cho (Korea Advanced Institute of Science and Technology)*; Jaegul Choo (Korea Advanced Institute of Science and Technology) |
7899 |
In Defense of Image Pre-Training for Spatiotemporal Recognition |
Xianhang Li (University of California, Santa Cruz)*; Huiyu Wang (JHU); Chen Wei (Johns Hopkins University); Jieru Mei (Johns Hopkins University); Alan Yuille (Johns Hopkins University); Yuyin Zhou (UC Santa Cruz); Cihang Xie (University of California, Santa Cruz) |
7925 |
SocialVAE: Human Trajectory Prediction using Timewise Latents |
Pei Xu (Clemson University)*; Jean-Bernard Hayet (CIMAT); Ioannis Karamouzas (Clemson University) |
7926 |
BodySLAM: Joint Camera Localisation, Mapping, and Human Motion Tracking |
Dorian F Henning (Imperial College London)*; Tristan Laidlow (Imperial College London); Stefan Leutenegger (TU Munich) |
7935 |
Eliminating Gradient Conflict in Reference-based Line-Art Colorization |
zekun li (University of Electronic Science and Technology of China)*; Zhengyang Geng (Peking University); Zhao Kang (University of Electronic Science and Technology of China); Wenyu Chen (University of Electronic Science and Technology of China); Yibo Yang (Peking University) |
7950 |
Transfer without Forgetting |
Matteo Boschini (University of Modena and Reggio Emilia)*; Lorenzo Bonicelli (Università of Modena and Reggio Emilia); Angelo Porrello (University of Modena and Reggio Emilia); Giovanni Bellitto (University of Catania); Matteo Pennisi (University of Catania); Simone Palazzo (University of Catania); Concetto Spampinato (University of Catania); SIMONE CALDERARA (University of Modena and Reggio Emilia, Italy) |
7955 |
DSR — A dual subspace re-projection network for surface anomaly detection |
Vitjan Zavrtanik (University of Ljubljana)*; Matej Kristan (University of Ljubljana); Danijel Skocaj (University of Ljubljana) |
7964 |
Multi-Exit Semantic Segmentation Networks |
Alexandros Kouris (Imperial College London and Samsung AI)*; Stylianos Venieris (Samsung AI); Stefanos Laskaridis (Samsung AI); Nicholas Lane (University of Cambridge and Samsung AI) |
7968 |
Almost-Orthogonal Layers for Efficient General-Purpose Lipschitz Networks |
Bernd Prach (IST Austria)*; Christoph H Lampert (IST Austria) |
8001 |
Bridging the visual semantic gap in VLN via semantically richer instructions |
Joaquín Ignacio Ossandón (Universidad Catolica de Chile)*; Benjamín Earle (Universidad Católica de Chile); Alvaro Soto (Universidad Catolica de Chile) |
8003 |
Kernel Relative-prototype Spectral Filtering for Few-shot Learning |
Tao Zhang (Chengdu Techman Software Co., Ltd.)*; Wu Huang (Sichuan University) |
8009 |
StoryDALL-E: Adapting Pretrained Text-to-image Transformers for Story Continuation |
Adyasha Maharana (UNC Chapel Hill)*; Darryl Hannan (University of North Carolina at Chapel Hill); Mohit Bansal (University of North Carolina at Chapel Hill) |
8026 |
Unsupervised Learning of Efficient Geometry-Aware Neural Articulated Representations |
Atsuhiro Noguchi (The University of Tokyo)*; Xiao Sun (Microsoft Research Asia); Stephen Lin (Microsoft Research); Tatsuya Harada (The University of Tokyo / RIKEN) |
8029 |
PANDORA: Polarization-Aided Neural Decomposition Of Radiance |
Akshat Dave (Rice University)*; Yongyi Zhao (Rice University); Ashok Veeraraghavan (Rice University) |
8042 |
OCR-free Document Understanding Transformer |
Geewook Kim (NAVER Corporation)*; Teakgyu Hong (Upstage AI); Moonbin Yim (Clova AI Research, NAVER Corp.); Jeongyeon Nam (Naver); Jinyoung Park (TmaxAI); Jinyeong Yim (Google); Wonseok Hwang (LBox); Sangdoo Yun (NAVER AI LAB); Dongyoon Han (NAVER AI Lab); Seunghyun Park (Clova AI Research, NAVER Corp.) |
8048 |
VQGAN-CLIP: Open Domain Image Generation and Manipulation Using Natural Language |
Katherine B Crowson (EleutherAI); Stella R Biderman (Booz Allen Hamilton)*; daniel kornis (Eleuther.ai); Dashiell Stander (Eleuther AI); Eric Hallahan (EleutherAI); Louis J Castricato (Georgia Tech); Edward Raff (Booz Allen Hamilton) |
8063 |
Learning to use unlabeled data in data augmentation for 3D detection |
Zhaoqi Leng (Waymo)*; Shuyang Cheng (Waymo LLC); Ben Caine (Google); Weiyue Wang (Waymo); Xiao Zhang (Cruise); Jonathon Shlens (Google); Mingxing Tan (Waymo); Dragomir Anguelov (Waymo) |
8070 |
Differentiable Zooming for Multiple Instance Learning on Whole-Slide Images |
Kevin Thandiackal (ETH Zurich / IBM Research)*; Boqi Chen (ETH Zurich ); Pushpak Pati (IBM Research Zurich); Guillaume Jaume (Harvard); Drew Williamson (Pathology, Brigham and Women’s Hospital, Harvard Medical School); Maria Gabrani (IBM Research); Orcun Goksel (ETH Zurich) |
8081 |
Towards Learning Neural Representations from Shadows |
Kushagra Tiwary (MIT)*; Tzofi M Klinghoffer (Massachusetts Institute of Technology); Ramesh Raskar (Massachusetts Institute of Technology) |
8086 |
Augmenting Deep Classifiers with Polynomial Neural Networks |
Grigorios Chrysos (EPFL)*; Markos Georgopoulos (Imperial College London); Jiankang Deng (Imperial College London); Jean Kossaifi (NVIDIA); Yannis Panagakis (University of Athens); Animashree Anandkumar (Caltech) |
8092 |
AdaBest: Minimizing Client Drift in Federated Learning via Adaptive Bias Estimation |
Farshid Varno (Dalhousie/Imagia)*; Marzie Saghayi (Dalhousie University); Laya Rafiee Sevyeri (Concordia); Sharut Gupta (MILA, Imagia, Indian Institute of Technology Delhi (IIT Delhi)); Stan Matwin (Dalhouise University); Mohammad Havaei (Imagia) |
8094 |
A Simple Approach and Benchmark for 21,000-Category Object Detection |
Yutong Lin (Xi’an Jiaotong University); Chen Li (Xi’an Jiaotong University); Yue Cao (Microsoft Research); Zheng Zhang (MSRA); Jianfeng Wang (Microsoft); Lijuan Wang (Microsoft); Zicheng Liu (Microsoft); Han Hu (Microsoft Research Asia)* |
8106 |
Bitwidth-Adaptive Quantization-Aware Neural Network Training: A Meta-Learning Approach |
Jiseok Youn (Seoul National University)*; Jaehun Song (Seoul National University); Hyung-Sin Kim (Seoul National University); Saewoong Bahk (Seoul National University) |
8140 |
Learning with Noisy Labels by Efficient Transition Matrix Estimation to Combat Label Miscorrection |
Seong Min Kye (KAIST); Kwanghee Choi (Sogang University); Joonyoung Yi (Hyperconnect); Buru Chang (Hyperconnect)* |
8170 |
Online Task-free Continual Learning with Dynamic Sparse Distributed Memory |
Julien Pourcel (ENSEA)*; Ngoc-Son Vu (ETIS/Université Paris Seine, Université Cergy-Pontoise, ENSEA, CNRS/ 95000-Cergy); Robert M FRENCH (CNRS) |