Sessions
Oral Sessions
1A | Learning for Vision 1 Monday, September 10 | Oral session 8:30 AM - 9:45 AMAndrea Vedaldi, Oxford Timothy Hospedales, University of Edinburgh ←↑ |
---|---|---|
O-1A-01 | Convolutional Networks with Adaptive Computation Graphs | Andreas Veit*, Cornell University; Serge Belongie, Cornell University |
O-1A-02 | Progressive Neural Architecture Search | Chenxi Liu*, Johns Hopkins University; Maxim Neumann, Google; Barret Zoph, Google; Jon Shlens, Google; Wei Hua, Google; Li-Jia Li, Google; Li Fei-Fei, Stanford University; Alan Yuille, Johns Hopkins University; Jonathan Huang, Google; Kevin Murphy, Google |
O-1A-03 | Diverse Image-to-Image Translation via Disentangled Representations | Hsin-Ying Lee*, University of California, Merced; Hung-Yu Tseng, University of California, Merced; Maneesh Singh, Verisk Analytics; Jia-Bin Huang, Virginia Tech; Ming-Hsuan Yang, University of California at Merced |
O-1A-04 | Lifting Layers: Analysis and Applications | Michael Moeller*, University of Siegen; Peter Ochs, Saarland University; Tim Meinhardt, Technical University of Munich; Laura Leal-Taixé, TUM |
O-1A-05 | Learning with Biased Complementary Labels | Xiyu Yu*, The University of Sydney; Tongliang Liu, The University of Sydney; Mingming Gong, University of Pittsburgh; Dacheng Tao, University of Sydney |
Oral session 1B
1B | Computational Photography 1 Monday, September 10 | Oral session 1:00 PM - 2:15 PMJan-Michael Frahm, University of North Carolina at Chapel Hill Gabriel Brostow, University College London←↑ |
---|---|---|
O-1B-01 | Light Structure from Pin Motion: Simple and Accurate Point Light Calibration for Physics-based Modeling | Hiroaki Santo*, Osaka University; Michael Waechter, Osaka University; Masaki Samejima, Osaka University; Yusuke Sugano, Osaka University; Yasuyuki Matsushita, Osaka University |
O-1B-02 | Programmable Light Curtains | Jian Wang*, Carnegie Mellon University; Joe Bartels, Carnegie Mellon University; William Whittaker, Carnegie Mellon University; Aswin Sankaranarayanan, Carnegie Mellon University; Srinivasa Narasimhan, Carnegie Mellon University |
O-1B-03 | Learning to Separate Object Sounds by Watching Unlabeled Video | Ruohan Gao*, University of Texas at Austin; Rogerio Feris, IBM Research; Kristen Grauman, University of Texas |
O-1B-04 | Coded Two-Bucket Cameras for Computer Vision | Mian Wei, University of Toronto; Navid Navid Sarhangnejad, University of Toronto; Zhengfan Xia, University of Toronto; Nikola Katic, University of Toronto; Roman Genov, University of Toronto; Kyros Kutulakos*, University of Toronto |
O-1B-05 | Materials for Masses: SVBRDF Acquisition with a Single Mobile Phone Image | Zhengqin Li*, UC San Diego; Manmohan Chandraker, UC San Diego; Sunkavalli Kalyan, Adobe Research |
Oral session 1C
1C | Video Monday, September 10 | Oral session 2:45 PM - 4:00 PMIvan Laptev, INRIA Thomas Brox, University of Freiburg←↑ |
---|---|---|
O-1C-01 | End-to-End Joint Semantic Segmentation of Actors and Actions in Video | Jingwei Ji*, Stanford University; Shyamal Buch, Stanford University; Alvaro Soto, Universidad Catolica de Chile; Juan Carlos Niebles, Stanford University |
O-1C-02 | Learning-based Video Motion Magnification | Tae-Hyun Oh, MIT CSAIL; Ronnachai Jaroensri*, MIT CSAIL; Changil Kim, MIT CSAIL; Mohamed A. Elghareb, Qatar Computing Research Institute; Fredo Durand, MIT; Bill Freeman, MIT; Wojciech Matusik, MIT CSAIL |
O-1C-03 | Massively Parallel Video Networks | Viorica Patraucean*, DeepMind; Joao Carreira, DeepMind; Laurent Mazare, DeepMind; Simon Osindero, DeepMind; Andrew Zisserman, University of Oxford |
O-1C-04 | DeepWrinkles: Accurate and Realistic Clothing Modeling | Zorah Laehner, TU Munich; Tony Tung*, Facebook / Oculus Research; Daniel Cremers, TUM |
O-1C-05 | Learning Discriminative Video Representations Using Adversarial Perturbations | Jue Wang*, ANU; Anoop Cherian, MERL |
Oral session 2A
2A | Humans analysis 1 Tuesday, September 11 | Oral session 8:30 AM - 9:45 AMKris Kitani, Carnegie Mellon University Tinne Tuytelaars, KU Leuven←↑ |
---|---|---|
O-2A-01 | Scaling Egocentric Vision: The E-Kitchens Dataset | Dima Damen*, University of Bristol; Hazel Doughty, University of Bristol; Sanja Fidler, University of Toronto; Antonino Furnari, University of Catania; Evangelos Kazakos, University of Bristol; Giovanni Farinella, University of Catania, Italy; Davide Moltisanti, University of Bristol; Jonathan Munro, University of Bristol; Toby Perrett, University of Bristol; Will Price, University of Bristol; Michael Wray, University of Bristol |
O-2A-02 | Unsupervised Person Re-identification by Deep Learning Tracklet Association | Minxian Li*, Nanjing University and Science and Technology; Xiatian Zhu, Queen Mary University, London, UK; Shaogang Gong, Queen Mary University of London |
O-2A-03 | Predicting Gaze in Egocentric Video by Learning Task-dependent Attention Transition | Yifei Huang*, The University of Tokyo; Minjie Cai, Hunan University, The University of Tokyo; Zhenqiang Li, The University of Tokyo; Yoichi Sato,The University of Tokyo |
O-2A-04 | Instance-level Human Parsing via Part Grouping Network | Ke Gong*, SYSU; Xiaodan Liang, Carnegie Mellon University; Yicheng Li, Sun Yat-sen University; Yimin Chen, sensetime; Liang Lin, Sun Yat-sen University |
O-2A-05 | Adversarial Geometry-Aware Human Motion Prediction | Liangyan Gui*, Carnegie Mellon University; Yu-Xiong Wang, Carnegie Mellon University; Xiaodan Liang, Carnegie Mellon University; José M. F. Moura, Carnegie Mellon University |
Oral session 2B
2B | Human Sensing I Tuesday, September 11 | Oral session 1:00 PM - 2:15 PMMykhaylo Andriluka, Max Planck Insititute Pascal Fua, EPFL←↑ |
---|---|---|
O-2B-01 | Weakly-supervised 3D Hand Pose Estimation from Monocular RGB Images | Yujun Cai*, Nanyang Technological University; Liuhao Ge, NTU; Jianfei Cai, Nanyang Technological University; Junsong Yuan, State University of New York at Buffalo, USA |
O-2B-02 | Audio-Visual Scene Analysis with Self-Supervised Multisensory Features | Andrew Owens*, UC Berkeley; Alexei Efros, UC Berkeley |
O-2B-03 | Jointly Discovering Visual Objects and Spoken Words from Raw Sensory Input | David Harwath*, MIT CSAIL; Adria Recasens, Massachusetts Institute of Technology; Dídac Surís, Universitat Politecnica de Catalunya; Galen Chuang, MIT; Antonio Torralba, MIT; James Glass, MIT |
O-2B-04 | DeepIM: Deep Iterative Matching for 6D Pose Estimation | Yi Li*, Tsinghua University; Gu Wang, Tsinghua University; Xiangyang Ji, Tsinghua University; Yu Xiang, University of Michigan; Dieter Fox, University of Washington |
O-2B-05 | Implicit 3D Orientation Learning for 6D Object Detection from RGB Images | Martin Sundermeyer*, German Aerospace Center (DLR); Zoltan Marton, DLR; Maximilian Durner, DLR; Rudolph Triebel, German Aerospace Center (DLR) |
Oral session 2C
2C | Computational Photograpy 2 Tuesday, September 11 | Oral session 2:45 PM - 4:00 PMKyros Kutulakos, University of Toronto Kalyan Sunkavalli, Adobe Research←↑ |
---|---|---|
O-2C-01 | Direct Sparse Odometry With Rolling Shutter | David Schubert*, Technical University of Munich; Vladyslav Usenko, TU Munich; Nikolaus Demmel, TUM; Joerg Stueckler, Technical University of Munich; Daniel Cremers, TUM |
O-2C-02 | 3D Motion Sensing from 4D Light Field Gradients | Sizhuo Ma*, University of Wisconsin-Madison; Brandon Smith, University of Wisconsin-Madison; Mohit Gupta, University of Wisconsin-Madison, USA |
O-2C-03 | A Style-aware Content Loss for Real-time HD Style Transfer | Artsiom Sanakoyeu*, Heidelberg University; Dmytro Kotovenko, Heidelberg University; Bjorn Ommer, Heidelberg University |
O-2C-04 | Scale-Awareness of Light Field Camera based Visual Odometry | Niclas Zeller*, Karlsruhe University of Applied Sciences; Franz Quint, Karlsruhe University of Applied Sciences; Uwe Stilla, Technische Universitaet Muenchen |
O-2C-05 | Burst Image Deblurring Using Permutation Invariant Convolutional Neural Networks | Miika Aittala*, MIT; Fredo Durand, MIT |
Oral session 3A
3A | Stereo and reconstruction Wednesday, September 12 | Oral session 8:30 AM - 9:45 AMNoah Snavely, Cornell University Andreas Geiger, University of Tübingen←↑ |
---|---|---|
O-3A-01 | MVSNet: Depth Inference for Unstructured Multi-view Stereo | Yao Yao*, The Hong Kong University of Science and Technology; Zixin Luo, HKUST; Shiwei Li, HKUST; Tian Fang, HKUST; Long Quan, Hong Kong University of Science and Technology |
O-3A-02 | PlaneMatch: Patch Coplanarity Prediction for Robust RGB-D Registration | Yifei Shi, Princeton University; Kai Xu, Princeton University and National University of Defense Technology; Matthias Niessner, Technical University of Munich; Szymon Rusinkiewicz, Princeton University; Thomas Funkhouser*, Princeton, USA |
O-3A-03 | Active Stereo Net: End-to-End Self-Supervised Learning for Active Stereo Systems | Yinda Zhang*, Princeton University; Sean Fanello, Google; Sameh Khamis, Google; Christoph Rhemann, Google; Julien Valentin, Google; Adarsh Kowdle, Google; Vladimir Tankovich, Google; Shahram Izadi, Google; Thomas Funkhouser, Princeton, USA |
O-3A-04 | GAL: Geometric Adversarial Loss for Single-View 3D-Object Reconstruction | Li Jiang*, The Chinese University of Hong Kong; Xiaojuan Qi, CUHK; Shaoshuai SHI, The Chinese University of Hong Kong; Jia Jiaya, Chinese University of Hong Kong |
O-3A-05 | Deep Virtual Stereo Odometry: Leveraging Deep Depth Prediction for Monocular Direct Sparse Odometry | Nan Yang*, Technical University of Munich; Rui Wang, Technical University of Munich; Joerg Stueckler, Technical University of Munich; Daniel Cremers, TUM |
Oral session 3B
3B | Human Sensing II Wednesday, September 12 | Oral session 1:00 PM - 2:15 PMGerard-Pons Moll, Max Planck Institute Juergen Gall, University of Bonn←↑ |
---|---|---|
O-3B-01 | Unsupervised Geometry-Aware Representation for 3D Human Pose Estimation | Helge Rhodin*, EPFL; Mathieu Salzmann, EPFL; Pascal Fua, EPFL, Switzerland |
O-3B-02 | Dual-Agent Deep Reinforcement Learning for Deformable Face Tracking | Minghao Guo, Tsinghua University; Jiwen Lu*, Tsinghua University; Jie Zhou, Tsinghua University, China |
O-3B-03 | Deep Autoencoder for Combined Human Pose Estimation and Body Model Upscaling | Matthew Trumble*, University of Surrey; Andrew Gilbert, University of Surrey; John Collomosse, Adobe Research; Adrian Hilton, University of Surrey |
O-3B-04 | Occlusion-aware Hand Pose Estimation Using Hierarchical Mixture Density Network | Qi Ye*, Imperial College London; Tae-Kyun Kim, Imperial College London |
O-3B-05 | GANimation: Anatomically-aware Facial Animation from a Single Image | Albert Pumarola*, Institut de Robotica i Informatica Industrial; Antonio Agudo, Institut de Robotica i Informatica Industrial, CSIC-UPC; Aleix Martinez, The Ohio State University; Alberto Sanfeliu, Industrial Robotics Institute; Francesc Moreno, IRI |
Oral session 3C
3C | Optimization Wednesday, September 12 | Oral session 4:00 PM - 5:15 PMVincent Lepetit, University of Bordeaux Vladlen Koltun, Intel←↑ |
---|---|---|
O-3C-01 | Deterministic Consensus Maximization with Biconvex Programming | Zhipeng Cai*, The University of Adelaide; Tat-Jun Chin, University of Adelaide; Huu Le, University of Adelaide; David Suter, University of Adelaide |
O-3C-02 | Robust fitting in computer vision: easy or hard? | Tat-Jun Chin*, University of Adelaide; Zhipeng Cai, The University of Adelaide; Frank Neumann, The University of Adelaide, School of Computer Science, Faculty of Engineering, Computer and Mathematical Science |
O-3C-03 | Highly-Economized Multi-View Binary Compression for Scalable Image Clustering | Zheng Zhang*, Harbin Institute of Technology Shenzhen Graduate School; Li Liu, the inception institute of artificial intelligence; Jie Qin, ETH Zurich; Fan Zhu, the inception institute of artificial intelligence ; Fumin Shen, UESTC; Yong Xu, Harbin Institute of Technology Shenzhen Graduate School; Ling Shao, Inception Institute of Artificial Intelligence; Heng Tao Shen, University of Electronic Science and Technology of China (UESTC) |
O-3C-04 | Efficient Semantic Scene Completion Network with Spatial Group Convolution | Jiahui Zhang*, Tsinghua University; Hao Zhao, Intel Labs China; Anbang Yao, Intel Labs China; Yurong Chen, Intel Labs China; Hongen Liao, Tsinghua University |
O-3C-05 | Asynchronous, Photometric Feature Tracking using Events and Frames | Daniel Gehrig, University of Zurich; Henri Rebecq*, University of Zurich; Guillermo Gallego, University of Zurich; Davide Scaramuzza, University of Zurich& ETH Zurich, Switzerland |
Oral session 4A
4A | Learning for Vision 2 Thursday, September 13 | Oral session 8:30 AM - 9:30 AMKyoung Mu Lee, Seoul National University Michael Felsberg, Linköping University←↑ |
---|---|---|
O-4A-01 | Group Normalization | Yuxin Wu, Facebook; Kaiming He*, Facebook Inc., USA |
O-4A-02 | Deep Expander Networks: Efficient Deep Networks from Graph Theory | Ameya Prabhu*, IIIT Hyderabad; Girish Varma, IIIT Hyderabad; Anoop Namboodiri, IIIT Hyderbad |
O-4A-03 | Towards Realistic Predictors | Pei Wang*, UC San Diego; Nuno Vasconcelos, UC San Diego |
O-4A-04 | Learning SO(3) Equivariant Representations with Spherical CNNs | Carlos Esteves*, University of Pennsylvania; Kostas Daniilidis, University of Pennsylvania; Ameesh Makadia, Google Research; Christine Allec-Blanchette, University of Pennsylvania |
Oral session 4B
4B | Matching and Recognition Thursday, September 13 | Oral session 1:00 PM - 2:15 PMRoss Girshick, Facebook Philipp Kraehenbuehl, University of Texas at Austin←↑ |
---|---|---|
O-4B-01 | CornerNet: Detecting Objects as Paired Keypoints | Hei Law*, University of Michigan; Jia Deng, University of Michigan |
O-4B-02 | RelocNet: Continous Metric Learning Relocalisation using Neural Nets | Vassileios Balntas*, University of Oxford; Victor Prisacariu, University of Oxford; Shuda Li, University of Oxford |
O-4B-03 | The Contextual Loss for Image Transformation with Non-Aligned Data | Roey Mechrez*, Technion; Itamar Talmi, Technion; Lihi Zelnik-Manor, Technion |
O-4B-04 | Acquisition of Localization Confidence for Accurate Object Detection | Borui Jiang*, Peking University; Ruixuan Luo, Peking University; Jiayuan Mao, Tsinghua University; Tete Xiao, Peking University; Yuning Jiang, Megvii(Face++) Inc |
O-4B-05 | Deep Model-Based 6D Pose Refinement in RGB | Fabian Manhardt*, TU Munich; Wadim Kehl, Toyota Research Institute; Nassir Navab, Technische Universität München, Germany; Federico Tombari, Technical University of Munich, Germany |
Oral session 4C
4C | Video and attention Thursday, September 13 | Oral session 2:45 PM - 4:00 PMHedvig Kjellström, KTH Lihi Zelnik Manor, Technion ←↑ |
---|---|---|
O-4C-01 | DeepTAM: Deep Tracking and Mapping | Huizhong Zhou*, University of Freiburg; Benjamin Ummenhofer, University of Freiburg; Thomas Brox, University of Freiburg |
O-4C-02 | ContextVP: Fully Context-Aware Video Prediction | Wonmin Byeon*, NVIDIA; Qin Wang, ETH Zurich; Rupesh Kumar Srivastava, NNAISENSE; Petros Koumoutsakos, ETH Zurich |
O-4C-03 | Saliency Benchmarking Made Easy: Separating Models, Maps and Metrics | Matthias Kümmerer*, University of Tübingen; Thomas Wallis, University of Tübingen; Matthias Bethge, University of Tübingen |
O-4C-04 | Museum Exhibit Identification Challenge for the Supervised Domain Adaptation. | Piotr Koniusz*, Data61/CSIRO, ANU; Yusuf Tas, Data61; Hongguang Zhang, Australian National University; Mehrtash Harandi, Monash University; Fatih Porikli, ANU; Rui Zhang, University of Canberra |
O-4C-05 | Multi-Attention Multi-Class Constraint for Fine-grained Image Recognition | Ming Sun, baidu; Yuchen Yuan, Baidu Inc.; Feng Zhou*, Baidu Research; Errui Ding, Baidu Inc. |
Poster Sessions
1A | Monday, September 10 | Poster Session 10:00 AM - 12:00 PM←↑ |
---|---|---|
P-1A-01 | ECO: Efficient Convolutional Network for Online Video Understanding | Mohammadreza Zolfaghari*, University of Freiburg; kamaljeet singh, University of Freiburg; Thomas Brox, University of Freiburg |
P-1A-02 | Learning to Anonymize Faces for Privacy Preserving Action Detection | Zhongzheng Ren*, University of California, Davis; Yong Jae Lee, University of California, Davis; Michael Ryoo, Indiana University |
P-1A-03 | Adversarial Open-World Person Re-Identification | Xiang Li, Sun Yat-sen University; Ancong Wu, Sun Yat-sen University; Jason Wei Shi Zheng*, Sun Yat Sen University |
P-1A-04 | Graph R-CNN for Scene Graph Generation | Jianwei Yang*, Georgia Institute of Technology; Jiasen Lu, Georgia Institute of Technology; Stefan Lee, Georgia Institute of Technology; Dhruv Batra, Georgia Tech & Facebook AI Research; Devi Parikh, Georgia Tech & Facebook AI Research |
P-1A-05 | Contemplating Visual Emotions: Understanding and Overcoming Dataset Bias | Rameswar Panda*, UC Riverside; Jianming Zhang, Adobe Research; Haoxiang Li, Adobe; Joon-Young Lee, Adobe Research; Xin Lu, Adobe; Amit Roy-Chowdhury , University of California, Riverside, USA |
P-1A-06 | Graph Adaptive Knowledge Transfer for Unsupervised Domain Adaptation | Zhengming Ding*, Northeastern University; Sheng Li, Adobe Research; Ming Shao, University of Massachusetts Dartmouth; YUN FU, Northeastern University |
P-1A-07 | Deep Recursive HDRI: Inverse Tone Mapping using Generative Adversarial Networks | Siyeong Lee, Sogang University; Gwon Hwan An, Sogang University; Suk-Ju Kang*, Nil |
P-1A-08 | Deep Cross-Modal Projection Learning for Image-Text Matching | Ying Zhang*, Dalian University of Technology; Huchuan Lu, Dalian University of Technology |
P-1A-09 | Composition Loss for Counting, Density Map Estimation and Localization in Dense Crowds | Haroon Idrees*, Carnegie Mellon University; Muhammad Tayyab, UCF; Kishan Athrey, UCF; Mubarak Shah, University of Central Florida; Dong Zhang, University of Central Florida, USA |
P-1A-10 | Person Search by Multi-Scale Matching | Xu Lan*, Queen Mary University of London; Xiatian Zhu, Queen Mary University, London, UK; Shaogang Gong, Queen Mary University of London |
P-1A-11 | Efficient 6-DoF Tracking of Handheld Objects from an Egocentric Viewpoint | Rohit Pandey, Google; Pavel Pidlypenskyi, Google; Shuoran Yang, Google; Christine Kaeser-Chen*, Google |
P-1A-12 | Deep Video Generation, Prediction and Completion of Human Action Sequences | Chunyan Bai, Hong Kong University of Science and Technology; Haoye Cai*, Hong Kong University of Science and Technology; Yu-Wing Tai, Tencent YouTu; Chi-Keung Tang, Hong Kong University of Science and Technology |
P-1A-13 | Efficient Uncertainty Estimation for Semantic Segmentation in Videos | Po-Yu Huang*, National Tsing Hua University; Wan-Ting Hsu, National Tsing Hua University; Chun-Yueh Chiu, National Tsing Hua University; Tingfan Wu, Umbo Computer Vision; Min Sun, NTHU |
P-1A-14 | DeepKSPD: Learning Kernel-matrix-based SPD Representation for Fine-grained Image Recognition | Melih Engin, university of wollongong; Lei Wang*, University of Wollongong, Australia; Luping Zhou, University of Wollongong, Australia; Xinwang Liu, National University of Defense Technology |
P-1A-15 | From Face Recognition to Models of Identity: A Bayesian Approach to Learning about Unknown Identities from Unsupervised Data | Daniel Castro*, Imperial College London; Sebastian Nowozin, Microsoft Research Cambridge |
P-1A-16 | ShapeStacks: Learning Vision-Based Physical Intuition for Generalised Object Stacking | Oliver Groth*, Oxford Robotics Insitute; Fabian Fuchs, Oxford Robotics Insitute; Andrea Vedaldi, Oxford University; Ingmar Posner, Oxford |
P-1A-17 | Fast and Precise Camera Covariance Computation for Large 3D Reconstruction | Michal Polic*, Czech Technical University in Prague; Wolfgang Foerstner, University Bonn; Tomas Pajdla, Czech Technical University in Prague |
P-1A-18 | Inner Space Preserving Generative Pose Machine | Shuangjun Liu, Northeastern University; Sarah Ostadabbas*, Northeastern University |
P-1A-19 | CTAP: Complementary Temporal Action Proposal Generation | Jiyang Gao*, USC; Kan Chen, University of Southern California, USA; Ram Nevatia, U of Southern California |
P-1A-20 | Learning to Reenact Faces via Boundary Transfer | Wayne Wu, SenseTime Research; Yunxuan Zhang, sensetime research; Cheng Li*, SenseTime Research; Chen Qian, SenseTime; Chen Change Loy, Chinese University of Hong Kong |
P-1A-21 | Fast and Accurate Intrinsic Symmetry Detection | Rajendra Nagar*, Indian Institute of Technology Gandhinagar; Shanmuganathan Raman, IIT Gandhinagar |
P-1A-22 | Fictitious GAN: Training GANs with Historical Models | Yin Xia*, Northwestern University; Xu Chen, Northwestern University; Hao Ge, Northwestern University; Ying Wu, Northwestern University; Randall Berry, Northwestern University |
P-1A-23 | Audio-Visual Event Localization in Unconstrained Videos | Yapeng Tian*, University of Rochester; Jing Shi, University of Rochester; Bochen Li, University of Rochester; Zhiyao Duan, Unversity of Rochester; Chenliang Xu, University of Rochester |
P-1A-24 | Tackling 3D ToF Artifacts Through Learning and the FLAT Dataset | Qi Guo, Harvard University; Iuri Frosio*, NVIDIA; Orazio Gallo, NVIDIA Research; Todd Zickler, Harvard University; Kautz Jan, NVIDIA |
P-1A-25 | Self-Calibrating Isometric Non-Rigid Structure-from-Motion | shaifali parashar*, CNRS; Adrien Bartoli, Université Clermont Auvergne; Daniel Pizarro, Universidad de Alcala |
P-1A-26 | Semi-Supervised Deep Learning with Memory | Yanbei Chen*, Queen Mary University of London; Xiatian Zhu, Queen Mary University, London, UK; Shaogang Gong, Queen Mary University of London |
P-1A-27 | Question-Guided Hybrid Convolution for Visual Question Answering | gao peng*, Chinese university of hong kong; Hongsheng Li, Chinese University of Hong Kong; Shuang Li, The Chinese University of Hong Kong; Pan Lu, Tsinghua University; Yikang LI, The Chinese University of Hong Kong; Steven Hoi, SMU; Xiaogang Wang, Chinese University of Hong Kong, Hong Kong |
P-1A-28 | Rolling Shutter Pose and Ego-motion Estimation using Shape-from-Template | Yizhen Lao*, Université Clermont Auvergne; Omar Ait-Aider, Université Clermont Auvergne; Adrien Bartoli, Université Clermont Auvergne |
P-1A-29 | Semi-Dense 3D Reconstruction with a Stereo Event Camera | Yi Zhou*, The Australian National University; Guillermo Gallego, University of Zurich; Henri Rebecq, University of Zurich; Laurent Kneip, ShanghaiTech University; HONGDONG LI, Australian National University, Australia; Davide Scaramuzza, University of Zurich& ETH Zurich, Switzerland |
P-1A-30 | Local Orthogonal-Group Testing | Ahmet Iscen*, Czech Technical University; Ondrej Chum, Vision Recognition Group, Czech Technical University in Prague |
P-1A-31 | Temporal Relational Reasoning in Videos | Bolei Zhou*, MIT; Alex Andonian, Massachusetts Institute of Technology; Aude Oliva, MIT; Antonio Torralba, MIT |
P-1A-32 | Deep High Dynamic Range Imaging with Large Foreground Motions | Shangzhe Wu*, HKUST; Jiarui Xu, Hong Kong University of Science and Technology (HKUST); Yu-Wing Tai, Tencent YouTu; Chi-Keung Tang, Hong Kong University of Science and Technology |
P-1A-33 | Geometric Constrained Joint Lane Segmentation and Lane Boundary Detection | Jie Zhang*, Shanghai Jiao Tong University; Yi Xu, Shanghai Jiao Tong University; Bingbing Ni, Shanghai Jiao Tong University; Zhenyu Duan, Shanghai Jiao Tong University |
P-1A-34 | Attributes as Operators | Tushar Nagarajan*, UT Austin; Kristen Grauman, University of Texas |
P-1A-35 | Textual Explanations for Self-Driving Vehicles | Jinkyu Kim*, UC Berkeley; Anna Rohrbach, UC Berkeley; Trevor Darrell, UC Berkeley; John Canny, UC Berkeley; Zeynep Akata, University of Amsterdam |
P-1A-36 | Generative Domain-Migration Hashing for Sketch-to-Image Retrieval | Jingyi Zhang*, University of Electronic Science and Technology of China; Fumin Shen, UESTC; Li Liu, the inception institute of artificial intelligence; Fan Zhu, the inception institute of artificial intelligence ; Mengyang Yu, ETH Zurich; Ling Shao, Inception Institute of Artificial Intelligence; Heng Tao Shen, University of Electronic Science and Technology of China (UESTC); Luc Van Gool, ETH Zurich |
P-1A-37 | Recurrent Fusion Network for Image captioning | Wenhao Jiang*, Tencent AI Lab; Lin Ma, Tencent AI Lab; Yu-Gang Jiang, Fudan University; Wei Liu, Tencent AI Lab; Tong Zhang, Tecent AI Lab |
P-1A-38 | Attention-based Ensemble for Deep Metric Learning | Wonsik Kim*, Samsung Electronics; Bhavya Goyal, Samsung Electronics; Kunal Chawla, Samsung Electronics; Jungmin Lee, Samsung Electronics; Keunjoo Kwon, Samsung Electronics |
P-1A-39 | Egocentric Activity Prediction via Event Modulated Attention | Yang Shen*, Shanghai Jiao Tong University; Bingbing Ni, Shanghai Jiao Tong University; Zefan Li, Shanghai Jiao Tong University; Ning Zhuang, Shanghai Jiao Tong University |
P-1A-40 | A+D Net: Training a Shadow Detector with Adversarial Shadow Attenuation | Hieu Le*, Stony Brook University; Tomas F Yago Vicente, Stony Brook University; Vu Nguyen, Stony Brook University; Minh Hoai Nguyen, Stony Brook University; Dimitris Samaras, Stony Brook University |
P-1A-41 | Stereo Vision-based Semantic 3D Object and Ego-motion Tracking for Autonomous Driving | Peiliang LI*, HKUST Robotics Institute; Tong QIN, HKUST Robotics Institute; Shaojie Shen, HKUST |
P-1A-42 | End-to-end View Synthesis for Light Field Imaging with Pseudo 4DCNN | Yunlong Wang*, Center for Research on Intelligent Perception and Computing (CRIPAC) National Laboratory of Pattern Recognition (NLPR) Institute of Automation, Chinese Academy of Sciences (CASIA) ; Fei Liu, Center for Research on Intelligent Perception and Computing (CRIPAC) National Laboratory of Pattern Recognition (NLPR) Institute of Automation, Chinese Academy of Sciences (CASIA); Zilei Wang, University of Science and Technology of China; Guangqi Hou, Center for Research on Intelligent Perception and Computing (CRIPAC) National Laboratory of Pattern Recognition (NLPR) Institute of Automation, Chinese Academy of Sciences (CASIA); Zhenan Sun, Chinese of Academy of Sciences; Tieniu Tan, NLPR, China |
P-1A-43 | Robust image stitching using multiple registrations | Charles Herrmann, Cornell; Chen Wang, Google Research; Richard Bowen, Cornell; Mike Krainin, Google; Ce Liu, Google; Bill Freeman, MIT; Ramin Zabih*, Cornell Tech/Google Research |
P-1A-44 | Fast Multi-fiber Network for Video Recognition | Yunpeng Chen*, National University of Singapore; Yannis Kalantidis, Facebook Research, USA; Jianshu Li, NUS; Yan Shuicheng, National University of Singapore; Jiashi Feng, NUS |
P-1A-45 | TBN: Convolutional Neural Network with Ternary Inputs and Binary Weights | Diwen Wan*, University of Electronic Science and Technology of China; Fumin Shen, UESTC; Li Liu, the inception institute of artificial intelligence; Fan Zhu, the inception institute of artificial intelligence ; Jie Qin, ETH Zurich; Ling Shao, Inception Institute of Artificial Intelligence; Heng Tao Shen, University of Electronic Science and Technology of China (UESTC) |
P-1A-46 | Contextual Based Image Inpainting: Infer, Match and Translate | Yuhang Song*, USC; Chao Yang, University of Southern California; Zhe Lin, Adobe Research; Xiaofeng Liu, Carnegie Mellon University; Hao Li, Pinscreen/University of Southern California/USC ICT; Qin Huang, University of Southern California; C.-C. Jay Kuo, USC |
P-1A-47 | Deep Fundamental Matrix Estimation | Rene Ranftl*, Intel Labs; Vladlen Koltun, Intel Labs |
P-1A-48 | Joint Person Segmentation and Identification in Synchronized First- and Third-person Videos | Mingze Xu*, Indiana University; Chenyou Fan, JD.com; Yuchen Wang, Indiana University; Michael Ryoo, Indiana University; David Crandall, Indiana University |
P-1A-49 | Linear Span Network for Object Skeleton Detection | Chang Liu*, University of Chinese Academy of Sciences; Wei Ke, University of Chinese Academy of Sciences; Fei Qin, University of Chinese Academy of Sciences; Qixiang Ye, University of Chinese Academy of Sciences, China |
P-1A-50 | Category-Agnostic Semantic Keypoint Representations in Canonical Object Views | Xingyi Zhou*, The University of Texas at Austin; Arjun Karpur, The University of Texas at Austin; Linjie Luo, Snap Inc; Qixing Huang, The University of Texas at Austin |
P-1A-51 | Where are the blobs: Counting by Localization with Point Supervision | Issam Hadj Laradji*, University of British Columbia (UBC); Negar Rostamzadeh, Element AI; Pedro Pinheiro, EPFL; David Vazquez, Element AI; Mark Schmidt, University of British Columbia |
P-1A-52 | A Hybrid Model for Identity Obfuscation by Face Replacement | Qianru Sun*, National University of Singapore; Ayush Tewari, Max Planck Institute for Informatics; Weipeng Xu, MPII; Mario Fritz, Max-Planck-Institut für Informatik; Christian Theobalt, MPI Informatik; Bernt Schiele, MPI |
P-1A-53 | Exploring the Limits of Supervised Pretraining | Dhruv Mahajan, Facebook; Ross Girshick*, Facebook AI Research (FAIR); Vignesh Ramanathan, Facebook; Kaiming He, Facebook Inc., USA; Manohar Paluri, Facebook; Yixuan Li, Facebook Research; Ashwin Bharambe, Facebook; Laurens van der Maaten, Facebook AI Research |
P-1A-54 | TrackingNet: A Large-Scale Dataset and Benchmark for Object Tracking in the Wild | Matthias Müller*, King Abdullah University of Science and Technology (KAUST); Adel Bibi, KAUST; Silvio Giancola, KAUST; Salman Al-Subaihi, KAUST; Bernard Ghanem, KAUST |
P-1A-55 | Unpaired Image Captioning by Language Pivoting | Jiuxiang Gu*, Nanyang Technological University; Shafiq Joty, Nanyang Technological University; Jianfei Cai, Nanyang Technological University; Gang Wang, Alibaba Group |
P-1A-56 | Pairwise Relational Networks for Face Recognition | Bong-Nam Kang*, POSTECH |
P-1A-57 | DeepPhys: Video-Based Physiological Measurement Using Convolutional Attention Networks | Weixuan Chen*, MIT Media Lab; Daniel McDuff, Microsoft Research |
P-1A-58 | Semantic Match Consistency for Long-Term Visual Localization | Carl Toft*, Chalmers; Erik Stenborg, Chalmers University; Lars Hammarstrand, Chalmers university of technology; Lucas Brynte, Chalmers University of Technology; Marc Pollefeys, ETH Zurich; Torsten Sattler, ETH Zurich; Fredrik Kahl, Chalmers |
P-1A-59 | Grounding Visual Explanations | Lisa Anne Hendricks*, Uc berkeley; Ronghang Hu, University of California, Berkeley; Trevor Darrell, UC Berkeley; Zeynep Akata, University of Amsterdam |
P-1A-60 | Cross-Modal Hamming Hashing | Yue Cao, Tsinghua University; Mingsheng Long*, Tsinghua University; Bin Liu, Tsinghua University; Jianmin Wang, Tsinghua University, China |
P-1A-61 | A Modulation Module for Multi-task Learning with Applications in Image Retrieval | Xiangyun Zhao*, Northwestern University; Haoxiang Li, Adobe; Xiaohui Shen, Adobe Research; Xiaodan Liang, Carnegie Mellon University; Ying Wu, Northwestern University |
P-1A-62 | Open-World Stereo Video Matching with Deep RNN | Yiran Zhong*, Australian National University; HONGDONG LI, Australian National University, Australia; Yuchao Dai, Northwestern Polytechnical University |
P-1A-63 | Deblurring Natural Image Using Super-Gaussian Fields | Yuhang Liu, Wuhan University; Wenyong Dong*, Wuhan University; Dong Gong, Northwestern Polytechnical University & The University of Adelaide; Lei Zhang, The unversity of Adelaide; Qinfeng Shi, University of Adelaide |
P-1A-64 | Diverse and Coherent Paragraph Generation from Images | Moitreya Chatterjee*, University of Illinois at Urbana Champaign; Alexander Schwing, UIUC |
P-1A-65 | Learning Compression from limited unlabeled Data | Xiangyu He*, Chinese Academy of Sciences; Jian Cheng, Chinese Academy of Sciences, China |
P-1A-66 | Deep Video Quality Assessor: From Spatio-temporal Visual Sensitivity to A Convolutional Neural Aggregation Network | Woojae Kim*, Yonsei University; Jongyoo Kim, Yonsei University; Sewoong Ahn, Yonsei University; Jinwoo Kim, Yonsei University; Sanghoon Lee, Yonsei University, Korea |
P-1A-67 | Product Quantization Network for Fast Image Retrieval | Tan Yu*, Nanyang Technological University; Junsong Yuan, State University of New York at Buffalo, USA; CHEN FANG, Adobe Research, San Jose, CA; Hailin Jin, Adobe Research |
P-1A-68 | Factorizable Net: An Efficient Subgraph-based Framework for Scene Graph Generation | Yikang LI*, The Chinese University of Hong Kong; Bolei Zhou, MIT; Yawen Cui, National University of Defense Technology ; Jianping Shi, Sensetime Group Limited; Xiaogang Wang, Chinese University of Hong Kong, Hong Kong; Wanli Ouyang, CUHK |
P-1A-69 | C-WSL: Count-guided Weakly Supervised Localization | Mingfei Gao*, University of Maryland; Ang Li, Google DeepMind; Ruichi Yu, University of Maryland, College Park; Vlad Morariu, Adobe Research; Larry Davis, University of Maryland |
P-1A-70 | The Sound of Pixels | Hang Zhao*, Massachusetts Institute of Technology; Chuang Gan, MIT; Andrew Rouditchenko, MIT; Carl Vondrick, MIT; Josh McDermott, Massachusetts Institute of Technology; Antonio Torralba, MIT |
P-1A-71 | Unsupervised Video Object Segmentation using Motion Saliency-Guided Spatio-Temporal Propagation | Yuan-Ting Hu*, University of Illinois at Urbana-Champaign; Jia-Bin Huang, Virginia Tech; Alexander Schwing, UIUC |
P-1A-72 | Good Line Cutting: towards Accurate Pose Tracking of Line-assisted VO/VSLAM | Yipu Zhao*, Georgia Institute of Technology; Patricio Vela, Georgia Institute of Technology |
P-1A-73 | Bi-box Regression for Pedestrian Detection and Occlusion Estimation | CHUNLUAN ZHOU*, Nanyang Technological University; Junsong Yuan, State University of New York at Buffalo, USA |
P-1A-74 | Unveiling the Power of Deep Tracking | Goutam Bhat*, Linkoping University; Joakim Johnander, Linköping University; Martin Danelljan, Linkoping University; Fahad Shahbaz Khan, Linköping University; Michael Felsberg, Linköping University |
P-1A-75 | Multi-Scale Structure-Aware Network for Human Pose Estimation | Lipeng Ke*, University of Chinese Academy of Sciences; Ming-Ching Chang, Albany University; Honggang Qi, University of Chinese Academy of Sciences; Siwei Lyu, University at Albany |
P-1A-76 | Neural Graph Matching Networks for Fewshot 3D Action Recognition | Michelle Guo*, Stanford University; Edward Chou, Stanford University; De-An Huang, Stanford University; Shuran Song, Princeton; Serena Yeung, Stanford University; Li Fei-Fei, Stanford University |
P-1A-77 | Objects that Sound | Relja Arandjelovi?*, DeepMind; Andrew Zisserman, University of Oxford |
P-1A-78 | Discriminative Region Proposal Adversarial Networks for High-Quality Image-to-Image Translation | Chao Wang, Ocean University of China; Haiyong Zheng*, Ocean University of China; Zhibin Yu, Ocean University of China; Ziqiang Zheng, Ocean University of China; Zhaorui Gu, Ocean University of China; Bing Zheng, Ocean University of China |
P-1A-79 | SaaS: Speed as a Supervisor for Semi-supervised Learning | Safa Cicek*, UCLA; Alhussein Fawzi, UCLA; Stefano Soatto, UCLA |
P-1A-80 | Adaptive Affinity Field for Semantic Segmentation | Tsung-Wei Ke, UC Berkeley / ICSI; Jyh-Jing Hwang*, UC Berkeley / ICSI; Ziwei Liu, UC Berkeley / ICSI; Stella Yu, UC Berkeley / ICSI |
P-1A-81 | Semi-convolutional Operators for Instance Segmentation | Samuel Albanie*, University of Oxford; Andrea Vedaldi, Oxford University; David Novotny, Oxford University; Diane Larlus, Naver Labs Europe |
P-1A-82 | Effective Use of Synthetic Data for Urban Scene Semantic Segmentation | Fatemeh Sadat Saleh*, Australian National University (ANU); Mohammad Sadegh Aliakbarian, Data61; Mathieu Salzmann, EPFL; Lars Petersson, Data61/CSIRO; Jose Manuel Alvarez, Toyota Research Institute |
P-1A-83 | Shape correspondences from learnt template-based parametrization | Thibault Groueix*, École des ponts ParisTech; Bryan Russell, Adobe Research; Mathew Fisher, Adobe Research; Vladimir Kim, Adobe Research; Mathieu Aubry, École des ponts ParisTech |
P-1A-84 | TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes | Shangbang Long, Peking University; Jiaqiang Ruan, Peking University; Wenjie Zhang, Peking University; Xin He*, Megvii; Wenhao Wu, Megvii; Cong Yao, Megvii |
P-1A-85 | How good is my GAN? | Konstantin Shmelkov*, Inria; Cordelia Schmid, INRIA; Karteek Alahari, Inria |
P-1A-86 | Deep Generative Models for Weakly-Supervised Multi-Label Classification | Hong-Min Chu*, National Taiwan University; Chih-Kuan Yeh, Carnegie Mellon University; Yu-Chiang Frank Wang, National Taiwan University |
P-1A-87 | Attention-GAN for Object Transfiguration in Wild Images | Xinyuan Chen*, Shanghai Jiao Tong University; Chang Xu, University of Sydney; Xiaokang Yang, Shanghai Jiao Tong University of China; Dacheng Tao, University of Sydney |
P-1A-88 | Skeleton-Based Action Recognition with Spatial Reasoning and Temporal Stack Learning | Chenyang Si*, Institute of Automation, Chinese Academy of Sciences; Ya Jing, Institute of Automation, Chinese Academy of Sciences; wei wang, Institute of Automation Chinese Academy of Sciences; Liang Wang, NLPR, China; Tieniu Tan, NLPR, China |
P-1A-89 | Diverse Image-to-Image Translation via Disentangled Representations | Hsin-Ying Lee*, University of California, Merced; Hung-Yu Tseng, University of California, Merced; Maneesh Singh, Verisk Analytics; Jia-Bin Huang, Virginia Tech; Ming-Hsuan Yang, University of California at Merced |
P-1A-90 | Convolutional Networks with Adaptive Computation Graphs | Andreas Veit*, Cornell University; Serge Belongie, Cornell University |
Poster session 1B
1B | Monday, September 10 | Poster Session 04:00 PM - 06:00 PM←↑ |
---|---|---|
P-1B-01 | Learning to Separate Object Sounds by Watching Unlabeled Video | Ruohan Gao*, University of Texas at Austin; Rogerio Feris, IBM Research; Kristen Grauman, University of Texas |
P-1B-02 | Learning-based Video Motion Magnification | Tae-Hyun Oh, MIT CSAIL; Ronnachai Jaroensri*, MIT CSAIL; Changil Kim, MIT CSAIL; Mohamed A. Elghareb, Qatar Computing Research Institute; Fredo Durand, MIT; Bill Freeman, MIT; Wojciech Matusik, MIT CSAIL |
P-1B-03 | Light Structure from Pin Motion: Simple and Accurate Point Light Calibration for Physics-based Modeling | Hiroaki Santo*, Osaka University; Michael Waechter, Osaka University; Masaki Samejima, Osaka University; Yusuke Sugano, Osaka University; Yasuyuki Matsushita, Osaka University |
P-1B-04 | Video Object Segmentation with Joint Re-identification and Attention-Aware Mask Propagation | Xiaoxiao Li*, The Chinese University of Hong Kong; Chen Change Loy, Chinese University of Hong Kong |
P-1B-05 | Coded Two-Bucket Cameras for Computer Vision | Mian Wei, University of Toronto; Navid Navid Sarhangnejad, University of Toronto; Zhengfan Xia, University of Toronto; Nikola Katic, University of Toronto; Roman Genov, University of Toronto; Kyros Kutulakos*, University of Toronto |
P-1B-06 | Multimodal Unsupervised Image-to-image Translation | Xun Huang*, Cornell University; Ming-Yu Liu, NVIDIA; Serge Belongie, Cornell University; Kautz Jan, NVIDIA |
P-1B-07 | Learning to Detect and Track Visible and Occluded Body Joints in a Virtual World | Matteo Fabbri, University of Modena and Reggio Emilia; Fabio Lanzi*, University of Modena and Reggio Emilia; SIMONE CALDERARA, University of Modena and Reggio Emilia, Italy; Andrea Palazzi, University of Modena and Reggio Emilia; ROBERTO VEZZANI, University of Modena and Reggio Emilia, Italy; Rita Cucchiara, Universita Di Modena E Reggio Emilia |
P-1B-08 | Local Spectral Graph Convolution for Point Set Feature Learning | Chu Wang*, McGill University; Babak Samari, McGill University; Kaleem Siddiqi, McGill University |
P-1B-09 | Meta-Tracker: Fast and Robust Online Adaptation for Visual Object Trackers | Eunbyung Park*, UNC-CHAPEL HILL; Alex Berg, University of North Carolina, USA |
P-1B-10 | VSO: Visual Semantic Odometry | Konstantinos-Nektarios Lianos, Geomagical Labs, Inc; Johannes Schoenberger, ETH Zurich; Marc Pollefeys, ETH Zurich; Torsten Sattler*, ETH Zurich |
P-1B-11 | Progressive Lifelong Learning by Distillation and Retrospection | Saihui Hou*, University of Science and Technology of China; Xinyu Pan, MMLAB, CUHK; Chen Change Loy, Chinese University of Hong Kong; Dahua Lin, The Chinese University of Hong Kong |
P-1B-12 | Spatio-Temporal Channel Correlation Networks for Action Classification | Ali Diba*, KU Leuven; Mohsen Fayyaz, University of Bonn; Vivek Sharma, Karlsruhe Institute of Technology; Mohammad Arzani, Sensifai; Rahman Yousefzadeh, sensifai; Jürgen Gall, University of Bonn; Luc Van Gool, ETH Zurich |
P-1B-13 | Long-term Tracking in the Wild: a Benchmark | Efstratios Gavves, University of Amsterdam ; Luca Bertinetto*, University of Oxford; Joao Henriques, University of Oxford; Andrea Vedaldi, Oxford University; Philip Torr, University of Oxford; Ran Tao, University of Amsterdam; Jack Valmadre, Oxford |
P-1B-14 | Online Detection of Action Start in Untrimmed, Streaming Videos | Zheng Shou*, Columbia University; Junting Pan, Columbia University ; Jonathan Chan, Columbia University; Kazuyuki Miyazawa, Mitsubishi Electric; Hassan Mansour, Mitsubishi Electric Research Laboratories (MERL); Anthony Vetro, Mitsubishi Electric Research Lab; Xavier Giro-i-Nieto, Universitat Politecnica de Catalunya; Shih-Fu Chang, Columbia University |
P-1B-15 | Dense Pose Transfer | Natalia Neverova*, Facebook AI Research; Alp Guler, INRIA; Iasonas Kokkinos, Facebook, France |
P-1B-16 | Simultaneous 3D Reconstruction for Water Surface and Underwater Scene | Yiming Qian*, University of Alberta; Yinqiang Zheng, National Institute of Informatics; Minglun Gong, Memorial University; Herb Yang, University of Alberta |
P-1B-17 | Multiple-gaze geometry: Inferring novel 3D locations from gazes observed in monocular video | Ernesto Brau, CiBO Technologies; Jinyan Guan, UC San Diego; Tanya Jeffries, U. Arizona; Kobus Barnard*, University of Arizona |
P-1B-18 | Multi-Scale Context Intertwining for Semantic Segmentation | Di Lin*, Shenzhen University; Yuanfeng Ji, Shenzhen University; Dani Lischinski, The Hebrew University of Jerusalem; Danny Cohen-Or, Tel Aviv University; Hui Huang, Shenzhen University |
P-1B-19 | Object-centered image stitching | Charles Herrmann, Cornell; Chen Wang, Google Research; Richard Bowen, Cornell; Ramin Zabih*, Cornell Tech/Google Research |
P-1B-20 | Grassmann Pooling for Fine-Grained Visual Classification | Xing Wei*, Xi'an Jiaotong University; Yihong Gong, Xi'an Jiaotong University; Yue Zhang, Xi'an Jiaotong University; Nanning Zheng, Xi'an Jiaotong University; Jiawei Zhang, City University of Hong Kong |
P-1B-21 | Diagnosing Error in Temporal Action Detectors | Humam Alwassel*, KAUST; Fabian Caba, KAUST; Victor Escorcia, KAUST; Bernard Ghanem, KAUST |
P-1B-22 | CGIntrinsics: Better Intrinsic Image Decomposition through Physically-Based Rendering | Zhengqi Li*, Cornell University; Noah Snavely, - |
P-1B-23 | A Closed-form Solution to Photorealistic Image Stylization | Yijun Li*, University of California, Merced; Ming-Yu Liu, NVIDIA; Xueting Li, University of California, Merced; Ming-Hsuan Yang, University of California at Merced; Kautz Jan, NVIDIA |
P-1B-24 | Two at Once: Enhancing Learning and Generalization Capacities via IBN-Net | Xingang Pan*, The Chinese University of Hong Kong; Ping Luo, The Chinese University of Hong Kong; Jianping Shi, Sensetime Group Limited; Xiaoou Tang, The Chinese University of Hong Kong |
P-1B-25 | Collaborative Deep Reinforcement Learning for Multi-Object Tracking | Liangliang Ren, Tsinghua University; Zifeng Wang, Tsinghua University; Jiwen Lu*, Tsinghua University; Qi Tian , The University of Texas at San Antonio; Jie Zhou, Tsinghua University, China |
P-1B-26 | Single Image Highlight Removal with a Sparse and Low-Rank Reflection Model | Jie Guo*, Nanjing University; Zuojian Zhou, Nanjing University Of Chinese Medicine; Limin Wang, Nanjing University |
P-1B-27 | Hierarchical Relational Networks for Group Activity Recognition and Retrieval | Mostafa Ibrahim*, Simon Fraser University; Greg Mori, Simon Fraser University |
P-1B-28 | Towards Human-Level License Plate Recognition | Jiafan Zhuang, University of Science and Technology of China; Zilei Wang*, University of Science and Technology of China |
P-1B-29 | Stacked Cross Attention for Image-Text Matching | Kuang-Huei Lee*, Microsoft AI and Research; Xi Chen, Microsoft AI and Research; Gang Hua, Microsoft Cloud and AI; Houdong Hu, Microsoft AI and Research; Xiaodong He, JD AI Research |
P-1B-30 | Deep Discriminative Model for Video Classification | Mohammad Tavakolian*, University of Oulu; Abdenour Hadid, Finland |
P-1B-31 | The Mutex Watershed: Efficient, Parameter-Free Image Partitioning | Steffen Wolf*, Univertity of Heidelberg; Constantin Pape, University of Heidelberg; Nasim Rahaman, University of Heidelberg; Anna Kreshuk, University of Heidelberg; Ullrich Köthe, University of Heidelberg; Fred Hamprecht, Heidelberg Collaboratory for Image Processing |
P-1B-32 | Monocular Depth Estimation with Affinity, Vertical Pooling, and Label Enhancement | YuKang Gan*, SUN YAT-SEN University; Xiangyu Xu, Tsinghua University; Wenxiu Sun, SenseTime Research; Liang Lin, SenseTime |
P-1B-33 | Improved Structure from Motion Using Fiducial Marker Matching | Joseph DeGol*, UIUC; Timothy Bretl, University of Illinois at Urbana-Champaign; Derek Hoiem, University of Illinois at Urbana-Champaign |
P-1B-34 | Temporal Modular Networks for Retrieving Complex Compositional Activities in Video | Bingbin Liu*, Stanford University; Serena Yeung, Stanford University; Edward Chou, Stanford University; De-An Huang, Stanford University; Li Fei-Fei, Stanford University; Juan Carlos Niebles, Stanford University |
P-1B-35 | Quantized Densely Connected U-Nets for Efficient Landmark Localization | Zhiqiang Tang*, Rutgers; Xi Peng, Rutgers University; Shijie Geng, Rutgers; Shaoting Zhang, University of North Carolina at Charlotte; Lingfei Wu, IBM T. J. Watson Research Center; Dimitris Metaxas, Rutgers |
P-1B-36 | Real-to-Virtual Domain Uni_x000c_cation for End-to-End Autonomous Driving | Luona Yang*, Carnegie Mellon University; Xiaodan Liang, Carnegie Mellon University; Eric Xing, Petuum Inc. |
P-1B-37 | Beyond Part Models: Person Retrieval with Refined Part Pooling (and A Strong Convolutional Baseline) | Yifan Sun*, Tsinghua University; Liang Zheng, Singapore University of Technology and Design; Yi Yang, University of Technology, Sydney; Qi Tian , The University of Texas at San Antonio; Shengjin Wang, Tsinghua University |
P-1B-38 | Fully-Convolutional Point Networks for Large-Scale Point Clouds | Dario Rethage*, Technical University of Munich, Germany; Johanna Wald, Technical University of Munich; Nassir Navab, TU Munich, Germany; Federico Tombari, Technical University of Munich, Germany |
P-1B-39 | Real-Time Hair Rendering using Sequential Adversarial Networks | Lingyu Wei*, University of Southern California; Liwen Hu, University of Southern California; Vladimir Kim, Adobe Research; Ersin Yumer, Argo AI; Hao Li, Pinscreen/University of Southern California/USC ICT |
P-1B-40 | Visual Tracking via Spatially Aligned Correlation Filters Network | mengdan zhang*, Institute of Automation, Chinese Academy of Sciences; qiang wang, Institute of Automation, Chinese Academy of Sciences; Junliang Xing, National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences; Jin Gao, Institute of Automation, Chinese Academy of Sciences; peixi peng, Institute of Automation, Chinese Academy of Sciences; Weiming Hu, Institute of Automation,Chinese Academy of Sciences; Steve Maybank, University of London |
P-1B-41 | Spatio-temporal Transformer Network for Video Restoration | Tae Hyun Kim*, Max Planck Institute for Intelligent Systems; Mehdi S. M. Sajjadi, Max Planck Institute for Intelligent Systems; Michael Hirsch, Max Planck Institut for Intelligent Systems ; Bernhard Schölkopf, Max Planck Institute for Intelligent Systems |
P-1B-42 | Value-aware Quantization for Training and Inference of Neural Networks | Eunhyeok Park, Seoul National University; Sungjoo Yoo*, Seoul National University; Peter Vajda, Facebook |
P-1B-43 | Lambda Twist: An Accurate Fast Robust Perspective Three Point (P3P) Solver | Mikael Persson*, Linköping University; Klas Nordberg, Linköping University |
P-1B-44 | Programmable Light Curtains | Jian Wang*, Carnegie Mellon University; Joe Bartels, Carnegie Mellon University; William Whittaker, Carnegie Mellon University; Aswin Sankaranarayanan, Carnegie Mellon University; Srinivasa Narasimhan, Carnegie Mellon University |
P-1B-45 | Monocular Depth Estimation Using Whole Strip Masking and Reliability-Based Refinement | Minhyeok Heo*, Korea University; Jaehan Lee, Korea University; Kyung-Rae Kim, Korea University; Han-Ul Kim, Korea University; Chang-Su Kim, Korea university |
P-1B-46 | Task-Aware Image Downscaling | Heewon Kim, Seoul National University; Myungsub Choi, Seoul National University; Bee Lim, Seoul National University; Kyoung Mu Lee*, Seoul National University |
P-1B-47 | Single Image Scene Refocusing using Conditional Adversarial Networks | Parikshit Sakurikar*, IIIT-Hyderabad; Ishit Mehta, IIIT Hyderabad; Vineeth N Balasubramanian, IIT Hyderabad; P. J. Narayanan, IIIT-Hyderabad |
P-1B-48 | Model-free Consensus Maximization for Non-Rigid Shapes | Thomas Probst*, ETH Zurich; Ajad Chhatkuli , ETHZ; Danda Pani Paudel, ETH Zürich; Luc Van Gool, ETH Zurich |
P-1B-49 | BSN: Boundary Sensitive Network for Temporal Action Proposal Generation | Tianwei Lin, Shanghai Jiao Tong University; Xu Zhao*, Shanghai Jiao Tong University; Haisheng Su, Shanghai Jiao Tong University; Chongjing Wang, China Academy of Information and Communications Technology; Ming Yang, Shanghai Jiao Tong University |
P-1B-50 | Materials for Masses: SVBRDF Acquisition with a Single Mobile Phone Image | Zhengqin Li*, UC San Diego; Manmohan Chandraker, UC San Diego; Sunkavalli Kalyan, Adobe Research |
P-1B-51 | Attentive Semantic Alignment with Offset-Aware Correlation Kernels | Paul Hongsuck Seo*, POSTECH; Jongmin Lee, POSTECH; Deunsol Jung, POSTECH; Bohyung Han, Seoul National University; Minsu Cho, POSTECH |
P-1B-52 | Deeply Learned Compositional Models for Human Pose Estimation | Wei Tang*, Northwestern University; Pei Yu, Northwestern University; Ying Wu, Northwestern University |
P-1B-53 | Real-Time MDNet | Ilchae Jung*, POSTECH; Jeany Son, POSTECH; Mooyeol Baek, POSTECH; Bohyung Han, Seoul National University |
P-1B-54 | Women also Snowboard: Overcoming Bias in Captioning Models | Lisa Anne Hendricks*, UC Berkeley; Kaylee Burns, UC Berkeley; Kate Saenko, Boston University; Trevor Darrell, UC Berkeley; Anna Rohrbach, UC Berkeley |
P-1B-55 | Progressive Structure from Motion | Alex Locher*, ETH Zürich; Michal Havlena, Vuforia, PTC, Vienna; Luc Van Gool, ETH Zurich |
P-1B-56 | Occlusion-aware R-CNN: Detecting Pedestrians in a Crowd | Shifeng Zhang*, CBSR, NLPR, CASIA; Longyin Wen, GE Global Research; Xiao Bian, GE Global Research; Zhen Lei, NLPR, CASIA, China; Stan Li, National Lab. of Pattern Recognition, China |
P-1B-57 | Affinity Derivation and Graph Merge for Instance Segmentation | Yiding Liu*, University of Science and Technology of China; Siyu Yang, Beihang University; Bin Li, Microsoft Research Asia; Wengang Zhou, University of Science and Technology of China; Ji-Zeng Xu, Microsoft Research Asia; Houqiang Li, University of Science and Technology of China; Yan Lu, Microsoft Research Asia |
P-1B-58 | Second-order Democratic Aggregation | Tsung-Yu Lin*, University of Massachusetts Amherst; Subhransu Maji, University of Massachusetts, Amherst; Piotr Koniusz, Data61/CSIRO, ANU |
P-1B-59 | Improving Sequential Determinantal Point Processes for Supervised Video Summarization | Aidean Sharghi*, University of Central Florida; Boqing Gong, Tencent AI Lab; Ali Borji, University of Central Florida; Chengtao Li, MIT; Tianbao Yang, University of Iowa |
P-1B-60 | Seeing Deeply and Bidirectionally: A Deep Learning Approach for Single Image Reflection Removal | Jie Yang*, University of Adelaide; Dong Gong, Northwestern Polytechnical University & The University of Adelaide; Lingqiao Liu, University of Adelaide; Qinfeng Shi, University of Adelaide |
P-1B-61 | Specular-to-Diffuse Translation for Multi-View Reconstruction | Shihao Wu*, University of Bern; Hui Huang, Shenzhen University; Tiziano Portenier, University of Bern; Matan Sela, Technion - Israel Institute of Technology; Danny Cohen-Or, Tel Aviv University; Ron Kimmel, Technion; Matthias Zwicker, University of Maryland |
P-1B-62 | SEAL: A Framework Towards Simultaneous Edge Alignment and Learning | Zhiding Yu*, NVIDIA; Weiyang Liu, Georgia Tech; Yang Zou, Carnegie Mellon University; Chen Feng, Mitsubishi Electric Research Laboratories (MERL); Srikumar Ramalingam, University of Utah; B. V. K. Vijaya Kumar, CMU, USA; Kautz Jan, NVIDIA |
P-1B-63 | Question Type Guided Attention in Visual Question Answering | Yang Shi*, University of California, Irvine; Tommaso Furlanello, University of Southern California; Sheng Zha, Amazon Web Services; Anima Anandkumar, Amazon |
P-1B-64 | Neural Procedural Reconstruction for Residential Buildings | Huayi Zeng*, Washington University in St.Louis; Jiaye Wu, Washington University in St.Louis; Yasutaka Furukawa, Simon Fraser University |
P-1B-65 | Self-Calibration of Cameras with Euclidean Image Plane in Case of Two Views and Known Relative Rotation Angle | Evgeniy Martyushev*, South Ural State University |
P-1B-66 | Towards Optimal Deep Hashing via Policy Gradient | Xin Yuan, Tsinghua University; Liangliang Ren, Tsinghua University; Jiwen Lu*, Tsinghua University; Jie Zhou, Tsinghua University, China |
P-1B-67 | Piggyback: Adapting a Single Network to Multiple Tasks by Learning to Mask Weights | Arun Mallya*, UIUC; Svetlana Lazebnik, UIUC; Dillon Davis, UIUC |
P-1B-68 | Generating 3D Faces using Convolutional Mesh Autoencoders | Anurag Ranjan*, MPI for Intelligent Systems; Timo Bolkart, Max Planck for Intelligent Systems; Soubhik Sanyal, Max Planck Institute for Intelligent Systems; Michael Black, Max Planck Institute for Intelligent Systems |
P-1B-69 | ICNet for Real-Time Semantic Segmentation on High-Resolution Images | Hengshuang Zhao, The Chinese University of Hong Kong; Xiaojuan Qi, CUHK; Xiaoyong Shen*, CUHK; Jianping Shi, Sensetime Group Limited; Jia Jiaya, Chinese University of Hong Kong |
P-1B-70 | Memory Aware Synapses: Learning what (not) to forget | Rahaf Aljundi*, KU Leuven; Francesca babiloni, KU Leuven; Mohamed Elhoseiny, Facebook; Marcus Rohrbach, Facebook AI Research; Tinne Tuytelaars, K.U. Leuven |
P-1B-71 | Deep Texture and Structure Aware Filtering Network for Image Smoothing | Kaiyue Lu*, Australian National University & Data61-CSIRO; Shaodi You, Data61-CSIRO, Australia; Nick Barnes, CSIRO(Data61) |
P-1B-72 | Linear RGB-D SLAM for Planar Environments | Pyojin Kim*, Seoul National University; Brian Coltin, NASA Ames Research Center; Hyoun Jin Kim, Seoul National University |
P-1B-73 | DeepJDOT: Deep Joint distribution optimal transport for unsupervised domain adaptation | Bharath Bhushan Damodaran*, IRISA,Universite de Bretagne-Sud; Benjamin Kellenberger, Wageningen University and Research; Rémi Flamary, Université Côte dAzur; Devis Tuia, Wageningen University and Research; Nicolas Courty, IRISA, Universite Bretagne-Sud |
P-1B-74 | W-TALC: Weakly-supervised Temporal Activity Localization and Classification | Sujoy Paul*, University of California-Riverside; Sourya Roy, University of California, Riverside; Amit Roy-Chowdhury , University of California, Riverside, USA |
P-1B-75 | Unsupervised Video Object Segmentation with Motion-based Bilateral Networks | Siyang Li*, University of Southern California; Bryan Seybold, Google Inc.; Alexey Vorobyov, Google Inc.; Xuejing Lei, University of Southern California ; C.-C. Jay Kuo, USC |
P-1B-76 | Disentangling Factors of Variation with Cycle-Consistent Variational Auto-Encoders | Ananya Harsh Jha*, Indraprastha Institute of Information Technology Delhi; Saket Anand, Indraprastha Institute of Information Technology Delhi; Maneesh Singh, Verisk Analytics; VSR Veeravasarapu, Verisk Analytics |
P-1B-77 | Mancs: A Multi-task Attentional Network with Curriculum Sampling for Person Re-identification | Cheng Wang, Huazhong Univ. of Science and Technology; Qian Zhang, Horizon Robotics; Chang Huang, Horizon Robotics, Inc.; Wenyu Liu, Huazhong University of Science and Technology; Xinggang Wang*, Huazhong Univ. of Science and Technology |
P-1B-78 | Multi-view to Novel view: Synthesizing Views via Self-Learned Confidence | Shao-Hua Sun*, University of Southern California; Jacob Huh, Carnegie Mellon University; Yuan-Hong Liao, National Tsing Hua University; Ning Zhang, SnapChat; Joseph Lim, USC |
P-1B-79 | Part-Activated Deep Reinforcement Learning for Action Prediction | Lei Chen, Tianjin University; Jiwen Lu*, Tsinghua University; Zhanjie Song, Tianjin University; Jie Zhou, Tsinghua University, China |
P-1B-80 | Online Dictionary Learning for Approximate Archetypal Analysis | Jieru Mei, Microsoft Research Asia; Chunyu Wang*, Microsoft Research asia; Wenjun Zeng, Microsoft Research |
P-1B-81 | Estimating Depth from RGB and Sparse Sensing | Zhao Chen*, Magic Leap, Inc.; Vijay Badrinarayanan, Magic Leap, Inc.; Gilad Drozdov, Magic Leap, Inc.; Andrew Rabinovich, Magic Leap, Inc. |
P-1B-82 | Unsupervised Domain Adaptation for Semantic Segmentation via Class-Balanced Self-Training | Yang Zou*, Carnegie Mellon University; Zhiding Yu, NVIDIA; B. V. K. Vijaya Kumar, CMU, USA; Jinsong Wang, General Motors |
P-1B-83 | Zoom-Net: Mining Deep Feature Interactions for Visual Relationship Recognition | Guojun Yin, University of Science and Technology of China; Lu Sheng, The Chinese University of Hong Kong; Bin Liu, University of Science and Technology of China; Nenghai Yu, University of Science and Technology of China; Xiaogang Wang, Chinese University of Hong Kong, Hong Kong; Chen Change Loy, Chinese University of Hong Kong; Jing Shao*, The Chinese University of Hong Kong |
P-1B-84 | Joint Camera Spectral Sensitivity Selection and Hyperspectral Image Recovery | Ying Fu*, Beijing Institute of Technology; Tao Zhang, Beijing Institute of Technology; Yinqiang Zheng, National Institute of Informatics; debing zhang, DeepGlint; Hua Huang, Beijing Institute of Technology |
P-1B-85 | Compositing-aware Image Search | Hengshuang Zhao*, The Chinese University of Hong Kong; Xiaohui Shen, Adobe Research; Zhe Lin, Adobe Research; Sunkavalli Kalyan, Adobe Research; Brian Price, Adobe; Jia Jiaya, Chinese University of Hong Kong |
P-1B-86 | Zero-shot keyword search for visual speech recognition in-the-wild | Themos Stafylakis*, University of Nottingham; Georgios Tzimiropoulos, University of Nottingham |
P-1B-87 | End-to-End Joint Semantic Segmentation of Actors and Actions in Video | Jingwei Ji*, Stanford University; Shyamal Buch, Stanford University; Alvaro Soto, Universidad Catolica de Chile; Juan Carlos Niebles, Stanford University |
P-1B-88 | Learning Discriminative Video Representations Using Adversarial Perturbations | Jue Wang*, ANU; Anoop Cherian, MERL |
P-1B-89 | DeepWrinkles: Accurate and Realistic Clothing Modeling | Zorah Laehner, TU Munich; Tony Tung*, Facebook / Oculus Research; Daniel Cremers, TUM |
P-1B-90 | Massively Parallel Video Networks | Viorica Patraucean*, DeepMind; Joao Carreira, DeepMind; Laurent Mazare, DeepMind; Simon Osindero, DeepMind; Andrew Zisserman, University of Oxford |
Poster session 2A
2A | Tuesday, September 11 | Poster session 10:00 AM - 12:00 PM←↑ |
---|---|---|
P-2A-01 | Unsupervised Person Re-identification by Deep Learning Tracklet Association | Minxian Li*, Nanjing University and Science and Technology; Xiatian Zhu, Queen Mary University, London, UK; Shaogang Gong, Queen Mary University of London |
P-2A-02 | Instance-level Human Parsing via Part Grouping Network | Ke Gong*, SYSU; Xiaodan Liang, Carnegie Mellon University; Yicheng Li, Sun Yat-sen University; Yimin Chen, sensetime; Liang Lin, Sun Yat-sen University |
P-2A-03 | Scaling Egocentric Vision: The E-Kitchens Dataset | Dima Damen*, University of Bristol; Hazel Doughty, University of Bristol; Sanja Fidler, University of Toronto; Antonino Furnari, University of Catania; Evangelos Kazakos, University of Bristol; Giovanni Farinella, University of Catania, Italy; Davide Moltisanti, University of Bristol; Jonathan Munro, University of Bristol; Toby Perrett, University of Bristol; Will Price, University of Bristol; Michael Wray, University of Bristol |
P-2A-04 | Predicting Gaze in Egocentric Video by Learning Task-dependent Attention Transition | Yifei Huang*, The University of Tokyo; Minjie Cai, Hunan University, The University of Tokyo; Zhenqiang Li, The University of Tokyo; Yoichi Sato,The University of Tokyo |
P-2A-05 | Beyond local reasoning for stereo confidence estimation with deep learning | Fabio Tosi, University of Bologna; Matteo Poggi*, University of Bologna; Antonio Benincasa, University of Bologna; Stefano Mattoccia, University of Bologna |
P-2A-06 | DeepGUM: Learning Deep Robust Regression with a Gaussian-Uniform Mixture Model | Stéphane Lathuiliere, INRIA; Pablo Mesejo-Santiago, University of Granada; Xavier Alameda-Pineda*, INRIA; Radu Horaud, INRIA |
P-2A-07 | Into the Twilight Zone: Depth Estimation using Joint Structure-Stereo Optimization | Aashish Sharma*, National University of Singapore; Loong Fah Cheong, NUS |
P-2A-08 | Generalized Loss-Sensitive Adversarial Learning with Manifold Margins | Marzieh Edraki*, University of Central Florida; Guo-Jun Qi, University of Central Florida |
P-2A-09 | Adversarial Open Set Domain Adaptation | Kuniaki Saito*, The University of Tokyo; Shohei Yamamoto, The University of Tokyo; Yoshitaka Ushiku, The University of Tokyo; Tatsuya Harada, The University of Tokyo |
P-2A-10 | Connecting Gaze, Scene and Attention | Eunji Chong*, Georgia Institute of Technology; Nataniel Ruiz, Georgia Institute of Technology; Richard Wang, Georgia Institute of Technology; Yun Zhang, Georgia Institute of Technology |
P-2A-11 | Multi-modal Cycle-consistent Generalized Zero-Shot Learning | RAFAEL FELIX*, The University of Adelaide; Vijay Kumar B G, University of Adelaide; Ian Reid, University of Adelaide, Australia; Gustavo Carneiro, University of Adelaide |
P-2A-12 | Understanding Degeneracies and Ambiguities in Attribute Transfer | Attila Szabo*, University of Bern; Qiyang Hu, University of Bern; Tiziano Portenier, University of Bern; Matthias Zwicker, University of Maryland; Paolo Favaro, Bern University, Switzerland |
P-2A-13 | Start, Follow, Read: End-to-End Full Page Handwriting Recognition | Curtis Wigington*, Brigham Young University; Chris Tensmeyer, Brigham Young University; Brian Davis, Brigham Young University; Bill Barrett, Brigham Young University; Brian Price, Adobe; Scott Cohen, Adobe Research |
P-2A-14 | Rethinking the Form of Latent States in Image Captioning | Bo Dai*, the Chinese University of Hong Kong; Deming Ye, Tsinghua University; Dahua Lin, The Chinese University of Hong Kong |
P-2A-15 | ConvNets and ImageNet Beyond Accuracy: Understanding Mistakes and Uncovering Biases | Pierre Stock*, Facebook AI Research; Moustapha Cisse, Facebook AI Research |
P-2A-16 | Deep Shape Matching | Filip Radenovic*, Visual Recognition Group, CTU Prague; Giorgos Tolias, Vision Recognition Group, Czech Technical University in Prague; Ondrej Chum, Vision Recognition Group, Czech Technical University in Prague |
P-2A-17 | Neural Stereoscopic Image Style Transfer | Xinyu Gong*, University of Electronic Science and Technology of China; Haozhi Huang, Tencent AI Lab; Lin Ma, Tencent AI Lab; Fumin Shen, UESTC; Wei Liu, Tencent AI Lab; Tong Zhang, Tecent AI Lab |
P-2A-18 | Semi-supervised FusedGAN for Conditional Image Generation | Navaneeth Bodla*, University of Maryland; Gang Hua, Microsoft Cloud and AI; Rama Chellappa, University of Maryland |
P-2A-19 | Affine Correspondences between Central Cameras for Rapid Relative Pose Estimation | Iván Eichhardt*, MTA SZTAKI; Mitya Csetverikov, MTA SZTAKI & ELTE |
P-2A-20 | Bi-directional Feature Pyramid Network with Recursive Attention Residual Modules For Shadow Detection | Lei Zhu*, The Chinese University of Hong Kong; Zijun Deng, South China University of Technology; Xiaowei Hu, The Chinese University of Hong Kong; Chi-Wing Fu, The Chinese University of Hong Kong; Xuemiao Xu, South China University of Technology; Jing Qin, The Hong Kong Polytechnic University; Pheng-Ann Heng, The Chinese Univsersity of Hong Kong |
P-2A-21 | Joint Learning of Intrinsic Images and Semantic Segmentation | Anil Baslamisli*, University of Amsterdam; Thomas Tiel Groenestege, University of Amsterdam; Partha Das, University of Amsterdam; Hoang-An Le, University of Amsterdam; Sezer Karaoglu, University of Amsterdam; Theo Gevers, University of Amsterdam |
P-2A-22 | Visual Reasoning with a Multi-hop FiLM Generator | Florian Strub*, University of Lille; Mathieu Seurin, University of Lille; Ethan Perez, Rice University; Harm De Vries, Montreal Institute for Learning Algorithms; Jeremie Mary, Criteo; Philippe Preux, INRIA; Aaron Courville, MILA, Université de Montréal; Olivier Pietquin, GoogleBrain |
P-2A-23 | View-graph Selection Framework for SfM | Rajvi Shah*, IIIT Hyderabad; Visesh Chari, INRIA; P. J. Narayanan, IIIT-Hyderabad |
P-2A-24 | Fine-grained Video Categorization with Redundancy Reduction Attention | Chen Zhu, University of Maryland; Xiao Tan, Baidu Inc.; Feng Zhou, Baidu Inc.; Xiao Liu, Baidu Research; Kaiyu Yue*, Baidu Inc.; Errui Ding, Baidu Inc.; Yi Ma, UC Berkeley |
P-2A-25 | Space-time Knowledge for Unpaired Image-to-Image Translation | Aayush Bansal*, Carnegie Mellon University; Shugao Ma, Facebook / Occulus; Deva Ramanan, Carnegie Mellon University; Yaser Sheikh, CMU |
P-2A-26 | Integral Human Pose Regression | Xiao Sun*, Microsoft Research Asia; Bin Xiao, MSR Asia; Fangyin Wei, Peking University; Shuang Liang, Tongji University; Yichen Wei, MSR Asia |
P-2A-27 | Recurrent Tubelet Proposal and Recognition Networks for Action Detection | Dong Li, University of Science and Technology of China; Zhaofan Qiu, University of Science and Technology of China; Qi Dai, Microsoft Research; Ting Yao*, Microsoft Research; Tao Mei, JD.com |
P-2A-28 | Learning to Predict Crisp Edge | Ruoxi Deng*, Central South University; Chunhua Shen, University of Adelaide; Shengjun Liu, Central South University; Huibing Wang, Dalian University of Technology; Xinru Liu, Central South University |
P-2A-29 | Open Set Learning with Counterfactual Images | Lawrence Neal*, Oregon State University; Matthew Olson, Oregon State University; Xiaoli Fern, Oregon State University; Weng-Keen Wong, Oregon State University; Fuxin Li, Oregon State University |
P-2A-30 | Estimating the Success of Unsupervised Image to Image Translation | Lior Wolf, Tel Aviv University, Israel; Sagie Benaim*, Tel Aviv University; Tomer Galanti, Tel Aviv University |
P-2A-31 | Joint Map and Symmetry Synchronization | Qixing Huang*, The University of Texas at Austin; Xiangru Huang, University of Texas at Austin; Zhenxiao Liang, Tsinghua University; Yifan Sun, The University of Texas at Austin |
P-2A-32 | Single Image Water Hazard Detection using FCN with Reflection Attention Units | Xiaofeng Han, Nanjing University of Science and Technology; Chuong Nguyen*, CSIRO Data61; Shaodi You, Data61-CSIRO, Australia; Jianfeng Lu, Nanjing University of Science and Technology |
P-2A-33 | Realtime Time Synchronized Event-based Stereo | Alex Zhu*, University of Pennsylvania; Yibo Chen, University of Pennsylvania; Kostas Daniilidis, University of Pennsylvania |
P-2A-34 | Transferring GANs: generating images from limited data | yaxing wang*, Computer Vision Center; Chenshen Wu, Computer Vision Center; Luis Herranz, Computer Vision Center (Ph.D.); Joost van de Weijer, Computer Vision Center; Abel Gonzalez-Garcia, Computer Vision Center; BOGDAN RADUCANU, Computer Version Center, Edifici |
P-2A-35 | To learn image super-resolution, use a GAN to learn how to do image degradation first | Adrian Bulat*, University of Nottingham; Jing Yang, University of Nottingham; Georgios Tzimiropoulos, University of Nottingham |
P-2A-36 | Unsupervised CNN-based co-saliency detection with graphical optimization | Kuang-Jui Hsu*, Academia Sinica; Chung-Chi Tsai, Texas A&M University; Yen-Yu Lin, Academia Sinica; Xiaoning Qian, Texas A&M University; Yung-Yu Chuang, National Taiwan University |
P-2A-37 | Fast Light Field Reconstruction With Deep Coarse-To-Fine Modeling of Spatial-Angular Clues | Henry W. F. Yeung, the University of Sydney; Junhui Hou*, City University of Hong Kong, Hong Kong; Jie Chen, Nanyang Technological University; Yuk Ying Chung, the University of Sydney; Xiaoming Chen, University of Science and Technology of China |
P-2A-38 | Unified Perceptual Parsing for Scene Understanding | Tete Xiao*, Peking University; Yingcheng Liu, Peking University; Yuning Jiang, Megvii(Face++) Inc; Bolei Zhou, MIT; Jian Sun, Megvii, Face++ |
P-2A-39 | PARN: Pyramidal Affine Regression Networks for Dense Semantic Correspondence Estimation | Sangryul Jeon*, Yonsei university; Seungryung Kim, Yonsei University; Dongbo Min, Ewha Womans University; Kwanghoon Sohn , Yonsei Univ. |
P-2A-40 | Structural Consistency and Controllability for Diverse Colorization | Safa Messaoud*, University of Illinois at Urbana Champaign; Alexander Schwing, UIUC; David Forsyth, Univeristy of Illinois at Urbana-Champaign |
P-2A-41 | Online Multi-Object Tracking with Dual Matching Attention Networks | Ji Zhu, Shanghai Jiao Tong University; Hua Yang*, Shanghai Jiao Tong University; Nian Liu, Northwestern Polytechnical University; Minyoung Kim, Perceptive Automata; Wenjun Zhang, Shanghai Jiao Tong University; Ming-Hsuan Yang, University of California at Merced |
P-2A-42 | MaskConnect: Connectivity Learning by Gradient Descent | Karim Ahmed*, Dartmouth College; Lorenzo Torresani, Dartmouth College |
P-2A-43 | FloorNet: A Unified Framework for Floorplan Reconstruction from 3D Scans | Chen Liu*, Washington University in St. Louis; Jiaye Wu, Washington University in St.Louis; Yasutaka Furukawa, Simon Fraser University |
P-2A-44 | Image Manipulation with Perceptual Discriminators | Diana Sungatullina*, Skolkovo Institute of Science and Technology; Egor Zakharov, Skolkovo Institute of Science and Technology; Dmitry Ulyanov, Skolkovo Institute of Science and Technology; Victor Lempitsky, Skoltech |
P-2A-45 | Transductive Centroid Projection for Semi-supervised Large-scale Recognition | Yu Liu*, The Chinese University of Hong Kong; Xiaogang Wang, Chinese University of Hong Kong, Hong Kong; Guanglu Song, Sensetime; Jing Shao, Sensetime |
P-2A-46 | Eigendecomposition-free Training of Deep Networks with Zero Eigenvalue-based Losses | Zheng Dang*, Xi'an Jiaotong University; Kwang Moo Yi, University of Victoria; Yinlin Hu, EPFL; Fei Wang, Xi'an Jiaotong University; Pascal Fua, EPFL, Switzerland; Mathieu Salzmann, EPFL |
P-2A-47 | Self-supervised Knowledge Distillation Using Singular Value Decomposition | SEUNG HYUN LEE, Inha University; Daeha Kim, Inha University ; Byung Cheol Song*, Inha University |
P-2A-48 | Snap Angle Prediction for 360$^{\circ}$ Panoramas | Bo Xiong*, University of Texas at Austin; Kristen Grauman, University of Texas |
P-2A-49 | Saliency Preservation in Low-Resolution Grayscale Images | Shivanthan Yohanandan*, RMIT University; Adrian Dyer, RMIT University; Dacheng Tao, University of Sydney; Andy Song, RMIT University |
P-2A-50 | PPF-FoldNet: Unsupervised Learning of Rotation Invariant 3D Local Descriptors | Tolga Birdal*, TU Munich; Haowen Deng, Technical University of Munich; Slobodan Ilic, Siemens AG |
P-2A-51 | BusterNet: Detecting Copy-Move Image Forgery with Source/Target Localization | Rex Yue Wu*, USC ISI; Wael Abd-Almageed, Information Sciences Institute; Prem Natarajan, USC ISI |
P-2A-52 | Double JPEG Detection in Mixed JPEG Quality Factors using Deep Convolutional Neural Network | Jin-Seok Park*, Korea Advanced Institute of Science and Technology (KAIST); Donghyeon Cho, KAIST; Wonhyuk Ahn, KAIST; Heung-Kyu Lee, Korea Advanced Institute of Science and Technology (KAIST) |
P-2A-53 | Unsupervised holistic image generation from key local patches | Donghoon Lee*, Seoul National University; Sangdoo Yun, Clova AI Research, NAVER Corp.; Sungjoon Choi, Seoul National University; Hwiyeon Yoo, Seoul National University; Ming-Hsuan Yang, University of California at Merced; Songhwai Oh, Seoul National University |
P-2A-54 | CrossNet: An End-to-end Reference-based Super Resolution Network using Cross-scale Warping | Haitian Zheng, HKUST; Mengqi Ji, HKUST; Haoqian Wang, Tsinghua University; Yebin Liu*, Tsinghua University; Lu Fang, Tsinghua University |
P-2A-55 | DCAN: Dual Channel-wise Alignment Networks for Unsupervised Scene Adaptation | Zuxuan Wu*, UMD; Xintong Han, University of Maryland, USA; Yen-Liang Lin, GE Global Research ; Gokhan Uzunbas, Avitas Systems-GE Venture; Tom Goldstein, University of Maryland, College Park; Ser-Nam Lim, GE Global Research; Larry Davis, University of Maryland |
P-2A-56 | YouTube-VOS: Sequence-to-Sequence Video Object Segmentation | Ning Xu*, Adobe Research; Linjie Yang, Snap Research; Dingcheng Yue, UIUC; Jianchao Yang, Snap; Brian Price, Adobe; Jimei Yang, Adobe; Scott Cohen, Adobe Research; Yuchen Fan, Image Formation and Processing (IFP) Group, University of Illinois at Urbana-Champaign; Yuchen Liang, UIUC; Thomas Huang, University of Illinois at Urbana Champaign |
P-2A-57 | Selfie Video Stabilization | Jiyang Yu*, University of California San Diego; Ravi Ramamoorthi, University of California San Diego |
P-2A-58 | Videos as Space-Time Region Graphs | Xiaolong Wang*, CMU; Abhinav Gupta, CMU |
P-2A-59 | Parallel Feature Pyramid Network for Object Detection | Seung-Wook Kim*, Korea University; Hyong-Keun Kook, Korea University; Jee-Young Sun, Korea University; Mun-Cheon Kang, Korea University; Sung-Jea Ko, Korea University |
P-2A-60 | Goal-Oriented Visual Question Generation via Intermediate Rewards | Junjie Zhang, University of Technology, Sydney; Qi Wu*, University of Adelaide; Chunhua Shen, University of Adelaide; Jian Zhang, UTS; Jianfeng Lu, Nanjing University of Science and Technology; Anton Van Den Hengel, University of Adelaide |
P-2A-61 | WildDash - Creating Hazard-Aware Benchmarks | Oliver Zendel*, AIT Austrian Institute of Technology; Katrin Honauer, Heidelberg University; Markus Murschitz, AIT Austrian Institute of Technology; Daniel Steininger, AIT Austrian Institute of Technology; Gustavo Fernandez, n/a |
P-2A-62 | Reinforced Temporal Attention and Split-Rate Transfer for Depth-Based Person Re-identification | Nikolaos Karianakis*, Microsoft; Zicheng Liu, Microsoft; Yinpeng Chen, Microsoft; Stefano Soatto, UCLA |
P-2A-63 | DF-Net: Unsupervised Joint Learning of Depth and Flow using Cross-Network Consistency | Yuliang Zou*, Virginia Tech; Zelun Luo, Stanford University; Jia-Bin Huang, Virginia Tech |
P-2A-64 | Generating Multimodal Human Dynamics with a Transformation based Representation | Xinchen Yan*, University of Michigan; Akash Rastogi, UM; Ruben Villegas, University of Michigan; Eli Shechtman, Adobe Research, US; Sunkavalli Kalyan, Adobe Research; Sunil Hadap, Adobe; Ersin Yumer, Argo AI; Honglak Lee, UM |
P-2A-65 | Learning Rigidity in Dynamic Scenes with a Moving Camera for 3D Motion Field Estimation | Zhaoyang Lv*, GEORGIA TECH; Kihwan Kim, NVIDIA; Alejandro Troccoli, NVIDIA; Deqing Sun, NVIDIA; Kautz Jan, NVIDIA; James Rehg, Georgia Institute of Technology |
P-2A-66 | Learning Visual Question Answering by Bootstrapping Hard Attention | Mateusz Malinowski*, DeepMind; Carl Doersch, DeepMind; Adam Santoro, DeepMind; Peter Battaglia, DeepMind |
P-2A-67 | Image Reassembly Combining Deep Learning and Shortest Path Problem | Marie-Morgane Paumard*, ETIS; David Picard, ETIS/LIP6; Hedi Tabia, France |
P-2A-68 | RESOUND: Towards Action Recognition without Representation Bias | Yingwei Li*, UCSD; Nuno Vasconcelos, UC San Diego; Yi Li, University of California San Diego |
P-2A-69 | Key-Word-Aware Network for Referring Expression Image Segmentation | Hengcan Shi*, University of Electronic Science and Technology of China; Hongliang Li, University of Electronic Science and Technology of China; Fanman Meng, University of Electronic Science and Technology of China; Qingbo Wu, University of Electronic Science and Technology of China |
P-2A-70 | Mutual Learning to Adapt for Joint Human Parsing and Pose Estimation | Xuecheng Nie*, NUS; Jiashi Feng, NUS; Shuicheng Yan, Qihoo/360 |
P-2A-71 | Simple Baselines for Human Pose Estimation and Tracking | Bin Xiao*, MSR Asia; Haiping Wu, MSR Asia; Yichen Wei, MSR Asia |
P-2A-72 | Pose Partition Networks for Multi-Person Pose Estimation | Xuecheng Nie*, NUS; Jiashi Feng, NUS; Junliang Xing, National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences; Shuicheng Yan, Qihoo/360 |
P-2A-73 | Wasserstein Divergence For GANs | Jiqing Wu*, ETH Zurich; Zhiwu Huang, ETH Zurich; Janine Thoma, ETH Zurich; Dinesh Acharya, ETH Zurich; Luc Van Gool, ETH Zurich |
P-2A-74 | A Segmentation-aware Deep Fusion Network for Compressed Sensing MRI | Zhiwen Fan, Xiamen University; Liyan Sun, Xiamen University; Xinghao Ding*, Xiamen University; Yue Huang, Xiamen University; Congbo Cai, Xiamen University; John Paisley, Columbia University |
P-2A-75 | Deep Metric Learning with Hierarchical Triplet Loss | Weifeng Ge*, The University of Hong Kong |
P-2A-76 | Generative Adversarial Network with Spatial Attention for Face Attribute Editing | Gang Zhang*, Institute of Computing Technology, CAS; Meina Kan, Institute of Computing Technology, Chinese Academy of Sciences; Shiguang Shan, Chinese Academy of Sciences; Xilin Chen, China |
P-2A-77 | Proxy Clouds for Live RGB-D Stream Processing and Consolidation | Adrien Kaiser*, Telecom ParisTech; Jose Alonso Ybanez Zepeda, Ayotle SAS; Tamy Boubekeur, Paris Telecom |
P-2A-78 | Synthetically Supervised Feature Learning for Scene Text Recognition | Yang Liu*, University of Cambridge; Zhaowen Wang, Adobe Research; Hailin Jin, Adobe Research; Ian Wassell, University of Cambridge |
P-2A-79 | Scale Aggregation Network for Accurate and Efficient Crowd Counting | Xinkun Cao*, Beijing University of Posts and Telecommunications; Zhipeng Wang, School of Communication and Information Engineering, Beijing University of Posts and Telecommunications; Yanyun Zhao, Beijing Univiersity of Posts and Telecommunications; Fei Su, Beijing University of Posts and Telecommunications |
P-2A-80 | PM-GANs: Discriminative Representation Learning for Action Recognition Using Partial-modalities | Lan Wang, Chongqing Key Laboratory of Signal and Information Processing, Chongqing University of Posts and Telecommunications; Chenqiang Gao*, Chongqing Key Laboratory of Signal and Information Processing, Chongqing University of Posts and Telecommunications; Luyu Yang, Chongqing Key Laboratory of Signal and Information Processing, Chongqing University of Posts and Telecommunications; Yue Zhao, Chongqing Key Laboratory of Signal and Information Processing, Chongqing University of Posts and Telecommunications; Wangmeng Zuo, Harbin Institute of Technology, China; Deyu Meng, Xi'an Jiaotong University |
P-2A-81 | OmniDepth: Dense Depth Estimation for Indoors Spherical Panoramas. | NIKOLAOS ZIOULIS*, CERTH / CENTRE FOR RESEARCH AND TECHNOLOGY HELLAS; Antonis Karakottas, CERTH / CENTRE FOR RESEARCH AND TECHNOLOGY HELLAS; Dimitrios Zarpalas, CERTH / CENTRE FOR RESEARCH AND TECHNOLOGY HELLAS; Petros Daras, ITI-CERTH, Greece |
P-2A-82 | Hashing with Binary Matrix Pursuit | Fatih Cakir*, Boston University; Kun He, Boston University; Stan Sclaroff, Boston University |
P-2A-83 | Probabilistic Video Generation using Holistic Attribute Control | Jiawei He*, Simon Fraser University; Andreas Lehrmann, Facebook; Joe Marino, California Institute of Technology; Greg Mori, Simon Fraser University; Leonid Sigal, University of British Columbia |
P-2A-84 | Transductive Semi-Supervised Deep Learning using Min-Max Features | Weiwei Shi*, Xi'an Jiaotong University; Yihong Gong, Xi'an Jiaotong University; Chris Ding, UNIVERSITY OF TEXAS AT ARLINGTON; Zhiheng Ma, Xi'an Jiaotong University; Xiaoyu Tao, Institute of Artificial Intelligence and Robotics, Xi'an Jiaotong University.; Nanning Zheng, Xi'an Jiaotong University |
P-2A-85 | Deep Feature Pyramid Reconfiguration for Object Detection | Tao Kong*, Tsinghua; Fuchun Sun, Tsinghua; Wenbing Huang, Tencent AI Lab; ? ??, ???? |
P-2A-86 | Quadtree Convolutional Neural Networks | Pradeep Kumar Jayaraman*, Nanyang Technological University; Jianhan Mei, Nanyang Technological University; Jianfei Cai, Nanyang Technological University; Jianmin Zheng, Nanyang Technological University |
P-2A-87 | Correcting the Triplet Selection Bias for Triplet Loss | Baosheng Yu*, The University of Sydney; Tongliang Liu, The University of Sydney; Mingming Gong, CMU & U Pitt; Changxing Ding, South China University of Technology; Dacheng Tao, University of Sydney |
P-2A-88 | Adversarial Geometry-Aware Human Motion Prediction | Liangyan Gui*, Carnegie Mellon University; Yu-Xiong Wang, Carnegie Mellon University; Xiaodan Liang, Carnegie Mellon University; José M. F. Moura, Carnegie Mellon University |
Poster session 2B
2B | Tuesday, September 11 | Poster session 04:00 PM - 06:00 PM←↑ |
---|---|---|
P-2B-01 | 3D Motion Sensing from 4D Light Field Gradients | Sizhuo Ma*, University of Wisconsin-Madison; Brandon Smith, University of Wisconsin-Madison; Mohit Gupta, University of Wisconsin-Madison, USA |
P-2B-02 | A Trilateral Weighted Sparse Coding Scheme for Real-World Image Denoising | XU JUN, The Hong Kong Polytechnic University; Lei Zhang*, Hong Kong Polytechnic University, Hong Kong, China; D. Zhang, The Hong Kong Polytechnic University |
P-2B-03 | Saliency Detection in 360$^\circ$ Videos | Ziheng Zhang, Shanghaitech University; Yanyu Xu*, Shanghaitech University; Shenghua Gao, Shanghaitech University; Jingyi Yu, Shanghai Tech University |
P-2B-04 | Learning to Blend Photos | Wei-Chih Hung*, University of California, Merced; Jianming Zhang, Adobe Research; Xiaohui Shen, Adobe Research; Zhe Lin, Adobe Research; Joon-Young Lee, Adobe Research; Ming-Hsuan Yang, University of California at Merced |
P-2B-05 | Escaping from Collapsing Modes in a Constrained Space | Chieh Lin, National Tsing Hua University; Chia-Che Chang, National Tsing Hua University; Che-Rung Lee, National Tsing Hua University; Hwann-Tzong Chen*, National Tsing Hua University |
P-2B-06 | Verisimilar Image Synthesis for Accurate Detection and Recognition of Texts in Scenes | Fangneng Zhan, Nanyang Technological University; Shijian Lu*, Nanyang Technological University; Chuhui Xue, Nanyang Technological University |
P-2B-07 | Layer-structured 3D Scene Inference via View Synthesis | Shubham Tulsiani*, UC Berkeley; Richard Tucker, Google; Noah Snavely, - |
P-2B-08 | Perturbation Robust Representations of Topological Persistence Diagrams | Anirudh Som*, Arizona State University; Kowshik Thopalli, Arizona State University; Karthikeyan Natesan Ramamurthy, IBM Research; Vinay Venkataraman, Arizona State University; Ankita Shukla, Indraprastha Institute of Information Technology - Delhi; Pavan Turaga, Arizona State University |
P-2B-09 | Analyzing Clothing Layer Deformation Statistics of 3D Human Motions | Jinlong YANG*, Inria; Jean-Sebastien Franco, INRIA; Franck Hétroy-Wheeler, University of Strasbourg; Stefanie Wuhrer, Inria |
P-2B-10 | Neural Nonlinear least Squares with Application to Dense Tracking and Mapping | Ronald Clark*, Imperial College London; Michael Bloesch, Imperial; Jan Czarnowski, Imperial College London; Andrew Davison, Imperial College London; Stefan Leutenegger, Imperial College London |
P-2B-11 | Propagating LSTM: 3D Pose Estimation based on Joint Interdependency | Kyoungoh Lee*, Yonsei University; Inwoong Lee, Yonsei University; Sanghoon Lee, Yonsei University, Korea |
P-2B-12 | Proximal Dehaze-Net: A Prior Learning-Based Deep Network for Single Image Dehazing | Dong Yang, Xi'an Jiaotong University; JIAN SUN*, Xi'an Jiaotong University |
P-2B-13 | Attend and Rectify: a gated attention mechanism for fine-grained recovery | Pau Rodriguez Lopez*, Computer Vision Center, Universitat Autonoma de Barcelona; Guillem Cucurull, Computer Vision Center, Universitat Autonoma de Barcelona; Josep Gonfaus, Computer Vision Center; Jordi Gonzalez, UA Barcelona; Xavier Roca, Computer Vision Center, Universitat Autonoma de Barcelona |
P-2B-14 | Learning to Capture Light Fields through A Coded Aperture Camera | Yasutaka Inagaki*, Nagoya University; Yuto Kobayashi, Nagoya University; Keita Takahashi, Nagoya University; Toshiaki Fujii, Nagoya University; Hajime Nagahara, Osaka University |
P-2B-15 | AMC: AutoML for Model Compression and Acceleration on Mobile Devices | Yihui He, Xian Jiaotong University; Ji Lin, MIT; Zhijian Liu, MIT; Hanrui Wang, MIT; Li-Jia Li, Google; Song Han, MIT |
P-2B-16 | Extreme Network Compression via Filter Group Approximation | Bo Peng*, Hikvision Research Institute; Wenming Tan, Hikvision Research Institute; Zheyang Li, Hikvision Research Institute; Shun Zhang, Hikvision Research Institute; Di Xie, Hikvision Research Institute; Shiliang Pu, Hikvision Research Institute |
P-2B-17 | Retrospective Encoders for Video Summarization | Ke Zhang*, USC; Kristen Grauman, University of Texas; Fei Sha, USC |
P-2B-18 | Optimized Quantization for Highly Accurate and Compact DNNs | Dongqing Zhang, Microsoft Research; Jiaolong Yang*, Microsoft Research Asia (MSRA); Dongqiangzi Ye, Microsoft Research; Gang Hua, Microsoft Cloud and AI |
P-2B-19 | Universal Sketch Perceptual Grouping | Ke LI*, Queen Mary University of London; Kaiyue Pang, Queen Mary University of London; Jifei Song, Queen Mary, University of London; Yi-Zhe Song, Queen Mary University of London; Tao Xiang, Queen Mary, University of London, UK; Timothy Hospedales, Edinburgh University; Honggang Zhang, Beijing University of Posts and Telecommunications |
P-2B-20 | Uncertainty Estimates and Multi-Hypotheses Networks for Optical Flow | Eddy Ilg*, University of Freiburg; Özgün Çiçek, University of Freiburg; Silvio Galesso, University of Freiburg; Aaron Klein, Universität Freiburg; Osama Makansi, University of Freiburg; Frank Hutter, University of Freiburg; Thomas Brox, University of Freiburg |
P-2B-21 | Learning 3D Keypoint Descriptors for Non-Rigid Shape Matching | Hanyu Wang, NLPR, Institute of Automation, Chinese Academy of Sciences; Jianwei Guo*, NLPR, Institute of Automation, Chinese Academy of Sciences; Yan Dong-Ming, NLPR, CASIA; Weize Quan, NLPR, Institute of Automation, Chinese Academy of Sciences; Xiaopeng Zhang, Institute of Automation, Chinese Academy of Sciences |
P-2B-22 | A Joint Sequence Fusion Model for Video Question Answering and Retrieval | Youngjae Yu, Seoul National University Vision and Learning Lab; Jongseok Kim, Seoul National University Vision and Learning Lab; Gunhee Kim*, Seoul National University |
P-2B-23 | Deformable Pose Traversal Convolution for 3D Action and Gesture Recognition | Junwu Weng*, Nanyang Technological University; Mengyuan Liu, Nanyang Technological University; Xudong Jiang, Nanyang Technological University; Junsong Yuan, State University of New York at Buffalo, USA |
P-2B-24 | Fine-Grained Visual Categorization using Meta-Learning Optimization with Sample Selection of Auxiliary Data | Yabin Zhang, South China University of Technology; Tang Hui, South China University of Technology; Kui Jia*, South China University of Technology |
P-2B-25 | Stereo relative pose from line and point feature triplets | Alexander Vakhitov*, Skoltech; Victor Lempitsky, Skoltech; Yinqiang Zheng, National Institute of Informatics |
P-2B-26 | Convolutional Block Attention Module | Sanghyun Woo*, KAIST; Jongchan Park, KAIST; Joon-Young Lee, Adobe Research; In So Kweon, KAIST |
P-2B-27 | EC-Net: an Edge-aware Point set Consolidation Network | Lequan Yu*, The Chinese University of Hong Kong; Xianzhi Li, The Chinese University of Hong Kong; Chi-Wing Fu, The Chinese University of Hong Kong; Danny Cohen-Or, Tel Aviv University; Pheng-Ann Heng, The Chinese Univsersity of Hong Kong |
P-2B-28 | Video Compression through Image Interpolation | Chao-Yuan Wu*, UT Austin; Nayan Singhal, UT Austin; Philipp Kraehenbuehl, UT Austin |
P-2B-29 | Burst Image Deblurring Using Permutation Invariant Convolutional Neural Networks | Miika Aittala*, MIT; Fredo Durand, MIT |
P-2B-30 | HybridNet: Classification and Reconstruction Cooperation for Semi-Supervised Learning | Thomas Robert*, LIP6 / Sorbonne Universite; Nicolas Thome, CNAM, Paris; Matthieu Cord, Sorbonne University |
P-2B-31 | Structure-from-Motion-Aware PatchMatch for Adaptive Optical Flow Estimation | Daniel Maurer*, University of Stuttgart; Nico Marniok, Universität Konstanz; Bastian Goldluecke, University of Konstanz; Andrés Bruhn, University of Stuttgart |
P-2B-32 | Joint & Progressive Learning from High-Dimensional Data for Multi-Label Classification | Danfeng Hong*, Technical University of Munich (TUM); German Aerospace Center (DLR); Naoto Yokoya, RIKEN Center for Advanced Intelligence Project (AIP); Jian Xu, German Aerospace Center (DLR); Xiaoxiang Zhu, DLR&TUM |
P-2B-33 | SDC-Net: Video prediction using spatially-displaced convolution | Fitsum Reda*, NVIDIA; Guilin Liu, NVIDIA; Kevin Shih, NVIDIA; Robert Kirby, Nvidia; Jon Barker, Nvidia; David Tarjan, Nvidia; Andrew Tao, NVIDIA; Bryan Catanzaro, NVIDIA |
P-2B-34 | Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation | Liang-Chieh Chen*, Google Inc.; Yukun Zhu, Google Inc.; George Papandreou, Google; Florian Schroff, Google Inc.; Hartwig Adam, Google |
P-2B-35 | VQA-E: Explaining, Elaborating, and Enhancing Your Answers for Visual Questions | Qing Li*, University of Science and Technology of China; Qingyi Tao, Nanyang Techonological University; Shafiq Joty, Nanyang Technological University; Jianfei Cai, Nanyang Technological University; Jiebo Luo, U. Rochester |
P-2B-36 | Image Super-Resolution Using Very Deep Residual Channel Attention Networks | Yulun Zhang*, Northeastern University; Kunpeng Li, Northeastern University; kai li, northeastern university; Lichen Wang, Northeastern University; Bineng Zhong, Huaqiao University; YUN FU, Northeastern University |
P-2B-37 | Urban Zoning Using Higher-Order Markov Random Fields on Multi-View Imagery Data | Tian Feng, University of New South Wales; Quang-Trung Truong, SUTD; Thanh Nguyen*, Deakin University, Australia; Jing Yu Koh, SUTD; Lap-Fai Yu, UMass Boston; Sai-Kit Yeung, Singapore University of Technology and Design; Alexander Binder, Singapore University of Technology and Design |
P-2B-38 | Clustering Convolutional Kernels to Compress Deep Neural Networks | Sanghyun Son, Seoul National University; Seungjun Nah, Seoul National University; Kyoung Mu Lee*, Seoul National University |
P-2B-39 | Explainable Neural Computation via Stack Neural Module Networks | Ronghang Hu*, University of California, Berkeley; Jacob Andreas, UC Berkeley; Trevor Darrell, UC Berkeley; Kate Saenko, Boston University |
P-2B-40 | Quaternion Convolutional Neural Networks | Xuanyu Zhu*, Shanghai Jiao Tong University; Yi Xu, Shanghai Jiao Tong University; Hongteng Xu, Duke University; Changjian Chen, Shanghai Jiao Tong University |
P-2B-41 | Lip Movements Generation at a Glance | Lele Chen*, University of Rochester; Zhiheng Li, WuHan University; Ross Maddox, University of Rochester; Zhiyao Duan, Unversity of Rochester; Chenliang Xu, University of Rochester |
P-2B-42 | Toward Scale-Invariance and Position-Sensitive Object Proposal Networks | Hsueh-Fu Lu, Umbo Computer Vision; Ping-Lin Chang*, Umbo Computer Vision; Xiaofei Du, Umbo Computer Vision |
P-2B-43 | Constraints Matter in Deep Neural Network Compression | Changan Chen, Simon Fraser University; Fred Tung*, Simon Fraser University; Naveen Vedula, Simon Fraser University; Greg Mori, Simon Fraser University |
P-2B-44 | MRF Optimization with Separable Convex Prior on Partially Ordered Labels | Csaba Domokos*, Technical University of Munich; Frank Schmidt, BCAI; Daniel Cremers, TUM |
P-2B-45 | Switchable Temporal Propagation Network | Sifei Liu*, NVIDIA; Ming-Hsuan Yang, University of California at Merced; Guangyu Zhong, Dalian University of Technology; Jinwei Gu, Nvidia; Shalini De Mello, NVIDIA Research; Kautz Jan, NVIDIA; Varun Jampani, Nvidia Research |
P-2B-46 | T2Net: Synthetic-to-Realistic Translation for Solving Single-Image Depth Estimation Tasks | Chuanxia Zheng*, Nanyang Technological University; Tat-Jen Cham, Nanyang Technological University; Jianfei Cai, Nanyang Technological University |
P-2B-47 | ArticulatedFusion: Real-time Reconstruction of Motion, Geometry and Segmentation Using a Single Depth Camera | Chao Li*, The University of Texas at Dallas; Zheheng Zhao, The University of Texas at Dallas; Xiaohu Guo, The University of Texas at Dallas |
P-2B-48 | NNEval: Neural Network based Evaluation Metric for Image Captioning | Naeha Sharif*, University of Western Australia; Lyndon White, University of Western Australia; Mohammed Bennamoun, University of Western Australia; Syed Afaq Ali Shah, Department of Computer Science and Software Engineering, The University of Western Australia |
P-2B-49 | Coreset-Based Convolutional Neural Network Compression | Abhimanyu Dubey*, Massachusetts Institute of Technology; Moitreya Chatterjee, University of Illinois at Urbana Champaign; Ramesh Raskar, Massachusetts Institute of Technology; Narendra Ahuja, University of Illinois at Urbana-Champaign, USA |
P-2B-50 | Context Refinement for Object Detection | Zhe Chen*, University of Sydney; Shaoli Huang, University of Sydney; Dacheng Tao, University of Sydney |
P-2B-51 | Real-time Actor-Critic Tracking | Boyu Chen*, Dalian University of Technology; Dong Wang, Dalian University of Technology; Peixia Li, Dalian University of Technology; Huchuan Lu, Dalian University of Technology |
P-2B-52 | Partial Adversarial Domain Adaptation | Zhangjie Cao, Tsinghua University; Lijia Ma, Tsinghua University; Mingsheng Long*, Tsinghua University; Jianmin Wang, Tsinghua University, China |
P-2B-53 | Localization Recall Precision (LRP): A New Performance Metric for Object Detection | Kemal Oksuz*, Middle East Technical University; Bar?? Can Çam, Roketsan; Emre Akbas, Middle East Technical University; Sinan Kalkan, Middle East Technical University |
P-2B-54 | Improving Embedding Generalization via Scalable Neighborhood Component Analysis | Zhirong Wu*, UC Berkeley; Alexei Efros, UC Berkeley; Stella Yu, UC Berkeley / ICSI |
P-2B-55 | Leveraging Motion Priors in Videos for Improving Human Segmentation | Yu-Ting Chen*, NTHU; Wen-Yen Chang, NTHU; Hai-Lun Lu, NTHU; Tingfan Wu, Umbo Computer Vision; Min Sun, NTHU |
P-2B-56 | Recurrent Squeeze-and-Excitation Context Aggregation Net for Single Image Deraining | Xia Li*, Peking University Shenzhen Graduate School; Jianlong Wu, Peking University; Zhouchen Lin, Peking University; Hong Liu, Peking University Shenzhen Graduate School; Hongbin Zha, Peking University, China |
P-2B-57 | Statistically-motivated Second-order Pooling | Kaicheng Yu*, EPFL; Mathieu Salzmann, EPFL |
P-2B-58 | SegStereo: Exploiting Semantic Information for Disparity Estimation | Guorun Yang*, Tsinghua University; Hengshuang Zhao, The Chinese University of Hong Kong; Jianping Shi, Sensetime Group Limited; Jia Jiaya, Chinese University of Hong Kong |
P-2B-59 | Small-scale Pedestrian Detection Based on Somatic Topology Localization and Temporal Feature Aggregation | Tao Song, Hikvision Research Institute; Leiyu Sun, Hikvision Research Institute; Di Xie*, Hikvision Research Institute; Haiming Sun, Hikvision Research Institute; Shiliang Pu, Hikvision Research Institute |
P-2B-60 | Object Detection with an Aligned Spatial-Temporal Memory | Fanyi Xiao*, University of California Davis; Yong Jae Lee, University of California, Davis |
P-2B-61 | Learning to Drive with 360° Surround-View Cameras and a Map | Simon Hecker*, ETH Zurich; Dengxin Dai, ETH Zurich; Luc Van Gool, ETH Zurich |
P-2B-62 | Monocular Scene Parsing and Reconstruction using 3D Holistic Scene Grammar | Siyuan Huang*, UCLA; Siyuan Qi, UCLA; Yixin Zhu, UCLA; Yinxue Xiao, University of California, Los Angeles; Yuanlu Xu, University of California, Los Angeles; Song-Chun Zhu, UCLA |
P-2B-63 | Coded Illumination and Imaging for Fluorescence Based Classification | Yuta Asano, Tokyo Institute of Technology; Misaki Meguro, Tokyo Institute of Technology; Chao Wang, Kyushu Institute of Technology; Antony Lam*, Saitama University; Yinqiang Zheng, National Institute of Informatics; Takahiro Okabe, Kyushu Institute of Technology; Imari Sato, National Institute of Informatics |
P-2B-64 | Modality Distillation with Multiple Stream Networks for Action Recognition | Nuno Garcia, IIT; Pietro Morerio*, IIT; Vittorio Murino, Istituto Italiano di Tecnologia |
P-2B-65 | VideoMatch: Matching based Video Object Segmentation | Yuan-Ting Hu*, University of Illinois at Urbana-Champaign; Jia-Bin Huang, Virginia Tech; Alexander Schwing, UIUC |
P-2B-66 | Superpixel Sampling Networks | Varun Jampani*, Nvidia Research; Deqing Sun, NVIDIA; Ming-Yu Liu, NVIDIA; Ming-Hsuan Yang, University of California at Merced; Kautz Jan, NVIDIA |
P-2B-67 | Deep Bilinear Learning for RGB-D Action Recognition | HU Jian-Fang, Sun Yat-sen University; Jason Wei Shi Zheng*, Sun Yat Sen University; Pan Jiahui, Sun Yat-sen University; Jian-Huang Lai, Sun Yat-sen University; Jianguo Zhang, University of Dundee |
P-2B-68 | Multi-object Tracking with Neural Gating using bilinear LSTMs | Chanho Kim*, Georgia Tech; Fuxin Li, Oregon State University; James Rehg, Georgia Institute of Technology |
P-2B-69 | Direct Sparse Odometry With Rolling Shutter | David Schubert*, Technical University of Munich; Vladyslav Usenko, TU Munich; Nikolaus Demmel, TUM; Joerg Stueckler, Technical University of Munich; Daniel Cremers, TUM |
P-2B-70 | Person Search via A Mask-guided Two-stream CNN Model | Di Chen*, Nanjing University of Science and Techonology; Shanshan Zhang, Max Planck Institute for Informatics; Wanli Ouyang, CUHK; Jian Yang, Nanjing University of Science and Technology; Ying Tai, Tencent |
P-2B-71 | Imagine This! Scripts to Compositions to Videos | Tanmay Gupta*, UIUC; Dustin Schwenk, Allen Institute for Artificial Intelligence; Ali Farhadi, University of Washington; Derek Hoiem, University of Illinois at Urbana-Champaign; Aniruddha Kembhavi, Allen Institute for Artificial Intelligence |
P-2B-72 | Multiresolution Tree Networks for Point Cloud Procesing | Matheus Gadelha*, University of Massachusetts Amherst; Subhransu Maji, University of Massachusetts, Amherst; Rui Wang, U Massachusetts |
P-2B-73 | Quantization Mimic: Towards Very Tiny CNN for Object Detection | Yi Wei*, Tsinghua University; Xinyu Pan, MMLAB, CUHK; Hongwei Qin, SenseTime; Junjie Yan, Sensetime; Wanli Ouyang, CUHK |
P-2B-74 | Multi-scale Residual Network for Image Super-Resolution | Juncheng Li, East China Normal University; Faming Fang*, East China Normal University; Kangfu Mei, Jiangxi Normal University; Guixu Zhang, East China Normal University |
P-2B-75 | BodyNet: Volumetric Inference of 3D Human Body Shapes | Gul Varol*, INRIA; Duygu Ceylan, Adobe Research; Bryan Russell, Adobe Research; Jimei Yang, Adobe; Ersin Yumer, Argo AI; Ivan Laptev, INRIA Paris; Cordelia Schmid, INRIA |
P-2B-76 | 3D Recurrent Neural Networks with Context Fusion for Point Cloud Semantic Segmentation | Xiaoqing Ye*, SIMIT; Jiamao Li, SIMIT; Hexiao Huang, Shanghai Opening University; Xiaolin Zhang, SIMIT |
P-2B-77 | Robust Anchor Embedding for Unsupervised Video Re-Identification in the Wild | Mang YE*, Hong Kong Baptist University; Xiangyuan Lan, Department of Computer Science, Hong Kong Baptist University; PongChi Yuen, Department of Computer Science, Hong Kong Baptist University |
P-2B-78 | Towards Robust Neural Networks via Random Self-ensemble | Xuanqing Liu, UC Davis Department of Computer Science; Minhao Cheng, University of California, Davis; Huan Zhang, UC Davis; Cho-Jui Hsieh*, UC Davis Department of Computer Science and Statistics |
P-2B-79 | SpiderCNN: Deep Learning on Point Sets with Parameterized Convolutional Filters | Yifan Xu, Tsinghua University; Tianqi Fan, Multimedia Laboratory, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences; Mingye Xu, Multimedia Laboratory, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences; Long Zeng, Tsinghua University; Yu Qiao*, Multimedia Laboratory, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences |
P-2B-80 | CIRL: Controllable Imitative Reinforcement Learning for Vision-based Self-driving | Xiaodan Liang*, Carnegie Mellon University; Tairui Wang, Petuum Inc; Luona Yang, Carnegie Mellon University; Eric Xing, Petuum Inc. |
P-2B-81 | Normalized Blind Deconvolution | Meiguang Jin*, University of Bern; Stefan Roth, TU Darmstadt; Paolo Favaro, Bern University, Switzerland |
P-2B-82 | Few-Shot Human Motion Prediction via Meta-Learning | Liangyan Gui*, Carnegie Mellon University; Yu-Xiong Wang, Carnegie Mellon University; Deva Ramanan, Carnegie Mellon University; José M. F. Moura, Carnegie Mellon University |
P-2B-83 | Learning to Segment via Cut-and-Paste | Tal Remez*, Tel-Aviv University; Matthew Brown, Google; Jonathan Huang, Google |
P-2B-84 | Weakly-supervised 3D Hand Pose Estimation from Monocular RGB Images | Yujun Cai*, Nanyang Technological University; Liuhao Ge, NTU; Jianfei Cai, Nanyang Technological University; Junsong Yuan, State University of New York at Buffalo, USA |
P-2B-85 | DeepIM: Deep Iterative Matching for 6D Pose Estimation | Yi Li*, Tsinghua University; Gu Wang, Tsinghua University; Xiangyang Ji, Tsinghua University; Yu Xiang, University of Michigan; Dieter Fox, University of Washington |
P-2B-86 | Jointly Discovering Visual Objects and Spoken Words from Raw Sensory Input | David Harwath*, MIT CSAIL; Adria Recasens, Massachusetts Institute of Technology; Dídac Surís, Universitat Politecnica de Catalunya; Galen Chuang, MIT; Antonio Torralba, MIT; James Glass, MIT |
P-2B-87 | A Style-aware Content Loss for Real-time HD Style Transfer | Artsiom Sanakoyeu*, Heidelberg University; Dmytro Kotovenko, Heidelberg University; Bjorn Ommer, Heidelberg University |
P-2B-88 | Implicit 3D Orientation Learning for 6D Object Detection from RGB Images | Martin Sundermeyer*, German Aerospace Center (DLR); Zoltan Marton, DLR; Maximilian Durner, DLR; Rudolph Triebel, German Aerospace Center (DLR) |
P-2B-89 | Scale-Awareness of Light Field Camera based Visual Odometry | Niclas Zeller*, Karlsruhe University of Applied Sciences; Franz Quint, Karlsruhe University of Applied Sciences; Uwe Stilla, Technische Universitaet Muenchen |
P-2B-90 | Audio-Visual Scene Analysis with Self-Supervised Multisensory Features | Andrew Owens*, UC Berkeley; Alexei Efros, UC Berkeley |
Poster session 3A
3A | Wednesday, September 12 | Poster session 10:00 AM - 12:00 PM←↑ |
---|---|---|
P-3A-01 | Efficient Sliding Window Computation for NN-Based Template Matching | Lior Talker*, Haifa University; Yael Moses, IDC, Israel; Ilan Shimshoni, University of Haifa |
P-3A-02 | Active Stereo Net: End-to-End Self-Supervised Learning for Active Stereo Systems | Yinda Zhang*, Princeton University; Sean Fanello, Google; Sameh Khamis, Google; Christoph Rhemann, Google; Julien Valentin, Google; Adarsh Kowdle, Google; Vladimir Tankovich, Google; Shahram Izadi, Google; Thomas Funkhouser, Princeton, USA |
P-3A-03 | GAL: Geometric Adversarial Loss for Single-View 3D-Object Reconstruction | Li Jiang*, The Chinese University of Hong Kong; Xiaojuan Qi, CUHK; Shaoshuai SHI, The Chinese University of Hong Kong; Jia Jiaya, Chinese University of Hong Kong |
P-3A-04 | Learning to Reconstruct High-quality 3D Shapes with Cascaded Fully Convolutional Networks | Yan-Pei Cao*, Tsinghua University; Zheng-Ning Liu, Tsinghua University; Zheng-Fei Kuang, Tsinghua University; Shi-Min Hu, Tsinghua University |
P-3A-05 | Deep Reinforcement Learning with Iterative Shift for Visual Tracking | Liangliang Ren, Tsinghua University; Xin Yuan, Tsinghua University; Jiwen Lu*, Tsinghua University; Ming Yang, Horizon Robotics; Jie Zhou, Tsinghua University, China |
P-3A-06 | CPlaNet: Enhancing Image Geolocalization by Combinatorial Partitioning of Maps | Paul Hongsuck Seo*, POSTECH; Tobias Weyand, Google Inc.; Jack Sim, Google LLC; Bohyung Han, Seoul National University |
P-3A-07 | Bayesian Instance Segmentation in Open Set World | Trung Pham*, NVIDIA; Vijay Kumar B G, University of Adelaide; Thanh-Toan Do, The University of Adelaide; Gustavo Carneiro, University of Adelaide; Ian Reid, University of Adelaide, Australia |
P-3A-08 | Characterizing Adversarial Examples Based on Spatial Consistency Information for Semantic Segmentation | Chaowei Xiao, University of Michigan, Ann Arbor; Ruizhi Deng, Simon Fraser University; Bo Li*, University of Illinois at Urbana–Champaign and UC Berkeley; Fisher Yu, UC Berkeley; Mingyan Liu, University of Michigan, Ann Arbor; Dawn Song, UC Berkeley |
P-3A-09 | CubeNet: Equivariance to 3D Rotation and Translation | Daniel Worrall*, UCL; Gabriel Brostow, University College London |
P-3A-10 | 3D Face Reconstruction from Light Field Images: A Model-free Approach | Mingtao Feng, Hunan Unversity; Syed Zulqarnain Gilani*, The University of Western Australia; Yaonan Wang, Hunan University; Ajmal Mian, University of Western Australia |
P-3A-11 | stagNet: An Attentive Semantic RNN for Group Activity Recognition | Mengshi Qi*, Beihang University; Jie Qin, ETH Zurich; Annan Li, Beijing University of Aeronautics and Astronautics; Yunhong Wang, State Key Laboratory of Virtual Reality Technology and System, Beihang University, Beijing 100191, China; Jiebo Luo, U. Rochester; Luc Van Gool, ETH Zurich |
P-3A-12 | Supervising the new with the old: learning SFM from SFM | Maria Klodt*, University of Oxford; Andrea Vedaldi, Oxford University |
P-3A-13 | PSANet: Point-wise Spatial Attention Network for Scene Parsing | Hengshuang Zhao*, The Chinese University of Hong Kong; Yi ZHANG, The Chinese University of Hong Kong; Shu Liu, CUHK; Jianping Shi, Sensetime Group Limited; Chen Change Loy, Chinese University of Hong Kong; Dahua Lin, The Chinese University of Hong Kong; Jia Jiaya, Chinese University of Hong Kong |
P-3A-14 | FishEyeRecNet: A Multi-Context Collaborative Deep Network for Fisheye Image Recti_x000c_cation | Xiaoqing Yin*, University of Sydney; Xinchao Wang, Stevens Institute of Technology; Jun Yu, HDU; Maojun Zhang, National University of Defense Technology, China; Pascal Fua, EPFL, Switzerland; Dacheng Tao, University of Sydney |
P-3A-15 | ELEGANT: Exchanging Latent Encodings with GAN for Transferring Multiple Face Attributes | Taihong Xiao*, Peking University; Jiapeng Hong, Peking University; Jinwen Ma, Peking University |
P-3A-16 | Deep Bilevel Learning | Simon Jenni*, Universität Bern; Paolo Favaro, Bern University, Switzerland |
P-3A-17 | ADVIO: An Authentic Dataset for Visual-Inertial Odometry | Santiago Cortes, Aalto University; Arno Solin*, Aalto University; Esa Rahtu, Tampere University of Technology; Juho Kannala, Aalto University, Finland |
P-3A-18 | D2S: Densely Segmented Supermarket Dataset | Patrick Follmann*, MVTec Software GmbH; Tobias Böttger, MVTec Software GmbH; Philipp Härtinger, MVTec Software GmbH; Rebecca König, MVTec Software GmbH; Markus Ulrich, MVTec Software GmbH |
P-3A-19 | PyramidBox: A Context-assisted Single Shot Face Detector | Xu Tang, Baidu; Daniel Du*, Baidu; Zeqiang He, Baidu; jingtuo liu, baidu |
P-3A-20 | Structured Siamese Network for Real-Time Visual Tracking | Yunhua Zhang, Dalian University of Technology; Lijun Wang, Dalian University of Technology; Dong Wang, Dalian University of Technology; Mengyang Feng, Dalian University of Technology; Huchuan Lu*, Dalian University of Technology; Jinqing Qi, Dalian University of Technology |
P-3A-21 | Probabilistic Signed Distance Function for On-the-fly Scene Reconstruction | Wei Dong*, Peking University; Qiuyuan Wang, Peking University; Xin Wang, Peking University; Hongbin Zha, Peking University, China |
P-3A-22 | 3D Vehicle Trajectory Reconstruction in Monocular Video Data Using Environment Structure Constraints | Sebastian Bullinger*, Fraunhofer IOSB; Christoph Bodensteiner, Fraunhofer IOSB; Michael Arens, Fraunhofer IOSB; Rainer Stiefelhagen, Karlsruhe Institute of Technology |
P-3A-23 | Unsupervised Image-to-Image Translation with Stacked Cycle-Consistent Adversarial Networks | Minjun Li*, Fudan University; Haozhi Huang, Tencent AI Lab; Lin Ma, Tencent AI Lab; Wei Liu, Tencent AI Lab; Tong Zhang, Tecent AI Lab; Yu-Gang Jiang, Fudan University |
P-3A-24 | Pose-Normalized Image Generation for Person Re-identification | Xuelin Qian, Fudan University; Yanwei Fu*, Fudan Univ.; Tao Xiang, Queen Mary, University of London, UK; Wenxuan Wang, Fudan University; Jie Qiu, Nara Institute of Science and Technology; Yang Wu, Nara Institute of Science and Technology; Yu-Gang Jiang, Fudan University; Xiangyang Xue, Fudan University |
P-3A-25 | Action Anticipation with RBF Kernelized Feature Mapping RNN | Yuge Shi*, Australian National University; Basura Fernando, Australian National University; RICHARD HARTLEY, Australian National University, Australia |
P-3A-26 | Rendering Portraitures from Monocular Camera and Beyond | Xiangyu Xu*, Tsinghua University; Deqing Sun, NVIDIA; Sifei Liu, NVIDIA; Wenqi Ren, Institute of Information Engineering, Chinese Academy of Sciences; Yu-Jin Zhang, Tsinghua University; Ming-Hsuan Yang, University of California at Merced; Jian Sun, Megvii, Face++ |
P-3A-27 | Recovering 3D Planes from a Single Image via Convolutional Neural Networks | Fengting Yang*, Pennsylvania State University ; Zihan Zhou, Penn State University |
P-3A-28 | The Devil of Face Recognition is in the Noise | Liren Chen*, Sensetime Group Limited; Fei Wang, SenseTime; Cheng Li, SenseTime Research; Shiyao Huang, SenseTime Co Ltd; Yanjie Chen, sensetime; Chen Qian, SenseTime; Chen Change Loy, Chinese University of Hong Kong |
P-3A-29 | 3DMV: Joint 3D-Multi-View Prediction for 3D Semantic Scene Segmentation | Angela Dai*, Stanford University; Matthias Niessner, Technical University of Munich |
P-3A-30 | Joint optimization for compressive video sensing and reconstruction under hardware constraints | Michitaka Yoshida*, Kyushu University; Akihiko Torii, Tokyo Institute of Technology, Japan; Masatoshi Okutomi, Tokyo Institute of Technology; Kenta Endo, Hamamatsu Photonics K. K.; Yukinobu Sugiyama, Hamamatsu Photonics K. K.; Hajime Nagahara, Osaka University |
P-3A-31 | Consensus-Driven Propagation in Massive Unlabeled Data for Face Recognition | Xiaohang Zhan*, The Chinese University of Hong Kong; Ziwei Liu, The Chinese University of Hong Kong; Junjie Yan, Sensetime Group Limited; Dahua Lin, The Chinese University of Hong Kong; Chen Change Loy, Chinese University of Hong Kong |
P-3A-32 | Recovering Accurate 3D Human Pose in The Wild Using IMUs and a Moving Camera | Timo von Marcard*, University of Hannover; Roberto Henschel, Leibniz University of Hannover; Michael Black, Max Planck Institute for Intelligent Systems; Bodo Rosenhahn, Leibniz University Hannover; Gerard Pons-Moll, MPII, Germany |
P-3A-33 | Predicting Future Instance Segmentation by Forecasting Convolutional Features | Pauline Luc*, Facebook AI Research; Camille Couprie, Facebook; yann lecun, Facebook; Jakob Verbeek, INRIA |
P-3A-34 | PS-FCN: A Flexible Learning Framework for Photometric Stereo | Guanying Chen*, The University of Hong Kong; Kai Han, University of Oxford; Kwan-Yee Wong, The University of Hong Kong |
P-3A-35 | Unsupervised Class-Specific Deblurring | Nimisha T M*, Indian Institute of Technology Madras; Sunil Kumar, Indian Institute of Technology Madras; Rajagopalan Ambasamudram, Indian Institute of Technology Madras |
P-3A-36 | Face Super-resolution Guided by Facial Component Heatmaps | Xin Yu*, Australian National University; Basura Fernando, Australian National University; Bernard Ghanem, KAUST; Fatih Porikli, ANU; RICHARD HARTLEY, Australian National University, Australia |
P-3A-37 | A Contrario Horizon-First Vanishing Point Detection Using Second-Order Grouping Laws | Gilles Simon*, Université de Lorraine; Antoine Fond, Université de Lorraine; Marie-Odile Berger, INRIA |
P-3A-38 | Fast, Accurate, and, Lightweight Super-Resolution with Cascading Residual Network | Namhyuk Ahn, Ajou University; Byungkon Kang, Ajou University; Kyung-Ah Sohn*, Ajou University |
P-3A-39 | Face Recognition with Contrastive Convolution | Chunrui Han*, ICT, Chinese Academy of Sciences, China; Shiguang Shan, Chinese Academy of Sciences; Meina Kan, ICT, CAS; Shuzhe Wu, Chinese Academy of Sciences; xilin chen, ICT, Chinese Academy of Sciences, China |
P-3A-40 | Deforming Autoencoders: Unsupervised Disentangling of Shape and Appearance | Zhixin Shu*, Stony Brook University; Mihir Sahasrabudhe, CentraleSupelec; Alp Guler, INRIA; Dimitris Samaras, Stony Brook University; Nikos Paragios, Therapanacea; Iasonas Kokkinos , UCL |
P-3A-41 | NetAdapt: Platform-Aware Neural Network Adaptation for Mobile Applications | Tien-Ju Yang*, Massachusetts Institute of Technology; Andrew Howard, Google; Bo Chen, Google; Xiao Zhang, Google; Alec Go, Google; Vivienne Sze, Massachusetts Institute of Technology; Hartwig Adam, Google |
P-3A-42 | ExFuse: Enhancing Feature Fusion for Semantic Segmentation | Zhenli Zhang*, Fudan University; Xiangyu Zhang, Megvii Inc; Chao Peng, Megvii(Face++) Inc; Jian Sun, Megvii, Face++ |
P-3A-43 | AugGAN: Cross Domain Adaptation with GAN-based Data Augmentation | Sheng-Wei Huang, National Tsing Hua University; Che-Tsung Lin*, National Tsing Hua University; Shu-Ping Chen, National Tsing Hua University; Yen-Yi Wu, NTHU CS; Po-Hao Hsu, National Tsing Hua University; Shang-Hong Lai , National Tsing Hua University |
P-3A-44 | LAPCSR:A Deep Laplacian Pyramid Generative Adversarial Network for Scalable Compressive Sensing Reconstruction | Kai Xu*, Arizona State University; Zhikang Zhang, Arizona State University; Fengbo Ren, Arizona State University |
P-3A-45 | U-PC: Unsupervised Planogram Compliance | Archan Ray, University of Massachusetts Amherst; Nishant Kumar, SMART-FM; Avishek Shaw*, Tata Consultancy Services Limited; Dipti Prasad Mukherjee, ISI, Kolkata |
P-3A-46 | Seeing Tree Structure from Vibration | Tianfan Xue, MIT; Jiajun Wu*, MIT; Zhoutong Zhang, MIT; Chengkai Zhang, MIT; Joshua Tenenbaum, MIT; Bill Freeman, MIT |
P-3A-47 | A Dataset of Flash and Ambient Illumination Pairs from the Crowd | Yagiz Aksoy*, ETH Zurich; Changil Kim, MIT CSAIL; Petr Kellnhofer, MIT; Sylvain Paris, Adobe Research; Mohamed A. Elghareb, Qatar Computing Research Institute; Marc Pollefeys, ETH Zurich; Wojciech Matusik, MIT |
P-3A-48 | Compressing the Input for CNNs with the First-Order Scattering Transform | Edouard Oyallon*, CentraleSupélec; Eugene Belilovsky, Inria Galen / KU Leuven; Sergey Zagoruyko, Inria; Michal Valko, Inria |
P-3A-49 | Distractor-aware Siamese Networks for Visual Object Tracking | Zheng Zhu*, CASIA; Qiang Wang, University of Chinese Academy of Sciences; Bo Li, sensetime; Wu Wei, Sensetime; Junjie Yan, Sensetime Group Limited |
P-3A-50 | "Factual" or "Emotional": Stylized Image Captioning with Adaptive Learning and Attention" | Tianlang Chen*, University of Rochester; Zhongping Zhang, University of Rochester; Quanzeng You, Microsoft; CHEN FANG, Adobe Research, San Jose, CA; Zhaowen Wang, Adobe Research; Hailin Jin, Adobe Research; Jiebo Luo, U. Rochester |
P-3A-51 | Constrained Optimization Based Low-Rank Approximation of Deep Neural Networks | Chong Li*, University of Washington; C.J. Richard Shi, University of Washington |
P-3A-52 | Extending Layered Models to 3D Motion | Dong Lao, KAUST; Ganesh Sundaramoorthi*, Kaust |
P-3A-53 | ExplainGAN: Model Explanation via Decision Boundary Crossing Transformations | Nathan Silberman*, Butterfly Network; Pouya Samangouei, Butterfly Network; Liam Nakagawa, Butterfly Network; Ardavan Saeedi, Butterfly Network Inc |
P-3A-54 | Adding Attentiveness to the Neurons in Recurrent Neural Networks | Pengfei Zhang, Xi'an Jiaotong University; Jianru Xue, Xi'an Jiaotong University; Cuiling Lan*, Microsoft Research; Wenjun Zeng, Microsoft Research; Zhanning Gao, Xi'an Jiaotong University; Nanning Zheng, Xi'an Jiaotong University |
P-3A-55 | ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation | Sachin Mehta*, University of Washington; Mohammad Rastegari, Allen Institute for Artificial Intelligence; Anat Caspi, University of Washington; Linda Shapiro, University of Washington; Hannaneh Hajishirzi, University of Washington |
P-3A-56 | Learning Human-Object Interactions by Graph Parsing Neural Networks | Siyuan Qi*, UCLA; Wenguan Wang, Beijing Institute of Technology; Baoxiong Jia, UCLA; Jianbing Shen, Beijing Institute of Technology; Song-Chun Zhu, UCLA |
P-3A-57 | BOP: Benchmark for 6D Object Pose Estimation | Tomas Hodan*, Czech Technical University in Prague; Frank Michel, Technical University Dresden; Eric Brachmann, TU Dresden; Wadim Kehl, Toyota Research Institute; Anders Buch, University of Southern Denmark; Dirk Kraft, Syddansk Universitet; Bertram Drost, MVTec Software GmbH; Joel Vidal, National Taiwan University of Science and Technology; Stephan Ihrke , Fraunhofer ivi ; Xenophon Zabulis, FORTH; Caner Sahin, Imperial College London; Fabian Manhardt, TU Munich; Federico Tombari, Technical University of Munich, Germany; Tae-Kyun Kim, Imperial College London; Jiri Matas, CMP CTU FEE; Carsten Rother, University of Heidelberg |
P-3A-58 | RCAA: Relational Context-Aware Agents for Person Search | Xiaojun Chang*, Carnegie Mellon University; Po-Yao Huang, Carnegie Mellon University; Xiaodan Liang, Carnegie Mellon University; Yi Yang, UTS; Alexander Hauptmann, Carnegie Mellon University |
P-3A-59 | DetNet: Design Backbone for Object Detection | Zeming Li*, Tsinghua University;Megvii Inc; Chao Peng, Megvii(Face++) Inc; Gang Yu, Face++; Yangdong Deng, Tsinghua University; Xiangyu Zhang, Megvii Inc; Jian Sun, Megvii, Face++ |
P-3A-60 | Modeling Varying Camera-IMU Time Offset in Optimization-Based Visual-Inertial Odometry | Yonggen Ling*, Tencent AI Lab; Linchao Bao, Tencent AI Lab; Zequn Jie, Tencent AI Lab; Fengming Zhu, Tencent AI Lab; Ziyang Li, Tencent AI Lab; Shanmin Tang, Tencent AI Lab; YongSheng Liu, Tencent AI Lab; Wei Liu, Tencent AI Lab; Tong Zhang, Tecent AI Lab |
P-3A-61 | Exploiting temporal information for 3D human pose estimation | Mir Rayat Imtiaz Hossain*, University of British Columbia; Jim Little, University of British Columbia, Canada |
P-3A-62 | Joint Representation and Truncated Inference Learning for Correlation Filter based Tracking | Yingjie Yao, Harbin Institute of technology; Xiaohe Wu, Harbin Institute of technology; Lei Zhang, University of Pittsburgh; Shiguang Shan, Chinese Academy of Sciences; Wangmeng Zuo*, Harbin Institute of Technology, China |
P-3A-63 | Learning to Zoom: a Saliency-Based Sampling Layer for Neural Networks | Adria Recasens*, Massachusetts Institute of Technology; Petr Kellnhofer, MIT; Simon Stent, Toyota Research Institute; Wojciech Matusik, MIT; Antonio Torralba, MIT |
P-3A-64 | Does Haze Removal Help Image Classification? | Yanting Pei*, Beijing Jiaotong University; Yaping Huang, Beijing Jiaotong University; Qi Zou, Beijing Jiaotong University; Yuhang Lu, University of South Carolina; Song Wang, University of South Carolina |
P-3A-65 | Learning Local Descriptors by Integrating Geometry Constraints | Zixin Luo*, HKUST; Tianwei Shen, HKUST; Lei Zhou, HKUST; Siyu Zhu, HKUST; Runze Zhang, HKUST; Tian Fang, HKUST; Long Quan, Hong Kong University of Science and Technology |
P-3A-66 | Repeatability Is Not Enough: Learning Affine Regions via Discriminability | Dmytro Mishkin*, Czech Technical University in Prague; Filip Radenovic, Visual Recognition Group, CTU Prague; Jiri Matas, CMP CTU FEE |
P-3A-67 | Macro-Micro Adversarial Network for Human Parsing | Yawei Luo*, University of Technology Sydney; Zhedong Zheng, University of Technology Sydney; Liang Zheng, University of Technology Sydney; Yi Yang, UTS |
P-3A-68 | Learning Class Prototypes via Structure Alignment for Zero-Shot Recognition | Huajie Jiang, ICT, CAS; Ruiping Wang*, ICT, CAS; Shiguang Shan, Chinese Academy of Sciences; Xilin Chen, China |
P-3A-69 | SphereNet: Learning Spherical Representations for Detection and Classification in Omnidirectional Images | Benjamin Coors*, MPI Intelligent Systems, Bosch; Alexandru Condurache, Bosch; Andreas Geiger, MPI-IS and University of Tuebingen |
P-3A-70 | A dataset and architecture for visual reasoning with a working memory | Guangyu Robert Yang*, Columbia University; Igor Ganichev, Google Brain; Xiao-Jing Wang, New York University; Jon Shlens, Google; David Sussillo, Google Brain |
P-3A-71 | Flow-Grounded Spatial-Temporal Video Prediction from Still Images | Yijun Li*, University of California, Merced; CHEN FANG, Adobe Research, San Jose, CA; Jimei Yang, Adobe; Zhaowen Wang, Adobe Research; Xin Lu, Adobe; Ming-Hsuan Yang, University of California at Merced |
P-3A-72 | The Unmanned Aerial Vehicle Benchmark: Object Detection and Tracking | Dawei Du*, University of Chinese Academy of Sciences; Yuankai Qi, Harbin Institute of Technology; Hongyang Yu, Harbin Institute of Technology; Yifang Yang, University of Chinese Academy of Sciences; Kaiwen Duan, University of Chinese Academy of Sciences; guorong Li, CAS; Weigang Zhang, Harbin Institute of Technology, Weihai; Qingming Huang, University of Chinese Academy of Sciences; Qi Tian , The University of Texas at San Antonio |
P-3A-73 | Selective Zero-Shot Classification with Augmented Attributes | Jie Song, College of Computer Science and Technology, Zhejiang University; Chengchao Shen, Zhejiang University; Jie Lei, Zhejiang University; An-Xiang Zeng, Alibaba; Kairi Ou, Alibaba; Dacheng Tao, University of Sydney; Mingli Song*, Zhejiang University |
P-3A-74 | Action Search: Spotting Actions in Videos and Its Application to Temporal Action Localization | Humam Alwassel*, KAUST; Fabian Caba, KAUST; Bernard Ghanem, KAUST |
P-3A-75 | A Principled Approach to Hard Triplet Generation via Adversarial Nets | Yiru Zhao*, Shanghai Jiao Tong University; Zhongming Jin, Alibaba Group; Guo-Jun Qi, University of Central Florida; Hongtao Lu, Shanghai Jiao Tong University; Xian-Sheng Hua, Alibaba Group |
P-3A-76 | Pose Guided Human Video Generation | Ceyuan Yang*, SenseTime Group Limited; Zhe Wang, Sensetime Group Limited; Xinge Zhu, Sensetime Group Limited; Chen Huang, Carnegie Mellon University; Jianping Shi, Sensetime Group Limited; Dahua Lin, The Chinese University of Hong Kong |
P-3A-77 | Deep Directional Statistics: Pose Estimation with Uncertainty Quantification | Sergey Prokudin*, Max Planck Institute for Intelligent Systems; Peter Gehler, Amazon; Sebastian Nowozin, Microsoft Research Cambridge |
P-3A-78 | Learning 3D Human Pose from Structure and Motion | Rishabh Dabral*, IIT Bombay; Anurag Mundhada, IIT Bombay; Abhishek Sharma, Gobasco AI Labs |
P-3A-79 | Learning Dynamic Memory Networks for Object Tracking | Tianyu Yang*, City University of Hong Kong; Antoni Chan, City University of Hong Kong, Hong, Kong |
P-3A-80 | Faces as Lighting Probes via Unsupervised Deep Highlight Extraction | Renjiao Yi*, Simon Fraser University; Chenyang Zhu, Simon Fraser University; Ping Tan, Simon Fraser University; Stephen Lin, Microsoft Research |
P-3A-81 | CurriculumNet: Learning from Large-Scale Web Images without Human Annotation | Sheng Guo*, Malong Technologies; Weilin Huang, Malong Technologies; Haozhi Zhang, Malong Technologies |
P-3A-82 | Joint Task-Recursive Learning for Semantic Segmentation and Depth Estimation | Zhenyu Zhang*, Nanjing University of Sci & Tech; Zhen Cui, Nanjing University of Science and Technology; Zequn Jie, Tencent AI Lab; Xiang Li, NJUST; Chunyan Xu, Nanjing University of Science and Technology; Jian Yang, Nanjing University of Science and Technology |
P-3A-83 | HybridFusion: Real-Time Performance Capture Using a Single Depth Sensor and Sparse IMUs | Zerong Zheng*, Tsinghua University; Tao Yu, Beihang University; Hao Li, Pinscreen/University of Southern California/USC ICT; Kaiwen Guo, Google Inc.; Qionghai Dai, Tsinghua University; Lu Fang, Tsinghua University; Yebin Liu, Tsinghua University |
P-3A-84 | Associating Inter-Image Salient Instances for Weakly Supervised Semantic Segmentation | Ruochen Fan*, Tsinghua University; Qibin Hou, Nankai University; Ming-Ming Cheng, Nankai University; Gang Yu, Face++; Ralph Martin, Cardiff University; Shimin Hu, Tsinghua University |
P-3A-85 | Ask, Acquire and Attack: Data-free UAP generation using Class impressions | Konda Reddy Mopuri*, Indian Institute of Science, Bangalore; Phani Krishna Uppala, Indian Institute of Science; Venkatesh Babu RADHAKRISHNAN, Indian Institute of Science |
P-3A-86 | A Scalable Exemplar-based Subspace Clustering Algorithm for Class-Imbalanced Data | Chong You*, Johns Hopkins University; Chi Li, Johns Hopkins University; Daniel Robinson, Johns Hopkins University; Rene Vidal, Johns Hopkins University |
P-3A-87 | Find and Focus: Retrieve and Localize Video Events with Natural Language Queries | Dian SHAO*, The Chinese University of Hong Kong; Yu Xiong, The Chinese University of HK; Yue Zhao, The Chinese University of Hong Kong; Qingqiu Huang, CUHK; Yu Qiao, Multimedia Laboratory, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences; Dahua Lin, The Chinese University of Hong Kong |
P-3A-88 | Graininess-Aware Deep Feature Learning for Pedestrian Detection | Chunze Lin, Tsinghua University; Jiwen Lu*, Tsinghua University; Jie Zhou, Tsinghua University, China |
P-3A-89 | MVSNet: Depth Inference for Unstructured Multi-view Stereo | Yao Yao*, The Hong Kong University of Science and Technology; Zixin Luo, HKUST; Shiwei Li, HKUST; Tian Fang, HKUST; Long Quan, Hong Kong University of Science and Technology |
P-3A-90 | PlaneMatch: Patch Coplanarity Prediction for Robust RGB-D Registration | Yifei Shi, Princeton University; Kai Xu, Princeton University and National University of Defense Technology; Matthias Niessner, Technical University of Munich; Szymon Rusinkiewicz, Princeton University; Thomas Funkhouser*, Princeton, USA |
P-3A-91 | Deep Virtual Stereo Odometry: Leveraging Deep Depth Prediction for Monocular Direct Sparse Odometry | Nan Yang*, Technical University of Munich; Rui Wang, Technical University of Munich; Joerg Stueckler, Technical University of Munich; Daniel Cremers, TUM |
Poster session 3B
3B | Wednesday, September 12 | Poster session 02:30 PM - 04:00 PM←↑ |
---|---|---|
P-3B-01 | GANimation: Anatomically-aware Facial Animation from a Single Image | Albert Pumarola*, Institut de Robotica i Informatica Industrial; Antonio Agudo, Institut de Robotica i Informatica Industrial, CSIC-UPC; Aleix Martinez, The Ohio State University; Alberto Sanfeliu, Industrial Robotics Institute; Francesc Moreno, IRI |
P-3B-02 | Unsupervised Geometry-Aware Representation for 3D Human Pose Estimation | Helge Rhodin*, EPFL; Mathieu Salzmann, EPFL; Pascal Fua, EPFL, Switzerland |
P-3B-03 | Efficient Semantic Scene Completion Network with Spatial Group Convolution | Jiahui Zhang*, Tsinghua University; Hao Zhao, Intel Labs China; Anbang Yao, Intel Labs China; Yurong Chen, Intel Labs China; Hongen Liao, Tsinghua University |
P-3B-04 | Deep Autoencoder for Combined Human Pose Estimation and Body Model Upscaling | Matthew Trumble*, University of Surrey; Andrew Gilbert, University of Surrey; John Collomosse, Adobe Research; Adrian Hilton, University of Surrey |
P-3B-05 | Highly-Economized Multi-View Binary Compression for Scalable Image Clustering | Zheng Zhang*, Harbin Institute of Technology Shenzhen Graduate School; Li Liu, the inception institute of artificial intelligence; Jie Qin, ETH Zurich; Fan Zhu, the inception institute of artificial intelligence ; Fumin Shen, UESTC; Yong Xu, Harbin Institute of Technology Shenzhen Graduate School; Ling Shao, Inception Institute of Artificial Intelligence; Heng Tao Shen, University of Electronic Science and Technology of China (UESTC) |
P-3B-06 | Asynchronous, Photometric Feature Tracking using Events and Frames | Daniel Gehrig, University of Zurich; Henri Rebecq*, University of Zurich; Guillermo Gallego, University of Zurich; Davide Scaramuzza, University of Zurich& ETH Zurich, Switzerland |
P-3B-07 | Deterministic Consensus Maximization with Biconvex Programming | Zhipeng Cai*, The University of Adelaide; Tat-Jun Chin, University of Adelaide; Huu Le, University of Adelaide; David Suter, University of Adelaide |
P-3B-08 | Depth-aware CNN for RGB-D Segmentation | Weiyue Wang*, USC; Ulrich Neumann, USC |
P-3B-09 | Object Detection in Video with Spatiotemporal Sampling Networks | Gedas Bertasius*, University of Pennsylvania; Lorenzo Torresani, Dartmouth College; Jianbo Shi, University of Pennsylvania |
P-3B-10 | Dependency-aware Attention Control for Unconstrained Face Recognition with Image Sets | Xiaofeng Liu*, Carnegie Mellon University; B. V. K. Vijaya Kumar, CMU, USA; Chao Yang, University of Southern California; Qingming Tang, TTIC; Jane You, The Hong Kong Polytechnic University |
P-3B-11 | License Plate Detection and Recognition in Unconstrained Scenarios | Sérgio Silva*, UFRGS; Claudio Jung, UFRGS |
P-3B-12 | Revisiting the Inverted Indices for Billion-Scale Approximate Nearest Neighbors | Dmitry Baranchuk*, MSU / Yandex; Artem Babenko, MIPT/Yandex; Yury Malkov, NTechLab |
P-3B-13 | Zero-Annotation Object Detection with Web Knowledge Transfer | Qingyi Tao*, Nanyang Techonological University; Hao Yang, NTU; Jianfei Cai, Nanyang Technological University |
P-3B-14 | Semi-supervised Adversarial Learning to Generate Photorealistic Face Images of New Identities from 3D Morphable Model | Baris Gecer*, Imperial College London; Binod Bhattarai, Imperial College London; Josef Kittler, University of Surrey, UK; Tae-Kyun Kim, Imperial College London |
P-3B-15 | Improving Shape Deformation in Unsupervised Image-to-Image Translation | Aaron Gokaslan*, Brown University; Vivek Ramanujan, Brown University; Daniel Ritchie, Brown University; Kwang In Kim, University of Bath; James Tompkin, Brown University |
P-3B-16 | K-convexity shape priors for segmentation | Hossam Isack*, UWO; Lena Gorelick, University of Western Ontario; Karin nG, University of Western Ontario; Olga Veksler, University of Western Ontario; Yuri Boykov, University of Waterloo |
P-3B-17 | Visual Question Generation for Class Acquisition of Unknown Objects | Kohei Uehara*, The University of Tokyo; Antonio Tejero-de-Pablos, The University of Tokyo; Yoshitaka Ushiku, The University of Tokyo; Tatsuya Harada, The University of Tokyo |
P-3B-18 | Sampling Algebraic Varieties for Robust Camera Autocalibration | Danda Pani Paudel*, ETH Zürich; Luc Van Gool, ETH Zurich |
P-3B-19 | Hand Pose Estimation via Latent 2.5D Heatmap Regression | Umar Iqbal*, University of Bonn; Pavlo Molchanov, NVIDIA; Thomas Breuel, NVIDIA; Jürgen Gall, University of Bonn; Kautz Jan, NVIDIA |
P-3B-20 | HairNet: Single-View Hair Reconstruction using Convolutional Neural Networks | Yi Zhou*, University of Southern California; Liwen Hu, University of Southern California; Jun Xing, Institute for Creative Technologies, USC; Weikai Chen, USC Institute for Creative Technology; Han-Wei Kung, University of California, Santa Barbara; Xin Tong, Microsoft Research Asia; Hao Li, Pinscreen/University of Southern California/USC ICT |
P-3B-21 | Super-Identity Convolutional Neural Network for Face Hallucination | Kaipeng Zhang*, National Taiwan University; ZHANPENG ZHANG, SenseTime Group Limited; Chia-Wen Cheng, UT Austin; Winston Hsu, National Taiwan University; Yu Qiao, Multimedia Laboratory, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences; Wei Liu, Tencent AI Lab; Tong Zhang, Tecent AI Lab |
P-3B-22 | Receptive Field Block Net for Accurate and Fast Object Detection | Songtao Liu, BUAA; Di Huang*, Beihang University, China; Yunhong Wang, State Key Laboratory of Virtual Reality Technology and System, Beihang University, Beijing 100191, China |
P-3B-23 | Interpretable Intuitive Physics Model | Tian Ye*, Carnegie Mellon University; Xiaolong Wang, CMU; James Davidson, Google; Abhinav Gupta, CMU |
P-3B-24 | Variable Ring Light Imaging: Capturing Transient Subsurface Scattering with An Ordinary Camera | Ko Nishino*, Kyoto University; Art Subpa-asa, Tokyo Institute of Technology; Yuta Asano, Tokyo Institute of Technology; Mihoko Shimano, National Institute of Informatics; Imari Sato, National Institute of Informatics |
P-3B-25 | Facial Dynamics Interpreter Network: What are the Important Relations between Local Dynamics for Facial Trait Estimation? | Seong Tae Kim*, KAIST; Yong Man Ro, KAIST |
P-3B-26 | Coloring with Words: Guiding Image Colorization Through Text-based Palette Generation | Hyojin Bahng, Korea University; Seungjoo Yoo, Korea University; Wonwoong Cho, Korea University; David Park, Korea University; Ziming Wu, Hong Kong University of Science and Technology; Xiaojuan MA, Hong Kong University of Science and Technology; Jaegul Choo*, Korea University |
P-3B-27 | Sparsely Aggregated Convolutional Networks | Ligeng Zhu*, Simon Fraser University; Ruizhi Deng, Simon Fraser University; Michael Maire, Toyota Technological Institute at Chicago; Zhiwei Deng, Simon Fraser University; Greg Mori, Simon Fraser University; Ping Tan, Simon Fraser University |
P-3B-28 | Deep Attention Neural Tensor Network for Visual Question Answering | Yalong Bai*, Harbin Institute of Technology; Jianlong Fu, Microsoft Research; Tao Mei, JD.com |
P-3B-29 | Diverse feature visualizations reveal invariances in early layers of deep neural networks | Santiago Cadena*, University of Tübingen; Marissa Weis, University of Tübingen; Leon A. Gatys, University of Tuebingen; Matthias Bethge, University of Tübingen; Alexander Ecker, University of Tübingen |
P-3B-30 | Sidekick Policy Learning for Active Visual Exploration | Santhosh Kumar Ramakrishnan*, University of Texas at Austin; Kristen Grauman, University of Texas |
P-3B-31 | DPP-Net: Device-aware Progressive Search for Pareto-optimal Neural Architectures | Jin-Dong Dong*, National Tsing-Hua University; An-Chieh Cheng, National Tsing-Hua University; Da-Cheng Juan, Google; Wei Wei, Google; Min Sun, NTHU |
P-3B-32 | Pixel2Mesh: Generating 3D Mesh Models from Single RGB Images | Nanyang Wang, Fudan University; Yinda Zhang*, Princeton University; Zhuwen Li, Intel Labs; Yanwei Fu, Fudan Univ.; Wei Liu, Tencent AI Lab; Yu-Gang Jiang, Fudan University |
P-3B-33 | End-to-End Incremental Learning | Francisco M. Castro*, University of Málaga; Manuel J. Marín-Jiménez, University of Córdoba; Nicolás Guil, University of Málaga; Cordelia Schmid, INRIA; Karteek Alahari, Inria |
P-3B-34 | CAR-Net: Clairvoyant Attentive Recurrent Network | Amir Sadeghian*, Stanford; Maxime Voisin, Stanford University; Ferdinand Legros, Stanford University; Ricky Vesel, Race Optimal; Alexandre Alahi, EPFL; Silvio Savarese, Stanford University |
P-3B-35 | Learning Data Terms for Image Deblurring | Jiangxin Dong*, Dalian University of Technology; Jinshan Pan, Dalian University of Technology; Deqing Sun, NVIDIA; Zhixun Su, Dalian University of Technology; Ming-Hsuan Yang, University of California at Merced |
P-3B-36 | Image Inpainting for Irregular Holes Using Partial Convolutions | Guilin Liu*, NVIDIA; Fitsum Reda, NVIDIA; Kevin Shih, NVIDIA; Ting-Chun Wang, NVIDIA; Andrew Tao, NVIDIA; Bryan Catanzaro, NVIDIA |
P-3B-37 | SRDA: Generating Instance Segmentation Annotation Via Scanning, Reasoning And Domain Adaption | Wenqiang Xu, Shanghai Jiaotong University; Yonglu Li, Shanghai Jiao Tong University; Jun Lv, SJTU; Cewu Lu*, Shanghai Jiao Tong Univercity |
P-3B-38 | Learning Priors for Semantic 3D Reconstruction | Ian Cherabier*, ETH Zurich; Johannes Schoenberger, ETH Zurich; Martin R. Oswald, ETH Zurich; Marc Pollefeys, ETH Zurich; Andreas Geiger, MPI-IS and University of Tuebingen |
P-3B-39 | Integrating Egocentric Videos in Top-view Surveillance Videos: Joint Identification and Temporal Alignment | Shervin Ardeshir*, University of Central Florida; Ali Borji, University of Central Florida |
P-3B-40 | Deep Boosting for Image Denoising | Chang Chen, University of Science and Technology of China; Zhiwei Xiong*, University of Science and Technology of China; Xinmei Tian, USTC; Feng Wu, University of Science and Technology of China |
P-3B-41 | Descending, lifting or smoothing: Secrets of robust cost optimization | Christopher Zach*, Toshiba Research; Guillaume Bourmaud, University of Bordeaux |
P-3B-42 | MultiPoseNet: Fast Multi-Person Pose Estimation using Pose Residual Network | Muhammed Kocabas*, Middle East Technical University; Salih Karagoz, Middle East Technical University; Emre Akbas, Middle East Technical University |
P-3B-43 | TS2C: Tight Box Mining with Surrounding Segmentation Context for Weakly Supervised Object Detection | Yunchao Wei*, UIUC; Zhiqiang Shen, UIUC; Honghui Shi, UIUC; Bowen Cheng, UIUC; Jinjun Xiong, IBM Thomas J. Watson Research Center; Jiashi Feng, NUS; Thomas Huang, UIUC |
P-3B-44 | End-to-End Deep Structured Models for Drawing Crosswalks | Justin Liang*, Uber ATG; Raquel Urtasun, Uber ATG |
P-3B-45 | Efficient Global Point Cloud Registration by Matching Rotation Invariant Features Through Translation Search | Yinlong Liu, Fudan University; Wang Chen*, Shanghai Key Laboratory of Medical Imaging Computing and Computer Assisted Intervention, Digital Medical Research Center, Fudan University; Zhijian Song, Fudan University; Manning Wang, Fudan University |
P-3B-46 | Large Scale Urban Scene Modeling from MVS Meshes | Lingjie Zhu, University of Chinese Academy of Sciences; National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences; Shuhan Shen*, National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences; Zhanyi Hu, National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences |
P-3B-47 | Sub-GAN: An Unsupervised Generative Model via Subspaces | Jie Liang, Nankai University; Jufeng Yang*, Nankai University ; Hsin-Ying Lee, University of California, Merced; Kai Wang, Nankai University; Ming-Hsuan Yang, University of California at Merced |
P-3B-48 | Pseudo Pyramid Deeper Bidirectional ConvLSTM for Video Saliency Detection | Hongmei Song, Beijing Institute of Technology; Sanyuan Zhao*, Beijing Institute of Technology ; Jianbing Shen, Beijing Institute of Technology; Kin-Man Lam, The Hong Kong Polytechnic University |
P-3B-49 | Practical Black-box Attacks on Deep Neural Networks using Efficient Query Mechanisms | Arjun Nitin Bhagoji*, Princeton University; Warren he, University of California, Berkeley; Bo Li, University of Illinois at UrbanaChampaign; Dawn Song, UC Berkeley |
P-3B-50 | Learning 3D Shape Priors for Shape Completion and Reconstruction | Jiajun Wu*, MIT; Chengkai Zhang, MIT; Xiuming Zhang, MIT; Zhoutong Zhang, MIT; Joshua Tenenbaum, MIT; Bill Freeman, MIT |
P-3B-51 | Comparator Networks | Weidi Xie*, University of Oxford; Li Shen, University of Oxford; Andrew Zisserman, University of Oxford |
P-3B-52 | Improving Fine-Grained Visual Classification using Pairwise Confusion | Abhimanyu Dubey*, Massachusetts Institute of Technology; Otkrist Gupta, MIT; Pei Guo, Brigham Young University; Ryan Farrell, Brigham Young University; Ramesh Raskar, Massachusetts Institute of Technology; Nikhil Naik, MIT |
P-3B-53 | Visual-Inertial Object Detection and Mapping | Xiaohan Fei*, UCLA; Stefano Soatto, UCLA |
P-3B-54 | Learning Region Features for Object Detection | Jiayuan Gu, Peking University; Han Hu, Microsoft Research Asia; Liwei Wang, Peking University; Yichen Wei, MSR Asia; Jifeng Dai*, Microsoft Research Asia |
P-3B-55 | Efficient Dense Point Cloud Object Reconstruction using Deformation Vector Fields | Kejie Li*, University of Adelaide; Trung Pham, NVIDIA; Huangying Zhan, The University of Adelaide; Ian Reid, University of Adelaide, Australia |
P-3B-56 | Evaluating Capability of Deep Neural Networks for Image Classification via Information Plane | Hao Cheng*, Shanghaitech University; Dongze Lian, Shanghaitech University; Shenghua Gao, Shanghaitech University; Yanlin Geng, Shanghaitech University |
P-3B-57 | Shuffle-Then-Assemble: Learning Object-Agnostic Visual Relationship Features | XU YANG*, NTU; Hanwang Zhang, Nanyang Technological University; Jianfei Cai, Nanyang Technological University |
P-3B-58 | Zero-Shot Deep Domain Adaptation | Kuan-Chuan Peng*, siemens corporation; Ziyan Wu, Siemens Corporation; Jan Ernst, Siemens Corporation |
P-3B-59 | Deep Imbalanced Attribute Classification using Visual Attention Aggregation | Nikolaos Sarafianos*, University of Houston; Xiang Xu, University of Houston; Ioannis Kakadiaris, University of Houston |
P-3B-60 | Video Object Segmentation by Learning Location-Sensitive Embeddings | Hai Ci, Peking University; Chunyu Wang*, Microsoft Research asia; Yizhou Wang, PKU |
P-3B-61 | Deep Multi-Task Learning to Recognise Subtle Facial Expressions of Mental States | Guosheng Hu*, AnyVision; Li Liu, the inception institute of artificial intelligence; Yang Yuan, AnyVision; Zehao Yu, Xiamen University; Yang Hua, Queen's University Belfast; Zhihong Zhang, Xiamen University; Fumin Shen, UESTC; Ling Shao, Inception Institute of Artificial Intelligence; Timothy Hospedales, Edinburgh University; Neil Robertson, Queen's University Belfast; Yongxin Yang, University of Edinburgh |
P-3B-62 | Where Will They Go? Predicting Fine-Grained Adversarial Multi-Agent Motion using Conditional Variational Autoencoders | Panna Felsen*, University of California Berkeley; Patrick Lucey, STATS; Sujoy Ganguly, STATS |
P-3B-63 | Video Summarization Using Fully Convolutional Sequence Networks | Mrigank Rochan*, University of Manitoba; Linwei Ye, University of Manitoba; Yang Wang, University of Manitoba |
P-3B-64 | Occlusion-aware Hand Pose Estimation Using Hierarchical Mixture Density Network | Qi Ye*, Imperial College London; Tae-Kyun Kim, Imperial College London |
P-3B-65 | Learning with Biased Complementary Labels | Xiyu Yu*, The University of Sydney; Tongliang Liu, The University of Sydney; Mingming Gong, University of Pittsburgh; Dacheng Tao, University of Sydney |
P-3B-66 | ConceptMask: Large-Scale Segmentation from Semantic Concepts | Yufei Wang*, Facebook; Zhe Lin, Adobe Research; Xiaohui Shen, Adobe Research; Scott Cohen, Adobe Research; Jianming Zhang, Adobe Research |
P-3B-67 | Conditional Image-Text Embedding Networks | Bryan Plummer*, Boston University; Paige Kordas, University of Illinois at Urbana Champaign; Hadi Kiapour, eBay; Shuai Zheng, eBay; Robinson Piramuthu, eBay Inc.; Svetlana Lazebnik, UIUC |
P-3B-68 | Geolocation Estimation of Photos using a Hierarchical Model and Scene Classification | Eric Müller-Budack*, Leibniz Information Centre of Science and Technology (TIB); Kader Pustu-Iren, Leibniz Information Center of Science and Technology (TIB); Ralph Ewerth, Leibniz Information Center of Science and Technology (TIB) |
P-3B-69 | Lifting Layers: Analysis and Applications | Michael Moeller*, University of Siegen; Peter Ochs, Saarland University; Tim Meinhardt, Technical University of Munich; Laura Leal-Taixé, TUM |
P-3B-70 | Progressive Neural Architecture Search | Chenxi Liu*, Johns Hopkins University; Maxim Neumann, Google; Barret Zoph, Google; Jon Shlens, Google; Wei Hua, Google; Li-Jia Li, Google; Li Fei-Fei, Stanford University; Alan Yuille, Johns Hopkins University; Jonathan Huang, Google; Kevin Murphy, Google |
P-3B-71 | Learning Deep Representations with Probabilistic Knowledge Transfer | Nikolaos Passalis*, Aristotle University of Thessaloniki; Anastasios Tefas, Aristotle University of Thessaloniki |
P-3B-72 | Robust fitting in computer vision: easy or hard? | Tat-Jun Chin*, University of Adelaide; Zhipeng Cai, The University of Adelaide; Frank Neumann, The University of Adelaide, School of Computer Science, Faculty of Engineering, Computer and Mathematical Science |
P-3B-73 | Dual-Agent Deep Reinforcement Learning for Deformable Face Tracking | Minghao Guo, Tsinghua University; Jiwen Lu*, Tsinghua University; Jie Zhou, Tsinghua University, China |
Poster session 3C
3C | Wednesday, September 12 | Poster session 05:15 PM - 06:45 PM←↑ |
---|---|---|
P-3C-01 | Zero-Shot Object Detection | Ankan Bansal*, University of Maryland; Karan Sikka, SRI International; Gaurav Sharma, NEC Labs America; Rama Chellappa, University of Maryland; Ajay Divakaran, SRI, USA |
P-3C-02 | ForestHash: Semantic Hashing With Shallow Random Forests and Tiny Convolutional Networks | Qiang Qiu*, Duke University; Jose Lezama, Universidad de la Republica, Uruguay; Alex Bronstein, Tel Aviv University, Israel; Guillermo Sapiro, Duke University |
P-3C-03 | ML-LocNet: Improving Object Localization with Multi-view Learning Network | Xiaopeng Zhang*, National University of Singapore; Jiashi Feng, NUS |
P-3C-04 | MPLP++: Fast, Parallel Dual Block-Coordinate Ascent for Dense Graphical Models | Siddharth Tourani*, Visual Learning Lab, HCI, Uni-Heidelberg; Alexander Shekhovtsov, Czech Technical University in Prague, Czech Republic; Carsten Rother, University of Heidelberg; Bogdan Savchynskyy, Heidelberg University |
P-3C-05 | A Zero-Shot Framework for Sketch based Image Retrieval | Sasikiran Yelamarthi , IIT Madras; Shiva Krishna Reddy M, Indian Institute of Technology Madras; Ashish Mishra*, IIT Madras; Anurag Mittal, Indian Institute of Technology Madras |
P-3C-06 | In the Eye of Beholder: Joint Learning of Gaze and Actions in First Person Vision | Yin Li*, CMU; Miao Liu, Georgia Tech; James Rehg, Georgia Institute of Technology |
P-3C-07 | SAN: Learning Relationship between Convolutional Features for Multi-Scale Object Detection | YongHyun Kim*, POSTECH |
P-3C-08 | A Systematic DNN Weight Pruning Framework using Alternating Direction Method of Multipliers | Tianyun Zhang*, Syracuse University; Shaokai Ye, Syracuse University; Kaiqi Zhang, Syracuse University; Yanzhi Wang, Syracuse University; Makan Fardad, Syracuse Universtiy; Wujie Wen, Florida International University |
P-3C-09 | Iterative Crowd Counting | Viresh Ranjan*, Stony Brook University; Hieu Le, Stony Brook University; Minh Hoai Nguyen, Stony Brook University |
P-3C-10 | A Dataset for Lane Instance Segmentation in Urban Environments | Brook Roberts, Five AI Ltd.; Sebastian Kaltwang*, Five AI Ltd.; Sina Samangooei, Five AI Ltd.; Mark Pender-Bare, Five AI Ltd.; Konstantinos Tertikas, Five AI Ltd.; John Redford, Five AI Ltd. |
P-3C-11 | Out-of-Distribution Detection Using an Ensemble of Self Supervised Leave-out Classifiers | Nataraj Jammalamadaka*, Intel Labs; Xia Zhu, Intel Labs; Dipankar Das, Intel Labs; Bharat Kaul, Intel Labs; Theodore Willke, Intel Labs |
P-3C-12 | Penalizing Top Performers: Conservative Loss for Semantic Segmentation Adaptation | Xinge Zhu*, Sensetime Group Limited; Hui Zhou, Sensetime Group Limited.; Ceyuan Yang, SenseTime Group Limited; Jianping Shi, Sensetime Group Limited; Dahua Lin, The Chinese University of Hong Kong |
P-3C-13 | Compound Memory Networks for Few-shot Video Classification | Linchao Zhu*, University of Technology, Sydney; Yi Yang, UTS |
P-3C-14 | Straight to the Facts: Learning Knowledge Base Retrieval for Factual Visual Question Answering | Medhini Narasimhan*, University of Illinois at Urbana-Champaign ; Alexander Schwing, UIUC |
P-3C-15 | Interpretable Basis Decomposition for Visual Explanation | Antonio Torralba, MIT; Bolei Zhou*, MIT; David Bau, MIT; Yiyou Sun, Harvard |
P-3C-16 | How Local is the Local Diversity? Reinforcing Sequential Determinantal Point Processes with Dynamic Ground Sets for Supervised Video Summarization | Yandong Li*, University of Central Florida; Boqing Gong, Tencent AI Lab; Tianbao Yang, University of Iowa; Liqiang Wang, University of Central Florida |
P-3C-17 | Dividing and Aggregating Network for Multi-view Action Recognition | Dongang Wang*, The University of Sydney; Wanli Ouyang, CUHK; Wen Li, ETHZ; Dong Xu, University of Sydney |
P-3C-18 | Shape Reconstruction Using Volume Sweeping and Learned Photoconsistency | Vincent Leroy*, INRIA Grenoble Rhône-Alpes; Edmond Boyer, Inria; Jean-Sebastien Franco, INRIA |
P-3C-19 | RT-GENE: Real-Time Eye Gaze Estimation in Natural Environments | Tobias Fischer*, Imperial College London; Hyung Jin Chang, University of Birmingham; Yiannis Demiris, Imperial College London |
P-3C-20 | Pairwise Body-Part Attention for Recognizing Human-Object Interactions | Haoshu Fang, SJTU; Jinkun Cao, Shanghai Jiao Tong University; Yu-Wing Tai, Tencent YouTu; Cewu Lu*, Shanghai Jiao Tong Univercity |
P-3C-21 | Motion Feature Network: Fixed Motion Filter for Action Recognition | Myunggi Lee, Seoul National University; Seung Eui Lee, Seoul National University; Sung Joon Son, Seoul National University; Gyutae Park, Seoul National University; Nojun Kwak*, Seoul National University |
P-3C-22 | Reverse Attention for Salient Object Detection | Shuhan Chen*, Yangzhou University; Xiuli Tan, Yangzhou University; Ben Wang, Yangzhou University; Xuelong Hu, Yangzhou University |
P-3C-23 | Dynamic Sampling Convolutional Neural Networks | Jialin Wu*, UT Austin; Dai Li, Tsinghua University; Yu Yang, Tsinghua University; Chandrajit Bajaj, University of Texas, Austin; Xiangyang Ji, Tsinghua University |
P-3C-24 | DDRNet: Depth Map Denoising and Refinement for Consumer Depth Cameras Using Cascaded CNNs | Shi Yan, Tsinghua University; Chenglei Wu, Oculus Research; Lizheng Wang, Tsinghua University; Liang An, Tsinghua University; Feng Xu, Tsinghua University; Kaiwen Guo, Google Inc.; Yebin Liu*, Tsinghua University |
P-3C-25 | Stereo Computation for a Single Mixture Image | Yiran Zhong, Australian National University; Yuchao Dai*, Northwestern Polytechnical University; HONGDONG LI, Australian National University, Australia |
P-3C-26 | Volumetric performance capture from minimal camera viewpoints | Andrew Gilbert*, University of Surrey; Marco Volino, University of Surrey; John Collomosse, Adobe Research; Adrian Hilton, University of Surrey |
P-3C-27 | Liquid Pouring Monitoring via Rich Sensory Inputs | Tz-Ying Wu*, National Tsing Hua University; Juan-Ting Lin, National Tsing Hua University; Tsun-Hsuang Wang, National Tsing Hua University; Chan-Wei Hu, National Tsing Hua University; Juan Carlos Niebles, Stanford University; Min Sun, NTHU |
P-3C-28 | Move Forward and Tell: A Progressive Generator of Video Descriptions | Yilei Xiong*, The Chinese University of Hong Kong; Bo Dai, the Chinese University of Hong Kong; Dahua Lin, The Chinese University of Hong Kong |
P-3C-29 | DYAN: A Dynamical Atoms-Based Network for Video Prediction | Wenqian Liu*, Northeastern University; Abhishek Sharma, Northeastern University ; Octavia Camps, Northeastern University; Mario Sznaier, Northeastern University |
P-3C-30 | Deep Structure Inference Network for Facial Action Unit Recognition | Ciprian Corneanu*, Universitat de Barcelona; Meysam Madadi, CVC; Sergio Escalera, Computer Vision Center (UAB) & University of Barcelona, |
P-3C-31 | Physical Primitive Decomposition | Zhijian Liu, Shanghai Jiao Tong University; Jiajun Wu*, MIT; Bill Freeman, MIT; Joshua Tenenbaum, MIT |
P-3C-32 | Boosted Attention: Leveraging Human Attention for Image Captioning | Shi Chen*, University of Minnesota; Qi Zhao, University of Minnesota |
P-3C-33 | Is Robustness the Cost of Accuracy? -- Lessons Learned from 18 Deep Image Classifiers | Dong Su*, IBM Research T.J. Watson Center; Huan Zhang, UC Davis; Hongge Chen, MIT; Jinfeng Yi, JD AI Research; Pin-Yu Chen, IBM Research; Yupeng Gao, IBM Research AI |
P-3C-34 | Dynamic Multimodal Instance Segmentation guided by natural language queries | Edgar Margffoy-Tuay*, Universidad de los Andes; Emilio Botero, Universidad de los Andes; Juan Pérez, Universidad de los Andes; PABLO ARBELÁEZ, Universidad de los Andes |
P-3C-35 | Hierarchy of Alternating Specialists for Scene Recognition | Hyo Jin Kim*, University of North Carolina at Chapel Hill; Jan-Michael Frahm, UNC-Chapel Hill |
P-3C-36 | SwapNet: Garment Transfer in Single View Images | Amit Raj*, Georgia Institute of Technology; Patsorn Sangkloy, Georgia Institute of Technology; Huiwen Chang, Princeton University; Jingwan Lu, Adobe Research ; Duygu Ceylan, Adobe Research; James Hays, Georgia Institute of Technology, USA |
P-3C-37 | What do I Annotate Next? An Empirical Study of Active Learning for Action Localization | Fabian Caba*, KAUST; Joon-Young Lee, Adobe Research; Hailin Jin, Adobe Research; Bernard Ghanem, KAUST |
P-3C-38 | Combining 3D Model Contour Energy and Keypoints for Object Tracking | Bogdan Bugaev*, Saint Petersburg Academic University; Anton Kryshchenko, Saint Petersburg Academic University; Roman Belov, KeenTools |
P-3C-39 | AGIL: Learning Attention from Human for Visuomotor Tasks | Ruohan Zhang*, University of Texas at Austin; Zhuode Liu, Google Inc.; Luxin Zhang, Peking University; Jake Whritner, University of Texas at Austin; Karl Muller, University of Texas at Austin; Mary Hayhoe, University of Texas at Austin; Dana Ballard, University of Texas at Austin |
P-3C-40 | PersonLab: Person Pose Estimation and Instance Segmentation with a Bottom-Up, Part-Based, Geometric Embedding Model | George Papandreou*, Google; Tyler Zhu, Google; Liang-Chieh Chen, Google Inc.; Spyros Gidaris, Ecole des Ponts ParisTech; Jonathan Tompson, Google; Kevin Murphy, Google |
P-3C-41 | Accelerating Dynamic Programs via Nested Benders Decomposition with Application to Multi-Person Pose Estimation | Shaofei Wang*, Baidu Inc.; Alexander Ihler, UC Irvine; Konrad Kording, Northwestern; Julian Yarkony, Experian Data Lab |
P-3C-42 | Separating Reflection and Transmission Images in the Wild | Patrick Wieschollek*, University of Tuebingen; Orazio Gallo, NVIDIA Research; Jinwei Gu, Nvidia; Kautz Jan, NVIDIA |
P-3C-43 | Point-to-Point Regression PointNet for 3D Hand Pose Estimation | Liuhao Ge*, NTU; Zhou Ren, Snap Research, USA, ; Junsong Yuan, State University of New York at Buffalo, USA |
P-3C-44 | Summarizing First-Person Videos from Third Persons' Points of View | HSUAN-I HO*, National Taiwan University; Wei-Chen Chiu, National Chiao Tung University; Yu-Chiang Frank Wang, National Taiwan University |
P-3C-45 | Learning Category-Specific Mesh Reconstruction from Image Collections | Angjoo Kanazawa*, UC Berkeley; Shubham Tulsiani, UC Berkeley; Alexei Efros, UC Berkeley; Jitendra Malik, University of California at Berkley |
P-3C-46 | StereoNet: Guided Hierarchical Refinement for Real-Time Edge-Aware Depth Prediction | Sameh Khamis*, Google; Sean Fanello, Google; Christoph Rhemann, Google; Julien Valentin, Google; Adarsh Kowdle, Google; Shahram Izadi, Google |
P-3C-47 | Visual Question Answering as a Meta Learning Task | Damien Teney*, The Unversity of Adelaide; Anton van den Hengel, The University of Adelaide |
P-3C-48 | SRFeat: Single Image Super Resolution with Feature Discrimination | Seong-Jin Park*, POSTECH; Hyeongseok Son, POSTECH; Sunghyun Cho, DGIST; Ki-Sang Hong, POSTECH; Seungyong Lee, POSTECH |
P-3C-49 | Deep Factorised Inverse-Sketching | Kaiyue Pang*, Queen Mary University of London; Da Li, QMUL; Jifei Song, Queen Mary, University of London; Yi-Zhe Song, Queen Mary University of London; Tao Xiang, Queen Mary, University of London, UK; Timothy Hospedales, Edinburgh University |
P-3C-50 | Multimodal image alignment through a multiscale chain of neural networks with application to remote sensing | Armand Zampieri, Inria Sophia-Antipolis; Guillaume Charpiat, INRIA; Nicolas Girard, Inria Sophia-Antipolis; Yuliya Tarabalka*, Inria Sophia-Antipolis |
P-3C-51 | Improving Deep Visual Representation for Person Re-identification by Global and Local Image-language Association | Dapeng Chen*, The Chinese University of HongKong; Hongsheng Li, Chinese University of Hong Kong; Xihui Liu, The Chinese University of Hong Kong; Jing Shao, The Chinese University of Hong Kong; Xiaogang Wang, Chinese University of Hong Kong, Hong Kong |
P-3C-52 | Robust Optical Flow Estimation in Rainy Scenes | Ruoteng Li*, National University of Singapore; Robby Tan, Yale-NUS College, Singapore; Loong Fah Cheong, NUS |
P-3C-53 | Image Generation from Sketch Constraint Using Contextual GAN | Yongyi Lu*, HKUST; Shangzhe Wu, HKUST; Yu-Wing Tai, Tencent YouTu; Chi-Keung Tang, Hong Kong University of Science and Technology |
P-3C-54 | Accurate Scene Text Detection through Border Semantics Awareness and Bootstrapping | Chuhui Xue, Nanyang Technological University; Shijian Lu*, Nanyang Technological University; Fangneng Zhan, Nanyang Technological University |
P-3C-55 | CNN-PS: CNN-based Photometric Stereo for General Non-Convex Surfaces | Satoshi Ikehata*, National Institute of Informatics |
P-3C-56 | Making Deep Heatmaps Robust to Partial Occlusions for 3D Object Pose Estimation | Markus Oberweger*, TU Graz; Mahdi Rad, TU Graz; Vincent Lepetit, TU Graz |
P-3C-57 | Recognition in Terra Incognita | Sara Beery*, Caltech; Grant van Horn, Caltech; Pietro Perona, Caltech |
P-3C-58 | Super-Resolution and Sparse View CT Reconstruction | Guangming Zang, KAUST; Ramzi Idoughi, KAUST; Mohamed Aly, KAUST; Peter Wonka, KAUST; Wolfgang Heidrich*, KAUST |
P-3C-59 | Modeling Visual Context is Key to Augmenting Object Detection Datasets | NIKITA DVORNIK*, INRIA; Julien Mairal, INRIA; Cordelia Schmid, INRIA |
P-3C-60 | Occlusions, Motion and Depth Boundaries with a Generic Network for Optical Flow, Disparity, or Scene Flow Estimation | Eddy Ilg*, University of Freiburg; Tonmoy Saikia, University of Freiburg; Margret Keuper, University of Mannheim; Thomas Brox, University of Freiburg |
P-3C-61 | Unsupervised Domain Adaptation for 3D Keypoint Estimation via View Consistency | Xingyi Zhou, The University of Texas at Austin; Arjun Karpur, The University of Texas at Austin; Chuang Gan, MIT; Linjie Luo, Snap Inc; Qixing Huang*, The University of Texas at Austin |
P-3C-62 | Improving DNN Robustness to Adversarial Attacks using Jacobian Regularization | Daniel Jakubovitz*, Tel Aviv University; Raja Giryes, Tel Aviv University |
P-3C-63 | A Framework for Evaluating 6-DOF Object Trackers | Mathieu Garon, Université Laval; Denis Laurendeau, Laval University; Jean-Francois Lalonde*, Université Laval |
P-3C-64 | Self-Supervised Relative Depth Learning for Urban Scene Understanding | Huaizu Jiang*, UMass Amherst; Erik Learned-Miller, University of Massachusetts, Amherst; Gustav Larsson, University of Chicago; Michael Maire, Toyota Technological Institute at Chicago; Greg Shakhnarovich, Toyota Technological Institute at Chicago |
P-3C-65 | Actor-centric Relation Network | Chen Sun*, Google; Abhinav Shrivastava, UMD / Google; Carl Vondrick, MIT; Kevin Murphy, Google; Rahul Sukthankar, Google; Cordelia Schmid, Google |
P-3C-66 | Self-produced Guidance for Weakly-supervised Object Localization | Xiaolin Zhang*, University of Technology Sydney; Yunchao Wei, UIUC; Guoliang Kang, UTS; Yi Yang, UTS; Thomas Huang, UIUC |
P-3C-67 | Attribute-Guided Face Generation Using Conditional CycleGAN | Yongyi Lu*, HKUST; Yu-Wing Tai, Tencent YouTu; Chi-Keung Tang, Hong Kong University of Science and Technology |
P-3C-68 | Neural Network Encapsulation | Hongyang Li*, Chinese University of Hong Kong; Bo Dai, the Chinese University of Hong Kong; Wanli Ouyang, CUHK; Xiaoyang Guo, The Chinese University of Hong Kong; Xiaogang Wang, Chinese University of Hong Kong, Hong Kong |
P-3C-69 | Deep Regionlets for Object Detection | Hongyu Xu*, University of Maryland; Xutao Lv, Intellifusion; Xiaoyu Wang, -; Zhou Ren, Snap Inc.; Navaneeth Bodla, University of Maryland; Rama Chellappa, University of Maryland |
P-3C-70 | Deep Adversarial Attention Alignment for Unsupervised Domain Adaptation: the Benefit of Target Expectation Maximization | Guoliang Kang*, UTS; Liang Zheng, Singapore University of Technology and Design; Yan Yan, UTS; Yi Yang, UTS |
P-3C-71 | Fighting Fake News: Image Splice Detection via Learned Self-Consistency | Jacob Huh*, Carnegie Mellon University; Andrew Liu, University of California, Berkeley; Andrew Owens, UC Berkeley; Alexei Efros, UC Berkeley |
P-3C-72 | Learning Monocular Depth by Distilling Cross-domain Stereo Networks | Xiaoyang Guo*, The Chinese University of Hong Kong; Hongsheng Li, Chinese University of Hong Kong; Shuai Yi, The Chinese University of Hong Kong; Jimmy Ren, Sensetime Research; Xiaogang Wang, Chinese University of Hong Kong, Hong Kong |
P-3C-73 | Riemannian Walk for Incremental Learning: Understanding Forgetting and Intransigence | Arslan Chaudhry*, University of Oxford; Puneet Dokania, University of Oxford; Thalaiyasingam Ajanthan, University of Oxford; Philip Torr, University of Oxford |
P-3C-74 | Weakly Supervised Region Proposal Network and Object Detection | Peng Tang*, Huazhong University of Science and Technology; Xinggang Wang, Huazhong Univ. of Science and Technology; Angtian Wang, Huazhong University of Science and Technology ; Yongluan Yan, Huazhong University of Science and Technology ; Wenyu Liu, Huazhong University of Science and Technology; Junzhou Huang, Tencent AI Lab; Alan Yuille, Johns Hopkins University |
Poster session 4A
4A | Thursday, September 13 | Poster session 10:00 AM - 12:00 PM←↑ |
---|---|---|
P-4A-01 | Viewpoint Estimation - Insights & Model | Gilad Divon, Technion; Ayellet Tal*, Technion |
P-4A-02 | Towards Realistic Predictors | Pei Wang*, UC San Diego; Nuno Vasconcelos, UC San Diego |
P-4A-03 | Group Normalization | Yuxin Wu, Facebook; Kaiming He*, Facebook Inc., USA |
P-4A-04 | Deep Expander Networks: Efficient Deep Networks from Graph Theory | Ameya Prabhu*, IIIT Hyderabad; Girish Varma, IIIT Hyderabad; Anoop Namboodiri, IIIT Hyderbad |
P-4A-05 | Learning SO(3) Equivariant Representations with Spherical CNNs | Carlos Esteves*, University of Pennsylvania; Kostas Daniilidis, University of Pennsylvania; Ameesh Makadia, Google Research; Christine Allec-Blanchette, University of Pennsylvania |
P-4A-06 | Video Re-localization via Cross Gated Bilinear Matching | Yang Feng*, University of Rochester; Lin Ma, Tencent AI Lab; Wei Liu, Tencent AI Lab; Tong Zhang, Tecent AI Lab; Jiebo Luo, U. Rochester |
P-4A-07 | A Deeply-initialized Coarse-to-fine Ensemble of Regression Trees for Face Alignment | Roberto Valle*, Universidad Politécnica de Madrid; José Buenaposada, Universidad Rey Juan Carlos; Antonio Valdés, Universidad Complutense de Madrid; Luis Baumela, Universidad Politecnica de Madrid |
P-4A-08 | Deep Kalman Filtering Network for Video Compression Artifact Reduction | Guo Lu*, Shanghai Jiao Tong University; Wanli Ouyang, CUHK; Dong Xu, University of Sydney; Xiaoyun Zhang, Shanghai Jiao Tong University; Zhiyong Gao, Shanghai Jiao Tong University; Ming Ting Sun, - |
P-4A-09 | Exploring Visual Relationship for Image Captioning | Ting Yao*, Microsoft Research; Yingwei Pan, University of Science and Technology of China; Yehao Li, Sun Yat-Sen University; Tao Mei, JD.com |
P-4A-10 | Sequential Clique Optimization for Video Object Segmentation | Yeong Jun Koh*, Korea University; Young-Yoon Lee, Samsung; Chang-Su Kim, Korea university |
P-4A-11 | Spatial Pyramid Calibration for Image Classification | Yan Wang, Shanghai Jiao Tong University; Lingxi Xie*, JHU; Siyuan Qiao, Johns Hopkins University; Ya Zhang, Cooperative Medianet Innovation Center, Shang hai Jiao Tong University; Wenjun Zhang, Shanghai Jiao Tong University; Alan Yuille, Johns Hopkins University |
P-4A-12 | Visual Text Correction | Amir Mazaheri*, University of Central Florida; Mubarak Shah, University of Central Florida |
P-4A-13 | X-ray Computed Tomography Through Scatter | Adam Geva*, Technion; Yoav Y. Schechner, Technion; Jonathan Chernyak, Technion; Rajiv Gupta, MGH Harvard |
P-4A-14 | Graph Distillation for Action Detection with Privileged Information in RGB-D Videos | Zelun Luo*, Stanford University; Lu Jiang, Google; Jun-Ting Hsieh, Stanford University; Juan Carlos Niebles, Stanford University; Li Fei-Fei, Stanford University |
P-4A-15 | Modular Generative Adversarial Networks | Bo Zhao*, University of British Columbia; Bo Chang, University of British Columbia; Zequn Jie, Tencent AI Lab; Leonid Sigal, University of British Columbia |
P-4A-16 | R2P2: A ReparameteRized Pushforward Policy for Diverse, Precise Generative Path Forecasting | Nicholas Rhinehart*, CMU; Kris Kitani, CMU; Paul Vernaza, NEC Labs America |
P-4A-17 | DFT-based Transformation Invariant Pooling Layer for Visual Classification | Jongbin Ryu*, Hanyang University; Ming-Hsuan Yang, University of California at Merced; Jongwoo Lim, Hanyang University |
P-4A-18 | X2Face: A network for controlling face generation by using images, audio, and pose codes | Olivia Wiles*, University of Oxford; A Koepke, University of Oxford; Andrew Zisserman, University of Oxford |
P-4A-19 | Compositional Learning of Human Object Interactions | Keizo Kato, CMU; Yin Li*, CMU; Abhinav Gupta, CMU |
P-4A-20 | Learning to Navigate for Fine-grained Classification | Ze Yang*, Peking University; Tiange Luo, Peking University; Dong Wang, Peking University; Zhiqiang Hu, Peking University; Jun Gao, Peking University; Liwei Wang, Peking University |
P-4A-21 | Cross-Modal Ranking with Soft Consistency and Noisy Labels for Robust RGB-T Tracking | Chenglong Li, Anhui University; Chengli Zhu, Anhui University; Yan Huang, Institute of Automation, Chinese Academy of Sciences; Jin Tang, Anhui University; Liang Wang*, NLPR, China |
P-4A-22 | Light-weight CNN Architecture Design for Fast Inference | Ningning Ma*, Tsinghua; Xiangyu Zhang, Megvii Inc; Hai-Tao Zheng, Tsinghua University; Jian Sun, Megvii, Face++ |
P-4A-23 | Fully Motion-Aware Network for Video Object Detection | Shiyao Wang*, Tsinghua University; Yucong Zhou, Beihang University; Junjie Yan, Sensetime Group Limited |
P-4A-24 | Shift-Net: Image Inpainting via Deep Feature Rearrangement | Zhaoyi Yan, Harbin Institute of Technology; Xiaoming Li, Harbin Institute of Technology; Mu LI, The Hong Kong Polytechnic University; Wangmeng Zuo*, Harbin Institute of Technology, China; Shiguang Shan, Chinese Academy of Sciences |
P-4A-25 | Choose Your Neuron: Incorporating Domain Knowledge through Neuron Importance | Ramprasaath Ramasamy Selvaraju*, Virginia Tech; Prithvijit Chattopadhyay, Georgia Institute of Technology; Mohamed Elhoseiny, Facebook; Tilak Sharma, Facebook; Dhruv Batra, Georgia Tech & Facebook AI Research; Devi Parikh, Georgia Tech & Facebook AI Research; Stefan Lee, Georgia Institute of Technology |
P-4A-26 | Joint 3D tracking of a deformable object in interaction with a hand | Aggeliki Tsoli*, FORTH; Antonis Argyros, CSD-UOC and ICS-FORTH |
P-4A-27 | Interpolating Convolutional Neural Networks Using Batch Normalization | Gratianus Wesley Putra Data*, University of Oxford; Kirjon Ngu, University of Oxford; David Murray, University of Oxford; Victor Prisacariu, University of Oxford |
P-4A-28 | Learning Warped Guidance for Blind Face Restoration | Xiaoming Li, Harbin Institute of Technology; Ming Liu, Harbin Institute of Technology; Yuting Ye, Harbin Institute of Technology; Wangmeng Zuo*, Harbin Institute of Technology, China; Liang Lin, Sun Yat-sen University; Ruigang Yang, University of Kentucky, USA |
P-4A-29 | Separable Cross-Domain Translation | Yedid Hoshen*, Facebook AI Research (FAIR); Lior Wolf, Tel Aviv University, Israel |
P-4A-30 | Task-driven Webpage Saliency | Quanlong Zheng*, City University of HongKong; Jianbo Jiao, City University of Hong Kong; Ying Cao, City University of Hong Kong; Rynson Lau, City University of Hong Kong |
P-4A-31 | Appearance-Based Gaze Estimation via Evaluation-Guided Asymmetric Regression | Yihua Cheng, Beihang University; Feng Lu*, U. Tokyo; Xucong Zhang, Max Planck Institute for Informatics and Saarland University |
P-4A-32 | Pivot Correlational Neural Network for Multimodal Video Categorization | Sunghun Kang*, KAIST; Junyeong Kim, KAIST; Hyunsoo Choi, SAMSUNG ELECTRONICS CO.,LTD; Sungjin Kim, SAMSUNG ELECTRONICS CO.,LTD; Chang D. Yoo, KAIST |
P-4A-33 | Interactive Boundary Prediction for Object Selection | Hoang Le, Portland State University; Long Mai*, Adobe Research; Brian Price, Adobe; Scott Cohen, Adobe Research; Hailin Jin, Adobe Research; Feng Liu, Portland State University |
P-4A-34 | Scenes-Objects-Actions: A Multi-Task, Multi-Label Video Dataset | Heng Wang*, Facebook Inc; Lorenzo Torresani, Dartmouth College; Matt Feiszli, Facebook Research; Manohar Paluri, Facebook; Du Tran, Facebook; Jamie Ray, Facebook Research; Yufei Wang, Facebook |
P-4A-35 | Transferable Adversarial Perturbations | Bruce Hou*, Tencent; Wen Zhou, Tencent |
P-4A-36 | Incremental Non-Rigid Structure-from-Motion with Unknown Focal Length | Thomas Probst, ETH Zurich; Danda Pani Paudel*, ETH Zürich; Ajad Chhatkuli , ETHZ; Luc Van Gool, ETH Zurich |
P-4A-37 | Semantically Aware Urban 3D Reconstruction with Plane-Based Regularization | Thomas Holzmann*, Graz University of Technology; Michael Maurer, Graz University of Technology; Friedrich Fraundorfer, Graz University of Technology; Horst Bischof, Graz University of Technology |
P-4A-38 | Learning to Dodge A Bullet | shi jin*, ShanghaiTech University; Jinwei Ye, Louisiana State University; Yu Ji, Plex-VR; RUIYANG LIU, ShanghaiTech University; Jingyi Yu, Shanghai Tech University |
P-4A-39 | Training Binary Weight Networks via Semi-Binary Decomposition | Qinghao Hu*, Institute of Automation, Chinese Academy of Sciences; Gang Li, Institute of Automation, Chinese Academy of Sciences; Peisong Wang, Institute of Automation, Chinese Academy of Sciences; yifan zhang, Institute of Automation,Chinese Academy of Sciences; Jian Cheng, Chinese Academy of Sciences, China |
P-4A-40 | Learnable PINs: Cross-Modal Embeddings for Person Identity | Samuel Albanie*, University of Oxford; Arsha Nagrani, Oxford University ; Andrew Zisserman, University of Oxford |
P-4A-41 | Toward Characteristic-Preserving Image-based Virtual Try-On Network | Bochao Wang, Sun Yet-sen University; Huabin Zheng, Sun Yat-Sen University; Xiaodan Liang*, Carnegie Mellon University; Yimin Chen, sensetime; Liang Lin, Sun Yat-sen University |
P-4A-42 | Deep Feature Factorization For Unsupervised Concept Discovery | Edo Collins*, EPFL; Radhakrishna Achanta, EPFL; Sabine Süsstrunk, EPFL |
P-4A-43 | SOD-MTGAN: Small Object Detection via Multi-Task Generative Adversarial Network | Yongqiang Zhang*, Harbin institute of Technology/KAUST; Yancheng Bai, KAUST/ISCAS; Mingli Ding, Harbin institute of Technology; Bernard Ghanem, KAUST |
P-4A-44 | Human Motion Analysis with Deep Metric Learning | HUSEYIN COSKUN*, Technical University of Munich; David Joseph Tan, CAMP, TU Munich; Sailesh Conjeti, Technical University of Munich; Nassir Navab, TU Munich, Germany; Federico Tombari, Technical University of Munich, Germany |
P-4A-45 | Dist-GAN: An Improved GAN using Distance Constraints | Ngoc-Trung Tran*, Singapore University of Technology and Design; Tuan Anh Bui, Singapore University of Technology and Design; Ngai-Man Cheung, Singapore University of Technology and Design |
P-4A-46 | Cross-Modal and Hierarchical Modeling of Video and Text | Bowen Zhang*, University of Southern California; Hexiang Hu, University of Southern California; Fei Sha, USC |
P-4A-47 | Deep Image Demosaicking using a Cascade of Convolutional Residual Denoising Networks | Filippos Kokkinos*, Skolkovo Institute of Science and Technology; Stamatis Lefkimmiatis, Skolkovo Institute of Science and Technology |
P-4A-48 | Deep Clustering for Unsupervised Learning of Visual Features | Mathilde Caron*, Facebook Artificial Intelligence Research; Piotr Bojanowski, Facebook; Armand Joulin, Facebook AI Research; Matthijs Douze, Facebook AI Research |
P-4A-49 | Domain Adaptation through Synthesis for Unsupervised Person Re-identification | Slawomir Bak*, Argo AI; Jean-Francois Lalonde, Université Laval; Pete Carr, Argo AI |
P-4A-50 | Facial Expression Recognition with Inconsistently Annotated Datasets | Jiabei Zeng*, Institute of Computing Technology, Chinese Academy on Sciences; Shiguang Shan, Chinese Academy of Sciences; Chen Xilin, Institute of Computing Technology, Chinese Academy of Sciences |
P-4A-51 | Single Shot Scene Text Retrieval | Lluis Gomez*, Universitat Autónoma de Barcelona; Andres Mafla, Computer Vision Center; Marçal Rossinyol, Universitat Autónoma de Barcelona; Dimosthenis Karatzas, Computer Vision Centre |
P-4A-52 | DeepVS: A Deep Learning Based Video Saliency Prediction Approach | Lai Jiang, BUAA; Mai Xu*, BUAA; Minglang Qiao, BUAA; Zulin Wang, BUAA |
P-4A-53 | Generalizing A Person Retrieval Model Hetero- and Homogeneously | Zhun Zhong*, Xiamen University; Liang Zheng, Singapore University of Technology and Design; Shaozi Li, Xiamen University, China; Yi Yang, University of Technology, Sydney |
P-4A-54 | A New Large Scale Dynamic Texture Dataset with Application to ConvNet Understanding | Isma Hadji*, York University; Rick Wildes, York University |
P-4A-55 | Deep Cross-modality Adaptation via Semantics Preserving Adversarial Learning for Sketch-based 3D Shape Retrieval | Jiaxin Chen, New York University Abu Dhabi; Yi Fang*, New York University |
P-4A-56 | BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation | Changqian Yu*, Huazhong University of Science and Technology; Jingbo Wang, Peking University; Chao Peng, Megvii(Face++) Inc; Changxin Gao, Huazhong University of Science and Technology; Gang Yu, Face++; Nong Sang, School of Automation, Huazhong University of Science and Technology |
P-4A-57 | Face De-spoofing | Yaojie Liu*, Michigan State University; Amin Jourabloo, Michigan State University; Xiaoming Liu, Michigan State University |
P-4A-58 | Towards End-to-End License Plate Detection and Recognition: A Large Dataset and Baseline | Zhenbo Xu*, University of Science and Technology in China; Wei Yang, University of Science and Technology in China; Ajin Meng, University of Science and Technology in China; Nanxue Lu, University of Science and Technology in China; Huan Huang, Xingtai Financial Holdings Group Co., Ltd. |
P-4A-59 | Self-supervised Tracking by Colorization | Carl Vondrick*, MIT; Abhinav Shrivastava, UMD / Google; Alireza Fathi, Google; Sergio Guadarrama, Google; Kevin Murphy, Google |
P-4A-60 | Pose Proposal Networks | Taiki Sekii*, Konica Minolta, inc. |
P-4A-61 | Incremental Multi-graph Matching via Diversity and Randomness based Graph Clustering | Tianshu Yu*, Arizona State University; Junchi Yan, Shanghai Jiao Tong University; baoxin Li, Arizona State University; Wei Liu, Tencent AI Lab |
P-4A-62 | Single Image Intrinsic Decomposition Without a Single Intrinsic Image | Wei-Chiu Ma*, MIT; Hang Chu, University of Toronto; Bolei Zhou, MIT; Raquel Urtasun, University of Toronto; Antonio Torralba, MIT |
P-4A-63 | Triplet Loss with Theoretical Analysis in Siamese Network for Real-Time Object Tracking | Xingping Dong, Beijing Institute of Technology; Jianbing Shen*, Beijing Institute of Technology |
P-4A-64 | Learning to Learn Parameterized Image Operators | Qingnan Fan, Shandong University; Dongdong Chen*, university of science and technology of china; Lu Yuan, Microsoft Research Asia; Gang Hua, Microsoft Cloud and AI; Nenghai Yu, University of Science and Technology of China; Baoquan Chen, Shandong University |
P-4A-65 | HBE: Hand Branch Ensemble network for real time 3D hand pose estimation | Yidan Zhou, Dalian University of Technology; Jian Lu, Laboratory of Advanced Design and Intelligent Computing, Dalian University; Kuo Du, Dalian University of Technology; Xiangbo Lin*, Dalian University of Technology; Yi Sun, Dalian University of Technology; Xiaohong Ma, Dalian University of Technology |
P-4A-66 | Generative Semantic Manipulation with Mask-Contrasting GAN | Xiaodan Liang*, Carnegie Mellon University |
P-4A-67 | Learning to Fuse Proposals from Multiple Scanline Optimizations in Semi-Global Matching | Johannes Schoenberger*, ETH Zurich; Sudipta Sinha, Microsoft Research; Marc Pollefeys, ETH Zurich |
P-4A-68 | Less is More: Picking Informative Frames for Video Captioning | Yangyu Chen*, University of Chinese Academy of Sciences; Shuhui Wang, vipl,ict,Chinese academic of science; Weigang Zhang, Harbin Institute of Technology, Weihai; Qingming Huang, University of Chinese Academy of Sciences, China |
P-4A-69 | Deep Pictorial Gaze Estimation | Seonwook Park*, ETH Zurich; Adrian Spurr, ETH Zurich; Otmar Hilliges, ETH Zurich |
P-4A-70 | SkipNet: Learning Dynamic Execution in Residual Networks | Xin Wang*, UC Berkeley; Fisher Yu, UC Berkeley; Zi-Yi Dou, Nanjing University; Trevor Darrell, UC Berkeley; Joseph Gonzalez, UC Berkeley |
P-4A-71 | Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes | Pengyuan Lyu*, Huazhong University of Science and Technology; Minghui Liao, Huazhong University of Science and Technology; Cong Yao, Megvii; Wenhao Wu, Megvii; Xiang Bai, Huazhong University of Science and Technology |
P-4A-72 | Deep Adaptive Attention for Joint Facial Action Unit Detection and Face Alignment | Zhiwen Shao*, Shanghai Jiao Tong University; Zhilei Liu, Tianjin University; Jianfei Cai, Nanyang Technological University; Lizhuang Ma, Shanghai Jiao Tong University |
P-4A-73 | Semantic Scene Understanding under Dense Fog with Synthetic and Real Data | Christos Sakaridis*, ETH Zurich; Dengxin Dai, ETH Zurich; Simon Hecker, ETH Zurich; Luc Van Gool, ETH Zurich |
P-4A-74 | RIDI: Robust IMU Double Integration | Hang Yan*, Washington University in St. Louis; Qi Shan, Zillow Group; Yasutaka Furukawa, Simon Fraser University |
P-4A-75 | Weakly-supervised Video Summarization using Variational Encoder-Decoder and Web Prior | Sijia Cai*, The Hong Kong Polytechnic University; Wangmeng Zuo, Harbin Institute of Technology; Larry Davis, University of Maryland; Lei Zhang, Hong Kong Polytechnic University, Hong Kong, China |
P-4A-76 | Transferring Common-Sense Knowledge for Object Detection | Krishna Kumar Singh*, University of California Davis; Santosh Divvala, Allen AI; Ali Farhadi, University of Washington; Yong Jae Lee, University of California, Davis |
P-4A-77 | Person Search in Videos with One Portrait Through Visual and Temporal Links | Qingqiu Huang*, CUHK; Wentao Liu, Sensetime; Dahua Lin, The Chinese University of Hong Kong |
P-4A-78 | Eliminating the Dreaded Blind Spot: Adapting 3D Object Detection and Monocular Depth Estimation to 360° Panoramic Imagery | Gregoire Payen de La Garanderie*, Durham University; Toby Breckon, Durham University; Amir Atapour-Abarghouei, Durham University |
P-4A-79 | Folded Recurrent Neural Networks for Future Video Prediction | Marc Oliu*, Universitat Oberta de Catalunya; Javier Selva, Universitat de Barcelona; Sergio Escalera, Computer Vision Center (UAB) & University of Barcelona, |
P-4A-80 | Deep Regression Tracking with Shrinkage Loss | Xiankai Lu, Shanghai Jiao Tong University; Chao Ma*, University of Adelaide; Bingbing Ni, Shanghai Jiao Tong University; Xiaokang Yang, Shanghai Jiao Tong University of China; Ian Reid, University of Adelaide, Australia; Ming-Hsuan Yang, University of California at Merced |
P-4A-81 | Stroke Controllable Fast Style Transfer with Adaptive Receptive Fields | Yongcheng Jing, Zhejiang University; Yang Liu, Zhejiang University; Yezhou Yang, Arizona State University; Zunlei Feng, Zhejiang University; Yizhou Yu, The University of Hong Kong; Dacheng Tao, University of Sydney; Mingli Song*, Zhejiang University |
P-4A-82 | Part-Aligned Bilinear Representations for Person Re-Identification | Yumin Suh, Seoul National University; Jingdong Wang, Microsoft Research; Kyoung Mu Lee*, Seoul National University |
P-4A-83 | Joint 3D Face Reconstruction and Dense Alignment with Position Map Regression Network | Yao Feng*, Shanghai Jiao Tong University; Fan Wu, CloudWalk Technology; Xiao-Hu Shao, Chongqing Institute of Green and Intelligent Technology,Chinese Academy of Sciences; Yan-Feng Wang, Shanghai Jiao Tong University; Xi Zhou, CloudWalk Technology |
P-4A-84 | Learning Efficient Single-stage Pedestrian Detection by Asymptotic Localization Fitting | Wei Liu*, National University of Defense Technology; Shengcai Liao, NLPR, Chinese Academy of Sciences, China; Weidong Hu, National University of Defence Technology; Xuezhi Liang, Center for Biometrics and Security Research & National Laboratory of Pattern Recognition Institute of Automation, Chinese Academy of Sciences; Xiao Chen, National University of Defense Technology |
P-4A-85 | Unsupervised Hard-Negative Mining from Videos for Object Detection | SouYoung Jin*, UMASS Amherst; Huaizu Jiang, UMass Amherst; Aruni RoyChowdhury, University of Massachusetts, Amherst; Ashish Singh, UMASS Amherst; Aditya Prasad, UMASS Amherst; Deep Chakraborty, UMASS Amherst; Erik Learned-Miller, University of Massachusetts, Amherst |
P-4A-86 | Focus, Segment and Erase: An Efficient Network for Multi-Label Brain Tumor Segmentation | Xuan Chen*, NUS; Jun Hao Liew, NUS; Wei Xiong, A*STAR Institute for Infocomm Research, Singapore; Chee-Kong Chui, NUS; Sim-Heng Ong, NUS |
P-4A-87 | Maximum Margin Metric Learning Over Discriminative Nullspace for Person Re-identification | T M Feroz Ali*, Indian Institute of Technology Bombay, Mumbai; Subhasis Chaudhuri, Indian Institute of Technology Bombay |
P-4A-88 | Efficient Relative Attribute Learning using Graph Neural Networks | Zihang Meng*, University of Wisconsin Madison; Nagesh Adluru , WISC; Vikas Singh, University of Wisconsin-Madison USA |
P-4A-89 | Object Level Visual Reasoning in Videos | Fabien Baradel, LIRIS; Natalia Neverova*, Facebook AI Research; Christian Wolf, INSA Lyon, France; Julien Mille, INSA Centre Val de Loire; Greg Mori, Simon Fraser University |
Poster session 4B
4B | Thursday, September 13 | Poster session 04:00 PM - 06:00 PM←↑ |
---|---|---|
P-4B-01 | Deep Model-Based 6D Pose Refinement in RGB | Fabian Manhardt*, TU Munich; Wadim Kehl, Toyota Research Institute; Nassir Navab, Technische Universität München, Germany; Federico Tombari, Technical University of Munich, Germany |
P-4B-02 | ContextVP: Fully Context-Aware Video Prediction | Wonmin Byeon*, NVIDIA; Qin Wang, ETH Zurich; Rupesh Kumar Srivastava, NNAISENSE; Petros Koumoutsakos, ETH Zurich |
P-4B-03 | CornerNet: Detecting Objects as Paired Keypoints | Hei Law*, University of Michigan; Jia Deng, University of Michigan |
P-4B-04 | RelocNet: Continous Metric Learning Relocalisation using Neural Nets | Vassileios Balntas*, University of Oxford; Victor Prisacariu, University of Oxford; Shuda Li, University of Oxford |
P-4B-05 | Museum Exhibit Identification Challenge for the Supervised Domain Adaptation. | Piotr Koniusz*, Data61/CSIRO, ANU; Yusuf Tas, Data61; Hongguang Zhang, Australian National University; Mehrtash Harandi, Monash University; Fatih Porikli, ANU; Rui Zhang, University of Canberra |
P-4B-06 | Acquisition of Localization Confidence for Accurate Object Detection | Borui Jiang*, Peking University; Ruixuan Luo, Peking University; Jiayuan Mao, Tsinghua University; Tete Xiao, Peking University; Yuning Jiang, Megvii(Face++) Inc |
P-4B-07 | The Contextual Loss for Image Transformation with Non-Aligned Data | Roey Mechrez*, Technion; Itamar Talmi, Technion; Lihi Zelnik-Manor, Technion |
P-4B-08 | Saliency Benchmarking Made Easy: Separating Models, Maps and Metrics | Matthias Kümmerer*, University of Tübingen; Thomas Wallis, University of Tübingen; Matthias Bethge, University of Tübingen |
P-4B-09 | Multi-Attention Multi-Class Constraint for Fine-grained Image Recognition | Ming Sun, baidu; Yuchen Yuan, Baidu Inc.; Feng Zhou*, Baidu Research; Errui Ding, Baidu Inc. |
P-4B-10 | Look Before You Leap: Bridging Model-Free and Model-Based Reinforcement Learning for Planned-Ahead Vision-and-Language Navigation | Xin Wang*, University of California, Santa Barbara; Wenhan Xiong, University of California, Santa Barbara; Hongmin Wang, University of California, Santa Barbara; William Wang, UC Santa Barbara |
P-4B-11 | HandMap: Robust Hand Pose Estimation via Intermediate Dense Guidance Map Supervision | Xiaokun Wu*, University of Bath; Daniel Finnegan, University of Bath; Eamonn O'Neill, University of Bath; Yongliang Yang, University of Bath |
P-4B-12 | LSQ++: lower runtime and higher recall in multi-codebook quantization | Julieta Martinez*, University of British Columbia; Shobhit Zakhmi, University of British Columbia; Holger Hoos, University of British Columbia; Jim Little, University of British Columbia, Canada |
P-4B-13 | Multimodal Dual Attention Memory for Video Story Question Answering | Kyungmin Kim*, Seoul National University; Seong-Ho Choi, Seoul National University; Jin-Hwa Kim, Seoul National University; Byoung-Tak Zhang, Seoul National University |
P-4B-14 | Hierarchical Bilinear Pooling for Fine-Grained Visual Recognition | Chaojian Yu*, Huazhong University of Science and Technology; Qi Zheng, Huazhong University of Science and Technology; Xinyi Zhao, Huazhong University of Science and Technology; Peng Zhang, Huazhong University of Science and Technology; Xinge YOU, School of Electronic Information and Communications,Huazhong University of Science and Technology |
P-4B-15 | Dense Semantic and Topological Correspondence of 3D Faces without Landmarks | Zhenfeng Fan*, Chinese Academy of Sciences; hu xiyuan, The Chinese academy of science; chen chen, The Chinese academy of science; peng silong, The Chinese academy of science |
P-4B-16 | Real-Time Blind Video Temporal Consistency | Wei-Sheng Lai*, University of California, Merced; Jia-Bin Huang, Virginia Tech; Oliver Wang, Adobe Systems Inc; Eli Shechtman, Adobe Research, US; Ersin Yumer, Argo AI; Ming-Hsuan Yang, University of California at Merced |
P-4B-17 | Depth Estimation via Affinity Learned with Convolutional Spatial Propagation Network | Xinjing Cheng, Baidu; Peng Wang*, Baidu USA LLC; Ruigang Yang, University of Kentucky, USA |
P-4B-18 | Hierarchical Metric Learning and Matching for 2D and 3D Geometric Correspondences | Mohammed Fathy, University of Maryland College Park; Quoc-Huy Tran*, NEC Labs; Zeeshan Zia, Microsoft; Paul Vernaza, NEC Labs America; Manmohan Chandraker, NEC Labs America |
P-4B-19 | GridFace: Face Rectification via Learning Local Homography Transformations | Erjin Zhou*, Megvii Research |
P-4B-20 | Rethinking Spatiotemporal Feature Learning: Speed-Accuracy Trade-offs in Video Classification | Saining Xie*, UCSD; Chen Sun, Google; Jonathan Huang, Google; Zhuowen Tu, UC San Diego; Kevin Murphy, Google |
P-4B-21 | Deep Variational Metric Learning | Xudong Lin, Tsinghua University; Yueqi Duan, Tsinghua University; Qiyuan Dong, Tsinghua University; Jiwen Lu*, Tsinghua University; Jie Zhou, Tsinghua University, China |
P-4B-22 | Multi-Class Model Fitting by Energy Minimization and Mode-Seeking | Dániel Baráth*, MTA SZTAKI, CMP Prague; Jiri Matas, CMP CTU FEE |
P-4B-23 | A Unified Framework for Single-View 3D Reconstruction with Limited Pose Supervision | Guandao Yang*, Cornell University; Yin Cui, Cornell University; Bharath Hariharan, Cornell University |
P-4B-24 | Diverse Conditional Image Generation by Stochastic Regression with Latent Drop-Out Codes | Yang He*, MPI Informatics; Bernt Schiele, MPI; Mario Fritz, Max-Planck-Institut für Informatik |
P-4B-25 | Orthogonal Deep Features Decomposition for Age-Invariant Face Recognition | yitong wang, Tencent AI Lab; dihong gong, Tencent AI Lab; zheng zhou, Tencent AI Lab; xing ji, Tencent AI Lab; Hao Wang, Tencent AI Lab; Zhifeng Li*, Tencent AI Lab; Wei Liu, Tencent AI Lab; Tong Zhang, Tecent AI Lab |
P-4B-26 | HiDDeN: Hiding Data with Deep Networks | Jiren Zhu*, Stanford University; Russell Kaplan, Stanford University; Justin Johnson, Stanford University; Li Fei-Fei, Stanford University |
P-4B-27 | Learning and Matching Multi-View Descriptors for Registration of Point Clouds | Lei Zhou*, HKUST; Siyu Zhu, HKUST; Zixin Luo, HKUST; Tianwei Shen, HKUST; Runze Zhang, HKUST; Tian Fang, HKUST; Long Quan, Hong Kong University of Science and Technology |
P-4B-28 | Deep Burst Denoising | Clement Godard*, University College London; Kevin Matzen, Facebook; Matt Uyttendaele, Facebook |
P-4B-29 | On Offline Evaluation of Vision-based Driving Models | Felipe Codevilla, UAB; Antonio Lopez, CVC & UAB; Vladlen Koltun, Intel Labs; Alexey Dosovitskiy*, Intel Labs |
P-4B-30 | Distortion-Aware Convolutional Filters for Dense Prediction in Panoramic Images | Keisuke Tateno*, Technical University Munich; Nassir Navab, TU Munich, Germany; Federico Tombari, Technical University of Munich, Germany |
P-4B-31 | Salient Objects in Clutter: Bringing Salient Object Detection to the Foreground | Deng-Ping Fan, Nankai University; Jiang-Jiang Liu, Nankai University; Shanghua Gao, Nankai University; Qibin Hou, Nankai University; Ming-Ming Cheng*, Nankai University; Ali Borji, University of Central Florida |
P-4B-32 | Randomized Ensemble Embeddings | Hong Xuan*, The George Washington University; Robert Pless, George Washington University |
P-4B-33 | Conditional Prior Networks for Optical Flow | Yanchao Yang*, UCLA; Stefano Soatto, UCLA |
P-4B-34 | Adaptively Transforming Graph Matching | Fudong Wang, Wuhan University; Nan Xue, Wuhan University; yi-peng Zhang, Syracuse University; Xiang Bai, Huazhong University of Science and Technology; Gui-Song Xia*, Wuhan University |
P-4B-35 | Learning 3D shapes as multi-layered height maps using 2D convolutional neural networks | Kripasindhu Sarkar*, University of Kaiserslautern; Basavaraj Hampiholi, University of Kaiserslautern; Kiran Varanasi, German Research Center for Artificial Intelligence; Didier Stricker, DFKI |
P-4B-36 | ISNN - Impact Sound Neural Network for Material and Geometry Classification | Auston Sterling*, UNC Chapel Hill; Justin Wilson, UNC Chapel Hill; Sam Lowe, UNC Chapel Hill; Ming Lin, UNC Chapel Hill |
P-4B-37 | Visual Psychophysics for Making Face Recognition Algorithms More Explainable | Brandon RichardWebster*, University of Notre Dame; So Yon Kwon, Perceptive Automata; Samuel Anthony, Perceptive Automata; Christopher Clarizio, University of Notre Dame; Walter Scheirer, University of Notre Dame |
P-4B-38 | Show, Tell and Discriminate: Image Captioning by Self-retrieval with Partially Labeled Data | Xihui Liu*, The Chinese University of Hong Kong; Hongsheng Li, Chinese University of Hong Kong; Jing Shao, The Chinese University of Hong Kong; Dapeng Chen, The Chinese University of HongKong; Xiaogang Wang, Chinese University of Hong Kong, Hong Kong |
P-4B-39 | Using LIP to Gloss Over Faces in Single-Stage Face Detection Networks | Siqi Yang*, UQ ITEE; Arnold Wiliem, University of Queensland; Shaokang Chen, University of Queensland; Brian Lovell, University of Queensland |
P-4B-40 | Variational Wasserstein Clustering | Liang Mi*, Arizona State University; wen zhang, ASU; Xianfeng GU, Stony Brook University; Yalin Wang, Arizona State University |
P-4B-41 | ADVISE: Symbolism and External Knowledge for Decoding Advertisements | Keren Ye*, University of Pittsburgh; Adriana Kovashka, University of Pittsburgh |
P-4B-42 | Weakly- and Semi-Supervised Panoptic Segmentation | Anurag Arnab*, University of Oxford; Philip Torr, University of Oxford; Qizhu Li, University of Oxford |
P-4B-43 | Broadcasting Convolutional Network for Visual Relational Reasoning | Simyung Chang, Seoul National University; John Yang, Seoul National University; Seonguk Park, Seoul National University; Nojun Kwak*, Seoul National University |
P-4B-44 | A Unified Framework for Multi-View Multi-Class Object Pose Estimation | Chi Li*, Johns Hopkins University; Jin Bai, Johns Hopkins University; Gregory D. Hager, The Johns Hopkins University |
P-4B-45 | Fast and Accurate Point Cloud Registration using Trees of Gaussian Mixtures | Benjamin Eckart*, NVIDIA; Kihwan Kim, NVIDIA; Kautz Jan, NVIDIA |
P-4B-46 | Teaching Machines to Understand Baseball Games: Large Scale Baseball Video Database for Multiple Video Understanding Tasks | Minho Shim, Yonsei University; KYUNGMIN KIM, Yonsei University; Young Hwi Kim, Yonsei University; Seon Joo Kim*, Yonsei Univ. |
P-4B-47 | Using Object Information for Spotting Text | Shitala Prasad*, NTU Singapore; Wai-Kin Adams Kong, Nanyang Technological University |
P-4B-48 | Deep Domain Generalization via Conditional Invariant Adversarial Networks | Ya Li, USTC; Xinmei Tian, USTC; Mingming Gong, CMU & U Pitt; Yajing Liu*, USTC; Tongliang Liu, The University of Sydney; Kun Zhang, Carnegie Mellon University; Dacheng Tao, University of Sydney |
P-4B-49 | On the Solvability of Viewing Graphs | Matthew Trager*, INRIA; Brian Osserman, UC Davis; Jean Ponce, Inria |
P-4B-50 | Learning Type-Aware Embeddings for Fashion Compatibility | Mariya Vasileva*, University of Illinois at Urbana-Champaign; Bryan Plummer, Boston University; Krishna Dusad, University of Illinois at Urbana-Champaign; Shreya Rajpal, University of Illinois at Urbana-Champaign; David Forsyth, Univeristy of Illinois at Urbana-Champaign; Ranjitha Kumar, UIUC: CS |
P-4B-51 | Visual Coreference Resolution in Visual Dialog using Neural Module Networks | Satwik Kottur*, Carnegie Mellon University; José M. F. Moura, Carnegie Mellon University; Devi Parikh, Georgia Tech & Facebook AI Research; Dhruv Batra, Georgia Tech & Facebook AI Research; Marcus Rohrbach, Facebook AI Research |
P-4B-52 | Hard-Aware Point-to-Set Deep Metric for Person Re-identification | Rui Yu*, Huazhong University of Science and Technology; Zhiyong Dou, Huazhong University of Science and Technology; Song Bai, HUST; ZHAO-XIANG ZHANG, Chinese Academy of Sciences, China; Yongchao Xu, HUST; Xiang Bai, Huazhong University of Science and Technology |
P-4B-53 | Gray box adversarial training | Vivek B S*, Indian Institute of Science; Konda Reddy Mopuri, Indian Institute of Science, Bangalore; Venkatesh Babu RADHAKRISHNAN, Indian Institute of Science |
P-4B-54 | Exploiting Vector Fields for Geometric Rectification of Distorted Document Images | Gaofeng Meng*, Chinese Academy of Sciences; Yuanqi Su, Xi'an Jiaotong University; Ying Wu, Northwestern University; SHIMING XIANG, Chinese Academy of Sciences, China; Chunhong Pan, Institute of Automation, Chinese Academy of Sciences |
P-4B-55 | Revisiting RCNN: On Awakening the Classification Power of Faster RCNN | Yunchao Wei*, UIUC; Bowen Cheng, UIUC; Honghui Shi, UIUC; Rogerio Feris, IBM Research; Jinjun Xiong, IBM Thomas J. Watson Research Center; Thomas Huang, UIUC |
P-4B-56 | DeepTAM: Deep Tracking and Mapping | Huizhong Zhou*, University of Freiburg; Benjamin Ummenhofer, University of Freiburg; Thomas Brox, University of Freiburg |
P-4B-57 | On Regularized Losses for Weakly-supervised CNN Segmentation | Meng Tang*, University of Waterloo; Ismail Ben Ayed, ETS; Federico Perazzi, Disney Research; Abdelaziz Djelouah, Disney Research; Christopher Schroers, Disney Research; Yuri Boykov, University of Waterloo |
P-4B-58 | ShapeCodes: Self-Supervised Feature Learning by Lifting Views to Viewgrids | Dinesh Jayaraman*, UC Berkeley; Ruohan Gao, University of Texas at Austin; Kristen Grauman, University of Texas |
P-4B-59 | A Minimal Closed-Form Solution for Multi-Perspective Pose Estimation using Points and Lines | Pedro Miraldo*, Instituto Superior Técnico, Lisboa; Tiago Dias, Institute for systems and robotics; Srikumar Ramalingam, University of Utah |
P-4B-60 | Interaction-aware Spatio-temporal Pyramid Attention Networks for Action Classification | Yang Du, NLPR; Chunfeng Yuan*, NLPR; Weiming Hu, Institute of Automation,Chinese Academy of Sciences |
P-4B-61 | Towards Privacy-Preserving Visual Recognition via Adversarial Training: A Pilot Study | Zhenyu Wu, Texas A&M University; Zhangyang Wang*, Texas A&M University; Zhaowen Wang, Adobe Research; Hailin Jin, Adobe Research |
P-4B-62 | Polarimetric Three-View Geometry | Lixiong Chen, National Institute of Informatics; Yinqiang Zheng*, National Institute of Informatics; Art Subpa-asa, Tokyo Institute of Technology; Imari Sato, National Institute of Informatics |
P-4B-63 | SketchyScene: Richly-Annotated Scene Sketches | Changqing Zou*, University of Maryland (UMD); Qian Yu, Queen Mary University of London; Ruofei Du, UMD; Haoran Mo, sun yat sen university; Yi-Zhe Song, Queen Mary University of London; Tao Xiang, Queen Mary, University of London, UK; Chengying Gao, sun yat sen university; Baoquan Chen, Shandong University; Hao Zhang, SFU |
P-4B-64 | Bi-Real Net: Enhancing the Performance of 1-bit CNNs with Improved Representational Capability and Advanced Training Algorithm | zechun liu*, HKUST; Baoyuan Wu, Tencent AI Lab; Wenhan Luo, Tencent AI Lab; Xin Yang, Huazhong University of Science and Technology; Wei Liu, Tencent AI Lab; Kwang-Ting Cheng, Hong Kong University of Science and Technology |
P-4B-65 | Deep Continuous Fusion for Multi-Sensor 3D Object Detection | Ming Liang*, Uber; Shenlong Wang, Uber ATG, University of Toronto; Bin Yang, Uber ATG, University of Toronto; Raquel Urtasun, Uber ATG |
P-4B-66 | Focus on the Hard Things: Dynamic Task Prioritization for Multitask Learning | Michelle Guo*, Stanford University; Albert Haque, Stanford University; De-An Huang, Stanford University; Serena Yeung, Stanford University; Li Fei-Fei, Stanford University |
P-4B-67 | Domain transfer through deep activation matching | Haoshuo Huang*, Tsinghua University; Qixing Huang, The University of Texas at Austin; Philipp Kraehenbuehl, UT Austin |
P-4B-68 | Joint Blind Motion Deblurring and Depth Estimation of Light Field | Dongwoo Lee, Seoul Ntional University; Haesol Park, Seoul National University; In Kyu Park, Inha University; Kyoung Mu Lee*, Seoul National University |
P-4B-69 | Learning to Look around Objects for Top-View Representations of Outdoor Scenes | Samuel Schulter*, NEC Labs; Menghua Zhai, University of Kentucky; Nathan Jacobs, University of Kentucky; Manmohan Chandraker, NEC Labs America |
P-4B-70 | Data-Driven Sparse Structure Selection for Deep Neural Networks | Zehao Huang*, TuSimple; Naiyan Wang, TuSimple |
P-4B-71 | Reconstruction-based Pairwise Depth Dataset for Depth Image Enhancement Using CNN | Junho Jeon, POSTECH; Seungyong Lee*, POSTECH |
P-4B-72 | A Geometric Perspective on Structured Light Coding | Mohit Gupta*, University of Wisconsin-Madison, USA ; Nikhil Nakhate, University of Wisconsin-Madison |
P-4B-73 | 3D Ego-Pose Estimation via Imitation Learning | Ye Yuan*, Carnegie Mellon University; Kris Kitani, CMU |
P-4B-74 | Unsupervised Learning of Multi-Frame Optical Flow with Occlusions | Joel Janai*, Max Planck Institute for Intelligent Systems; Fatma Güney, University of Oxford; Anurag Ranjan, MPI for Intelligent Systems; Michael Black, Max Planck Institute for Intelligent Systems; Andreas Geiger, MPI-IS and University of Tuebingen |
P-4B-75 | Dynamic Conditional Networks for Few-Shot Learning | Fang Zhao, National University of Singapore; Jian Zhao*, National University of Singapore; Yan Shuicheng, National University of Singapore; Jiashi Feng, NUS |
P-4B-76 | 3DFeat-Net: Weakly Supervised Local 3D Features for Rigid Point Cloud Registration | Zi Jian Yew*, National University of Singapore; Gim Hee Lee, National University of SIngapore |
P-4B-77 | Learning to Forecast and Refine Residual Motion for Image-to-Video Generation | Long Zhao*, Rutgers University; Xi Peng, Rutgers University; Yu Tian, Rutgers; Mubbasir Kapadia, Rutgers; Dimitris Metaxas, Rutgers |
P-4B-78 | Learn-to-Score: Efficient 3D Scene Exploration by Predicting View Utility | Benjamin Hepp*, ETH Zurich; Debadeepta Dey, Microsoft; Sudipta Sinha, Microsoft Research; Ashish Kapoor, Microsoft; Neel Joshi, -; Otmar Hilliges, ETH Zurich |
P-4B-79 | Deep Co-Training for Semi-Supervised Image Recognition | Siyuan Qiao*, Johns Hopkins University; Wei Shen, Shanghai University; Zhishuai Zhang, Johns Hopkins University; Bo Wang, Hikvision Research Institue; Alan Yuille, Johns Hopkins University |
P-4B-80 | Attention-aware Deep Adversarial Hashing for Cross Modal Retrieval | Xi Zhang, Sun Yat-Sen University; Hanjiang Lai*, Sun Yat-Sen university; Jiashi Feng, NUS |
P-4B-81 | Remote Photoplethysmography Correspondence Feature for 3D Mask Face Presentation Attack Detection | Siqi Liu*, Department of Computer Science, Hong Kong Baptist University; Xiangyuan Lan, Department of Computer Science, Hong Kong Baptist University; PongChi Yuen, Department of Computer Science, Hong Kong Baptist University |
P-4B-82 | Semi-Supervised Generative Adversarial Hashing for Image Retrieval | Guan'an Wang*, Chinese Academy of Sciences; Qinghao Hu, Chinese Academy of Sciences; Jian Cheng, Chinese Academy of Sciences, China; Zengguang Hou, Chinese Academy of Sciences |
P-4B-83 | Improving Spatiotemporal Self-Supervision by Deep Reinforcement Learning | Uta Büchler*, Heidelberg University; Biagio Brattoli, Heidelberg University; Bjorn Ommer, Heidelberg University |
P-4B-84 | AutoLoc: Weakly-supervised Temporal Action Localization in Untrimmed Videos | Zheng Shou*, Columbia University; Hang Gao, Columbia University; Lei Zhang, Microsoft Research; Kazuyuki Miyazawa, Mitsubishi Electric; Shih-Fu Chang, Columbia University |
P-4B-85 | Revisiting Autofocus for Smartphone Cameras | Abdullah Abuolaim*, York University; Abhijith Punnappurath, York University; Michael Brown, York University |
P-4B-86 | Contour Knowledge Transfer for Salient Object Detection | Xin Li, UESTC; Fan Yang*, UESTC; Hong Cheng, UESTC; Wei Liu, Digital Media Technology Key Laboratory of Sichuan Province, UESTC; Dinggang Shen, UNC |
P-4B-87 | Deep Volumetric Video From Very Sparse Multi-View Performance Capture | Zeng Huang*, University of Southern California; Tianye Li, University of Southern California; Weikai Chen, USC Institute for Creative Technology; Yajie Zhao, USC Institute for Creative Technology ; Jun Xing, Institute for Creative Technologies, USC; Chloe LeGendre, USC Institute for Creative Technology ; Linjie Luo, Snap Inc; Chongyang Ma, Snap Inc.; Hao Li, Pinscreen/University of Southern California/USC ICT |
P-4B-88 | Person Re-identification with Deep Similarity-Guided Graph Neural Network | Yantao Shen*, The Chinese University of Hong Kong; Hongsheng Li, Chinese University of Hong Kong; Shuai Yi, The Chinese University of Hong Kong; Xiaogang Wang, Chinese University of Hong Kong, Hong Kong |
P-4B-89 | Deep Component Analysis via Alternating Direction Neural Networks | Calvin Murdock*, Carnegie Mellon University; MingFang Chang, Carnegie Mellon University; Simon Lucey, CMU |
P-4B-90 | Understanding Perceptual and Conceptual Fluency at a Large Scale | Meredith Hu*, Cornell University; Ali Borji, University of Central Florida |
P-4B-91 | Look Deeper into Depth: Monocular Depth Estimation with Semantic Booster and Attention-Driven Loss | Jianbo Jiao*, City University of Hong Kong; Ying Cao, City University of Hong Kong; Yibing Song, Tencent AI Lab; Rynson Lau, City University of Hong Kong |