Computer Vision – ECCV 2022

Overview of attention for book

Cover of 'Computer Vision – ECCV 2022'

Table of Contents

Altmetric Badge

Book Overview
Altmetric Badge

Chapter 1 Efficient One-Stage Video Object Detection by Exploiting Temporal Consistency
Altmetric Badge

Chapter 2 Leveraging Action Affinity and Continuity for Semi-supervised Temporal Action Segmentation
Altmetric Badge

Chapter 3 Spotting Temporally Precise, Fine-Grained Events in Video
Altmetric Badge

Chapter 4 Unified Fully and Timestamp Supervised Temporal Action Segmentation via Sequence to Sequence Translation
Altmetric Badge

Chapter 5 Efficient Video Transformers with Spatial-Temporal Token Selection
Altmetric Badge

Chapter 6 Long Movie Clip Classification with State-Space Video Models
Altmetric Badge

Chapter 7 Prompting Visual-Language Models for Efficient Video Understanding
Altmetric Badge

Chapter 8 Asymmetric Relation Consistency Reasoning for Video Relation Grounding
Altmetric Badge

Chapter 9 Self-supervised Social Relation Representation for Human Group Detection
Altmetric Badge

Chapter 10 K -centered Patch Sampling for Efficient Video Recognition
Altmetric Badge

Chapter 11 A Deep Moving-Camera Background Model
Altmetric Badge

Chapter 12 GraphVid : It only Takes a Few Nodes to Understand a Video
Altmetric Badge

Chapter 13 Delta Distillation for Efficient Video Processing
Altmetric Badge

Chapter 14 MorphMLP: An Efficient MLP-Like Backbone for Spatial-Temporal Representation Learning
Altmetric Badge

Chapter 15 COMPOSER : Compositional Reasoning of Group Activity in Videos with Keypoint-Only Modality
Altmetric Badge

Chapter 16 E-NeRV: Expedite Neural Video Representation with Disentangled Spatial-Temporal Context
Altmetric Badge

Chapter 17 TDViT: Temporal Dilated Video Transformer for Dense Video Tasks
Altmetric Badge

Chapter 18 Semi-supervised Learning of Optical Flow by Flow Supervisor
Altmetric Badge

Chapter 19 Flow Graph to Video Grounding for Weakly-Supervised Multi-step Localization
Altmetric Badge

Chapter 20 Deep 360 $$^\circ $$ ∘ Optical Flow Estimation Based on Multi-projection Fusion
Altmetric Badge

Chapter 21 MaCLR: Motion-Aware Contrastive Learning of Representations for Videos
Altmetric Badge

Chapter 23 Frozen CLIP Models are Efficient Video Learners
Altmetric Badge

Chapter 24 PIP: Physical Interaction Prediction via Mental Simulation with Span Selection
Altmetric Badge

Chapter 25 Panoramic Vision Transformer for Saliency Detection in 360 $$^\circ $$ ∘ Videos
Altmetric Badge

Chapter 26 Bayesian Tracking of Video Graphs Using Joint Kalman Smoothing and Registration
Altmetric Badge

Chapter 27 Motion Sensitive Contrastive Learning for Self-supervised Video Representation
Altmetric Badge

Chapter 28 Dynamic Temporal Filtering in Video Models
Altmetric Badge

Chapter 29 Tip-Adapter: Training-Free Adaption of CLIP for Few-Shot Classification
Altmetric Badge

Chapter 30 Temporal Lift Pooling for Continuous Sign Language Recognition
Altmetric Badge

Chapter 31 MORE: Multi-Order RElation Mining for Dense Captioning in 3D Scenes
Altmetric Badge

Chapter 32 SiRi: A Simple Selective Retraining Mechanism for Transformer-Based Visual Grounding
Altmetric Badge

Chapter 34 TM2T: Stochastic and Tokenized Modeling for the Reciprocal Generation of 3D Human Motions and Texts
Altmetric Badge

Chapter 35 SeqTR: A Simple Yet Universal Network for Visual Grounding
Altmetric Badge

Chapter 37 FashionViL: Fashion-Focused Vision-and-Language Representation Learning
Altmetric Badge

Chapter 38 Weakly Supervised Grounding for VQA in Vision-Language Transformers
Altmetric Badge

Chapter 39 Automatic Dense Annotation of Large-Vocabulary Sign Language Videos
Altmetric Badge

Chapter 40 MILES: Visual BERT Pre-training with Injected Language Semantics for Video-Text Retrieval
Altmetric Badge

Chapter 41 GEB+: A Benchmark for Generic Event Boundary Captioning, Grounding and Retrieval
Altmetric Badge

Chapter 42 A Simple and Robust Correlation Filtering Method for Text-Based Person Search

Attention for Chapter 5: Efficient Video Transformers with Spatial-Temporal Token Selection

Altmetric Badge

Citations

dimensions_citation: 2 Dimensions

Readers on

mendeley: 58 Mendeley

Summary Dimensions citations

You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.

Chapter title	Efficient Video Transformers with Spatial-Temporal Token Selection
Chapter number	5
Book title	Computer Vision – ECCV 2022
Published by	Springer, Cham, January 2022
DOI	10.1007/978-3-031-19833-5_5
Book ISBNs	978-3-03-119832-8, 978-3-03-119833-5
Authors	Wang, Junke, Yang, Xitong, Li, Hengduo, Liu, Li, Wu, Zuxuan, Jiang, Yu-Gang

View on publisher site Alert me about new mentions

Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 58 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country	Count	As %
Unknown	58	100%

Demographic breakdown

Readers by professional status	Count	As %
Student > Ph. D. Student	10	17%
Student > Master	6	10%
Student > Bachelor	4	7%
Student > Doctoral Student	3	5%
Researcher	3	5%
Other	2	3%
Unknown	30	52%

Readers by discipline	Count	As %
Computer Science	20	34%
Engineering	3	5%
Mathematics	1	2%
Agricultural and Biological Sciences	1	2%
Biochemistry, Genetics and Molecular Biology	1	2%
Other	0	0%
Unknown	32	55%