全部版块 我的主页
论坛 提问 悬赏 求职 新闻 读书 功能一区 经管文库(原现金交易版)
373 1
2025-04-18
人大研究生论文资料扩展阅读材料:CVPR2023多模态、视听语言学习 、视觉-语言

+多模态学习            181.0 MB
| Align and Attend:Multimodal Summarization with Dual Contrastive Losses.pdf             7.6 MB
| BiCro:Noisy Correspondence Rectification for Multi-modality Data via Bi-directional Cross-modal Similarity Consistency.pdf             10.6 MB
| CLIP2Scene:Towards Label-Efficient 3D Scene Understanding by CLIP.pdf             9.1 MB
| Decoupled Multimodal Distilling for Emotion Recognition.pdf             7.4 MB
| Detecting and Grounding Multi-Modal Media Manipulation.pdf             11.9 MB
| Emotional Reaction Intensity Estimation Based on Multimodal Data.pdf             6.7 MB
| Learning Instance-Level Representation for Large-Scale Multi-Modal Pretraining in E-commerce.pdf             22.9 MB
| MaPLe:Multi-modal Prompt Learning.pdf             14.5 MB
| MM-Diffusion:Learning Multi-Modal Diffusion Models for Joint Audio and Video Generation.pdf             12.5 MB
| Multimodal Feature Extraction and Fusion for Emotional Reaction Intensity Estimation and Expression Classification in Videos with Transformers.pdf             6.8 MB
| Multimodal Prompting with Missing Modalities for Visual Recognition.pdf             14.8 MB
| Mutilmodal Feature Extraction and Attention-based Fusion for Emotion Estimation in Videos.pdf             6.7 MB
| Quantum Multi-Model Fitting.pdf             10.8 MB
| Towards All-in-one Pre-training via Maximizing Multi-modal Mutual Information.pdf             7.5 MB
| Towards Flexible Multi-modal Document Models.pdf             8.9 MB
| Understanding and Constructing Latent Modality Structures in Multi-modal Representation Learning.pdf             7.5 MB
| Uni-Perceiver v2:A Generalist Model for Large-Scale Vision and Vision-Language Tasks.pdf             7.6 MB
| Vita-CLIP:Video and text adaptive CLIP via Multimodal Prompting.pdf             7.3 MB
+视觉-语言            367.0 MB
| Accelerating Vision-Language Pretraining with Free Language Modeling.pdf             7.3 MB
| Bidirectional Cross-Modal Knowledge Exploration for Video Recognition with Pre-trained Vision-Language Models.pdf             10.7 MB
| Blind Image Quality Assessment via Vision-Language Correspondence:A Multitask Learning Perspective.pdf             7.7 MB
| Connecting Vision and Language with Video Localized Narratives.pdf             22.0 MB
| CrowdCLIP:Unsupervised Crowd Counting via Vision-Language Model.pdf             12.1 MB
| FAME-ViL:Multi-Tasking Vision-Language Model for Heterogeneous Fashion Tasks.pdf             16.8 MB
| GIVL:Improving Geographical Inclusivity of Vision-Language Models with Pre-Training Methods.pdf             18.5 MB
| HOICLIP:Efficient Knowledge Transfer for HOI Detection with Vision-Language Models.pdf             8.3 MB
| IFSeg:Image-free Semantic Segmentation via Vision-Language Model.pdf             11.0 MB
| Improving Vision-and-Language Navigation by Generating Future-View Image Semantics.pdf             8.0 MB
| Is BERT Blind?Exploring the Effect of Vision-and-Language Pretraining on Visual Language Understanding.pdf             14.9 MB
| KERM:Knowledge Enhanced Reasoning for Vision-and-Language Navigation.pdf             8.0 MB
| Lana:A Language-Capable Navigator for Instruction Following and Generation.pdf             11.2 MB
| Learning to Exploit Temporal Structure for Biomedical Vision-Language Processing.pdf             11.4 MB
| Learning to Name Classes for Vision and Language Models.pdf             15.1 MB
| MAGVLT:Masked Generative Vision-and-Language Transformer.pdf             24.1 MB
| MAP:Multimodal Uncertainty-Aware Vision-Language Pre-training Model.pdf             9.5 MB
| Meta-Explore:Exploratory Hierarchical Vision-and-Language Navigation Using Scene Object Spectrum Grounding.pdf             9.0 MB
| Open-vocabulary Attribute Detection.pdf             35.1 MB
| Policy Adaptation from Foundation Model Feedback.pdf             10.3 MB
| PosterLayout:A New Benchmark and Approach for Content-aware Visual-Textual Presentation Layout.pdf             12.1 MB
| Seeing What You Miss:Vision-Language Pre-training with Semantic Completion Learning.pdf             8.7 MB
| SynthVSR:Scaling Up Visual Speech Recognition With Synthetic Supervision.pdf             7.5 MB
| Task Residual for Tuning Vision-Language Models.pdf             7.3 MB
| Test of Time:Instilling Video-Language Models with a Sense of Time.pdf             10.9 MB
| Towards Generalisable Video Moment Retrieval:Visual-Dynamic Injection to Image-Text Pre-Training.pdf             7.3 MB
| Turning a CLIP Model into a Scene Text Detector.pdf             8.8 MB
| Video-Text as Game Players:Hierarchical Banzhaf Interaction for Cross-Modal Representation Learning.pdf             9.7 MB
| VILA:Learning Image Aesthetics from User Comments with Vision-Language Pretraining.pdf             15.4 MB
| VLPD:Context-Aware Pedestrian Detection via Vision-Language Semantic Self-Supervision.pdf             8.3 MB
+视听语言学习            159.0 MB
| A Light Weight Model for Active Speaker Detection.pdf             10.0 MB
| Audio-Visual Grouping Network for Sound Localization from Mixtures.pdf             8.1 MB
| CASP-Net:Rethinking Video Saliency Prediction from an Audio-VisualConsistency Perceptual Perspective.pdf             7.9 MB
| Dense-Localizing Audio-Visual Events in Untrimmed Videos:A Large-Scale Benchmark and Baseline.pdf             15.5 MB
| Egocentric Audio-Visual Object Localization.pdf             35.8 MB
| Fine-grained Audible Video Description.pdf             13.4 MB
| Language-Guided Audio-Visual Source Separation via Trimodal Consistency.pdf             8.8 MB
| Learning Audio-Visual Source Localization via False Negative Aware Contrastive Learning.pdf             10.7 MB
| Physics-Driven Diffusion Models for Impact Sound Synthesis from Videos.pdf             23.3 MB
| Sound to Visual Scene Generation by Audio-to-Visual Latent Alignment.pdf             18.1 MB
| Watch or Listen:Robust Audio-Visual Speech Recognition with Visual Corruption Modeling and Reliability Scoring.pdf             7.3 MB
CVPR'23多模态学习论文及代码检索目录.pdf            290.0 KB



CVPR2023多模态.part1.rar
大小:(100 MB)

只需: RMB 29元  马上下载


CVPR2023多模态.part5.rar
大小:(100 MB)

只需: RMB 1元  马上下载



二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

全部回复
2025-4-20 15:27:08
感谢楼主,正需要一些资料填补我研究的空白
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

相关推荐
栏目导航
热门文章
推荐文章

说点什么

分享

扫码加好友,拉您进群
各岗位、行业、专业交流群