ICCV：Towers of Babel Combining Images, Language, and 3D Geometry for Learni ...

收藏 2025-08-11

Towers of Babel: Combining Images, Language, and 3D Geometry for
            Learning Multimodal Vision
            Xiaoshi Wu1    Hadar Averbuch-Elor2    Jin Sun2 Noah Snavely2
               1             2
                  Tsinghua University Cornell Tech, Cornell University

Figure 1: Our WikiScenes dataset combines 3D reconstructions, images, and language descriptions for dozens of landmarks, like the
Barcelona and Reims Cathedrals pictured above. WikiScenes enable ...

附件列表

ICCV：Towers of Babel Combining Images, Language, and 3D Geometry for Learning .pdf

大小:8.25 MB

只需: RMB 5 元马上下载

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

栏目导航

扫码加我 拉你入群

分享

扫码加好友，拉您进群

扫码加我拉你入群