架构师
互联网
推荐课程
average > 0 ? $model->average . '分' : '10.0分' ?>

深度学习时代的跨模态信息建模

课程费用

5800.00 /人

课程时长

3小时

成为教练

课程简介

Feature representation of different modalities is the main focus of current cross-modal information retrieval research. Existing models typically project texts and images into the same embedding space. In this talk, we will introduce some basic ideas of text and image modeling and how can we build cross-modal relations using deep learning models. In details, we will discuss a joint model by using metric learning to minimize the similarity of the same content from different modalities. We will also introduce some recent research developments in image captioning and vision question answering (VQA)

【工作坊大纲】
1. 语义鸿沟
2. 图像建模与CNN
3. 文本模型与词向量
4. 联合模型
5. 自动标注
6. 文本生成
7. 视觉问答

目标收益

了解到深度学习的前沿研究,了解如何利用深度学习进行图像、文本信息的联合建模并如何跨模态的实现语义搜索和图像问答系统。

培训对象

课程内容

Feature representation of different modalities is the main focus of current cross-modal information retrieval research. Existing models typically project texts and images into the same embedding space. In this talk, we will introduce some basic ideas of text and image modeling and how can we build cross-modal relations using deep learning models. In details, we will discuss a joint model by using metric learning to minimize the similarity of the same content from different modalities. We will also introduce some recent research developments in image captioning and vision question answering (VQA)。

outline:
-语义鸿沟
-图像建模与CNN
-文本模型与词向量
-联合模型
-自动标注
-文本生成
-视觉问答

活动详情

提交需求