Go to the content (press return)

Large scale video tagging with knowledge bases

Total activity: 4
Type of activity
Competitive project
Funding entity
AGAUR. Agència de Gestió d'Ajuts Universitaris i de Recerca
Funding entity code
2017 DI 011
33.960,00 €
Start date
End date
Digital Society
This research project aims at introducing knowledge bases in the automatic understanding of video content. The goal is generating keywords and tags for videos faster applying deep learning techniques in a web-scale video indexing platform in the cloud.

In the recent years, video sharing in social media from ubiquitous video recording devices has resulted in a exponential growth of videos available online on the Internet. Such video data is continuously increasing with daily recordings related to a wide number of topics. Nowadays finding relevant videos in such a large and distributed collection of videos had become a challenge by itself. Current search engines for video mostly rely on text-based indexing cores built around keyword tags related to the video content. Generating these tags manually is costly and does not scale, so computer vision for video understanding is required to automatically generate these tags.

Computer vision has recently benefited from the advances in machine learning, that have exploited modern hardware architectures (mainly GPUs) to process large amounts of visual data crawled online. Deep learning techniques have successfully been explored and popularized for still images, but the video domain is still a relatively unexplored field for these type of methods.

The proposed research aims at analyzing a large and growing collection of online videos with current cloud machine architecture combining data-driven analysis based on deep learning with additional existing knowledge bases. Instead of adopting a brute force approach of extracting knowledge based on data itself, we propose to exploit the existing knowledge graphs such as Freebase or WordNet, which store high amounts of information about the world and semantic entities represented in the video content. This prior knowledge about the video data can help solving faster and with better accuracy more complex tasks such as concept disambiguation and linkage.
Adm. Generalitat
V Pla de Recerca i Innovació de Catalunya (PRI). 2010-2013
Call year
Funding call
Doctorats Industrials
Grant institution
Agència De Gestió D'ajuts Universitaris I De Recerca (agaur)


Scientific and technological production

1 to 4 of 4 results