Download Full Text (1.0 MB)

Publication Date



Semantic representations of words can be acquired by both textual and perceptual information. Multimodal models integrate this information, and outperform text-only semantic representations of words. In many contexts, they better reflect human concept acquisition. A common model for semantic representation is the semantic vector, a list of decimal numbers representing the clusters in which a word appears in text. Studies have shown that if two words have similar vectors, they are likely to have similar meaning, or at least be relevant to each other. Other approaches entail inserting sentences, made up of caption words from an image set, into text, to modify the vectors corresponding to each word in a textual corpus's vocabulary, and thus form different semantic representations. These techniques have also suggested that whereas concrete terms' meanings tend to improve with propagation, abstract terms tend to become less accurate when too much information from their more concrete counterparts is propagated to them. In this study, I have therefore utilized different techniques for comparing words' meanings, to implement an image retrieval system. Even if a word w does not directly tag an image, the system retrieves images whose captions contain words that have the most similar vector representations to that of w. Therefore, we examine the extent to which a word's semantic representation has improved, based on improvements in corresponding retrieval results from this system.

Different Modes of Semantic Representation in Image Retrieval