Caption Aided Action Recognition Using Single Images
Kafka, Adam
Graduate Student
Computer Science
Chuah, Mooi Choo
text
theses
2017-05-01
2017
Lehigh University
eng
electronic documents
application/pdf
In this paper, we attack the problem of classifying human actions from a single, static image. We propose that leveraging an automatic caption generator for this task will provide extra information when compared to a traditional convolutional neural network based classifier. The architecture consists of two stages, caption generation and caption classification, used sequentially to a proposed human action class label from a single image. Evaluation is performed of our system and it is evident that caption generation is the limiting factor in accuracy. We propose fixes to both the dataset and the caption generator, in order to improve the model. Finally, it is discovered that caption classification is significantly improved by concatenating all captions from a single image together, to produce one input vector.
Computers Mathematics
action caption classification CNN LSTM
https://asa.lib.lehigh.edu/Record/10875230