DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | Jo, Sungho | - |
dc.contributor.advisor | 조성호 | - |
dc.contributor.author | Bae, Byeong-Uk | - |
dc.date.accessioned | 2018-06-20T06:23:47Z | - |
dc.date.available | 2018-06-20T06:23:47Z | - |
dc.date.issued | 2017 | - |
dc.identifier.uri | http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=675478&flag=dissertation | en_US |
dc.identifier.uri | http://hdl.handle.net/10203/243415 | - |
dc.description | 학위논문(석사) - 한국과학기술원 : 전산학부, 2017.2,[iii, 25 p. :] | - |
dc.description.abstract | In recent years, research on first-person images has become increasingly important in the field of computer vision due to the development of wearable cameras and the interest in life logging. However, it is difficult to analyze the first-person image because the user 's hand is represented in various ways as well as the camera motion is mixed. As a general approach, Convolutional Neural Network (CNN) based learning methods are used primarily for vision tasks such as classification and recognition, because they better represent the latent features of an image. However, for vision work involving video data, the CNN-based model has the disadvantage that it is difficult to learn the long-time dependence between sequence data. In order to overcome such limitations, we propose a deep network structure consisting of CNN and LSTM (Long short term memory) for action recognition in first-person image data. Our model has two main concepts: First, each object information and motion information is learned through a convolution network divided into two streams. The next step is to learn the temporal dependence of multi-task learning in the LSTM model through the latent features obtained from each stream. We evaluated the performance of the GTEA dataset and compared it with other studies. | - |
dc.language | eng | - |
dc.publisher | 한국과학기술원 | - |
dc.subject | First-person video | - |
dc.subject | Action recognition | - |
dc.subject | CNN | - |
dc.subject | LSTM | - |
dc.subject | Multi-task learning | - |
dc.subject | 일인칭 영상 | - |
dc.subject | 행동 인지 | - |
dc.subject | 멀티 태스크 학습 | - |
dc.title | Convolutional recurrent neural networks for first-person action recognition | - |
dc.title.alternative | 일인칭 행동 인식을 위한 컨볼루션 순환 신경망 연구 | - |
dc.type | Thesis(Master) | - |
dc.identifier.CNRN | 325007 | - |
dc.description.department | 한국과학기술원 :전산학부, | - |
dc.contributor.alternativeauthor | 배병욱 | - |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.