Improving perceptual and planning capabilities of robots : Algorithms for human action classification and future prediction로봇의 계획 수립 및 지각 능력 향상 : 인간 행동 구별과 미래 예측 알고리즘

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 299
  • Download : 0
namely action classification, and future prediction. In the first part, a deep Convolutional Neural Network which uses the ResNext Architecture as the backbone and employs 3D spatiotemporal convolutional kernels is proposed for human activity classification. The network also employs a spatio-temporal attention mechanism which softly weights the learned features at each stage of the network to the salient features. To further enhance the representational power and accuracy of the network, the ResNeXt modules are replaced by Squeeze and Excitation ResNeXt modules which implement channel wise feature recalibration. To prevent overfitting due to the large number of parameters in 3D kernels, ImageNet pretraining is used to initialize the weights of the network. A computationally efficient variant, which uses separable spatial and temporal convolutions instead of using 3D convolutions, is also proposed. Transferability of the features learnt by the network to similar action datasets is also demonstrated by fine tuning the network on the UCF101 dataset. The second part of this proposes an end-to-end unsupervised predictive network for future generation, that, instead of generating the raw pixel values for the future frame directly, outputs a set of transformation which are then applied to the input frames to generate frames. The architecture is adversarially trained using standard generative adversarial networks(GAN) as well as Wasserstein GAN and is evaluated on the moving mnist, UCF101 and robot pushing dataset.; We as humans can rely on our extensive knowledge and experience accumulated over our lifetime to perceive what is happening around us; recognize actions and gestures, interpret intentions, anticipate the future and act accordingly. How do we give prediction models in robots access to such common sense knowledge. One solution is to employ deep learning algorithms, which have made it possible for robots to learn computational models for various tasks directly from raw data without. In this study, this non-trivial task is focused on two main areas
Advisors
Kim Jong-Hwanresearcher김종환researcher
Description
한국과학기술원 :전기및전자공학부,
Publisher
한국과학기술원
Issue Date
2018
Identifier
325007
Language
eng
Description

학위논문(석사) - 한국과학기술원 : 전기및전자공학부, 2018.8,[viii, 81 p. :]

Keywords

Action classification▼aattention▼a3D convolutional neural networks▼afuture generation▼agenerative adversarial networks(GAN)▼aresNeXt; 행동 분류▼a주의▼a3D 컨볼 루션 뉴럴 네트워크▼a미래 세대▼agenerative Adversarial Networks(GAN)▼aresNeXt

URI
http://hdl.handle.net/10203/266741
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=828588&flag=dissertation
Appears in Collection
EE-Theses_Master(석사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0