DSpace at KOASAS: Decomposing motion and content for natural video sequence prediction

DSpace at KOASAS

College of Engineering(공과대학)School of Computing(전산학부)CS-Conference Papers(학술회의논문)

Decomposing motion and content for natural video sequence prediction

Cited 0 time in webofscience

Cited 0 time in

Hit : 261
Download : 0

Export

DC Field	Value	Language
dc.contributor.author	Villegas, Ruben	ko
dc.contributor.author	Yang, Jimei	ko
dc.contributor.author	Hong, Seunghoon	ko
dc.contributor.author	Lin, Xunyu	ko
dc.contributor.author	Lee, Honglak	ko
dc.date.accessioned	2020-10-23T01:57:05Z	-
dc.date.available	2020-10-23T01:57:05Z	-
dc.date.created	2020-10-06	-
dc.date.issued	2017-04-24	-
dc.identifier.citation	5th International Conference on Learning Representations, ICLR 2017	-
dc.identifier.uri	http://hdl.handle.net/10203/276943	-
dc.description.abstract	We propose a deep neural network for the prediction of future frames in natural video sequences. To effectively handle complex evolution of pixels in videos, we propose to decompose the motion and content, two key components generating dynamics in videos. Our model is built upon the Encoder-Decoder Convolutional Neural Network and Convolutional LSTM for pixel-level prediction, which independently capture the spatial layout of an image and the corresponding temporal dynamics. By independently modeling motion and content, predicting the next frame reduces to converting the extracted content features into the next frame content by the identified motion features, which simplifies the task of prediction. Our model is end-to-end trainable over multiple time steps, and naturally learns to decompose motion and content without separate training. We evaluate the proposed network architecture on human activity videos using KTH, Weizmann action, and UCF-101 datasets. We show state-of-the-art performance in comparison to recent approaches. To the best of our knowledge, this is the first end-to-end trainable network architecture with motion and content separation to model the spatio-temporal dynamics for pixel-level future prediction in natural videos.	-
dc.language	English	-
dc.publisher	International Conference on Learning Representations, ICLR	-
dc.title	Decomposing motion and content for natural video sequence prediction	-
dc.type	Conference	-
dc.identifier.scopusid	2-s2.0-85064824515	-
dc.type.rims	CONF	-
dc.citation.publicationname	5th International Conference on Learning Representations, ICLR 2017	-
dc.identifier.conferencecountry	FR	-
dc.identifier.conferencelocation	Toulon	-
dc.contributor.localauthor	Hong, Seunghoon	-
dc.contributor.nonIdAuthor	Villegas, Ruben	-
dc.contributor.nonIdAuthor	Yang, Jimei	-
dc.contributor.nonIdAuthor	Lin, Xunyu	-
dc.contributor.nonIdAuthor	Lee, Honglak	-

Appears in Collection: CS-Conference Papers(학술회의논문)

Files in This Item: There are no files associated with this item.

Display Simple Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

Decomposing motion and content for natural video sequence prediction

KOASAS

Communities & Collections