Neural ordinary differential equations (NODE) present a new way of considering a deep residual network as a continuous structure by layer depth. However, it fails to overcome its representational limits, where it cannot learn all possible homeomorphisms of input data space, and therefore quickly saturates in terms of performance even as the number of layers increases. Here, we show that simply stacking Neural ODE blocks could easily improve performance by alleviating this issue. Furthermore, we suggest a more effective way of training neural ODE by using a time-evolving mixture weight on multiple ODE functions that also evolves with a separate neural ODE. We provide empirical results that are suggestive of improved performance over stacked as well as vanilla neural ODEs where we also confirm our approach can be orthogonally combined with recent advances in neural ODEs.