Taking group selfies via video conferencing has become a popular way of developing a close, interpersonal relationship with people in remote situations. However, existing systems are not suitable for building social bonds due to their limitations such as rigid grid-based layout, information overloading, and difficult eye contact to name a few. Based on design opportunities and user needs from a participatory ideation session, we developed Virfie, a novel web-based video conferencing platform that (1) allows dynamic compositions of users, background, and overlaid graphics; (2) enables embodied social interaction via collective challenges; and (3) fosters social narratives with multi-scene scenarios. Based on the feedback from a user study we confirm that Virfie strengthens social bonds and the feeling of remote togetherness.