Generating the texture map for a 3D human mesh from a single image is challenging. To generate a plausible texture map, the invisible part of the texture needs to be synthesized with relevance to the visible part and the texture should semantically align to the UV space of the template mesh. To overcome such challenges, we propose a novel method that incorporates SamplerNet and RefinerNet. SamplerNet predicts a sampling grid that enables sampling from the given visible texture information, and RefinerNet refines the sampled texture to maintain spatial alignment.