Text messaging on smartphones has become one of the most popular communication methods. With many smartphone chat applications, text messaging no longer is only "text"; users send emoticons to express emotions or share pictures stored on their phones. We believe that providing more visuals in chat applications by autonomously suggesting proper images from the Internet (i.e., "auto complete" with images), based on the chat content, is the next evolution of mobile messaging. Realizing this simple vision however, is a difficult task due to the intrinsic nature of mobile chat and resource limitations of smartphones. We identify these challenges and to overcome them, we suggest integrating solutions from the field of mobile computing, natural language processing, sentiment analysis, machine learning, storage, human computer interaction, networking, and systems. We present MilliCat, a lightweight mobile messaging service that autonomously suggests images based on chat context to improve emotion expression, nuance delivery, and information delivery of a conversation. Experimental results from our preliminary prototype implementation show promises that real-time autonomous image suggestion can provide timely, proper images while only incurring manageable networking and energy overhead.