In this thesis we propose a new vision-based smartphone interface system which can translate user`s eyes and mouth gestures into commands for other software applications.
Our interface system performs three main stages of vision-based techniques with frame images obtained from smartphone front-facing camera in real time.
First, we use ``LBP(Local Binary Pattern)+Cascade of Boosting`` algorithm for face detection.
From the face region, we find eyes and mouth regions approximately.
Next, our system determines whether user`s pupil exists or not using image filtering techniques.
Finally, mouth gesture is determined using GIST descriptors containing fingerprints of an input image as a low dimensional representation.
Additionally, a short calibration needs to be executed before a user starts working with the system in order to collect mouth gesture images made by the user in current lighting condition.
After face detection and eyes and mouth pattern recognition, the gesture pattern is translated into a command.
Experimental results show that our proposed system works reasonably well and has potential to be used as an alternative for those who cannot use conventional smartphone interfaces.