Haptic and audio cues now appear commonly in computer interfaces, partially due to inherent advantages such as their support for eyes-free interaction. Their invisible, unobservable nature also makes them ideal candidates for security interfaces in which users have to enter secret information such as passwords. In particular, researchers have explored this idea through the design of PIN entry authentication systems based on multi-modal combinations of visual and non-visual content or on the recognition of small sets of unimodal haptic or audio stimuli. This paper highlights the benefits and performance limitations of these approaches and introduces an alternative based on unimodal audio or haptic temporal numerosity - the ability to accurately and rapidly determine the number of cues presented in rapid temporal succession. In essence, in a numerosity interface, rather than recognizing distinct cues, users must count the number of times that a single cue occurs. In an iterative process of design and evaluation, three prototypes implementing this concept are presented and studies of their use reported. The results show the fastest PIN entry times and lowest error rates to be 8 s and 2%, figures that improve substantially on previous research. These results are attained while maintaining low levels of workload and substantial resistance to observation attack (as determined via camera attack security studies). In sum, this paper argues that unimodal audio and haptic numerosity is a valuable and relatively unexplored metaphor for non-visual input and demonstrates the validity of this claim in the demanding task of unobservable authentication systems. (C) 2012 British Informatics Society Limited. All rights reserved.