Visual proportion sense in untrained deep neural networks

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 115
  • Download : 0
Visual proportion sense - the ability to estimate proportion between two visual magnitudes - is an essential cognitive function in animal survival (i.e. hunting, foraging, and mating), and it is widely observed in animals, humans, and even in infants without any prior knowledge (Jacob, 2012). Neurophysiological and neuroimaging experiments have found that several neurons in rhesus monkeys and voxels in adult humans show selective response to certain visual proportions (Vallentin, 2008; Jacob, 2009). However, how proportion sense can emerge without any learning is elusive, even in theoretical studies. A recent study showed that visual number sense can arise by hierarchical random feedforward connections in an untrained deep neural network (DNN), without any training (Kim, 2021). From this idea, we hypothesized that visual proportion sense can emerge in an untrained network without any training. We designed sets of images consisting of white (W) and black (B) dots, which represent specific proportions (i.e. W/(W+B) = 1/6), and fed these images into a randomly initialized DNN, AlexNet. In the designed images, the total number of dots can vary to 6, 12, or 18 dots, while they can represent the same proportion (i.e. W:(W+B) = 1:6, 2:12, and 3:18 dots = 1/6). In the deep layers of untrained AlexNet (Conv 3-5), we found proportion units in which responses are tuned to specific proportions invariant to the total number of dots in the images. These proportion units selectively and robustly respond to the preferred proportion, when the total number of dots varied from 2 to 30 (invariant to absolute number), and dot size, shape, or total area varied (invariant to low-level image feature). Next, to examine the emergence mechanism of the proportion units, we tested two possible hypotheses - proportion units selectively receive input from 1) units that encode proportion in a specific total number of dots (i.e. 1/6 = W1 and B5, W2 and B10 ⋯) and 2) units that encode white/black increasing tendency - from the previous layers. From the weight analysis of the untrained network, we found that proportion units in layer 5 receive stronger inputs from white/black increment encoding units than number-specific proportion units in layer 4. In addition, ablation of increment units significantly diminished the emergence of the proportion units, whereas ablating number-specific units did not affect the emergence of the proportion units, supporting the latter hypothesis. In summary, our results suggest that visual proportion sense may arise without training and provide insights into how innate cognitive functions can emerge from a hierarchical network structure.
Publisher
Society for Neuroscience
Issue Date
2021-11-09
Language
English
Citation

Society for Neuroscience 2021

URI
http://hdl.handle.net/10203/290606
Appears in Collection
BiS-Conference Papers(학술회의논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0