The main problem of visual information transmission over the packet networks is that they are lossy and do not support guaranteed service. There are four kinds of losses: packet corruption (i.e. a partial information of a packet is lost or changed), packet loss (i.e. whole information of a packet is lost), packet delay (i.e. timing information is lost), and packet jitter or delay variation (i.e. synchronization information is lost). These losses are the main reason of image quality impariments. These losses are inevitable for packet networks. Fortunately, in contrast to other data such as text, it is possible to allow a small distortion in the visual data, as long as it is not perceivable to human eyes by adopting an appropriate masking built into the signal decoders. For this reason, we first initiated our study from the investigation of the basic mechanism and characteristics of human visual system (HVS)``s information processing. To faciliate our discussion, we formulated a {\it percepton} model for visual perception. Percepton is defined as a basic unit for human perception coming from the scene. The two problems to deal with in this thesis are as follows: what is an maximum tolerance for a specific representation of percepton and what is an efficient processsing of percepton to maximize perceptual image quality over lossy packet networks.
First part, we identify the maximum tolerance level for coefficients in transform-domain representation, where we take discrete cosine transform (DCT) and wavelet transform (WT) because they are widely used and efficient. There are many researches on the imperceptible distortion. We extend Watson model which have been applied to the baseline of the JPEG coder. The model exploits three different properties of the human visual system: frequency sensitivity, luminance masking, and contrast masking. In human visual system, horizontal and amacrine cells transmits signal to the neighbour bipolar and ganglion cells, which inhib...