I'm approaching the study of the neural networks applied to image processing. My first experiments showed the needs of a very large input layer with a conseguent memory overflow. So i considered "downsampling" my input but i can't figure out a "fast" and "robust" image ratio.
For example assume a simple computer webcam with a capture resolution of 320x240 how much scale?
@AlxEyesoul Scaling down the image alone is not a good solution for reducing the input size, especially in the case of face recognition. The minimum image scale I would recommend for face recognition would be 160 X 120. Anything below that would greatly reduce the recognition accuracy. But even with these values (160X120) you are still stuck with a very large input size (in this case 160x120 = 19200). "continued..."
@AlxEyesoul So what I recommend to solve your problem is that you consider feature extraction and compression techniques. For example, instead of using the entire face, use only certain regions that are unique for different people such as eyes and lips. Use features such as skin texture, distance between eyes and nose…, or a combination of those. Typical input length consisting of a feature vector is in the range between few hundreds to a couple of thousands.
I'm approaching the study of the neural networks applied to image processing. My first experiments showed the needs of a very large input layer with a conseguent memory overflow. So i considered "downsampling" my input but i can't figure out a "fast" and "robust" image ratio.
For example assume a simple computer webcam with a capture resolution of 320x240 how much scale?
AlxEyesoul 1 year ago
@AlxEyesoul Scaling down the image alone is not a good solution for reducing the input size, especially in the case of face recognition. The minimum image scale I would recommend for face recognition would be 160 X 120. Anything below that would greatly reduce the recognition accuracy. But even with these values (160X120) you are still stuck with a very large input size (in this case 160x120 = 19200). "continued..."
kyoussef321 1 year ago
@AlxEyesoul So what I recommend to solve your problem is that you consider feature extraction and compression techniques. For example, instead of using the entire face, use only certain regions that are unique for different people such as eyes and lips. Use features such as skin texture, distance between eyes and nose…, or a combination of those. Typical input length consisting of a feature vector is in the range between few hundreds to a couple of thousands.
kyoussef321 1 year ago
@kyoussef321 thanks, but let's make it simple, what is the best resolution for a B/W image to be processed?
AlxEyesoul 1 year ago