To start with, I'll explain the basic 'funda' behind the whole project. There are two webcams for the classroom with different, non-overlapping fields of vision. A webcam captures an image of the part of the classroom it is assigned to. After a small time interval, say 2 seconds, it takes another image of the same part. Now any movable object, which a student in our case, will have slightly different positions in the two images (even the most disciplined student will have some facial movement!!). Call these images 'A' and 'B'. So somehow if we detect this difference our job is done.
   Our approach has been to convert these two images to their grayscale versions and then subtract them using the OpenCV library functions. What you'll see in the resultant image (faintly though) is an outline of the students captured and looks almost like a fainted negative. Now, since the classroom is fixed, you generate a file containing all the regions of interest within which you want to detect persons. Using these regions of interest, a series of
'masks' are created. A mask is essentially an image which has only a particular area is painted white and the rest of the area is kept black. These generated masks are each multiplied (pixel by pixel logical and) by the two images 'A' and 'B'. These resultant images are each put through the following operations :

                 An average value of the pixel intensities was calculated.

                  Variance about this mean value was also measured.

A certain threshold is decided for each of these two parameters, above which it is asserted that a student is present in that particular region. These threshold values were empirically determined after taking many samples. Even if noone is there in a particular mask, a finite mean and variance value is obtained. This is essentially optical noise, due to the ever-dynamic illumination of the room. After this the no. of such masks above these threshold values gives the no. of people in the class.
See example in image archive for reference.

The classroom chosen was of medium size (one of our tutorial classrooms) and the cams were mounted on the wall near the sides of the blackboard. This was done using two wooden supports that could give the cam a rotary motion as well as vertical adjustment. These were relatively easy to get made and cheap too. A couple of USB extension cables were used to connect them to the computer.

A particular problem I faced while programming was (strangely enough) to get OpenCV to work with two cameras at the same time. I don't know if it is a bug in OpenCV or some technical detail I overlooked but the program kept running into some execution error or the other. Finally, I decided to use the program for one cam at a time, i.e. the application would have to be run twice to get the overall attendance.


Back to main project page