

Moron $Ver 0.1.0 - Method for Object Recognition of Obscure Nature

(11-Feb-2002)

For license, see file COPYING. Note that this tool reuses 
parts of the WEKA software, "http://www.cs.waikato.ac.nz/~ml/".


Purpose of Moron
----------------
Moron tries to do classification task between images 
of the following four types:

a (0.0) pron, subclass col
b (1.0) drawn, subclass cg
c (2.0) drawn, subclass manga
d (3.0) pron, subclass bw

You give it an image, and it tells you what it thinks it is. 

In theory Moron could be used to filter spam from e.g. 
cartoon/manga newsgroups, but the success rate might 
not be good enough for real-life use. See below.

Moron uses a machine learning approach. The classification 
engine was trained with features from about six thousand images.


Requirements
------------
gcc, a java 1.2 compiler, netpbm packages (anytopnm, libpnm.so)


Usage
-----
Compile: make

Usage:

./evalfile.sh turd.jpg                 Classify an image
./evaldir.sh /turds/pics/              Classify a directory of images
./evalpat.sh "/turds/pics/dog*"        Vote on a bunch of images

There are four different classification engines included, engine4 
(latest) being the default. You can change the engine with 
"make engine1" etc for slightly different classification results.
Engine3 might be worth trying. (Note that engines 1-3 use
the old feat. extractor extract_n6.c - Makefile notes this)


Current accuracy
----------------
Here is a confusion matrix belonging to an experiment
with 91% success rate using 2-fold cross validation.
The numbers are just image amounts.

    a    b    c    d   <-- classified as
 1051   38    6   62 |    a = 0
   49 1025   28   31 |    b = 1
   14   41 1025   34 |    c = 2
   19   21   38 1012 |    d = 3

Note that mistakes inside the two subsets of "drawn" and "pron"
(that is, "b vs c" or "a vs d") are not tragic. For binary 
classification tasks ["drawn" vs. "pron"] this implies 95% 
success rate.

A better classification rate is most efficiently obtained by 
extracting some additional features from the images and 
then retraining. What those features could be, is yet 
unclear.


Philosophy & internals
----------------------
Machine learning algorithms are easily enough available
from the internet, but the same can't be said of the feature 
extracting algorithms or image classification kits. However, 
any successful image classification task seems to depend heavily 
on the extracted features - the training images just can't be fed
straight to some learning algorithm. Instead well-chosen 
properties (called features) of the images are collected and 
the classifier is built on these.

Moron's feature extractor "extract" reads a pnm image
file from stdout and outputs the calculated features. 
Currently these features are quite simple, consisting of 

- sum of differences between current and previous pixel,
  summed in x- and y-directions as values xdiff and ydiff
  (calculated in a grayscale). Scaled with image size.
- features xc* and yc* show how much the image contains
  same pixel values in a row, e.g value 0.4 of xc2 means that
  40% of the image consisted of 2-pixel rows having the
  same color (calculated in grayscale).
- 64 bin histograms for R,G,B and GRAY planes.
- rcolors, bcolors, gcolors, tcolors: the amount of
  different colors in each plane, max 256.
- xmax and ymax: the maximum amount of subsequent pixels 
  with the same colors in a row/column. Scale by image size.

The calculated features are then used to predict with 
"java predict <feature_file.arff>". See script "evalfile.sh"
of how this works as a whole.

The features mentioned above are ad-hoc. The point is that 
we don't actually know what makes a drawn image a drawn image, 
but instead we provide these features to some learning algorithm 
and hope that it is able to find some useful rules. It seems 
that one characteristic feature of a drawn image is that it
has large surfaces with about the same color. It might
also have color-distribution different from photographs. 
The mentioned features try to capture these properties,
and yet they begin to fail the more the drawn images start
to look photorealistic.

I am looking for ideas for further feature extraction,
or even better, source code that can be directly used. If
you have ideas, let me know. Research articles certainly
burst with different techniques, but there is no code
available, the mathematics can be depressing and their usability
questionable compared to the amount of reimplementing
work involved. Also methods in Computer Vision journals 
do not usually output the features in a way usable for a
learning algorithm. The ideal features are such that they can be 
expressed as simple and few numeric values, because its easiest 
for the learning algorithm to benefit from those. Thus 
e.g. a whole picture with just the edges (from "edge detection" 
extraction) would be hard stuff for any learning algorithm, 
but a simple attribute like "number of edges" scaled with 
picture size might give better results in some cases.


How to re-teach
---------------
The current classifiers (engines) are Ada- or LogitBoosted 
Decision Trees and -Stumps. If you wish to experiment with 
a wide variety of learning algorithms on this same task
(to see how they would perform) get latest WEKA from

http://www.cs.waikato.ac.nz/~ml/

and my training file containing features from ~6000 
images from my homepage

http://www.geocities.com/iwronsky/

Then you can build a new classification engine using
some other algorithms simply by running something like

java -cp weka.jar weka.classifiers.AdaBoostM1 -I 100 -t train.arff -W weka.classifiers.DecisionStump -z maha >WekaWrapper.java

which outputs the classifier sourcecode as WekaWrapper.java. Then 
replace the current WekaWrapper-file in "moron/weka/classifiers/" with 
the new one and recompile. FYI, on the command line above, AdaBoost 
is the used boosting algorithm, using 100 iterations. The algorithm 
tries to combine predictions of 100 "weaker" algorithms, in
this case DecisionStumps.

If you are not familiar with machine learning or the algorithms,
it doesn't matter. WEKA has a nice GUI for experimenting and
quite good documentation, and tweaking can get you quite far.

You can also adapt this software to some entirely new 
image classification task by preparing your own training
material. This can be done by collecting a suitable
amount of teaching images and using "extract" to get
the features from them. Script "create_training_examples.sh"
illustrates this. Of course the features for this current
cartoon-task (mostly "global properties") might be unusable 
for your case, so you'll have to modify the extraction
routines to output something more meaningful attributes.


Important files of Moron
------------------------
    extract_n7.c       - The feature extractor, model "Camp Mk1"
    predict.java       - Predicts/votes on given features, using
    WekaWrapper?.java  - One of the pretrained classification engines
    create_training_examples.sh - Demonstrates how to create new
                                  training files


Contact
-------
Ideas? Suggestions? Flames?! Code? What do you think of 
this program? Just send mail to <iwronsky(at)users.sourceforge.net> 


Addendum
--------
ps. you dislike the expressions in this document? 
Then this one is for you:

"hey l33t hex0r i've hackd sum stuff to do er stuff 
 with pr0n if yu like to hack ftooo tghen snend me mail
 lets hachjk da plac3 tho the grounddasshzzaah.!!! geaH"



