Using Variance Estimates to Combine Bayesian Classifiers
Keywords:
Belief nets, Graphical Model, Machine Learning
Abstract:
Many of today's best classification results are obtained by
combining the responses of a set of base classifiers to produce an answer for
the query.
This paper explores a novel "query specific" combination rule:
After learning a set of simple belief network classifiers,
we produce an answer to each query by combining their individual responses,
using weights based inversely on their respective variances
around their responses.
These variances are based on the uncertainty of the network parameters,
which in turn depend on the training datasample.
In essence, this variance quantifies the
base classifier's confidence of its response to this query.
Our experimental results show that
these "mixture-using-variance belief net classifiers"   MUVs
work effectively,
especially when the base classifiers are learned
using balanced bootstrap samples
and when their results are combined using James-Stein shrinkage.
We also found that our variance-based combination rule
performed better than both
bagging and AdaBoost,
even on the set of base classifiers produced by AdaBoost itself.
Finally, this framework is extremely efficient,
as both the learning and the classification components require only
straight-line code.
- Submission version(pdf)
- Extra information
- Derivation
for James-Stein shrinkage
(see Section 3.2 (page 2-3) of extended
version)
- Tables omitted from the submissoin
- Table 1 (Section 4.1):
- Comparing MUV(X <boot,bal>, mle)
vs MUV(X, <boot,bal>, js)
- Table 2 (Section 4.1):
- Comparing MUV(X,
<disj,bal>, mle)
vs MUV(X, <disj,bal>, js)
- Comparing kNBs and kTANs when data samles
are either Balanced or Skewed.