Background
A Bayesian network is a particular type learning network well-suited to
classification problems, based on Bayesian statistical methods. Each node
represents a variable, and associated with each node we keep information about
the probability of that the variable has a particular value given any known
information about the system. Edges between nodes indicate conditional
dependence between the associated variables. Because the network encodes the
probabilities for each variable conditioned on every other variable, the
network as a whole represents a joint distribution for the variables in
question.
We wish to use these networks to solve classification problems. To do this, we
must somehow alter the node weights to reflect the true probabilities, rather
than the prior probabilities. Then, given the state of the input variables, we
can accurately predict the value of the output variables, as well as giving a
measure of confidence in our choice.
The techniques for doing this come from Bayesian statistics. Given a prior
probability distribution and new information, Bayes's rule allows us to
calculate updated probabilities, known as posterior probabilities, that reflect
our new data. The process of updating the node weights is known as Bayesian
inference.
Unfortunately, the general problem of Bayesian inference is NP-Hard. Therefore,
for large networks we must either perform an approximation to Bayesian
inference or we must constrain the structure of our graph to simplify the
inference process. There are many inference techniques in use today, each with its
own advantages.