matlab - Simple example/use-case for a BNT gaussian_CPD? -


i attempting implement naive bayes classifier using bnt , matlab. far have been sticking simple tabular_cpd variables , "guesstimating" probabilities variables. prototype net far consists of following:

dag = false(5); dag(1, 2:5) = true; bnet = mk_bnet(dag, [2 3 4 3 3]); bnet.cpd{1} = tabular_cpd(bnet, 1, [.5  .5]); bnet.cpd{2} = tabular_cpd(bnet, 2, [.1  .345   .45 .355   .45 .3]); bnet.cpd{3} = tabular_cpd(bnet, 3, [.2  .02    .59 .2     .2  .39    .01 .39]); bnet.cpd{4} = tabular_cpd(bnet, 4, [.4  .33333 .5  .33333 .1  .33333]); bnet.cpd{5} = tabular_cpd(bnet, 5, [.5  .33333 .4  .33333 .1  .33333]); engine = jtree_inf_engine(bnet); 

here variable 1 desired output variable, set assign .5 probability either output class.

variables 2-5 define cpds features measure:

  • 2 cluster size, ranging 1 dozen or more
  • 3 ratio real value >= 1
  • 4 , 5 standard deviation (real) values (x , y scatter)

in order classify candidate cluster break of feature measurements 3-4 range brackets, so:

...     evidence = cell(1, 5);     evidence{2} = sum(m > [0 2 6]);     evidence{3} = sum(o > [0 1.57 2 3]);     evidence{4} = sum(s(1) > [-inf 1 2]);     evidence{5} = sum(s(2) > [-inf 0.4 0.8]);     eng = enter_evidence(engine, evidence);     marginals = marginal_nodes(eng, 1);     e = marginals.t(1); ... 

this works pretty well, considering i'm guessing @ range brackets , probability values. believe should using here gaussian_cpd. think gaussian_cpd can learn both optimal brackets , probabilities (as mean , covariance matrices , weights).

my problem is, not finding simple examples of how bnt gaussian_cpd class used. how, example, go initializing gaussian_cpd approximately same behavior 1 of tabular_cpd variables above?

i figured out experimenting bnt @ matlab command prompt. here how defined classifier net using gaussian_cpd nodes:

dag = false(5); dag(1, 2:5) = true bnet = mk_bnet(dag, [2 1 1 2 1], 'discrete', 1); bnet.cpd{1} = tabular_cpd(bnet, 1, 'prior_type', 'dirichlet'); node = 2:5    bnet.cpd{node} = gaussian_cpd(bnet, node); end bnet  dag =       0     1     1     1     1      0     0     0     0     0      0     0     0     0     0      0     0     0     0     0      0     0     0     0     0  bnet =                  equiv_class: [1 2 3 4 5]                     dnodes: 1                   observed: []                      names: {}                     hidden: [1 2 3 4 5]                hidden_bitv: [1 1 1 1 1]                        dag: [5x5 logical]                 node_sizes: [2 1 1 2 1]                     cnodes: [2 3 4 5]                    parents: {[1x0 double]  [1]  [1]  [1]  [1]}     members_of_equiv_class: {[1]  [2]  [3]  [4]  [5]}                        cpd: {[1x1 tabular_cpd]  [1x1 gaussian_cpd]  [1x1 gaussian_cpd]  [1x1 gaussian_cpd]  [1x1 gaussian_cpd]}              rep_of_eclass: [1 2 3 4 5]                      order: [1 5 4 3 2] 

to train it, used original classifier me label set of 300 samples, ran 2/3rds of them through training algorithm.

bnet = learn_params(bnet, lsamples); cpd = struct(bnet.cpd{1}); % peek inside cpd{1} dispcpt(cpd.cpt);  1 : 0.6045  2 : 0.3955 

the output dispcpt gives rough idea of breakdown between class assignments in labeled samples in training set.

to test new classifier ran last 1/3rd of results through both original , new bayes nets. here code used new net:

engine = jtree_inf_engine(bnet); evidence = cell(1, 5); tresults = cell(3, length(tsamples)); tresults(3, :) = tsamples(1, :); = 1:length(tsamples)     evidence(2:5) = tsamples(2:5, i);     marginal = marginal_nodes(enter_evidence(engine, evidence), 1);     tresults{1, i} = find(marginal.t == max(marginal.t)); % generic decision point     tresults{2, i} = marginal.t(1); end tresults(:, 1:8)  ans =       [         2]    [     1]    [         2]    [         2]    [         2]    [     1]    [     1]    [     1]     [1.8437e-10]    [0.9982]    [3.3710e-05]    [3.8349e-04]    [2.2995e-11]    [0.9997]    [0.9987]    [0.5116]     [         2]    [     1]    [         2]    [         2]    [         2]    [     1]    [     1]    [     2] 

then figure out if there improvement plotted overlaid roc diagrams. turned out, original net did enough hard tell if trained net using gaussian cpds did better. printing areas under roc curves clarified new net did indeed perform better. (base area original net, , area new one.)

conf = cell2mat(tresults(2,:)); hit = cell2mat(tresults(3,:)) == 1; [~, ~, basearea] = plotroc(baseconf, basehit, 'r') hold all; [~, ~, area] = plotroc(conf, hit, 'b') hold off;  basearea =      0.9371  area =      0.9555 

roc diagram

i'm posting here next time need this, i'll able find answer... else might find useful well.


Comments

Popular posts from this blog

java - Play! framework 2.0: How to display multiple image? -

gmail - Is there any documentation for read-only access to the Google Contacts API? -

php - Controller/JToolBar not working in Joomla 2.5 -