In your code the multiplication part of doing
is being repeated over and over again as you select different index combinations. Perhaps you could speed up things, even in your mex method, by doing things in the following way. (I am assuming phi is a column vector: one column, with the same number of rows as there are rows and columns in p.)
First, compute, one time only, the matrix
Then for each new indn vector as it comes along do only this
rtn = sum(sum(A(indn,indn)));
rtn = sum(reshape(A(indn,indn),,1));
whichever is faster. This step requires only addition operations rather than the multiplications plus additions required for matrix multiplication, and gives the same result.
Also in gammainc use the 'upper' option instead of doing Sn1=1-Sn1.