MATLAB Answers

Return (large) unchange mxArray from MEX

9 views (last 30 days)
Mark
Mark on 27 May 2013
Commented: James Tursa on 25 Jan 2018
Hi all ..
With a large array out to CPP / MEX, under certain conditions there will be no manipulation of the original data. Can someone please provide an example of how to properly return this original array in the most efficient manner?
Essentially plhs[0] = prhs[0] logically.
Many thanks, Mark

  0 Comments

Sign in to comment.

Accepted Answer

James Tursa
James Tursa on 27 May 2013
Edited: James Tursa on 7 Nov 2013
plhs[0] = prhs[0] is perfectly OK in mex programming. MATLAB always (EDIT: NOT TRUE ... SEE BELOW) makes shared data copies of the plhs[ ] mxArrays and actually returns the shared data copies. To see this is true you can do this:
1) Compile this code (call it plhs_eq_prhs.c):
#include "mex.h"
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
if( nrhs ) {
plhs[0] = prhs[0];
mexPrintf("The mxArray structure address of the input is %p\n",prhs[0]);
}
}
2) Run the following commands at the MATLAB prompt:
>> format debug
>> a = 1:3
a =
Structure address = 46cdda0
m = 1
n = 3
pr = 16c3a440
pi = 0
1 2 3
>> b = plhs_eq_prhs(a)
The mxArray structure address of the input is 046CDDA0
b =
Structure address = 495aea0
m = 1
n = 3
pr = 16c3a440
pi = 0
1 2 3
You can see that the mxArray structure address of the variable b is different from the mxArray structure address of the input variable a, but the data pointers pr are the same. So MATLAB has returned a shared data copy back to the workspace.
You can get almost the same result by using the following:
mxArray *mxCreateSharedDataCopy(mxArray *); // supply the prototype
:
plhs[0] = mxCreateSharedDataCopy(prhs[0]);
However, in this case there will be an extra temporary shared data copy of prhs[0] created that will be destroyed when the mex routine returns to the caller, so it is slightly less efficient that doing plhs[0] = prhs[0] directly.
The general behavior of the MATLAB API functions (the documented ones) is:
- All of the mxCreateEtc type of functions create temporary mxArrays that are put on the Variable Array List (my terminology).
- When the mex routine returns control to the caller, MATLAB makes shared data copies of the plhs[ ] variables and returns those, then everything on the Variable Array List is destroyed.
- Using mexMakeArrayPersistent removes a mxArray variable from the Variable Array List. Once a variable is removed from this list, I am unaware of any function (documented or undocumented) that will get it back on this list. So it is up to you, the programmer, to destroy it at some point in your code or you will have a permanent memory leak that can only be recovered by quitting and restarting MATLAB.
- There is a companion function mexMakeMemoryPersistent for memory allocations made with mxMalloc and friends, but in this case there is an undocumented function you can use to get it back on the garbage collection list if you want.
- There are undocumented API functions that create mxArray variables. Some of them create temporary mxArray variables that are on the Variable Array List (i.e., scheduled for garbage collection), but others create normal variables that are not on the Variable Array List. So you need to be careful when using them to avoid permanent memory leaks. mxCreateSharedDataCopy is an example of one of these undocumented API functions ... it happens to create a temporary mxArray that is on the Variable Array List (scheduled for garbage collection).

  3 Comments

Jan
Jan on 28 May 2013
See my response in the [EDITED] section of my answer.
angainor
angainor on 6 Nov 2013
Unfortunately it seems that you can not assign input to output in all cases. I have found a simple test case that reliably crashes MATLAB (2011b and 2013a) when you do plhs[0] = prhs[0] in the mex file. You can achieve it by passing a structure field (or an individual cell from a cell array). Consider this example
>> format debug
>> a.field=[1 2 3];
>> a.field
ans =
Structure address = 7f94e8259570
m = 1
n = 3
pr = 7f94ea412640
pi = 0
1 2 3
>> out=plhs_eq_prhs(a.field)
The mxArray structure address of the input is 0x7f94e825f9c8
out =
Structure address = 7f94e8264968
m = 1
n = 3
pr = 7f94ea412640
pi = 0
1 2 3
>> plhs_eq_prhs(a.field)
The mxArray structure address of the input is 0x7f94e82660f8
ans =
Structure address = 7f94e82660f8
m = 1
n = 3
pr = 0
pi = 0
(SEGFAULT)
The last execution fails. Note that in this case the mxArray address on input is THE SAME as that on the output, which is contrary to your observatiosn. Apparently, (ONLY?) in the absence of an output variable and when a structure field is being passed, MATLAB copies the field into another temporary mxArray, which is then returned directly to the environment, and freed.
This does not happen when you use mxCreateSharedDataCopy instead:
pargout[0] = mxCreateSharedDataCopy(pargin[0]);
James Tursa
James Tursa on 7 Nov 2013
Ah yes ... I forgot about that case. Thanks for the correction. It is possible to pass in a temporary variable to a mex routine. You can get it as a field reference or cell element reference as you mention, but you can also get it with any other expression that you put in the argument list (e.g., adding two variables together). Apparently it can get treated differently in the plhs return logic, possibly related to the fact that it is not on the mex routine's garbage collection list. And, of course, there is no official way to detect this in a mex routine. So the bottom line is to use mxCreateSharedDataCopy as you suggest.

Sign in to comment.

More Answers (4)

Jan
Jan on 27 May 2013
Edited: Jan on 28 May 2013
"plhs[0] = prhs[0]" is forbidden in MEX-programming, because re-using the inputs as outputs collides with the copy-on-write mechanism. Unfortunately the best (means documented and stable) method is to duplicate the array:
plhs[0] = mxDuplicateArray(prhs[0]);
And if it is your opinion, that this is not efficient for large array: Welcome to the team of MEX users, who ask TMW to document the methods for the creation of shared data copy. Please send an enhancement request to TMW.
You can try it with the undocumented mxCreateSharedDataCopy(). But a clean inplace processing would be much better.
[EDITED] James' explanation is correct and valuable. If you want the MEX to create an unchanged shared data copy of the inputs, plhs[0]=prhs[0] is valid. I've denied that, because this is a very rare need for a MEX function and I assumed, the contents of this variable must be changed inside the MEX. Then this can be observed:
// file: TestMex.c
#include "mex.h" // Must be called with a DOUBLE as input
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[]) {
plhs[0] = prhs[0]; // Ok until here
*mxGetPr(plhs[0]) = -1.0; // Don't do this
}
Compile it and run:
a = 1:4;
b = a; % Shared data copy inside Matlab
c = TestMex(b); % Modified shared data copy inside MEX
c % -1 2 3 4 as wanted
b % -1 2 3 4 !!!
a % -1 2 3 4 !!!
This means, that you plhs[0]=prhs[0] is valid, but modifications of the contents of plhs[0] are not and will destroy the integrity of the input values. Therefore performing this copy inside a MEX will not be useful for any real purpose.
What happens for:
format debug
a = 1:4;
a
a = TestMex(a)
format long g % clean up only
Now a is a shared data copy of itself. The Structure Address an pr are the same, but the increased reference counter might cause a memory leak. Unfortunately my trials to prove this have not been successful.
@Mark: I can unaccept my answer on demand.

  7 Comments

Show 4 older comments
James Tursa
James Tursa on 30 May 2013
@Jan: The CrossLink field in the mxArray structure contains the next mxArray in the linked list of shared data copy variables (for later versions of MATLAB the list is a double linked list). When this field is NULL, there are no shared data copies around. The reference count also has to be checked (for reference copies existing in cell and struct arrays). When the CrossLink is NULL and the reference count is 0, the data memory can be free'd when the variable is destroyed. This applies to most variables. For variables derived from the handle class, the rules are different (they are shared data copies without a CrossLink address).
Jan
Jan on 30 May 2013
@James: And now I assume, that the CrossLink and the ReferenceCounts are set correctly, when a is replaced by a shared data copy of itself coming from a MEX-function, which ignores the CONST qualifier of the input. I trust TMW, that the mechanisms are stable. But I know also, that using undocmumented methods means a certain risk of evil bugs and incompatibilities.
James Tursa
James Tursa on 3 Jun 2013
@Jan: I pretty much am in agreement with you. The fact that the following statement
plhs[0] = (mxArray *) prhs[0];
overrides the const attribute of prhs[0], and the fact that the shared-data-copy method of returning plhs[ ] variables is not documented, is a bit unsettling as a programmer. However, I put it in the same category of using any of the other undocumented functions/features. E.g., I routinely use mxCreateSharedDataCopy because it has proven to be reliable and it works across multiple platforms and versions. No guarantee it will work in the future, but that doesn't stop me from using it ... the advantage is just too great not to in many cases.

Sign in to comment.


Mark
Mark on 27 May 2013
Edited: Mark on 27 May 2013
Hi Jan,
Thanks for your reply. This is in fact exactly how I am handling it but this really is a terrible condition to have to senselessly duplicate an array (with associated overhead), especially when they go on into millions of observations.
I was hoping to be otherwise educated.
Mark
[Edit] In the event others search out this topic, here is a useful link: http://www.mathworks.com/support/solutions/en/data/1-6NU359/index.html

  4 Comments

Show 1 older comment
Ian
Ian on 30 Oct 2017
This thread is a little old, but I hope you gurus are still monitoring it. Q: what about creating a variable that points somewhere in the middle of prhs[0]'s data? will this break matlab's memory management system? Specifically, consider a large matrix, and wanting to split it into 2 smaller parts for processing separately:
i
>> % in matlab:
>> clear
>> % mem usage is 884 MB
>> x = rand(1000000,100);
>> % mem usage is now 1.61 GB
>> y = x(:,51:100);
>> % mem usage is now 2.01 GB
Unfortunately it appears that matlab duplicated the second portion of x, although there's no real need to do so, since I'm not modifying the original data. It seems like I should be able to create a zero-sized matrix in mexFunction, resize it, and point to the middle of the array, and return it without duplicating the data
% in matlab:
>> y = my_subarray(x,51); % get sub-array of x starting at row 51
and something like this inside mexFunction my_subarray(...):
// in C++:
double *p = msGetPtr(prhs[0]);
mxArray *B;
B = mxCreateDoubleMatrix(0, 0, mxREAL);
mxSetM(1000000);
mxSetN(100);
mxSetData(B, p+50*1000000);
plhs[0] = B;
But your examples above only deal with pointing to the entire original matrix. In my case, my need for this would not modify the sub-array, but it's not clear to me if this would break matlab's modify-on-write memory management...
James Tursa
James Tursa on 30 Oct 2017
No, you cannot do this in a mex routine. In the first place, the following line will bomb MATLAB with an assertion fault since the p+50*1000000 address will not match anything that the MATLAB Memory Manager generated:
mxSetData(B, p+50*1000000);
Even if you could set this address into the mxArray data pointer (using older versions of MATLAB or via a mxArray hack), when MATLAB subsequently destroys this mxArray downstream it will cause a seg fault because that address will mess up the memory manager.
The only thing you could do is hack into the mxArray header itself to set the data pointer directly (avoiding the assertion fault that the official API functions would generate), use this mxArray ONLY inside the mex routine, then manually detach that data pointer inside the mex routine prior to destroying the mxArray. Under no circumstances could this mxArray, which is in an invalid state when the mid memory block data pointer is present, be returned back to MATLAB as one of the plhs[ ] variables. All that being said, creating and using this partial mxArray is NOT ADVISED, is VERY TRICKY to implement, and risks bombing MATLAB.
E.g., see this FEX submission by Bruno Luong:
James Tursa
James Tursa on 25 Jan 2018
Another option that was just posted to the FEX for getting shared data copies of contiguous sub-sections of an existing variable:

Sign in to comment.


Mark
Mark on 30 May 2013
Edited: Mark on 30 May 2013
I must sheepishly admit that because Visual Studio intellisense kicked out the logical plhs[0]=prhs[0] I didn't think to try and compile. Shame on me really.
@Jan if you haven't an objection, I think indeed James has the higher value answer.

  3 Comments

James Tursa
James Tursa on 30 May 2013
For the record, I am not concerned about the "points". No need to change anything on my account.
Jan
Jan on 1 Jun 2013
@Mark: As long as my answer had no (non-trivial) comments, I could delete and resubmit it for un-accepting. But now I do not want to loose the discussion with James in the comments. I ask the admins for clearing the flag.
@James: I think, it is important, that for this non-trivial topic the answer with the highest value can be recognized by the readers. And your reputation can in fact not be increased by any kind of "points".
Mark
Mark on 2 Jun 2013
@Jan - I'm happy to elect whichever answer you think best resolves the question posed. To be fair, I did ask about C++ and relative to the specificity one may assert that your answer is the more rigorous as the direct assignment (plhs[n]=prhs[n]) fails to compile in C++ and the functions referenced with an extern do.
That said, I think the content of this topic outweighs any simple point credit assignment and believe others looking for a bit of clarity will find this all invaluable.
Again thank you and @James for your time contributing.
I would also encourage MW to adopt the functions formally with associated documentation.
I believe it to be a fundamental and necessary addition.

Sign in to comment.


Mark
Mark on 30 May 2013
Edited: Mark on 30 May 2013
In the spirit of clarification, the same 'plhs[0] = prhs[0]' does not compile in a C++ implementation (unless I'm missing something obvious which is certainly possible).
[IGNORE] May I also beg the proper mex compile command to reference the undocumented function? I'm not sure what file holds this. I think others (including myself) will find this useful. The couple of guesses I've made haven't been fruitful. [/IGNORE]
Because I'm working in C++, I needed to add the "C" to the extern declaration:
extern "C" mxArray *mxCreateSharedDataCopy(const mxArray *pr);
Many thanks for this it really has been interesting. Mark

  2 Comments

James Tursa
James Tursa on 30 May 2013
I would suggest the following to make the code robust against C/C++ implementations:
#ifdef __cplusplus
extern "C"
{
#endif
mxArray *mxCreateSharedDataCopy(const mxArray *pr);
// and any other prototypes for undocumented API functions you are using
#ifdef __cplusplus
}
#endif
And for the assignment itself do an explicit cast to override the const:
plhs[0] = (mxArray *) prhs[0];
Mark
Mark on 31 May 2013
Thanks James. That will surely resolve some sloppy or lazy code at a later date no doubt.
Much appreciated on your responses and contributions.

Sign in to comment.

Products