# What is missing from MATLAB #2 - the next decade edition

17 views (last 30 days)
Rik on 31 Jul 2020
Edited: Bruno Luong on 18 Nov 2020
Meta threads have a tendency to grow large. This has happened several times before (the wishlist threads #1 #2 #3 #4 #5, and 'What frustrates you about MATLAB?' #1 and #2).
No wonder that a thread from early 2011 has also kept growing. After just under a decade there are (at time of writing) 119 answers, making the page slow to load and navigate (especially on mobile). So after a friendly nudge; here is a new thread for the things that are missing from Matlab.
Same question: are there things you think should be possible in Matlab, but aren't? What things are possible with software packages similar to Matlab that Matlab would benefit from? (note that you can also submit an enhancement request through support, although I suspect they will be monitoring activity on this thread as well)
What should you post where?
Wishlist threads (#1 #2 #3 #4 #5): bugs and feature requests for Matlab Answers
Frustation threads (#1 #2): frustations about usage and capabilities of Matlab itself
Missing feature threads (#1 #2): features that you whish Matlab would have had

madhan ravi on 31 Jul 2020
Thanks Rik!
Walter Roberson on 31 Jul 2020
Thanks, Rik!

Shae Morgan on 31 Jul 2020
facet_wrap or facet_grid (or general ggplot2 functionality) version of subplots, or some altered, simpler customizability for subplotting subsets of data.
gramm is an available toolbox, but it'd be nice to have it built in

Rafael S.T. Vieira on 31 Jul 2020
Edited: Rafael S.T. Vieira on 31 Jul 2020
I would love to have a command TeX2sym and sym2TeX. With it, we could bring formulas from LaTeX and run them in MATLAB and the other way around. It is a tedious work doing it by hand, and I could bet most MATLAB users do it or will do it eventually.
Another useful feature to add would be arbitrary-precision arithmetic...languages such as python and Java (with Bignum), allows unlimited precision when dealing with integers at least. Granted that It is slow, but I believe that the MATLAB team could do something better. And I honestly feel like MATLAB is missing a feature by not having it...even some competing software has it.
And finally I was considering buying the Text Analytics Toolbox, and it would be nice, if it could have a grammar/spell checker. Even if it is not as advanced or doesn't contain all words. With live scripts, we can write an interactive document, so it would be nice if MATLAB could correct our spelling (even if it required a ToolBox to do it).

Walter Roberson on 31 Jul 2020
For example
• sin x+y -> %observed in the wild. People are likely to interpret as sin(x)+y . Not recommended latex style for trig functions
• \sin x+y -> %is it sin(x+y) or sin(x)+y? People are likely to interpret as sin(x)+y
• \sin{x+y} -> %sin(x+y) but people are likely to guess sin(x)+y without being sure
• \sin{x}+y -> %sin(x)+y but people are likely to guess sin(x)+y without being sure
• sin(x+y) -> %sin(x+y). Not recommended latex style for trig functions
• \sin(x+y) -> %sin(x+y). Better latex syntax for trig functions
• \sin\left( x+y \right) -> %sin(x+y) . Considered better latex style because it allows latex to match heights of () when internal content is varying sizes, but has too much space
• mx+b -> %m*x+b which people understand from long convention. But it could be a variable named mx that is being added to b . This is, however, the primary recommended latex style
• m.x+b -> %m*x+b but not recommended latex style. However, lower dot for multiplication tends to show up in papers that have matrix multiplication
• m*x+b -> %m*x+b but actively recommended against latex style
• {m}{x}+b -> %m*x+b which people understand from long convention, but visually it could be a variable named mx that is being added to b. Easier for mechanical interpretation but not typical latex style
• m{\cdot}x+b -> %m*x+b understandable both mechanically and human. One of the recommended latex styles
• m{\times}x+b -> -> %m*x+b understandable both mechanically and human, but less common in practice. Considered to be one of the valid latex styles if needed
• f{\textrm{\"{o}}}rl{\textrm{\aa}}t -> %a single word. Using {} inside an expression does not reliably indicate multiplication. And it was harder to get the special characters to work here than it should have been
Implied multiplication without any syntactic separation is very common in mathematics and latex.
The above shows just some of the ways that simple operations can be written in Latex, and tex2sym would have to try to understand them all.
There is also the issue that a lot of latex uses constructs that MATLAB does not support, especially \usepackage and amsmath mode.
madhan ravi on 31 Jul 2020
Wow , deep sir Walter!
Rafael S.T. Vieira on 1 Aug 2020
Thank you for your interest, Walter and Rik. I believe that It is best to teach ourselves to code in a particular way, which will allow MATLAB to convert it to sym, than to be doing the task by hand every time.
Operator precedence could take care some of these issues. If \sin x + y was the input, then left-to-right precedence could dictate the output to be sin(x) + y. And to obtain sin(x+y), we would have to code in LaTeX \sin{x+y} or \sin(x+y).
Implied multiplication is indeed very common in mathematics and latex. On the other hand, it is also almost ubiquitous that variables are written just as single letters (especially if we are writing implied multiplication).
Finally, a command like tex2sym does not need to contemplate every math package imho, just some set of commands and macros. MATLAB could even return its best guess, and let us do the remainder of the task. Of course, ideally, we would just copy the contents from environments like  and $...$ and paste them into MATLAB for a tex2sym conversion.

dpb on 10 Sep 2020
Editor won't restrict a substituion to selected area...best it knows of is the function. Uncool in the max! I'm now having to fix up a bunch of stuff shouldn't have to have done... :(

Walter Roberson on 13 Sep 2020
A "soft interrupt" -- that is, an interrupt that can be handled with try/catch .
For example I am running a long calculation at the moment that iterates several times, and each iteration can take more than an hour. I don't always have the patience to wait through as many iterations as I asked for. For my purposes, the output of the last successful iteration would be "good enough" if I were to ask to interrupt the function.
I can control-C, but that interrupts the function completely, losing all of the outputs, and losing the function variables.
If I could somehow "soft interrupt" and have it return the current values of all variables, then that would be good enough for my current task. But the generalization of that would be the ability to catch a soft interrupt in a try/catch so that the code could make whatever final summariziation it needed to in order to create usable output variables. Furthermore, even though my own routine might be happy to return the "last good" values of the output variables, any soft-interrupt I requested might well get received while some lower-level routine had control that did not know about soft-interrupts, so I would want interrupt (with no return values) to apply to those layers until control reached my handler for a clean run-down.

Bruno Luong on 14 Sep 2020
Just program an onCleanup to save whatever that needs to be saved. Ctrl-C do the rest.
What you propose is a dream but how to deal with variables/outputs not assigned and function interrupted at unknown state?
Only the programmer can know his workflow. So no possible to implement this lazy feature IMO.
Walter Roberson on 14 Sep 2020
onCleanup would have to either save() or store into global variables. The point is to be able to have a smooth return of values.
Outputs not assigned would still generate the same error messages as at present.
Function interrupted at unknown state is no worse of a problem than functions that error and are caught by try/catch at present.
It is common for programming systems to have a control-C handler .
Sindar on 3 Oct 2020
My solution in a similar case was to check whether a certain file ('FAIL_MODE.txt') exists in the directory at the end of each iteration. If it does, break the loop. Then, to stop the program, all I had to do was create the file (which is pretty trivial unless Matlab has frozen the whole computer)

Tim on 5 Nov 2020
The following changes to the internal volume renderer would make Matlab much more useful for volume visualization:
• True RGB volume visualization (vs. just scalar data + colormap)
• Independent, decoupled specification of alpha values and intensity values
• The ability to combine volume images with standard axes and axes objects like points, lines, patches, etc.

Mario Malic on 13 Nov 2020
When typing code, if user wants to reference a variable or a function, one could click tab and get a matching list of functions and variables. Would it be useful to split this functionality with modifier if user wants to reference one of the two? As an example Shift +Tab for variables and Tab for functions and variables (not to break the current functionality).

Mikhail on 2 Aug 2020
Some (well, most) dynamic programming languages allow for an experienced developer to have an insight look into how their code is actually executed by the runtime environment.
They are able to see the parse tree and the compiled byte code.
They are able to see what parts of code are JIT-compiled, and how exactly the are compiled. When something is not compiled, they are able to see why.
The developer doesn't have to guess whether a particular optimization has kicked in or not. They know for sure how each and every object in their code is handled (whether it is CoW, passed by reference, passed by value).
I'd love to see these capabilities in MATLAB, too.

Walter Roberson on 2 Aug 2020
The parse tree can be accessed through the undocumented mtree() function https://www.mathworks.com/matlabcentral/answers/180048-list-built-in-commands-used-by-m-function#answer_169083
Mikhail on 2 Aug 2020
> I tell people to use readfile(filename_or_url) and use that result, instead of telling them how to figure out if a file is UTF8 or ANSI and which exact conversion that requires for their specific OS and version of Matlab/Octave.
> ...
> The function is not suddenly horrible because you can't have a peek at the implementation and optimize your workflow accordingly. Why would the language as a whole be any different?
Given that I can't peek at the function's implementation, this function suddenly becomes horrible when I have a file double encoded to UTF-8 (and so your function returns garbage) and you considered your users too dumb to allow them to mend this function for their needs.
Same goes for the language.
Rik on 2 Aug 2020
That doesn't make the function horrible, it just means the function is not suited to your needs. I don't complain that my hypothetical electric stove can't cool things down, even though with a TEC it is possible to both heat and cool down things with a single electrically powered device.
Although you could re-use a lot of the internals of readfile to fix double UTF-8 encoded files, the problem is not the function, it's the double encoding. Fixing double encoding is a different task from reading a file correctly. It wouldn't be a matter of 'mending my function', it would be re-using internal functions for a different goal. The word 'mending' implies my function is broken, but returning a double encoded file in a corrected form is not living up to the contract that the name of the function is offering.
Also, there are more reasons to close down the source than just thinking your users are dumb (suggesting otherwise sounds a lot like assuming bad faith). As an example: if you want to provide a function for a license fee, it would be plain stupid to have your license check in an m-file, since it would be almost trivial to circumvent.

Seth Wagenman on 31 Aug 2020
Ability to convert Python items to MATLAB data types (other than numpy arrays) inside of MATLAB, rather than inside of Python using the admin-rights-required API: https://www.mathworks.com/help/matlab/matlab_external/install-the-matlab-engine-for-python.html?s_tid=srchtitle

dpb on 6 Sep 2020
Edited: dpb on 6 Sep 2020
A ready-built insertion method for tables, arrays, etc., so don't have to do all the under-the-hood indexing manually...given an array of the right size and matching index, if had table (say), then
tMyTable=insert(tMyTable,indices,tNewData);
ideally, an option with keyword 'Empty', T|F would be available as well.
Maybe I'm missing some magic pixie dust, but I've been unable to figure out a way to this without physically moving a section at a time to insert the new row which means working from rear to front to avoid changing the indices to the insertion points or by catenating the pieces starting from the front.

Show 1 older comment
Walter Roberson on 6 Sep 2020
You can do it without moving a section at a time by building a cumulative destination index that increases by the size of each gap at the appropriate places (though I need to think more about how to vectorize this step.) Then you assign the existing values into a new variable, destination indexed by the cumulative index. Then you can assign the new values into the gaps with a single assignment statement using a destination that is just the "holes".
Not sure exactly how to vectorize the index construction... but the data moving itself does not have to be done a section at a time.
dpb on 6 Sep 2020
"what would the Empty option do?"
Insert blank record(s). With table you have to construct a record of the right size and type for each column and with the matching variable name; would just be "syntactic sugar" to have the routine automagically handle that as well. (Probably, the keyword is enough, likely the only use would be with the value 'T')
Useful if you know where the new record(s) goes(go) but don't yet have the data for it(them) at hand but will be delivered on the next train to arrive at the station. Or, if only have part of the full record with the rest not arriving until the container ship from China gets to port. Insert the blanks then fill in the fields later.
Possibly there's an 'Incomplete' option that would be the composite of the two steps for the partial data in hand case?
Not that big a deal for an array of native type, agreed, but have run into the case enough times with tables for the thought should be a builtin -- at least for the table/timetable/timeseries(?) classes.
dpb on 6 Sep 2020
Yeah, that's undoubtedly a better way on the construction, Walter -- I hadn't really thought about the implementation too much other than going "dead ahead" to "just get 'er done!" for the immediate task at hand.
Taking a break, I thought I'd throw it out as an idea for an enhancement for discussion...with the sidelight maybe somebody did have a "trick" hadn't thought of.

Walter Roberson on 9 Sep 2020
I know I've said it before, but it's still missing and still important:
We need a way to gather all of the outputs of function call into a cell in the middle of an expression .
I know this may be tricky to implement internally. There are internal rules that are hard to work out, that have to do with how many outputs to request from functions. For example,
[A, B] = cellfun(@C, D, 'uniform', 0)
somehow passes knowledge of "two outputs" to C -- for example if you were to use @max then A would be a cell array of the maxima and B would be a cell array of the indices. The situation can get more complicated than this, and figuring out all the cases can make your head hurt. But we do know that any expression C(D(E)) that D(E) will be evaluated asking for exactly one output and that would be passed into C... but the knowledge of multiple outputs would be passed to C and yet not D.
The number of outputs to use is not inherently clear. If for example you call ode45 and ask to gather the outputs, are you asking for the common TOUT, YOUT case, or the full TOUT,YOUT,TE,YE,IE ? There are some cases where extra outputs can be expensive to calculate, so even though an operation that gathered "maximum" outputs might be useful, it is not always desireable, so ability to select the number would be useful.
Then there are issues with, for example, deal(), where you can have any number of outputs with just a single input:
[A,B,C,D] = deal(123)
would initialize A, B, C, and D all to 123. So if you ask to gather "all" of the outputs from deal(123), that number is not well defined.
Working these things out is not trivial -- but it is a really missing bit of the language.
There might be an opportunity for a syntax such as {}name -- e.g.,
arrayfun(@(X0) {}ode45(@f, tspan, X0), x0s)
meaning to gather all of the outputs of the call. At present, {} is not valid before a name.

Bruno Luong on 9 Sep 2020
Why nargout doesn't meet what you ask?
function varargout = foo(varargin)
varargout = cell(1,nargout);
[varargout{:}] = deal('dummy');
fprintf('request with %d outputs\n', nargout);
Test
>> [A]=cellfun(@foo, {1 2 3}, 'UniformOutput', false);
request with 1 outputs
request with 1 outputs
request with 1 outputs
>> [A,B]=cellfun(@foo, {1 2 3}, 'UniformOutput', false);
request with 2 outputs
request with 2 outputs
request with 2 outputs
>>
Walter Roberson on 10 Sep 2020
Bruno,
Imagine that you want to implement
[temp{1:3}] = ndgrid(-1:.1:1);
C = cell2mat(cellfun(@(M) M(:), temp, 'uniform', 0);
That is, you want a single array in which each of the columns is one of the outputs of ndgrid(). And you want to do it as a single expression.
ndgrid() does not have a fixed number of outputs -- it is not like sin() with one fixed outputs, or max with two fixed outputs. nargout(@ndgrid) is -1 -- in other words the declaration is like
function varargout = ndgrid(varargin)
If you wanted to capture all of the outputs of max() then you could query ndgrid(@max) to get 2:
function outputs = gather_outputs(f, varargin)
n = nargout(f);
[outputs{1:n}] = f(varargin{:});
end
and you could
gather_outputs(@max, rand(3,5))
and this would be fine for gathering the two outputs of max into a single cell array.
But if we try to
gather_outputs(@deal, [])
then nargout(@deal) is -1 and if we said "Okay, take the absolute value of that -1" then we would be doing
[outputs{1:1}] = deal([])
which would give you just {[]} as the output.
This shows that you cannot just look at nargout() of the function you are invoking, such as @max or @deal .
Can we just look at nargout of the overall expression to determine the number of outputs to use for deal? No,
C = cell2mat(cellfun(@(M) M(:), gather_outputs(@ndgrid, -1:.1:1), 'uniform', 0));
would at best tell you that nargout is 1 (the C variable). You need something else to tell you the number of outputs you want to get -- something like
function outputs = gather_n_outputs(f, n, varargin)
[outputs{1:n}] = f(varargin{:});
end
and then you could
C = cell2mat(cellfun(@(M) M(:), gather_n_outputs(@ndgrid, 3, -1:.1:1), 'uniform', 0));
The request is to build this kind of facility in to MATLAB instead of having to write true functions like gather_n_outputs and have to pass function handles into them. Some syntax like
C = cell2mat(cellfun(@(M) M(:), {3}ndgrid(-1:.1:1), 'uniform',0));
where the hypothethical new syntax {3} indicates that you are requesting 3 outputs and that you want them gathered in a cell array. A common alternate syntax would be {} to request all of the outputs, like
C = cellfun(@(V) V.^2, {}max(rand(3,5)), 'uniform', 0)
which would hypothetically gather both (all) outputs of the max() call into a cell array that would then be available for further processing.
If you just used
C = cellfun(@(V) V.^2, max(rand(3,5)), 'uniform', 0)
then the knowledge of the single output would be carried through into the cellfun, which would tell max() to only emit a single output, so you would not get the second output processed. And the first output would be numeric not cell, so you would need arrayfun instead of cellfun...)

Walter Roberson on 12 Oct 2020
A "select" clause for readtable() and kin.
For example one user only wanted to read the rows in which one particular variable had a particular value
It could at some point be implemented in terms of a rowfun() type function that got passed the variables for a row and could make arbitrary decisions based upon the row contents.
However an earlier stage could potentially have pairs of variable names (or numbers) and a vector or cell array of values, in which the selection code did an ismember(). Such a facility could be further improved if there were a "sorted" option (so the code could figure out when to give up looking -- any one value should no longer be looked for if a larger value were encountered). Or even "grouped", which would not imply sorted as such but would imply that when you find a value that all the instances of that value that exist will be in adjacent rows and so as soon as you detect a change you can know to give up looking for that value.

Sindar on 5 Nov 2020
Ok, this is pretty minor, but:
For common functions that return values as the first output and indices as the second, it would be nice if there was a direct way of getting the indices, so I could do things like this:
x = [1 2 5 4 5];
y = 1:5;
y_xsorted = y(sortInd(x));
% or
y_xunique = y(unique(x,'ia'));
instead of needing to create temporary variables:
x = [1 2 5 4 3];
y = 1:5;
[~,idx] = sort(x);
y_xsorted = y(idx);
[~,idx] = unique(x);
y_xunique = y(idx);
I know I could make wrappers myself, but this seems like a case where a builtin function could potentially be noticeably optimized. (And, is more straightforward than rebuilding the whole output system)

Bruno Luong on 6 Nov 2020
Implement family of (numerical) data structure with O(1) inserttion, removal, that includes chain-list, binary tree, Fibonachi tree, red-black tree etc, etc, ... the performance must be focus point. Don't care if they are encapsulated or not in the OP, just don't make that reduce the performace.

Bruno Luong on 6 Nov 2020
Implement equivalent to C inline-function (or macro) so as calling this function on small data won't be penalized in speed with the over-head.

Bruno Luong on 17 Nov 2020
Edited: Bruno Luong on 17 Nov 2020
No big deal but I wish SIGN(X) function could return 1 for X=0 instead of 0. May be implementing an option to not break the compatibility.
I just rarely use SIGN because of this exception choice.

Adam Danz on 17 Nov 2020
sign() returns the sign of -inf/+inf (-1/1) and returns NaN for NaN inputs.
Walter Roberson on 17 Nov 2020
It is common for people to write terms in the form
something + value * u[x-offset] %unit step function
and common for people to implement that either as
something + value * heaviside(x-offset) %OR
something + value .* (x>= offset)
and occasionally as
something + value .* (sign(x-offset)+1)/2
except wanting sign() to return 1 at 0 to achieve the same boundary condition.
And that is fine provided that value is finite when it is unselected, with non-finite being a problem because 0*inf -> nan, 0*-inf -> nan, 0*nan -> nan.
This is not a problem that can be cured by using the heaviside() or sign() functions or the >= operators... it needs something more like piecewise:
something + piecewise(x >= offset, value, 0)
It is not a problem in theoretical mathematics because Heaviside is more a distribution than a function (just like Dirac Delta is a distribution) and Limit of the distribution acts to remove the problem.
Bruno Luong on 18 Nov 2020
Why I prefer sign(0) = 1?
To me y = sign(x) should satisfy these two important properties
norm(y) = 1
x = y * abs(x)
This uniquely determines y for any x ~= 0 (real, complex or even quaternion). For x=0, any y=exp(1i*alpha) with alpha real meet both properties, so just select it as y=1 for arbitrary convention.
However the MATLAB SIGN() function does NOT satisfy the first property for x=0.
Adams's trick sign(x+realmin) is interesting, but it just shift the issue to x=-realmin.