setdiff

(Not Recommended) Set difference for dataset array observations

The dataset data type is not recommended. To work with heterogeneous data, use the MATLAB^® table data type instead. See MATLAB table documentation for more information.

Syntax

C = setdiff(A,B)

C = setdiff(A,B,vars)

C = setdiff(A,B,vars,setOrder)

[C,iA] =
setdiff(___)

Description

C = setdiff(A,B) for dataset arrays A and B returns the set of observations that are in A but not B, with repetitions removed. The observations in the dataset array C are sorted.

example

C = setdiff(A,B,vars) returns the set of observations that are in A but not B, considering only the variables specified in vars, with repetitions removed. The observations in the dataset array C are sorted by these variables. The values for variables not specified in vars for each observation in C are taken from the corresponding observation in A. If there are multiple observations in A that correspond to an observation in C, those values are taken from the first occurrence.

C = setdiff(A,B,vars,setOrder) returns the observations in C in the order specified by setOrder.

[C,iA] = setdiff(___) also returns the index vector iA such that C = A(iA,:). If there are repeated observations in A, then setdiff returns the index of the first occurrence. You can use any of the previous input arguments.

Examples

collapse all

Set Difference of Two Dataset Arrays

Open Live Script

Create a scalar structure array, and then convert it into two dataset arrays.

S(1,1).Name = 'CLARK';
S(1,1).Gender = 'M';
S(1,1).SystolicBP = 124;
S(1,1).DiastolicBP = 93;

S(2,1).Name = 'BROWN';
S(2,1).Gender = 'F';
S(2,1).SystolicBP = 122;
S(2,1).DiastolicBP = 80;

S(3,1).Name = 'MARTIN';
S(3,1).Gender = 'M';
S(3,1).SystolicBP = 130;
S(3,1).DiastolicBP = 92;

A = struct2dataset(S(1:2));
B = struct2dataset(S(2:3));

The intersection of A and B is the second observation, with last name BROWN.

Return the set difference of A and B.

[C,iA] = setdiff(A,B)

C = 
    Name             Gender       SystolicBP    DiastolicBP
    {'CLARK'}        {'M'}        124           93

iA = 
1

The first observation in A is not present in B.

Input Arguments

collapse all

`A,B` — Input arrays
`dataset` objects

Input arrays, specified as dataset objects.

`vars` — Variable names
string array | cell array of character vectors | vector of integers containing variable column numbers

Variable names, specified as a string array, cell array of character vectors, or vector of integers containing variable column numbers. vars indicates the variables in A and B that setdiff considers.

Specify vars as [] to use its default value of all variables.

`setOrder` — Flag indicating sorting order for observations in the resulting array
`'sorted'` (default) | `'stable'`

Flag indicating sorting order for observations in the resulting array C, specified as 'sorted' or 'stable'.

`'sorted'`	Observations in `C` are in sorted order (default).
`'stable'`	Observations in `C` are in the same order that they appear in `A`.

Output Arguments

collapse all

`C` — Dataset array containing observations that belong to `A` but not `B`
`dataset` object

Dataset containing observations that belong to A but not B, with repetitions removed, returned as a dataset object. C is in sorted order (by default), or the order specified by setOrder.

`iA` — Index vector indicating observations from `A` that are in `C`
vector of integers

Index vector indicating observations from A that are in C, returned as a vector of integers. The vector iA contains the index to the first occurrence of any repeated observations in A.

Version History

Introduced in R2012b

setdiff

Syntax

Description

Examples

Set Difference of Two Dataset Arrays

Input Arguments

A,B — Input arrays dataset objects

vars — Variable names string array | cell array of character vectors | vector of integers containing variable column numbers

setOrder — Flag indicating sorting order for observations in the resulting array 'sorted' (default) | 'stable'

Output Arguments

C — Dataset array containing observations that belong to A but not B dataset object

iA — Index vector indicating observations from A that are in C vector of integers

Version History

See Also

`A,B` — Input arrays
`dataset` objects

`vars` — Variable names
string array | cell array of character vectors | vector of integers containing variable column numbers

`setOrder` — Flag indicating sorting order for observations in the resulting array
`'sorted'` (default) | `'stable'`

`C` — Dataset array containing observations that belong to `A` but not `B`
`dataset` object

`iA` — Index vector indicating observations from `A` that are in `C`
vector of integers