# setdiff

Class: dataset

(Not Recommended) Set difference for dataset array observations

The `dataset` data type is not recommended. To work with heterogeneous data, use the MATLAB® `table` data type instead. See MATLAB `table` documentation for more information.

## Syntax

```C = setdiff(A,B) C = setdiff(A,B,vars) C = setxor(A,B,vars,setOrder) [C,iA] = setxor(___) ```

## Description

`C = setdiff(A,B)` for `dataset` arrays `A` and `B` returns the set of observations that are in `A` but not `B`, with repetitions removed. The observations in the dataset array `C` are sorted.

`C = setdiff(A,B,vars)` returns the set of observations that are in `A` but not `B`, considering only the variables specified in `vars`, with repetitions removed. The observations in the dataset array `C` are sorted by these variables. The values for variables not specified in `vars` for each observation in `C` are taken from the corresponding observation in `A`. If there are multiple observations in `A` that correspond to an observation in `C`, those values are taken from the first occurrence.

`C = setxor(A,B,vars,setOrder)` returns the observations in `C` in the order specified by `setOrder`.

```[C,iA] = setxor(___)``` also returns the index vector `iA` such that ```C = A(iA,:)```. If there are repeated observations in `A`, then `setxor` returns the index of the first occurrence. You can use any of the previous input arguments.

## Input Arguments

`A,B`

Input dataset arrays.

`vars`

String array or cell array of character vectors containing variable names, or a vector of integers containing variable column numbers. `vars` indicates the variables that `setdiff` considers.

Specify `vars` as `[]` to use its default value of all variables.

`setOrder`

Flag indicating the sorting order for the observations in `C`. The possible values of `setOrder` are:

 `'sorted'` Observations in `C` are in sorted order (default). `'stable'` Observations in `C` are in the same order that they appear in `A`.

## Output Arguments

 `C` Dataset array with the observations that are in `A` but not `B`, with repetitions removed. `C` is in sorted order (by default), or the order specified by `setOrder`. `iA` Index vector, indicating the observations from `A` that are in `C`. The vector `iA` contains the index to the first occurrence of any repeated observations in `A`.

## Examples

expand all

Create a scalar structure array, and then convert it into two dataset arrays.

```S(1,1).Name = 'CLARK'; S(1,1).Gender = 'M'; S(1,1).SystolicBP = 124; S(1,1).DiastolicBP = 93; S(2,1).Name = 'BROWN'; S(2,1).Gender = 'F'; S(2,1).SystolicBP = 122; S(2,1).DiastolicBP = 80; S(3,1).Name = 'MARTIN'; S(3,1).Gender = 'M'; S(3,1).SystolicBP = 130; S(3,1).DiastolicBP = 92; A = struct2dataset(S(1:2)); B = struct2dataset(S(2:3));```

The intersection of `A` and `B` is the second observation, with last name `BROWN`.

Return the set difference of `A` and `B`.

`[C,iA] = setdiff(A,B)`
```C = Name Gender SystolicBP DiastolicBP {'CLARK'} {'M'} 124 93 ```
```iA = 1 ```

The first observation in `A` is not present in `B`.