Jenks Natural Breaks is a data clustering method. It is an optimization process that finds the best arrangement of values into different classes. It can be used for step-change detection in noisy data. In this example, a one-dimensional array of noisy values is used. The method is applied to the array to find the index of the interface separating the high and low values.
MS (2021). Clustering via Jenks Natural Breaks (https://github.com/MSH19/Clustering-via-Jenks-Natural-Breaks-Matlab), GitHub. Retrieved .
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!Create scripts with code, output, and formatted text in a single executable document.
Ops, I forgot one line of code... it is now fixed:
\\
\\
clc; clear output sub_array;
input = [1,1,2,3,10,11,13,67,71];
classes = 4;
for i = 1 : classes-1
if i == 1
data = input;
elseif i > 1
data = remaining_elements;
end
total = length (data);
[SDCM_All, GF] = get_jenks_interface(data);
[M, I1] = max(GF);
sub_array{i} = data(I1+1:total);
remaining_elements = data (1:I1);
end
output = vertcat({data(1:I1)}, flipud(sub_array'));
output{:}
\\
\\
The result with
classes = 4;
is the following
\\
\\
ans =
1 1
ans =
2 3
ans =
10 11 13
ans =
67 71
Hello, Thanks for your comments. You can easily make this work for several classes by updating the input array after each iteration. For example, for four classes:
data = [1,1,2,3,10,11,13,67,71];
total = length (data);
%% Extract elements of class 4
% 1- Split the input array into two classes based on Jenks Natural Breaks
[SDCM_All, GF] = get_jenks_interface(data);
% 2- get the interface: index of maximum Goodness of Variance Fit
[M, I1] = max(GF);
% 3- extract sub_array 4 (class 4)
sub_array_4 = data(I1+1:total);
% 4- get the reamining elements
remaining_elements = data (1:I1);
total = length(remaining_elements);
%% Extract elements of class 3
% 1- Split the remaining elements into two classes based on Jenks natural breaks
[SDCM_All, GF] = get_jenks_interface(remaining_elements);
% get the interface: index that has the maximum Goodness of Variance Fit
[M, I2] = max(GF);
% extract sub_array_3 (class 3)
sub_array_3 = data(I2+1:total);
% get the reamining elements
remaining_elements = data (1:I2);
total = length(remaining_elements);
%% Extract elements of class 2
% Split the remaining elements into two classes based on Jenks natural breaks
[SDCM_All, GF] = get_jenks_interface(remaining_elements);
% get the interface: index that has the maximum Goodness of Variance Fit
[M, I1] = max(GF);
% extract sub_array_2 (class 2)
sub_array_2 = data(I1+1:total);
%% Extract elements of class 1
sub_array_1 = data(1:I1);
%% Display the result of classes 1 to 4
disp(sub_array_4);
disp(sub_array_3);
disp(sub_array_2);
disp(sub_array_1);
Output:
67 71
10 11 13
2 3
1 1
Hi, your code is working for 2 or 3 classes, but for 4 classes, I am not sure about the result:
COMPACT CODE:
clc; clear output sub_array;
input = [1,1,2,3,10,11,13,67,71];
classes = 4;
for i = 1 : classes-1
if i == 1
data = input;
elseif i > 1
data = remaining_elements;
end
total = length (data);
[SDCM_All, GF] = get_jenks_interface(data);
[M, I1] = max(GF);
sub_array{i} = data(I1+1:total);
end
output = vertcat({data(1:I1)}, sub_array');
output{:}
RESULT (with "classes = 4"):
ans =
1 1 2 3
ans =
67 71
ans =
10 11 13
ans =
10 11 13
Is there a way to make this work for three classes instead of only 2?
Thanks for your comment, Roberto. Yes, you are right, it should be: class_2 = Array(i+1:total);
Thank you very much for this code, it is great. Just one note: I would expect line 14 in get_jenks_interface.m to read like this:
class_2 = Array(i+1:total);
rather than
class_2 = Array(i:total);
Am I right?