preprocessing data for PCA

6 views (last 30 days)
Mahmoud
Mahmoud on 17 Nov 2017
Edited: Shantanu Dixit on 29 May 2025
Hi, I have a table contains categorial data as the first column and numerical data as the rest of columns. I need to fill the missing values and standardize the numerical data. Here is what I did:
PrepData = fillmissing(data,'spline','DataVariables',@isnumeric);
func = @zscore;
PrepData = varfun(func,PrepData{:,2:end});
but I receive this error:
Undefined function 'varfun' for input arguments of type 'function_handle'.
How can I fix this?
Thank you

Answers (1)

Shantanu Dixit
Shantanu Dixit on 29 May 2025
Edited: Shantanu Dixit on 29 May 2025
Hi Mahmoud,
If I understood the query correctly, you're trying to preprocess a table where the first column is categorical and the remaining columns are numerical. The error occurs because 'varfun' expects a table as input: https://www.mathworks.com/help/matlab/ref/table.varfun.html#btyj4vl-1-A , but 'PrepData{:,2:end}' extracts a matrix (numeric array).
To fill the missing values in the numerical columns (using 'spline' interpolation) and correspondingly standardize the numerical data (using 'zscore'), you can refer to a below example script:
% Sample data
categories = {'Type1'; 'Type2'; 'Type1'; 'Type2'; 'Type1'};
values1 = [1.2; NaN; 3.4; 4.1; 5.6];
values2 = [10.5; 20.1; NaN; 40.3; 50.7];
data = table(categories, values1, values2, ...
'VariableNames', {'Category', 'Feature1', 'Feature2'});
disp(data)
Category Feature1 Feature2 _________ ________ ________ {'Type1'} 1.2 10.5 {'Type2'} NaN 20.1 {'Type1'} 3.4 NaN {'Type2'} 4.1 40.3 {'Type1'} 5.6 50.7
% fill missing values
PrepData = fillmissing(data, 'spline', 'DataVariables', @isnumeric);
% Standardize numerical columns (keeping categorical)
numVars = PrepData.Properties.VariableNames(2:end); % Get numerical column names
% varfun expects table /timetable as second input argument
PrepData(:,numVars) = varfun(@zscore, PrepData(:,numVars));
disp('Preprocessed data:');
Preprocessed data:
disp(PrepData);
Category Feature1 Feature2 _________ ________ _________ {'Type1'} -1.3476 -1.2467 {'Type2'} -0.42879 -0.64327 {'Type1'} 0 -0.016763 {'Type2'} 0.42879 0.62651 {'Type1'} 1.3476 1.2803
Hope this helps!

Categories

Find more on Dimensionality Reduction and Feature Extraction in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!