Custom Feature Embedding in SVM

Question

Kamal Premaratne on 11 Jun 2023

0
Link

Direct link to this question

https://in.mathworks.com/matlabcentral/answers/1981264-custom-feature-embedding-in-svm

Commented: Kamal Premaratne on 13 Jun 2023

I have been trying to define a custom feature embedding so that SVM can be applied in the feature space. As an exampe, suppose I have the following 2 data samples, each having an x- and y-coordinate:

D = [p1; p2],

where p1 = [x1 y1] = [1 2] and p2 = [x2 y2] = [3 4]. Let us take the vector of classes of the data samples to be

C = [-1; 1].

The feature embedding is

Phi(p1) = [x1^2; sqrt(2) * x1 * y1; y1^2];

Phi(p2) = [x2^2; sqrt(2) * x2 * y2; y2^2];

I tried

>> SVM1 = fitcsvm(D, C, KernelFunction = 'CustomKernel', Standardize = true);

for which I defined the function

function G = CustomKernel(U, V)

%

Phi = [U(:, 1).^2; sqrt(2) * U(:, 1) .* U(:, 2); U(:, 2).^2];

Phi = reshape(Phi, [], 3);

G = Phi * Phi';

But when I run SVM1 with

>> D = [1 2; 3 4]; C = [-1; 1];

it returns the error

>> Kernel function returns kernel product of incorrect size.

I would appreciate any assistance. Thanks.

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

Gourab on 12 Jun 2023

0
Link

Direct link to this answer

https://in.mathworks.com/matlabcentral/answers/1981264-custom-feature-embedding-in-svm#answer_1254104

Open in MATLAB Online

Hi Kamal,

I understand you want to know why your custom SVM embedding returns error.

The error you're getting is because the kernel function you've defined is returning a kernel product matrix of the wrong size.

In your ‘CustomKernel’ function, you're defining the feature embedding ‘Phi’ for each data point, and then calculating the kernel product matrix as ‘G = Phi * Phi'’ to compute the inner product between each pair of data points in the feature space defined by the feature embedding.

However, the size of the kernel product matrix returned by this calculation is not compatible with the SVM classifier, which expects a square matrix where each entry G(i, j) represents the inner product between the feature embeddings of data points i and j.

To fix this, you need to change the size of the kernel product matrix G to be NxN, where N is the number of data points in your input matrix D.

Please refer to the below code snippet for better understanding.

function G = CustomKernel(U, V) 
% Define the feature embedding function 
embedding = @(p) [p(:,1).^2, sqrt(2) * p(:,1) .* p(:,2), p(:,2).^2]; 
  
% Compute the feature embeddings for each data point 
embeddings_U = embedding(U); 
embeddings_V = embedding(V); 
  
% Compute the inner product between each pair of feature embeddings 
G = zeros(size(U,1), size(V,1)); 
for i = 1:size(U,1) 
    for j = 1:size(V,1) 
        G(i,j) = dot(embeddings_U(i,:), embeddings_V(j,:)); 
    end 
end 

I hope this helps you to resolve the query. 

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Answer 2

Kamal Premaratne on 12 Jun 2023

0
Link

Direct link to this answer

https://in.mathworks.com/matlabcentral/answers/1981264-custom-feature-embedding-in-svm#answer_1254274

Dear Gourab:

Thank you so much for your response. Your code does indeed work, and I am very appreciative.

However, what I do not understand is the following: take

>> D = [1 2; 3 4];

If I use your code,

>> G = CustomKernel(D, D)

>> G =

25 121

121 625

If I use my code, viz.,

function G = CustomKernel2(U, V)

%

Phi = [U(:, 1).^2; sqrt(2) * U(:, 1) .* U(:, 2); U(:, 2).^2];

Phi = reshape(Phi, [], 3);

G = Phi * Phi';

I get the same result.

So, I do not understand why it does not work when called uponn within

>> SVM1 = fitcsvm(D, C, KernelFunction = 'CustomKernel', Standardize = true);

2 Comments
Show NoneHide None

Gourab on 13 Jun 2023

Hi Kamal,

The error you're getting is because the issue is with the ‘Standardize’ parameter of the ‘fitcsvm’ function.

The ‘Standardize’ parameter is used to normalize the predictor variables (features) by subtracting the mean and dividing by the standard deviation. However, in your custom kernel function, you are already computing feature embeddings for each data point, which are nonlinear transformations of the original predictor variables.

Therefore, it may not be necessary to apply standardization to the feature embeddings. In fact, standardizing the feature embeddings can result in loss of information, since the feature embeddings are specifically designed to capture the nonlinear relationships between the predictor variables.

I hope this helps you to resolve the query.

Kamal Premaratne on 13 Jun 2023

Hi Gourab:

Thanks. But the problem is not in 'Standardize'. It persists even after 'Standardize' is removed. I have the feeling that it has something to do with the way fitcsvm calls 'CustomKernel'. In fact, fitcsvm appears to access 'CustomKernel' multiple times (which can be verified by, for example, by printing out the value of G at the end of the function).

Sign in to comment.

Custom Feature Embedding in SVM

0 Comments
Show -2 older commentsHide -2 older comments

Answers (2)

0 Comments
Show -2 older commentsHide -2 older comments

2 Comments
Show NoneHide None

See Also

Categories

Tags

Community Treasure Hunt

Custom Feature Embedding in SVM

0 Comments Show -2 older commentsHide -2 older comments

Answers (2)

0 Comments Show -2 older commentsHide -2 older comments

2 Comments Show NoneHide None

See Also

Categories

Tags

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

0 Comments
Show -2 older commentsHide -2 older comments

2 Comments
Show NoneHide None