Function Error / If and elseif statement help

Hello,
I have been trying to create this function to call when entering nucleotides and then having the string divided by 3 and then read and changed into the appropriate amino acid. When I try to run the function it comes up with the error:
'Not enough input arguements'
If anyone could help to fix this code up it would be amazing!!
function [amino_acid_chain] = synthesize2(neucleotide_string)
%Function to synthesize an amino acid chain from an mRNA molecule.
neucleotide_string = upper(neucleotide_string);
%Loop to check for invalid characters in neucleotide string.
while any(neucleotide_string ~= 'A' && neucleotide_string ~= 'G' && neucleotide_string ~= 'U' && neucleotide_string ~= 'C');
error('Error! Neucleotide string contains invalid characters.');
end
amino_acid_chain = cellstr(reshape(neucleotide_string,3,[])');
if length(amino_acid_chain)<3
amino_acid_chain = char([]);
return;
end
if amino_acid_chain == 'UUU' or 'UUC'
amino_acid_chain = replace(word,{'UUU','UUC'},{'F','F'});
elseif amino_acid_chain == 'UUA' or 'UUG' or 'CUU' or 'CUA' or 'CUG'
amino_acid_chain = replace(word,{'UUA','UUG','CUU','CUC','CUA','CUG'},{'L','L','L','L','L','L'});
elseif amino_acid_chain == 'AUU' or 'AUC' or 'AUA'
amino_acid_chain = replace(word,{'AUU', 'AUC', 'AUA'},{'I','I','I'});
elseif amino_acid_chain == 'AUG'
amino_acid_chain = replace(word,{'AUG'},{'M'});
elseif amino_acid_chain == 'GUU' or 'GUC' or 'GUA' or 'GUG'
amino_acid_chain = replace(word,{'GUU','GUC','GUA','GUG'},{'V','V','V','V'});
elseif amino_acid_chain == 'UCU' or 'UCC' or 'UCA' or 'UCG'
amino_acid_chain = replace(word,{'UCU','UCC','UCA','UCG'},{'S','S','S','S'});
elseif amino_acid_chain == 'CCU' or 'CCC' or 'CCA' or 'CCG'
amino_acid_chain = replace(word,{'CCU','CCC','CCA','CCG'},{'P','P','P','P'});
elseif amino_acid_chain == 'ACU' or 'ACC' or 'ACA' or 'ACG'
amino_acid_chain = replace(word,{'ACU','ACC','ACA','ACG'},{'T','T','T','T'});
elseif amino_acid_chain == 'GCU' or 'GCC' or 'GCA' or 'GCG'
amino_acid_chain = replace(word,{'GCU','GCC','GCA','GCG'},{'A','A','A','A'});
elseif amino_acid_chain == 'UAU' or 'UAC'
amino_acid_chain = replace(word,{'UAU','UAC'},{'Y','Y'});
elseif amino_acid_chain == 'CAA' or 'CAG'
amino_acid_chain = replace(word,{'CAA','CAG'},{'Q','Q'});
elseif amino_acid_chain == 'AAU' or 'AAC'
amino_acid_chain = replace(word,{'AAU','AAC'},{'N','N'});
elseif amino_acid_chain == 'AAA' or 'AAG'
amino_acid_chain = replace(word,{'AAA','AAG'},{'K','K'});
elseif amino_acid_chain == 'GAU' or 'GAC'
amino_acid_chain = replace(word,{'GAU','GAC'},{'D','D'});
elseif amino_acid_chain == 'GAA' or 'GAG'
amino_acid_chain = replace(word,{'GAA','GAG'},{'E','E'});
elseif amino_acid_chain == 'UGU' or 'UGC'
amino_acid_chain = replace(word,{'UGU','UGC'},{'C','C'});
elseif amino_acid_chain == 'UGG'
amino_acid_chain = replace(word,{'UGG'},{'W'});
elseif amino_acid_chain == 'CGU' or 'CGC' or 'CGA' or 'CGG'
amino_acid_chain = replace(word,{'CGU','CGC','CGA','CGG'},{'R','R','R','R'});
elseif amino_acid_chain == 'AGU' or 'AGC'
amino_acid_chain = replace(word,{'AGU','AGC'},{'S','S'});
elseif amino_acid_chain == 'AGA' or 'AGG'
amino_acid_chain = replace(word,{'AGA','AGG'},{'R','R'});
elseif amino_acid_chain == 'CGU' or 'GGC' or 'GGA' or 'GGG'
amino_acid_chain = replace(word,{'GGU','GGC','GGA','GGG'},{'G','G','G','G'});
elseif amino_acid_chain == 'UAA' or 'UAG' or 'UGA'
amino_acid_chain = replace(word,{'UAA','UAG','UGA'},{'Stop','Stop','Stop'});
end
end
I was also using 'or' in my if/elseif statement, however I don't think that is correct either...

3 Comments

A few questions and some recommendations:
How are you calling this function?
If you are not sure you are using the or function correctly, why didn't you check the documentation?
The mlint is giving you several warnings. I would suggest you fix those.
If you find yourself using a lot of elseif statements, consider using switch, case, otherwise instead.
If you compare something with ==, it will be an element-wise operation. That means that 'test'=='foo' is not going to return false, it is going to give an error. There is no letter in foo that you can compare with the second t in test. If you want to compare strings: strcmp.
to call the function I added a nucleotide_string variable in the control window and then called the function in the control window.
I adjusted the code to the following:
function [amino_acid_chain] = synth_test(nucleotide_string)
%Function to synthesize an amino acid chain from an mRNA molecule.
%To handle the upper and lowercase letters, all letter are transferred to
%uppercase to be read.
neucleotides = upper(neucleotide_string);
%Loop to check for invalid characters in neucleotide string.
while any(neucleotides ~= 'A', neucleotides ~= 'G', neucleotides ~= 'U', neucleotides ~= 'C')
error('Error! Neucleotide string contains invalid characters.');
end
amino_acid_chain = cellstr(reshape(neucleotides,3,[])');
if length(amino_acid_chain)<3
amino_acid_chain = char([]);
return;
end
strrep(amino_acid_chain, {'UUU','UUC'}, {'F','F'});
strrep(amino_acid_chain, {'UUA','UUG','CUU','CUC','CUA','CUG'}, {'L','L','L','L','L','L'});
strrep(amino_acid_chain, {'AUU', 'AUC', 'AUA'},{'I','I','I'});
strrep(amino_acid_chain, {'AUG'},{'M'});
strrep(amino_acid_chain, {'GUU','GUC','GUA','GUG'},{'V','V','V','V'});
streep(amino_acid_chain, {'UCU','UCC','UCA','UCG'},{'S','S','S','S'});
strrep(amino_acid_chain, {'CCU','CCC','CCA','CCG'},{'P','P','P','P'});
strrep(amino_acid_chain, {'ACU','ACC','ACA','ACG'},{'T','T','T','T'});
strrep(amino_acid_chain, {'GCU','GCC','GCA','GCG'},{'A','A','A','A'});
strrep(amino_acid_chain, {'UAU','UAC'},{'Y','Y'});
strrep(amino_acid_chain, {'CAA','CAG'},{'Q','Q'});
strrep(amino_acid_chain, {'AAU','AAC'},{'N','N'});
strrep(amino_acid_chain, {'AAA','AAG'},{'K','K'});
strrep(amino_acid_chain, {'GAU','GAC'},{'D','D'});
strrep(amino_acid_chain, {'GAA','GAG'},{'E','E'});
strrep(amino_acid_chain, {'UGU','UGC'},{'C','C'});
strrep(amino_acid_chain, {'UGG'},{'W'});
strrep(amino_acid_chain, {'CGU','CGC','CGA','CGG'},{'R','R','R','R'});
strrep(amino_acid_chain, {'AGU','AGC'},{'S','S'});
strrep(amino_acid_chain, {'AGA','AGG'},{'R','R'});
streep(amino_acid_chain, {'GGU','GGC','GGA','GGG'},{'G','G','G','G'});
strrep(amino_acid_chain, {'UAA','UAG','UGA'},{'Stop','Stop','Stop'});
if amino_acid_chain == 'Stop'
return;
end
end
I thought it would be better then trying to use an if/elseif statement...
I'm also not 100% sure on how to give an output that will give a string reading of the neucleotides to the converted amino acids...
Why do you still have this?
if amino_acid_chain == 'Stop'
The goal of that if is unclear anyway, as there is no code between the return and the end of the function.
Also, have you seen my answer?

Sign in to comment.

 Accepted Answer

Rik
Rik on 4 May 2020
Edited: Rik on 5 May 2020
As a replacement for what you have done here, I would suggest using ismember to find the triples that form valid codes. You can even use it to keep track of every marked position so you can see if you forgot to implement any triples.
amino_acid_chain = cellstr(reshape(neucleotide_string,3,[])');
L=false(size(amino_acid_chain));%keep track of replaced codes
library={{'E','GAA','GAG'};...
{'W','UGG'}};%etc
for n=1:numel(library)
triplet=library{n}(2:end);%select the triplet(s) from the library (e.g. {'GAA','GAG'})
letter=library{n}(1);%select the corresponding amino acid letter (e.g. 'E')
L_current_code=ismember(amino_acid_chain,triplet);%find all positions where the amino acid occurs
amino_acid_chain(L_current_code)=letter;%replace by the letter code
L=L | L_current_code;%mark as replaced
end
if any(~L)%shouldn't happen
error('some code was not implemented correctly')
end

15 Comments

Sorry, I am quite new to MATLAB... could you please explain the code that you have used?
Will that also spit out the amino acid code?
I have added some comments. Is it more clear now?
It is a replacement for your function after your input checking, so it does the same as your function. So it doesn't stop parsing at the first stop codon. It will return a cell array with an amino acid code in each cell. You could write a function that returns a char array containing only the letters before the first stop. Is that what you want? Because it is a relatively minor addition.
Yes, that helps alot!!
I need to create a function that will give the amino acids that correspond to each set of 3 nucleotides.
I have created the function that you have suggested, however when I try to call the function it says that I do not have enough input arguments.
This is the whole function together at the moment:
function [amino_acid_chain] = test2(nucleotide_string)
new_nucleotide_string = upper(nucleotide_string);
amino_acid_chain = cellstr(reshape(new_neucleotide_string,3,[])');
L=false(size(amino_acid_chain));%keep track of replaced codes
library={{'E','GAA','GAG'}
{'F','UUU','UUC'}
{'L','UUA','UUG','CUU','CUC','CUA','CUG'}
{'I','AUU', 'AUC', 'AUA'}
{'M','AUG'}
{'V','GUU','GUC','GUA','GUG'}
{'S','UCU','UCC','UCA','UCG'}
{'P','CCU','CCC','CCA','CCG'}
{'T','ACU','ACC','ACA','ACG'}
{'A','GCU','GCC','GCA','GCG'}
{'Y','UAU','UAC'}
{'Q','CAA','CAG'}
{'N','AAU','AAC'}
{'D','GAU','GAC'}
{'K','AAA','AAG'}
{'E','GAA','GAG'}
{'C','UGU','UGC'}
{'W','UGG'}};
for n=1:numel(library)
triplet=library{n}(2:end);
letter=library{n}(1);
L_current_code=ismember(amino_acid_chain,triplet);
amino_acid_chain(L_current_code)=letter;
L=L | L_current_code;
end
if any(~L)%shouldn't happen
error('some code was not implemented correctly')
end
end
Also, yes I would like the code to stop if it comes across a 3 letter code corresponding to a stop codon.
I finally managed to get the error to go away... it had to do with how I was calling the function.
This is the output I get however...
>> nucleotide_string = 'AUGGCUCGCAGCUAA';
>> test2
ans =
1×17 logical array
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0
ans =
1×17 logical array
1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
ans =
1×17 logical array
1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Error using test2 (line 10)
Error! Neucleotide string contains invalid characters.
10 error('Error! Neucleotide string contains invalid characters.')
Please use the edit tools to make your code more readable.
After fixing a small typo, the source of the error in your code is clear: you forgot to include R in the library of codons and missed some S codons.
I have included them below and implemented an example of how you could exclude everyting after the first stop.
function [amino_acid_chain] = test2(nucleotide_string)
%write your function documentation here
new_nucleotide_string = upper(nucleotide_string);
amino_acid_chain = cellstr(reshape(new_nucleotide_string,3,[])');
library={{'E','GAA','GAG'}
{'F','UUU','UUC'}
{'L','UUA','UUG','CUU','CUC','CUA','CUG'}
{'I','AUU', 'AUC', 'AUA'}
{'M','AUG'}
{'V','GUU','GUC','GUA','GUG'}
{'S','UCU','UCC','UCA','UCG','AGU','AGC'}%more S codons here
{'P','CCU','CCC','CCA','CCG'}
{'T','ACU','ACC','ACA','ACG'}
{'A','GCU','GCC','GCA','GCG'}
{'Y','UAU','UAC'}
{'Q','CAA','CAG'}
{'N','AAU','AAC'}
{'D','GAU','GAC'}
{'K','AAA','AAG'}
{'E','GAA','GAG'}
{'C','UGU','UGC'}
{'W','UGG'}
{'R','CGU','CGC','CGA','CGG'}};
%do the stop codon separately
triplet={'UAA','UAG','UGA'};
L_current_code=ismember(amino_acid_chain,triplet);
stop=find(L_current_code,1,'first');
if ~isempty(stop)
amino_acid_chain(stop:end)=[];
end
L=false(size(amino_acid_chain));%keep track of replaced codes
for n=1:numel(library)
triplet=library{n}(2:end);
letter=library{n}(1);
L_current_code=ismember(amino_acid_chain,triplet);
amino_acid_chain(L_current_code)=letter;
L=L | L_current_code;
end
if any(~L)%shouldn't happen
error('some code was not implemented correctly')
end
end
thank you so much for your help! I am almost where I need to be.
is there any way to ensure that the single quotation on either end of the string isnt counted as part of the character string? So when I enter 15 nucleotides, it is being counted as 17 as I have to enter the string in ' ' in the control window first.
Also, I need to trim any characters at the end of the string that arent divisible by 3 if that is possible?
Thanks.
There is a difference between how data is stored and displayed in Matlab. Matlab will never count the quotes as characters unless you explicitly tell it to do so.
And about how to remove characters at the end of a string:
str((end-mod(numel(str),3)+1):end)=[];
I have added the code above the reshape line, however I am getting an error when I try to run the function stating that it is an incorrect use of ' = ' operator.
I entered the code into the function as:
new_nucleotide_string((end-mod(numel(new_nucleotide_string),3)+1):end)=[];
As you can confirm yourself: that error is not due to this line. Do you have some unmatched parentheses somewhere?
When I call the function I get an error message stating:
>> test2
Unrecognized function or variable 'new_nucleotide_string'.
Error in test2 (line 8)
new_nucleotide_string((end-mod(numel(new_nucleotide_string),3)+1):end)=[];
This is the whole function I have at this point:
function [amino_acid_chain] = test2(nucleotide_string)
% A function to return an amino acid chain based on codons entered from a
% nucleotide string.
% Make all characters in the string UPPERcase so they can be read.
new_neucleotide_string = upper('nucleotide_string');
new_nucleotide_string((end-mod(numel(new_nucleotide_string),3)+1):end)=[];
amino_acid_chain = cellstr(reshape(new_neucleotide_string,3,[])');
AA_Library = {{'E', 'GAA', 'GAG'}
{'F', 'UUU', 'UUC'}
{'L', 'UUA', 'UUG', 'CUU', 'CUC', 'CUA', 'CUG'}
{'I', 'AUU', 'AUC', 'AUA'}
{'M', 'AUG'}
{'V', 'GUU', 'GUC', 'GUA', 'GUG'}
{'S', 'UCU', 'UCC', 'UCA', 'UCG', 'AGU', 'AGC'}
{'P', 'CCU', 'CCC', 'CCA', 'CCG'}
{'T', 'ACU', 'ACC', 'ACA', 'ACG'}
{'A', 'GCU', 'GCC', 'GCA', 'GCG'}
{'Y', 'UAU', 'UAC'}
{'H', 'CAU', 'CAC'}
{'Q', 'CAA', 'CAG'}
{'N', 'AAU', 'AAC'}
{'D', 'GAU', 'GAC'}
{'K', 'AAA', 'AAG'}
{'E', 'GAA', 'GAG'}
{'C', 'UGU', 'UGC'}
{'W', 'UGG'}
{'R', 'CGU', 'CGC', 'CGA', 'CGG', 'AGA', 'AGG'}
{'G', 'GGU', 'GGC', 'GGA', 'GGG'}};
triplet = {'UAA', 'UAG', 'UGA'};
L_current_code=ismember(amino_acid_chain,triplet);
stop = find(L_current_code,1,'first');
if ~isempty(stop)
amino_acid_chain(stop:end)=[];
end
L=false(size(amino_acid_chain));
for n=1:numel(AA_Library)
triplet=AA_Library{n}(2:end);
letter=AA_Library{n}(1);
L_current_code=ismember(amino_acid_chain,triplet);
amino_acid_chain(L_current_code)=letter;
L=L | L_current_code;
end
if any(~L)
error('Some code was not implemented correctly.')
end
end
Check the spelling of your variables. You are warned by mlint that a variable is not used, which is strange, because you're intending to use it the next line. You also get a warning that your input is unused. Do you see the value of mlint?
Below are the required edits.
% Make all characters in the string UPPERcase so they can be read.
new_nucleotide_string = upper(nucleotide_string);
new_nucleotide_string((end-mod(numel(new_nucleotide_string),3)+1):end)=[];
amino_acid_chain = cellstr(reshape(new_nucleotide_string,3,[])');
Thank you. I am not sure what I am still doing wrong.
I made the changes as you have written, but it is now giving me this error:
>> test2
Not enough input arguments.
Error in test2 (line 6)
new_nucleotide_string = upper(nucleotide_string);
I'm not sure why or where I am still going wrong...
When I enter the original variable nucleotide_string in the comand window before I call the function, is there a special way to enter it? I am having to put ' ' around the letters when I enter them otherwise it wont create the variable to use, so I am wondering if my issue is in that part possibly?
Do you have the letters already stored in a variable? You need to do either of the following:
nucleotide_string = 'AUGGCUCGCAGCUAA';
chain=test2(nucleotide_string);
%or
chain=test2('AUGGCUCGCAGCUAA');
THANK YOU! That has fixed it all!
Thank you so much for your help!!
You're welcome

Sign in to comment.

More Answers (1)

Consider using a switch / case statement.
food = ["apple", "beans", "cauliflower", "dragonfruit", "egg"];
groceries = food(randi(numel(food), 20, 1)); % 20 random items from food
for whichItem = 1:numel(groceries)
item = groceries(whichItem);
switch item
case {"apple"; "dragonfruit"}
itIsA = "fruit";
case {"beans"; "cauliflower"}
itIsA = "vegetable";
otherwise
itIsA = "something else";
end
fprintf("Item %d, %s, is a %s.\n", whichItem, item, itIsA);
end

Asked:

on 4 May 2020

Commented:

Rik
on 6 May 2020

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!