Generating a Dictionary Function

Hi All,
How would i make a function called 'Word' that determines whether the user has input a real english word, determined based on the content within a file called 'dictionary.txt', which contains the whole english dictionary.
For example, a worked example from the users perspective would be as follows:
What is the word? 'kinbecef'
Not a word. Try Again.
What is the word? 'Community'
... (continue on with code)
Kind Regards,
Smoxk x

7 Comments

Should I assume that I'm also supposed to write the dictionary as well? If it exists, is the format defined or known?
The dictionary file is 466416 lines long. Is this what you mean @DGM? I may have misinterpreted your question.
@DGM i have already generated a dictionary txt file. :)
This sounds like a homework question. So please show, what you have tried so far.
How is the dictionary stored? One word per line? Separated by spaces? A MAT file? If it is a text file, fileread and strsplit can create a cell string containing the words. If you sort it, ismember might be a fast way to find a word in the list. Do you care about the upper/lower case? strcmp is an option also.
@Jan, @Walter Roberson, this is what ive attempted so far. strcmp sounds good, but how would i use it in the below context? Capitals dont matter.
function InputWord = Word(InputWord)
file = 'Dictionary.txt';
import = importdata(file);
size= numel(import);
for i=1:Num
while InputWord~=file(i)
fprintf("Not a word. Try Again.\n");
InputWord=input("What is the word? ", 's');
end
end
end
while ~strcmpi(Inputword, import)
%get a new word
end
@Smoxk x: Your code has several problems.
  • Do not use "size" as a name of a variable, because it is an important Matlab function. This can cuase unexpected behavior, if you try to use the function later.
  • "Num" is not defined.
  • "InputWord" is uded, before it is defined.
  • "file(i)" is the i.th character of the file name.
  • ~= compares char vectors elementwise, so both must have the same length or one can be a scalar.

Sign in to comment.

 Accepted Answer

Walter Roberson
Walter Roberson on 26 Apr 2022
strcmpi() perhaps. However that depends how much intelligence you need to put into it. You give an input example with a leading capital: is the search to be case-insensitive or do you have capitalization rules (the rules in English are more complicated than just "capitalize the first letter")? Do you need to deduce the plurals from the base words, including for example goose vs geese but moose does not imply meese?

6 Comments

@Walter Roberson @Jan, this is my current code, yet it takes a while to load when i test it because it runs through the entire dictionary. I believe this is the case because when i enter the word 'aachen' (first in the dictionary file), the code continues to execute for ages.
Is there a way i can add a element to my code so that when the word matches, the code somehow stops?
function InputWord = Word(InputWord)
file = 'Dictionary.txt';
import = importdata(file);
Number= numel(import);
for i=1:Number
while ~strcmpi(InputWord, import)
fprintf("Not a word. Try Again.\n");
InputWord=input("What is the word? ", 's');
end
end
function InputWord = Word(InputWord)
file = 'Dictionary.txt';
import = sort( cellfun(@tolower, importdata(file)) );
LInputWord = tolower(InputWord)
while ~ismember(LInputWord, import)
fprintf("Not a word. Try Again.\n");
InputWord = input("What is the word? ", 's');
LInputWord = tolower(InputWord);
end
ismember() uses a Binary Search; https://en.wikipedia.org/wiki/Binary_search_algorithm so by pre-sorting, the repeated ismember() calls will not have much cost at all.
@Walter Roberson, thanks so much for the help! When i run the code however, it displays the following error:
Error using cellfun
Undefined function 'tolower' for input arguments of type 'char'.
What can i do to change this?
Use lower() instead of tolower().
@Jan, ok thanks so much!
Sorry about that... the function is often named tolower() in other programming languages.

Sign in to comment.

More Answers (2)

function Word
Data = fileread('YourDictionary.txt');
List = sort(lower(strsplit(Data, char(10))));
while 1
s = input('Input a word: ', 's');
% insert the checks and the oputput here...
% Stop the loop with "break;" if s is empty
end
end
You can use container.Map. I give here example of numbers, but it can handle char array or strings as well.
The time access, using hash is O(1) and is much quicker than ismember (and this test on TMW server seems to invalid Walter's claim that ismember on sorted array is faster, perhaps because MATLAB ismember doesn't know/check array is sorted, and it sorts anyway and we all know some sorting algorithms are not fast when working on already sorted input):
a=randi(1e6,1,1e6);
as=sort(a);
m=containers.Map(a,a);
b=randi(1e6,1,1e3);
tic; for k=1:numel(b); ismember(b(k),as); end; toc
Elapsed time is 0.712036 seconds.
tic; for k=1:numel(b); ismember(b(k),a); end; toc
Elapsed time is 0.596029 seconds.
tic; for k=1:numel(b); isKey(m,b(k)); end; toc
Elapsed time is 0.005366 seconds.

Categories

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!