N = doclength(documents) returns the number of tokens in each document in documents.


Find the number of words in an array of tokenized documents. Erase the punctuation characters so they do not get counted as words.

str = [ ...
    "An example of a short sentence." 
    "A second short sentence."];
documents = tokenizedDocument(str)
documents = 
  2x1 tokenizedDocument:

    7 tokens: An example of a short sentence .
    5 tokens: A second short sentence .

documents = erasePunctuation(documents)
documents = 
  2x1 tokenizedDocument:

    6 tokens: An example of a short sentence
    4 tokens: A second short sentence

N = doclength(documents)
N = 2×1


Input Arguments

Input documents, specified as a tokenizedDocument array.

Output Arguments

Document lengths, returned as a vector of nonnegative integers. The size of N is the same as the size of documents.

Version History

Introduced in R2017b