Main Content


Length of documents in document array



N = doclength(documents) returns the number of tokens in each document in documents.


collapse all

Find the number of words in an array of tokenized documents. Erase the punctuation characters so they do not get counted as words.

str = [ ...
    "An example of a short sentence." 
    "A second short sentence."];
documents = tokenizedDocument(str)
documents = 
  2x1 tokenizedDocument:

    7 tokens: An example of a short sentence .
    5 tokens: A second short sentence .

documents = erasePunctuation(documents)
documents = 
  2x1 tokenizedDocument:

    6 tokens: An example of a short sentence
    4 tokens: A second short sentence

N = doclength(documents)
N = 2×1


Input Arguments

collapse all

Input documents, specified as a tokenizedDocument array.

Output Arguments

collapse all

Document lengths, returned as a vector of nonnegative integers. The size of N is the same as the size of documents.

Version History

Introduced in R2017b