How does Requirements Toolbox parse requirements for Microsoft Word files?

Does requirements toolbox parse each paragraph and look for the given regular expression? I am trying to parse a requirements document, but paragraphs with two requirements in them are not parsed separately. Do I need to manually separate them in the document?

3 Comments

Sharing the document and the code is probably helpful in finding a solution. I personally don't recognize what you are referring to, but if you post the file and the code I can help you experiment. (note that you need to zip some file types in order to be allowed to upload them)

Hi @Jim,
Just so I understand, do you have multiple requirements under the same heading which are separable based on a regular expression?
Reference here for @Rik: https://www.mathworks.com/help/slrequirements/ug/import-requirements-from-microsoft-office.html <-- I assume @Jim is using the "Identify items by occurrences of search pattern (REGEXP)" option.
Thanks.
Yes, they are identified using callbacks. I'm not able to get into specifics as it is proprietary, but here is an example of what I am parsing: This requirement should [REQ_CALLBACK_01] do something. most of them are in their own paragraphs, but some are combined in their own paragraphs.
i.e.
This requirement should [REQ_CALLBACK_01] do something. This requirement should [REQ_CALLBACK_02] do another thing.
This only parses the first requirement as its own and the second one is included. However, if they are separated in paragraphs such as:
This requirement should [REQ_CALLBACK_01] do something.
This requirement should [REQ_CALLBACK_02] do another thing.
They are both added as requirements. I am not able to edit the document, so I am trying to figure out a way to still parse each requirement that are in the same paragraph.

Sign in to comment.

Answers (1)

Hi @Jim,
I was able to reproduce this observation in R2025a, where multiple requirements in one paragraph are not parsed separately when the "Identify items by occurrences of search pattern (REGEXP)" option is used. One paragraph is treated as one requirement, even if there are multiple matches of the regular expression.
As a workaround, the "PostImportFcn" callback can be used to execute code that modifies the requirements after the import completes. You can specify code in the callback to split such requirements into two or more separate ones as needed.
Additionally, the option to define a custom document interface for importing requirements has been introduced in R2023a. You can implement your own importer interface, for document types that are not supported, or can extend one of the built-in interfaces, as would be appropriate in this case (i.e. implementing a custom variant of the Microsoft Word Importer).
Kindly refer to the following documentation links for more information:
Hope this helps!

Products

Release

R2022b

Asked:

Jim
on 9 Jun 2025

Answered:

on 22 Jul 2025

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!