MATLAB Answers

How to grab first number from each line of a .txt file and store it in a vector?

8 views (last 30 days)
I want to grab first number from each line and store it in a vector. I am trying in following way. But, the written code is picking all numbers from the file. I have attached here the input.txt file. Can anybody please help me regarding this? Thank you.
My written code:
fid = fopen('input.txt');
matrix = fscanf(fid, '%d', [1,inf]);
Matlab Output:
1 3 4 8 2 -1 0 8 3 4 5 6 7 8 9 -2
I wanted to grab only(the first number from each line):
1 3 2 3 4 5 6 7 8


Sign in to comment.

Accepted Answer

Walter Roberson
Walter Roberson on 9 Feb 2020
fid = fopen('input.txt');
matrix = cell2mat( textscan(fid, '%f%*[^\n]') ); %second field skips to end of line


Show 1 older comment
Walter Roberson
Walter Roberson on 10 Feb 2020
%*[^\n] is a single format specification.
The % signals that what follows is a format specification
The * signals that after the data indicated by what follows in the specification is read in, that that data is to be thrown away. You use * for parts of input that you know are there but you have no interest in the value of at the time.
The [] matches any number of characters in a row, each of which individually is consistent with the pattern inside the [] .
Typically what is inside [] indicates which character is to be matched. For example, [02468] would match any one of the characters 0, 2, 4, 6, or 8 . There are shortcuts to indicate ranges, such as [0-9] to indicate any character from the list 0, 1, 2, 3, 4, 5, 6, 7, 8, or 9 . There are also shortcuts to indicate kinds of characters, such as [\d] indicates that the match is to be against a "digit". These special patterns are indicated by a \ before the character that specifies the "kind" to match against. \n is the special pattern for matching the newline character. So [\n] would be for matching any number of newline characters in a row.
However... if the very first character inside the [] is ^ then the test is reversed, and the input only matches for characters that do not meet the test inside the rest of the [] . Therefore the pattern [^\n] is to match any number of characters in a row, each of which is not the newline character.
Together with the * before that to discard input, the %*[^\n] means that you should start from the current character and read any number of characters until you hit a newline character, and you should then discard all of those characters. In other words, %*[^\n] is a pattern to ignore everything until the end of the current line.
Putting this together with what is before that, '%f%*[^\n]' means that on each line, you should skip any leading spaces, then read a number, then ignore everything to the end of the line. And then the next line you again look for a number and ignore whatever is after that on the line, and so on.
Your input file starts with
3 4 8
The first time through, the %f would match the 1 and store it to be returned. That would leave you positioned at the newline that is after the 1. Then %*[^\n] would match any number of characters that are not newline, but since you are at a newline already that would be no characters. The "no characters" would be discarded.
Then on the next line, the %f would match the 3 and store it to be returned. That would leave you positioned at space that is after the 3. Then %*[\n] would match characters that are not newlines, so that would match ' 4 8' . And then because of the * it would throw that away.
So the 1 got matched on the first line and stored; the 3 got matched on the second line and stored; the 4 and 8 got deliberately ignored and thrown away.
And so on.

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!