Extracting Time Data from Text File

11 views (last 30 days)
In short, I've been trying to put together a script that scans a .txt file and pulls number values from it. The context is Rubik's Cube times; I have a program that produces .txt files from sessions, and the contents include somee basic statistics at the top, a list of times, and the moves used to scramble the cube for each time. I've included an example of the beginning of one of these files below:
Average: 10.05
Best: 9.34
Worst: 14.26
Mean: 10.75
Standard Deviation: 1.79
1: 9.91 D2 L2 B2 L2 F2 D L2 B2 D' U L' U' L D' L F' R' B' U B2
2: 9.86 R2 U2 B2 F2 D L2 D' U' R2 U' B2 R U F' U' B' D B2 U R' D2 F2
3: (14.26) D2 L2 B' L2 B2 D2 L2 F2 D2 F R' D B R2 B' D' R2 B2 F L
4: (9.34) B F U2 B R2 B' U2 F' D2 R' D' B2 D' B' U2 R' B' D' B'
5: 10.39 B F' U2 B2 L2 U2 R2 B' R2 B U B2 R2 D2 R2 B' L B R D2
My goal is to pull the times from these files to run my own stats on them, but there are a lot of nuances that have made this difficult!
The biggest issues I've had:
1) As you can see, some of the times contain 3 sig figs, and some contain 4. I haven't found a clean way to pull both from the file at the same time.
2) I haven't found a way to ignore the statistics at the top (if this is too much for a script, it would be easy to just delete that section of the text file before I run the script.
What I have so far isn't pretty, but so far this has successfully created an array of the 4-digit times for me:
fileID = fopen("txt Files/text-4AB30AA43EF2-1.txt")
raw = fscanf(fileID,"%c")
fclose(fileID)
num = '[0123456789][0123456789].[0123456789][0123456789]';
out = regexp(raw,num,'match');
gen1 = str2double(out);
list = gen1'
Any help would be greatly appreciated, thank you!

Accepted Answer

Stephen23
Stephen23 on 7 Feb 2022
Edited: Stephen23 on 7 Feb 2022
Simpler and more efficient:
str = fileread('test_1.txt');
tkn = regexp(str,'^\s*(\d+):\D+(\d+\.?\d*)\W+([^\n\r]+)', 'tokens', 'lineanchors');
tkn = vertcat(tkn{:})
tkn = 5×3 cell array
{'1'} {'9.91' } {'D2 L2 B2 L2 F2 D L2 B2 D' U L' U' L D' L F' R' B' U B2' } {'2'} {'9.86' } {'R2 U2 B2 F2 D L2 D' U' R2 U' B2 R U F' U' B' D B2 U R' D2 F2'} {'3'} {'14.26'} {'D2 L2 B' L2 B2 D2 L2 F2 D2 F R' D B R2 B' D' R2 B2 F L' } {'4'} {'9.34' } {'B F U2 B R2 B' U2 F' D2 R' D' B2 D' B' U2 R' B' D' B'' } {'5'} {'10.39'} {'B F' U2 B2 L2 U2 R2 B' R2 B U B2 R2 D2 R2 B' L B R D2' }
mat = str2double(tkn(:,1:2))
mat = 5×2
1.0000 9.9100 2.0000 9.8600 3.0000 14.2600 4.0000 9.3400 5.0000 10.3900
Optional extras:
% tkn(:,1:2) = num2cell(mat); % optional, but it is much better to store numeric
% data in a numeric array, so probably best avoided.
spl = regexp(tkn(:,3),'\w+','match')
spl = 5×1 cell array
{1×20 cell} {1×22 cell} {1×20 cell} {1×19 cell} {1×20 cell}
spl{:}
ans = 1×20 cell array
{'D2'} {'L2'} {'B2'} {'L2'} {'F2'} {'D'} {'L2'} {'B2'} {'D'} {'U'} {'L'} {'U'} {'L'} {'D'} {'L'} {'F'} {'R'} {'B'} {'U'} {'B2'}
ans = 1×22 cell array
{'R2'} {'U2'} {'B2'} {'F2'} {'D'} {'L2'} {'D'} {'U'} {'R2'} {'U'} {'B2'} {'R'} {'U'} {'F'} {'U'} {'B'} {'D'} {'B2'} {'U'} {'R'} {'D2'} {'F2'}
ans = 1×20 cell array
{'D2'} {'L2'} {'B'} {'L2'} {'B2'} {'D2'} {'L2'} {'F2'} {'D2'} {'F'} {'R'} {'D'} {'B'} {'R2'} {'B'} {'D'} {'R2'} {'B2'} {'F'} {'L'}
ans = 1×19 cell array
{'B'} {'F'} {'U2'} {'B'} {'R2'} {'B'} {'U2'} {'F'} {'D2'} {'R'} {'D'} {'B2'} {'D'} {'B'} {'U2'} {'R'} {'B'} {'D'} {'B'}
ans = 1×20 cell array
{'B'} {'F'} {'U2'} {'B2'} {'L2'} {'U2'} {'R2'} {'B'} {'R2'} {'B'} {'U'} {'B2'} {'R2'} {'D2'} {'R2'} {'B'} {'L'} {'B'} {'R'} {'D2'}
  5 Comments
Star Strider
Star Strider on 7 Feb 2022
Definitely worth voting for!
+1
Simon Montrose
Simon Montrose on 7 Feb 2022
@Stephen Of course, I didn't know I could switch them my apologies! Still getting acquainted with the forums, but I've implemented your suggestions and have the script completely up and running :) Cheers!

Sign in to comment.

More Answers (0)

Products


Release

R2021b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!