split a row into 2 rows

2 views (last 30 days)

chocho on 16 Feb 2017

0
Link

Direct link to this question

https://in.mathworks.com/matlabcentral/answers/325333-split-a-row-into-2-rows

Commented: chocho on 22 Feb 2017

cg00008493  0.987979722052904  "COX8C;KIAA1409"  14  93813777  0.986128428295584  "COX8C;KIAA1409"  14  93813777
cg00031162  0.378288688845672  "TNFSF12;TNFSF12-TNFSF13"  17  7453377  0.362510745266914  "TNFSF12;TNFSF12-TNFSF13"  17  7453377

here are 2 lines and each line have 8 columns, i want to split each line have 2 sets like "COX8C;KIAA1409" into 2 rows and delete the duplicated columns output should be like this:

cg00008493  0.987979722052904  COX8C   0.986128428295584
cg00008493  0.987979722052904  KIAA1409   0.986128428295584
cg00031162  0.378288688845672  "TNFSF12    0.362510745266914
cg00031162  0.378288688845672  TNFSF12-TNFSF13 0.362510745266914
fid = fopen('COADREAD_methylation.txt','r');
data={};
while ~feof(fid)
  l=fgetl(fid);
  if isempty(strfind(l,'NA')), data=[data;{l}]; end
  a = reshape(l, ',','""', [])';
end
fid=fclose(fid);

Note: I used NA to remove the lines which have NA

0 Comments
Show -2 older commentsHide -2 older comments

Accepted Answer

Stephen23 on 16 Feb 2017

0
Link

Direct link to this answer

https://in.mathworks.com/matlabcentral/answers/325333-split-a-row-into-2-rows#answer_255037

Edited: Stephen23 on 17 Feb 2017

Open in MATLAB Online

temp1.txt

opt = {'CollectOutput',true};
inp = '%s%s%q%*d%*d%s%*q%*d%*d';
out = '%s\t%s\t%s\t%s\n';
f1d = fopen('temp1.txt','rt'); % the original file
f2d = fopen('temp2.txt','wt'); % the new file
while ~feof(f1d)
    C = textscan(f1d,inp,1,opt{:});
    C = [C{:}];
    D = regexp(C{3},';','split');
    for k = 1:numel(D)
        fprintf(f2d,out,C{1:2},D{k},C{4});
    end
end
fclose(f1d);
fclose(f2d);

Produces this output file:

cg00008493  0.987979722052904  COX8C  0.986128428295584
cg00008493  0.987979722052904  KIAA1409  0.986128428295584
cg00031162  0.378288688845672  TNFSF12  0.362510745266914
cg00031162  0.378288688845672  TNFSF12-TNFSF13  0.362510745266914

Tested on this input file:

18 Comments
Show 16 older commentsHide 16 older comments

Stephen23 on 22 Feb 2017

If textscan has an empty output then you probably need to check the format string.

chocho on 22 Feb 2017

could you tell me how to present the format of this line? cg00000292 0.511852232819811 ATP2A1 0.787687855895422 0.51208122605745 0.599610258157912 0.568034757766559

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

split a row into 2 rows

0 Comments
Show -2 older commentsHide -2 older comments

Accepted Answer

18 Comments
Show 16 older commentsHide 16 older comments

More Answers (0)

See Also

Categories

Tags

Community Treasure Hunt

split a row into 2 rows

0 Comments Show -2 older commentsHide -2 older comments

Accepted Answer

18 Comments Show 16 older commentsHide 16 older comments

More Answers (0)

See Also

Categories

Tags

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

18 Comments
Show 16 older commentsHide 16 older comments