New column

Hello,
I work with the below part of a txt file (since the original is huge one):
Stock Date Time Price Volume Stock Category >ETE 04/01/2010 10145959 18.31 500 Big Cap >ETE 04/01/2010 10150000 18.01 70 Big Cap >ETE 04/01/2010 10170000 18.54 430 Big Cap >ABC 04/01/2010 10190000 18.34 200 Big Cap >YYY 04/01/2010 10200000 18.34 100 Big Cap >ETE 04/01/2010 10250000 18.31 40 Big Cap >ETE 04/01/2010 10295959 18.74 215 Big Cap >ETE 04/01/2010 10300000 18.74 500 Big Cap >ETE 04/01/2010 10320000 18.34 500 Big Cap
% I need to create a new variable (column six, let's say 'TRADE'. It's first value will be arbitrarilly asigned to 'BUY' (is there any code to do that?). Then I need its value to be 'BUY' if the row's value of Price(column 4) is higher than the previous row's value of Price. If price is lower, then it would be 'SELL'. In case of equal prices then if value in new column 7 (Trade) of previous row is 'BUY', then it will be 'BUY', otherwise 'SELL' (that's the why I arbitrarilly define the first value),
so the sample will looking like:
Stock Date Time Price Volume Stock Category Trade >ETE 04/01/2010 10145959 18.31 500 Big Cap BUY >ETE 04/01/2010 10150000 18.01 70 Big Cap SELL >ETE 04/01/2010 10170000 18.54 430 Big Cap BUY >ABC 04/01/2010 10190000 18.34 200 Big Cap SELL >YYY 04/01/2010 10200000 18.34 100 Big Cap SELL >ETE 04/01/2010 10250000 18.31 40 Big Cap SELL >ETE 04/01/2010 10295959 18.74 215 Big Cap BUY >ETE 04/01/2010 10300000 18.74 500 Big Cap BUY >ETE 04/01/2010 10320000 18.34 500 Big Cap SELL
Any help?
Thanks in advance,
Panos

 Accepted Answer

Matt Tearle
Matt Tearle on 22 Mar 2011
I think this does what you want:
Price = randi(10,20,1)
buy = [true;diff(Price)>=0];
idx = find(~diff(Price));
buy(idx+1) = buy(idx);
[Price,buy]
I'm using a logical array buy that is true for "buy" and false for "sell". You could use a nominal array (if you have Statistics Toolbox) to assign arbitrary labels (ie "buy" and "sell"), but logical probably does what you want the easiest.
EDIT: if you really want a column of strings (for output purposes), this will do it:
buysell = cellstr(repmat('Sell',size(Price)));
buysell(buy) = {'Buy'}

3 Comments

Pap
Pap on 22 Mar 2011
Thanks again Matt,
In the first code how can I extract this to the old txt file?
Many thanks indeed
Panos
Pap
Pap on 1 Apr 2011
Hi Matt,
May I ask what the first row of the above code ('randi(10,20,1)')pertains to?
Does this specifies the No of rows to generate this random value?
I am actually trying to apply this to larger dataset but I get the same output.
- Can I apply the above if I do not know exactly the No of Rows (because I work with a huge ASCII dataset)?
- How can I put the output (cellarray) into the original ASCII file ?
Many thanks
Panos
Matt Tearle
Matt Tearle on 2 Apr 2011
The first line was just to make some example price data. Remove it and use your data.
See my new answer for the whole process of reading and writing.

Sign in to comment.

More Answers (1)

Matt Tearle
Matt Tearle on 2 Apr 2011
% Read stock data from file
fid = fopen('stocks.txt');
data = textscan(fid,'%s%s%f%f%f%[^\n]','delimiter',' ','headerlines',1);
% Read as text (for later writing)
frewind(fid);
txt = textscan(fid,'%s','delimiter','\n');
fclose(fid);
% Get prices from imported data
Price = data{4};
% Determine which stocks to buy
buy = [true;diff(Price)>=0];
idx = find(~diff(Price));
buy(idx+1) = buy(idx);
% Make string of trade decision
buysell = cellstr(repmat(' Sell',size(Price)));
buysell(buy) = {' Buy'};
% Open file for writing
fid = fopen('stocks2.txt','wt');
% Make output string by appending trade decision
outstr = strcat(txt{1},[' Trade';buysell]);
% Write out
fprintf(fid,'%s\n',outstr{:});
fclose(fid);

4 Comments

Pap
Pap on 2 Apr 2011
Thanks Matt,
%I used the above but when I want to write it out I get the below error message:
>fprintf(fid,'%s\n',outstr{:});
??? Error using ==> fprintf
Invalid file identifier. Use fopen to generate a valid file identifier.
Any idea on what may I did wrong?
Thanks again
Panos
Matt Tearle
Matt Tearle on 3 Apr 2011
It could be something like a file/directory permissions problem. Have a look at the value of fid -- I'm guessing it's -1, which indicates a problem in opening the file. In that case, do
[fid, message] = fopen(...)
and see what "message" is.
Pap
Pap on 5 Apr 2011
Hi Matt,
I also applied the above code on a big txt file, (540 MB, with almost 12,000,000 rows and it doesn't seem to work. Actually for data=textscan.... I get a 6x1 cell array (headers only) and not the data, as with the sample file (830120x1 cell array). Also in the definition of column 4 (Price) I do not get values as in the sample file but I get '[]' instead. As such I also get '[]' in the idx=.., etc.
Any himt on what may i did wrong?
Is there any limitation in data for matlab?
Panos
Matt Tearle
Matt Tearle on 6 Apr 2011
I've added an answer to your new question about this, but as an aside here: data should be a 1-by-6 cell array. Each cell should contain an n-by-1 array of appropriate type (cell or double).

Sign in to comment.

Categories

Asked:

Pap
on 22 Mar 2011

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!