How to implement something similar to Vlookup with different lenght's vectors??

Good afternoon everyone,
I have a very big database of financial data presented in this way:
Of course my database is much more complicated, i have around 200 equities and some thousand data each.
Like this example, also my database has different number of records for each equity.
I would like to build something like this:
I excel i could easily use the VLookup function but I'm not very confident in Matlab language and i have no idea of where i should start.
Thanks for your help.

1 Comment

I attach a sample file that should help you guys to understand what i have in input and what i need to have in output.
Just imagine that I have 30.000 rows and 100 columns, so is not possible to work "by-hand" or in Excel (too slow). Thanks

Sign in to comment.

 Accepted Answer

Okay, I wrote a function that will do this for you. While I was attempting to pull the data from the files you supplied "Company1.mat, Company2.mat and TimeStamp.mat" I didn't really understand what I received. Each file had 2 columns, but these columns were identical to one another. I expected to see two columns, one with a timestamp and the other with a value. So, instead, I ran this function using the data you supplied originally.
The catch here is that you have to supply the data to this function in a particular way. The input requires each company's information to be contained within one giant matrix with the timestamp first and the value second. It also requires you pass to it the timestamps you wish to get everything at. If you can't figure out how to do these two steps then we can go into that, but I think you can get it fairly easily.
The function has two outputs: first, the new timestamps, and second, the values from each of the companies at those timestamps. If I were running this code on my data I would call it like this...
[newTimes,newData] = sortData(companies,DesiredTimeStamps);
where the newTimes and newData variables will be the function outputs and the companies and DesiredTimeStamps variables will be the function inputs.
Save the function to its own m-file so that you can call it that way. It is...
function [new_time,newData] = sortData(oldData,new_time)
% Reshape incoming variables
newTime = zeros(length(oldData),1);
newTime(1:length(new_time))=new_time;
% Define the number of companies and the size of the output matrix
numCompany = length(oldData(1,:))/2;
newData = zeros(length(oldData),numCompany);
% For each "i" we check a new company
% Within each company we loop through each instance of the desired time
% stamps "j"
% For each desired time step we loop through each value of the the "ith"
% companies timestamps to see if they are the same. If they are we then
% assign that value to it's particular location within the new data set.
i = 1;
comp = 1;
while i < numCompany*2;
for k=1:length(newTime)
for j=1:length(oldData)
if newTime(k) == 0;
break
elseif newTime(k) == oldData(j,i);
newData(k,comp) = oldData(j,i);
break
end
end
end
i = i+2;
comp = comp+1;
end
% Reshape the output Data
% newData = [newTime newData];
newData = newData(any(newData,2),:);
% Change the time column to a string
new_time = new_time(any(new_time,2),:);
new_time=datestr(new_time,'HH:MM:SS PM');
Good luck. I hope this helps.

1 Comment

Well, thank you very much for the big big help!
There were a little mistake in the code but for the rest was ok..
Wrong newData(k,comp) = oldData(j,i);
Correct newData(k,comp) = oldData(j,i+1);
I really appreciated you help, hope to be able to help you sometime. Bye

Sign in to comment.

More Answers (2)

In order to provide you with a set of code to solve your problem I would have to know the database as a whole. However, to get you started you could write a for loop that compares each value within your array of data with a set value. For example,
desiredValue = 1000;
for i=1:size(dataSet)
if dataSet(i) == desiredValue
newValue = dataSet(i);
break;
end
i = i+1;
end
This loop would compare each item within the matrix dataSet to a desired value and would terminate when that value was located.
That should get you started. Hard to go any further without more information on the specific dataSet you're talking about.

5 Comments

Hi Chris, and thanks.
What do you mean about "more information on the specific dataSet"?
My database start from 8am of three days ago and terminate at 5pm of yesterday. The problem is that for each company i have different number of records for this period: for example google may have 3000 records for the prices while i have just 1000 for apple. So i have to create a unique timestamp, step that i already did.
Now i need to create a new database and this time it will be a rectangular matrix, with the same number of records for each company.
On the left column of the database i need to have the timestamp and then each column should be a different company. Now is it more clear?
Thanks
Yes, that's more clear. Thanks for the clarification. Here's the algorithm I would use:
Step 1. Determine the timestamps you want. That is likely going to be decided by whichever company has the fewest number of timestamps.
Step 2. Loop through each company's data comparing the timestamp you want to find with the value at each location within the data set. Kind of confusin, so, for example:
for j = 1:length(desiredTimeStamps)
for i = 1:length(googleData(:,1))
if googleData(i,1) == desiredTimeStamp(j)
newGoogleData(i) = googleData(i,2)
end
end
end
That will run a loop for each data point within your google data set. Repeat for each company.
Then compile the results by:
newData = [desiredTimeStamps;newGoogleData;newAppleData;...];
Good luck!!!
I tried to run this code but after more than 1 hour the process is still going on, and i tried with just one company. I guess that there are some problems with the loop.
I attach you the timestamp file and two columns of two random companies, so maybe you can have a look and understand better my problem.
Many thanks to you and to whoever would be so kind to help me!

Sign in to comment.

Are there any other suggestions?
Please try to check your ideas with the samples i attached.
Unfortunately Chris code doesn't seem to run properly.
Thanks

Categories

Find more on Loops and Conditional Statements in Help Center and File Exchange

Asked:

Ale
on 13 Mar 2014

Commented:

Ale
on 17 Mar 2014

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!