Info

This question is closed. Reopen it to edit or answer.

Unix code check and REMOVE the datapoints ranging outside 9:00am and 4:15pm for a second by second dataset

1 view (last 30 days)
I have a list of about 70 million rows. I want to delete the the following and clean the dataset-
  1. Any values which are 0 or in the range of 0.001 or less.
  2. Any values that lie outside the range of 9:00am and 4:15pm
  3. If multiple quotes are present with the same time stamp, then replace that with a single entry of the median price.
I am able ot achive the third point, but not the second and the first one. Can someone guide me with this? Thanks
  4 Comments
Harsh Rob
Harsh Rob on 20 Aug 2019
Apologies for the confusion caused.
This is the description for the RAW dataset I have-
Column 1 contains the timestamp in the unix format - NEEDS to be a part of cleaned data
The raw dataset is in the unix format(number). However, I want to delete all the datapoints which is a weekend or falls outside the range of 9:00 hrs to 16:15 hrs. We can either do this by converting it into dd/mm/yyyy hh:mm:ss format, or if it can be deleted directly from the unix format(number).
Column 2 contains the price data -NEEDS to be a part of cleaned data
If the prices are 0, delete the entire row
If the prices are less than 0.001, delete the entire row
if the timestamps are same, take the median value of the unique timestamp. (I have figured out this one by using the unique and accumarray functions.)
Column 3 contains - NOT NEEDED to be a part of cleaned data
Not required for my calculation purposes, but a part of RAW data. Can be deleted as well.
Does this explantion make sense ?
Jan
Jan on 21 Aug 2019
@Harsh Rob: I cannot know what "RAW dataset" means. Is it a binary oder text file? Have you been able to import it already? Converting the time to a datevec or datetime object allow to create a matching filter easily.
It is still not clear, how your data are represented. A "timestamp in unix format" could be a UINT64, or s string containing the digits of the UINT64, or something else.
Please post a small example of the inputs.

Answers (0)

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!