25 views (last 30 days)
Show older comments
Ben on 23 Sep 2022
-
-
Link
Direct link to this question
https://matlabcentral.mathworks.com/matlabcentral/answers/1810965-how-can-i-read-rows-from-a-csv-outside-of-datalines-using-readtable
Commented: Ben on 26 Sep 2022
Accepted Answer: dpb
Open in MATLAB Online
I am trying to read a CSV with a specific format that I can't change. I can read most of the data, but I lose some data from above start of the data lines (in the header that also has a units line). Below I have a section of the data I want to read:
; Bat_MaximumVoltage_x; Bat_MaximumVoltage_y; Bat_MinimumVoltage_x; Bat_MinimumVoltage_y
0; XY ; ; XY ;
1; ; ; ;
2; s ; V ; s ; V
3; 02.06.2022 05:03:33 ; ; 02.06.2022 05:03:33 ;
4; 02.06.2022 15:00:22 ; ; 02.06.2022 15:00:22 ;
5; X-Values ; Y-Values ; X-Values ; Y-Values
6; -2,51E-7 ; 3954 ; 1,11E-7 ; 3933
7; 0,240157917 ; 3953 ; 0,240157889 ; 3933
8; 0,478257917 ; 3953 ; 0,478257555 ; 3933
9; 0,71209725 ; 3953 ; 0,712096888 ; 3933
10; 1,020145332 ; 3953 ; 1,020144999 ; 3933
11; 1,258192585 ; 3953 ; 1,258192221 ; 3933
12; 1,491101584 ; 3953 ; 1,491101222 ; 3933
The csv file I am reading has 40 signal (_y) columns and 39 time (_x) columns. I have manually added the spacing you see above to improve readability.
I can import the data into a table using the following code:
opts = detectImportOptions(filename);
opts.DataLines = [8, Inf];
opts.Delimiter = ";";
opts.VariableUnitsLine = 4;
opts.VariableNames{1} = 'RowNum';
idxTimeVars = find(endsWith(opts.VariableNames,"_x"));
idxSignalVars = find(endsWith(opts.VariableNames,"_y"));
opts = setvaropts(opts, idxTimeVars, "Type","double");
opts = setvaropts(opts, idxTimeVars, "DecimalSeparator", ",");
opts = setvaropts(opts, idxSignalVars, "Type","double");
opts = setvaropts(opts, idxSignalVars, "DecimalSeparator", ",");
T = readtable(filename, opts);
But I want to create a timetable with this data, where the time vector is made of the start time (row number 3) added to the 'X-values', for example:
startTime = 02.06.2022 05:03:33; %datetime variable when read from data
timeVec = startTime + seconds(table.Bat_MaximumVoltage_x);
Then replace the double (type) '_x' columns with datetime (type) columns.
How can I store or access this data from the header of the file (above DataLines) while using readtable to import data below DataLines?
Do I just have to run textscan or readtable again to read those two lines separately?
0 Comments Show -2 older commentsHide -2 older comments
Show -2 older commentsHide -2 older comments
Sign in to comment.
Sign in to answer this question.
Accepted Answer
dpb on 23 Sep 2022
Open in MATLAB Online
"Do I just have to run textscan or readtable again to read those two lines separately?"
Yes, unfortunately readtable doesn't have the feature of textscan to make multiple reads within the same file without closing and reopening it. textscan does, of course; one can fixup another import object to cull out the header with something like
opt=detectImportOptions('ben.csv',"Range",[4 6],'ReadVariableNames',1,'Delimiter',';')
opt.DataLines=5:6;
opt=setvaropts(opt,{'s','s_1'},'Type','datetime');
tHdr=readtable('ben.csv',opt);
By default the datetime with ambiguous month/day format will interpret the above as 02-June so if these dates are actually February 6, then will need to set the import format option 'DateTimeFormat' to match.
7 Comments Show 5 older commentsHide 5 older comments
Show 5 older commentsHide 5 older comments
Ben on 25 Sep 2022
Direct link to this comment
https://matlabcentral.mathworks.com/matlabcentral/answers/1810965-how-can-i-read-rows-from-a-csv-outside-of-datalines-using-readtable#comment_2380770
Edited: Ben on 26 Sep 2022
Open in MATLAB Online
- ben.csv
Okay, that's unfortunate. Follow-up question:
So now I have my tHdr where the _x variables contain the start and end timestamps, and T which contains the actual data and _x variables contain seconds since start stored as double.
How can I best combine them? I think I have to loop through the _x variables and create a vector (per variable) using data from both tables. But when I try to replace the seconds with actual timestamps I can't because the existing variable is double type.
Is my best option just creating a new table instead of modifying the existing ones?
Thanks
Here's what I have so far:
filename = "ben.csv";
opts = detectImportOptions(filename);
% Specify range and delimiter
opts.DataLines = [8, Inf];
opts.Delimiter = ";";
opts.VariableUnitsLine = 4;
% Specify column names and types
opts.VariableNames{1} = 'RowNum';
idxSignalVars = find(endsWith(opts.VariableNames,"_y"));
idxTimeVars = find(endsWith(opts.VariableNames,"_x"));
opts = setvaropts(opts, idxSignalVars, "Type","double");
opts = setvaropts(opts, idxSignalVars, "DecimalSeparator", ",");
% Check if times are stored as datetime or seconds
if ~any(opts.VariableTypes == "datetime") %True if no datetime types
opts.DataLines = [5, 6];
opts = setvaropts(opts, idxTimeVars, "Type","datetime");
opts = setvaropts(opts, idxTimeVars, "InputFormat", "dd.MM.yyyy HH:mm:ss");
tableStartEndTimes = readtable(filename, opts);
%%%% Reset for loading data
opts.DataLines = [8, Inf];
opts = setvaropts(opts, idxTimeVars, "Type","duration");
opts = setvaropts(opts, idxTimeVars, "DecimalSeparator", ",");
opts = setvaropts(opts, idxTimeVars, "DurationFormat", "s");
else
opts = setvaropts(opts, idxTimeVars, "InputFormat", "dd.MM.yyyy HH:mm:ss");
end
% Read table
T = readtable(filename, opts);
head(T)
RowNum Bat_MaximumVoltage_x Bat_MaximumVoltage_y Bat_MinimumVoltage_x Bat_MinimumVoltage_y ______ ____________________ ____________________ ____________________ ____________________ 6 NaN sec 3954 NaN sec 3933 7 NaN sec 3953 NaN sec 3933 8 NaN sec 3953 NaN sec 3933 9 NaN sec 3953 NaN sec 3933 10 NaN sec 3953 NaN sec 3933 11 NaN sec 3953 NaN sec 3933 12 NaN sec 3953 NaN sec 3933
% Make table timestamps datetime (if not already)
if ~any(opts.VariableTypes == "datetime")
ii=1; %for loop here
startTime = tableStartEndTimes(1, idxTimeVars(ii));
timeNumSecs = T(:, idxTimeVars(ii)).Variables;
timeVec = startTime.Variables + seconds(timeNumSecs);
% Insert into table somehow???
% T(:, idxTimeVars(ii)) = table(timeVec);
% T(:, idxTimeVars(ii)) = timeVec;
end
dpb on 25 Sep 2022
Direct link to this comment
https://matlabcentral.mathworks.com/matlabcentral/answers/1810965-how-can-i-read-rows-from-a-csv-outside-of-datalines-using-readtable#comment_2381020
Edited: dpb on 25 Sep 2022
Open in MATLAB Online
It would be easiest to illustrate with actual data file, but the answer is to read the seconds columns as the <duration> type which you can then add to the starting datetime.
I didn't try to parse the above(*) to figure out just what you're trying to do there with all the conditionals, but set the 'X-Values' input data type to 'Duration' and the corresponding 'DurationFormat' as 's'
You can then use addressing in the table by the <vartype> function which returns a subscripting variable as
isDur=vartype(tT,'duration');
tT(:,isDur)=tStart+tT(:,isDur);
where tT is your data table and tStart is the starting time datetime value.
You can also leave the input sample times as the double and use seconds to convert them to the duration afterwards, but since you're going to the trouble to build the import object to match the data file, why not finish the job there?
(*) Attach code snippets as text formatted with the "Code" button (or select text and use Ctrl-E is easier); nothing anybody can do with images directly.
Ben on 26 Sep 2022
Direct link to this comment
https://matlabcentral.mathworks.com/matlabcentral/answers/1810965-how-can-i-read-rows-from-a-csv-outside-of-datalines-using-readtable#comment_2381965
Edited: Ben on 26 Sep 2022
Open in MATLAB Online
That's helpful, thanks. How to import duration values is what I need. But I just get NaN sec when I set it to duration.
According to the docs, duration does not support seconds by itself. My data is <secs>,<milliseconds> which I guess I can't import directly? Or the DurationFormat=s does not support fractional seconds?
opts = setvaropts(opts, idxTimeVars, "Type","duration");
opts = setvaropts(opts, idxTimeVars, "DecimalSeparator", ",");
opts = setvaropts(opts, idxTimeVars, "DurationFormat", "s.SSSSSSSSS");%doesn't work
The data 'ben.csv' that I guess you constructed from the data in my original post is entirely representative of the actual data (I'm actually using it for quicker testing). The seconds continue increasing and there are many more columns of xy pairs of data. I updated my original post with a few extra rows to get to full seconds. I also attached it to my above comment.
Also, I have inserted all the code with the button that says "Insert MATLAB code example" on the editor window and copied the code as text directly from a working MATLAB online file. But I can format my previous comment using the other code formatting.
It was indeed my hope to import the data with the formatting instead of fixing it after. But don't think I can import duration data like 35802,890410352 seconds (with , as decimal separator, that is 35802 seconds 890 milliseconds), which is the duration value from near the end of the file.
Stephen23 on 26 Sep 2022
Direct link to this comment
https://matlabcentral.mathworks.com/matlabcentral/answers/1810965-how-can-i-read-rows-from-a-csv-outside-of-datalines-using-readtable#comment_2382015
Edited: Stephen23 on 26 Sep 2022
Open in MATLAB Online
@Ben: unfortunately the DURATION class has a very restricted small set of valid formats which it can either display or accept as input text (given here). My guess is that you will need to import that data as numeric and then convert to DURATION using SECONDS:
D = seconds(35802.890410352);
D.Format = 'hh:mm:ss.SSSSSSSSS'
D = duration
09:56:42.890410352
dpb on 26 Sep 2022
Direct link to this comment
https://matlabcentral.mathworks.com/matlabcentral/answers/1810965-how-can-i-read-rows-from-a-csv-outside-of-datalines-using-readtable#comment_2382205
Yeah, that's a gotcha'! on the duration class, indeed.
The datetime/duration classes are pretty new yet and not quite fully developed -- particularly the duration is yet to come fully of age on its limitations on formats.
This would be a great example to use as an enhancement request -- will see about crafting one...
Using the double and converting as @Stephen23 says isn't that bad to have to do, but would be cleaner if could go at it directly.
Ben on 26 Sep 2022
Direct link to this comment
https://matlabcentral.mathworks.com/matlabcentral/answers/1810965-how-can-i-read-rows-from-a-csv-outside-of-datalines-using-readtable#comment_2382335
I might send a feature request too! It's unfortunate, but I will survive xD
Thank you both for your answers and support.
Ben on 26 Sep 2022
Direct link to this comment
https://matlabcentral.mathworks.com/matlabcentral/answers/1810965-how-can-i-read-rows-from-a-csv-outside-of-datalines-using-readtable#comment_2382805
Open in MATLAB Online
I was having some trouble getting the data into the table, as the data table has the time column as a duration and I can't replace duration data with datetime due to time mismatch. I was trying
T(:, idxTimeVars(ii)) = {timeVec};
where timeVec is a datetime vector. But MATLAB says timeVec needs to be a duration array.
I found that I could pass an anonymous function to convertvars which allows me to do what I want. See code below.
One thing not shown by the sample data is that not all columns have the same start time, which is why this is a loop that goes through each time column.
T = convertvars(T, idxTimeVars, "seconds");
for ii = 1:length(idxTimeVars)
startTime = tableStartEndTimes(1, idxTimeVars(ii)).Variables;
startTime.Format = 'dd.MM.yyyy HH:mm:ss.SSS';
addDatetime = @(x)(x + startTime);
T = convertvars(T, idxTimeVars(ii), addDatetime);
end
Mainly wanted to close this out with a working solution to the end. Also curious if I can do this without a loop, but it's not slow so unimportant.
Sign in to comment.
More Answers (0)
Sign in to answer this question.
See Also
Categories
AI, Data Science, and StatisticsText Analytics ToolboxText Data Preparation
Find more on Text Data Preparation in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!
An Error Occurred
Unable to complete the action because of changes made to the page. Reload the page to see its updated state.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- Deutsch
- English
- Français
- United Kingdom(English)
Asia Pacific
- Australia (English)
- India (English)
- New Zealand (English)
- 中国
- 日本Japanese (日本語)
- 한국Korean (한국어)
Contact your local office