How can I read rows from a CSV outside of datalines using readtable (2024)

25 views (last 30 days)

Show older comments

Ben on 23 Sep 2022

  • Link

    Direct link to this question

    https://matlabcentral.mathworks.com/matlabcentral/answers/1810965-how-can-i-read-rows-from-a-csv-outside-of-datalines-using-readtable

  • Link

    Direct link to this question

    https://matlabcentral.mathworks.com/matlabcentral/answers/1810965-how-can-i-read-rows-from-a-csv-outside-of-datalines-using-readtable

Commented: Ben on 26 Sep 2022

Accepted Answer: dpb

Open in MATLAB Online

I am trying to read a CSV with a specific format that I can't change. I can read most of the data, but I lose some data from above start of the data lines (in the header that also has a units line). Below I have a section of the data I want to read:

; Bat_MaximumVoltage_x; Bat_MaximumVoltage_y; Bat_MinimumVoltage_x; Bat_MinimumVoltage_y

0; XY ; ; XY ;

1; ; ; ;

2; s ; V ; s ; V

3; 02.06.2022 05:03:33 ; ; 02.06.2022 05:03:33 ;

4; 02.06.2022 15:00:22 ; ; 02.06.2022 15:00:22 ;

5; X-Values ; Y-Values ; X-Values ; Y-Values

6; -2,51E-7 ; 3954 ; 1,11E-7 ; 3933

7; 0,240157917 ; 3953 ; 0,240157889 ; 3933

8; 0,478257917 ; 3953 ; 0,478257555 ; 3933

9; 0,71209725 ; 3953 ; 0,712096888 ; 3933

10; 1,020145332 ; 3953 ; 1,020144999 ; 3933

11; 1,258192585 ; 3953 ; 1,258192221 ; 3933

12; 1,491101584 ; 3953 ; 1,491101222 ; 3933

The csv file I am reading has 40 signal (_y) columns and 39 time (_x) columns. I have manually added the spacing you see above to improve readability.

I can import the data into a table using the following code:

opts = detectImportOptions(filename);

opts.DataLines = [8, Inf];

opts.Delimiter = ";";

opts.VariableUnitsLine = 4;

opts.VariableNames{1} = 'RowNum';

idxTimeVars = find(endsWith(opts.VariableNames,"_x"));

idxSignalVars = find(endsWith(opts.VariableNames,"_y"));

opts = setvaropts(opts, idxTimeVars, "Type","double");

opts = setvaropts(opts, idxTimeVars, "DecimalSeparator", ",");

opts = setvaropts(opts, idxSignalVars, "Type","double");

opts = setvaropts(opts, idxSignalVars, "DecimalSeparator", ",");

T = readtable(filename, opts);

But I want to create a timetable with this data, where the time vector is made of the start time (row number 3) added to the 'X-values', for example:

startTime = 02.06.2022 05:03:33; %datetime variable when read from data

timeVec = startTime + seconds(table.Bat_MaximumVoltage_x);

Then replace the double (type) '_x' columns with datetime (type) columns.

How can I store or access this data from the header of the file (above DataLines) while using readtable to import data below DataLines?

Do I just have to run textscan or readtable again to read those two lines separately?

0 Comments

Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Accepted Answer

dpb on 23 Sep 2022

  • Link

    Direct link to this answer

    https://matlabcentral.mathworks.com/matlabcentral/answers/1810965-how-can-i-read-rows-from-a-csv-outside-of-datalines-using-readtable#answer_1059545

  • Link

    Direct link to this answer

    https://matlabcentral.mathworks.com/matlabcentral/answers/1810965-how-can-i-read-rows-from-a-csv-outside-of-datalines-using-readtable#answer_1059545

Open in MATLAB Online

"Do I just have to run textscan or readtable again to read those two lines separately?"

Yes, unfortunately readtable doesn't have the feature of textscan to make multiple reads within the same file without closing and reopening it. textscan does, of course; one can fixup another import object to cull out the header with something like

opt=detectImportOptions('ben.csv',"Range",[4 6],'ReadVariableNames',1,'Delimiter',';')

opt.DataLines=5:6;

opt=setvaropts(opt,{'s','s_1'},'Type','datetime');

tHdr=readtable('ben.csv',opt);

By default the datetime with ambiguous month/day format will interpret the above as 02-June so if these dates are actually February 6, then will need to set the import format option 'DateTimeFormat' to match.

7 Comments

Show 5 older commentsHide 5 older comments

Ben on 25 Sep 2022

Direct link to this comment

https://matlabcentral.mathworks.com/matlabcentral/answers/1810965-how-can-i-read-rows-from-a-csv-outside-of-datalines-using-readtable#comment_2380770

  • Link

    Direct link to this comment

    https://matlabcentral.mathworks.com/matlabcentral/answers/1810965-how-can-i-read-rows-from-a-csv-outside-of-datalines-using-readtable#comment_2380770

Edited: Ben on 26 Sep 2022

Open in MATLAB Online

  • ben.csv

Okay, that's unfortunate. Follow-up question:

So now I have my tHdr where the _x variables contain the start and end timestamps, and T which contains the actual data and _x variables contain seconds since start stored as double.

How can I best combine them? I think I have to loop through the _x variables and create a vector (per variable) using data from both tables. But when I try to replace the seconds with actual timestamps I can't because the existing variable is double type.

Is my best option just creating a new table instead of modifying the existing ones?

Thanks

Here's what I have so far:

filename = "ben.csv";

opts = detectImportOptions(filename);

% Specify range and delimiter

opts.DataLines = [8, Inf];

opts.Delimiter = ";";

opts.VariableUnitsLine = 4;

% Specify column names and types

opts.VariableNames{1} = 'RowNum';

idxSignalVars = find(endsWith(opts.VariableNames,"_y"));

idxTimeVars = find(endsWith(opts.VariableNames,"_x"));

opts = setvaropts(opts, idxSignalVars, "Type","double");

opts = setvaropts(opts, idxSignalVars, "DecimalSeparator", ",");

% Check if times are stored as datetime or seconds

if ~any(opts.VariableTypes == "datetime") %True if no datetime types

opts.DataLines = [5, 6];

opts = setvaropts(opts, idxTimeVars, "Type","datetime");

opts = setvaropts(opts, idxTimeVars, "InputFormat", "dd.MM.yyyy HH:mm:ss");

tableStartEndTimes = readtable(filename, opts);

%%%% Reset for loading data

opts.DataLines = [8, Inf];

opts = setvaropts(opts, idxTimeVars, "Type","duration");

opts = setvaropts(opts, idxTimeVars, "DecimalSeparator", ",");

opts = setvaropts(opts, idxTimeVars, "DurationFormat", "s");

else

opts = setvaropts(opts, idxTimeVars, "InputFormat", "dd.MM.yyyy HH:mm:ss");

end

% Read table

T = readtable(filename, opts);

head(T)

RowNum Bat_MaximumVoltage_x Bat_MaximumVoltage_y Bat_MinimumVoltage_x Bat_MinimumVoltage_y ______ ____________________ ____________________ ____________________ ____________________ 6 NaN sec 3954 NaN sec 3933 7 NaN sec 3953 NaN sec 3933 8 NaN sec 3953 NaN sec 3933 9 NaN sec 3953 NaN sec 3933 10 NaN sec 3953 NaN sec 3933 11 NaN sec 3953 NaN sec 3933 12 NaN sec 3953 NaN sec 3933

% Make table timestamps datetime (if not already)

if ~any(opts.VariableTypes == "datetime")

ii=1; %for loop here

startTime = tableStartEndTimes(1, idxTimeVars(ii));

timeNumSecs = T(:, idxTimeVars(ii)).Variables;

timeVec = startTime.Variables + seconds(timeNumSecs);

% Insert into table somehow???

% T(:, idxTimeVars(ii)) = table(timeVec);

% T(:, idxTimeVars(ii)) = timeVec;

end

dpb on 25 Sep 2022

Direct link to this comment

https://matlabcentral.mathworks.com/matlabcentral/answers/1810965-how-can-i-read-rows-from-a-csv-outside-of-datalines-using-readtable#comment_2381020

  • Link

    Direct link to this comment

    https://matlabcentral.mathworks.com/matlabcentral/answers/1810965-how-can-i-read-rows-from-a-csv-outside-of-datalines-using-readtable#comment_2381020

Edited: dpb on 25 Sep 2022

Open in MATLAB Online

It would be easiest to illustrate with actual data file, but the answer is to read the seconds columns as the <duration> type which you can then add to the starting datetime.

I didn't try to parse the above(*) to figure out just what you're trying to do there with all the conditionals, but set the 'X-Values' input data type to 'Duration' and the corresponding 'DurationFormat' as 's'

You can then use addressing in the table by the <vartype> function which returns a subscripting variable as

isDur=vartype(tT,'duration');

tT(:,isDur)=tStart+tT(:,isDur);

where tT is your data table and tStart is the starting time datetime value.

You can also leave the input sample times as the double and use seconds to convert them to the duration afterwards, but since you're going to the trouble to build the import object to match the data file, why not finish the job there?

(*) Attach code snippets as text formatted with the "Code" button (or select text and use Ctrl-E is easier); nothing anybody can do with images directly.

Ben on 26 Sep 2022

Direct link to this comment

https://matlabcentral.mathworks.com/matlabcentral/answers/1810965-how-can-i-read-rows-from-a-csv-outside-of-datalines-using-readtable#comment_2381965

  • Link

    Direct link to this comment

    https://matlabcentral.mathworks.com/matlabcentral/answers/1810965-how-can-i-read-rows-from-a-csv-outside-of-datalines-using-readtable#comment_2381965

Edited: Ben on 26 Sep 2022

Open in MATLAB Online

That's helpful, thanks. How to import duration values is what I need. But I just get NaN sec when I set it to duration.

According to the docs, duration does not support seconds by itself. My data is <secs>,<milliseconds> which I guess I can't import directly? Or the DurationFormat=s does not support fractional seconds?

opts = setvaropts(opts, idxTimeVars, "Type","duration");

opts = setvaropts(opts, idxTimeVars, "DecimalSeparator", ",");

opts = setvaropts(opts, idxTimeVars, "DurationFormat", "s.SSSSSSSSS");%doesn't work

The data 'ben.csv' that I guess you constructed from the data in my original post is entirely representative of the actual data (I'm actually using it for quicker testing). The seconds continue increasing and there are many more columns of xy pairs of data. I updated my original post with a few extra rows to get to full seconds. I also attached it to my above comment.

Also, I have inserted all the code with the button that says "Insert MATLAB code example" on the editor window and copied the code as text directly from a working MATLAB online file. But I can format my previous comment using the other code formatting.

It was indeed my hope to import the data with the formatting instead of fixing it after. But don't think I can import duration data like 35802,890410352 seconds (with , as decimal separator, that is 35802 seconds 890 milliseconds), which is the duration value from near the end of the file.

Stephen23 on 26 Sep 2022

Direct link to this comment

https://matlabcentral.mathworks.com/matlabcentral/answers/1810965-how-can-i-read-rows-from-a-csv-outside-of-datalines-using-readtable#comment_2382015

  • Link

    Direct link to this comment

    https://matlabcentral.mathworks.com/matlabcentral/answers/1810965-how-can-i-read-rows-from-a-csv-outside-of-datalines-using-readtable#comment_2382015

Edited: Stephen23 on 26 Sep 2022

Open in MATLAB Online

@Ben: unfortunately the DURATION class has a very restricted small set of valid formats which it can either display or accept as input text (given here). My guess is that you will need to import that data as numeric and then convert to DURATION using SECONDS:

D = seconds(35802.890410352);

D.Format = 'hh:mm:ss.SSSSSSSSS'

D = duration

09:56:42.890410352

dpb on 26 Sep 2022

Direct link to this comment

https://matlabcentral.mathworks.com/matlabcentral/answers/1810965-how-can-i-read-rows-from-a-csv-outside-of-datalines-using-readtable#comment_2382205

  • Link

    Direct link to this comment

    https://matlabcentral.mathworks.com/matlabcentral/answers/1810965-how-can-i-read-rows-from-a-csv-outside-of-datalines-using-readtable#comment_2382205

Yeah, that's a gotcha'! on the duration class, indeed.

The datetime/duration classes are pretty new yet and not quite fully developed -- particularly the duration is yet to come fully of age on its limitations on formats.

This would be a great example to use as an enhancement request -- will see about crafting one...

Using the double and converting as @Stephen23 says isn't that bad to have to do, but would be cleaner if could go at it directly.

Ben on 26 Sep 2022

Direct link to this comment

https://matlabcentral.mathworks.com/matlabcentral/answers/1810965-how-can-i-read-rows-from-a-csv-outside-of-datalines-using-readtable#comment_2382335

  • Link

    Direct link to this comment

    https://matlabcentral.mathworks.com/matlabcentral/answers/1810965-how-can-i-read-rows-from-a-csv-outside-of-datalines-using-readtable#comment_2382335

I might send a feature request too! It's unfortunate, but I will survive xD

Thank you both for your answers and support.

Ben on 26 Sep 2022

Direct link to this comment

https://matlabcentral.mathworks.com/matlabcentral/answers/1810965-how-can-i-read-rows-from-a-csv-outside-of-datalines-using-readtable#comment_2382805

  • Link

    Direct link to this comment

    https://matlabcentral.mathworks.com/matlabcentral/answers/1810965-how-can-i-read-rows-from-a-csv-outside-of-datalines-using-readtable#comment_2382805

Open in MATLAB Online

I was having some trouble getting the data into the table, as the data table has the time column as a duration and I can't replace duration data with datetime due to time mismatch. I was trying

T(:, idxTimeVars(ii)) = {timeVec};

where timeVec is a datetime vector. But MATLAB says timeVec needs to be a duration array.

I found that I could pass an anonymous function to convertvars which allows me to do what I want. See code below.

One thing not shown by the sample data is that not all columns have the same start time, which is why this is a loop that goes through each time column.

T = convertvars(T, idxTimeVars, "seconds");

for ii = 1:length(idxTimeVars)

startTime = tableStartEndTimes(1, idxTimeVars(ii)).Variables;

startTime.Format = 'dd.MM.yyyy HH:mm:ss.SSS';

addDatetime = @(x)(x + startTime);

T = convertvars(T, idxTimeVars(ii), addDatetime);

end

Mainly wanted to close this out with a working solution to the end. Also curious if I can do this without a loop, but it's not slow so unimportant.

Sign in to comment.

More Answers (0)

Sign in to answer this question.

See Also

Categories

AI, Data Science, and StatisticsText Analytics ToolboxText Data Preparation

Find more on Text Data Preparation in Help Center and File Exchange

Tags

  • data import
  • datetime
  • readtable
  • csv
  • textscan
  • speed

Products

  • MATLAB

Release

R2021b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

An Error Occurred

Unable to complete the action because of changes made to the page. Reload the page to see its updated state.


How can I read rows from a CSV outside of datalines using readtable (10)

Select a Web Site

Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .

You can also select a web site from the following list

Americas

  • América Latina (Español)
  • Canada (English)
  • United States (English)

Europe

  • Belgium (English)
  • Denmark (English)
  • Deutschland (Deutsch)
  • España (Español)
  • Finland (English)
  • France (Français)
  • Ireland (English)
  • Italia (Italiano)
  • Luxembourg (English)
  • Netherlands (English)
  • Norway (English)
  • Österreich (Deutsch)
  • Portugal (English)
  • Sweden (English)
  • Switzerland
    • Deutsch
    • English
    • Français
  • United Kingdom(English)

Asia Pacific

Contact your local office

How can I read rows from a CSV outside of datalines using readtable (2024)

FAQs

How does CSV separate rows? ›

A CSV file stores tabular data (numbers and text) in plain text, where each line of the file typically represents one data record. Each record consists of the same number of fields, and these are separated by commas in the CSV file.

How to read rows from CSV file in Matlab? ›

M = csvread( filename ) reads a comma-separated value (CSV) formatted file into array M . The file must contain only numeric values. M = csvread( filename , R1 , C1 ) reads data from the file starting at row offset R1 and column offset C1 .

What is the difference between delimiter and separator in CSV? ›

delimiterA delimiter is a sequence of one or more characters used to specify the boundary between separate, independent regions in plain text or other data streams. separatorA character that separates parts of a string.

How do you split a CSV value into columns? ›

How to convert comma-separated values (CSV) to multiple columns in Excel
  1. Open the CSV-file in Microsoft Excel.
  2. Select the whole column of data.
  3. Now, with the data selected, click on Data and select Text to columns... ( ...
  4. This opens up a new window. ...
  5. Choose Comma and click 'Finish'

How are lines separated in CSV? ›

A CSV file contains a set of records separated by a carriage return/line feed (CR/LF) pair ( \r\n ), or by a line feed (LF) character. Each record contains a set of fields separated by a comma. If the field contains either a comma or a CR/LF, the comma must be escaped with double quotation marks as the delimiter.

How do I set a separator in CSV? ›

Under Region, click Change date, time, or number formats. In the Region dialog box, on the Formats tab, click Additional settings… In the Customize Format dialog box, on the Numbers tab, type the character you want to use as the default CSV delimiter in the List separator box.

How to split records in CSV? ›

Split - Once open, head to File > Download > Split To Multiple CSVs. Optionally use Filters to select the rows you'd like in your export. To filter on row count, filter the column name "#", or provide other criteria to select the rows of your choosing.

What is the row delimiter for CSV file? ›

The default delimiter for CSV files is the comma, but any symbol can be used as a CSV delimiter, for example, a horizontal tab.

Top Articles
Latest Posts
Article information

Author: Velia Krajcik

Last Updated:

Views: 6187

Rating: 4.3 / 5 (74 voted)

Reviews: 81% of readers found this page helpful

Author information

Name: Velia Krajcik

Birthday: 1996-07-27

Address: 520 Balistreri Mount, South Armand, OR 60528

Phone: +466880739437

Job: Future Retail Associate

Hobby: Polo, Scouting, Worldbuilding, Cosplaying, Photography, Rowing, Nordic skating

Introduction: My name is Velia Krajcik, I am a handsome, clean, lucky, gleaming, magnificent, proud, glorious person who loves writing and wants to share my knowledge and understanding with you.