This function imports and exports Comma Separated Value (CSV) data.
Monadic ⎕CSV imports data from a CSV file or converts data from CSV format to an internal format. Dyadic ⎕CSV exports data to a CSV file or converts data from internal format to a CSV format.
Arrays that result from importing CSV data or arrays that are suitable for exporting as CSV data are represented by 3 possible structures:
Note that when importing CSV data, all fields are assumed to be character fields unless otherwise specified (see Column Types below). A field that contains only "numbers" will not be converted to numeric data unless specified as being numeric.
R←⎕CSV Y
Y is an array that specifies just the source of the CSV data (see below) or a 1,2,3 or 4-element vector containing:
[1] | Source of CSV Data |
[2] | Description of the CSV data |
[3] | Column Types |
[4] | Header Row Indicator |
Source - may be one of:
Description may be one of:
Column Types
This is a scalar numeric code or vector of numeric codes that specifies the field types from the list below. If Column Types is zilde or omitted, the default is 1 (all fields are character).
0 | The field is ignored. |
1 | The field contains character data. |
2 | The field is to be interpreted as being numeric. Empty cells and cells which cannot be converted to numeric values are not tolerated and cause an error to be signalled. |
3 | The field is to be interpreted as being numeric but invalid numeric vales are tolerated. Empty fields and fields which cannot be converted to numeric values are replaced with the Fill variant option (default 0). |
4 | The field is to be interpreted numeric data but invalid numeric data is tolerated. Empty fields and fields which cannot be converted to numeric values are returned instead as character data; this type is disallowed when variant option Invert is set to 1. |
Note that if Column Types is specified by a scalar 4, all numeric data in all fields will be converted to numbers.
Header Row Indicator
This is a Boolean value (default 0) to specify whether or not the first record in a CSV file is a list of column labels. If Header Row Indicator is 1, the first record (the header row) is treated differently from other records. It is assumed to contain character data (labels) regardless of Y[3] and is returned separately in the result.
The following variant options are accepted:
Name | Meaning | Default |
---|---|---|
Invert | 0, 1 or 2 (see below) | 0 |
Separator | The field separator, any single character. If Widths is other than ⍬, Separator is ignored. | ',' |
Widths | A vector of numeric values describing the width (in characters) of the corresponding columns in the CSV source, or ⍬ for variable width delimited fields | ⍬ |
Decimal | The decimal mark in numeric fields - one of '.' or ',' | '.' |
Thousands | The thousands separator in numeric fields, which may be specified as an empty character vector (meaning no separator is defined) or a character scalar | '' |
Trim | A Boolean specifying whether undelimited/unescaped whitespace is trimmed at the beginning and end of fields | 1 |
Ragged | A Boolean specifying whether records with varying numbers of fields are allowed; see notes below | 0 |
Fill | The numeric value substituted for invalid numeric data in columns of type 3 | 0 |
Records | The maximum number of records to process or 0 for no limit. This applies only to a file specified by a tie number. | 0 |
Other options defined for export are also accepted but ignored.
This option specifies how the CSV data should be returned as follows:
0 | A table (a matrix whose elements are character vectors or scalars or numbers). |
1 | A vector, each of whose items contain field (column) values. Character field values are character matrices; numeric field values are numeric vectors. |
2 | A vector, each of whose items contain field (column) values. Character field values are vectors of character vectors; numeric field values are numeric vectors. |
The result R contains the imported data.
If Y[4] does not specify that the data contains a header then R contains the entire data in the form specified by the Invert variant option.
If Y[4] does specify that the data contains a header then R is a 2-element vector where:
⊃⎕NGET CSVFile←'c:\Dyalog16.0\sales.csv' ┌→───────────────────────────────────────────────┐ │Product,Sales │ │ Widgets,1912 │ │ Gimlets,205 │ │ Dingbats,189│ │ │ └────────────────────────────────────────────────┘
⎕CSV CSVFile ┌→───────────────────┐ ↓ ┌→──────┐ ┌→────┐ │ │ │Product│ │Sales│ │ │ └───────┘ └─────┘ │ │ ┌→──────┐ ┌→───┐ │ │ │Widgets│ │1912│ │ │ └───────┘ └────┘ │ │ ┌→──────┐ ┌→──┐ │ │ │Gimlets│ │205│ │ │ └───────┘ └───┘ │ │ ┌→───────┐ ┌→──┐ │ │ │Dingbats│ │189│ │ │ └────────┘ └───┘ │ └∊───────────────────┘ ⎕CSV CSVFile'' ⍬ 1 ⍝ Header row ┌→────────────────────────────────────────────┐ │ ┌→──────────────────┐ ┌→──────────────────┐ │ │ ↓ ┌→──────┐ ┌→───┐ │ │ ┌→──────┐ ┌→────┐ │ │ │ │ │Widgets│ │1912│ │ │ │Product│ │Sales│ │ │ │ │ └───────┘ └────┘ │ │ └───────┘ └─────┘ │ │ │ │ ┌→──────┐ ┌→──┐ │ └∊──────────────────┘ │ │ │ │Gimlets│ │205│ │ │ │ │ └───────┘ └───┘ │ │ │ │ ┌→───────┐ ┌→──┐ │ │ │ │ │Dingbats│ │189│ │ │ │ │ └────────┘ └───┘ │ │ │ └∊──────────────────┘ │ └∊────────────────────────────────────────────┘
⎕CSV CSVFile''(1 2)1 ⍝ Fields are Char,Num ┌→──────────────────────────────────────────┐ │ ┌→────────────────┐ ┌→──────────────────┐ │ │ ↓ ┌→──────┐ │ │ ┌→──────┐ ┌→────┐ │ │ │ │ │Widgets│ 1912 │ │ │Product│ │Sales│ │ │ │ │ └───────┘ │ │ └───────┘ └─────┘ │ │ │ │ ┌→──────┐ │ └∊──────────────────┘ │ │ │ │Gimlets│ 205 │ │ │ │ └───────┘ │ │ │ │ ┌→───────┐ │ │ │ │ │Dingbats│ 189 │ │ │ │ └────────┘ │ │ │ └∊────────────────┘ │ └∊──────────────────────────────────────────┘
(⎕CSV⍠'Invert' 1)CSVFile'' (1 2) 1 ⍝ Invert 1 ┌→────────────────────────────────────────────────────┐ │ ┌→──────────────────────────┐ ┌→──────────────────┐ │ │ │ ┌→───────┐ ┌→───────────┐ │ │ ┌→──────┐ ┌→────┐ │ │ │ │ ↓Widgets │ │1912 205 189│ │ │ │Product│ │Sales│ │ │ │ │ │Gimlets │ └~───────────┘ │ │ └───────┘ └─────┘ │ │ │ │ │Dingbats│ │ └∊──────────────────┘ │ │ │ └────────┘ │ │ │ └∊──────────────────────────┘ │ └∊────────────────────────────────────────────────────┘
⊃(⎕CSV⍠'Invert' 2)CSVFile'' (1 2) 1 ⍝ Invert 2 ┌→──────────────────────────────────────────────────┐ │ ┌→───────────────────────────────┐ ┌→───────────┐ │ │ │ ┌→──────┐ ┌→──────┐ ┌→───────┐ │ │1912 205 189│ │ │ │ │Widgets│ │Gimlets│ │Dingbats│ │ └~───────────┘ │ │ │ └───────┘ └───────┘ └────────┘ │ │ │ └∊───────────────────────────────┘ │ └∊──────────────────────────────────────────────────┘
Data may be read from a named file or a tied native file. A tied native files may be read in sections by repeatedly invoking ⎕CSV for a specified maximum number of records (specified by the Records variant) until no more data is read. A named file is read by ⎕CSV in its entirety regardless of the Records option.
In all cases the files must contain text using one of the supported encodings.
{R}←X ⎕CSV Y
The left argument X is either:
Y is a 1 or 2-element vector containing:
[1] | Destination of CSV Data (see below) |
[2] | Description of the CSV data (see below) |
Destination - may be one of:
Description may be:
The following variant options are accepted:
Name | Meaning | Default |
---|---|---|
Overwrite | a Boolean which specifies, when creating a named file which already exists, whether to overwrite it (1) or signal an error (0) | 0 |
Separator | the field separator, any single character. If Widths is other than ⍬, Separator is ignored. | ',' |
Widths | a vector of numeric values describing the width (in characters) of the corresponding columns in the CSV source, or ⍬ for variable width delimited fields | ⍬ |
Decimal | the decimal mark in numeric fields - one of '.' or ',' | '.' |
Thousands | the thousands separator in numeric fields, which may be specified as an empty character vector (meaning no separator is defined) or a character scalar | '' |
Trim | a Boolean specifying whether whitespace is trimmed at the beginning and end of character fields | 1 |
LineEnding | the line ending sequence |
(13 10) on Windows; 10 on other platforms |
Other options defined for import are also accepted but ignored.
If Y specifies that the CSV data is written to a file then R is the number of bytes (not characters) written, and is shy.
Otherwise, R is the CSV data in the format specified in Y, and is not shy.
CSVFile←'c:\Dyalog16.0\sales.csv' ⎕←DATA HDR←⎕CSV CSVFile''(1 2)1 ┌→──────────────────────────────────────────┐ │ ┌→────────────────┐ ┌→──────────────────┐ │ │ ↓ ┌→──────┐ │ │ ┌→──────┐ ┌→────┐ │ │ │ │ │Widgets│ 1912 │ │ │Product│ │Sales│ │ │ │ │ └───────┘ │ │ └───────┘ └─────┘ │ │ │ │ ┌→──────┐ │ └∊──────────────────┘ │ │ │ │Gimlets│ 205 │ │ │ │ └───────┘ │ │ │ │ ┌→───────┐ │ │ │ │ │Dingbats│ 189 │ │ │ │ └────────┘ │ │ │ └∊────────────────┘ │ └∊──────────────────────────────────────────┘ DATA⍪←'Gizmos' 23 DATA HDR ⎕CSV'' ┌→────────────┐ │Product,Sales│ │ │ │Widgets,1912 │ │ │ │Gimlets,205 │ │ │ │Dingbats,189 │ │ │ │Gizmos,23 │ │ │ │ │ └─────────────┘ CSVFile1←'c:\Dyalog16.0\sales1.csv' ⎕←DATA HDR ⎕CSV CSVFile1 67 DATA⍪←'Gimbals' 123 ⎕←DATA HDR ⎕CSV CSVFile1 FILE NAME ERROR: Unable to create file ("The file exists.") ⎕←DATA HDR ⎕CSV CSVFile1 ∧ ⎕←DATA HDR(⎕CSV⍠'Overwrite' 1)CSVFile1 80