readData Github code
Load built-in datasets
Syntax
data = readData(dataName,Name,Value)
Description
data = readData(DataName,Name,Value)
returns one of the built-in datasets in selected format. Name
and Value
specifies additional options using one or more name-value pair arguments. For example, users can specify if the data is stored in a 2D array or a table.
See: Input Arguments, Output Argument
Input Arguments
Data type: string
Name of the built-in datasets of the VBLab package. See list of VBLab built-in datasets.
Data Name | Description | Task | Argument |
---|---|---|---|
'Abalon' | Cross-sectional data | Regression | 'Intercept' , 'Normalized' ,'Type' |
'Cencus' | Cross-sectional data | Classification | 'Intercept' , 'Normalized' ,'Type' |
'DirectMarketing' | Cross-sectional data | Regression | 'Intercept' , 'Normalized' ,'Type' |
'GermanCredit' | Cross-sectional data | Classification | 'Intercept' , 'Normalized' ,'Type' |
'LabourForce' | Cross-sectional data | Classification | 'Intercept' , 'Normalized' ,'Type' |
'RealizedLibrary' | Time-series data | Regression | 'Index' ,'Length' , 'RealizedMeasure' ,'Type' |
Name-Value Pair Arguments
Specify optional comma-separated pairs of Name,Value
arguments. Name
is the argument name and Value
is the corresponding value. Name
must appear inside quotes. You can specify several name and value pair arguments in any order as Name1,Value1,...,NameN,ValueN
.
Example: 'Type','Matrix','Intercept',true
specifies that the output dataset is stored in a 2D array and a column of 1 will be added to the data matrix as intercepts.
'Intercept' - Adding intercept column
Data Type: true | false
Flag to add a column of $1$ to the data as intercepts.
Note: Only available for cross-sectional (tabular) data
Default: false
Example: 'Intercept',true
'Type' - Data format
Data Type: string
Format of the output data. Can be specified as
'Matrix'
Data is stored in a 2D array, for cross-sectional data or multivariate time series data, or 1D array, for univaritate time series. For cross-sectional, the last data column contains the response values.'Table'
Data is stored in a table. For cross-sectional, the last data column contains the response values.
Default: Matrix
Example: 'Type',Table
'Normalized' - Normalization flag
Data Type: true | false
Flag to normalize numerical variables. Neural network based models such as DeepGLM work more efficient with normalized data.
Note: Only available for cross-sectional (tabular) data
Default: false
Example: 'Normalized',true
'Index' - Stock index
Data Type: string
Stock return indices of the 'RealizedLibrary'
data.
Default: None
Example: 'Index','SP500'
'Length' - Number of observations of time series
Data Type: Integer | Positive
Number of observations of the time series. Only available for the 'RealizedLibrary'
data.
Default: None
Example: 'Length',1000
Output Arguments
Data type: array | table
The dataset stored as an 1D or 2D array or a table.
- For cross-sectional data, the last column of the 2D array or table contains response values.
- For time series data, data can be 1D (univariate) or 2D (multivariate) array.