`readData` Github code

Load built-in datasets

Syntax

data = readData(dataName,Name,Value)

Description

data = readData(DataName,Name,Value) returns one of the built-in datasets in selected format. Name and Value specifies additional options using one or more name-value pair arguments. For example, users can specify if the data is stored in a 2D array or a table.

See: Input Arguments, Output Argument

Input Arguments

dataName - Name of the dataset

Data type: string

Name of the built-in datasets of the VBLab package. See list of VBLab built-in datasets.

Data Name	Description	Task	Argument
`'Abalon'`	Cross-sectional data	Regression	`'Intercept'`, `'Normalized'`,`'Type'`
`'Cencus'`	Cross-sectional data	Classification	`'Intercept'`, `'Normalized'`,`'Type'`
`'DirectMarketing'`	Cross-sectional data	Regression	`'Intercept'`, `'Normalized'`,`'Type'`
`'GermanCredit'`	Cross-sectional data	Classification	`'Intercept'`, `'Normalized'`,`'Type'`
`'LabourForce'`	Cross-sectional data	Classification	`'Intercept'`, `'Normalized'`,`'Type'`
`'RealizedLibrary'`	Time-series data	Regression	`'Index'`,`'Length'`, `'RealizedMeasure'`,`'Type'`

Name-Value Pair Arguments

Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the corresponding value. Name must appear inside quotes. You can specify several name and value pair arguments in any order as Name1,Value1,...,NameN,ValueN.

Example: 'Type','Matrix','Intercept',true specifies that the output dataset is stored in a 2D array and a column of 1 will be added to the data matrix as intercepts.

'Intercept' - Adding intercept column

Data Type: true | false

Flag to add a column of $1$ to the data as intercepts.

Note: Only available for cross-sectional (tabular) data

Default: false

Example: 'Intercept',true

'Type' - Data format

Data Type: string

Format of the output data. Can be specified as

'Matrix' Data is stored in a 2D array, for cross-sectional data or multivariate time series data, or 1D array, for univaritate time series. For cross-sectional, the last data column contains the response values.
'Table' Data is stored in a table. For cross-sectional, the last data column contains the response values.

Default: Matrix

Example: 'Type',Table

'Normalized' - Normalization flag

Data Type: true | false

Flag to normalize numerical variables. Neural network based models such as DeepGLM work more efficient with normalized data.

Note: Only available for cross-sectional (tabular) data

Default: false

Example: 'Normalized',true

'Index' - Stock index

Data Type: string

Stock return indices of the 'RealizedLibrary' data.

Default: None

Example: 'Index','SP500'

'Length' - Number of observations of time series

Data Type: Integer | Positive

Number of observations of the time series. Only available for the 'RealizedLibrary' data.

Default: None

Example: 'Length',1000

'RealizedMeasure' - Realized measures of stock return volatility

Data Type: string | cell array of strings

Realized measures

Default: None

Example: 'RealizedMeasure','rk_parzen'

Example: 'RealizedMeasure',{'rk_parzen','medrv'}

Output Arguments

data - Output dataset

Data type: array | table

The dataset stored as an 1D or 2D array or a table.

For cross-sectional data, the last column of the 2D array or table contains response values.
For time series data, data can be 1D (univariate) or 2D (multivariate) array.

readData Github code

Syntax

Description

Input Arguments

Data type: string

Data Type: true | false

Data Type: string

Data Type: true | false

Data Type: string

Data Type: Integer | Positive

Data Type: string | cell array of strings

Output Arguments

Data type: array | table

`readData` Github code