OptionaldigestDataset digest, e.g. an md5 hash of the dataset that uniquely identifies it within datasets of the same name.
OptionalnameThe name of the dataset. E.g. “my.uc.table@2” “nyc-taxi-dataset”, “fantastic-elk-3”
OptionalprofileThe profile of the dataset. Summary statistics for the dataset, such as the number of rows in a table, the mean / std / mode of each column in a table, or the number of elements in an array.
OptionalschemaThe schema of the dataset. E.g., MLflow ColSpec JSON for a dataframe, MLflow TensorSpec JSON for an ndarray, or another schema format.
OptionalsourceSource information for the dataset. Note that the source may not exactly reproduce the dataset if it was transformed / modified before use with MLflow.
OptionalsourceThe type of the dataset source, e.g. ‘databricks-uc-table’, ‘DBFS’, ‘S3’, ...
Dataset. Represents a reference to data used for training, testing, or evaluation during the model development process.