Grid Search¶
The built-in grid search feature can automatically run training and testing/validation on multiple datasets using a variety of settings.
Grid Search Settings¶
The grid search settings specify what settings and values to explore in the grid search. The settings are a subset of the learner-settings in the main settings file. An example of grid search settings for classification is shown below:
"mode": "classification",
"reduction-strategy": [
"none",
"one-vs-rest",
"one-vs-one"
],
"forest-type": [
"SimpleForest",
"ClassicForest",
"PrototypeSampleForest"
],
"tree-type": [
"RdGreedy1D",
"GreedyNarrow1D",
"SimpleTreeGrower"
],
"number-of-trees": [
100,
500
],
"sampling-proportion": [
0.8,
1.0
],
"oob-proportion": [
0.05,
0.1
],
"max-depth": [
5,
64
],
"desired-leaf-size": [
1,
64
],
"feature-proportion": [
"sqrt",
"golden",
"all",
"1/3"
]
Generating Grid Search Settings Automatically¶
Use the following command to generate grid search settings automatically:
silas gen-gridsearch-settings [OPTIONS]
OPTIONS include:
-h: Print help message and exit.
-m mode: Specify a task mode. The mode can be either c for classification or r for regression.
-o file_path: output settings in the given file. If this option is not supplied, the grid search settings will be stored in grid-search/gridsearch-settings.json.
Perform Grid Search¶
Use the following command to run grid search on multiple datasets:
silas gridsearch [OPTIONS] [gridsearch-settings-file] [dataset-settings-files...]
where gridsearch-settings-file is the file path for Grid Search Settings and dataset-settings-files is a list of the Machine Learning Settings file paths for datasets. The grid search will output (partial) results in the same directory where gridsearch-settings-file is located. OPTIONS include:
-h: Print help message and exit.
-\-na/-\-noauc: Specify that do not compute ROC-AUC. Only works for classification tasks. Use this flag when the dataset has too many classes and multi-class AUC is not used as a performance measure.
-c: Continue from the previously unfinished grid search. Assume that the current grid search settings are the same as the previous one and the previous partial search results are stored in the directory where gridsearch-settings-file is located.
-o output_dir: output grid search results in the directory output_dir. If this option is not supplied, the grid search results will be stored in the directory where gridsearch-settings-file is located.