Verifying Models¶
Jump to the Verifying The Model Against User Specifications section for the usage.
User Specifications¶
Formally, in the current version, the language for user specifications is an extension of propositional logic where a proposition may consist of arithmetic operations, comparison operations and set membership. We refer to the formulae in the logic as “Boolean formulae”.
Arithmetic Terms¶
An atomic arithmetic term can be of two forms:
ArithmeticVariable: When the operand is an attribute.
ArithmeticConstant: When the operand is a number.
The user can build up compound arithmetic terms using the following operators:
Unary operators: Negative, Square.
Nnary operators: \(+\), \(-\), \(*\), \(/\).
Boolean Formulae¶
An atomic Boolean formula takes one of the following forms:
- BooleanConstant:
true;
false.
Membership.
Comparison.
The user can write the definitions in JSON format as follows:
The definition of a Boolean constant includes the type of the formula and the value of the formula. For instance, true in JSON format is the following code:
"type": "BooleanConstant",
"value": true
A membership formula \(f ∈ V\) where \(f\) is an attribute and \(V\) is a set of (string or numerical) values is defined using the type, the name of the attribute, and the list of values. The latter is defined in a field called “subset”. For instance, \(f ∈\) { \("a", "b"\) } is defined as follows:
"type": "Membership",
"name": "f",
"subset":
[
"a",
"b"
]
A comparison formula connects two arithmetic terms (defined in Arithmetic Terms) using a comparison operator, which may be one of the following:
\(=\)
\(>\)
\(>=\)
\(<\)
\(<=\)
The definition of a comparison formula includes the type of the operator and the left, right operands. Suppose \(a\) and \(b\) are two attributes, we can define \((a * 1.8) < b\) as follows:
"type": "<",
"left":
{
"type": "*",
"operands": [
{
"type": "ArithmeticVariable",
"name": "a"
},
{
"type": "ArithmeticConstant",
"value": 1.8
}
]
},
"right": {
"type": "ArithmeticVariable",
"value": "b"
}
The user can build up compound Boolean formula by connecting Boolean formulae using the following Boolean operators:
Unary operators: \(!\).
Binary Operators: \(\&\&\), \(||\), \(=>\), \(Xor\).
Similar to the definitions of unary and nary arithmetic operators (cf. Arithmetic Terms), the definition of an unary Boolean operator includes the type and the internal sub-formula, and the definition of a nary Boolean operator includes the type and the operands. For example, the Boolean formula \(!\) (\(f ∈\) { \("a", "b"\) }) can be defined as:
"type": "!",
"Internal":
{
"type": "Membership",
"name": "f",
"subset":
[
"a",
"b"
]
}
For instance, given three attributes \(a\), \(b\) and \(c\), we can define \((a * 1.8) < b ~\&\&~ c > 82\) as follows:
"type": "&&",
"operands": [
{
"type": "<",
"left":
{
"type": "*",
"operands": [
{
"type": "ArithmeticVariable",
"name": "a"
},
{
"type": "ArithmeticConstant",
"value": 1.8
}
]
},
"right": {
"type": "ArithmeticVariable",
"value": "b"
}
},
{
"type": ">",
"left": {
"type": "ArithmeticVariable",
"name": "c"
},
"right": {
"type": "ArithmeticConstant",
"value": 82
}
}
]
Finally, the definition of a user specification consists of the definition of the outcome and the constraint_formula, which is a Boolean formula. Suppose the outcome feature only has two values “Y” and “N” which mean “the flight will be delayed for more than 15 min” and “the flight will not be delayed for more than 15 min” respectively, and there is a feature called “Origin” that indicates the departure airport. We can define a specification that says “if the flight is going to be delayed for more tha 15 min, then it is not a flight that departures from LAX or JFK” using the following definition:
"name": "spec_example_1",
"outcome": "Y",
"constraint_formula": {
"type": "!",
"Internal": {
"type": "Membership",
"name": "Origin",
"subset": [
"LAX",
"JFK"
]
}
}
The user can define a list of specifications and store them in a JSON file.
Verifying The Model Against User Specifications¶
To verify a machine learning model against a list of specifications, use the following command:
silas verify [OPTIONS] [model] [spec_file]
where model is either a directory that contains multiple decision trees or a file path of a single decision tree, and spec_file is the file that contains user specifications. OPTIONS include:
-h: Print help message and exit.
-f: Verify a forest. The model parameter must be a directory. This is the default if options are not supplied.
-t: Verify a decision tree. The model parameter must be a file.
For instance, to verify a forest stored in model/model1/ against specifications stored in spec/spec1.json, use the following command:
silas verify model/model1/ spec/spec1.json
To verify a tree stored in model/model1/tree9527.json against specifications stored in spec/spec1.json, use the following command:
silas verify -t tree model/model1/tree9527.json spec/spec1.json