iguanas.rules.Rules

class iguanas.rules.Rules(rule_dicts=None, rule_strings=None, rule_lambdas=None, lambda_kwargs=None, lambda_args=None, opt_func=None)[source]

Defines a set of rules using the following representations: string, dictionary, lambda expression.

One of the above formats must be provided to define the rule set. The rules can then be reformatted into one of the other representations.

Parameters
rule_dictsDict[str, dict]

Set of rules defined using thestandard Iguanas dictionary format (values) and their names (keys). Defaults to None.

rule_stringsDict[str, str]

Set of rules defined using the standard Iguanas string format (values) and their names (keys). Defaults to None.

rule_lambdasDict[str, Callable[[dict], str]]

Set of rules defined using the standard Iguanas lambda expression format (values) and their names (keys). Must be given in conjunction with either lambda_kwargs or lambda_args. Defaults to None.

lambda_kwargsDict[str, dict]

For each rule (keys), a dictionary containing the features used in the rule (keys) and the current values (values). Only populates when .as_lambda() is used with the keyword argument with_kwargs=True. Defaults to None.

lambda_argsDict[str, list]

For each rule (keys), a list containing the current values used in the rule. Only populates when .as_lambda() is used with the keyword argument with_kwargs=False. Defaults to None.

opt_funcCallable, optional

A function/method which calculates a custom metric (e.g. Fbeta score) for each rule when applying the rules to a dataset. Defaults to None.

Attributes
rule_dictsDict[str, dict]

Set of rules defined using the standard Iguanas dictionary format (values) and their names (keys).

rule_stringsDict[str, str]

Set of rules defined using the standard Iguanas string format (values) and their names (keys).

rule_lambdasDict[str, Callable[[dict], str]]

Set of rules defined using the standard Iguanas lambda expression format (values) and their names (keys).

lambda_kwargsDict[str, dict]

For each rule (keys), a dictionary containing the features used in the rule (keys) and the current values (values).

lambda_argsDict[str, list]

For each rule (keys), a list containing the current values used in the rule.

rule_featuresDict[str, set]

For each rule (keys), a set containing the features used in the rule (only populates when the .get_rules_features() method is used).

rule_descriptionsPandasDataFrameType

A dataframe showing the logic of the rules and their performance metrics on the provided dataset (only populates when the .transform() method is used).

as_rule_dicts() Dict[str, dict][source]

Converts rules into the standard Iguanas dictionary format.

Returns
Dict[str, dict]

Rules in the standard Iguanas dictionary format.

as_rule_strings(as_numpy: bool) Dict[str, str][source]

Converts rules into the standard Iguanas string format.

Parameters
as_numpybool

If True, the conditions in the string format will uses Numpy rather than Pandas. These rules are generally evaluated more quickly on larger dataset stored as Pandas DataFrames.

Returns
Dict[str, str]

Rules in the standard Iguanas string format.

as_rule_lambdas(as_numpy: bool, with_kwargs: bool) Dict[str, Callable[[dict], str]][source]

Converts rules into the standard Iguanas lambda expression format.

Parameters
as_numpybool

If True, the conditions in the string format will uses Numpy rather than Pandas. These rules are generally evaluated more quickly on larger dataset stored as Pandas DataFrames.

with_kwargsbool

If True, the string in the lambda expression is created such that the inputs are keyword arguments. If False, the inputs are positional arguments.

Returns
Dict[str, Callable[[dict], str]]

Rules in the standard Iguanas lambda expression format.

transform(X: Union[iguanas.utils.typing.pandas.core.frame.DataFrame, iguanas.utils.typing.databricks.koalas.frame.DataFrame], y=None, sample_weight=None) Union[iguanas.utils.typing.pandas.core.frame.DataFrame, iguanas.utils.typing.databricks.koalas.frame.DataFrame][source]

Applies the set of rules to a dataset, X. If y is provided, the performance metrics for each rule will also be calculated.

Parameters
XUnion[PandasDataFrameType, KoalasDataFrameType]

The feature set on which the rules should be applied.

yUnion[PandasSeriesType, KoalasSeriesType], optional

The target column. Defaults to None.

sample_weightUnion[PandasSeriesType, KoalasSeriesType], optional

Record-wise weights to apply. Defaults to None.

Returns
Union[PandasDataFrameType, KoalasDataFrameType]

The binary columns of the rules.

filter_rules(include=None, exclude=None) None[source]

Filters the rules by their names.

Parameters
includeList[str], optional

The list of rule names to keep. Defaults to None.

excludeList[str], optional

The list of rule names to drop. Defaults to None.

Raises
Exception

include and exclude cannot contain similar values.

get_rule_features() Dict[str, set][source]

Returns the set of unique features present in each rule.

Returns
Dict[str, set]

Set of unique features (values) in each rule (keys).