Home - terazi

How does `terazi` work?

terazi works by sampling a dataset while searching for an optimal balance structure. Given the original data collection $D$, the goal is to construct a sampled dataset $D'$ which has a certain target balance structure. The sampled collection $D'$ is composed of the following four partitions:

$p\_f':$ privileged favourable samples
$p\_uf':$ privileged unfavourable samples
$up\_f':$ unprivileged favourable samples
$up\_uf':$ unprivileged unfavourable samples

The balance structure of sampling $D'$ is controlled with 3 parameters, $\alpha$, $\beta$, and $\gamma$. Each parameter can have a value between $0-1$, and represent the distribution of a specific subgroup in $D'$. The subgroups they control are as follows:

Parameter $\alpha:$ controls the unprivileged group rate within $D'$.
Parameter $\beta:$ controls the unfavourable labelled instance rate within the unprivileged group.
Parameter $\gamma:$ controls the unfavourable labelled instance rate within the privileged group.

Sampling parameters are also controlled with a level option for search depth, where their values are updated with different intervals as follows:

Level 0: Parameter values are selected by $0.1$ length intervals between $0-1$.
Level 1: Parameter values are selected by $0.01$ length intervals between $0-1$.

Each sampling $D'$ is used to train a selected classifier, while optimizing the performance according to the following loss function: $$ loss = (1-\text{DI_Ratio}) + (1-\text{MCC}) $$ where DI_Ratio is Disparate Impact Ratio metric which is formulized as: $$ \text{DI_Ratio} = \frac{P(L=\text{unfavourable} | G=\text{unprivileged})}{P(L=\text{unfavourable} | G=\text{unprivileged})} $$ and MCC is a common performance metric used when the class labels are imbalanced which is formulized as: $$ \text{MCC} = \frac{TP * TN - FP * FN}{\sqrt{(TP + FP) * (TP + FN) * (TN + FP) * (TN + FN)}} $$

`terazi` - AI Fairness for Doubly Imbalanced Data

What sets `terazi` apart?

Demo Datasets

BAF

CCF

VIF

How does `terazi` work?

terazi - AI Fairness for Doubly Imbalanced Data

What sets terazi apart?

Demo Datasets

BAF

CCF

VIF

How does terazi work?

`terazi` - AI Fairness for Doubly Imbalanced Data

What sets `terazi` apart?

How does `terazi` work?