Conditional Gain

Denote \(\mathcal{V}\) as the ground-set of items to be summarized. We denote by \(\mathcal{V}^{\prime}\) an auxiliary set that contains user-provided information like a private set (for privacy-preserving summarization or query-irrelevant summarization). The auxiliary information provided by the user may not be in the same space as the items in \(\mathcal{V}\) – for example, if the items in \(\mathcal{V}\) are images, the private instances could be textual topics. In such a case, we assume we have a joint embedding that can represent both the private instances and the image items, and correspondingly, we can define similarity between the items in \(\mathcal{V}\) and \(\mathcal{V}^{\prime}\). Next, let \(\Omega = \mathcal{V} \cup \mathcal{V}^{\prime}\) and define a set function \(f: 2^{\Omega} \rightarrow \Re\). Although \(f\) is defined on \(\Omega\), summarization is on the items in \(\mathcal{V}\), i.e., the discrete optimization problem will be only on subsets of \(\mathcal{V}\).

Given a set of items \(A, B \subseteq \Omega\), the conditional gain is the gain in function value by adding \(A\) to \(B\). That is,

\[f(A | B) = f(A \cup B) - f(B)\]

When \(f\) is entropy, this corresponds to the conditional entropy. Intuitively, \(f(A|B)\) measures how different \(A\) is from \(B\), where \(B\) is the conditioning set or the private set or the irrelevant set.

Examples of CG include \(f(A | P) = f(A \cup P) - f(P), A \subseteq \mathcal{V}\) where \(P \subseteq \mathcal{V}^{\prime}\) is either the private set or the irrelevant set.

Another example of CG is \(f(A | A_0), A, A_0 \in \mathcal{V}\) where \(A_0\) is a summary chosen by the user before. This is important for update-summarization [DO08, DA12, LLZ15] where the desired summary should be different from a pre-existing one.

The conditional gain has been studied in a number of optimization problems involving submodular functions [IB12, IB13, KG14].

Properties of conditional gain are studied at length in [IKBA21].

Note

Conditional Gain functions are non-negative and monotone in one argument with the other fixed [GL20, IKBA21].

Examples of Conditional Gain functions: