Submodular Mutual InformationΒΆ

Denote \(\mathcal{V}\) as the ground-set of items to be summarized. We denote by \(\mathcal{V}^{\prime}\) an auxiliary set that contains user-provided information such as a query (for query-focused summarization or targeted subset selection). The auxiliary information provided by the user may not be in the same space as the items in \(\mathcal{V}\) – for example, if the items in \(\mathcal{V}\) are images, the query could be text queries. In such a case, we assume we have a joint embedding that can represent both the query and the image items, and correspondingly, we can define similarity between the items in \(\mathcal{V}\) and \(\mathcal{V}^{\prime}\). Next, let \(\Omega = \mathcal{V} \cup \mathcal{V}^{\prime}\) and define a set function \(f: 2^{\Omega} \rightarrow \Re\). Although \(f\) is defined on \(\Omega\), summarization is on the items in \(\mathcal{V}\), i.e., the discrete optimization problem will be only on subsets of \(\mathcal{V}\).

We define the submodular mutual information [GB11, GL20] between two sets \(A,B\) as

\[I_f(A; B) = f(A) + f(B) - f(A \cup B)\]

It is easy to see that \(I_f(A; B)\) is equal to the mutual information between two random variables when \(f\) is the entropy function. Intuitively, this measures the similarity between \(B\) and \(A\) where \(B\) is the query set.

Note

\(I_f(A; B) = f(A) - f(A|B)\)

Properties of submodular mutual information are studied at length in [IKBA21].

For application in query-focused summarization, \(B = Q\) where \(Q \subseteq \mathcal{V}^{\prime}\) is a query set.

Some simple properties of SMI which follow almost immediately from definition is that \(I_f(A; B) \geq 0\) and \(I_f(A; B)\) is also monotone in \(A\) for a fixed \(B\). \(I_f(A; Q)\) models the mutual coverage, or shared information, between \(A\) and \(Q\), and is thus useful for modeling query relevance in query-focused summarization.

Note

\(I_f(A; Q)\) is unfortunately not submodular in \(A\) for a fixed \(Q\) in general [KSG08]. However some instantiations of SMI (using a some submodular function) may turn out to be submodular.

Examples of Submodular Mutual Information functions: