Surprises in Proper Positive-Only Learning
By Shai Ben-David, Farnam Mansouri, Anay Mehrotra, Manolis Zampetakis
"Characterizes proper positive-only learning via finite VC dimension and new combinatorial condition, revealing separations from standard PAC learning."
Abstract
Binary classification from positive-only samples is a variant of PAC learning in which the learner receives i.i.d. samples from the positive region of an unknown target concept, but is evaluated under the original distribution (which places mass on both positive and negative regions). This model dates back to Natarajan [1987, STOC], and the characterization of improper learning is well-known -- it even appears in textbooks. The characterization of proper positive-only learning, however, has long remained open. In this work, we revisit and settle this question: a concept class is properly learnable from positive-only samples if and only if it has finite VC dimension and satisfies a new combinatorial condition, which we call uniform exterior separability. Together with several separation results, this characterization reveals a surprisingly rich landscape that differs sharply from standard PAC learning: proper and improper learning are separated, randomized and deterministic proper learning are separated, there are classes for which no ERM is a learner, and finite VC dimension does not suffice even for non-uniform learning. Along the way, we introduce new combinatorial dimensions that we believe can be of broader interest in learning theory.
Technical Analysis & Implementation
Summary§
This paper resolves the long-open question of proper learning from positive-only samples in the PAC model. The authors show that a concept class is properly learnable from positive-only examples if and only if it has finite VC dimension and satisfies a new condition called uniform exterior separability (UES). This leads to surprising separations: proper vs. improper learning, randomized vs. deterministic proper learning, classes where no ERM succeeds, and finite VC dimension insufficient even for non-uniform learning.
Core Technical Contribution§
Model (Proper Positive-Only Learning)§
The learner receives i.i.d. samples from the positive region $\mathcal{X}^+$ of an unknown concept $c^ \in \mathcal{C}$ (where domain $\mathcal{X} = \mathcal{X}^+ \cup \mathcal{X}^-$). The learner must output a hypothesis $h \in \mathcal{C}$ (proper). Evaluation is under the original distribution $\mathcal{D}$ over $\mathcal{X}$, with generalization error $\Pr_{x \sim \mathcal{D}}[h(x) \neq c^(x)]$.
Key Result§
Theorem. A concept class $\mathcal{C}$ is properly learnable from positive-only samples iff: 1. $\text{VC}(\mathcal{C}) < \infty$, and 2. $\mathcal{C}$ is uniformly exterior separable.
Uniform exterior separability (UES) means: there exists a function $f: \mathbb{N} \to \mathbb{N}$ such that for every $n$, every finite subset $S \subseteq \mathcal{X}^+$ of size $n$ can be separated from its complement by some $h \in \mathcal{C}$ with margin $1/f(n)$ (in the sense of a distance-based separator). Formal definition involves a notion of "exterior" points.
Implications§
- Proper learning is strictly harder than improper learning (which only requires finite VC dimension).
- Randomized proper learners can outperform deterministic ones.
- There exist classes with finite VC dimension that are not properly learnable even non-uniformly.
- ERM can fail: some classes are properly learnable but no ERM (empirical risk minimization) succeeds.
New Combinatorial Dimensions§
The paper introduces the positive VC dimension and positive exterior dimension to quantify the hardness of proper positive-only learning.
Code Illustration§
Below is a conceptual PyTorch snippet illustrating the evaluation setup for a proper positive-only learner (not the UES condition itself):
import torch
import torch.nn as nn
# Hypothetical concept class: linear classifiers in R^2
class LinearClassifier(nn.Module):
def __init__(self):
super().__init__()
self.w = nn.Linear(2, 1, bias=True)
def forward(self, x):
return torch.sign(self.w(x)).squeeze()
# Positive-only training data (all labels +1)
X_pos = torch.randn(100, 2) # all positive examples
# Learner: must output hypothesis from same class (proper)
model = LinearClassifier()
# Train with some proper learning algorithm (e.g., using only positive data)
# ... (not trivial; paper shows ERM may fail)
# Evaluation distribution D (both positive and negative)
X_test = torch.randn(200, 2)
y_test = torch.randint(0, 2, (200,)) # true labels
with torch.no_grad():
preds = model(X_test)
error = (preds != y_test).float().mean()Key Equations§
Let $\mathcal{C} \subseteq \{0,1\}^{\mathcal{X}}$. For a distribution $\mathcal{D}$ over $\mathcal{X}$, the positive-only learning error of hypothesis $h$ is: $$ \text{err}(h) = \Pr_{x \sim \mathcal{D}}[h(x) \neq c^(x)]. $$ The learner only sees positive examples $x \sim \mathcal{D}^+$, where $\mathcal{D}^+$ is $\mathcal{D}$ conditioned on $c^(x)=1$.
The uniform exterior separability condition: there exists $m: \mathbb{N} \to \mathbb{N}$ such that for any $n$ and any set $S \subseteq \mathcal{X}^+$ of size $n$, there is $h \in \mathcal{C}$ with $h(S) = 1$ and $\forall x \notin S$, $h(x) = 0$ if $x$ is "exterior" to $S$ (i.e., outside a certain neighborhood).
Conclusion§
This paper provides a complete characterization of proper positive-only learning, revealing a rich structure beyond standard PAC learning. The new combinatorial condition UES may find applications in other learning models with restricted data access.