Matching Problem
Matching means find a match partner that satisfies certain requirements. It's a well-reseached area with applications to marriage, hospitality etc. Here we focus on finding a potential partner in romantic relationship.
Problem Formulation
Given a hypothesis space, we should first remove most infeasible points. Compared with finding all perfect points, what we should do is only finding one (or only a few) perfect point. To this end, we can first select some points equipped with some certain properties that form a much smaller but not much worse hypertheses subspace.
Suppose that you own a feature vector $x\in\mathbb{R}d$. What you should find is another one equipped with the feature vector $y\in\mathcal{F}\subset\mathbb{R}d$ such that $\langle x,y\rangle=\arg\max_{z\in\mathcal{F}}\langle x,z\rangle$.
The social network can be defined as a graph $(V,E,f)$, where $V$ is the set of all agents, $E\subset V2$ is the space of edges that connect pairs of agents, and $f$ the stability function that maps each edge to the interval $(0,1]$ to characterize how familar the end points are.
We can assume that there exists another function $I$ to bound the inner product error:
$$
|\langle x,y\rangle- I(f(x,y))|\le \epsilon.
$$
Chain-Basd Search Strategy
Given a chain $x,n(x),n2(x),n3(x),\dots, nk(x),\dots$, one of my friends X always prefer $nk(x)$ over $n^{k-1}(x)$ if and only if $\langle x,nk(x)\rangle \ge \langle x,n^{k-1}(x)\rangle$. If not, you can assume that X would leave $n^{k}(x)$ since X believes that the following search starting at $n^{k}(x)$ would not be better than starting at $n^{k-1}(x)$. It is a feedforward-backforward process.
Moreover, we can assume that $E_{n(x)}[f(x,n(x))\cdot n(x)]=x$ if we believe that one person is the union of all people he/she meets.That means, we believe that we can always find a better point $n^{k}(x)$ than the starting searching point $n^{k-1}(x)$.
However, we have no idea how far we should dive into, even if the inner product $\langle x,nk(x)\rangle$ always increases with respected to $k$. So my friend X searches lots of routes/chains, and estimates the upperbounds of the possible prefer points' inner products in each chains. He would prefer search the chain of higher upperbound, like the UCB algorithm does. Luckily, he has found a good matching point Y. Congratulations!
Cost Reduction
The only problem left here is: the cost of this chain search is high. We further made the following assumptions:
- The cost of computing the inner product is negatively proportional to the familiarity of the two agents: $c(x,y)= \frac{1}{f(x,y)}\in(1,+\infty)$.
- The total cost of searching a chain is the sum of the cost of computing the inner products of all edges in the chain: $C(x,n(x))=\sum_{k=1}^{m}c(n^{k-1}(x),nk(x))$.
Then we have the total cost as:
$$
C(x,n(x))=\sum_{k=1}^{m}\frac{1}{f(n^{k-1}(x),nk(x))}.
$$
Under such assumption, the best way to reduce cost is to search along the chain of the highest closeness and the lowest length. It's easy to prove that
$$
\argmin{y\in V} C(x,y) = \argmax{y \in n(x)} {f(x,y)}.
$$
Theorem 1 (Most Cost-Effective Matching): The most cost-effective matching is the most close one.
But if one is more tolerant to the cost, he/she can search more chains and (often) find a better matching point.
We may define the tolerance of cost $T(x)$ as the maximum cost affordable for $x$. Then the total affordable cost for $x,y$ to match is:
$$
T(x,y) = \min{T(x),T(y)}.
$$
Given no prior knowledge of $y$, we have $\mathbb{E}{y\in V}[T(x,y)]\le \mathbb{E}{y \in V}[T(y)]$. This indicate that we should always assume a low tolerance (below average).
Now we introduce another assumption (that lead to dillema):
- The closest points are often of the same gender or is legally unmarriable (or of the family financial status that is not matchable).
This happens when $f(x,y)\to 1$ but $I(f(x,y))=0$.
Due to the social labor distribution system, such dillema can be seen in many sectors including university, company, government institute etc.
A solution to matching dillema
Definition 1 (Matching Dillema): Given a social network $(V,E,f)$, a point $x$ is in a matching dillema when $f(x,y)\to 1$ but $I(f(x,y))=0$, for most $y \in n(x)$.
But there are exceptions: High-school classmates are a group of selected people that are familiar with each other (financially and academicly) and does not suffer from the matching dillema.
Because of this, theorem 1 is often referred to as "The end of matching is high-school classmates", which point out the fact that the people we are familair with & close to are often high-school classmates.
Corollary 1 (The end of matching is high-school classmates): The most cost-effective matching is the most close one that are matchable, which is often high-school classmates.
And many people also do not suffer from the matching dillema, since they have a lot of close points that are matchable (due to gender ratio or personal characteristics).
Corollary 2 (Characteristics over enviroment): If one has a lot of close points that are matchable, he/she does not suffer from the matching dillema.
So the most natural solution to such dillema is actively getting familiar with potential matchable people around people they know.
Theorem 2 (Active and Multiple): For individuals suffering from the matching dillema, the most natural solution is actively getting familiar with multiple potential matchable people around people they know.
It's important to point out that such Active and Multiple strategy will definitely work if one pays enough energy and time to it. (And I've seen real world examples that succeeded)
But such strategy is risky: compared to naturally close people, building a closer relationship requires more efforts, and having close relationship with many potential matchable people is morally under-rated.
Here, we provide another solution that is believed to be less costy both morally and emotionally.
Recall that matching dillema occurs when we are looking for the least cost path, if we are willing to extend further (i.e. $n2(x)$ instead of $n(x)$), the restriction is then easily relaxed, and the cost is slightly increased to
$$
C(x,y) = C(x,n(x))+C(n(x),y) = \frac{1}{f(x,n(x))}+\frac{1}{f(n(x),y)}.
$$
It's claimed that, $x,y$ does not suffer from matching dillema, and the cost is just slightly increased by $\frac{1}{f(x,n(x))}$, if you are familiar enough with $n(x)$, then the cost is almost negligible.
Theorem 3 (Relaxation via Indirect Matching): For individuals suffering from the matching dillema, they can relax the restriction by searching the chain of length 2, which is often affordable.
Combined this with the matching dillema definition, we have the following corollary:
Corollary 3 (Relatives of close friends): The Most cost-effective matching is often the relatives of close friends, or friends of close relatives.
Here relative means the family members, close friends include lovers.
Assume that I'm a male, examples of such matching include:
- My close friend's sister.
- My sister's close friend.
- My close friend's close friend.
Note that it should be of similar age and have no legal issues.
To make things less awkward and more natural, it's better to organize a party/event and invite both sides.
In fact, such claims justified the matching arranged by parents(i.e. friends of close relatives). But the way of arranging is worth improving: The line should know each other's characteristics and interests and meet up should be arranged and initiated by themselves, given that the line provide proper opportunities: instead of awkward blind meet determined by age/money/status, people can interact in a more friendly and engaging activity.
This lead to the so-called peer meet-up activity scheduled by some NGOs with label like "break your comfort zone, meet 100 interesting people offline.". But this is more fragile and have larger variance compared to searching along familiar graph due to it's randomness.
So our solution can be summarized as follows:
Organize friendly and engaging activities where both sides are invited and are aware of.