Exercise 3.1.3 : Suppose we have a universal set U of n elements, and

we choose two subsets S and T at random, each with m of the n

elements.What is the expected value of the Jaccard similarity of S

and T ?I am reading the book

http://infolab.stanford.edu/~ullman/mmds/ch3.pdf

**Answer**

Each item in T has an $\frac{m}{n}$ chance of also being in S. The expected number of items common to S & T is therefore $\frac{m^2}{n}$.

Exp. $\text{Jaccard Similarity} = \dfrac{\text{No. of common items}}{\text{Size of T} + \text{Size of S} – \text{Number of common items}} = \dfrac{m}{2n – m}$ (after simplification.)

**Attribution***Source : Link , Question Author : delete me , Answer Author : Soroush*