First of all, I apologize since this question has probably been asked many times and is easily answered. However, as a statistics amateur I simply couldn’t figure out what keywords are relevant to my question.

Suppose you have 100 merchants and 100 products. Each merchant sells a certain range of products, ranging from only one product to all 100 products. Also, products are sold in widely different proportions, which differ among merchants, and are subject to the merchant’s individual (irrational) preferences.

Whenever a merchant makes a “pitch” on the market, we observe whether or not he manages to sell the product he’s pitching. We assume the probability of success depends (a) on the skill of the merchant and (b) the attractiveness of the product. The products’ prices are fixed, so that’s not a factor.

The data we have consists of millions of pitches. For each pitch, we know whether or not it was successful, the merchant, and the product.

Obviously, if we compare merchants by their average success rate, this information is useless because every merchant sells different products. Likewise, if we compare products, we gain no information since every product is sold by different merchants.

What we want is a skill score for each merchant, which is independent of the products the merchant is selling, and an attractiveness score for each product, which is independent of the merchants who are selling it.I don’t need a comprehensive explanation, just some keywords to point me in the right direction. I literally have no idea where to start.

Edit: Note that our assumption is that the product attractiveness is merchant-independent and the merchant skill is product-independent, i.e. there are no merchants which are better at selling certain products but worse at selling others.

**Answer**

Let me expand on alternative solution proposed by @curious_cat.

Pij is the matrix of pitches

Lij is the matrix of sells

Sij=Lij/Pij is the matrix of success rates (elementwise division where it exists and 0 elsewhere)

As @curious_cat suggested, you want to approximate Sij by the outer product of two **positive** vectors

Sij≈Mi×ATj

Least square minimization will lead to

min

where | \quad |_2 is the Frobenius norm.

**BUT** you do not want to minimize for the entries in which S_{ij} is not defined. So what you realy want is something like:

\min |W_{ij} \odot (S_{ij} – M_j \times A_i^T)|_2

where \odot is the elementwise multiplication.

1) At a first approximation, w_{ij} is 0 where p_{ij} is 0 and 1 elsewhere.

This is a **weighted non-negative matrix factorisation** (or approximation) problem. Google should give some references to it.

2) Now, shooting from the hip, let us try to answer the point also made by @curious_cat that you should trust more a success rate of 1000 sells over 2000 pitches than a 2 sells over 4 pitches.

The weight w_{ij} need not to be uniformly 1 for the entries that are defined in S_{ij}. One can give it more weight to success rates with higher pitches.

My guess is to use \sqrt{p_{ij}} as the weight. The intuition is that the confidence interval on the success rate is inversely proportional to \sqrt{p_{ij}}.

**Attribution***Source : Link , Question Author : M. Cypher , Answer Author : Jacques Wainer*