Measuring individual player effectiveness in 2-player per team sports

I’ve got a spreadsheet of some team scores. First team to 10 points wins. There are 2 players on each team. The players play with different teammates all the time, although they are not chosen perfectly randomly. No individual scores are kept.

So basically we have
Bill and Bob beat Andy and Alice 10-4
Jake and Bill beat Joe and John 10-8

Is it possible to come up with some ranking for the individual players, based on all the available match data. Basically, to see how much each player contributes to each game in terms of points or relative to the other players?

Below are a couple very simple models. They are both deficient in at least one way, but maybe they’ll provide something to build on. The second model actually does not (quite) address the OP’s scenario (see remarks below), but I am leaving it in case it helps in some way.

Model 1: A variant of the Bradley–Terry model

Suppose we are primarily interested in predicting whether one team will beat another based on the players on each team. We can simply record whether Team 1 with players $(i,j)$ beats Team 2 with players $(k,\ell)$ for each game, ignoring the final score. Certainly, this is throwing away some information, but in many cases this still provides lots of information.

The model is then

That is, we have an “affinity” parameter for each player that affects how much that player improves the chance of his team winning. Define the player’s “strength” by $s_i = e^{\alpha_i}$. Then, this model asserts that

There is a very nice symmetry here in that it doesn’t matter how the response is coded as long as it is consistent with the predictors. That is, we also have

This can be fit easily as a logistic regression with predictors that are indicators (one for each player) taking value $+1$ if player $i$ is on Team 1 for the game in question, $-1$ if she’s on Team 2 and $0$ if she does not participate in that game.

From this we also have a natural ranking for the players. The larger the $\alpha$ (or $s$), the greater the player improves her team’s chance of winning. So, we can simply rank players according to their estimated coefficients. (Note that the affinity parameters are only identifiable up to a common offset. Therefore, it is typical to fix $\alpha_1 = 0$ to make the model identifiable.)

Model 2: Independent scoring

NB: Upon rereading the OP’s question, it’s apparent that the models below are inadequate for his setup. Specifically, the OP is interested in a game that ends after a fixed number of points are scored by one team or the other. The models below are more appropriate for games that have a fixed duration in time. Modifications can be made to fit better within the OP’s framework, but it would require a separate answer to develop.

Now we want to keep track of scores. Suppose it’s a reasonable approximation that each team scores points independently of each other with the number of points scored in any interval independent of any disjoint interval. Then the number of points each team scores can be modeled as a Poisson random variable.

Thus, we can setup a Poisson GLM such that the score of some team consisting of players $i$ and $j$ in a particular game is

Note that this model ignores the actual matchups between teams, focusing purely on scoring.

It does have an interesting connection to the modified Bradley–Terry model. Define $\sigma_i = e^{\gamma_i}$ and suppose that a “sudden-death” game is played in which the first team to scores wins. If Team 1 has players $(i,j)$ and Team 2 has players $(k,\ell)$, then

Thus, the mean rate of scoring of the players is equivalent to the “strength” parameter formulation of Model 1.

We might consider making this model more complex by having an “offense” affinity $\rho_i$ and “defense” affinity $\delta_i$ for each player, such that if Team 1 with $(i,j)$ plays Team 2 with $(k,\ell)$, then

and

Scoring is still independent in this model, but now there is an interaction between the players on each team that affects the score. Players can also be ranked according to their affinity-coefficient estimates.

Model 2 (and its variants) allow for prediction of a final score as well.

Extensions: One useful way to extend both models is to incorporate an ordering where the positive indicators correspond to the “home” team and the negative indicators to the “away” team. Adding in an intercept term to the models can then be interpreted as a “home-field advantage”. Other extensions might include incorporating the chance of ties in Model 1 (it’s actually already a possibility in Model 2).

Side note: At least one of the computerized polls (Peter Wolfe’s) used for the Bowl Championship Series in American college football uses the (standard) Bradley–Terry model to produce its rankings.