bpdatabases logo bpdatabases logo
Sunday, February 1, 2026
6:30 PM ET
Raymond James Stadium (ESPN)
Point of View
Bayesian xGF% Analysis
xGF%
-
Expected Goal Share
Actual GF%
-
Laplace-Smoothed Actual Goal Share
Adjusted xGF%
-
Updated Expected Goal Share
xWin%
-
Expected Winning Probability
xGF% (Prior)
Actual GF% (Likelihood)
Adjusted xGF% (Posterior)
xWin% region

Why Mix Expected & Actual Results?

Expected Goals (xG) is not a prediction of game outcomes or goal totals. Instead, xG estimates the probability that a shot becomes a goal based on the information available at the moment the shot is taken. It is intentionally blind to whether the shot was converted, to allow us to grade the process. By evaluating every shot attempt, xG draws information from all ~60 shots taken during a typical game, providing a more stable and complete measure of performance. The final score, by comparison, reflects only the handful of shots that actually become goals, making goal-based metrics far more susceptible to randomness and game-to-game volatility.

However, the final score is not without value. While models such as xG are designed to measure the quality of a team's performance independent of luck (somewhat), they cannot capture every aspect of the game. Finishing talent, goaltending performance, and other factors may influence outcomes in ways that are only partially reflected in the model.

A Bayesian approach allows us to incorporate information from both the process and the results. Rather than treating xG or the final score as the sole source of truth, we combine them to produce an updated estimate of a team’s underlying performance. The result is a more balanced assessment than either measure can provide on its own. From this updated understanding, we can derive an Expected Win Percentage (xWin%). Specifically, we calculate the probability that a team’s posterior expected goal share (Adjusted xGF%) exceeds 50%, indicating that their underlying performance favored outscoring their opponent. In other words, we can answer the question: "With all of the information available to us, what is the probability that this team deserved to win this game?"


Step 1 - Scale Parameter: W

W is the scaling factor that determines how much statistical weight to give our data. W is set equal to the number of shots taken in that game. Shots are used because they represent the sample size of the xG model itself. The more shots taken, the more observations that informed our xGF% estimate, and the more confident we can be in it. W is applied in full to the prior (xGF%) and cut in half for the likelihood (actual GF%). This is a reflection of my belief that xGF% is the more descriptive metric, and thus I am more confident that it describes the true merit of play than the final score.

Visually, a larger W results in narrower curves.

\[ W = \text{total xG-modeled shots (both teams)} \qquad W_{\text{like}} = \frac{W}{2} \]
Step 2 - Prior: xGF% (Light Blue Curve)

The prior is modeled as a Beta distribution centered at the team's Expected Goals share (xGF%), concentrated by W. It represents our belief about the team's true performance level before observing the final score, based on the xG model output, with confidence adjusted by the number of shots fed into that model.

Visually, a higher xGF% shifts the curve right; more shots makes it narrower.

\[ \alpha_{\text{prior}} = W \times xGF\% \qquad \beta_{\text{prior}} = W \times (1 - xGF\%) \]
Step 3 - Likelihood: Actual GF% (Dark Blue Curve)

The likelihood is modeled as a Beta distribution centered at the team's actual goal share (GF%), concentrated by Wlike. In Bayesian terms, the likelihood is the new evidence used to update our prior belief. In this case, we're using the final result. Like our xG model, we exclude empty net and shootout goals from this goal share because neither are statistically representative of the underlying performance we are trying to measure.

A small Laplace correction (+0.5) is applied to the goal share to prevent degenerate distributions in shutout games. Without it, a team that scores zero goals would produce a Beta distribution with α = 0, which is mathematically undefined. The correction nudges the estimated goal share slightly toward 50%, ensuring the distribution remains well-behaved while preserving the direction and spirit of the actual result.

Visually, a higher actual goal share shifts the curve to the right; because Wlike = W/2, the likelihood curve is always wider than the prior, reflecting our lower confidence in the final score as a measure of true performance.

\[ \text{actualGF%} = \frac{GF + 0.5}{GF + GA + 1} \] \[ \alpha_{\text{like}} = W_{\text{like}} \times \text{actualGF%} \qquad \beta_{\text{like}} = W_{\text{like}} \times (1 - \text{actualGF%}) \]
Step 4 - Posterior: Adjusted xGF% (Gold Curve)

Starting from our prior (xGF%), and adjusting that based on new information (GF%) results in the posterior, or our updated understanding of the past game having seen the result. This updated understanding is labeled the Adjusted Expected Goal For Percentage (Adjusted xGF%), which is a more informed evaluation of merit than either individual metric on its own.

Visually, the posterior is narrower and taller than either the prior or likelihood alone, reflecting the increased certainty that comes from combining two sources of information into one updated belief. Because xGF% receives twice the statistical weight of actualGF%, the posterior is always pulled more strongly toward the process than the result.

\[ \alpha_{\text{post}} = \alpha_{\text{prior}} + \alpha_{\text{like}} = (W \times xGF\%) + (W_{\text{like}} \times \text{actualGF%}) \] \[ \beta_{\text{post}} = \beta_{\text{prior}} + \beta_{\text{like}} \] \[ \text{Adjusted xGF%} = \frac{\alpha_{\text{post}}}{\alpha_{\text{post}} + \beta_{\text{post}}} \]
Step 5 - xWin% (Shaded Region)

Expected Win Percentage (xWin%) is the area under the posterior curve to the right of 50%. Statistically, it represents the probability that the team’s true underlying goal share exceeded 50%, meaning they have outscored their opponent, as estimated by the posterior distribution.

\[ \text{xWin%} = \bigl(1 - I_{0.5}(\alpha_{\text{post}},\; \beta_{\text{post}})\bigr) \times 100 \]

In plain terms, it represents our updated probability of the team outscoring their opponent. It answers the question: "With all of the information available to us, what is the probability that this team deserved to win this game?"