Padel Performance Analysis: A Statistical Case Study

Introduction: The Analytical Framework

Objective

This project analyzes a real-world, ongoing dataset of padel match results to move beyond simple win/loss records and develop a robust player performance evaluation. The analysis is structured as a formal case study, demonstrating a complete data science workflow from data wrangling to statistical inference and interpretation.

Core Assumptions

The fundamental assumption of this analysis is that match, set, game, and tiebreak outcomes can be modeled as a Binomial process. Each event is treated as an independent Bernoulli trial where a player either wins or does not. This allows us to estimate a player’s underlying, unobserved skill parameter (\(p\)), representing their true probability of winning any given event.

Dual-Approach Analysis

To provide a comprehensive picture, this report employs two major statistical paradigms: 1. Frequentist Analysis: To provide objective, long-run probabilities and answer “yes/no” questions about statistical significance. 2. Bayesian Analysis: To quantify our certainty and provide a more intuitive understanding of player skill, especially given the limited data.

1. Descriptive Statistics: Player Leaderboard

The first step is to estimate each player’s skill from the observed data. We calculate the Maximum Likelihood Estimate (MLE) of the true win probability (\(p\)) for each player at all four levels. This is the observed proportion of wins, and the results are summarized in the leaderboard below, ranked by Game Win Percentage.

Metric Definitions: * All “Played” counts include every valid match, including those that ended in a draw. * Match_Win_Pct is calculated using a points system where a win is 1 point and a draw is 0.5 points. * All other win percentages are the simple proportion of wins to total events played.

Overall Player Performance Summary (MLEs)
Player	Matches_Played	Match_Win_Pct	Sets_Played	Set_Win_Pct	Games_Played	Game_Win_Pct	Tiebreaks_Played	Tiebreak_Win_Pct
Antun veli	1	100.0	2	100.0	20	60.0	0	NA
Kim	19	65.8	43	62.8	458	55.9	5	100.0
Anttu	7	78.6	17	70.6	176	54.5	3	33.3
Tomi	5	50.0	13	53.8	127	53.5	2	0.0
Jone	19	42.1	41	41.5	444	52.5	5	80.0
Jussi	2	100.0	6	66.7	58	51.7	1	0.0
Hastis	3	50.0	2	50.0	28	50.0	0	NA
Jon	15	50.0	38	50.0	394	45.4	4	50.0
Antti	17	20.6	38	28.9	407	41.3	4	0.0

2. Inferential Analysis: Frequentist Approach

Next, we use hypothesis testing to determine if a player’s performance is statistically significant. For this test of win proportion, we focus on decisive outcomes (wins and losses).

Null Hypothesis (H0): The player’s true win probability is 50% or less (p <= 0.5).
Alternative Hypothesis (Ha): The player’s true win probability is greater than 50% (p > 0.5).
Significance Level: We use a standard \(\alpha = 0.05\).

Game-Level Analysis

Game Level Hypothesis Test Results (H0: p <= 0.5)
Player	Wins	Decisive Played (n)	P-Value	95% CI Lower	95% CI Upper	Significant
Kim	256	458	0.006	0.521	1	Yes
Anttu	96	176	0.114	0.483	1	No
Jone	233	444	0.148	0.486	1	No
Antun veli	12	20	0.186	0.419	1	No
Tomi	68	127	0.212	0.463	1	No
Jussi	30	58	0.396	0.411	1	No
Hastis	14	28	0.500	0.352	1	No
Jon	179	394	0.965	0.414	1	No
Antti	168	407	1.000	0.373	1	No

Set-Level Analysis

Set Level Hypothesis Test Results (H0: p <= 0.5)
Player	Wins	Decisive Played (n)	P-Value	95% CI Lower	95% CI Upper	Significant
Anttu	11	15	0.035	0.521	1	Yes
Kim	27	43	0.047	0.502	1	Yes
Antun veli	2	2	0.079	0.425	1	No
Jussi	4	6	0.207	0.347	1	No
Tomi	6	11	0.382	0.315	1	No
Jon	18	36	0.500	0.368	1	No
Hastis	1	2	0.500	0.121	1	No
Jone	16	39	0.869	0.291	1	No
Antti	11	38	0.995	0.186	1	No

Match-Level Analysis

Match Level Hypothesis Test Results (H0: p <= 0.5)
Player	Wins	Decisive Played (n)	P-Value	95% CI Lower	95% CI Upper	Significant
Anttu	5	6	0.051	0.498	1	No
Kim	12	18	0.079	0.473	1	No
Jussi	2	2	0.079	0.425	1	No
Antun veli	1	1	0.159	0.270	1	No
Jon	7	14	0.500	0.299	1	No
Hastis	1	2	0.500	0.121	1	No
Tomi	2	4	0.500	0.182	1	No
Jone	7	17	0.767	0.241	1	No
Antti	3	16	0.994	0.078	1	No

Tiebreak-Level Analysis

Tiebreak Level Hypothesis Test Results (H0: p <= 0.5)
Player	Wins	Decisive Played (n)	P-Value	95% CI Lower	95% CI Upper	Significant
Kim	5	5	0.013	0.649	1	Yes
Jone	4	5	0.090	0.435	1	No
Jon	2	4	0.500	0.182	1	No
Anttu	1	3	0.718	0.078	1	No
Jussi	0	1	0.841	0.000	1	No
Tomi	0	2	0.921	0.000	1	No
Antti	0	4	0.977	0.000	1	No
Hastis	0	0	NA	NA	NA	N/A
Antun veli	0	0	NA	NA	NA	N/A

A key observation from these tables is that very few results are statistically significant—only a handful of tests showed a p-value below our 0.05 threshold. This is not surprising and highlights a central theme of this analysis: detecting a small winning edge requires a substantial amount of evidence. As our subsequent power analysis will confirm, many of our tests were not sensitive enough to confidently distinguish a real, small skill difference from random chance.

3. Post-Hoc Power Analysis

A hypothesis test can fail to find a significant result simply because it lacks statistical power. This analysis evaluates the sensitivity of our tests. We define a “meaningfully skilled player” as someone with a true win rate of 55%. The table below shows the probability (power) of our test correctly identifying such a player, given our current sample sizes.

Statistical Power to Detect a 55% Win Rate (against H0: p <= 0.5)
Player	N (Match)	Power	N (Set)	Power	N (Game)	Power	N (Tiebreak)	Power
Kim	18	0.11	43	0.16	458	0.69	5	0.08
Jone	17	0.11	39	0.15	444	0.68	5	0.08
Antti	16	0.11	38	0.15	407	0.65	4	0.07
Jon	14	0.10	36	0.15	394	0.63	4	0.07
Anttu	6	0.08	15	0.10	176	0.38	3	0.07
Tomi	4	0.07	11	0.09	127	0.30	2	0.07
Jussi	2	0.07	6	0.08	58	0.19	1	0.06
Hastis	2	0.07	2	0.07	28	0.13	0	NA
Antun veli	1	0.06	2	0.07	20	0.12	0	NA

The results confirm that our analysis is most reliable at the game-level due to its higher power.

4. Power Curve Visualization

The following plots visualize the relationship between statistical power, effect size, and sample size.

The curve above shows that with our current best sample size (n=450), our test becomes highly sensitive (power > 80%) when a player’s true win rate approaches 58%.

This second curve simulates a future scenario with more data (n=600), showing that the test would become powerful enough to reliably detect even smaller winning edges (~56%).

5. Bayesian Analysis: Quantifying Uncertainty

As a complementary approach, we use Bayesian inference. This method is ideal for quantifying our certainty given the limited data. We use a Beta-Binomial model with a weakly informative prior (Beta(2, 2)), which assumes a player is likely average before we see their results.

Bayesian Probability Summary

The table below shows the direct probability that each player’s true skill is greater than 50% (P(p > 0.5)). This provides a more intuitive measure of evidence than a p-value.

Bayesian Probability of Player Skill Being Above Average (p > 50%)
Player	Match	Set	Game	Tiebreak
Kim	0.857	0.948	0.994	0.965
Anttu	0.828	0.942	0.884	0.344
Jone	0.143	0.146	0.851	0.855
Antun veli	0.688	0.812	0.798	0.500
Tomi	0.363	0.598	0.785	0.187
Jussi	0.812	0.746	0.601	0.313
Hastis	0.344	0.500	0.500	0.500
Jon	0.407	0.500	0.035	0.500
Antti	0.006	0.006	0.000	0.063

Player Skill Distributions

Finally, we visualize the full posterior distributions. Wider curves indicate more uncertainty, while narrower curves indicate more certainty. These plots provide the most complete picture of our findings.

Game-Level Distributions

Set-Level Distributions

Match-Level Distributions

Tiebreak-Level Distributions

6. Conclusion of Statistical Analysis

This report serves as a mid-term summary of the project, with the central theme being the critical role of sample size in statistical certainty. This explains our key frequentist finding: despite several players having winning records, very few of these results were found to be statistically significant.

The dual-analysis approach provided a comprehensive picture. While the frequentist tests gave us objective “yes/no” answers on significance, the Bayesian analysis offered a more nuanced view of uncertainty. The Bayesian posterior plots provided the clearest visualization of our conclusions:

A player like Hastis, with very little data, has a wide, flat skill distribution that is almost identical to our initial prior belief—we have learned very little about his true skill.
In contrast, a player like Kim, with over 450 games, has a much narrower, “peaky” distribution, representing our high degree of certainty that he is a winning player.

This pattern holds across all levels of analysis, confirming that the most reliable insights are derived from the game-level data where our sample size is largest.

A Note on the Test Statistic

The frequentist analysis used a one-proportion z-test. The z-statistic is an intuitive measure of evidence: it counts how many standard errors our observed result (e.g., a 56% win rate) is away from the null hypothesis (a 50% win rate). A large z-score indicates that the result is far enough from 50% that it is unlikely to be due to random chance, leading to a significant p-value.

7. Future Work: Predictive Modeling & Interactive Application

This report constitutes the complete exploratory and inferential analysis phase of the project. The final phase will focus on predictive modeling and productization by developing an R Shiny web application.

The planned features for the application include:

Match Outcome Prediction: A predictive model (e.g., a Bradley-Terry model) will be trained on the data to estimate player skill ratings. The app will use these ratings to generate win probabilities for any given matchup.
Interactive Player Dashboard: The app will also feature a dashboard where users can select a specific player to view their detailed statistical profile and personalized visualizations, such as performance over time or with different partners.