Determinants of Scoring in the BMW PGA Championship
This analysis is based on scores and stats from individual rounds in the last ten BMW PGA Championships: 4,182 rounds in total.
Section 1: Absolute Correlation Coefficients with Score
Key Points:
- SGP consistently shows the highest correlation with score, emphasising the importance of putting.
- SGTee displays variability but is crucial in certain years like 2023.
- SGApp shows moderate correlation, highlighting its lesser, though significant, role compared to putting.
Key Points:
- Greens In Regulation consistently shows high correlation with score, particularly significant in BMW PGA Championship.
- Driving Accuracy fluctuates but plays a pivotal role in certain years.
- Scrambling has moderate correlation, reflecting the importance of recovery shots.
Key Points:
- Par 4 shows the highest correlation, indicating its importance in overall performance.
- Par 3 has moderate and variable correlation over the years.
- Par 5 displays less correlation, signifying fewer scoring opportunities compared to Par 4 holes.
Section 2: Importance of Each Metric in Determining Score
Random Forest Regressor and Feature Importance
Random Forest Regressor is an ensemble learning method that constructs multiple decision trees during training and outputs the average prediction. It combines the predictions of several models to improve accuracy and robustness.
Feature importance is a technique used to interpret a machine learning model. It refers to the score that quantifies the contribution of each feature to the prediction made by the model.
In a Random Forest, the importance of a feature is computed by looking at how much the feature decreases the impurity (e.g., variance for regression tasks) across all the trees in the forest. The more a feature decreases the impurity, the more important it is considered.
The calculated importance scores for all features are then normalized to give relative importance as a percentage. This shows the relative contribution of each feature to the prediction task.
Interpreting Feature Importance
Features with high relative importance percentages have a strong impact on the model's predictions. They are crucial for accurate predictions and indicate key areas where performance matters most.
Features with low relative importance have a minimal impact on the model's predictions. While they can still contribute, they are less critical.
Key Points:
- SGTee shows a higher than average importance, emphasising the critical role of driving off the tee in the BMW PGA Championship.
- SGApp has a lower importance compared to the DP World Tour average, suggesting approach shots were less crucial in this event.
- SGP is more influential than the tour average, highlighting the importance of putting on challenging greens.
Key Points:
- Greens In Regulation (GIR) continues to be a dominant factor, aligning with its importance on the DP World Tour.
- PPGIR remains highly relevant but slightly lower than the tour average, indicating that putting efficiency is essential but not overly dominant.
- DrivingAccuracy has a more significant impact than average, reflecting the importance of hitting tight fairways in this championship.
Key Points:
- Par 4 holes are the most critical, consistent with the DP World Tour average, emphasising their role in scoring success.
- Par 3 metrics are slightly less important, though they still influence overall performance.
- Par 5 holes offer moderate importance, reflecting the scoring opportunities they provide in the BMW PGA Championship.
Top 5 Ranked Players - 2024 BMW PGA Championship
The table below shows the top-5 ranked players across the three different Random Forest models above.
| Rank |
Surname |
Firstname |
Average Predicted Score |
| 1 |
Wallace |
Matt |
69.48 |
| 2 |
Mcilroy |
Rory |
69.51 |
| 3 |
Lowry |
Shane |
69.54 |
| 4 |
Scott |
Adam |
69.64 |
| 5 |
Hojgaard |
Rasmus |
69.91 |
Rankings and estimated scores for all players can be found here.