Sunday, June 17, 2012

It's like a Salman Rushdie novel

Some one over on Big Soccer finally started a thread trying to get some thoughts on whether or not the East was actually becoming a better conference than the West, defying the conventional wisdon that the West ic clearly the better conference. Personally, I've never bought into any of that. It seems like unless one conference was rountinely dominating the other, there would always be one that looks a little better, justbecause that's how stats work.

I guess, because I'm lazy, I'll just repost exactly what I posted there here.

I don't know that looking at the resulting points from each match played is the best way to analyze the two conferences, but it's easy enough that I'm willing to do it before noon on a Sunday. The West has scored slightly more points on average and points per game than teams in the East. A KS test between the two sets of results based on individual results for each match (i.e. a set of 0's, 1's and 3's) comes back with a P value of 1.000, roughly translated as there is a 0.0% chance that the two data sets are statistically different.

Running a KS test with data sets based on each teams culmulative point total (i.e. 30, 28, 26, 19, etc...) comes back basically the same. P value of 0.975, meaning roughly only a 2.5% chance that the West is statistically better than the East.

Here's some numbers, if that's your thing.

East Stats
Total Points 183
Average Points  18.30
Average Points Per Game 1.39
Standard Deviation 1.37
Games Played 132

West Stats
Total Points 179
Average Points 19.89
Average Points Per Game 1.42
Standard Deviation 1.36
Games Played 126

KS Test: Results
Kolmogorov-Smirnov Comparison of Two Data Sets

The results of a Kolmogorov-Smirnov test performed at 10:17 on 17-JUN-2012
The maximum difference between the cumulative distributions, D, is: 0.0278 with a corresponding P of: 1.000

--------------------------------------------------------------------------------

Data Set 1:132 data points were entered
Mean = 1.386
95% confidence interval for actual Mean: 1.150 thru 1.623
Standard Deviation = 1.37
High = 3.00 Low = 0.00
Third Quartile = 3.00 First Quartile = 0.00
Median = 1.000
Average Absolute Deviation from Median = 1.22
KS says it's unlikely this data is normally distributed: P= 0.00 where the normal distribution has mean= 1.524 and sdev= 0.9963

Items in Data Set 1:
0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00

Data Set 2:
126 data points were entered
Mean = 1.421
95% confidence interval for actual Mean: 1.181 thru 1.660
Standard Deviation = 1.36
High = 3.00 Low = 0.00
Third Quartile = 3.00 First Quartile = 0.00
Median = 1.000
Average Absolute Deviation from Median = 1.20
KS says it's unlikely this data is normally distributed: P= 0.00 where the normal distribution has mean= 1.535 and sdev= 0.9958

Items in Data Set 2:
0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00
--------------------------------------------------------------------------------

KS Test: Results
Kolmogorov-Smirnov Comparison of Two Data Sets

The results of a Kolmogorov-Smirnov test performed at 10:24 on 17-JUN-2012
The maximum difference between the cumulative distributions, D, is: 0.2000 with a corresponding P of: 0.975

--------------------------------------------------------------------------------

Data Set 1:10 data points were entered
Mean = 18.30
95% confidence interval for actual Mean: 12.25 thru 24.35
Standard Deviation = 8.46
High = 30.0 Low = 3.00
Third Quartile = 26.5 First Quartile = 13.2
Median = 18.50
Average Absolute Deviation from Median = 6.10
KS finds the data is consistent with a normal distribution: P= 0.84 where the normal distribution has mean= 18.10 and sdev= 10.69
KS finds the data is consistent with a log normal distribution: P= 0.28 where the log normal distribution has geometric mean= 15.05 and multiplicative sdev= 2.834

Items in Data Set 1:
3.00 8.00 15.0 17.0 18.0 19.0 19.0 26.0 28.0 30.0

Data Set 2:10 data points were entered
Mean = 19.89
95% confidence interval for actual Mean: 14.89 thru 24.89
Standard Deviation = 6.98
High = 32.0 Low = 11.0
Third Quartile = 25.5 First Quartile = 13.0
Median = 19.45
Average Absolute Deviation from Median = 5.69
KS finds the data is consistent with a normal distribution: P= 0.98 where the normal distribution has mean= 20.24 and sdev= 8.109
KS finds the data is consistent with a log normal distribution: P= 1.00 where the log normal distribution has geometric mean= 18.78 and multiplicative sdev= 1.528

Items in Data Set 2:
11.0 13.0 13.0 15.0 19.0 19.9 24.0 25.0 27.0 32.0

No comments:

Post a Comment