Yappi Sports - THE Ohio Prep Sports Authority  

Go Back   Yappi Sports - THE Ohio Prep Sports Authority > Boys HS Sports > Boys Track & Field/Cross Country

Hello Guest!
Take a minute to register, It's 100% FREE! What are you waiting for?
Register Now
Reply
 
Thread Tools Display Modes
  #1  
Old 07-09-17, 10:18 AM
madman madman is offline
All Ohio
 
Join Date: 09-27-10
Posts: 892
madman is on a distinguished road
Improvement Expectations

Milesplit current has an article posted that shows the improvement for every Ohio boy that ran under 20:31.30 for 5k last year. I wish they had done it for every boy, but it still provides some interesting insights regarding improvement.



The typical improvement among all was 23.2 seconds. For years my expectation has been that boys who train consistently should improve ~3%/year (~30 sec/5k or ~10 sec/mile) not accounting for changes in physical maturity. It looks like a more typical value might be 15-20 sec/year.

What's better about having this extremely large sample is that it brings some validity to a common assumption that the faster you are the less you are likely to improve both on an absolute scale and as a percent.

Each athlete is a multi-factor experiment of one. I'm not sure how useful these numbers are in practical application, but I do find them interesting.

==============================================

I wish they had also included the grade for each boy as I expect the older you are the less the improvement.

Note that the slowest four bands are incomplete as the list only included boys from those two bands that ended up running under 20:31.30 the next year. The first band is probably too small to be of much use, too.

Last edited by madman; 07-10-17 at 07:02 PM..
Reply With Quote
Sponsored Links
  #2  
Old 07-09-17, 01:50 PM
mathking mathking is offline
All American
 
Join Date: 07-25-10
Posts: 1,310
mathking is on a distinguished road
It's interesting but I have to wonder how meaningful. When I look at improvement I try to measure how well athletes do on common courses from year to year. (Dropping off any courses with vastly different conditions between the courses.) Otherwise it gives you a distorted picture of how much faster or slower a kid was. Our first kid on the list had a 2016 PR from the state meet, and a 2015 PR from our conference meet at Darby. But he was 18 seconds faster in the district meet and 30 seconds faster in the state meet when comparing 2016 to 2015. Our second kid on the list "got slower" because we didn't run our conference meet at Darby last year, even though he was faster in most meet to meet comparisons. For example he was about 30 seconds faster in the district meet in 2016 than in 2015.
Reply With Quote
  #3  
Old 07-09-17, 03:41 PM
madman madman is offline
All Ohio
 
Join Date: 09-27-10
Posts: 892
madman is on a distinguished road
As in any real dataset there is noise among the observations. Systematic differences from pre-test to post-test are what you're hoping to discern. If PR courses were systematically faster one year than another that would bias the results.

Among the top 2-3 bands that would probably be the State Meet course - I would think more of those kids run their PR at NTR than any other course. Last year the state meet was run under near ideal conditions which would lead to larger drops in typical PRs for those bands.

Statistics rarely yield much of value in making precise predictions for an individual, especially when there are so many variables affecting the outcome.

At best this provides an estimate of reasonable expectations for typical changes for each band of individuals.
Reply With Quote
  #4  
Old 07-09-17, 10:45 PM
mathking mathking is offline
All American
 
Join Date: 07-25-10
Posts: 1,310
mathking is on a distinguished road
Last year NTR was the most common PR course for the top kids. But that data set is smallish. I am not going to go through all their data but for mine there is a lot of variability in PR course. Which is why I use same meet times year over year.

I actually have found that 1600 time improvement is my best predictor of XC improvement for boys under 18:00 and girls under 21:00. I think because track season is much closer than the previous XC season and most good XC runners run at least one fresh 1600 late in track season. (The last couple years, with the proliferation of auto timed dual meets really helps.)
Reply With Quote
  #5  
Old 07-09-17, 10:47 PM
mathking mathking is offline
All American
 
Join Date: 07-25-10
Posts: 1,310
mathking is on a distinguished road
Quote:
Originally Posted by madman View Post
As in any real dataset there is noise among the observations. Systematic differences from pre-test to post-test are what you're hoping to discern. If PR courses were systematically faster one year than another that would bias the results.



Among the top 2-3 bands that would probably be the State Meet course - I would think more of those kids run their PR at NTR than any other course. Last year the state meet was run under near ideal conditions which would lead to larger drops in typical PRs for those bands.



Statistics rarely yield much of value in making precise predictions for an individual, especially when there are so many variables affecting the outcome.



At best this provides an estimate of reasonable expectations for typical changes for each band of individuals.

You are correct that this does help provide an idea of how to gauge whether your team is making good progress or not.
Reply With Quote
  #6  
Old 07-10-17, 12:07 PM
gatornation gatornation is offline
All District
 
Join Date: 06-01-11
Location: The Swamp
Posts: 211
gatornation is on a distinguished road
Madman, will you be doing an average improvement for the girls, their numbers are posted?
Reply With Quote
  #7  
Old 07-10-17, 01:27 PM
mathking mathking is offline
All American
 
Join Date: 07-25-10
Posts: 1,310
mathking is on a distinguished road
I looked at the girls who ran under 24:00 in 2016. This gave me a mean improvement of 10.6 seconds with a standard deviation of almost 1:08. That goes along with the fact that of the 1880 girls with 2016 times under 24:00 about 45% got slower from 2015 to 2016 in terms of their PRs.
Reply With Quote
  #8  
Old 07-10-17, 03:14 PM
CC Track Fan CC Track Fan is offline
All District
 
Join Date: 06-17-16
Posts: 103
CC Track Fan is on a distinguished road
Quote:
Originally Posted by mathking View Post
That goes along with the fact that of the 1880 girls with 2016 times under 24:00 about 45% got slower from 2015 to 2016 in terms of their PRs.
I knew a big number of girls do not improve YOY but that 45% is even high than I thought. Thanks this is good information.
Reply With Quote
  #9  
Old 07-10-17, 04:08 PM
CoventryTrackXCguy CoventryTrackXCguy is offline
All Ohio
 
Join Date: 10-28-15
Location: Coventry, Ohio
Posts: 607
CoventryTrackXCguy is on a distinguished road
Shear quantity of data like this is almost a quality in itself. Any anomalies that make it difficult to compare times in cross country seem to be ironed out by the sheer quantity of data.

Sent from my SGH-M919 using Tapatalk
Reply With Quote
  #10  
Old 07-10-17, 04:53 PM
mathking mathking is offline
All American
 
Join Date: 07-25-10
Posts: 1,310
mathking is on a distinguished road
Actually more data often does not smooth out differences or give a better picture. In fact, if you aren't looking at the right data, more data tends to give an unrealistic view of how certain your data is. In this case, if we look at all the girls listed the mean is 1.2 seconds improvement. If we limit to 24:00 and faster it's 10.6 seconds. If we further limit to 22:30 or faster the mean improvement is 15.8 seconds. Note that this means the percentage improvement is growing even faster, since the average time is going down. (From basically 0% to 0.8% to 1.3%.)

I did a random selection of 20 girls, and 16 were faster in the last meet they ran in 2016 that was the same course they ran that same weekend in 2017. (For most it was the conference, district or regional. One was the state meet and one was the meet before the conference meet.)
Reply With Quote
  #11  
Old 07-10-17, 06:18 PM
madman madman is offline
All Ohio
 
Join Date: 09-27-10
Posts: 892
madman is on a distinguished road
Quote:
Originally Posted by mathking View Post
Actually more data often does not smooth out differences or give a better picture. In fact, if you aren't looking at the right data, more data tends to give an unrealistic view of how certain your data is. In this case, if we look at all the girls listed the mean is 1.2 seconds improvement. If we limit to 24:00 and faster it's 10.6 seconds. If we further limit to 22:30 or faster the mean improvement is 15.8 seconds. Note that this means the percentage improvement is growing even faster, since the average time is going down. (From basically 0% to 0.8% to 1.3%.)

I did a random selection of 20 girls, and 16 were faster in the last meet they ran in 2016 that was the same course they ran that same weekend in 2017. (For most it was the conference, district or regional. One was the state meet and one was the meet before the conference meet.)

Unless you know your data is roughly symmetric you shouldn't be using mean as a measure of center, but I think you know that...

This particular dataset has a problem with the slower end of the distribution since it is truncated. Athletes who ran slower than the last value in the list are omitted, which creates a bias towards improvement since those who got slower may be omitted.
Reply With Quote
  #12  
Old 07-10-17, 06:53 PM
madman madman is offline
All Ohio
 
Join Date: 09-27-10
Posts: 892
madman is on a distinguished road
Quote:
Originally Posted by gatornation View Post
Madman, will you be doing an average improvement for the girls, their numbers are posted?
I have no agenda here, but before presenting the girls results I've got the feeling they may generate some strong responses. Just remember the data is what it is. I am making no attempt to argue why it's that way.

I will leave it to those who have coached girls far more than I have to make such arguments.

============================



As I mentioned earlier, since no times from last year slower than 26:31 are included, the data omits a large chunk of the population and in so doing biases the results for girls as the slower end of the spectrum. That's why it appears that all of those girls improve. No times from last fall slower than 26:31 were included in Milesplit's list.

============================

For the girls in the faster bands that are less likely to be affected by the truncation of the data, the typical girl didn't end up with a faster PR. However 40-45%, a large chunk of the population, did have faster PRs than the year before.

============================

Note: To facilitate comparisons, I went back and updated the table for boys to include the % Who Improve column.

Last edited by madman; 07-10-17 at 07:05 PM..
Reply With Quote
  #13  
Old 07-10-17, 07:37 PM
mathking mathking is offline
All American
 
Join Date: 07-25-10
Posts: 1,310
mathking is on a distinguished road
Quote:
Originally Posted by madman View Post
Unless you know your data is roughly symmetric you shouldn't be using mean as a measure of center, but I think you know that...

This particular dataset has a problem with the slower end of the distribution since it is truncated. Athletes who ran slower than the last value in the list are omitted, which creates a bias towards improvement since those who got slower may be omitted.
There are tons of problems with this data. You correctly point out one of the big problems, since the lower end of the data has a lot of issues. That is one of the reasons I chose to look at a restricted set of the data. In my view the single largest problem is that comparing PRs doesn't make tons of sense. While getting a PR is a great motivational tool for kids, comparing races on different courses under different conditions doesn't make sense if you are trying to use the data in a predictive manner.

There really isn't a great measure of central tendency to use in this case, but that is pretty normal. On the other hand, if you are trying to use the data predictively, say by eventually developing a regression model, then you pretty much have to use some form of mean. And by the way, there is a lot of symmetry. (See the attached graph of the improvements.)

In any event, as I said I looked at restricting the data to a faster subset, which alleviates some of the issues you pointed out. Another reason I did this is that on the average those kids are more likely than their slower teammates to have engaged in consistent training. Thus they are more likely to fit a consistent pattern. In the restricted data sets the girls were more likely to improve, but still nowhere near as likely as boys were to improve. That pattern fits with my 25 seasons of XC data.
Attached Thumbnails
Click image for larger version

Name:	GirlsImprovementScreenShot.jpg
Views:	14
Size:	22.6 KB
ID:	1699  

Last edited by mathking; 07-11-17 at 04:16 PM..
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT -5. The time now is 05:39 AM.




Powered by vBulletin® Version 3.8.8
Copyright ©2000 - 2017, vBulletin Solutions, Inc.
Registration Booster - Powered By Dirt RIF CustUmz