The Data-Based Drafting Thread (what players would a Potato pick?)

Discussion in 'Vancouver Canucks' started by But Gillis, May 27, 2018.

  1. Melvin

    Melvin is on a pasta-free diet. Sponsor

    Joined:
    Sep 29, 2017
    Messages:
    4,062
    Likes Received:
    3,532
    Trophy Points:
    102
    Gender:
    Male
    Occupation:
    Data Nerd
    Location:
    Vancouver, BC
    One side benefit of this project is that in compiling this data I now have a database of draft information I have never had before, and can do all sorts of stuff with it very easily.

    For example, here is the strength of each draft, rated as the avg of potato scores for drafted-players:

    1. 2016 draft
    2. 2015 draft (the 2018 draft is looking comparable.)
    3. 2010 draft
    4. 2009 draft
    5. 2011 draft
    6. 2013 draft
    7. 2017 draft (surprised this is so low)
    8. 2014 draft (interestingly enough, based on first round only this draft is #1. But it fell off after that.)
    9. 2012 draft

    This lines up with common perception that the 2012 draft was a poor one.

    How annoying is it that the 2016 draft is the one where we didn't have any picks. Argh.
     
    Last edited: May 29, 2018
    Catbug and vancityluongo like this.
  2. ChefBoiRD

    ChefBoiRD Registered User

    Joined:
    Feb 26, 2018
    Messages:
    586
    Likes Received:
    226
    Trophy Points:
    33
    Gender:
    Male
    The Patata (potato) reigns supreme!!

    Lllllove it! Looooooooove iiiiiiiiit!!

    But Gillis is like our own Bill James (Sabermetrics)

    Patatametrics!!

    Nice work But Gillis! Very interesting stuff.

    Thanks for doing the 2018 class as well. It will be my go too, cheat sheet come Draft day.

    Ciao!!
     
    Hindustan Smyl likes this.
  3. Jyrki21

    Jyrki21 Todellakaan ole suomalaista Sponsor

    Joined:
    Oct 3, 2004
    Messages:
    8,145
    Likes Received:
    1,123
    Trophy Points:
    139
    Location:
    Cascadian Embassy in Ottawa
    Home Page:
    I was away for a few days getting old when you posted this, but I want to chime in as well and say that even though I don't get a lot of the finer statistical points (and none of the compsci background) I really appreciate this too and look forward to its application to 2018.
     
    Melvin likes this.
  4. Mafic

    Mafic Registered User

    Joined:
    Oct 2, 2007
    Messages:
    231
    Likes Received:
    0
    Trophy Points:
    74
    Location:
    BC
    Thanks for putting the time into this, it's a very interesting read! Reminds me a bit of that Sham-the-intern draft Canucks Army put out a while back (We think the Vancouver Canucks may have a scouting problem(!!!!)), but obviously with a lot more effort.

    I'm curious how you ended up training the model, but didn't see much mentioned about it in the methods. Are you basically taking some NHL metric (I'm guessing TOI), and adjusting your position/league/age/height modifiers until you get the best correlation with PPG?

    I agree. Obviously the Sham example is flawed as he knows which prospects are drafted in each round. Since you mentioned you're using the same version of the model for every draft, I think it's value as a scouting-evaluation tool would diminish the farther back in time you go simply because you have access to a greater amount of data that was not known at that time.
    It would be interesting to see your potato system with some kind of moving-window to train the model. But I'm not that great of a stats or programming guy, so have no idea how much effort that requires...

    Anyways, these 'potato' drafts are always interesting because I think the data would show they generally outperform our drafting. But it would certainly be more convincing if it relied solely on data prior to the draft. The 2018 predictions will be interesting to follow just for that reason.

    Cheers
     
  5. Melvin

    Melvin is on a pasta-free diet. Sponsor

    Joined:
    Sep 29, 2017
    Messages:
    4,062
    Likes Received:
    3,532
    Trophy Points:
    102
    Gender:
    Male
    Occupation:
    Data Nerd
    Location:
    Vancouver, BC
    I set it up deliberately so that it is very easy for me to re-model based on different years if need be.

    I don't want to get too detailed into the methodology until I know where I am going with this but I have done hundreds of different runs using different draft years and different approaches.

    When I did the team evaluations on the blog, my method to avoiding this was to use older data to "train" but then do the team evaluations based on the newer drafts only. Thus I was posting evaluations of teams from 2014-2017 but only after "training" based on the drafts prior. Having said that though, I have made changes to things since then so I can't guarantee that if I do the same exercise I will get the exact same results.

    It is pretty marginal though in terms of how much difference it makes in the big picture, and that makes me happy. For example just last weekend I added the 2009 draft and re-ran calculations using 2009-2013 as my "window" and I did not get any different results. So that is a good sign. That is why I am comfortable posting this thread now; because I believe that I am not going to be adjusting anything significantly unless I come across some sort of breakthrough and I am mostly into diminishing returns.

    When I do the league calculations for example, I always give myself a "range" of sort of values. I don't have it output an exact coefficient because I believe that introduces a false sense of precision and you are lying to people when you say "oh this league is 0.834 of the OHL." Like nobody has enough data to be able to pinpoint things to a 3rd decimal place; it's absurd. So I always end up with an output that is like "between 0.5 and 0.7." The more data I have, the tighter the window I will be able to make. For something like the Czech league where hte data is limited, I get something like "between 2.5 and 6.0" or something crazy. At that point I am picking a value as pretty much a guess based on my own instincts so it is a bit subjective. That's why I said that Martin Kaut could realistically be anywhere from like 3rd on my ranking to 16th. Just, nobody really knows much about that league unfortunately.

    I believe that my approach of outputting a window and not a hyper-precise number means that I am not as prone to small sample fluctuations or biases introduced by using specific model years. Whereas someone calculating specific values might get 0.834 when running these years, but get 0.893 when using some other data set. For me, I get "between 0.8 and 0.9" both times and I pick my coefficient based on that.

    Anyway, I am working on 2008 now; I think I may go back as far as I can because even if I don't use this old data for this exercise it may still have a purpose for future project. Just having a database with all this draft data is interesting even just for answering questions like "when was the last time a player like X was drafted."

    BTW, if anyone out there is willing to help with some data entry, please PM me. That would really expedite the process in terms of getting these older drafts loaded.
     
    Last edited: May 29, 2018
  6. Melvin

    Melvin is on a pasta-free diet. Sponsor

    Joined:
    Sep 29, 2017
    Messages:
    4,062
    Likes Received:
    3,532
    Trophy Points:
    102
    Gender:
    Male
    Occupation:
    Data Nerd
    Location:
    Vancouver, BC
    Here are my team rankings from 2009-2016:

    ott77
    car76
    ana74
    lak69
    was64
    min63
    col62
    nas62
    tam61
    fla60
    cbs59
    buf57
    edm54
    chi53
    bos52
    nyi50
    det49
    win48
    cal47
    ari45
    dal45
    pit42
    njd40
    sjs40
    mon37
    phi33
    tor33
    stl32
    nyr30
    van29
     
  7. Hindustan Smyl

    Hindustan Smyl 2020/2021 = Canucks being a contender

    Joined:
    Jun 21, 2014
    Messages:
    4,732
    Likes Received:
    443
    Trophy Points:
    89
    Gender:
    Male
    Occupation:
    Lecturer
    Location:
    Shanghai
    One question for you: Why does Hughes fall completely off the map under your analysis? Your opinion here seems to go against the grain of a lot of other esteemed hockey pundits.

    One thing I find interesting however is that ALL of you guys (me included) are high on Kotkaniemi.
     
  8. Hindustan Smyl

    Hindustan Smyl 2020/2021 = Canucks being a contender

    Joined:
    Jun 21, 2014
    Messages:
    4,732
    Likes Received:
    443
    Trophy Points:
    89
    Gender:
    Male
    Occupation:
    Lecturer
    Location:
    Shanghai
    Thank you Mike Gillis!
     
  9. Peter10

    Peter10 Registered User

    Joined:
    Dec 7, 2003
    Messages:
    1,932
    Likes Received:
    837
    Trophy Points:
    139
    Gender:
    Male
    Location:
    Germany
    Only 5/8 of your thanks belong to Mike Gillis, the other 3/8 are to Jim Bennings credit.
     
  10. Peter10

    Peter10 Registered User

    Joined:
    Dec 7, 2003
    Messages:
    1,932
    Likes Received:
    837
    Trophy Points:
    139
    Gender:
    Male
    Location:
    Germany
    I would say you are a bit late to the Kotkaniemi train jumping on only last month. I think it was Knight53 who was banging the drum for him since about a year and I guess many folks had a look then and liked what they saw.
     
    Hindustan Smyl likes this.
  11. mossey3535

    mossey3535 Registered User

    Joined:
    Feb 7, 2011
    Messages:
    6,527
    Likes Received:
    391
    Trophy Points:
    94
    Dude just sell this and get a job with an NHL team.
     
  12. mossey3535

    mossey3535 Registered User

    Joined:
    Feb 7, 2011
    Messages:
    6,527
    Likes Received:
    391
    Trophy Points:
    94
    As for d-men, IMO there's no point in changing your model.

    If a team at that position wants a d-man, just filter the potato list for d-men, pick the top rated potato d-man. Done.
     
  13. Doyle Hargraves

    Doyle Hargraves Registered User

    Joined:
    May 11, 2018
    Messages:
    401
    Likes Received:
    179
    Trophy Points:
    33
    Gender:
    Male
    I think you have that backwards. Gillis had six drafts in Vancouver. He drafted from 08-13. The drafting here was a garbage fire from 06-12.
     
  14. RobertKron

    RobertKron Registered User

    Joined:
    Sep 1, 2007
    Messages:
    11,665
    Likes Received:
    1,225
    Trophy Points:
    169
    The period in question was 09-16
     
  15. Doyle Hargraves

    Doyle Hargraves Registered User

    Joined:
    May 11, 2018
    Messages:
    401
    Likes Received:
    179
    Trophy Points:
    33
    Gender:
    Male
    Ok so he ran five drafts in there from 09-13. Elmer ran 3
     
  16. Hindustan Smyl

    Hindustan Smyl 2020/2021 = Canucks being a contender

    Joined:
    Jun 21, 2014
    Messages:
    4,732
    Likes Received:
    443
    Trophy Points:
    89
    Gender:
    Male
    Occupation:
    Lecturer
    Location:
    Shanghai
    Elmer killed it in 15’ and 17’
     
  17. Doyle Hargraves

    Doyle Hargraves Registered User

    Joined:
    May 11, 2018
    Messages:
    401
    Likes Received:
    179
    Trophy Points:
    33
    Gender:
    Male
    Those were Brackett and Gradin drafts
     
  18. drax0s

    drax0s Registered User

    Joined:
    Mar 18, 2014
    Messages:
    2,162
    Likes Received:
    310
    Trophy Points:
    79
    Location:
    Vancouver, BC.
    So effectively, when people claim a draft is weak or strong, you can quantity it. I could see draft rankings like "top 5 picks, 5% above average, top 20 picks 20% below average", etc being useful to quantify sort of an "actual value" of a draft pick. 2016 draft picks, for example, are more valuable than 2012 picks.
     
  19. PuckMunchkin

    PuckMunchkin Registered User

    Joined:
    Dec 13, 2006
    Messages:
    3,363
    Likes Received:
    373
    Trophy Points:
    124
    Location:
    Lapland
    Its a crap shoot anyways.
     
  20. canuckslover10

    canuckslover10 Registered User

    Joined:
    Apr 10, 2014
    Messages:
    754
    Likes Received:
    164
    Trophy Points:
    46
    Occupation:
    ear tickeler
    Lol in this years draft the model doesnt even have hughes on the list
     
  21. Hindustan Smyl

    Hindustan Smyl 2020/2021 = Canucks being a contender

    Joined:
    Jun 21, 2014
    Messages:
    4,732
    Likes Received:
    443
    Trophy Points:
    89
    Gender:
    Male
    Occupation:
    Lecturer
    Location:
    Shanghai
    Of course they were. Anything positive = not Benning. Anything negative = Benning. #Agenda #Narrative #CorkyThatcher
     
    Doyle Hargraves likes this.
  22. skyo

    skyo Dahlen-Pettersson FTW.

    Joined:
    Sep 22, 2013
    Messages:
    3,190
    Likes Received:
    48
    Trophy Points:
    56
    Location:
    CanucksCorner
    LOL Gradin was here how long again?
     
  23. Melvin

    Melvin is on a pasta-free diet. Sponsor

    Joined:
    Sep 29, 2017
    Messages:
    4,062
    Likes Received:
    3,532
    Trophy Points:
    102
    Gender:
    Male
    Occupation:
    Data Nerd
    Location:
    Vancouver, BC
    True, and an interesting thought but I think the difference is usually going to be quite marginal. 2012 was baaad though.
     
  24. Melvin

    Melvin is on a pasta-free diet. Sponsor

    Joined:
    Sep 29, 2017
    Messages:
    4,062
    Likes Received:
    3,532
    Trophy Points:
    102
    Gender:
    Male
    Occupation:
    Data Nerd
    Location:
    Vancouver, BC
    5'10" defender with meh production in the weakest college conference. He rates similarly to Julius Honka and Adam Fox.

    This is -again- where your scouts need to be able to add value. If you're going to make him the highest-selected 5'10" defender to be taken in history, you'd better be sure.
     
  25. RobertKron

    RobertKron Registered User

    Joined:
    Sep 1, 2007
    Messages:
    11,665
    Likes Received:
    1,225
    Trophy Points:
    169
    This is exactly what the post you disagreed with said.
     
    Peter10 likes this.

Share This Page

monitoring_string = "358c248ada348a047a4b9bb27a146148"