Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Wikipedia Vandal Early Detection: From User Behavior to User Embedding [pdf] (semanticscholar.org)
75 points by lainon on Aug 19, 2017 | hide | past | favorite | 1 comment


Not convinced their user embedding creation is useful. Did not read in detail but it seems to use a list of edits similar to how one may create paragraph vectors as an average of word vectors. But if I had to guess, they're not really capturing more information than they originally had with a one hot vector of whether or not a user had edited a specific article. It would have been better if they had bench marked against this. I would wager that a simple random forest and the one hot vector would do just as well if not better than their NN solution.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: