Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Catching a Unicorn with GLTR: A tool to detect automatically generated text (gltr.io)
48 points by dsr12 on March 9, 2019 | hide | past | favorite | 10 comments


I wonder if these detections would become just impossible once NLP researchers figure out how to integrate adversarial training (specially GANs) into their models.


What OP is detecting is not a flaw in GPT-2 but in how GPT-2 is used to generate samples: it chooses only from the top k most likely words, so you can easily detect GPT-2 simply by running it and noting that no sampled word appears from below, say, the top-40 candidates, while human text will occasionally have a top-41 or a top-99 or a top-1000 word.

If you fix that with a better sampling strategy like beam search or RL finetuning, unclear how well OP-like methods would work.


I'd love to see a Chrome plugin for this, to look at text on various sites and flag out of it's likely machine-written.


That's neat, but many of the tests are using text generated with the same GPT-2 data set used for testing the text.

(Also, whatever generates their images of text blocks has what looks like interpreting Windows-1252 as Unicode.)


> has what looks like interpreting Windows-1252 as Unicode

The other way around. The underlying data is UTF-8; it's being misread as either latin1 or Windows-1252. E.g., "Pérez",

  Original string: "Pérez"
  Encoded as UTF-8: b'P\xc3\xa9rez'
  Those bytes, erroneously decoded as latin1: "Pérez"
https://en.wikipedia.org/wiki/Mojibake

These two (latin1 and Win-1252) are the most common mis-decoding. UTF-8 data is common. HTML/HTTP/the "text/*" mimetype default to latin1. Windows, for North American users, defaults to Win-1252 in places. (Both encodings are extremely similar.)

It used to be that if you typed "Québec" into Google, it'd give you Québec, and wouldn't even issue a "did you mean?", just silently corrected it. Seems like that's not the case anymore. Back when it was, I wondered if the "machine" thought that Québec was an alternative spelling. (The results for "Québec" are still decently relevant.)


It sounds a good idea for deterring spam, etc. until you realise that such techniques can also be used for very insidious forms of censorship. Imagine writing about some controversial topic/opinion and getting "you've been banned because your comment looks like it was written by a robot."


If your comment is so banal as to trip this filter then maybe you haven't actually contributed anything to the discussion anyway, so what is the loss? I don't mean this to be as glib as it might sound, but as the old saying goes, 'opinions are like assholes, everyone has one' and if your proposed addition to the discussion is indistinguishable from auto-generated text there is little lost by removing it.


I think the implication is either that

1) The people running the filter are saying that in bad faith, they have actually trained it to catch "Bad Ideas" instead of autogenerated text

2) Training a GAN to mimic your political opponents and spamming sites so the bad political opinions themselves become heuristics of spam


In the referenced paper the trigger for the filter is lack of novelty, not any specific content. In neither of the cases you present would you be able to demonstrate that a filter like this could be operated in way to discriminate on content rather than on novelty other than by shifting the corpus in a manner that would be obvious to everyone.


Just like with spam filters, the problem isn't when they work; it's the false positives that are really the concern, and what I've seen with all the machine-learning stuff these days is that they tend to fail in mysterious ways.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: