So far, if I understand correctly, nobody knows of a sequence of characters you could write down in any language that would trigger a crash when encoded in Unicode in a straightforward way. That suggests that the invariant being assumed may come, ironically, from a deep understanding of the scripts rather than ignorance.
In Bengali and Oriya specifically, a ZWNJ can be used to force a different vowel form when used before a vowel (e.g. রু vs রু), however this bug seems to apply to vowels for which there is only one form
This seems to say that the ZWNJ has a meaning before vowels that have different forms, but the crash happens with vowels that only have one form, where the ZWNJ has no effect. Maybe I am misreading?
I'm saying that this crash _also_ applies to vowels with one form.
রু was the original Bengali crash, and that has two forms. I'm saying it's less likely to be related to the zwnj-vowel interaction because it also occurs for vowels where such interaction doesn't exist.