Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Imparting any meaning at all to NULL including non uniqueness or distinction seems fraught with problems. NULL is NULL that’s it. It means nothing else.


If it can't be either behavior, then should nullable columns not be allowed in unique indexes at all?


No ideally not. If they are they must be ignored.

If other columns in the row index are unique regardless of the value of the nullable column, then the row is logically unique. So if in that case the NULL value is ignored, then OK. But two NULLs neither match each other nor are they distinct from each other. You just cannot say either way.


A few DBs actually do that.

It can be rather limiting, though also less surprising (at runtime) than either other behaviour.


The index only should apply over non-null values. I'm not sure how anything else makes any practical sense.


That is equivalent to the DISTINCT semantics.

First, it should be noted that the reason this toggle was added is because the spec was ambiguous and different databases have different defaults, specifically:

- BigQuery and Db2 forbid nullable columns in indexes, solving the issue

- most databases default to NULLS DISTINCT

- SQL Server defaults to NULLS NOT DISTINCT

- as usual Oracle huffes glue in the corner (it generally uses NULLS NOT DISTINCT, except in the case where every column in the constraint is null in which cases it uses NULLS DISTINCT —- I assume because it just doesn’t apply the constraint at all)

And of course something else makes sense, because SQL’s NULL covers multiple use cases (UNKNOWN, missing, intentionally left out) they leave a lot to the details of the modelling.

For instance let’s say you’re modelling an FS-like structure, you have records which have a parent and a name, and the name needs to be unique within a parent.

That constraint needs to apply the the top-level items still, and while you could model top-level-ness through a special-case hard-coded record, you could just as well model it by leaving the parent field NULL, in which case you’d need `UNIQUE (parent, name)` where “parent” is NULLS NOT DISTINCT. This is especially common under under organic growth.

Adding a separate boolean flag is redundant and distinctly ugly, I’m not sure expressions are supported by every db (and they’re not free) and while sometimes a sentinel value can be used, that’s not the case in general.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: