Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

So if I understand this right, what GitHub did was something like:

    user = get_user_from_valid_email(params[:email])
    send_reset_email(params[:email])
    # instead of
    # send_reset_email(user.email)
?

I've seen this pattern before and the reason is usually something about using the variable in memory as opposed to the function call. Total non-optimisation.



Yes! Add in a toLowerCase() in there as well.

An apparently small deviation with surprisingly large repercussions.


I always preferred using the record instead of the user input for follow up operations like that.

So far it was only a gut feeling. Now I know why I do it ;)


> Total non-optimisation.

Lots of non-thought too. Sending e-mail directly from the place where web requests are processed isn't very smart. What if the SMTP subsystem is currently down or very slow? How many e-mails will you send if an attacker starts 1000 parallel web requests for a password reset?

A saner way to do these things is to just set a flag ("password reset requested") in the user database there and do the actual work asynchronously in a regularly performed maintenance task. It'll also prevent all these attacks based on misuse of user input by default (unless, in this case, you pointlessly decide to update the user's e-mail in the user database to what the hacker specified).


Are designers like you the reason password reset mails frequently take 30 seconds plus to arrive?

I expect things on the web to be damn near instant - I want to click that password reset button and hear a synchronous "ding" of an arriving mail. I don't want to wait for some cron job to run once per minute.

If you want to send mail off the serving path, at least use a push based queue so there is no scheduling or polling involved.


Yeah... and don't forget that the "R" in "SMTP" stands for "Real-time".


I can't speak for any other engineer anywhere else (where I'm sure they synchronously send out email because they haven't learned how to do it better), but there's not a chance github do this at their scale, and they're built on Rails. Rails gives you async mail by default and you just have to plug in a queue adapter for your worker processes to consume.

e.g. GitHub gets this request, queues the job in redis/zero MQ/SQS or whatever they're using, and another process dedicated to sending those emails (or jobs with that priority) does the rest of the work.

This is a massively common pattern in the Rails world and is as trivial to configure as your database connection.


Or maybe someone wrote a validate_user_and_email() API and someone else wrote a send_password_reset_email() in a context where they lack access to the user DB, so they just validate the {username, email} and send to the attacker-provided email address.

The first case (yours) is a plain bug, while the latter (mine) is an architecture bug. Architecture bugs often arise out of organizational bugs.


We use three versions of the email address internally: the exact verified address used at signup or the last valid email change, a normalized version of that (for identity) without + mailboxes, lowercased, de-accented, stripped of dots and other inert punctuation, and normalized in a number of other ways... and then of course the email parameter (only used during registration).

We accomplish this with a slightly more restrictive version of the standard ABNF provided in the RFC.

I guess I should probably document why we go to this trouble, in case somebody gets the brilliant idea to "simplify" it.


> stripped of dots and other inert punctuation

The period thing is a Gmail feature, not a standard. some.email@mydomain and someemail@mydomain most certainly do not deliver to the same mailbox.


It is actually an antifeature, since it is non-standard and leads to various attacks.

For example, a malicious user can register

  some.email@gmail.com
  so.meemail@gmail.com
  someem.ail@gmail.com
  so.mee.mail@gmail.com
  somee.mail@gmail.com
  so..mee.mail@gmail.com
  som..eem.ail@gmail.com
  so.meemail@gmail.com
  somee.mail@gmail.com
with a service, which will (quite reasonably) send a "Welcome" email. That results in a flood of emails to the GMail user.


I wouldn't go so far as to call it an antifeature--in fact, I wouldn't be surprised if a common use is to allow people to maintain multiple accounts on the same service with one email. It isn't standard, but it's not in violation of any standard--nothing says that the server must store each distinct valid email in a separate mailbox with its own login. Many servers implement "catchall" emails or aliases which result in the same thing, distinct addresses going to the same mailbox.

What is a problem, and what is non-standards-compliant, is GitHub incorrectly assuming that all mail providers will do this when many do not. It would be no different from assuming that "admin" and "postmaster" go the same mailbox because that's the way a lot of software is configured.


This seems like it would take a lot of effort to mildly annoy someone.

And the only way it can be automated is if the service doesn't protect itself against automated user sign-ups, which they will either start doing once someone really takes advantage of it, or will result in their domain being categorised as spam once they start sending lots of sign-up emails (either by the user or gmail in general).


> some.email@mydomain and someemail@mydomain most certainly do not deliver to the same mailbox.

That's why we keep your verified mailbox address for sending mail; but there's no good reason to consider them different for the purpose of identity.


> That's why we keep your verified mailbox address for sending mail; but there's no good reason to consider them different for the purpose of identity.

Both of Microsoft's own identity services (AAD and Live ID, and by extension O365 and Outlook.com) recognize (and allow creation of) microcolonel@example.com, micro.colonel@example.com, and mic.rocolonel@example.com as distinct identities/email addresses.

What you're saying is that once the first of (microcolonel|micro.colonel|mic.rocolonel)@example.com registers at github, the other two will no longer be able to do so, but will instead receive a confusing 'you already have an account' error, (hopefully) without being able to receive password reset emails.


...and only one in 50,000 e-mail addresses contain the string "rq5", therefore we strip that string from addresses...

A false postive in these identity checks is likely to be less destructive than a false negative. But I still don't get the point of making up all sorts of rules not in the standard. I have seen both + as well as meaningful dots in e-mail adresses in the wild.


+ and . have is legal in email addresses since it was standardized.

Google was the first installation I know of to silently swallow periods. Plus-addressing was well-known back in the day, but as far as I know GUI mailers more or less killed the practice by not offering support for it, and web sites written by people too smart to know how to validate email addresses ensured you can't even use them properly anymore.

Ref: http://www.faqs.org/rfcs/rfc822.html, pages 8/9.


You can't assume that '+' has special meaning that can be stripped away.


Why not? On most major hosts it has a special meaning, and otherwise it is a relatively ridiculous thing to just add willy-nilly to your email address. We keep your verified mailbox address, the one you gave us, for sending mail.

I doubt we'll ever turn away a customer by preventing registration of a new account sharing the prefix to a plus sign in their email address with an existing customer.


Senders don't get to dictate how a recipient encodes their addresses.

RFC 822:

The local-part of an addr-spec in a mailbox specification (i.e., the host's name for the mailbox) is understood to be whatever the receiving mail protocol server allows.


I am well aware of that, but I'm comfortable requiring that new customers don't register an account with an email address foolishly designed to resemble another customer's email address in this particular way. We don't throw away their specified mailbox address, we just don't accept registrations which look suspiciously similar, or intended to cause confusion.

I repeat, this has absolutely nothing to do with the mailbox address, where we send mail.


So if on my system we name users by their last name unless that is taken and then we add initials, and so Mike Nesmith (our only Nesmith) gets "nesmith", but we have several Smiths so Norman Edward Smith is given "n.e.smith", then only one of them can use your service?


So, when I registered for my primary email account (many, many years ago), Firstname.Lastname@provider was already taken, so I took FirstnameLastname@provider.

Are you suggesting I shouldn't be allowed an account with you if the person who beat me to my preferred email address also beat me to registering with you?


Can agree with this, maybe except for dot stripping, while this will block out some legal mail addresses it's generally worth it and close to impossible to have accidental collisions in practice.

It's like deciding to not allow quoting and with this whitespace. Sure ":"@example.com is a legal mail address (surprised?) but nothing good will come from allowing it.


You can't decide for yourself what the semantics of someone else's address do or do not mean.


Yes they can! It's their prerogative to allow people to sign up or not with any email address, full stop.

It might make them bad netizens and you may not like it. But the spec doesn't compel any behavior. It's just a way to communicate technical ideas and ideals.


Precisely. This sketch always comes to mind https://youtu.be/hNoS2BU6bbQ

You're under no obligation to accept names that are hard (as in painful) for you (except if you're the police, I suppose. Ironically)


This is paltry nonsense. Gmail and similar host users are used to treating the dots as decorative, and will register with john.smith@ then try to use "Forgot username?" with johnsmith@ instead. They'll end up with three GitHub accounts registered to the same mailbox and be confused as heck about how they're still getting email notifications for an account that they can't recover a password for.

You can't break user expectations and mental models by pointing to the spec as justification. The spec exists to serve users, not the other way around.


Sure, and meanwhile if there are collisions without periods on that domain (randy@somewhere, r.andy@somewhere and rand.y@somewhere, for instance), only one of them gets an account.

In the absence of being able to count on specs, I guess the user should expect a race?


The fact that we just now found out that this is the case, yet would be completely unable to find anyone complaining about it except in hypothetical terms, tells you exactly how important it is.


This stuff just causes your business to mysteriously have low customer satisfaction. You're doing 95% as well as the successful business but for 5% of your customers these "unimportant" problems make it awful to deal with you and they stay away and tell others.

Most of them can't specifically point to the problem, their impression is just that your services don't work properly. They're right.


How would they complain to you? How would they know what the problem is?

Assuming they don't just fail to register and move on ...


> Why not? On most major hosts it has a special meaning

So at best you can special case for those "major hosts" and not apply such treatment to any other domain.

The RFCs for email don't say "uhh dude whatever, just check what gmail does and maybe hotmail too lol". You are playing fast and loose with these things.


A lot of thinks you describe are Gmail properitary extensions. Especially the dot stripping. While far having two mail addresses only differing in dots is possible, especially with some older email addresses.


It's very unlikely but still conceivable that robby@ and rob.by@ could be two different people at the same domain.


Our team at Wisdom has done the exact same. It's been confusing at times, but much better once we got going.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: