Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Honeywords: Making password-cracking detectable (csail.mit.edu)
102 points by newscasta on May 6, 2013 | hide | past | favorite | 64 comments


If the Honeychecker system has to be secured in order to guarantee that the whole system is secure, why not store the encrypted passwords there in the first place ? It seems to be manufactured complexity.


If you keep the encrypted passwords in one "secure" place and the honeychecker in another "secure" place, then the system is secure unless both systems are compromised.

It's classic engineering redundancy.


Except that:

A. It doesn't do a lot to 'secure' the password credentials (in the way most people think of the term). It just tells you that someone tried to login with a honeywords. What happens then is a difficult process.

B. It's only belt-and-suspenders redundant to the extent the difficulty of cracking the honeychecker server is independent of the regular login server. It's certainly beneficial that it has a much simpler API, but if your honeychecker is just a different Ruby Gem hosted on a different Linode (for example) the benefit are lessened.


If the honeychecker is compromised you are no worse off than without a honeychecker (unless it lulls you into a false sense of security) but as long as it's not hacked it adds detectability. Not sure it's worth the effort but it's an interesting idea.


Indeed. Their FAQ says as much in question 6, and question 7 tries to make it out as though it maintaining a separate (actually secure?) honeychecker actually worthwhile pursuit, because then attackers who don't bother to look at your login code after compromising your box can sometimes be detected. It seems like a bit of a longshot, though. I would expect the attacker to be at least a little curious as to why each user has 20 passwords.


The vulnerability surface of the honeychecker may also be significantly smaller than that of the principle server.

It's possible that the password compromise is affected through other means as well (backups, database compromise, etc.), In which case the honeychecker van help validate this.

A similar approach would be to have sentinel accounts which aren't ever accessed, and which would trigger alerts if they were.


Since the whole system turns on having a hardened system which can distinguish honey from true passwords, it seems like you could get a significantly improved level of security by replacing it with a hardened system that hashes passwords with a secret, long pepper, which vastly increases the amount of time necessary to crack even trivial passwords.

The alarm trip is nice, but if you assume that you have a hardened black box that you can put data into and get data out of, you could use that to increase the complexity of the given hashes to a degree that they would be uncrackable (offline) in the first place, and standard throttling (and alarming) makes online brute-force attempts trivially detectable and blockable.

Assume that you have a user whose password is "1234", and the salt is "5678". Let's assume the system is naive and uses SHA1 for hashing.

Given these two pieces of information, that password is trivially crackable. Not a problem. However, let's now assume that we take our password and feed it to our black box, which hashes it with a 1024-bit pepper, and feeds the resulting hash back to the web app. Assuming the pepper's secrecy is protected, the password's complexity is now so ridiculous that even with all the computing power in the world, you wouldn't be able to brute-force the hash for the password "1234" before our sun explodes.

Now your passwords are mathematically protected against breach, which is arguably more valuable, since it means that the attacker can't take my email address and my six honeywords + real password and plug them into various sites trying to get a hit - those other sites won't know which are the honeywords, and won't trip any alarms when the wrong ones are used, but given the tiny quantity of passwords to try, attackers are sure to get hits. Why not just prevent them from ever discovering a usable password in the first place?

Or, if you wanted the honeypot aspect to be preserved, make your black box take a password, perform some transformation on it, then hash it. For each user, we'll generate and store (in the black box) a one-time pad that we use to transform (/encrypt) their password, then hash it. Password complexity isn't affected, but without the pad, the original password is unobtainable. My bad user uses a 4-character password "abcd" which maps to "hvz5", which I then hash. An attacker who steals the hash list can brute-force "hvz5" easily, then attempt to log in with it, and my system can detect an that the attacker is attempting to log in with a transformed (and thereforce, obviously derived) password and raise the alarms. The system allows for detection of compromised logins without ever risking exposure of the user's actual password to an attacker.


Argh. The whole concept of "peppers" begs the question. If you can keep attackers from learning a "pepper", keep the attackers from learning the password hashes too. Problem solved.

The reality is that attackers learn password hashes when they pop a database. They don't get them from XSS attacks an they don't get them from CSRF bugs. When attackers pull password hashes, they've got the database. When attackers get your database, they have your server. It's over. Stop pretending. Your database server has vulnerabilities, even in the (extraordinarily unlikely) event that you've configured it perfectly so that the ability to issue a SQL query doesn't hand over the filesystem to an attacker. But more importantly, your app server has a whole new class of vulnerabilities once it can't trust the database anymore.

People who talk about relying on secret keys in their password hashing schemes are doing the world a disservice. Just use a strong adaptive hash and move on.


I agree - I prefaced the whole thing with the assumption that we have a hardened black box that we want to split authentication responsibilities with and can trust to not be compromised, since that's a fundamental assumption of the honeywords concept. If the whole thing is compromised, it's game over, no matter how your auth scheme is split up.


> When attackers get your database, they have your server.

Could you please link me to more info on this escalation vector?


I don't have it in the context of an escalation, but if they can get the database to make a new stored procedure,

http://www.mssqltips.com/sqlservertip/1263/accessing-the-win...

then they can usually do a very large amount of things


>If you can keep attackers from learning a "pepper", keep the attackers from learning the password hashes too. Problem solved.

Isn't it easier to protect one password than protect millions that have to be connected somehow to an internet connected database.

Your pepper could be completely off-internet requiring physical access. That's not likely to be true of your password hashes. You'd likely have to wait a couple of days to get an account verified this way so that passwords submitted could be physically moved to the "pepper pot" [I made that up] and peppered and hashed codes could be returned.

"If you can take one step, why can't you take 10 million? Problem solved."

Tenability of such a scheme could vary!


No, your "pepper" cannot be completely off-internet requiring physical access, since it's required to actually validate an incoming password submission against the password hash.


If you are running it through an HSM, wouldn't that accomplish this goal? Take a code, encrypt with HSM, then do hashing, etc. You would have to hack the server then compromise the HSM in order to do an offline attack.


"Tenability of such a scheme could vary!"


I believe the tenability of your scheme varies from "unpossible" to "suitable for sites with such low traffic that the sysadmin can manually verify every login".


This is such a simple, yet powerful idea. What an elegant way to add another layer of security to a system. It's very easy to generate long random passwords that no user would create on their own. Then to populate the database with a large number of these phony passwords. There would be a small hit based on the amount of space the honeyword passwords occupy. However, the amount is manageable, and the benefits are well worth it.


In section 5 they talk about generating the passwords, but I think taking them from existing known password databases (excluding those that are within a typo of the real password) would make them more likely to get hits. It would also make them almost impossible to distinguish from the real user passwords, since they would in fact be real user passwords.


long random passwords that no user would create on their own

Really? You see the ones my IronKey comes up with... I have about 140 of them. IMO anyone using a password manager is pretty likely to be generating long, random passwords, or they're Not Doing It Right.


I am working on a replacement for password managers. Type a password, then [Ctrl] + double click the field to hash it. Even if the database is compromised, an attacker is unlikely to assume your plaintext password is a base64 hash. http://deckar01.github.io/SHA512JS/


Personally, I suspect the Right Way to do this is for W3C to standardize a special input field something like:

    <input type="passhash"/>
which looks like a normal password entry field but automatically does some clever hashing on the client to create a per-site password.

Of course, you can still get keylogged if you use a public computer or whatever.


Well, the issue I'm alluding to here is the issue of avoiding collisions. Even the password generator will not create the same long random passwords precisely because they're random and long... So even with a password generator these are still long random passwords that no user would create on their own and therefore do not collide with the honeyword passwords.

Also, the combination of password generator and honeyword is actually even more secure than either one; in a "Greater than the sum of its parts" kind of way.


It's a good idea but avoiding collisions with genuine passwords could be tricky particularly for sites with millions of accounts. If the honeywords are very long and random, then they probably won't collide but they will be obviously different from a typical users password. If you add a standard prefix or an extended character that is disallowed for real user passwords, that avoids collisions but also makes the honeywords easy to filter out. If you generate the honeywords by swapping around the characters in a real password then theres a danger that a user could set off a false alarm with a simple typo in their password.


The generator shown will produce random variants on passwords it's fed, so it shouldn't be easy to filter them out. False alarms should be easy to avoid - stick 100 honeypot passwords in the database, and set off the alarm if say, 10 of them are tried within a week. Obviously those numbers can be tweaked according to password strength, number of users, etc.

Edit: It's actually talking about a specific set of honeywords for each user. So when the password is chosen, you generate a set of honeywords that are quite different from that password, and can't be reached by mistake.


The way the system is proposed is that each user would have an exclusive list of honey words for just them. Each user would have, say 10 possible passwords. The password system would recognize all of them as correct for that user, but only 1 of the 10 wouldn't also trip an alarm in a secondary system. The honey word generating system avoid collisions with that specific user's password for generating that user's honey words, which is all that is needed.


I didn't go to MIT, so I may just be talking out my ass here, but here goes.

While I think this is a good step in the right direction, there is no technological solution for the real problem:

AVERAGE USERS ARE IDIOTS

They use dictionary passwords. They write them down on post-it notes. They re-use passwords.

Try as we may to enlighten people. They are still stuck in their ways. The only thing we can really attempt to do is force people to add complexity (#1).

Security is only as strong as its weakest link. I'm glad that companies take my security seriously, and I hope this idea catches on to strengthen that security. But it's only a small step overall.


Users aren't idiots. They're trying to get shit done, and these stupid and arbitrary rules about passwords are hindering their ability. They're stuck in their ways because holding the computer's hand is a waste of time.

Forcing them to add complexity has already failed, because fundamentally it's not a humane solution to the problem.


This times a lot. I'm far from a naive user, but I recycle a handful of easily memorable passwords for most of the services I subscribe to. And I'm willing to bet that most of you do too.

Security is a tradeoff, and we tend to forget that complex, unique passwords have a very real cost. They drastically increase the risk that I will lock myself out of some service, and dramatically increase the workload of authorizing myself when I do want to use it. With a few obvious exceptions, this tradeoff is a huge net loss for the average user.

In other words, the risk of some stranger wanting to post as me in HN is acceptably small. The cost of having to install KeePassX, then download my passwords file from Dropbox then decrypt and copy/paste, then make sure the paste register is clear, then securely delete my passwords file every time I borrow a friend's computer is prohibitively high.


Just write your passwords down in a safe place.


Real question: assuming you live alone is 'next to your computer' eg in your own home considered a safe place? What's the risk of a physical robber stealing bits of paper next to your computer desk then hacking your accounts?


"A safe place" really depends on your threat model.

If you live in a place where home entries by persons with an interest in your online accounts is common, then no, your home would not be a safe place. This could include: living under an oppressive nondemocratic regime, living in a democratic regime with broad search and investigation rules, living with your snooping parents, having an ex with (authorized or otherwise) access to your home, roommates, roommates friends, being a highly social person hosting parties and not being able to secure your computer area.

Among others.

A friend tells doing consulting work with a national diplomatic corps in a foreign country, using his personal Linux laptop, had the device scanned on his exiting the country by a known, trusted, and competent security expert. Several surveillance mechanisms were detected.

The offices of faculty and staff at major universities associated with that foreign country are also subject to surveillance software, according to the same source. Those offices and the buildings they are in, as well as the associated computer networks, are generally readily accessible.

Computers are complex enough, even for sophisticated users, to be difficult to secure completely.

An advantage of physical, nondigital records of passwords is that they provide a much smaller attack surface. Computers (especially always-on systems) can be attacked from anywhere on the Internet (at least in theory). A slip of paper concealed in some out-of-the-way place in your home is much less likely to be found, though unless it's encrypted, it's much more likely to be useful if found.


Yes, if your adversary has physical access to your home, your computer, or other methods of installing backdoor software on it then the question of password security is rendered moot.

You can't have a secret and type it on a compromised computer too.


My point wasn't to moot the question.

My point was to put the question in its appropriate context: it really depends on your threat model. And if that model includes those whom you'd prefer not acquire information having ready access to your house, then no, it's not safe.

Similarly: if that's not a problem for you, it's a perfectly reasonable practice.

That said: I'd probably try to find a slightly more obscure and/or secure location than in plain sight.

Your threat model matters. It includes possible attackers, their modes of access, likeliness of access, the assets you're trying to protect, and how they might be used in ways damaging to you. Any significant discussion or assessment of security should be framed in this context, and it's very much generalizable beyond online, electronic, or data systems.

http://en.wikipedia.org/wiki/Threat_model


I completely agree it depends upon your threat model. But I find the term 'threat model' isn't that useful when a simpler answer is possible. The term is great for leading you to ask more questions.

One huge advantage of the paper system is people have thousands of years of collective experience dealing with the security of paper documents. For example, the 4th Amendment to the US Constitution reads "The right of the people to be secure in their persons, houses, papers, and effects, ... shall not be violated...".

The question "is it safe on my desk next to my computer?" illustrates how users tend to discount their own experience and common sense once computer security gets involved. This is certainly reasonable from the users' perspective. We've all had absurdly counterintuitive experiences with computers, heard astonishing stories about hackers, and gotten plenty of nonsensical advice from the 'experts'.


I find the term 'threat model' isn't that useful

It's domain-specific language. It is a model. Of your threat profile. Of risks, exposures, etc. Understand the concept, it's useful.

One huge advantage of the paper system ... Paper has many advantages. I own a great deal of paper. I love paper. It's tremendously stable.

It's also hard to search, expensive to duplicate, and carries a risk of single-copy loss. Even misplacing (without destroying) a document can be a crisis.

Those are all parts of the paper threat model.

Your mention of the fourth amendment brings up s great many other issues, and I won't discuss them, but generally pointing in the direction of:

Are electronic records "papers", and in what contexts and locations are they treated as such.

Do protections against unreasonable search protect against warrented searches? Or warrantless searches?

As for practical experience: I've had some in the areas of which I write here.


The risk, I suspect, is considerably LOWER than having a password guessed by a brute-forcing script on the Internet.

Thievery isn't typically a frequently recurring operation, and when it occurs, the user is apt to notice.


The category of attackers who are willing and able to commit a (possibly violent) physical crime to obtain one or more of my login passwords is much smaller than those who will successfully exploit weak and re-used passwords over the internet.


Well, consider that if an attacker has physical access to your computer there's a pretty good chance you are screwed even if your password isn't written on a sticky note under the keyboard.


And you're much more likely to be aware of the intrusion.


Maybe store it with the other bits of paper in your wallet?


Secure password safe anyone? Keepass...


Forcing them to add complexity has already failed

there is nothing extra the user has to do, which is explicitly stated in the paper.


Sorry, infelicitous language on my part. Read that as:

  Forcing the users to add complexity ...
I think the honeyword concept is very clever. The idea that the same broken security paradigm ONLY THIS TIME EMBIGGENED will work, where its been doing nothing but failing until now, seems like magical thinking.


This doesn't try to solve any of those problems - it's basically a honeypot system to detect when a password file has been leaked and logins are being extracted (and used) from it.


There's forcing users to add complexity.

And there's recognizing that a known password, or knowable password, is a poor password.

There are existing lists of millions of known passwords recovered from various hack attacks and break-ins. Weakness follows a Poisson distribution, with the weakest 10, 50, 100, 500, 1000, ... etc, catching progressively more user accounts.

Instituting systems to check against know passwords, or even subsets of the list, when generating an account, on login, or even with regular attempts against known accounts to reject poor passwords, force password changes, or identify poorly-secured accounts is within the realm of technical feasibility. It's functionality which should (in the "would be a good thing to", not "it does" sense) exist in common development toolkits.

Better would be obsoleting password-based authentication in favor of key-based challenge-response, but support of this within browsers (yay! centralized implementation point), apps (oh crap....), and other tools (more oh crap) is still years off (despite being called for for years).

There's still the matter of master security over a device or authentication system, but we can get away from the present problem of users sharing weak passwords across large numbers of sites.


It's a delicate balance between accessibility (to your site and product) and good security practices (forced periodic password resets, force the use of special characters in passwords, etc).

The approach I like best so far is screening for the 1,000 or so most common passwords during account registration. Lists of these passwords are available from the Twitter Most Common Password list[0] or other leaks. If the user picks a password form this list then I politely ask them to pick another password.

Optional: Suggest passwords to users! Write a simple generator that mixes two inoffensive dictionary words, a few numbers and a symbol into a string that should be fairly easy to memorize (ex: 9Nitrogenchair$) to make the password-picking job easier for them.

[0] - http://elementdesignllc.com/2009/12/twitters-most-common-pas...


> AVERAGE USERS ARE IDIOTS

Please. Technical users are often idiots as well.


I know I am.


It certainly helps when we use password-based key derivation functions instead of cryptographic hashes in our applications. The difference between md5 and bcrypt with a high work factor is like the difference between Pa$$word01 and t8LIO.>5row8VEgf


a post-it note is nearly impossible to hack -- only a high-value unsecured physical space can compromise a post-it note.

Sniffing a password over the Internet/LAN is far easier.


Maybe i'm crazy, but I really hate PDFs. Why don't people at least provide an HTML or text copy next to their PDF?


That's how academic papers are done. One reason is that it's more complicated to do mathematical equations and such in HTML (we don't even talk about text) than it is to simply use LaTeX.


I understand the need for proper formatting in academic papers, but it wouldn't hurt anyone to provide a text copy after the fact.

If someone needs a formula they can render the pdf. But a secondary text copy can be more easily indexed, searched and viewed on any kind of text-viewing device... it's just a million times more convenient.


I understand what you are saying but it's a pretty safe bet on today's internet that a browser is capable of handling a PDF.

An online service such as http://www.extractpdf.com/ does a pretty good job of taking the PDF link and extracting the text.


It "hurts" in that it takes time. It's also not just math, but figures as well - diagrams and experimental results. There are automatic Latex-to-HTML programs, but their results are not pleasant. And I have difficulty putting up things that are ugly.


a standard PDF is ugly to the 99% of readers who aren't going to print it out.


That doesn't hold for the people who would read my papers - other researchers. We read pdfs on our screens all the time.


In recent Firefoxes, they get transparently converted to HTML by PDF.js, so I don't tend to care too much either way. There's a little bit of conversion overhead, but then there would also be some conversion overhead if they were using MathJax.


Conversely, I prefer PDF, because, particularly with LaTeX generated content, it just looks way, way better than HTML.


A similarly off-topic question: Why isn't "automatically crop margins" a standard feature on all PDF viewers?

I don't care if it perfectly matches the way the PDF is supposed to look printed out, I just want to read the paper. I can scroll through web pages and flip through PDF pages just fine, but I loathe mixing the two metaphors, so I have to be in "fit page" mode. I'd find entire pages of PDFs perfectly legible on my 11' netbook or 6' e-reader, if only the damned things would crop out all of that wasted blank space and use their entire screens for the content.

No, the zoom feature on most PDF readers is not adequate, because you then lose the ability to progress exactly one page forward and backward at a time (at least, this is the case on every reader I've used). Not to mention, who wants to do that manually every time they want to read a paper?


And I enjoy PDFs on my Retina iPad. Much better than other text representations, almost sharp as on the paper, the screen format matches the paper format, it's easy to move the whole PDF to iBooks for later reading. Perfect.


Mainly because most math/CS papers have a lot of LaTeX in them for equations and symbols, so it makes sense to write the entire paper in LaTeX and then convert to PDF or PS.


That's what "Cheif [sic] Scientists" do


Listen carefully, that is the sound of foreheads being slapped.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: