Every time you map a large set to a smaller set, you are bound to have collisions[1]. In other words, all hashing algorithms which map input data to a bounded string will have collisions. Depending on what you're using the hashing function for, it just becomes a matter of how feasible it is to either find a collision or find the data which yielded the hash.
While that's true, I wouldn't call the pigeonhole principle a remotely similar weakness to the chosen prefix collision attack vulnerability in MD5. The feasibility of finding a collision in MD5 has little to do with the number of pigeonholes in MD5.
Not at all. Not to completely repeat myself, I'm merely saying that all such hash functions will have collisions. So the fact you have collisions doesn't imply weakness as it's a fact of life. So all you're left with is to look at the feasibility of finding a collision. We're both saying the same thing IMO.
[1] http://en.wikipedia.org/wiki/Pigeonhole_principle
Edit: source