One big difference to me is that many of the ML implementations were developed (and are still developed) to research a particular technique or problem by competing academic groups. From what I can tell Python/Ruby are not really used for research in this way and most of the forks are to port to different platform/interpreter or to make a faster implementation (PyPy).