I'm very curious to hear the use case for which date time parsing was the bottle...

gukoff · on Jan 29, 2021

One of the components in our project was churning through thousands of JSONs per second - deserializing, transforming and serializing them.

These JSONs represented the flight information. They included multiple datetimes, such as the scheduled departure/arrival time and the real departure/arrival time of a flight.

The first bottleneck was JSON deserializarion/serializarion. At that time we solved it with ujson, and now there's the even more performant orjson.

The second bottleneck happened to be datetime deserializarion. And we solved it with ciso8601 - luckily, these datetimes were in ISO8601. But this bottleneck later repeatedly occured in the other components and became an inspiration to write dtparse :)

sillysaurusx · on Jan 30, 2021

Wow, orjson is amazing. It even serializes numpy arrays. Thanks!

delduca · on Jan 30, 2021

`pysimdjson` is even better!

oblvious-earth · on Jan 29, 2021

I've had this situation a few times. Most recently transforming large (1-50 GB) CSV files in to a format that can be digested by a proprietary bulk DB loader.

Because our problem was just about reformatting we ended up reading the CSVs in binary mode and using struct to extract the relevant values from the date time fields. But if we needed to do actual date logic something like this would perhaps be useful (but there other fast date time libraries out there, I've been a fan of pendulum for some tasks).

throwaway894345 · on Jan 29, 2021

That makes sense, but I have a hard time believing the approach of calling into a date time parser O(n) times is going to yield a significant performance gain no matter how much faster the parser is. However, I'm being downvoted, so perhaps I'm mistaken?

oblvious-earth · on Jan 29, 2021

Sometimes it's about optimizing wall time not algorithmic complexity.

If you have a batch SLA of 1 hour, and your currently spending 50-70 mins to complete the batch and 20 minutes of that time is spent date parsing and you can reduce it to 5 minutes that's an big win.

throwaway894345 · on Jan 29, 2021

No doubt, but if your date parsing saves you 1 second per date parsed but each call into the faster library costs 2 seconds, then your performance actually suffers. The only way around this is to make a batch call such that the overhead is O(1).

minitech · on Jan 29, 2021

I’m not going to install it to check, but when someone writes “Fast datetime parser for Python written in Rust. Parses 10x-15x faster than datetime.strptime.” it seems reasonable to assume that this is not the case.

throwaway894345 · on Jan 29, 2021

Depends on whether or not the parent is including the overhead in their statistic. Misinformation about microbenchmarks is hardly a rarity.

ahupp · on Jan 30, 2021

In a language like Java where you mostly spend time in the VM and only occasionally jump into native code, that might be true. But in python a huge part of the runtime is this kind of native call. So I would not expect that this approach adds any new overhead.

throwaway894345 · on Jan 30, 2021

Your conclusion might be right, but your reasoning is certainly wrong. Calling native functions in Python is often quite expensive because you need to marshal between PyObjects and the native types (probably allocating memory as well). This doesn’t “feel” so slow in Python because, well, everything in Python is slow. But you really start to notice it when you’re optimizing.

ahupp · on Jan 30, 2021

Of course "It depends", but in my experience that kind of thing is rare. Either you're passing in str and can just grab the char* out of the existing PyObject, or you have some more complicated thing that was wrapped once in a PyObject and doesn't need to be converted, etc. But sure, if you have some dict with a lot of data and need to convert it into an std::map you'll have a bad time.

lincolnq · on Jan 29, 2021

My instinct is that the overhead is small. You need to add a few C stack frames and do some string conversion on each call, maybe an allocation to store the result. It’s not going to be as quick as doing in pure Rust, but the python-to-native code layer can be pretty lightweight I think!

brundolf · on Jan 29, 2021

Maybe they did it in bulk? i.e. send all the strings over at once, parse them in a loop, send them back. Seems like that would reduce overhead

throwaway894345 · on Jan 29, 2021

Right, and that makes sense, but the context here is a date parsing library for Python--unless said library has a batch interface, I'm not sure how that would improve performance, but maybe I'm misestimating something.

brundolf · on Jan 29, 2021

Ah, I skimmed over the part where this is a library and not application-code

pbecotte · on Jan 29, 2021

I've certainly never been bottlenecked on date parsing :) However, many/most of the high performance python libraries are built in C code, and compiled down into something the python interpreter can use directly. There are lots of python bindings written in c++ to native c libraries as well, I know I have used ZeroMQ pretty recently. Rust is done the same way- the code is compiled down into objects that Python can use directly- its not like running a javascript interpreter in your code.

cdavid · on Jan 30, 2021

I have seen it in many cases, especially working on financial data. My most recent example was working with real time feeds of trades, which we used ML models on top of. Inference was based on accumulated volume per fixed amount of time (say 30 sec, 1 min), and the code doing this in real time was python.

I don't remember the numbers, but caching + using ciso8601 was essential to manage the peak load (maybe 50k trades per sec ?).