Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Well, you either serialise/deserialise your data structure to make it memory relative, or you could use mmap().

Allocate such pages with a malloc which works out of an anonymous mmap() backed pool, with madvise(MADV_DONTNEED).

The OS will then discard the pages under memory pressure and will give you back zero filled pages if they have been discarded.

If each object has a nonzero sentinel value and is smaller than a page, you can detect discarded objects.

Objects bigger than a page can be composed of a 'dictionary obj' which pts to multiple objects - check the sentinel on each and discard/rebuild the whole lot if any are missing.

That last bit may be hard to retrofit, I agree.



> Well, you either serialise/deserialise your data structure > to make it memory relative

Yes, aka "fix up all the pointers".

> or you could use mmap()

That doesn't help: the objects in the DOM are already allocated on the heap. You just want to hold on to them and then drop those references at some point. Writing a pointer to a heap object to disk, even via mmaped memory areas, isn't going to really work well.

And you can't just drop the objects from the heap because they may have references from elsewhere to them.


> That doesn't help: the objects in the DOM are already allocated on the heap. You just want to hold on to them and then drop those references at some point.

The idea is to have more than one heap, with different behaviour. Objects in one heap have a "can disappear at any time, but we can detect that" behaviour.

An 'object deep copy' is all it takes to move an object from one heap to another. Or if you prefer, you can allocate all such objects in the discardable heap but temporarily 'pin' an object by changing the madvise on it's pages (this involves more page-level complexity of how stuff is laid out and how pages are shared between objects).

I'm not saying this is a 2 hour project, but it's the sort of thing which could be captured in a library without too much complexity.

> Writing a pointer to a heap object to disk, even via mmaped memory areas, isn't going to really work well.

You'd be surprised. I've done exactly this [1]. As long as your mmap'd addresses are stable, it's fine. Also note that in the case we're talking about,the pages never go to disk. They're in anonymous-backed mmap'd memory, which the kernel has been told to throw away instead of writing to disk (swap in this case, since this is an anon map).

[1] well, pretty close. I made a single-proc app multi-proc by making it's key data structures allocate out of a pool controlled by a custom malloc, backed by a mmap()'d disk file. Other processes then attached to that file via mmap() at the same address and lo and behold, all the pointers to the nested data structures were good. (One proc had read/write access to this pool, the others were all readonly. Some additional locking was required. Your mileage may vary.)

You're right in that things won't work if something else has a ptr to these pages. But as long as all references to such pages go via a single "get_page_from_cache()" it's fine.


> You're right in that things won't work if something else > has a ptr to these pages.

That's the problem, exactly. In a browser context it's pretty easy for something else (another frame, a web page in another tab or window) to have pointers into the DOM of pages that have been navigated away from. So when a page is evicted from the in-memory cache some of the information can just be destroyed. The layout objects, say. These are already allocated out of an arena, so the mmap approach may work there; it's worth looking into. But the DOM can't just be unmapped (and in fact is not arena-allocated for the same reason right now) unless you're willing to pin the whole mmaped area if something holds on to those objects, which brings us back to fragmentation issues.


The latter doesn't really solve the problem. Folks are still going to see that Firefox is using a lot of memory, even if the OS is now more inclined to swap out pages that are less important.


I think it solves the technical problem, I agree that there is an potentially an issue of perception. (Note that the OS won't swap these pages, it will discard them under memory pressure).

(Although an argument could be made that any the OS need not account pages which a process has marked as DONTNEED as "belonging" to that process). In as much as a page does "belong" to a process, anyway.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: