On Thu, Jul 17, 2008 at 10:42 PM, Matthew Fluet <firstname.lastname@example.org> wrote:
> The ideal solution, especially for a situation like yours, where you are
> happy to use lots of memory on a dedicated machine, is to use
> @MLton fixed-heap 3.5G -- to grab as large a heap as you can (that
> comfortably fits in physical memory) at the begining of the program and
> never bother resizing it. As I understand it, resizing is only to 'play
> nice' with other processes in the system.
Most operating systems map the upper 1G directly to be the kernel. In
other words, you will be lucky if you can get hold of 3G - something.
Linux can do some tricks so it can get hold of more in some situations
Your understanding of 'playing nice' is correct. When you mmap() you
don't get the memory, but every time you hit a new page, you'll get a
trap to the kernel which assigns a fresh physical page to the virtual
location and resumes the mlton program. In other words, the computer
is not paying for pages it has not yet accessed. Of course, this is
not possible in the "other direction": the kernel doesn't know what
pages we don't need anymore, so we have no chance of doing anything,
but calling mremap()/munmap() to forcibly remove the mapping structure
in the kernel.
On FreeBSD, there is an option to madvise(2), MADV_FREE which tells
the OS that the neither the contents nor the page is interesting to
the process anymore. This allows you to free up pages in the middle of
an address range while keeping the address range valid. It means that
such physical pages can be used in other processes and that they will
never be paged to disk. The semantics of the Linux MADV_DONTNEED call
is similar but I wouldn't count on the man-page before having read
what the kernel source does. In effect, on FreeBSD, you can just grab
the whole memory and never remap it. When a space is not interesting
to the process you just call madvise(2) on the range. Granted, it only
works for whole pages, but that is no problem in the GC.