Kernel Log: Coming in 3.2 (Part 4) – Infrastructure
by Thorsten Leemhuis
Changes to the memory subsystem promise improved response times and performance. From Linux 3.2, device-mapper supports thin provisioning and is able to use this ability for improved snapshot functionality.
Just before last weekend, Linus Torvalds released the fifth pre-release version of Linux 3.2. In his release email, he expressed some disappointment about the increase in commits since RC4 compared to the second and fourth release candidates. Torvalds says that there's "nothing really scary" in RC5, noting that the changes tend "to be pretty small, and many of them are solid regression fixes"
Torvalds has not yet given any indication of an expected release date for kernel 3.2. But, with many kernel developers away from their keyboards over the Christmas and New Year period, to avoid having the Linux 3.3 merge window fall within this period, the next major Linux version is unlikely to be released before early January. The Kernel Log will nonetheless complete the "Coming in 3.2" series before Christmas. Following articles on new features in the areas of network drivers and infrastructure, filesystems and architecture and processor support, this article is concerned with other kernel infrastructure. The series will conclude with an article on drivers.
Memory management
The writeback code now throttles programs which generate large volumes of data for writing to storage media more efficiently (see e.g. 1, 2, 3, 4, 5). This should ensure that the system reacts to user interaction quickly and does not become overloaded by trying to cache large volumes of data, such as can occur with dd
when writing to a very slow disk. In the release email for 3.2-rc1, Torvalds noted that, although these changes were very small, they had the potential to be noticed by all users. There are, however, still some situations in which, because the memory management is busy, the system appears to the user to be reacting slowly. Mel Gorman has developed changes (see e.g. 1, 2), to fix a frequent problem relating to transparent huge pages (THP), background information on which can be found in this LWN.net article.
The new cross memory attach reduces overhead when communicating via the message passing interface (MPI). Further information can be found in the LWN.net article Fast Interprocess Messaging and the description of the new function call. Changes to the SLUB allocator and vmscan code should improve kernel performance in some situations, as evidenced by benchmark figures in the commit comments. Likewise mremap support and TLB optimisations in the transparent huge pages (THP) code.
Also proposed for merger was frontswap, which can be used by programs such as Xen and Zcache to save cached data in Xen's transcendent memory or compressed in memory; the feature is described in more detail in this LWN.net article. Several kernel hackers spoke out against its merger, with a summary of the state of play from the main coder failing to shift their viewpoint. Following a somewhat capitulatory sounding email, Konrad Rzeszutek Wilk has volunteered to help get frontswap in shape for merger.
Virtualisation
The front and backend drivers for accessing storage devices under Xen virtualisation now support discard, enabling hosts to determine when storage is freed up by guest systems (1, 2).
There have been a number of minor enhancements to the kernel-based KVM hypervisor. The native KVM tool has once again been proposed for merger. A script which allows a virtual machine to be launched with Qemu with minimum effort (to try out a newly built kernel for example), proposed as an alternative to the native KVM tool, also made a first serious merge attempt. Neither approach was included in this development cycle, but the developers behind both solutions will likely try again in Linux 3.3.
Device-mapper
From 3.2, device-mapper contains an experimental persistent data library, a framework for storing metadata for device-mapper targets. Various targets will rely on it in future, as expressed in the documentation. One of the first device-mapper targets to use the new infrastructure is dm-thin, which enables functions for thin provisioning of storage, allowing more storage to be exported than is actually present.
Dm-thin also adds improved snapshot functionality, which makes more efficient use of storage space, and is expected to deliver decent performance even when multiple snapshots are generated on a disk (for example as backups). This code is marked as experimental. Background information can be found in the documentation and in a page on dm-thin for Fedora 17, which will incorporate this feature. Using it requires new userspace tools.
Dm-bufio is another new and experimental device-mapper function which can be used as a disk cache. Device-mapper now also forwards information on whether the underlying disk has a non-rotating storage mechanism (e.g. an SSD). This information can allow faster booting.