Kernel Log: Coming in 2.6.37 (Part 4) – Architecture and infrastructure code
Thorsten Leemhuis
The kernel now includes some components for supporting operation as a Xen host (Dom0). Switching into and out of sleep mode should be accelerated by the use of LZO compression. Following years of work, almost all parts of the kernel are now able to run without using the big kernel lock (BKL).
On releasing 2.6.37-rc6 last week, Linus Torvalds reiterated that he hoped to release Linux kernel 2.6.37 after the holiday period, around the turn of the year. However things have not been running quite as smoothly as he might like – something which probably also applies to the seventh pre-release version, due for release shortly, since a change aimed at improving hardware resource allocation has now been reverted.
This, Part 4 of our "What's coming in 2.6.37" Kernel Log series, provides an overview of these and other changes in platform-specific and infrastructure-related kernel code. Part 1 of the series dealt with graphics hardware-related changes, Part 2 with file system changes and the Part 3 with changes to network and storage code. The fifth and final part will be published around the turn of the year and will deal with other drivers and the kernel subsystems in which they are found.
Bye bye BKL
By mentioning it explicitly on releasing the first pre-release version of 2.6.37, Linus Torvalds has focused attention on one particular change – the core of the Linux kernel now no longer needs the big kernel lock (BKL). The BKL is a locking method dating back to the early days of multi-processor support in Linux and is aimed at preventing conflicts where shared data structures are accessed simultaneously. Back then, it was considered relatively easy to implement, but operated as a system-wide lock. This adversely affects performance on systems with larger numbers of processor cores and can result in undesirably long system call latencies in real time environments.
A new option now allows kernels to be configured to be completely BKL free. This does mean avoiding the handful of drivers and file systems which still use the BKL, but most of these are of no relevance for modern systems – with the possible exception of the UDF file system. Patches to allow UDF, required for some DVDs, to run without the BKL arrived too late to be merged into 2.6.37 and are instead likely to feature in 2.6.38.
Process management and tracing
The scheduler now tries to avoid migrating real time tasks to other processor cores. When allocating processors, the scheduler now no longer accounts the time the processor takes processing IRQs to the currently active processor, with the result that it no longer receives less time than intended.
The '-V' switch causes perf's probe command to list all local variables at probe points; '--externs' allows all accessible global variables to be listed. Also new is basic support for modules in 'perf probe'.
The new jump labelling (see e.g. 1, 2, 3, 4, 5) should further reduce overhead relating to on and off functions for tracepoints – details can be found in a short article on LWN.net. Field testing has, however, shown that this does not yet do quite what it should, so that it may yet be deactivated before 2.6.37 hits the streets.
Security
The kernel's crypto daemon now includes an AEAD (Authenticated Encryption with Associated Data) interface. It is now possible, via a sysctl call, to specify whether normal users should have access to dmesg-readable syslog information, or whether this should be restricted to users with CAP_SYS_ADMIN capability. This is aimed at hindering attacks which make use of information from syslog.
Read restrictions for /proc/kallsyms intended to make life more difficult for attackers have now been reverted due to problems with userland applications. Background information on this can be found in the LWN-net article 'Making attacks a little harder'. An implementation that encoded a policy into the kernel was proposed some days after the withdrawal of these changes, but did not meet with wholehearted approval.