Massive infrastructure improvements
Performance problem – That small inconspicuous patches can cause a 30 percent drop in performance in certain benchmark scenarios – in this case, Tbench on a system with four quad-core CPUs – became evident during development of 2.6.25 following a small change to include files in the network subsystem. But the problem, caused by unaligned memory access, was caught and corrected in time. In order to prevent just such problems, developers had added a document just prior to this to the Linux source that explains some background on modern CPU memory allocation and describes how unaligned memory access can be avoided. |
Just how much memory an application and the libraries it uses actually block should be easier to determine in Linux 2.6.25 with patches – among others, 1, 2, 3, 4 – integrated into the kernel, developed under the key words maps4 and proportional size set (PSS), along with complementary programs in user space. With older kernels, the data that the kernel provided were often not adequate to obtain accurate information. For example, in a memory usage query, a shared library might be counted over and over again – once for each application that accessed it – even though only one occurrence of the library was actually present in the memory.
Most users are not likely even to notice the integration of the Preempt-RCU (pre-emptive read-copy update; see also 1), developed in the RT tree. This optional new update now makes it possible for the kernel to pre-empt the RCU code when there are more important things to do; this has some special advantages in real-time environments where quick and dependable response times are critical.
For larger multiprocessor systems, on the other hand, the new FIFO ticket spinlocks – see also 1 – could be interesting. With spinlock protected access to resources, competing threads will now receive access in the order in which they attempt it (FIFO - first in, first out). Previously some individual threads got rather quick access, while threads that had attempted access much earlier sometimes had to wait several times longer.
Consolidation of the x86 architecture strides ahead
Developers are continuing the consolidation of directories with the code for x86-32 and x86-64 architectures, which began with 2.6.24 and has been accomplished for the most part with scripts, by manually consolidating many additional files. But the union is far from complete; in the x86 developer branch for 2.6.26, the kernel hackers have already merged a vast amount of additional files, which should in the long run, simplify maintenance and result in a more dependable kernel with fewer errors.
This goal is also shared by numerous comprehensive changes – such as 1, 2 – to the basic infrastructure for drivers – driver core. The documentation delivered with the kernel has also been cleaned up and better structured in places. In the context of a debate over the inclusion of BadRAM, participants in the discussion concluded that a defective memory range can be excluded from use with functions already contained in the kernel; the kernel documentation now briefly describes the necessary steps. Developers removed the time kernel parameter, which shows time stamps for all kernel output, since printk.time=y has been doing the same thing for quite some time.
Security framework and the SELinux competitor Smack
As anticipated, the Simplified Mandatory Access Control Kernel (Smack) made the leap into the "official" Linux kernel with version 2.6.25. Like SELinux, Smack offers Mandatory Access Control (MAC) but consciously abstains from many of the extended security functions of SELinux. This makes Smack far easier to use than the much decried and difficult to master SELinux. Whether AppArmor, much criticised by some prominent kernel developers, will be integrated into the kernel is still up in the air; the patches including this security framework, distributed for discussion on the kernel mailing list just before Christmas, were not included in 2.6.25 development and were not much discussed.
Kernel 2.6.25 now also supports designated BIOS successor EFI (Extensible Firmware Interface) on x86 architecture – 1, 2, documentation. For delay purposes prior to I/O operations, the kernel now no longer always writes to port 0x80. Instead it can be switched to 0xed, since the previous method caused some new HP notebooks to crash. A long discussion on the Linux Kernel Mailing List (LKML) on how to best solve this problem preceded the change.
The new Linux version for the first time supports Marvell's System On a Chip architecture from the Orion family – an ARM variant. These highly integrated modules are contained in many NAS devices and some other consumer hardware that runs Linux.
The long neglected support for Smbfs – Windows network/Samba client – is finally coming to an end – it is officially considered depreciated and will soon be dropped. The kernel has for some time had more flexible and better maintained code for the CIFS and SMB protocols using CIFS, for which most Linux distributions have a corresponding userspace program – mount.cifs.
The network stack provides a framework – 1, 2, 3, 4, documentation, background article – for the controller area network (CAN), used for example for component communication in vehicles. With the PF_CAN protocol family, contributed by Volkswagen developers, and the unified network driver interface for CAN hardware, several Linux applications can now take part in CAN communication – similar to a normal network device. Previous CAN drivers for Linux usually offered a character device with a manufacturer-specific program interface, which could only be accessed by a single application.
Fine tuning virtualisation technologies
There were also numerous changes to the many virtualisation technologies that the kernel supports. The KVM x86 emulator, for instance, implements some additional instructions and components – 1, 2, 3, 4 –, which should improve performance and compatibility. Also, patches that pave the way for porting KVM on other architectures, like IA64 were added. Linux as a KVM host can also swap memory ranges used by the guest, in certain circumstances freeing up memory on the host system as needed. Other improvements were made on the Virtio framework – the Virtio PCI device and the balloon driver –, whose technology and driver interfaces can be used by various virtualisation solutions when emulating guest systems – ideally, a Xen-generated guest system should run smoothly this way with Virtio drivers under KVM or Qemu. Paravirt_ops also now works on the x86-64 architecture.
There were numerous improvements in 2.6.24 for container solutions, which are getting a final polish in 2.6.25. Guest systems can now link to the Procfs and Sysfs virtual data systems without the data systems inadvertently revealing host system information or otherwise influencing the host. With the memory resource controller – 1, 2, documentation, background article – and the control groups added in version 2.6.24,memory used by individual applications or complete containers can be limited so that excessive use of resources does not interfere with other containers, the host system or other applications.
In addition to all of these changes there are others that may be important for running Linux, depending on architecture and intended use. The most significant of these lesser changes are listed in an overview at the end of this article; each is accompanied by a short description linked to references that provides further information.
[More: Proprietary USB drivers are out, some features are integrated, but then removed]