Archive

Archive for July 30th, 2009

2009/07/27 Linux Kernel Podcast

July 30th, 2009 jcm No comments

Audio: http://media.libsyn.com/media/jcm/linux_kernel_podcast_20090727.mp3

For Monday, July 27th, 2009 I’m Jon Masters with a summary of today’s LKML traffic.

In today’s issue: Abuse, dynamic ticks, HVM info, and the tty layer.

Abuse. We’ve seen filesystems in userspace, character devices in userspace, and even generic interrupt abstractions (from Thomas Gleixner). Now it seems only logical that we would also receive the blessing that is a block device layer implemented in…userspace. Zachary Amsden posted an (RFC) patch implementing such a scary notion. The interface uses ioctl()s to create the device, then passes bios to and from the kernel. I hate to think what’ll happen if someone tries to do swap-on-userspace block devices and runs into low memory situations. If this ever merges, there’ll be a lot of caveats. But Kudos to Zachary for getting this discussion going – I’m sure there’ll be a nice Linux Weekly News summary of the design if it persists.

Dynamic ticks. As you may be aware, modern Linux kernels support dynamic tick based timer interrupts. Under this new world order, Linux replaces the traditional timer tick with the scheduling of a (High Resolution) timer event to expire whenever it next would actually need to wake up. There is also some special case handling and a concept of going tickless (it’s not possible for the kernel to operate in this mode all of the time) in which the various CPUs might enter a lower power state, and so forth. But one problem that has existed up until now (on 32-bit systems) involved the inability for any sleep period to exceed about 2.15 seconds, due to the wrapping of a 32-bit quantity in the clocksource representing the hardware delivered time. Jon Hunter has a patch series that aims to detect and avoid this – potentially allowing idle systems to sleep for much longer periods of time than a few seconds.

hvminfo? A thread started elsewhere was brought over to LKML by Jan Kiska in which the various usual suspects have been debating how to properly express hypervisor capability bits, for which it turns out there are quite a few. There has been some concern that simply adding them to /proc/cpuinfo would eventually create a list similar in length to that currently representing CPU feature capability bits. For that reason, /proc/hvminfo has been proposed. This would be of special interest to those with Real Time systems, allowing one to determine whether Real Time determinism can actually be expected.

The tty layer. Those who follow LKML closely (or read LWN) know what’s coming, but I’ll save the plot twist until Tuesday’s summary. In any case, as you may be aware, the TTY (and PTY) layer provide support for a wide range of interactions with Linux systems, even (especially) today. This includes the terminal upon which this podcast is being prepared, the one used by your gnome-terminal, your 3G (also perhaps POTS) modem pppd session, etc. For something we all rely upon, this code has had little love in a long time. As Linux Weekly News reports this week, that is something Alan Cox has been trying to correct. He has reworked the locking, fixed some DoS attacks (and proposed a kernel hack for catching NULL pointer “execution” attempts), and otherwise attempted somewhat of an overhaul. But it seems every time you touch one thing in that code, another breakage rears it’s ugly head (largely because the various standards are wiggly). Today’s conversations started out being around a breakage in “kdesu”, but ultimately that fix turned out to be the least of Alan’s worries. Emacs makes some (admittedly rather awkward) assumptions about behavior on close() – that the TTY layer has completed processing and delivery to the other end – and then there’s a locking problem experienced by Stephen Rothwell (and probably others) on boot. All in all, the recent experiences had Linus calling to revert patches – something that is harder than it sounds due to layer dependencies, and the fact that actual series bugs still need to be fixed, especially DoS bugs, whatever happens.

In today’s miscellaneous items: a fix for a “section mismatch” in i386 init caused when CONFIG_HOTPLUG_CPU is enabled, since the code might be needed even after init completes (Robert Richter), some x86 fixes (Peter Anvin), a discussion concerning newer low-voltage MMC parts and how these may (or may not) be handled by the existing codebase (for which there is no maintainer), some updates for microblaze (Michal Simek), some perfcounter fixes for powerpc (either coming in through the powerpc tree or directly via Peter Zijlstra – that is currently to be decided), mention of an infinite loop in get_futex_key in 2.6.31-rc4 (Jens Rosenboom), a suggestion that checkpatch somehow enforce all new config options have a help summary (Johannes Berg), an academic question surrounding the navigation of task page tables in a physical memory image dump file (M. Shuaib Khan), some feedback discussion from Vivek Goyal concerning benchmarks performed on his latest IO scheduler controller patches provided by Gui Jianfeng in which a 7% performance loss was being observed (for “normal” writes), a large number (22) of patches implementing an WM831x hardware monitoring driver, that is fairly intrusive (Mark Brown), version 5 of a patch adding 1GB page support to KVM (Joerg Roedel), a weird lockdep problem in which David Howells believes lockdep is mischaracteristing two different locks as actually being the same one (a false positive), an RFC patch attempting to make AGP work with IOMMU (David Woodhouse), an issue with signal delivery not being guaranteed to reach a specific thread and yet being used to deliver performance counter events (which are thread specific) was raised (Stephane Eranian), a number of fanotify followups (Eric Paris), version 2 of a cleaned up, simplified RCU patch from Paul E. McKenney, some networking updates from David Miller, an attempt at full NAT support for IPVS, and Amerigo Wang noticed that setting CONFIG_SYSFS_DEPRECATED_V2 is required in order to boot recent kernels on older distribution userlands, such as RHEL5.

In today’s announcements: 2.6.31-rc4-rt1. Thomas Gleixner is back with another thrilling installment of the RT patch. As he summarizes in his announcement, a decision was made to skip over .30 and go straight from .29 to .31 in order to avoid having to play catchup (and other reasons). Thomas sent a very detailed and very useful announcement, which I can only summarize here, but for the full detail, refer to his announcement. In the latest release, interrupt threads become a simple extension of the mainline ones, so it’s now possible to schedule thread priority at the device (not interrupt line) level. Also, this patch introduces a major change to RT locking. Gone is the use of cunning compile-time hacks, and in is the introduction of the atomic_spinlock_t (which is similar to the old raw_spinlock – alternative names are welcome, but off-list, since Thomas wishes to avoid bikeshed painting exercises), that is used for a few specific locks that cannot be replaced with sleeping versions. Thomas also takes the opporunitity to cleanup semaphores (as previously reported), and has begun to use git on an initially limited basis. The next main target, once he returns from 10 days of vacation (helping run a summer camp for kids) will be shooting down the Big Kernel Lock for good.

The latest kernel release is 2.6.31-rc4, which was released last week.

Rafael J Wysocki followed up to his previous postings concerning individual regressions for which bugs on the kernel.org bugzilla instance had been filed with some updates – for example, realizing some issues had been introduced in even older kernel versions than those cited in the original reports. The number of regressions in the current RC remains somewhat concerning.

Stephen Rothwell posted a linux-next tree for July 27th. Since Friday, there are new “drbd” and “benh-mm” trees. The former is obvious (it’s the distributed replicating filesystem that we’ve covered on numerous occasions prior to now), while the latter is temporary only while an API change is made on Cell architecture. The powerpc tree still fails to build in an allyesconfig build configuration, and overall the tree lost build failures (when one considers also a reverted commit from the “rr” tree, and a patch Stephen did for the drbd to resolve a build failure in the latter). The total sub-tree has increased yet again, up to 134 trees at this point once more in the latest linux-next tree compose.

That’s a summary of today’s LKML traffic. For further information visit kernel.org. I’m Jon Masters.

Categories: episodes Tags: