2009/06/30 Linux Kernel Podcast
Audio: http://media.libsyn.com/media/jcm/linux_kernel_podcast_20090630.mp3
For Tuesday, June 30th 2009, I’m Jon Masters with a summary of today’s LKML traffic.
In today’s issue: fanotify, GPL, KVM, Modules, OOM, Real Time, Tasks, VFAT, VFS, and Virtual Terminals.
fanotify. Something we missed on Monday. Eric Paris posted a new version of fanotify, a notification mechanism originally designed to aid “anti-malware” vendors. The patch adds two key things, the ability to receive a read only fd pointing to modified filesystem objects (so they can be scanned for malware), and an access system in which processes may be blocked until an fanotify userspace listener has decided if whatever they were trying to do should be allowed (once they have been deemed “clean”). Eric reminds us that this is not an LSM, is not intended to provide system security, is not intended to prevent malware from running on Linux, but is merely intended to support on-access file scanning operations. Valdis Kletnieks followed up to say that he doesn’t care about virus scanners but that this could be useful for HSM applications.
GPL. Andrey Volkov posted an email about the ASUSTek Computer WMVN25E2+ WiMAX Subscriber Station, in which he alleges that this product is ingringing upon the GPL because it allegedy includes GPL software (he lists the versions) for which no source code is made available, and no offer of source is made under some kind of “intellectual property” defense. One hopes that that Andrey has tried other avenues of communication before emailing the LKML about it, since many companies can come around to doing the right thing with private prodding.
KVM. Gleb Natapov posted to let us know that KVM would like to provide the x2APIC interface to a guest without emulating interrupt remapping. KVM prefers this because x2APIC is better virtualizable and provides better performance than the MMIO xAPIC interface (Gleb cites examples of why this is the case). The patch changes x2APIC enabling so that it is enabled on KVM guests, even if interrupt remapping initialization failed.
Modules. Jan Beulich posted a patch reducing exported symbol CRC table size on 64-bit architectures. He does this by ensuring that these quantities are actually only stored as the 32-bit quantities they are (using assembly wrappers) rather than the 64 bits that are used when gcc is left to its own devices. By applying this patch, one saves 16k of kernel resident size, 2k module resident size, and a whopping 1M of vmlinux image size. On an unrelated note Jan also posted a patch replacing uses of num_physpages by totalram_pages since many memory sizing calculations should be influenced only by usable memory, not just the total number of physical pages (perhaps including lots of non-RAM).
OOM. Ongoing discussion of a patch intended for swapless systems that incorrectly also affected those with swap and caused a lot of OOM situations in 2.6.30 kernels onward (especially for David Howells, who found it) lead to the suggestion from Mel Gorman that OOM situations would also cause the kernel to print out the full active_anon LRU list – so that developers can figure out what pages are still on the active_anon list in that case, and – more importantly – which of those should not be there.
Real Time. Zoltan Bus posted a message saying that wake_up() sometimes isn’t waking up real-time priority tasks when called from an interrupt handler. This is interesting timing, because this author also heard just yesterday that (on the RT kernel) he should also never be calling wake_up from interrupt thread context. The correct places to use wake_up are probably worth documenting.
Tasks. Oleg Nesterov posted an RFC patch entitled “do not place sub-threads on task_struct->children list”. Currently, Linux systems add sub-threads to the ->real_parent->childen list, but this only really serves to slow down do_wait. With this patch, ->children contains only the main threads (group leaders). Roland McGrath thought this seemed mostly like the right idea.
VFAT. Ongoing discussion of Andrew Trigell’s latest patch. At the same time Hirofumo Ogawa followed up to ask whether it isn’t about time to change the default shortname=lower mount option for vfat filesystems. This is known to cause problems when copying files from one filesystem to another on Linux and affecting the originally intended case of the files, it also is inconsistent with respect to Windows behavior, whereas (as Jamie Lokier also agreed), shortname=mixed is a more sensible default.
VFS. In an evolution of previous discussion, Miklos Szeredi posted a new patch implementing a new O_NODE flag for open calls. Opening a file in such a way will not call the driver’s ->open() method and will not have any side effect other than referencing the dentry/vfsmount from the struct file pointer. This can be layered with other options to implement some useful features.
VTs. Lennart Poettering posted a patch implementing an extension to VT_WAITACTIVE such that is possible to wait until a specific VT becomes inactive. This will allow ConsoleKit to more easily keep track of which VT is the active one. Currently, ConsoleKit (which is used by distributions such as Fedora to assist with Fast User Switching of Graphical Desktops) creates 64 separate userspace threads, one for each theoretical VT, and calls VT_WAITACTIVE in them to look for changes. With this patch, ConsoleKit will instead be able to only monitor whatever it considers to be the current VT.
In today’s miscellaneous items: IDE and Network fixes from David Miller (including a migration to generic block layer request completetion on IDE for some legacy code, and some small fixes for the new ZigBee networking stack), Kmemleak fixes (Catalin Marinas – these are intended to reduce the “false positive” noise that Dave Jones and Ingo Molnar, and others, were seeing before), two “important” device-mapper fixes for 2.6.31-rc2 (Alasdair Kergon), a bunch of useful performance couter tools fixes from Arnaldo Carvalho de Melo (acme) adding filtering by comm, dso and symbol lists to perf (Paul Mackerras also posted a fix to enable counters only on next exec, allowing one to skip the launching process in profiling, and another allowing one to exclude any overhead caused by the presence of a hypervisor such as KVM). Ingo Molnar thinks Robin Getz’s CON_BOOT idea is “geninuely useful”. Finally, Amerigo Wang posted an update to kcore sizing calculation that should actually do the right thing this time around.
[new segment] In today’s blog postings: Dave Jones ponders aloud creating “rawhide”-like kernel builds for stable Fedora releases, to increase test coverage and (hopefully) make released kernels even better overall. If you’d like your blog included here, simply have it added to planet.kernel.org. If you do that, it’s assumed you don’t mind being featured in such things.
The latest kernel release is 2.6.31-rc1, which was released by Linus last week.
Andrew Morton posted an mm-of-the-moment for 2009-06-30-12-50. It contains the usual impossibly large number of patches to itemize here.
Stephen Rothwell posted a linux-next tree for June 30th. Since Monday, he adds a new tree entitled “sfi” (which was immediately dropped due to build problems), his fixes tree still contains that fbdev fix, and our old friend of the powerpc build configuration problem in an allyesconfig is back. There are a total of 132 sub-trees in the linux-next compose now.
That’s a summary of today’s Linux Kernel Mailing List traffic, for further information visit www.kernel.org. I’m Jon Masters.

