2009/08/19 Linux Kernel Podcast
Audio: http://media.libsyn.com/media/jcm/linux_kernel_podcast_20090819.mp3
For Wednesday, August 19th 2009, I’m Jon Masters with a summary of today’s LKML traffic.
In today’s issue: Config/SysFS, Cpuidle, O_SYNC, Perfomance Counters, Spinlocks, and x86.
Config/SysFS. Avi Kivity posted concerning some issues he has with “all the text based pseudo filesystems that the kernel exposes”. His main concern being that the kernel development community is “optimizing for the active sysadmin, not for libraries and management programs”. On a lower level, he is concerned about a number of specifics, including efficiency of open/read/close actions, atomicity of having to read multiple files that may be changing in order to capture specific system state information, the ambiguous format of attributes, lifetime and access control concerns, notification of change in attributes, and readdir support being “painful”. Avi says that “I don’t think a lot of effort is needed to make an extensible syscall interface just as usable and a lot more efficient than config/sysfs”, to which Ingo Molnar suggested that such an implementation was available in the form of the mechanism used by the performance counters code perf_counter_open system call, which does such things as passing an embedded .size field so that the data structure exchanged with userspace can change in size later on (embedded ABI protection). Avi replied that he had seen this and that it was “nice”. A number of others expressed frustrations at the current interfaces, so it will be interesting to see whether this turns into anything more concrete.
Cpuidle. Arun R Bharadwaj posted a two part patch series implementing cpuidle infrastructure support for powerpc systems. This not only allows powerpc systems to save power by selectively entering “snooze” and “nap” states when the kernel cpuidle code deems it appropriate, but also provides tpmd_idle, which is support for Thermal and Power Management idling also.
O_SYNC. Jan Kara posted a seventeen part patch series entitled “Make O_SYNC handling use standard syncing path” that aims to unify O_SYNC handling with the existing code that implements fsync(). After this patch series is applied, there is just one place where handling for forcing file commits to disk is implemented, making life easier for filesystem code. The patch touches a lot of filesystems and is probably going to need some fairly hefty testing.
Performance Counters. Everyone’s worried about information leakage and security at the moment, and Peter Zijlstra had previously noted the risk for information leakage through performance counters metrics. He posted another version of a patch series changing the default permissions on performance counters (disallowing regular users from creating cpu-wide counters), and causing any samples to have anonymized kernel IPs (Instruction Pointers) in the case that they are being collected by an unprivileged user.
Spinlocks. Discussion continued surrounding the meaning and purpose of spin_is_locked() as applied to uniprocessor systems. Thomas Gleixner had suggested that it should always return true, whereas Peter Zijlstra, Linus, and others had pointed out problems with this logic. In the end Peter suggested that the best idea might be for spin_is_locked to by a synonym for panic(). As I mentioned previously, Linux Weekly News has an excellent writeup in the latest edition, so it’s worth refering to that for more detail.
x86. Jan Beulich noted that according to gcc’s instruction selection, inc/dec instructions can be used without a performance penalty on most x86 CPU models, but should be avoided on others. Hence he suggests (and posts a patch for) selectable configuration of inc/dec instruction use depending upon the CPU models that are being targeted by a given x86 build.
In today’s miscellaneous items: another version of the CLOCK_REALTIME_COARSE patch that adds a fast but not very fine-grained timestamp from John Stultz, version 0.5 of the new kfifo API implementation from Stefani Seibold, a patch from Bartlomiej Zolnierkiewicz removing the mailing list for ncpfs from MAINTAINERS, a patch from Miguel Boton moving the many different alignment macros within the kernel into a standard “align.h” header file, yet another round of patches for Compal made Dell laptops from Mario Limonciello (with special thanks to Alan Jenkins for once again putting a lot of effort into testing and finding some bugs), some minor bug files for nilfs2 from Ryusuke Konishi, some documentation update to AFS from David Howells, a patch from Miroslav Rezanina causing Xen guest kernels booted with a mem= parameter (but nonetheless allocated additional memory in the hypervisor) to return the additional memory back to Xen early in boot, a second batch 47 of KVM updates targeting 2.6.32, a trivial fix for linux-next from Ingo Molnar that adds new tracepoints for syscall_enter and exit on s390 systems (avoiding a build failure otherwise), some microblaze fixes from Michal Simek, version 3 of a patch series from Zhang Rui implementing a standard interface for Ambient Light Sensors (ALS), a patch adding syscall filtering support for ftrace events from Li Zefan, version 5 of a patch from Amerigo Wang correcting the semantics for file truncations when both suid and write permissions are set for the user on a given file entry, some DRM fixes from Dave Airlie, and a new version 4 of the vhost kernel-level virtio server from Michael S. Tsirkin that is sure to kick off another round of enjoyable virtualization dialogue.
In today’s announcements: 2.6.31-rc6-rt5. Thomas Gleixner posted the latest version of the preempt-rt kernel, which updates to the latest Linus git tree, makes IPI handlers unthreaded on PowerPC (pseries), and fixes a problem with cgroup memcontrol preemption.
The latest kernel release is 2.6.31-rc6, which was released on August 14th.
Rafael J. Wysocki posted a list of regressions from 2.6.29 to 2.6.30 and from 2.6.30 to 2.6.31-rc6-git5 for which there are no fixes in mainline that he is currently aware of. The regression list has not increased dramatically, and most of the bugs seem to have driver specific or suspend/resume roots.
Walt Holman posted saying that he is experiencing some “periodic timeouts” with kernel 2.6.30.5 and Simon Kirby noticed how a “storage head box” also running 2.6.30 would occasionally get stuck allocating memory to send a packet for up to several seconds (visible watching sshd getting stuck), blocking on a mutex named iprune_mutex called from prune_icache in fs/inode.c. He made some suggestions about converting to a try_lock in that code and so forth. Finally, Steven Rostedt posted a series of lockups in the IPI code on recent kernels.
Stephen Rothwell posted a linux-next tree for August 19th. Since Tuesday, the mips, omap, and suspend trees lost their issues, wheile the tip and usb trees gained some conflicts. The total sub-tree count remains steady at 140 trees.
That’s a summary of today’s Linux Kernel Mailing List traffic, for further information visit www.kernel.org. I’m Jon Masters.










