Archive

Archive for August 13th, 2009

2009/08/12 Linux Kernel Podcast

August 13th, 2009 jcm No comments

Audio: http://media.libsyn.com/media/jcm/linux_kernel_podcast_20090812.mp3

For Wednesday, August 12th, 2009, I’m Jon Masters with a summary of today’s LKML traffic.

In today’s issue: AlacrityVM, Asynchronous device suspend and resume, and Kbuild.

AlacrityVM. Gregory Haskins posted some updated benchmarks for the AlacrityVM hypervisor (based upon KVM) that he and others have been working on. Chiefly, Alacrity includes a new venet implementation for network virtualization, and aims to be optimized for Real Time workloads with low latency requirements. The figures are based upon 2.6.31-rc4, and show example response times of 56.8us vs. 29.8us native as opposed to 4016.0us for existing KVM instances running with virtio. Greg renames virtio to “virtio-u” since he is aware of the new in-kernel virtio server and plans to update the figures once he is able to compare against the “virtio-k” code posted recently. He posted some “3Dish” graphics that seemed to disturb one reader to the point of ranting about the evils of 3D graphics (somewhat harsh for a simple bar chart).

Asynchronous device suspend and resume. Rafael J. Wysocki posted a three part patch series intended to implement asynchronously the device driver provided suspend and resume callbacks on such events as suspend to RAM.

Kbuild. Catalin Marinas posted a nice patch to kbuild that implements reverse dependency tracking for selected options. With this patch, an option cannot be selected if any of its direct dependencies are not met.

In today’s miscellaneous items: a number of V4L/DVB fixes from Mauro Carvalho Chehab, a request for development of Kprobes and Kretprobes support in performance counters from Frederic Weisbecker, a fix for blktrace from Jens Axboe (fixing a double removal of a debugfs directory causing a crash), Rick L. Vinyard asked about tracking changes to exported attributes in sysfs (to which Kay Sievers replied that what he wants doesn’t exist “ouf of the box”), some cleanups to the tracepoint-analysis documentation (based upon feedback from LWN’s Jonathan Corbet) from Mel Gorman – who recently implemented the VM tracepoints, a fix for an O_DIRECT oops in NFS from Trond Myklebust, a new version of the “send callback when swap slot is freed” patch from Nitin Gupta, a git pull request implementing various performance counters code refactoring from Frederic Weisbecker, some libata fixes from Jeff Garzik, version 3 of the automatic crashkernel size calculation boot parameter patch from Amerigo Wang, some XFS updates for 2.6.31-rc6 from Felix Blyakher, another version of the kfifo patches from Stefani Seibold, and some sound fixes from Takashi Iwai.

Finally today, David Wuertele notes some difficulty in creating readonly root filesystems using initramfs. He would like to know how to do so but his tests are failing and the documentation doesn’t provide any detail – perhaps someone can help him out with an explanation.

In today’s announcements: linux-2.6.31-rc5-rt1.2. John Kacur announced version 2.6.31-rc5-rt1.2 of the -rt kernel patchset, which is an “unofficial” tree (although with implicit blessing nonetheless) intended to avoid the RT patch generating bitrot while Thomas Gleixner and others work on new RT features. The development (though maybe not this tree yet) removes the boot warning for options that might hurt performance in the case that ftrace is built with dynamic ftrace support rather than static. This paves the way for having ftrace built into the kernel by default, rather than optionally doing so.

The latest kernel release is 2.6.31-rc5, which was released over a week ago.

Andrew Morton posted an mm-of-the-moment for 2009-08-12-13-55.

Stephen Rothwell posted a linux-next tree for August 12th. Since Tuesday, the linux-next tree has now moved to a new more officious location on git.kernel.org (symlinks will redirect from the old location), and the v4l-dvb lost its conflicts. The sub-tree count remains steady at 140 trees.

That’s a summary of today’s Linux Kernel Mailing List traffic, for further information visit www.kernel.org. I’m Jon Masters.

Categories: episodes Tags:

2009/08/11 Linux Kernel Podcast

August 13th, 2009 jcm No comments

Audio: http://media.libsyn.com/media/jcm/linux_kernel_podcast_20090811.mp3

For Tuesday, August 11th, 2009, I’m Jon Masters with a summary of today’s LKML traffic.

In today’s issue: Kexec, KVM, RTC, and VGA.

Kexec. Amerigo Wang posted two interesting patches for kexec. The first implements the display of a loaded crash kernel’s memory section information in /proc/iomem, while the second allows one to shrink the reserved memory for a crash kernel on an already running system if it is more than enough. For example, if you already had reserved 128MB, but only needed 100MB, you can simply write into sysfs (/sys/kernel/kexec_crash_size) to reclaim 28MB.

KVM. Michael S. Tsirkin posted version 2 of a 2 part patch series implementing a kernel-level virtio server. The main motivation for this effort is to reduce the virtualization overhead for virtio by removing system calls in the data path, without changing the guest system. As he says, for virtio-net, this removes up to 4 system calls *per packet*, which is a very significant performance improvement and should lead to some nice benchmarks. This version has only a few minor improvements from the previous one, such as moving rather than copying fs/aio.c, and removing some debug logging.

RTC. Feng Tang posted an RFC patch series implementing a new generic rtc_ops struct for x86 systems. As Feng points out, most x86 systems get their time keeping information from a Motorola 146818-like RTC device, EFI, or even virtualiation (these come in via get_wallclock/set_wallclock) but in the future there will be other mechanisms also and so Feng implements the ability to register different RTC sources in a generic fashion.

VGA. Dave Airlie posted a patch series that had originally come in from Tiago Vignatti, aimed at implementing VGA arbitration on systems using “legacy” VGA devices. As Dave says, the Resource Access Control (RAC) module inside the X server currently does the task of arbitration when more than one legacy device co-exists on the same machine, but a problem happens when different userspace clients attempt to do the same and so an arbitration mechanism that is independent of the X server is really needed.

In today’s miscellaneous items: an ACPI event notifier for AC/DC connect/disconnect events from Mark Langsdorf, a number of tracing fixes from Frederic Weisbecker that include Jason Baron’s syscall name to number mapping function, some wireless fixes from John Linville, some OCFS2 fixes from Joel Becker, version 2 of the new Winbond IR driver from David Hardeman, a patch allowing architectures (for example, SPARC) to override the default check_for_illegal_area function if it doesn’t work reliably from Joerg Roedel, a fix for userland ABI breakage in gnet_stats_basic that is passed via netlink from Michael Spang, version 4 of the “Help Resource Counters Scale better” patch series from Balbir Singh (which Prarit Bhargava confirmed improved a kernel compile time by around 30 seconds), a patch fixing CPUCLOCK_PROF and CPUCLOCK_VIRT timer precision from Stanislaw Gruszka (who notes that few people use these, but they should probably still be fixed for anyone who does – his posting includes a reproducer), a patch “constifying” various seq_operations structs from James Morris, a patch to print AMD virtualization features such as NPT, LBRV, SVML, and NRIPS in /proc/cpuinfo from Joerg Roedel, some IPC semaphore improvements (aimed to improve the O(n^2) behavior with n waiting processes) from Nick Piggin, a patch disabling cpufreq on 32-bit PowerPC systems from Bastian Blank, a patch adding cache miss and cache references events to performance counters on Pentium-M systems from Ingo Molnar, and a question from Frans Pop (of Ted T’so) as to what happened to the “data=guarded” patches Chris Mason had proposed in April for ext3.

Finally today, Luis R. Rodriguez inquired as to whether there exists a “typedef” removal tool. Presumably this would be a script or program that would look for typedefs and remove or replace them intelligently. If anyone knows of such a tool, do let Luis know about it also.

The latest kernel release is 2.6.31-rc5, which was released over a week ago.

Zdenek Kabelac posted to let everyone know that he is getting a “complete system freeze during reboot – usually just after iptables frees modules”. He posted a kconfig indicating that he is running 2.6.31-rc5. Also, Catalin Marinas posted to let everyone know that LTP on 2.6.31-rc5 on ARM with root NFS generates an oops in __put_nfs_open_context when running diotest4.

Stephen Rothwell posted a linux-next tree for August 11th. Since Monday, the nfdsd, kvm, rr, and staging trees all lost conflicts and/or build failures. The total sub-tree count is steady today at 140 trees.

That’s a summary of today’s Linux Kernel Mailing List traffic, for further information visit www.kernel.org. I’m Jon Masters.

Categories: episodes Tags:

2009/08/10 Linux Kernel Podcast

August 13th, 2009 jcm No comments

Audio: http://media.libsyn.com/media/jcm/linux_kernel_podcast_20090810.mp3

For Monday, August 10th, 2009, I’m Jon Masters with a summary of today’s LKML traffic.

In today’s issue: CGroups, Ftrace, Modules, RCU, Spinlocks, Swap, System Calls, TTY, and VM.

CGroups. Ben Blum posted version 3 of a 7 part patch series implementing support for a “cgroup.procs” file that allows the user to quickly display all the unique thread IDs in a particular cgroup, as well as move a collection of existing processes sharing the same thread ID into a particular cgroup.

Ftrace. John Reiser complained that recordmcount, which is run during kernel build against every .o object as a means to extract mcount data for use with the dynamic function patching code in ftrace can add many minutes to a full kernel compile. He suggests that the problem is in repeated calls to “ld -r”, which can be batched into one call based on the output from recordmcount or the other way around. Either way, he says, the data output is the same. He was concerned that his 900 line “recordmcount.c” replacement might be too long for the mailing list (perhaps he has not seen the size of some patches) but will likely be persuaded to send it if the developers are interested.

Modules. Eric Paris posted requesting thoughts on how permissions checks are currently implemented on request_module(), and if it makes sense. As he says, request_module() is used to request the kernel helper thread spawn out a modprobe userspace thread to do a module load. It is called in a number of places within the kernel (apparently, approximately 128 unique callsites) and only three check to see if the requesting process has some sort of module loading permissions (CAP_SYS_RAWIO). Amongst the suggestions, Eric would like to see the request_module() code perform this security check for itself. Also on the subject of modules, Ozan Caglayan posted version 2 of a recent patch implementing a fix in the markup_oops script that will use modinfo to lookup module information when the EIP within a oops is within a module that has a “-” instead of a “_”. This is a semi-frequent occurance with module naming, so should avoid confusion.

RCU. Martin Schwidefsky had posted on Friday evening concerning a 2.6.30 system that was hanging due to a bad interaction between RCU and NOHZ. Paul McKenney followed up today with a congratulatory reply saying, “Congratulations, Martin! You have exercised what to date has been a theoretical bug identified last year by Manfred Spraul. This fix is to swich from CONFIG_RCU_CLASSIC to CONFIG_RCU_TREE, which was added in 2.6.29″. Martin replies that SLES11 uses 2.6.27 and classic RCU, and he believes the bug is present there also, so therefore does need to be fixed. On a only maringally related tangential note, Martin also mentioned that he is working on NOHZ some more to improve delay performance by not having a CPU go fully tickless if it did some work in the last timer tick (which causes an unnecessary timer tick if the CPU goes truly idle, but generally he thinks will improve performance – and Martin requests comments on this approach from the wider LKML Congress).

Spinlocks. Heiko Carstens posted an RFC patch series allowing inlined spinlocks once again, since this apparently can lead to a 1%-5% speedup on some (s390 in this particular case) systems under certain workloads. The patch introduces CONFIG_SPINLOCK_INLINE as a conditional selector for this feature.

Swap. Nitin Gupta posted an RFC patch implementing a callback function whenever a swap slot is freed, for use on (in this example) systems with compressed RAM devices backing the swap device, allowing the memory to be instantly freed rather than when the “swap discard” bio is eventually processed by the block layer. Apparently, this is “essential” for the “compcache” project to which he posted a link.

System Calls. Jason Baron posted an interesting 12 part patch series implementing a runtime system call to name mapping function that allows one to pass a string representation of a system call and returns the ID of the call. Initially, it is for the syscall event tracer within ftrace, although one can imagine other projects would be interested in picking this up in-kernel.

TTY. The ongoing saga with the TTY layer came up again today (but only marginally). Artur Skawina noticed a ^S/^Q sequence resulted in data loss within his xterm. That seemed to be caused by a recent commit that had removed a check for tty->stopped in pty_write_buffer() for “no clear reason”, according to Linus Torvalds, who posted a patch that fixed the problem for Artur.

VM. Bill Speirs noticed a problem with VMA merging. The Linux VM uses VMAs (Virtual Memory Areas) to represent ranges of pages allocated to a task, complete with their protections and flags. A typical task has a number of different VMAs representing load code, library functions, program text, data, and so forth. Typically, the kernel will coalesce adjacent VMA regions if they share contiguous (virtual) memory and protection. However, in the case Bill cited, where he maps three pages with PROT_NONE and then sets the middle one to PROT_WRITE protection before setting it back, the kernel fails to reconcile these three pages back into a single VMA. This is not true if the same experiment is done using PROT_READ. Bill sees this issue because he is in reality mapping 200,000+ pages and rapidly changing permissions is causing him to exceed the max_map_count ulimit. This is worthy of investigation.

In today’s miscellaneous items: a power management fix (removing a run-time warning) from Rafael J. Wysocki, some performance counters fixes from Ingo Molnar (who states that he hopes it is still fine to make a few changes, but is willing to trim the patchset down to minimal changes if Linus prefers), the usual round of other updates from Ingo (x86, irq), some PCI fixes from Jesse Barnes, version 6 of a patch series adding trace events to the page allocator from Mel Gorman (who requests a “yey or nay” on whether these should be merged), a memory leak in security/selinux/hooks.c, identified by “iceberg” (which is about as useful as calling yourself only “debiandeveloper” or one of the many other nickname-only posters on LKML) and later patched in a posting from James Morris, version 2 of a VFS patch converting superblock s_maxbytes to an loff_t, a patch giving waitqueue spinlocks their own lockdep classes when they are initialized from init_waitqueue_head() from Peter Zijlstra by way of David Howells, who needed it to address a lockdep false positive situation in CacheFiles, a powerpc fix that allows “direct” DMA (non-iommu) to work for devices that have a < 32-bit DMA mask when the machine simply has no enough memory to go over the chip addressing limit from Ben Herrenschmidt, a patch implementing vhost, a kernel-level virtio server, from Micael S. Tsirkin, and a rethink of command line precedence on MicroBlaze.

Finally today, Ted T’so posted an update to the Kconfig description for EXT3_DEFAULTS_TO_ORDERED better explaining the tradeoffs in terms of journal options on ext3, which he says has been vetted by the developers as being more informative for users. Hopefully, some users will agree with that assertion.

The latest kernel release is 2.6.31-rc5, which was released over a week ago.

Matthias Dahl reported an oops in 2.6.31-rc5-git5 in kmem_cache_alloc and Eric Paris noticed a NULL pointer deference in kmemcheck in linux-next. There was also some whining that ARM doesn’t test with “randconfig” builds that often.

Stephen Rothwell posted a linux-next tree for August 10th. Since Friday, there are two new trees added – ide and hwpoison (the old ide became ide-current). The nfsd and drm trees gained conflicts, while the trivial tree lost its conflict. Given the two new tree additions, there are now 140 sub-trees. Stephen reminded Andi Kleen (author of HWPOISON) that linux-next is intended only for patches “destined for the next merge window”, which Andi affirmed.

That’s a summary of today’s Linux Kernel Mailing List traffic, for further information visit www.kernel.org. I’m Jon Masters.

Categories: episodes Tags:

2009/08/09 Linux Kernel Podcast

August 13th, 2009 jcm No comments

Audio: http://media.libsyn.com/media/jcm/linux_kernel_podcast_20090809.mp3

For the weekend of August 9th 2009, I’m Jon Masters with a summary of today’s LKML traffic.

In today’s issue: clone_with_pids(), HTC Dream, Nested SVM, and Performance Counters.

clone_with_pids(). Sukadev Bhattiprolu posted version 4 of a 7 part patch series implementing a new clone_with_pids() system call for use with checkpoint application restarting support. The idea is to request that the kernel give a newly created task or tasks a specific set of process IDs so that code being resumed from a checkpoint will have a consistent process ID. Sukadev is interested in feedback on the proposed system call interface and offers two alternatives for consideration.

HTC Dream. Pavel Machek posted yet more patches for the HTC Dream. At this point he has done a large amount of work at pushing this support into the staging tree. The latest effort included support for input devices connected to GPIO pins, and a number of other fixes. Separately, Jiri Slaby posted a buffer overflow fix for the Dream, which he is unable to even build test as he doesn’t have an ARM toolchain (this suggests he doesn’t have hardware either).

Nested SVM. Joerg Roedel posted version 2 of a series of nested SVM cleanups. The patchset has been tested with the use case of KVM within KVM and has shown apparently no regressions (with the first-level guest using nested and shadow paging). The latest version of the patch enables nested SVM support by default, although the user must still invoke qemu with -enable-nesting.

Performance counters. There were a number of small patches to performance counters over the weekend, from a number of people, suggesting that many are starting to play with these now. Of the patches, there was support for displaying per-thread event counters from Brice Goglin, and a fix to avoid oopsing on PowerPC CPUs without performance counter hardware support from Paul Mackerras.

In today’s miscellaneous items: some reposted patches implementing asm-generic and dma-mapping-common for SPARC from Tujita Tomonori, a futex bugfix from Darrent Hart, a series of fsnotify patches from Eric Paris, a series of patches converting parts of the kernel over to using printk_once from Marcin Slusarz, version 2 of a patch fixing an oops in identify_cpu() on CPUs without the CPUID instruction on x86 from Ondrej Zary, some timer, tracing, core, and x86 fixes from Ingo Molnar, some critical KVM updates for 2.6.31-rc6 from Avi Kivity (including a guest-initiated DoS fix), a winbond IR driver from David Hardeman, a possible regression in XFS in 2.6.30.4 raised by Justin Piszcz, a number of updates to the staging tree and some USB fixes (including addressing some EHCI warnings folks are seeing – Greg included – and a number of other fairly minor fixes) from Greg Kroah-Hartman, some RT fixes for ARM from Uwe Kleine-Konig, some fixes for SDHCI (high speed and 4-bit SD cards) from Anton Vorontsov, a few “relatively small” bug fixes for btrfs from Chris Mason, a pull request for some wireless updates from John Linville, version 5 of a patch adding trace events to the page allocator from Mel Gormon, version 4 (apparently “for the upstream community, this is revision 3″ – worth fixing that to adopt one numbering scheme soon) of support for the Intel RAR Register from Mark Allyn, a lockdep warning in 2.6.31-rc5-rt1.1 from Clark Williams, an update to CPU topology detection for AMD Magny-Cours from Andreas Herrmann, a fix to a memory leak in the ring_buffer free code from Eric Dumazet (which was immediately released as a pull request from Steven Rostedt), version 3 of a patch allowing file truncations on files with suid and write permissions set, which previously incorrectly failed with EPERM, from Amerigo Wang, a patch changing superblock s_maxbytes for an loff_t type from Jeff Layton, and yet another round of DRM fixes for 2.6.31-rc6 from Dave Airlie.

Finally today, Robert P. J. Day inquired as to whether any official standard existed for determining when/if tools should be moved into the top level directory of the same name. As an example, he cited Documentation/fs/slabinfo.c as a candidate.

In today’s security items: A read buffer overflow fix for FAT from Roel Kluin, and the aforementioned KVM guest DoS fixes from Avi Kivity.

The latest kernel release is 2.6.31-rc5, which was released over a week ago.

Hugh Dickens wonders if CONFIG_PREEMPT_RCU is supposed to be working in next/mmotm at the moment, because he suspects it is failing on his PowerPC G5 system, as evidenced by a parallel kernel compilation test the fails in what appears to be a manner consistent with RCU failing to reap the “filp” SLAB. Separately, Martin Schwidefsky wondered whether there was a race in the case of RCU and NOHZ being defined at kernel build time. Martin posted an example interaction showing how this might happen and requested input from Paul E. McKenney, who is the inventor and implementor of RCU support.

Rafael J. Wysocki posted a list of regressions between 2.6.29 and 2.6.30 and also between 2.6.30 and 2.6.31-rc5-git5. The former list of regressions appears to be leveling off for the older kernel (a total of 37 unresolved bugs are cited from the upstream kernel.org bugzilla), however the more recent regressions have increased, with a total of 24 unresolved regressions. Of course, these are just regressions for which there is a tracking bug.

Stephen Rothwell posted a linux-next tree for August 7th. Since Thursday, the following trees gained conflicts and/or build failures: net, security-testing, tip. The following trees lost conflicts and/or build failures: rr, agp. The total sub-tree count remains steady at 138 trees.

That’s a summary of today’s Linux Kernel Mailing List traffic, for further information visit www.kernel.org. I’m Jon Masters.

Categories: episodes Tags:

2009/08/06 Linux Kernel Podcast

August 13th, 2009 jcm No comments

Audio: http://media.libsyn.com/media/jcm/linux_kernel_podcast_20090806.mp3

For Thursday, August 6, 2009, I’m Jon Masters with a summary of today’s LKML traffic.

In today’s issue: AlacrityVM, blk-iopoll, CPU features, PCI Identifiers, Performance Counters, and Tux3.

AlacrityVM. Michaeal S. Tsirkin replied to Gregory Haskins’ announcement of the “AlacrityVM” (which is a fork of KVM) with a suggestion that the Alacrity folks work to start merging with the host side of the project, copy the kvm lists on development, and perhaps update the comparison graphs between KVM and Alacrity to reflect a more “apples to apples” comparison.

blk-iopoll. Jens Axboe posted a patch series implementing a polled completion API for the block layer, with the hope of targeting a merge in 2.6.32. As he puts it, “basically this implements NAPI for block devices, and much of the core is essentially lift from there [the network code]“. Jens has seen good performance results on SSD devices, reducing the interrupt rate a lot (for example, a 28% reduction on a fast box doing 50k IOPS – even with interrupt coalescing support in the hardware being enabled), but up to 95% fewer interrupts on a slow box doing 30k IOPS). Sounds like fun, and was hinted at in the recent “state of the kernel” address at this year’s Linux Symposium.

CPU features. Kevin Winchester noted that his AMD64 system incorrectly reports having X86_FEATURE_LAHF_LM, which the CPU does not actually support (as evidenced by test code which fails with an “illegal instruction”). He tracked this down to an AMD errata that states that the BIOS should program an MSR to indicate that this feature is present, which it might be erroneously doing in his particular case. He suggests that the kernel could automatically remove this feature flag from early Athlon 64 processors known not to support it.

PCI Identifiers. Dave Jones noted the inconsistent approach to handling pci_ids.h, a file containing global PCI identifiers. Officially, this file is supposed to only have entries for drivers that need to share a PCI ID with other drivers (for example, for multi-port cards or alternative drivers), but it has turned into a kind of free-for-all, that Dave aims to fix with a comment explaining when to add new entries to this file.

Performance counters. There were a number of updates and patches to the perfcounters code. These included symbol parsing fixes, reporting fixes, and other updates from third parties. Included amongst these was a patch from Frederic Weisbecker implementing support for ftrace event record sampling.

Tux3. In continuing discussion of the tux3 filesystem, and its future, Daneil Philipps had mentioned how he is more likely to put greater effort into tux3 merging if invited to do so. Otherwise, he says, “if we are not invited to merge, nobody has any cause to complain about progress slowing down”. This caused Ingo Molnar to send a lengthy reply politely explaining that in his 14 years of Linux hacking, he had never seen nor had such an invitation. Linux doesn’t work this way, but instead relies upon people requesting to merge.

In today’s miscellaneous items: version 4 of the trace events for the page allocator patches from Mel Gorman, a patch from Li Zefan allowing one to specify which filter type should be used for TRACE_EVENTs (existing support allowed only customized filters for static and dynamic strings), some test-for-null kmalloc/kzmalloc checks added in PowerPC from Julia Lawall, a minor update to the CPU topology documentation from Andreas Herrmann (adding mention of new attributes for the recent mutli-node processor support), a suggestion from Joe Perches that the MAINTAINERS file more prominently mention the linux-arm mailing list (which Russell King had previously suggested he saw no signs of people moving to), a patch killing the BKL in compat ioctl handling from Arnd Bergmann, a number of /proc/kcore cleanup patches (6 patches actually) from Kamezawa Hiroyuki aimed at removing many per-arch hooks and supporting e.g. VM hotplug, a lengthy question email concerning the correct way to handle DMA and cache on ARMv7 systems from Laurent Pinchart, a patch implementing __[un]register_chrdev() from Tejun Heo allowing one to specify a subset of minor numbers to register and unregister (used by the ALSA OSS cleanups), a new ALS (Ambient Light Sensor) device class in sysfs from Zhang Rui, some tracing fixes for 2.6.32 from Frederic Weisbecker, version 2 of the “crashkernel=auto” patches from Amerigo Wang, some input updates from Dmitry Torokhov, some more DRM fixes from Dave Airlie, callchain support in performance counters and allowing performance counters to access user memory at interrupt time for PowerPC from Paul Mackerras, a request to track down a problem with shmem and TTM from Thomas Hellstrom, and a patch implementing devtmpfs_wait_for_dev() from Mind Lei that builds upon yesterday’s re-posting of devtmpfs and allows the kernel to generically wait for a root device to appear without polling and using other hacks. A number of people have now noticed that one needs to set CONFIG_SYSFS_DEPRECATED_V2 on recent RT kernels if testing on e.g. older Enterprise Linux distribution releases, such as RHEL5.

The latest kernel release is 2.6.31-rc5, which was released over a week ago.

Andrew Morton posted an mm-of-the-moment for 2009-08-06-00-30.

Stephen Rothwell posted a linux-next tree for August 6th. Since Wednesday, Stephen has added support for signed next-yyyymmdd tags, and three minor conflicts were addressed. The tree continues to have 138 sub-trees.

That’s a summary of today’s Linux Kernel Mailing List traffic, for further information visit www.kernel.org. I’m Jon Masters.

Categories: episodes Tags:

2009/08/05 Linux Kernel Podcast

August 13th, 2009 jcm No comments

Audio: http://media.libsyn.com/media/jcm/linux_kernel_podcast_20090805.mp3

For Wednesday, August 5th 2009, I’m Jon Masters with a summary of today’s LKML traffic.

In today’s issue: CPUs, devtmpfs, and KVM.

CPUs. Gautham R Shenoy posted an RFC patch series implementing an idle state framework for offline hotplug CPUs. As Gautham points out, when we go into an offline transition state on current systems, we put the affected CPU(s) into a HLT loop (or the equivalent) rather than using the lower C-states that are available. Previous patches have proposed various alternatives – including putting the CPUs into the lowest power C-states available – but the guys at IBM favor giving the user a choice over which state will be chosen. The patch implements a new “available_offline_states” entry in sysfs, from which one can determine a valid low-power state and configure via “preferred_offline_states”.

devtmpfs. Greg Kroah-Hartman reposted “devtmpfs”, which is a patch series originally created by Kay Sievers. Unlike the earlier devfs, this patch series doesn’t attempt to implement device filesystem functionality entirely in the kernel. Instead, the patches provide an implementation that makes life easier for bootstrapping a system by supplying a pre-populated tmpfs filesystem on boot, containing entries for all the initial hardware devices detected. This can be used to boot a system without “complex userspace bootstrap logic to provide a working /dev”. Once devtmpfs is populated, udev takes over and can freely create, manage, and delete any entries it likes as ususal. For those who don’t want to run udev, devtmpfs also offers a cleaner way out.

KVM. Fengguang Wu mailed to let everyone know that Jeff Dike had discovered that KVM pages are being refaulted in 2.6.29. Quoting Fengguang, who cited Jeff, “Lots of pages between discarded due to memory pressure only to be faulted back in soon after. These pages are nearly all stack pages. This is not consistent – sometimes there are relatively few such pages and they are spread out between processes”. Fengguang posted a patch that “drastically reduces” the problem by respecting the referenced bit of all anonymous pages, but suspects that it may re-introduce a previous scalability issue. Discussion continued at some length between the various KVM folks on this one.

In today’s miscellaneous items: a new version 0.12 of the Ceph distributed filesystem from Sage Weil (including several fixes, and some documentation), some networking updates from David Miller (including a lockdep regression that was triggering for a number of people, and was discovered by Ingo Molnar in the previous day’s networking fixes), automatic crash kernel memory allocation from Amerigo Wang (via the new crashkernel=auto boot parameter), some minor s390 updates from Martin Schwidefsky, some OProfile updates from Robert Richter, a suggestion to setup a patchwork (quilt) instance for linux-alpha (although Jeff Garzik cannot have been the only person to wonder if a demonstrated need exists for this), an update to checkincludes.pl from Luis R. Rodriguez that can remove duplicate header inclusions in place (useful, he says for porting “crap” drivers – he also now closes files as soon as he’s done with them rather than keeping file descriptors lying around), conditional support for MSI in sata_nv from Tony Vroon (so far only for MCP55), some build system fixes from Andi Kleen (mcount handling, gold linker support, gcc 4.5 support), an x86 IOAPIC RFC from Cyrill Gorcunov that will only panic on irq-pin binding if needed (i.e. allow failure in the case of PCI), yet another version of the HWPOISON patches from Andi Kleen, version five of the ZERO_PAGE patches from Kamezawa Hiroyuki (with minor fixes), a version 3 of the “security processor” kernel driver from Intel (now with additional support for re-distributing the no-longer-built-in firmware files), and some DRM fixes from Dave Airlie.

Finally today, Dave Airlie expressed some obvious frustration (citing “shitty scripts”) at the lack of verbosity for make V=1 builds. The builds currently fail to display all scripts that are being executed during a build – in particular, Dave Airlie’s case, the ftrace function pre-patching script.

In today’s announcements: SystemTAP version 0.9.9. Josh Stone announced that version 0.9.9 of SystemTAP is now available. It features faster script compilation, improved userspace probing, support for new DWARF_OPs, self-monitoring markers, enhaced variable access, new SNMP tapset, new dentry tapset, bug fixes…and much more.

linux-2.6.31-rc5-tr1.1. John Kacur announced that he and Clark Williams had put together an unofficial preempt-rt kernel release while Thomas Glexixner was out at summer camp (Thomas volunteers every summer with a local camp).

The latest kernel release was 2.6.31-rc5, which was released over a week ago.

Kosaki Motohiro experienced a “poison overwritten” issue with -rc5, which was triggered by the netdev SKB allocation code, but was not able to reproduce it.

Stephen Rothwell posted a linux-next tree for August 5th. Since Tuesday, the tree gained a few minor conflicts, and remains steady at 138 sub-trees.

That’s a summary of today’s Linux Kernel Mailing List traffic, for further information visit www.kernel.org. I’m Jon Masters.

Categories: episodes Tags: