Home > episodes > 2009/07/28 Linux Kernel Podcast

2009/07/28 Linux Kernel Podcast

Audio: http://media.libsyn.com/media/jcm/linux_kernel_podcast_20090728.mp3

Apologies for lagging behind. Last week was pretty busy and the box hosting the podcasts got attacked by script kiddies over the weekend. On the plus side, I did use the extra downtime to complete automating of my home – I now have (all using Linux) the ability to remotely control my fridge (X10), however the heck that’s useful. Anyway, here we go with a mega update round of podcasts for your edutainment.

For Tuesday, July 28th, 2009, I’m Jon Masters with a summary of today’s LKML traffic.

In today’s issue: Btrfs, dev_pm_ops, MMC, NMIs, NULL pointers, OProfile, Performance Counters, and the TTY layer deathmarch.

Btrfs. Chris Mason requested that Linus pull some late breaking btrfs fixes for 2.6.31 that greatly enhance the way free extents are tracked in RAM – reducing SLAB use in one weekend test from 1GB to 10MB. These patches come originally from work that Josef Bacik has been refining since 2.6.28 or so. Since btrfs is still widely under development, this should be ok to pull.

Device Power Management. Ben Dooks posted to ask about the ongoing conversion to dev_pm_ops (the new device dynamic power management capable code), and specifically whether suspend/resume code should be wrapped in CONFIG_PM conditions (and whether a NULL dev_pm_ops should be supplied if if is configured out of the kernel) and whether such changes could be made during RC, or should wait for the following merge window to open.

MMC. As was raised recently, a new MMC maintainer is being sort since Pierre Ossman is very busy these days and doesn’t want to be a blocker to progress. He does note that he is around and willing to be involved, but the discussion is ongoing. Pierre adds that he generally uses the specs on sdcard.org but that he does also have MMC 4 specs that he cannot pass on (kindly provided to him previously by Nokia). The lack of a formal maintainer didn’t stop Adrian Hunter from posting a seris of updates, especially for OMAP.

NMIs. Paul Mackerras brought up a previous PowerPC discussion topic and asked how it might apply to x86 kernels. On PowerPC, as Ben Herrenschmidt noted, there might be a problem if we get a PMU (Performance Management Unit) interrupt and try to a stack trace of userspace in the interval between when we call switch_mm() and when we call switch_to (all within sched.c). If an NMI occurs in that interval, we’ll see registers from the old task but userspace for the new task, so the stack trace will be “completely bogus”. Paul wonders whether this is also a problem on x86, or if there’s some reason it won’t hit.

NULL pointers. On Monday, Alan Cox raised the idea of catching code exploits that attempt to “jump through NULL”. The idea is clarified somewhat to mean adding a default hardware breakpoint, catching it, and having a handler make an appropriate decision. Andi Kleen argued that hardware breakpoints were a rare resource and that this would upset those who rely upon them, while Alan countered that those who really need all available hardware breakpoints can suitably configure their systems to do so – perhaps losing this feature. Andi also explained (to Kees Cook) how this could not easily be done using page tables alone due to races between different threads.

OProfile. Robert Richter posted a 26 part patch series implementing performance counter multiplexing for OProfile. Quoting Robert, “The number of hardware counters is limited. The multiplexing feature enables OProfile to gather more events than counters are provided by the hardware. This is realized by switching between events at an user specified time interval”. Obviously this is not the same as truly having additional hardware counters, but OProfile is already a snapshot based performance profiling tool, so this approach would seem to be valid. The patch adds a new file (in /dev/oprofile/time_slice) that can be used to specify interval. Separately, Robert posted a series of updates for -tip, on several branches.

Performance Counters. Anton Blanchard raised the obvious point that current perfcounters code only supports tracking executable code, and not data. But he suggests that it won’t be long before we will want to also track data maps – for example to monitor TLB miss rates (hugepage conversion suggestions) or other TLB miss issues – and so he posts a kind of RFC patch that begins to implement such support. He requests review comments on the general idea. Separately, the issue of POSIX signalling and delivery to specific threads was raised again today. This is the issue that a performance counter signal event might not be delivered to the same thread it pertains to, but merely to a thread that forms part of a running userspace process. Andi Kleen and others debated how this might fit in with POSIX and whether a new sigaction flag should be introduced to guarantee delivery to the correct thread.

TTY. Of course the big news today was the change in maintainership of the TTY layer, or rather lack of it. Alan Cox has been heroically fighting battles with the tty layer for some time, trying to beat it into shape (as covered previously), but was finally pushed to breaking point by ongoing heated dialog concerning recent regressions caused by the code having to support various assumptions not necessarily part of any official standard (e.g. emacs file close flushing semantics, and other recent issues). He responded to one particular email from Linus (in which Linus chided Alan for “making idiotic excuses”) with a patch removing himself as maintainer, and suggesting that Linus “have fun”. Later, Greg Kroah-Hartman made some musings suggesting he might be interested in poking in this particularly unpleasant subsystem. As Linux Weekly News noted, this is one subsystem that even scares Ingo Molnar, so it’ll be interesting to see who dares to try fixing it next. Not I!

In today’s miscellaneous items: A resend of a patch from Jon Hunter that enables long sleep times for tickless kernels on 32-bit platforms (as covered previously – increasing from the previous maximum sleep time of 2.15 seconds), an fbcon bugfix correcting a problem with rotating upside down (Stefani Seibold), a new version of the uid mount option for ext2/ext3 patches that uses the specified uid for the files on disk also (and not just for mounts) – rather than root – and allows also for this to be configured at runtime (Ludwig Nussel – as suggested by Andreas Dilger), a confirmation from Jui Jianfeng that he will re-run his tests against Vivek Goyal’s IO scheduler IO controller using the latest V7 version (there are efforts to find out where the 7% performance hit has been introduced), some tracing fixes (Lai Jiangshan), a legal question surrounding linking initramfs images containing proprietary drivers directly into the kernel (Subodh Nijsure – who was told that the LKML is a technical list and not a place for legal advice), a second round of mcheck/EDAC “marriage” patches (Borislav Petkov), a patch to deny use of CLONE_PARENT|CLONE_NEWPID in combination as part of a clone operation (Sukadev Bhattiprolu – who wants to wait at least “until the required semantics of the pid namespaces are clear” before touching this again), some i2c fixes (Jean Delvare), some hwmon fixes (also Jean Delvare), a note that Jesse Barnes is on vacation so Matthew Wilcox is handling PCI updates until August 6th, a lengthy debate about whether a new MAINTAINERS file section was needed to somehow indictate individual wireless driver writers in addition to John Linville as the subsystem maintainer (which David Miller seemed to think would instead only create confusion for those sending patches – which all need to go via John anyway), a series of patches converting IPVS to use pr_fmt, some USB, driver core, and staging fixes (Greg Kroah-Hartman), some interesting patches from Mel Gorman that add trace events for the page allocator, some libata fixes from Jeff Garzik (mostly one-liners aside from pata_at91), a series of stable review patches from Greg Kroah-Hartman for 2.6.27.29 and .30.4, and Peter Zijlstra was happy to discover that his CFS group scheduler fairness fix (aiming to restore fairness that has been apparently broken since .29-rc1) worked fine even though that had only been compile tested before he posted.

Finally today, Pavel Machek responded to Ogawa Hirofumi’s concerns about the use of an int type in the calendar time to broken-down time patches noting that support for years up until 2,000,000,000 (2 Billion) is probably more than sufficient, given that our own Sun won’t be lasting a whole lot more than 5 Billion years either. One assumes we’ll all be using Star Dates long before then anyway, and have holographic representations of famous kernel hackers to create interesting everyday plot situations with along the way.

The latest kernel release is 2.6.31-rc4 (except it isn’t any more since -rc5 was released on Friday evening).

Stephen Rothwell posted a linux-next tree for July 28th. Since Monday, the benh-mm tree was dropped (merged), the tree fails to build in an allyesconfig build configuration on powerpc systems, and several net build failures were introduced. Stephen did a final re-merge of Linus’ tree to get some updates. The total subtree count decreases to 135 trees after dropping benh-mm.

That’s a summary of today’s Linux Kernel Mailing List traffic, for further information visit www.kernel.org. I’m Jon Masters.

  • Print this article!
  • del.icio.us
  • Facebook
  • TwitThis
  • Identi.ca
  • Digg
  • Google Bookmarks
  • Slashdot
  • RSS
Categories: episodes Tags:
  1. Miciah Dashiel Butler Masters
    August 4th, 2009 at 02:33 | #1

    Thanks for the summary!

    I’m curious what kinds of devices will benefit significantly from Jon Hunter’s elimination of the 2.15 second maximum tick interval and how much they will benefit.

  1. No trackbacks yet.