Archive

Archive for July 23rd, 2009

LKML Podcast Update

July 23rd, 2009 jcm 6 comments

This week saw the first day with over 1,300 downloads of a single episode. That’s pretty exciting – it means that a number of people are interested in what’s happening with the kernel on a day-to-day basis. Typical listener figures are still more like 250-300 every day and then 500ish for a given episode over the course of a week, which I’m still fairly happy with even though it’s not quite as exciting. Around 1/5th of listeners are using iTunes, whereas most are now using Linux Podcasting software. Which is nice.

Anyway. Since you find this useful, I will keep doing it. But I might substitute catching up individual episodes for larger single “catchup” episodes when I’m travelling or too busy to do the semi-daily Podcast. Catching up the last two weeks took 5 hours on Monday evening at a popular coffee shop, which might not always be possible. It probably would have been better to do a bumper show and move on from there.

I’m going to try experimenting with a “Week in Review” section on weekends. I don’t know whether I’ll get to start this weekend, or whether it will be sustainable to do that, but I think some highlights from the week in a longer format would be useful to those who just like to listen on Monday mornings (of which there are already a large number – seems many use the Podcast instead of reading the list over weekends).

I’ve played with compression plugins, audio volume levels, and many other tweaks. But let me know if there’s something I can do to make the Podcast better. One thing I cannot do, however, is guarantee typo-free Podcasts or insert links to each item covered. I’m a guy doing this in my spare time, I’m not The New York Times Company (and let’s face it, their daily corrections are getting disturbing enough anyway) and I don’t have the time and resources to offer a full news service. Sorry about that :)

Jon.

Categories: Uncategorized Tags:

2009/07/21 Linux Kernel Podcast

July 23rd, 2009 jcm No comments

Audio: http://media.libsyn.com/media/jcm/linux_kernel_podcast_20090721.mp3

For Tuesday, July 21st, 2009, I’m Jon Masters with a summary of today’s LKML traffic.

In today’s issue: Block devices, flexible array implementation, I/O bandwidth controller, Linux 2.6.27.27 problems, Microsoft, modules, per-cpu, performance counters, taskstats, and VFAT.

Block devices. Alan Jenkins posted a patch implementing bdcopy(), which is a function that allows one to create a copy of an existing reference to a block_device, rather than relying upon spinlock protected access to bdget(). This is particularly useful in the case of the hibernation code, which needs to create a copy of a refernece to the active swap device but doesn’t want to call bdget() directly because it might sleep. bdcopy() is safe to call from any context and corrects the PM/hibernate regressions that some have been seeing of late. Thanks to Alan for tracking that particular fix down.

Flexible array implementation. Dave Hansen posted an RFC proposal for a flexible array implementation. This allows the kernel to create and manage “flexible arrays”, which are formed by single pages containing pointers to second level array objects. The idea is to make it easier to create and manage dynamic arrays in-kernel, reducing the need for large contiguous memory allocations (or calls to vmalloc for situations in which kernel virtual memory can be used, and where there is vmalloc room to make the mapping(s)). The patch introduces 4 functions: alloc_flex_array(), free_flex_array(), and the accessors flex_array_put() and flex_array_get(). As Andrew Morton points out in his reply, that yields 2MB worth of objects on 64-bit platforms using a 4K page size, which he hopes is enough for likely callers.

I/O bandwidth controller and BIO tracking. Ryo Tsuruta posted to let everyone know about the latest version of his patches implementing dm-ioband and blkio-cgroup, which implement I/O bandwidth limiting at the device-mapper level (per partition, per user, per process, per virtual machine, etc.) and I/O tracking using cgroups to identify the owners of any type of I/O. There is also even tracing support and documentation. This patch series is just one of several (three, I believe) alternative I/O bandwidth limiting patchsets around, with Vivek Goyal’s work being one of the obvious alternatives. It remains unclear whether the different groups are actually going to meet at Kernel Summit or some other event to reconcile a common solution.

Linux 2.6.27.27. There are some concerns about 2.6.27.27 and the introduction of a patch aimed at avoiding the use of a GCC compiler option named ‘-fwrapv’ that is reportedly buggy in gcc-4.1.x. Apparently, with this fix applied, some systems are failing to boot (gcc-4.2.4). The problem is that both -fwrapv and -fno-strict-overflow exhibit bugs depending upon which version of the compiler one has chosen. Linus Torvards noted that it might be best to simply restrict -fwrapv to GCC 4.2.x and newer. But the problems get more complex as others have chimed in with differing issues (including an inability for Marc Dionne to build upstream kernels using ccache on rawhide), leading Linus to believe there are currently three different tools issues hurting people (the last one being a Debian/sid binutils package failure). Later, Linus began analyzing the assembly (or rather dissassembly) of different kernel builds in an effort to determine why the compiler options were generating broken code.

Microsoft. It’s skiing season in hell this week, and season passes are available. Microsoft followed up to their initial patch postings with a confirmation that they plan to continue posting regular Hyper-V driver updates to Greg’s staging tree, and then on into the mainline kernel proper. The current work can be found in the linux-next tree (/drivers/staging/hv). All joking aside, this is great news. It publicly endorses the GPL and perhaps warms relations a little, although a patent promise would be far more useful. And for those thinking it’s simply April come early, Slashdot reports Microsoft also posted userspace GPL code this week.

Modules. Li Zefan posted a patch impementing tracepoints for module_load, module_free, module_get, module_put and module_request. He included sample output and received favorable feedback from Steven Rostedt (who is keen for Rusty Russell to comment as owner of the in-kernel module loader). Also today, Reinhard Tartler wondered if anyone really uses scripts/checkkconfigsymbols.sh to reconcile symbol requirements with config options (sometimes it is possible that a particular configuration requirement will be missed, and necessary module dependencies will not exist in the build, which obviously causes runtime problems). He pointed out that there is a growing tendency in the kernel for these dependencies to be missing, and even several typos of configuration requirements (e.g. CONFIG_CPUMASK_OFFSTACK).

Percpu. Tejun Heo posted a (not signed-off-by yet) RFC patch removing the legacy per-cpu allocator on IA64 systems (making it dynamic), and then another obvious subsequent RFC that removes the legacy per-cpu allocator functions completely. These are not signed off because Tejun hasn’t been able to test this on actual hardware, but only on the simulator (which had a few problems), and even then only in one particular build configuration. He is awaiting further testing from those with IA64 systems before proceeding. Separately, Tejun (who has obviously been busy) posted a patchset entitled “implement and use sparse embedding first chunk allocator” that enables the per-cpu allocator to use bootmem allocated memory directly, even on NUMA.

Performance counters. Jason Baron posted a perf utility patch building upon Peter Zijlstra’s initial support for tracepoints in the performance counters tools. Jason’s patch adds a ‘perf list’ and ‘per stat’ command, and makes use of debugfs to obtain this data. The use of debugfs is compounded by the fact that there are a variety of possible mount points for it on target systems (most kernel documentation was recently updated to refer to /sys/kernel/debug as the standard location but many – including this author – still steadfastly use /debugfs for the mountpoint out of sheer stubborn debugfs originalism), which necessitates Jason poking around in /proc/mounts. This is something Ray Lee thinks should be optimized so that perf will only do this in cases where the /sys/kernel/debug location is not the correct mountpoint since some systems have a very large number of mountpoints, making this expensive. On a related note, Arjan van de Ven posted a patch (that he noted elsewhere was a really tricky issue to track down) entitled “avoid structure size confusion by using a fixed size”, correcting a compiler issue in which struct perf_header would vary in size from one compiled file to the next.

Taskstats. Nikanth Karthikesan posted an RFC patch series implementing a netlink based notification mechanism on fork (refered to within the kernel as clone), allowing one to track the creation of new tasks without having to constantly poll and walk /proc. Nikanth points out that this can also be used by utilities such as iotop, which gains a performance improvement. As he points out, the existing polling process won’t scale.

VFAT. Andrew Trigell followed up to Pavel Machek’s rather terse comments about his previous math with a broken down summary of the combinatorial likelihood of crashing a Windows system when presenting it with his modified VFAT patch. According to Andrew’s figures, the likelihood of a single collision in a maximally full directory containing 32767 files is about 0.0052 or 0.5% when using an exponential birthday approximation (for those who are not math inclined, refer to Wikipedia for a summary of the “Birthday Problem”, “Pigeon Hole Principle”, approximations for collisions, and related topics of interest to Computer Scientists). Even this doesn’t necessarily result in a bluescreen on Windows, since that only occurs when Windows “fastfat” driver attempts to access two colliding files in quick succession. Andrew encourages others to check his math and makes a number of other comments surrounding VFAT that I won’t go into because I personally believe it best not to comment at all.

In today’s miscellaneous items: a request to pull from the notification tree (Eric Paris) to handle some fallout from the fsnotify conversion (a generic framework which replaces the existing backends to both dnotify and inotify with a single universal notification mechanism), an RFC patch series containing clocksource cleanups (Martin Schwidefsky – including use of the expensive stop_machine context for clocksource switches), an optimisation hack (also from Martin Schwidefsky) that caches the next timer interrupt on CPU sleep when running on NOHZ systems, an ALSA update (Takashi Iwai), some HID fixes (Jiri Kosina), a new regulator_get_exclusive() API (Mark Brown), some /proc/kcore cleanups (Kamezawa Hiroyuki), a patch switching i8042 to dev_pm_ops from Dmitry Torokhov (aside: dev_pm_ops was covered at last week’s Linux Symposium in the PCI suspend and resume presentation), some input and driver core updates (Dmitry Torokhov – the latter making pm operations a const pointer since it shouldn’t be changed by module users), a patch from Joe Perches implementing separate sections for printk format strings, some informative comments from Thomas Gleixner concerning the correct way to handle threaded interrupts for hardware level triggered devices in which the interrupt generation cannot be easily disabled, and a suggestion from Nick Piggin that it might be nice to remove PG_reserved (the Page Table Entry bit) and replace it with a more useful PG_arch_2 bit that could be used for e.g. pfn_is_ram.

The latest kernel release is 2.6.31-rc3, which was released by Linus last week (a more recent -rc4 release exists as of this recording however).

Stephen Rothwell posted a linux-next tree for July 21st. Since Monday, a new ecryptfs tree has been added and the tree still fails to build in an allyesconfig build configuration on powerpc. The total sub-tree count rises to a new total of 133 trees in the latest compose, with the addition of ecryptfs.

That’s a summary of today’s Linux Kernel Mailing List traffic, for further information visit www.kernel.org. I’m Jon Masters.

Categories: episodes Tags: