Archive

Archive for March 18th, 2010

2010/03/07 Linux Kernel Podcast

March 18th, 2010 jcm No comments

Audio: http://media.libsyn.com/media/jcm/linux_kernel_podcast_20100307.mp3

For the weekend of March 7th, 2010, I’m Jon Masters with a summary of today’s LKML traffic.

In today’s issue: Console, DRM, ext4, integrating tools, sensors, split function and data sections, union mounts, and versioning.

Console. Eric W. Biederman posted an intuitive patch for /dev/console opening, effectively ensuring that it is always available even if the root filesystem has no /dev. “This effectively guarantees that there will be a device node, and it won’t be on a filesystem that we will ever unmount”. Al Viro replied “hell yeah”, and took the patch “with thanks”.

DRM. This weeks thread length of the week prize goes to a thread entitled, “drm request 3″ in which Dave Airlie tried to pull some patches into the 2.6.34 merge window. These contained, “[f]ixes for default y + CONFIG_STAGING + CONFIG_DRM_NOUVEAU enabled”. Linus wasn’t very happy when he booted with these patches (nouveau interface version 0.0.16) and saw an error message saying “[drm] wrong version, expecting 0.0.15″. This lead to a rant about backwards compatibility, and that he hadn’t even been warned it would break existing user space (in his case, Fedora 12). Linus even found that the commit that introduced the breakage did so explicitly, but again noted, ‘why the hell wasn’t I made aware of it before-hand? Quite frankly, I probably wouldn’t have pulled it. We can’t just go around break people[s] setups. This driver is, like it or not, used by Fedora-12 (and probably other distros). It may say “staging”, but that doesn’t change the fact that it’s in production use by huge distributions. Flag days aren’t acceptable’. This lead on to a thread in which Linus and others (including Jeff Garzik) noted that Fedora 12 was shipping this driver in “production” and so more should be done to ensure that the kernel could be tested on older systems, while others said the driver was all along a “use at your own risk” driver (Jesse Barnes). Personally, this author solved the problem by using another graphics chipset a long time ago. Daniel Stone probably had the best solution, “fuck it, it’s Friday. To the pub”.

The DRM thread also deviated into a discussion of “Upstream first” as a distro policy, and then onto specific patches in other distributions that aren’t in upstream. For example, Ubuntu carrying AppArmor. That lead on to yet another tangent in which James Morris felt he was being personally attacked for the lack of the patches being upstream. Ingo Molnar (and later, Linus, who seemed to share a similar viewpoint – that there needn’t be only one security answer) decided to weigh in, noting that it had been “a few reasonable months after the last big security flamewar”, and wanting to see a “rehash or fair summary of the pathname versus labels arguments” (refering to the fact that SELinux uses file labeling and complex rules, while AppArmor uses simple file paths). Ingo feels that pathnames are a “far more fitting abstraction to any ‘human based security process’ on Linux than ‘labels’”. Ingo called out that there was a lot of security research based on labels but essentially said none of that mattered due to the difficulty of practically using label based security. Quoting Ingo again, “[i]n other words: [I] see [SEL]inux’s main failure in that it somewhat blindly aims for a security model that is sees as the technical most secure, while not being intellectually open to the fact that we very likely _cannot know in advance_ which of the models will make Linux more secure in the long run. It would seem Ingo would like AppArmor to be less of a “hostile competitor” and more of a “natural ally” to SELinux. The idea is that there can be two different security mechanisms for different use cases.

Ext4 performance concerns. Justin Piszcz had recently raised the issue of the relative performance of ext4 for “large” writes vs. XFS. Justin was seeing almost half the write throughput when using ext4 as opposed to XFS and was concerned. After asking various questions, to which the replies included that he should use “nice” numbers of disks (e.g. 9 for the specific RAID case he was looking at) that made no difference, the thread seemed to dry up without any concrete conclusions other than that a performance issue exists and requires some further investigation using blktrace, etc.

Integrating tools. Ingo Molnar, in a thread entitled “Re: KVM usability”, made some remarks about the relative virtues of having “unified repositor[ies]” in which both the kernel and userspace tools are combined in one place, such as with the Performance Counters tools. Ingo believes that one reason why Apple can “consistently out-develop Linux” is “in part due to there not being a strict [C]hinese [W]all between the Apple kernel, libraries and applications – it’s one coherent project where everyone is well-connected to each piece”. This maybe true, but it’s just as likely in this author’s opinion that Apple is benefitting from that, coupled with the fact that it owns every piece and can hand down edicts from on high about what every piece will do, and when. In any case, the thread is worth reading – it was surprisingly short given the potentially contentious comments that could have made great flamebait.

Sensors. Dima Zavin (Google) replied to Jean Delvare’s attempt to have the ALS (Ambient Light Sensors) subsystem pulled, saying that the kernel was on the road toward having one subsystem under drivers/ for ALS, one for Proximity sensors, one for Accelerometers, etc. all with similar interfaces, and that a better approach would be a single “sensors” subsystem. He offered to help work on just that. Jean was interested, but didn’t want to hold up having the ALS patches pulled, favoring reworking them later on. He was subsequently dismayed when Linus and others started asking why ALS wasn’t just using the input subsystem for events, saying that he didn’t care where the code went but that discussions had been ongoing for 5 months already and he didn’t want to hold things up for another 5 months when people decided to bring this up during the merge window rather than before. The conversation then took a tangent into different rate devices (some of these “sensors” can operate at many KHz, above what the “input” subsystem is intended for). Linus contended that these devices, just like joysticks, were input devices. The conversation appears to have stalled at this point without a resolution.

Split function and data sections. As some of you will know, various attempts have been made over the past year to add support for compiling the kernel with the GCC options “-ffunction-sections”, and “-fdata-sections”. These cause the kernel to generate one ELF section for each function or data related object, and make life very easy for optimization tools (that can remove whole sections) as well as kernel patching utilities such as Ksplice. Tim (Ksplice) Abbott was happy with the latest round of patches, though he did have some questions about the “rename kernel’s magic sections with compatbility with -ffunction-sections -fdata-sections” patch series, especially about where certain renames were being used. For example, he wondered aloud how renaming “.text.reset” to “.text..reset” would affect AVR32 systems, because he couldn’t see how the original “.text.reset” was being populated anyway (answer: it wasn’t). As Tim mentioned, he wanted input from Haaard Skinnemoen, who provided the comment on “.text.reset” amongst other feedback.

Union mounts. Valerie Aurora posted version 1 of an RFC patch series (against Al Viro’s for-next tree) entitled, “Union mount core rewrite”. This, as it implies, is a complete rewrite of parts of the code implementing union mounts. Val has previously written about the goals and implementation of her work in various LWN articles. Separately, Val wondered aloud whether it was now possible to have multiple read-only layers in union mounts.

Versioning. Paul McKenney posted a patch placing the SHA1 git hash of the latest commit in the kernel version line on boot if available, or “[Not git tree]” in the case that a non-git tree was use to build.

In today’s miscellaneous items:
Large numbers of git pull requests started to come in for 2.6.34 (including everything from core kernel to networking and sound), there were some further nested SVM patches from Joerg Roedel, a large number of KVM updates (including a lot of PowerPC bits, Microsoft Hyper-V patches, and some x86 emulator cleanup), a new “platform-drivers-x86″ git tree reference was added to the MAINTAINERS file (as maintained by Matthew Garrett, who posted a pull request for the latest bits also), a new generic x86 “NMI Watchdog” built upon performance events from Don Zickus (by way of Ingo Molnar actually making the pull request for Don’s previously posted patches), version 3 of the memory controller groups dirty page limits patches from Andrea Righi, an affirmation from Andrew Morton that the “Linux Checkpoint-Restart” patches could be posted to LKML following 2.6.34-rc1 (Oren Laadan also mentioned how the patches will refuse to do a checkpoint if they believe they cannot do so safely, reporting this back to userspace), the latest “compat-wireless” tree for stable kernel (2.6.32) users that contains the latest 2.6.33 bits from Luis R. Rodriguez, version 3 of a patch series providing for 512KB readahead rather than 128KB from Fengguang Wu, various trivial and staging patches from Greg Kroah-Hartman (as an aside, Alan Stern raised some concerns about the way Greg’s scripts generate those patches), a request to pull the Ceph distributed file system client into 2.6.34 (along with various input about changes made since the 2.6.33 merge request) from Sage Weil, some Performance (perf) Counters “live mode” patches from Tom Zanussi that allow perf data to be directly processed as it is captured “without ever touching the disk”, some paravirt (PV) extension patches for HVM (Hybrid virtualization support) in Xen from Sheng Yang, and Ted Ts’o complained about dynamic device filesystems with initramfses in a mini-rant about how 2.6.33 could not boot with an LVM root on his Ubuntu 9.10 userspace. He added that, “of course, the initrfamfs environment is so crappy that there are no debugging aids — not even a working pager”.

In today’s announcements:

Git 1.7.0.2. Junio C Hamano announced the latest maintenance release of Git version 1.7.0.{1,2}. The second .2 posting had a few minor patches since .1, including fixing support for GIT_PAGER. Whether or not it is technically an SCM, I will cease using that term in this podcast, following some feedback from listeners of this podcast.

LTP. The Linux Test Project was released for February 2010. The latest release comes with a reminder that there “has been multiple chnges for building/installing the test suite after the recent changes in Makefile infrastructure”. This month’s release didn’t come with any corrupt script warnings.

Userspace RCU 0.4.2. Mathieu Desnoyers announced version 0.4.2 of his Userspace RCU “urcu” library. It includes some patches from Paolo Bonzini adding generic uatomic ops support for architectures not explicitly supported by liburcu, including (effectively free support) for IA64 and Alpha when using GCC versions 4.0-4.5, and a bugfix in urcu-bp which is the “User-Space Tracing” version of the urcu library. Mathieu has asked me to point out that an patent exemption was made to cover use of RCU in LGPL code such as urcu, so my previous comments about GPL patent concerns were a little too severe.

The latest kernel release was 2.6.33.

Andrew Morton posted an mm-of-the-moment (mmotm) for 2010-03-04-18-05.

That’s a summary of today’s Linux Kernel Mailing List traffic, for further information visit www.kernel.org. I’m Jon Masters.

Categories: episodes Tags:

2010/02/28 Linux Kernel Podcast

March 18th, 2010 jcm No comments

Audio: http://media.libsyn.com/media/jcm/linux_kernel_podcast_20100228.mp3

For the weekend of February 28th 2010, I’m Jon Masters with a summary of the week’s LKML traffic.

In today’s issue: Linux 2.6.33, ACPI, Cgroups, Checkpoint and Restart, OF Device Tree, Firmware, and x86 embedded.

Linux 2.6.33. Linus Torvalds announced the final release of 2.6.33 on Wednesday February 24th at 12:06pm Best Coast Time (PST). The final release includes a relatively small number of final fixes on top of rc8. As Linus says, the most notable thing may be the Nouveau integration and modesetting support. Others may notice the mainlining of DRBD and the fact that the AS IO scheduler is now gone (”since keeping it around and just causing confusion seemed to not be worth it any more. You’re supposed to use CFQ instead”). Daniel walker asked Linus whether he still planned to try a one week merge window this time, to which Linus said, “No. But I might do a ten-to-twelve day thing or something like that – just to make sure that anybody who tries to game the system and send their merge request late will get summarily ignored. So I’m going to stop being so predictable that people can tell that exactly two weeks after the last release is where the merge window closes, and if people want to make sure their stuff merged, I had better have a merge request in my inbox earlier than thirteen days after the release.” The pull requests started pretty much immediately, and with the usual vigor. Separately, Con Kolivas announced 2.6.33-ck1, which includes his BFS scheduler and various other “desktop” focused bits.

ACPI. Rafael J. Wysocki posted an RFC patch concerned with removing race conditions from ACPI event handlers. The first race concerns the execution of handlers while they are being removed, the second is a locking issue.

Cgroups. Andrea Righi posted an intruiging RFC patch series intended to provide per-cgroup dirty page limits. The idea is that the maximum amount of dirty pages a cgroup is allowed to have can be limited, and if a cgroup exceeds this count, it will be forced to perform write-out immediately.

Checkpoint and restart. Oren Laaden posted version 19 of his “Linux Checkpoint-Restart” patchset. As a reminder, these patches are intended to allow systems to handle failures by taking whole system checkpoints and restarting all activity from that point in the event of failure. The latest patchset is intended to address previous concerns from Andrew Morton and others, and is apparently able to checkpoint and restart both screen and vnc sessions, and support live migration of network servers between hosts. The project has a checklist of TODOs on its wiki: http://ckpt.wiki.kernel.org/.

OF Device Tree. Grant Likely asked Linus to pull in his OF device tree rework for 2.6.34. Grant has recently been working on ARM support, in addition to the PowerPC, Microblaze, and SPARC changes covered in this pull. Hopefully, OF device tree emulation will finally provide one mechanism for supplying data to the kernel that can be common across many different architectures, in addition to those that do “real” OpenFirmware in the vendor firmware.

Firmware. There was some discussion about kernel firmware versioning, and whether kernel firmware should be wrapped in a container format making it more suited to SO library style versioning. This happened in response to the folks behind the open sourcing of the Atheros WiFi firmware seeking advice on the best way to handle compatible and incompatible versions. David Woodhouse has advocated for the use of more library-like versioning, but was not a big fan of introducing the complexity of such wrappers. In the end it was decided that the kernel developer maintained linux-firmware package should provide firmware files of the form foo-$(API). Those wanting a sub-versioned file like foo-$(API)-$(VAR) could provide one if they so wish.

x86 embedded. Graeme Russ posted a very detailed and well reasoned description of his embedded x86 port, which is not in any way based upon PC hardware, in which he uses U-Boot to transition to 32-bit Protected Mode and directly calls the kernel’s “32-bit BOOT PROTOCOL” described in Documentation/x86/boot.txt. He was having some issues though handling kernel relocation that turned out to be due to documentation differences between the bzImage format and the current reality. Peter Anvin was his usually very helpful self.

In today’s miscellaneous items: A fix for SPARC32 from Rob Landley (apparently, SPARC32 has been broken since 2.6.28, which isn’t surprising since this author and most other Linux SPARC users seem to be running SPARC64 kernels), various debugging from Thomas Gleixner and John Kacur on the recent 2.6.33 RT patch, version 6 of a patch series intended to add lockdep-based diagnostics to rcu_dereference() from Paul McKenney, a series of PPS implementation patches from Rodolfo Giometti (useful for those needing accurate time sources on a serial line), a patch to increase readahead size to a default of 512K from Fengguang Wu (the previous default was 128K), a bunch of s390 updates for 2.6.33 final from Martin Schwidefsky (including kernel image compression “finally…after only 10 years”), some patches intended to document the rfkill sysfs ABI from Florian Mickler, some more nested SVM (virtualization within virtualization on AMD compatible systems) from Joerg Roedel intended to aid running Microsoft Hyper-V with nested SVM (which doesn’t quite work yet even with these according to Joerg), a number of rather cool gdb and early debug updates from Jason Wessel (who has now split kdb and early debug out into two separate trees), version 4 of the “concurrency managed workqueue” from Tejun Heo, a discussion about order 1 allocation failures started by Frans Pop (the failures were under GFP_ATOMIC, but Frans felt that they were particularly ugly given plenty of cache was available for reclaim), David Howells proposed removing EXPERIMENTAL from NFS_FSCACHE in order that it could be compiled into the standard Ubuntu kernel (since, as he says, “As Arjan van de Ven pointed out…the EXPERIMENTAL flag doesn’t mean that much any more”, and a lengthy discussion of linux-next “requirements” that is worth reading, if you have the time.

In today’s announcements:

iproute2. Stephen Hemminger announced release 2.6.33 of the iproute2 utilities that “includes bug fixes and support for all the new features in kernel 2.6.33. This integrates a number of minor bug fixes from Debian aswell”. The update is available at http://devresources.linux-foundation.org/.

RT 2.6.33-rt4. Thomas Gleixner announced version 2.6.33-rt{2,3,4} of the RT kernel patchset. This updates to Linus’ latest tree and includes a number of fixes to bugs reported by John Kacur and others. It is available from the usual location: http://www.kernel.org/pub/linux/kernel/projects/rt/ Thomas noted that “rt/2.6.33 branch is now stabilization only. The rt/head branch will follow linus tree from now on, so it will inherit all (mis)features which come in the merge window. Separately, John Stultz announced that he had forward ported Nick Piggin’s VFS scalability patches to 2.6.33-rc8-rt2, and that it applies to 2.6.33 without any collisions. He requested feedback as he had yet to do any serious stress testing with the patchset (yet).

The latest kernel release was 2.6.33.

Greg Kroah-Hartman released an updated stable Linux 2.6.32.9.

Finally today, Mikael Abrahamsson suggested that some TLC be given to the Wikipedia article on the Linux kernel as it “doesn’t even mention the new -rc system” (in the “development model” section of the article). He wondered if anyone who knew exactly what was going on could write up the new world order on that wiki page for the rest of the world to see. That does not seem to have happened as of this writing.

That’s a summary of today’s Linux Kernel Mailing List traffic, for further information visit www.kernel.org. I’m Jon Masters.

Categories: episodes Tags: