Archive

Archive for November, 2009

2009/11/12 Linux Kernel Podcast

November 26th, 2009 jcm No comments

Audio: http://media.libsyn.com/media/jcm/linux_kernel_podcast_20091112.mp3

[NOTE] Last week was another 100+ hour week of vendor kernel excitement, and so the podcast suffered as a result. I’m catching up over the US Thanksgiving Day holiday, so expect us to be caught up by the end of November, or thereabouts.

For Thursday, November 12th, 2009, I’m Jon Masters with a summary of today’s LKML traffic.

In today’s issue: Breaking strings, cputime, ftrace, KVM, and XFS.

Breaking strings. Ingo Molnar (in replying to a percpu fixes for 2.6.32-rc6 thread) brought up the issue of breaking strings mid-sentence. He favors re-engineering code such that strings are – in general – on a single source line and more easily greppable (using ‘git grep’, for example). Tejun saw Ingo’s point, but suggested that a more scalable and longer term fix would be to teach grepping tools to understand strings split in such a fashion. This lead onto the comment of the day for today, from Oliver Neukum: “There’s a point where following style guidelines turns into a fetish”. Dear goodness.

cputime. Hidetoshi Seto noted that a recent commit to task_s and utime changed their return types to (the more fine-grained) cputime_t, but without making some other appropriate changes to casts of the return value elsewhere, which affected the granularity of the results of such uses. He posted a patch.

Ftrace. Steven Rostedt has been busy profiling his ftrace infrastructure, looking for issues in the recording of individual entries into the per-cpu managed ring_buffer implementation. He has found that the timestamping feature causes the highest single overhead component in profile runs and is working on moving some of the timestamp processing to the read side of traces.

KVM. The previous discussion on quirks for the AMD Geode CPU (causing it to become viewed as an i686-like processor) had turned into a discussion of hypervisor technology [this happens often], and in particular a bug that Willy Tarreau and others have recently discovered in KVM’s instruction interpreter. It turns out that feeding it arbitrarily long instructions with many “66″ (data size prefix) codes pre-pended will cause KVM to deprive other tasks form running and can be a form of denial of service. Of course, more modern ISAs use fixed width instructions and don’t suffer from these problems, but that isn’t a reason not to address the issue when handling the venerable x86.

XFS. Christoph Hellwig posted an “XFS status update for October 2009″, in which he mentioned that the 2.6.32 merge window had opened up with a major XFS update that included refactoring the inode allocator (performance, etc.), and also noted that a healthy amount of work has recently gone into xfsprogs.

In today’s announcements: Linux 2.6.32-rc7. Linus Torvalds announced the latest release of the 2.6 series kernel at 4:57pm Best Coast Time (PST). In his announcement, Linus notes that he had held off releasing the -rc while Rafael J. Wysocki tracked down an “ugly-looking” resume regression. The changes were otherwise fairly minimal and in line with this stage in rc.

The latest kernel release was 2.6.32-rc7.

Stephen Rothwell posted a linux-next tree for November 12th. Since Wednesday, there were issues with Linus’ tree, the cpufreq, i7core_edac, and sysctl trees, while the net, tip, and usb trees lost conflicts. Rusty’s “rr” tree lost a build failure but exposed problems with the powerpc and sparc trees. The total number of sub-trees now stands at 148 in the latest compose.

That’s a summary of today’s Linux Kernel Mailing List traffic, for further information visit www.kernel.org. I’m Jon Masters.

Categories: episodes Tags:

Catching up

November 23rd, 2009 jcm No comments

A set of new podcasts are on their way. Unfortunately, I had a 100+ hour work week last week and the podcasting suffered as a result.

Categories: Uncategorized Tags:

2009/11/11 Linux Kernel Podcast

November 12th, 2009 jcm No comments

AUDIO: http://media.libsyn.com/media/jcm/linux_kernel_podcast_20091111.mp3

For Veterans Day (November 11th) 2009, I’m Jon Masters with a summary of today’s LKML traffic.

In today’s issue: Floppy, LZO, Resume, and Wakeup.

Floppy. Just when you thought nobody used floppy disks any more. Stephen Hemminger posted to let everyone know that “mount -o sync” support has a regression for floppy disk use cases in kernel 2.6.31. Some time between 2.6.30 and 2.6.31-rc1, the anticipated behavior of writes immediately completing and blocking until they hit the ext2-formatted disk broke and a copy followed by disk removal followed by unmount results in errors. This potentially may affect USB thumbdrive users, so has some wider relevance.

LZO. Albin Tonnerre posted version 3 of a patch series implementing generic LZO compression for kernel binary images on x86, ARM. The patches include support both for building and using these images, and their initramfses.

Resume. Rafael J. Wysocki and Linus Torvalds chimed in on Rafael’s previous posting concerning broken resume-from-suspend. After applying a patch intended to help diangose the problem, Rafael reported that errors were being generated by btusb_waker, which Linus said matched his “observation that only a few [Bluetooth] drivers seem to use workqueues, and btusb_disconnect() isn’t doing any work cancel”. Marcel Holtmann and others began discussing solutions.

Wakeup. Yinghai Lu posted version 2 of a patch intended to make doubly sure that ACPI wakeup code is located below 1M in physical memory on x86. The patch attempts to find a suitable region in the BIOS/EFI/firmwire specified “e820″ area (a table of memory mappings on such systems) and reserve it early on.

The latest kernel release is 2.6.32-rc6.

Stephen Rothwell posted a linux-next tree for Novemeber 11th. Since Tuesday, the i2c tree lost a conflict, the new tree gained a conflict, the wireless tree lost a build failure, the rr tree gained a build failure, the pcmcia tree gained a conflict, the tip tree gained a build failure, the percpu tree gained a conflict, and the usb tree also gained a conflict. The total sub-tree count is now at 148 trees, since the previous issues with pulling trees resolved.

That’s a summary of today’s Linux Kernel Mailing List traffic, for further information visit www.kernel.org. I’m Jon Masters.

Categories: episodes Tags:

2009/11/10 Linux Kernel Podcast

November 12th, 2009 jcm No comments

AUDIO: http://media.libsyn.com/media/jcm/linux_kernel_podcast_20091110.mp3

For Tuesday, November 10th, 2009, I’m Jon Masters with a summary of today’s LKML traffic.

In today’s issue: AppArmor, Changing task UIDs, SECURITY_FILE_CAPABILITIES, and Stable tags and git workflow.

AppArmor. John Johansen posted version 3 of a 12 part patch series intended to re-implement the AppArmor security module (which was previously maintained out of tree by Novell, until it wasn’t, then seemed to die shortly after Canonical begun to support it, and now has returned in a new form in a posting from John, who is a Canonical engineer) upon the security_path hooks instead of the previous VFS hack. AppArmor is a path-based alternative to SELinux that is sometimes seen as being less complicated to setup, although this is debated. In any case, these patches seem more supportable for upstream inclusion.

Changing task UIDs. Enrico Weigelt, who is working on plan9 patches inquired as to the best way to implement plan9-style support for changing the UID of running tasks, perhaps through a new /proc entry. He then proceeded to post various replies to other threads he had not previously been involved with – amongst other things criticizing hald and dbus design, and espousing the virtues of plan9 (if only it had more users to sell us on its features?).

SECURITY_FILE_CAPABILITIES. Serge E. Hallyn posted suggesting the the Kconfig option SECURITY_FILE_CAPABILITIES be removed, specifically invalidating the case of SECURITY_FILE_CAPABILITIES=n, and meaning that such capabilities would always be enabled unless the user specified no_file_caps on system boot. The reason behind this suggestion stems from an apparent missunderstanding amonsgt a growing number of application developers that such support is always present, leading Serge to wonder if it might aswell just be by now.

Stable tags and git workflow. Ingo Molnar posted an RFC concerning stable tree git commit workflow. He noted that that previously, he would have to email (cherry pick) the specific pre-requisite dependencies for any stable patch forwarded to the stable team (or wait for an email when things didn’t apply to the stable tree), but felt that this could be optimized. So, Ingo has begun adding comments on “CC” lines in the patch indicating additional commits that should be included, e.g. “# .32.x: : “. These commits are added to a new “-stable” tag on Ingo’s -tip tree. He seeks comments.

Finally today, Chris Friesen had asked about correct use of IANA-registered ports on systems running sunrpc. Specifically, the RPC implementation as used by NFS can make use of ports that are reserved for other services, if a range has not been set aside ahead of time (and even then, it’s not optimal if you really want to run every service). But Trond Myklebust asked the obvious: “The people who are trying to run absolutely all IANA registered services on a single Linux machine that is also trying to run as an NFS client may have a problem, but then again, how many setups do you know who are trying to do that?”. The answer, one assumes, is less than one.

In today’s announcements: 2.6.31.6-rt19. Thomas Gleixner announced preempt-rt patch series release number rt19 for the 2.6.31.6 stable kernel. This was mostly a forward port to the latest stable tree, but also contains a missing pre-emption point in ksoftirqd. The patches are avaialable in the usual locations, amongst them: http://www.kernel.org/pub/linux/kernel/projects/rt.

The latest kernel release is 2.6.32-rc6.

Stephen Rothwell posted a linux-next tree for November 10th. Since Monday, there was a new sysctl tree from Eric W. Biederman that contains only the generic compat_sys_sysctl patches now that binary sysctls are going away. The net tree lost both a conflict and build failure, the wireless tree still has a build failure, and the trivial tree lost a conflict. Stephen reports the sub-tree count at 146, but that is incongruent with the new tree.

That’s a summary of today’s Linux Kernel Mailing List traffic, for further information visit www.kernel.org. I’m Jon Masters.

Categories: episodes Tags:

2009/11/09 Linux Kernel Podcast

November 12th, 2009 jcm No comments

AUDIO: http://media.libsyn.com/media/jcm/linux_kernel_podcast_20091109_corrected.mp3

For Monday, November 9th, 2009, I’m Jon Masters with a summary of today’s LKML traffic.

In today’s issue: CFS, Cisco VPN, Fsck, Resume, and Too Many Signals.

CORRECTED: Indeed, I screwed this up by mixing up two patches. The following is about CFS task limit scheduling, not CFQ, patches for which I was reading about at the time also.

CFS Hard Limits. Bharata B. Rao posted version 3 of his CFS Hard Limits patch series. This is intended to allow for configurable hard limits on CPU used by task groups.

Cisco VPN. Mariusz Smykula, noting that this was “not yours problem” posted to let everyone know that kernels after 2.6.29 seem to break support for the proprietary Cisco VPN client, apparently needed on some “certified” systems that by implication cannot run vpnc or similar. The posting included a variety of links to users discussing the issues, though it does seem unlikely that the kernel community will rush to help Cisco with proprietary software.

Fsck. Ted T’so pointer out (in a thread entitled “document conditions when reliable operation is possible”) that “as the ext3 authors have stated many times over the years, you still need to run fsck periodically anyway.” This lead David Lang to question where that documentation was, to which Ted replied that it was in the LKML archives. Apparently, the lack of documentation that explicitly mentions this was a contributing factor in “SUSE11-or-so” ceasing to perform periodic fscks on its own because Pavel Machek could not find sufficient documentation justifying this when the decision was made.

Resume. Rafael J. Wysocki posted a request for help diagnosing a problem with the suspend and resume code in 2.6.32-rc. For several days, he has been trying to debug resume problems (that obviously might be suspend problems) on his Toshiba Portege R500. Apparently, it seems to be caused by leakage of preempt_count in the events kernel thread, but Rafael has never been able to capture a full oops message, so that is based only upon some detective work performed using gdb and a partial trace output. He did find a commit (from himself) that upon removal would make the issue unreproducible, but he believes that commit (preparing for early/late parts to resume) simply exercises code paths that make the problem more easily triggerable. He also found an earlier commit in which the leak lead to a warning (run_workqueue) that didn’t kill the box, but might be responsible for the hard lockup seen later on. Later, he found and posted a full trace, stuck in run_workqueue.

Too many signals. Naohiro Ooiwa posted a patch to the handling of the print-fatal-signals kernel boot parameter such that sysadmins will receive a warning when RLIMIT_SIGPENDING is exceeded and can choose to enable the additional logging facility to diagnose what is really going on.

Finally today, are you feeling motivated? Mark Pith announced that his research team (at the University of Amsterdam) were researching the “motivation factors of Open Source software programmers”. He would like you to complete a short survey that won’t exceed 15 minutes in length. The link to the survey is: http://bit.ly/Survey_Developers_Motivation.

In today’s announcements: Linux 2.6.27.39 and Linux 2.6.31.6. Greg Kroah-Hartman announced the release of kernels 2.6.27.39 and 31.6. These were in review over the weekend.

LTTng 0.167. Mathieu Desnoyers announced the latest LTTng patch for 2.6.31.6, encouraging all users to upgrade to the latest .31 series kernel since it contains security fixes.

The latest kernel release is 2.6.32-rc6.

Stephen Rothwell posted a linux-next tree for Novemeber 9th. Since Friday, several trees are feeling less “angry” (they’re always “in conflict” you see, according to Katherine). The sparc tree lost its build failure, the net tree lost a conflict but gained another for which Stephen applied a patch. The wireless, pcmcia, trivial and staging trees also gained similar conflicts. The total number of subtrees in linux-next remained steady at 146 trees.

That’s a summary of today’s Linux Kernel Mailing List traffic, for further information visit www.kernel.org. I’m Jon Masters.

Categories: episodes Tags:

2009/11/08 Linux Kernel Podcast

November 9th, 2009 jcm No comments

Audio: http://media.libsyn.com/media/jcm/linux_kernel_podcast_20091108.mp3

For the weekend of November 8th, 2009, I’m Jon Masters with a summary of the weekend’s LKML traffic.

In today’s issue: AMD Geode, Ftrace, IO bandwidth control, modules, nconfig, per-cpu mm counters, regressions, and sysctls.

AMD Geode. Geode is a (relatively) low power SoC-type x86-compatible CPU from AMD that many have heard of due to its use by the OLPC XO laptop project. Although the chip is now discontinued, there are a number of users, and Matteo Croce noted that although the kernel has always treated it as a 586-class CPU, the Geode is “technically an i686″ processor. Given a few quirks (it lacks the “long NOP” or “NOPL” instruction, which can be emulated instead), it can be made to run as an i686 processor. Debate centered around whether it was “really worth it”, as Peter Anvin put it.

Ftrace. Michal Simek (who has been working on Ftrace support for Microblaze for some time) posted some example output and asked a number of questions of Steven Rostedt in relation to the implementation of the mcount function that is necessary for Ftrace to function correctly. Mcount is a function usually provided by GCC for applications that Steven intentionally replaces when compiling the kernel with profiling support such that he can hook into mcount and capture various information about functions as they are called.

IO Bandwidth Control. Vivek Goyal (who has an increadible amount of patience) posted an RFC in regard to the upcoming 2.6.33 merge window. Vivek is working on support for bandwidth limiting of IO and notes that recent CFQ changes actually add another layer of grouping of IO – this time within the CFQ scheduler – assigning IO based on the workload type (sync-idle, sync-noidle, and async IO). The question is whether to do bandwidth control at the outer level, or within each of these three workload type groups.

Modules. Rusty Russell noted that he has now applied Alan Jenkins’ whole series of patches to improve module loading speed through pre-sorting the symbol table and using a binary search on module load.

nconfig. Nir Tzachar posted version 5 of a “menuconfig” replacement, written using the most modern versions of the ncurses interface toolkit. The patch isn’t huge, and comes with documentation, so it might well be a candidate, if the kernel developers consider a replacement is necessary.

per-cpu mm counters. Kamezawa Hiroyuki followed up to Christoph Lameter’s previous posting of a new per-cpu array implementation for various counters currently living within the mm_struct. His concern was the overhead incurred in compiling summary statistics when userspace attempted to read the data, as is done by a variety of utilities, including both top and ps.

Regressions. Yanmin Zhang raised the issue of a 5% performance regression between 2.6.31 and the current 2.6.32-rc release. Much of these are attributed to recent scheduler changes, but not all. Mike Galbraith noted that there were some locking issues that were being fixed up that might have skewed benchmarks overly negatively, and that a fix was in the pipeline. Ingo Molnar wanted some more information about the precise setup Yanmin was using. ertainly, recent developments on Performance Events and “perf bench” will help.

Sysctls. Eric W. Biederman is currently working on various cleanups. Not content merely to have cleaned up VFS cache handing for sysfs, he decided at the same time to also take on binary sysctl support in the kernel. His 23 part patch series on that front will remove existing sysctl handling from all over the kernel tree and instead implement sys_sysctl as a wrapper over /proc/sys. Users shouldn’t notice, but kernel developers should take note.

Finally today, in replying to the AMD Geode debate over whether it was worth promoting Geode to be i686 (albeit with quirks), Alan Cox noted that checkpatch minor formatting warning output really is not intended to be useful until a patch has a serious likelihood of being accepted. i.e. while things are under development, it’s ok to take a chill pill and relax a little.

In today’s announcements: Linux 2.4.37.7. Willy Tarreau announced the latest release of the venerable 2.4 series kernel. Specifically, this latest release includes a number of potential NULL pointer deference bug fixes that all users should consider as potential issues, even if they have set mmap_min_addr to disallow the kernel from mapping the NULL page to userspace.

2.6.31.5-rt17. Thomas Gleixner announced version 2.6.31.5-rt17 of the preempt-rt kernel patch. The latest release is forward ported to 2.6.31.5, has some scheduler improvements, security fixes, and some tracer enhancements also. It is available from the usual places, including http://www.kernel.org/pub/linux/kernel/projects/rt.

Luis R. Rodriguez announced the stable compat-wireless tree for 2.6.32-rc6. This is a wireless tree backported to older kernels and allows users to make use of newer wireless drivers on older systems.

The latest kernel release is 2.6.32-rc6.

Greg Kroah-Hartman posted stable series review patches for 2.6.27.39 and 2.6.31.6. The former had 99 patches, while the later had 16. And as usual, responses were requested by Sunday afternoon.

Stephen Rothwell posted a linux-next tree for November 6th. Since Thursday, the PowerPC KVM fix went away at last, the sparc tree had a build failure for which he applied a patch, and the kvm and net trees had conflicts. The total sub-tree count remains steady at 146 trees in linux-next.

That’s a summary of today’s Linux Kernel Mailing List traffic, for further information visit www.kernel.org. I’m Jon Masters.

Categories: episodes Tags:

2009/11/05 Linux Kernel Podcast

November 6th, 2009 jcm No comments

Audio: http://media.libsyn.com/media/jcm/linux_kernel_podcast_20091105.mp3

For Thursday, November 5th, 2009, I’m Jon Masters with a summary of today’s LKML traffic.

In today’s issue: CVE-2009-2584, Generic per-cpu counter arrays, MM locking, page types, performance events, and the scheduler.

CVE-2009-2584. A security issue was recently found in a procfs function contained within the sgi-gru driver. It involved unsafe use of strncpy_from_user. Various people posted fix suggestions for it, while Linus noted that most of the logic in the offending function (options_write) was “utter sh*t as far as I can tell”. He posted a couple of entirely untested patches (Linus style) for others to take a look at. Meanwhile, it was also noted that few people had the hardware, which helped to mitigate the issue.

Generic per-cpu counter arrays. Kamezawa Hiroyuki, noting that the patch had been “ony my queue for a month”, posted an RFC patch intended to add support for generic percpu counter arrays. His patch uses the recent dynamic percpu support to create arrays of per-cpu data on the fly, using some macros such as DEFINE_COUNTER_ARRAY, and functions such as counter_array_init, and counter_array_add to manage entries being added to an existing array.

MM locking. Christoph Lameter posted an RFC MM patch implementing a variety of “accessors for mm locking”. Essentially, the idea is to abstract and wrap up use of mmap_sem such that it could eventually be ripped out and replaced without having to touch a lot of MM code once again. Christoph notes that the patch is “currently incomplete” but it does at least build.

Page Types. Fengguang Wu posted a followup to his previous patch enabling one to specify new page type information on the command line of the “page-types” utility (used to decode various VM data) with an example of how one could educate page-types about new types of page flags on the command line.

Performance Events. Hitoshi Mitake posted version 5 of a 7 part patch series implementing the “perf bench” command, and incorporating Rusty Russell’s original “hackbench” scheduler benchmark code.

Scheduler. Lai Jiangshan noted that a previous patch from Mike Galbraith didn’t seem to be mitigating the problems with the scheduler running tasks on the wrong CPU. In his case, the built-in kernel thread named “events” for CPU 1 was in fact shown (by using Ftrace) to be running on CPU0. Mike noted that the problem was likely to be in the migration code not holding the runqueue lock and thus not being safe against pre-emption and subsequent chaos.

In today’s announcements: AlacrityVM version 0.2. Gregory Haskins announced the 0.2 release of his AlacrityVM project. This is a modified KVM that uses a replacement virtualized IO bus for improved performance of, for example, network packet transfer between host and guest. The latest version includes some nice features, such as zero-copy transmits in the VENET driver. For further informatin, visit:
http://developer.novell.com/wiki/index.php/AlacrityVM.

The latest kernel release is 2.6.32-rc6.

Stephen Rothwell posted a linux-next tree for November 5th. Since Wednesday, the PowerPC KVM fix was still around, while the pcmcia, drbd, and catalin trees lost their issues, and the sparc tree gained a build failure for which Stephen applied a patch. The total sub-tree count remained at 146 trees.

That’s a summary of today’s Linux Kernel Mailing List traffic, for further information visit www.kernel.org. I’m Jon Masters.

Categories: episodes Tags:

2009/11/04 Linux Kernel Podcast

November 5th, 2009 jcm No comments

Audio: http://media.libsyn.com/media/jcm/linux_kernel_podcast_20091104.mp3

For Wednesday, November 4th, 2009, I’m Jon Masters with a summary of today’s LKML traffic.

In today’s issue: Cgroups, FatELF, PerCPU MM counters, and Swap.

Cgroups. Balbir Singh posted to let everyone know that discussion is happening concerning the most appropriate place to mount the cgroup filesystem. Since the Linux Filesystem Hierarchy Standard (FHS) was written prior to the existence of cgroups, it has no specific advice, which leads to three alternatives. These are /dev/cgroup, /cgroup, or some place under /sys. Balbir prefers the first option, but that will require some co-operation with udev. He asks for advice from others as to the best place for this to live. Several people seem to be quite happy with /sys/kernel/cgroup (which is not the only filesystem that gets mounted there).

FatELF. Continuing the discussion on the relative merits of “FAT” image files containing multiple ELF objects, Mikulas Patocka made some interesting comments on Linux package managers, describing them as “evil”. In his opinion, FatELF might provide a means to ship single image files containing all of the files an application needs to execute in one object, similar to how Apple and other operating systems already do today. Mikulas is concerned about the relative difficulty Linux users face in installing software not provided by their distribution using package management software. He makes a good point, although FatELF may not be the solution to that particular problem.

PerCPU MM counters. Christoph Lameter, noting that support for generic per-cpu operations is now in the “percpu” and linux-next trees, posted a patch implementing per-cpu mm counters for tasks rather than single entires in mm_struct. This obviates the need for larger SMP systems to perform atomic updates to mm counters and (intuitively) implies a performance improvement. The only downside is occasionally having to iterate over each of these per-cpu values when the actual count values are being requested.

Swap. Following on from the recent discussion about OOM killer behavior and the various metrics that might be used in the future, Kamezawa Hiroyuki posted a patch that exports per-process (task) swap usage statistics via procfs. This happens through the addition of a new “VmSwap” entry in /proc/pid/status.

The latest kernel release is 2.6.32-rc6.

Stephen Rothwell posted a linux-next tree for November 4th. There had been no tree the previous day due to a national holiday in Australia, where he is based (and one trusts the horse race went well, too). Since Monday, there was a new “msm” tree (which is an ARM platform), the PowerPC KVM fix was still required, and a couple of other conflicts went away. The total sub-tree count increased today to 146 trees with the addition of the “msm” tree.

That’s a summary of today’s Linux Kernel Mailing List traffic, for further information visit www.kernel.org. I’m Jon Masters.

Categories: episodes Tags:

2009/11/03 Linux Kernel Podcast

November 5th, 2009 jcm No comments

Audio: http://media.libsyn.com/media/jcm/linux_kernel_podcast_20091103.mp3

For Tuesday, November 3rd, 2009, I’m Jon Masters with a summary of today’s LKML traffic.

In today’s issue: Block IO controller, FatELF, Ftrace, Performance, and Sysctls.

Block IO controller. The ever patient Vivek Goyal, fresh from the IO minisummit in Tokyo, posted the first version of a new IO bandwidth control patchset entitled that “Block IO Controller”. This RFC patch series aims to address the problem of there being no “one size fits all” IO control policy, and the need for different policies to be implemented for different uses. The patch introduces what Vivek calls the blkio cgroup controller, through which a management interface is provided that can be used to switch policies.

FatELF. Eric Windisch posted some example use cases for FatELF that he felt others should know about, in an attempt to counter some of the points made by Alan Cox previously. In particular, it would seem that Eric is into Cloud Computing in a big way and looks forward to having virtual machine images that can simultaneously run on a variety of different hardware. Although there is certainly some benefit provided by FatELF, it wasn’t clear how these problems couldn’t be solved as Alan had suggested – with different directories containing versions of the same binaries for the different arches.

Ftrace. Michal Simek posted to let everyone know that he is currently working on Ftrace support for the Microblaze CPU architecture (an FPGA-based soft core from the folks at Xilinx). In particular, he is looking at function trace support at the moment and how the mcount function is used to record entry into each individual function. He has a number of questions, and Steven Rostedt (the Ftrace author) was happy to help answer a number of them.

Performance. Alex Shi posted with an observation that performance testing had yielded results with a 20-30% drop off in the 2.6.32-rc5 timeframe. This seemed to be due to a cfq-iosched patch from Jens Axboe. Alex attached an example run of perf stat both with and without the patch, showing a clear difference between the two sets of data.

Sysctl. Eric Dumazet recently observed that sysctl table entries were quite expensive, due to a sentinel value added after each one in order to detect and avoid corruption of table entries. Eric noted that the sentinel need actually only contain a couple of pieces of data, and so he created a special sentinel entry struct called ctl_table_sentinel that was smaller in size. This would apparently reduce RAM utilization of such entries by 40%.

In today’s announcements: Userspace RCU. Mathieu Desnoyers posted to let everyone know that version 0.3.0 of his Userspace RCU patches is now available. This is an RCU implementation using the POSIX pthread functions that applications can use to take advantage of the same features as the kernel has done for some time. The latest version removes a function (call_rcu) for which he had provided differing arguments and semantics than the kernel.

The latest kernel release is 2.6.32-rc6. Linus Torvalds announced version 2.6.32-rc6 of the Linux kernel at 12:05pm US Best Coast Time (PDT). In his announcement, Linus noted that there had been a longer gap since rc5, due in large part to the number of kernel developers who have been away at the kernel summit in Japan or traveling to and fro. There was also an ext4 filesystem corruption problem that required additional time, and that had turned out to be due to enabling checksum testing of journal transactions during recovery. Linus thanked Eric Sandeen for tracking down that particular problem. He also seemed pleased at the number of regressions addressed since 2.6.31.

Stephen Rothwell announced that there would be no linux-next tree for November 3rd due to a public holiday in Australia where he is based, which has apparently also has “nothing to do with a horse race in Melbourne”.

That’s a summary of today’s Linux Kernel Mailing List traffic, for further information visit www.kernel.org. I’m Jon Masters.

Categories: episodes Tags:

2009/11/02 Linux Kernel Podcast

November 5th, 2009 jcm No comments

Audio: http://media.libsyn.com/media/jcm/linux_kernel_podcast_20091102.mp3

For Monday, November 2nd, 2009, I’m Jon Masters with a summary of today’s LKML traffic.

In today’s issue: BKL, FatELF, Fast symbol resolution, OOM, and Performance benchmarks.

BKL. There is an ongoing effort to remove the BKL (Big Kernel Lock), which is the last stayover from early Linux support for SMP. Discussion of BKL removal was revived during the recent Real Time pre-emption mini-summit, and Jan Blunk is amongst those who have been looking at this from the filesystem level. He posted a series of patches intended to push BKL use down into individual filesystems from the generic kernel code (for example do_new_mount()) that it lives in today. He requests comments.

FatELF. There was some ongoing (and quite considerable) push back against the notion of supporting FatELF binaries. Chris Adams wondered aloud just what the target audience really was? As he sees it, embedded users don’t want the bloat, Enterprise distributions already have specific support processes in place for different architectures, and community distributions aren’t likely to want to deal with the increased build complexity and space requirements. Meanwhile, Alan Cox congratulated Ryan C. Gordon on re-inventing the concept of a directory – since directories already allow one to have multiple versions of a binary installed on a given system and to pick and choose between them. Sure that’s not as shiny as an Applesque approach, but it has worked for many decades at this point, and most of the distributions implement multi-arch (sometimes called multi-lib) using some kind of similar approach.

Fast symbol resolution. Alan Jenkins posted the latest version of his fast LKM symbol resolution patches. These take advantage of a binary search for symbol resolution at module load time, using a pre-generated (at build time) sorted table of exported kernel symbols. Using this approach, Alan has once again succeeded in reducing overall system boot time slightly on his netbook. The latest version of the patches has seen some limited testing on ARM and has also been built for Blackfin, so it’s not just x86 at this point.

OOM. Kamezawa Hiroyuki posted to let everyone know that he was putting code where his mouth was with a “total renewal” of the OOM killer code. This isn’t complete at this stage, but it is intended to keep the conversation moving. The first patch lays groundwork (including new OOM type classifications), while the second and subsequent patches add the ability to count swap use per process and implement a newly updated badness calculation that uses rss+swap as the base value but also factors in cpusets, and gives tasks a bonus for how far in the past their last allocation occured, and their runtime.

Performance benchmarks. Hitoshi Mitake posted to let everyone know that he has been working on integrating a benchmark subsystem into the existing – and already fairly extensive – “perf” (or performance events) utility. He asked Rusty Russell for permission to pull Rusty’s hackbench code directly into the kernel tree as part of this effort, which can be used by calling “perf bench sched” with whatever parameters one might wish to specify.

Finally today, Tilman Schmidt requests that we draw attention to the Kernel Cleanup wiki that Robert P J Day has been working on. The page at www.crashcourse.ca/wiki/index.php/Kernel_cleanup includes information about unused Kconfig variables, badly referenced ones, and general problems with kernel code that need further investigation in general.

In today’s announcements: LTP. Subrata Modak posted announcing that the Linux Test Project for October 2009 has been released. The latest version includes fixes, 119 test scenarios for EXT4 testing, new GETUID16/GETUID64/GETEUID16 and PTRACE system call tests, and much more. As usual, it is available at http://ltp.sourceforge.net/.

Sysprof. Soeren Sandmann announced version 1.1.4 of the sysprof CPU profiler. This is the latest version to be based upon the rewrite to make use of the new performance counters interface for exposing the low-level hardware counters. Since the previous 1.1.2 release, there have been a number of fixes. A download is available at http://www.daimi.au.dk/~sandmann/sysprof/.

The latest kernel release was 2.6.32-rc5.

Stephen Rothwell posted a linux-next tree for November 2nd. Since Friday, his fixes tree still has that PowerPC KVM fix, while there were a number of arch issues affecting ARM and OMAP in particular. The sub-tree count remains steady today at 145 trees in linux-next.

That’s a summary of today’s Linux Kernel Mailing List traffic, for further information visit www.kernel.org. I’m Jon Masters.

Categories: episodes Tags: