Archive

Archive for the ‘episodes’ Category

2010/09/20 Linux Kernel Podcast

September 20th, 2010 jcm No comments

NOTE: Backlog episodes are in various states of completion. They will be slotted in as time permits. Let’s see if we can get back to doing this thing regularly :)

Audio: http://traffic.libsyn.com/jcm/linux_kernel_podcast_20100919.mp3

For the weekend of Sunday, September 19th 2010, I’m Jon Masters with a summary of the past week’s LKML traffic.

In today’s issue: Linux 2.6.36-rc4, compiler versions, media polling and detection, writeback, and much more!

* Linux 2.6.36-rc4. Linus Torvalds announced the latest 2.6.36-rc4 release of the kernel on Sunday, September 12th 2010 at 4:49pm Best Coast Summer Time (BCST). In noting it had been two weeks (rather than one) due to travel, Linus said little stood out at this point in the RC, though he was a little bothered by the amount of GPU driver churn. Linus devoted a lengthy paragraph to calling out the need for greater use of the “Reported-by: tagline in patch message bodies in order to give due credit to those who help to track down and fix bugs, adding “Sometimes the fix is trivial, and the real work was in noticing and figuring our that a problem exists in the first place, and reporting it”.

* Compiler versions. Florian Mickler, Peter Zijlstra, and Peter Anvin debated the fact that older gcc 3.3 compilers were known not to work correctly when building x86 Linux kernels (Peter Anvin noted that some “Enterprise” distros were shipping custom patches and so their “3.3″ compilers did still work). Russell King noted that it would be ok to bump the generic requirement up to GCC 3.4, and then suggested architectures could require a higher version individually as needed. Neither Russell, nor others saw any reason for the generic requirement to be higher than 3.4 at this stage, and Russell noted that ARM developers like himself were still using 3.4 quite heavily.

* Media polling and detection. Maxim Levitsky posted, drawing attention to a potential regression, in a thread entitled “cdrom driver doesn’t detect removal”. Recent work on block device claiming had seemingly changed the logic for emiting uevents from the kernel that udev would pick up and use to trigger a mount or unmount of a CD or DVD device. Except this wasn’t a problem. Kay Sievers noted that recent systems rely on a polling process in userspace that won’t be running unless there is a desktop session, and so the real problem was that this process was not running on the Ubuntu system Maxim has – re-enabling the older HAL-provided process (now replaced with udisks upstream) fixed the problem. Of course, there is a wider problem here that is that no UNIX-like system should need a user running a graphical session (this should be init-initiated).

* Writeback. Michael Rubin posted a five part patch implementing entries in /proc/vmstat that provide visibility into writeback behavior. The two entries, nr_dirtied and nr_written “allow user apps to understand writeback speed over time”. Michael proceeds to then describe why it is important to provide visibility into writeback behavior, “to know how active it is over the whole system, if it’s falling behind or to quantify its efforts”. Apparently, these patches are used at Google in order to allow their non-kernel engineers to solve performance issues.

In today’s miscellaneous items:

* Joe Perches posted a series of cleanup patches intended to remove the extraneous provision of a loglevel in the parameters of the various pr_ functions, which already encode the loglevel into their naming.

* Avi Kivity, Ingo Molnar, and Pavel Machek continued to discuss possible “bytecode” intepretors built into the kernel for defining perf events.

* Robert Richter noted what seemed to be spurious interrupts after disabling performance counters. Actually, these were deemed to likely be already in flight interrupts at the time of the counter disabling, which were then not handled by the time the handler was called, resulting in an erro. Robert posted a patch to catch and handle these “spurious” interrupts.

* Heiko Bauke noted some recent issues with Realtek network cards detecting a link, which seemed to go away when using the (GPL) drivers directly from Realtek. The problem was that two “stable” tree update patches had not been added to 2.6.32 stable kernels. David Miller said he would take care of it.

* Robert Mueller noted that the default zone_reclaim_mode on NUMA kernels was pretty disasterous for performance on his very meaty servers. Cross-node memory use is bad, but not as bad as heavy disk IO with 5GB of RAM free. He and Christoph Lameter have started a threat to discuss default options.

* Mathieu Desnoyers posted an RFC patch entitled “sched: START_NICE feature (temporarily niced forks)”, which bumps the nice level on both parent and child temporarily (for their first slice only). The goal is to be able to reduce the impact to latency-sensitive workloads that do many forks. He included some impressive stats that were “tempting” to Ingo Molnar, and is currently working on a new patch. Ingo would like Mike Galbraith to work his magic to look for bad corner cases with taking this patch. A second version was posted, without any followup at this point.

* Vladislav Bolkhovitin posted a 17 part patch series implementing “SCST”, a new SCSI target framework with device handlers and 2 target drivers.

* Arnd Bergmann posted a 7 part patch series implementing “BKL mass-conversion to mutex”, which is part of a much larger effort that he has been working on for some time. He’d like to see this (and other bits) in linux-next in time to land in the forthcoming 2.6.37 kernel release. Arnd later posted a longer thread entitled “Remaining BKL users, what to do” in which he proposed various ways to removing remaining BKL use from different drivers, filesystems, and so forth. Christoph Hellwig noted that isofs just needed its own private mutex, as had been done in other drivers, for example.

* Dave Hansen posted a patch adding a WARN_ONCE when using drop_caches, and an update to the documentation, since he says “[t]here seems to be an epidemic spreading around. People get the idea in their heads that the kernel caches are evil. They eat too much memory, and there’s no way to set a size limit on them! Stupid kernel!”. This seems to be a heavyweight solution the problem of bad advice on the interwebs, in Google searches. The WARN_ONCE was deemed to be “meddling”, the documentation was well received, and there was some discussion about possible remaining issues that could mean a drop_caches is actually useful for some workloads.

* Christopher Yeoh posted an RFC patch entitled “Cross Memory Attach” that “allow[s] MPI programs doing intra-node communication to do a single copy of the message rather than a double copy of the message via shared memory. The mechanism is to allow a destination process to do a copy from a source process memory directly, using a system call. Apparently, splicing isn’t an option at this stage (zero-copy) due to the need for both processes to work co-opertively over a pipe. Ingo Molnar was impressed with the stats, which used a modified OpenMPI to run some MPI benchmarks, showing a very hugely dramatic speedup in overall MB/s of throughput in all of the benchmarks.

* VMWare decided to rename vmware_balloon to vmw_balloon, apparently following the new convention of “vmw_”, according to Dmitry Torokhov.

* Valerie Aurora posted a 34 part patch series implementing the latest version of her “Union mount core”, for general review. She included a TODO, and a summary of the changes (including to documentation) since the last revision.

In today’s announcements:

* Greg Kroah-Hartman announced the release of stable kernel 2.6.34.7, which contains a fix for a single USB issue apparently bothering “hundreds of OpenSuSE users” at the moment.

* Michael Kerrisk announced that man-pages version 3.26 is now available.

* Junio C Hamano announced Git version 1.7.3. It includes a number of test updates, and some GUI changes, amongst other things.

* Nicholas A. Bellinger announced that TCM/LIO version 4.0.0-rc4 for 2.6.36-rc4 is now available. It includes a large number of changes.

* Phillip Lougher announced the release of squashfs version 4.1.

The latest kernel release was 2.6.36-rc4.

* Greg Kroah-Hartman posted a series of review patches for future stable kernels 2.6.27.54, 2.6.32.22, and 2.6.35.5.

* Rafael J. Wysocki posted a summary of reported regressions from 2.6.34 to 2.6.35, and from 2.6.35 to 2.6.36-rc3-git5. From 2.6.34 to 2.6.35, there are at present 25 unresolved regressions remaining, up from 10 at the start of June. From 2.6.35 to 2.6.36-rc3-git5, there are at present 15 unresolved regressions remaining, up from 13 at the end of August, but the overall number has fallen as others have been fixed. None of the regressions appear to be fantastically earth shatteringly bad ones, although kernel bug 16549 does appear to be quite familiar to this author as a longstanding issue.

Rafael also notes that Florian Mickler has joined the “regression tracking team”, in the capacity of recording and noting regressions that are fixed. It is requested that he be copied whenever such fixes are made available.

* Mathieu Desnoyers posted LTTng 0.230 for Linux kernel 2.6.35.4.

That’s a summary of the week’s Linux Kernel Mailing List traffic, for further information visit www.kernel.org. I’m Jon Masters.


Purchase Replica Chanel Handbags In Canada
Cheap Chanel Handbags Online In Uk
Buy Discounted Replica Chanel Shoes In Uk
Discounted Fake Chanel Bags Online In Australia
Buy Discounted Designer Replica Chanel
Purchase Cheap Replica Chanel Online
Replica Chanel Shoes In Uk
Buy Discounted Replica Chanel Bags In Usa
Buy Cheap Fake Chanel Shoes In Canada
Buy Designer Replica Chanel Handbags Online In Uk
Discounted Fake Chanel Online
Fake Chanel Shoes In Australia
Chanel Bags Online In Canada
Cheap Chanel Shoes In Canada
Buy Discounted Chanel
Purchase Cheap Replica Chanel Shoes Online In Ireland
Purchase Designer Replica Chanel Shoes Online In Australia
Buy Fake Chanel Handbags Online
Cheap Chanel Shoes In Ireland
Buy Designer Replica Chanel Handbags Online In Australia
Discounted Designer Replica Chanel Shoes Online In Australia
Purchase Chanel Handbags Online In Usa
Cheap Designer Replica Chanel Handbags Online In Ireland
Purchase Cheap Designer Replica Chanel Handbags In Ireland
Buy Cheap Fake Chanel Shoes Online In Uk
Cheap Fake Chanel Bags
Buy Cheap Replica Chanel Bags Online In Ireland
Purchase Cheap Fake Chanel Handbags Online In Usa
Discounted Chanel In Usa
Purchase Designer Replica Chanel Handbags
Buy Cheap Designer Replica Chanel Shoes
Discounted Chanel Bags Online In Usa
Discounted Designer Replica Chanel Shoes Online In Canada
Buy Chanel In Usa
Discounted Chanel Shoes In Uk
Buy Cheap Designer Replica Chanel Handbags Online In Canada
Buy Cheap Fake Chanel Shoes Online
Purchase Designer Replica Chanel Bags Online In Usa
Cheap Replica Chanel Shoes
Fake Chanel Online In Canada
Purchase Discounted Replica Chanel Bags
Buy Cheap Replica Chanel Handbags Online In Ireland
Purchase Discounted Chanel Handbags Online In Australia
Buy Discounted Fake Chanel Shoes Online In Usa
Purchase Chanel Handbags In Ireland
Purchase Discounted Replica Chanel Online In Usa
Chanel Bags In Australia
Purchase Cheap Chanel Bags In Ireland
Purchase Cheap Chanel Handbags In Australia
Buy Designer Replica Chanel Handbags In Ireland
Buy Chanel Online In Ireland
Purchase Cheap Chanel Handbags
Purchase Designer Replica Chanel Shoes In Ireland
Purchase Designer Replica Chanel In Australia
Buy Discounted Designer Replica Chanel Bags Online In Canada
Purchase Discounted Fake Chanel
Cheap Designer Replica Chanel Shoes Online In Canada
Chanel Handbags In Usa
Cheap Fake Chanel Handbags Online In Usa
Cheap Chanel Bags In Ireland
Purchase Cheap Designer Replica Chanel Shoes Online In Uk
Discounted Designer Replica Chanel Handbags Online In Uk
Designer Replica Chanel Bags Online In Ireland
Discounted Fake Chanel In Ireland
Buy Discounted Fake Chanel Online In Usa
Buy Cheap Replica Chanel Shoes In Ireland
Discounted Replica Chanel
Discounted Designer Replica Chanel Online In Usa
Buy Discounted Chanel Handbags In Ireland
Buy Fake Chanel
Cheap Designer Replica Chanel Bags Online In Canada
Cheap Fake Chanel Handbags Online In Australia
Chanel Bags In Uk
Purchase Designer Replica Chanel Bags

Categories: episodes Tags:

2010/07/04 Linux Kernel Podcast

July 12th, 2010 jcm No comments

Audio: COMING SOON

For the weekend of the 4th of July 2010, I’m Jon Masters with a summary of today’s LKML traffic.

In today’s issue: Linux 2.6.35-rc4, Btrfs, Defconfig kernel configs, GDB, Timekeeping, and the VM.

*). Linux 2.6.35-rc4. Linus Torvalds announced the release of Linux 2.6.35-rc4 on July 4th 2010 at 8:44pm Best Coast Time (PDT). Linus says he’s been back online for a week and is happy at the relatively small number of changes building up, “having been strict for -rc3″, in his absence. He obviously sees the increased rigidity in enforcing the merge window has been a success, and considers that there will likely be an on time 2.6.35 release, “despite my vacation”. Linus says his vacation was very enjoyable and was the longest time away from the kernel in many years – apparently he did take a cellphone for email, but didn’t do any compiles while he was having “a great time under water.”

*). Btrfs. Edward Shishkin posted a rather scathing technical review of btrfs internal design, criticising variable record size allocations, file system utilization, the balancing algorithms used, and even suggesting that engineers leave the algorithm design up to academics, rather than re-inventing things for their programs. Edward performed various benchmarks and published his results in a thread entitled (variously), “Unbound(?) Internal fragmentation in Btrfs”, “Btrfs: broken file system design”, and “Balancing leaves when walking from top to down”. For his part, Chris Mason was very civil in his reply on a number of occasions, saying that he didn’t see a fundamental design problem existing in Btrfs. Edward “NACKed” Btrfs anyway for enterprise use (even though it’s been in tree for a while).

*). Defconfig kernel configs. Linus Torvalds (in a thread renamed to “ARM defconfig files”) essentially conveyed his discomfort with the continued existance of many dozens (or perhaps hundreds) of “defconfig” files in the architecture directories. These are reference files which are based upon copies of “known good” configuration files. They worked well back in the day, but as Linus says, times have changed and nobody is really making these files by hand any more without using Kconfig. So he proposes replacing them – eating the pain – with single config files per machine type that use Kconfig and source in particulars for the various chip and architecture family particulars. Russell King pointed out that this is basically what already happens, but the point of the defconfig files is to also handle stuff outside of the architecture – for example, choosing not to use certain “IDE” options on particular boards or systems – as Daniel Walker also pointed out. Daniel noted that those setting up e.g. a BeagleBoard or a Nexus One don’t really want to troll through thousands of possible kernel options if a good reference set is available to begin with. Daniel also point out a previous posting for a boolean SATisfiability solver in the kernel config. Linus thought that was interesting but ‘At the same time, “SAT solver” does scream “over-engineering failure” to me’. Linus later explained that he was looking to either kill the defconfigs or replace them with some templates and a means to generate them, but otherwise prefered them to live some place outside of the kernel.

*). GDB. David Howells posted a patch implementing GDB remote protocol support for the “p” command on FRV. The “p” command is used to transfer information about a single register, as opposed to the “g” command, that transfers data on several. But when a gdb client connects, it will attempt to use “p” or “g” and will then stick with that choice without varying. For this reason, Linus wondered aloud if using single reads would actually slow down clients connecting (since they usually will request a number of registers at a time). Jason Wessel said he had actualy done some fairly detailed benchmarking and would share his findings at a later point.

*). Timekeeping. Oleg Nesterov posted a thread entitled “Q: sys_futex() && timespec_valid()”, in which he attempted to summarize some concerns that the glibc folks were having with the Linux implementation of timespec timeouts. Ulrich Drepper replied, explaining that his point was that a negative value for tv_sec in the case of an absolute timeout should not return -EINVAL, but instead -ETIMEDOUT. He contends that a negative relative time in the 1960s is not an invalid time. Linus strongly disagreed, saying, “Ulrich – you’re wrong. Go away.” and then clarified, ‘In the end, it’s quite simple: the kernel doesn’t accept invalid timevals. And negative tv_secs are invalid. It’s that simple. If somebody gives the kernel a timeout from before the epoch [January 1st 1970], that somebody is being a total idiot. We know it’s not a valid absolute timeout, since there’s no way somebody is “waiting” for something that happened in the sixties. Yeah, yeah, maybe you’re waiting for flower power and and free sec. Good for you. But if you are, don’t ask the Linux kernel to wait with you. Ok?’ This author wonders what those still waiting for Elvis will do now that this is clarified.

*). VM. Larry Woodman posted a patch entitled “Call cond_resched() at bottom of main loo[sic: s/k/p/] in balance_pgdat()”, which handles a situation on small single CPU systems wherein a task should OOM (Out Of Memory) and call the OOM-killer, but it does not because kswapd is constantly running due to at least one system RAM zone being below the high page watermark. Larry adds a single cond_resched() call that will allow the watchdog, tasks, and OOM killer to run, freeing up the affected resources. Andrew Morton didn’t like this approach – implying he prefered something more specific than a cond_resched and waiting for the OOM killer to get chance to run – but he could live with it if there were a giant FIXME and/or some documentation at least explaining the essential nature of the specific cond_resched() call as opposed to a regular point of voluntary kernel preemption.

In today’s miscellaneous items:

*). Patrick Pannuto proposed a usleep API for the kernel to augment the existing msleep one, and be used as an alternative to udelay so as to allow the CPU to go into lower power C-states. After some dialogue between Patrick and Daniel Walker, in which Walker pointed out that some stats were needed to prove that this was power beneficial for small delays, it seemed that there was a small improvement for 50us delay values.

*). Ronny Tschuter had some issues with tracing power_start events when using the cpuidle framework with a menu governor and an cpi-based driver to handle idle states. There wer no instrumentation points in the processor_idle code, so he posted a patch, but Arjan van de Ven pointed out that the ACPI STATE type is pretty much “useless random garbage” so the posted should set their system to use mwait idle.

*). Dave Jones raised a concern with crypto and device-mapper. A potential regression was introduced somewhere between 2.6.32 and now, and the details are available in Red Hat Bugzilla 610278. Nobody replied to the posting on the list, but the Bugzilla says that one should be using LUKS, and in the case of not using it the default encryption options were changed due to a vulnerability. It is possible to mount the existing device using the instructions provided.

In today’s announcements:

*). Jeff Merkey announced the latest version of his MDB “Merkey’s Kernel Debugger” x86_64 2.6.34 07-01-2010 Release 4. It’s available on googlecode.com. There has been no community discussion thereof. Jeff also posted his Open Cworthy Libraries 07-01-2010.

*). Junio C Hamano announced Git version 1.7.1.1 is now available at: http://www.kernel.org/pub/software/scm/git/ He also announced Git 1.7.2.rc1 is available for review.

*). Karel Zak announced the latest stable release of util-linux-ng 2.18 is now available: http://www.kernel.org/pub/linux/utils/util-linux-ng/

*). Subrata Modak announced that the Linux Test Project for June 2010 has been released. http://ltp.sourceforge.net/

The latest kernel release is 2.6.35-rc4.

That’s a summary of today’s Linux Kernel Mailing List traffic, for further information visit www.kernel.org. I’m Jon Masters.

Discounted Replica Chanel Handbags In Ireland
Purchase Cheap Fake Chanel Handbags
Buy Discounted Chanel Shoes In Australia
Cheap Chanel Bags Online In Uk
Buy Fake Chanel Bags Online In Usa
Buy Cheap Designer Replica Chanel Shoes Online In Usa
Purchase Discounted Replica Chanel Shoes Online In Australia
Replica Chanel Bags In Ireland
Purchase Discounted Fake Chanel In Usa
Purchase Designer Replica Chanel Bags Online In Australia
Fake Chanel Online In Usa
Buy Cheap Replica Chanel Shoes
Buy Discounted Replica Chanel Handbags Online In Uk
Purchase Cheap Fake Chanel Shoes Online In Australia
Cheap Chanel Handbags Online
Discounted Replica Chanel Shoes Online In Usa
Buy Replica Chanel Bags
Buy Cheap Designer Replica Chanel Shoes Online In Australia
Discounted Replica Chanel Bags Online In Ireland
Purchase Fake Chanel Bags In Canada
Buy Replica Chanel In Ireland
Discounted Chanel Handbags In Australia
Cheap Designer Replica Chanel Bags In Uk
Cheap Chanel
Purchase Discounted Replica Chanel Shoes Online In Usa
Purchase Replica Chanel Shoes Online In Australia
Buy Cheap Fake Chanel Handbags In Australia
Buy Cheap Fake Chanel Shoes In Usa
Buy Replica Chanel Online In Uk
Purchase Discounted Fake Chanel Shoes In Uk
Purchase Discounted Designer Replica Chanel Shoes Online In Uk
Replica Chanel In Ireland
Buy Cheap Designer Replica Chanel Shoes In Australia
Purchase Discounted Designer Replica Chanel Handbags In Australia
Purchase Discounted Chanel Bags In Uk
Purchase Cheap Replica Chanel Bags In Canada
Buy Discounted Replica Chanel Handbags In Ireland
Purchase Cheap Chanel Bags Online In Canada
Purchase Fake Chanel Online In Canada
Buy Fake Chanel Shoes In Ireland
Buy Cheap Chanel Handbags Online In Usa
Buy Chanel Shoes Online In Ireland
Fake Chanel Shoes In Uk
Buy Replica Chanel Bags Online In Usa
Buy Cheap Designer Replica Chanel Handbags
Buy Discounted Replica Chanel Handbags Online
Purchase Discounted Designer Replica Chanel Handbags
Discounted Chanel Online In Ireland
Buy Cheap Designer Replica Chanel Bags Online In Usa
Buy Discounted Fake Chanel Handbags In Usa
Purchase Replica Chanel Shoes
Discounted Replica Chanel Handbags In Canada
Buy Discounted Chanel Bags Online In Usa
Buy Replica Chanel Online In Ireland
Cheap Chanel In Uk
Purchase Discounted Fake Chanel Online
Designer Replica Chanel In Australia
Buy Chanel Handbags In Usa
Purchase Cheap Chanel Handbags Online In Uk
Purchase Cheap Designer Replica Chanel Bags Online In Uk
Cheap Fake Chanel Shoes In Ireland
Discounted Replica Chanel In Ireland
Discounted Designer Replica Chanel
Buy Cheap Replica Chanel Handbags In Ireland
Buy Discounted Chanel Online In Canada
Buy Discounted Fake Chanel Shoes
Buy Chanel Bags Online In Ireland
Buy Discounted Chanel Handbags In Canada
Buy Discounted Fake Chanel Shoes In Uk
Cheap Chanel Shoes Online
Purchase Discounted Fake Chanel Shoes Online In Canada
Cheap Fake Chanel Bags Online
Buy Cheap Designer Replica Chanel Handbags In Usa
Purchase Discounted Replica Chanel Bags Online In Uk

Categories: episodes Tags:

2010/06/27 Linux Kernel Podcast

July 11th, 2010 jcm No comments

Audio: COMING SOON

For the weekend of June 27th 2010, I’m Jon Masters with a summary of today’s LKML traffic.

In today’s issue: Concurrent coredumps, OpenFirmware, and Power management policy.

*). Concurrent coredumps. Edward Allcutt posted, inquiring about placing a limit on the number of concurrent process coredumps that should be allowed to take place on a system. He cited an example Apache-based webserver in which large numbers of CGI processes were crashing, each with a 150-200MB core file that needed writing to disk. He was using a custom patch that would cease dumping cores after a certain number were already concurrently taking place. Roland McGrath and Andrew Morton did not favor this approach, instead prefering either that core dumps would begin to block (but not consuming resources) after a point, or that the blkio_cgroup IO controller be used to limit the IO being consumed. Hiroyuki Kamezawa suggested that distributions like Fedora – which in that case has its own dumping tool called abrt that manages coredumps – could wire up the blkio cgroup prior to beginning the dump process.

*). OpenFirmware. Andres Salomon posted a patch implementing support for making calls into OpenFirmware on x86 OLPC XO systems. The patch works by preserving the necessary page mappings for the OpenFirmware (OFW), which remains in memory at a virtual address. Just the minimum number of mappings are retained, but this does allow calls into the firmware even after Linux has booted. It’s always been interesting to see the XO using OpenFirmware as one of the only x86-based devices doing so.

*). Power management policy. Len Brown posted an RFC patch implementing a new centralized location for userspace to express its power management vs. performance policy preferences to the kernel. In the patch, such expression occurs through the new /sys/power/policy_preference file, which contains 5 different possible levels – ranging from “max_performance”, through “balanced” (the new default), to the “max_powersave” option on the other extreme. The idea is to centralize setting scheduler, cpuidle, governor, and other options.

In today’s miscellaneous items:

*). Dave Chinner posted a 5 part patch series implementing some fixes for emergency filesystem thawing (via sysrq control).

*). Michael Kerrisk posted some man-pages text for the MADV_MERGEABLE and MADV_UNMERGEABLE flags added in 2.6.32 for use with KSM (Kernel Samepage Mapping – the kernel support for detecting duplicate pages in guest virtual machines and mapping them to a single shared page instance).

*). Paul E. McKenney concluded that it was sufficient to turn off the CONFIG_PROVE_RCU option in Fedora rawhide kernels since it’s mostly a developer tool, rather than change licensing or otherwise make it available to non-GPL modules with which it is not compatible.

*). Luis R. Rodriguez posted a script and some documentation to implement some rudimentary ASPM (a PCI extension that allows devices to go to an entirely electrically idle bus state) support. For further information: http://wireless.kernel.org/en/users/Documentation/ASPM

*). Konrad Rzeszutek Wilk posted a 19 part patch series implementing PCI pass-through for Paravirtualizaed Xen guests, using SWIOTLB support.

*). Mike McCormack wasn’t happy with the 32 (NGROUPS_SMALL) group limit on the number shown in /proc/ /status for a given process ID. He and others discussed various ways those who really want more than 32 groups assigned to a process could get the full data through various API changes.

*). Rusty Russell posted the last (hopefully) of his cpumask patches which he says now also means that everyone should be using the cpumask_functions. At least, everyone in kernel is, according to his tests on 32-bit.

In today’s announcements:

*). Mathieu Desnoyers announced that LTTng 0.218 for kernel 2.6.34 is now available. For further information: http://www.lttng.org/

*). Henrik Rydberg announced version 1.0.1 of the mtdev Multitouch Translation Library is now available (releaseed under the MIT license). mtdev does all of the necessary finger tracking pieces in userspace, and separate from the Xorg driver from which it came, as a means to further adoption. This author is still waiting for his Apple Multitouch keypad to work on a Fedora system without having to patch the kernel with a kludge. mtdev is available at: http://bitmath.org/code/mtdev/

*). Len Brown announced the Boston Linux Power Management Mini-Summit will take place concurrently with the Linux Foundation LinuxCon 2010, on the day immediately prior to the beginning of the main events, August 9th. For further information: http://events.linuxfoundation.org/

The latest kernel release was 2.6.35-rc3.

Finally today, Piotr Hosowicz wondered aloud why Linus’ git repository was not being updated, asking if it’s because he’s on vacation. As mentioned before, Linus was indeed on a (well deserved) vacation.

That’s a summary of today’s Linux Kernel Mailing List traffic, for further information visit www.kernel.org. I’m Jon Masters.

Buy Replica Chanel
Buy Cheap Designer Replica Chanel Shoes Online In Ireland
Purchase Cheap Chanel Shoes In Canada
Buy Designer Replica Chanel
Replica Chanel Handbags Online
Buy Discounted Replica Chanel Handbags Online In Ireland
Cheap Designer Replica Chanel Handbags In Uk
Discounted Replica Chanel Handbags Online In Usa
Buy Discounted Designer Replica Chanel Handbags
Discounted Chanel Shoes Online In Australia
Cheap Designer Replica Chanel Bags In Ireland
Cheap Fake Chanel Shoes Online In Uk
Buy Replica Chanel In Australia
Discounted Replica Chanel Bags Online In Canada
Purchase Cheap Fake Chanel Handbags Online In Canada
Designer Replica Chanel Handbags Online In Ireland
Buy Discounted Fake Chanel Bags In Usa
Buy Replica Chanel Shoes In Australia
Buy Chanel Shoes Online In Uk
Designer Replica Chanel Online In Usa
Purchase Cheap Chanel Handbags In Usa
Buy Replica Chanel Shoes Online In Australia
Purchase Cheap Designer Replica Chanel Shoes In Canada
Cheap Chanel Bags In Uk
Cheap Replica Chanel Bags Online In Canada
Cheap Designer Replica Chanel Handbags In Australia
Cheap Fake Chanel Shoes
Purchase Discounted Fake Chanel Bags In Canada
Buy Discounted Fake Chanel Bags Online In Usa
Chanel Bags Online In Australia
Discounted Fake Chanel Shoes In Uk
Purchase Replica Chanel Bags Online In Uk
Buy Fake Chanel Shoes Online
Discounted Designer Replica Chanel Shoes In Ireland
Replica Chanel Handbags In Usa
Purchase Designer Replica Chanel Bags In Australia
Purchase Discounted Chanel Bags In Australia
Fake Chanel Handbags Online In Ireland
Purchase Cheap Chanel Bags In Australia
Buy Discounted Designer Replica Chanel Handbags Online In Canada
Designer Replica Chanel Shoes In Australia
Buy Discounted Replica Chanel Bags Online In Usa
Purchase Discounted Fake Chanel Handbags In Canada
Discounted Fake Chanel Shoes Online In Australia
Purchase Discounted Replica Chanel Bags In Usa
Purchase Cheap Chanel Shoes Online In Ireland
Buy Discounted Designer Replica Chanel Shoes In Uk
Designer Replica Chanel Bags Online In Australia
Designer Replica Chanel Handbags In Ireland
Buy Cheap Designer Replica Chanel Online In Canada
Purchase Discounted Designer Replica Chanel In Australia
Buy Designer Replica Chanel Shoes
Buy Fake Chanel Bags In Usa
Cheap Fake Chanel Bags Online In Uk
Chanel Shoes Online
Cheap Chanel Shoes Online In Uk
Buy Chanel Bags
Purchase Chanel Bags In Ireland
Buy Discounted Replica Chanel Handbags
Buy Designer Replica Chanel Bags Online In Canada
Buy Fake Chanel Online
Discounted Chanel Handbags Online
Designer Replica Chanel Handbags In Uk
Cheap Fake Chanel In Uk
Buy Cheap Designer Replica Chanel Handbags In Ireland
Discounted Replica Chanel Shoes Online In Uk
Buy Cheap Replica Chanel In Uk
Buy Fake Chanel Bags Online
Discounted Designer Replica Chanel Shoes Online In Uk

Categories: episodes Tags:

2010/06/20 Linux Kernel Podcast

July 11th, 2010 jcm No comments

Audio: COMING SOON

For the weekend of June 20th, I’m Jon Masters with a summary of today’s LKML traffic.

In today’s issue: Panic, Performance Events, Slow-work, and Timekeeping.

*). Panic. Shoichi Tamuki posted version 2 of a patch intended to fix keyboard LED blinking on panic. Existing systems will call mdelay to handle the reboot timeout post-panic, during which time the keyboard LEDs well blink. When a hypervisor is being used, those mdelay calls of 1 second or more will be implemented as spins, in order to avoid timeout accuracy slips, but the side effect is that the keyboard LEDs won’t blink properly. The patch will call panic_blink_enter() between every mdelay call, and it also fixes up the longer mdelays so that the blinking still occurs.

*). Performance Events. Nils Carlson, Andi Kleen, Eric W. Biederman, Tony Luck, and others, discussed the “Hardware Error Kernel Mini-Summit” followup in which it had been proposed to introduce a new hardware error subsystem. They pondered what (mostly) Andi saw as failings of EDAC and the need for a better way to find such things as which DIMM has failed without doing a binary search removal of individual modules (”the way of the 21st century”). Tony Luck proposed some further ideas for a generic subsystem.

*). Slow-work. Ted Ts’o reported that recent 2.6.35 kernels with an Ubuntu userspace would periodically get into a state in which large amounts of CPU time was spent in the kslowd worker threads. It turned out that this was caused by a change to the DRM/KMS code to pull polling of the display connectors into the DRM core. Reverting a specific commit fixed the issue for Nick Bowler, who had also been experiencing this problem.

*). Timekeeping. Suresh Rajashekara inquired as to what appeared to be a problem with timekeeping on his OMAP1 platform with a 2.6.29 kernel. It seemed odd that certain timers were not expiring immediately upon resume on a system that tries to spend most of its time in a suspend state (waking for 35 milliseconds every 4 seconds, apparently). Thomas Gleixner replied, saying that during such suspend operations, only the CLOCK_REALTIME based timers are kept correct (aligned to real time), whereas others won’t expire the moment the system resumes because there may otherwise be a thundering hurd problem as many timers expire at the point that the system wakes up from the suspend state.

In today’s miscellaneous items:

*). R. F. Burns inquired as to whether it was possible to “write a kernel module which, when loaded, will blow the PC speaker?”. Alan Cox replied that this wasn’t really likely, and in the absence of the root password and proper expertise, “throwing it out of the window or feeding it iron filings will work just as well.”

*). Lai Jiangshan posted a patch removing the use a default write bit with EPT page allocations under KVM virtualization. It wasn’t causing a problem now since get_user_pages is always called with write=1 at the moment.

*). Adrian Hunter posted MMC patches adding support for secure erase, trim, and secure trim – all now variants of erase in eMMC v4.4 cards.

*). Peter Zijlstra noted that the historical uses of perf_disable to prevent NMI races in the PMU code were basically now done per-arch, so he suggested that he would remove perf_disable as it did not seem to be really needed.

*). Christoph Hellwig posted the XFS status update for May 2010, in which he noted several of the important features that lands in 2.6.34 (including new inode and quota flushing code). Christoph also posted a patch (not entirely related to XFS) that removed the 4K stacks option on 32-bit x86 systems as it is deemed “too small” these days, even with now mandatory split IRQ/kernel stacks, given the depth of many kernel call chains.

*). A number of objections to the new automated addition of a “+” to the localversion for modified kernel trees, if no other is set. Mark Hills pointed out that this triggers a lengthy modpost step even when doing “casual kernel development” to test out some simple patch.

*). Dan Carpenter posted a patch that changes the output of kernel oops messages such that the previous “cut here” is replaced with a message asking for the entirity of the oops to be sent in to kernel folks.

*). Zachary Amsden (who has been working on this for some time) posted some TSC cleanup patches and documentation for KVM. This should help resolve many of the issues that have been affecting some TSC users under KVM. On that note, Hagen Paul Pfeifer sent a patch that effectively allows for deliberate speeding-up of time for certain guests for testing use.

*). Huan Ying posted a three-part “Unified NMI delayed call mechanism”, which essentially allows the deferment of certain NMI-time processing until the NMI context has been left. Ingo Molnar prefered that the solution be to re-use the existing unified NMI watchdog code. Sadly, the rest of the thread turned into a bit of a flamewar between Andi Kleen and Ingo.

In today’s announcements:

*). Jeff Merkey announced Open CWorth Libraries 06-19-2010, and ranted about wanting larger stack sizes. He also posted version 2.6.34-06-17-2010 of his “MDB” or “Merkey Debugger”. Nobody replied to any of these threads.

*). Etienne Lorrain announced version 2.8.2 of the gujin GPL bootloader. It contains several bugfixs and improvements – http://gujin.org/

*). James Morris announced the Program Schedule for the Linux Security Summit that will run in conjunction with the 2010 LinuxCon in Boston, on August 9. Further information is available at http://www.linuxfoundation.org/

*). Karel Zak announced that the second util-linux-ng 2.18 release candidate is now available. It contains lots of fixes (e.g. disable DOS mode and cylinders by default now in fdisk). Further information is available at: http://www.kernel.org/pub/linux/utils/util-linux-ng/v2.18/

*). Mathieu Desnoyers announced the release of Userspace RCU 0.4.6. The latest release includes added ARMv7l support. Further information is available at: http://www.lttng.org/urcu/

The latest kernel release was 2.6.35-rc3.

That’s a summary of today’s Linux Kernel Mailing List traffic, for further information visit www.kernel.org. I’m Jon Masters.

Categories: episodes Tags:

2010/06/13 Linux Kernel Podcast

June 17th, 2010 jcm No comments

Audio: COMING SOON

For the weekend of June 13th 2010, I’m Jon Masters with a summary of the past week’s LKML traffic.

In today’s issue: Linux 2.6.35-rc3, ext3, floppy, Kconfig, Memory Corruption, and Suspend Blockers.

Linux 2.6.35-rc3. Linus Torvalds announced the latest RC of 2.6.35 on Friday, June 11th 2010 at 8:01pm Best Coast Time (PDT). Quoting Linus, “So I’ve been hardnosed now for a week – perhaps over so – and hopefully that means that 2.6.35-rc3 will be better than -rc2 was. Not only do we have a number of regressions handled, we don’t have that silly memory corruptor that bit so many people with -rc2 and confused people with its many varied forms of bugs it seemed to take, depending on just what random memory it happened to corrupt. One effect of being strict is that this is likely the smallest -rc3 we’ve had in a long long time.” Linus has been on a bit of a crusade in recent times to reduce the churn post merge-window, which is what he means in describing himself as being “hardnosed” to take patches.

Ext3. Jeff Merkey posted a thread entitled “EXT3 File System Corruption in 2.6.34″, in which – in between his usual extraneous use of 4-letter words and outright offensiveness – he raised an issue that he was seeing on some systems running ext3 and recovering from a power failure. His workload raised an older concern that has been addressed before: file payload corruption following a power failure. The filesystem recovered its journal, but the content of the file was replaced with random other data (not either the original data, or the new data, as might have been expected). Jeff and Eric Sandeen had a discussion about the need to mount with data=ordered vs. the fact that Linus had previously changed the default such that the kernel config option CONFIG_EXT3_DEFAULTS_TO_ORDERED is no longer the default. Ted T’so noted that in lieu of explicit use of syncing within apps, everything is a tradeoff. Users could always mount with “-o sync” to guarantee things hit the disk, “but then the performance will be horrible”.

Floppy. Stephen Hemminger noted a number of issues with the very legacy floppy driver. Linus wondered why anyone really cared about floppies any more, but he agreed that oopsing was always bad. Stephen says he has a few fixes for the immediate issues that he will target for 2.6.36 (most of these were generated under virtualization without using real hardware), and both he and Linus seem quite interested in converting floppy over to threaded irqs, amongst other things. As a super special offer, after reading an online report on a website selling floppy drives (in which one purchaser noted that 3 out of 5 floppy drives had failed within 45 days) Linus even agreed to invest a cool $7.99 to pay for a real floppy disk drive for anyone twisted enough to really want to fix that code, while he himself will not “touch it…with a 10-foot pole”.

Kconfig. Vegard Nossum mailed to let everyone know that he has been accepted into this year’s Google Summer of Code (GSoC) program and will spend the summer working on integrating a “proper boolean constraint satisfiability solver into the configuration editors (menuconfig, etc.) in order to allow partial/incomplete configuration specifications. In short, this means that the user can choose to not specify a particular value for some config options, but let the system deduce their values”. There was some talk of the libzypp library used by Novell’s zypper boolean SAT enabled package manager, though James Bottomley noted that the difference here is that while sometimes package combinations result in an unsatisfiable install, kernel configuration should always be resolvable int a set of valid config options – a checker can be used to flag up broken Kconfig combinations and fix them. This conversation actually began a few weeks back, but had not previously been covered here.

Memory Corruption. It has been well known for some time that certain firmware (PC BIOS software) can be particularly prone to cause low memory (less than 1MB) corruption during suspend or other state transitions, including a mere call into a BIOS-provided System Management Mode (SMM) function. Yuhong Bao noted that the problem was also known to Microsoft, and he pointed to a Microsoft WHDC article covering the problem (and the fact that Windows 7 does not typically make use of low memory for this reason) in a posting entitled “Windows side agrees that lowmem corruption is a problem too”. Ingo Molnar noted that in one corruption case he saw, there was an on screen graphical bitmap that had been spewed all over the low memory. Peter Anvin suggested to just not make use of the first few pages (64K) of “low memory” region at all in order to avoid the increasing number of systems that corrupt it taking down the kernel and causing a system crash. He said that the rest of the lowest 1MB could be kept for ZONE_DMA use only “or something”.

Suspend Blockers. Ingo Molnar, replying to a separate message from Ted T’so, begun a lengthy thread entitled “suspend blockers & Android integration”. Google’s Android platform implements suspend blockers as a means to do what is implied in the naming: provide a means for applications to prevent the system from entering a suspend state. In the case of Android, this is apparently because it will attempt to suspend even with running applications in many cases where it has not been explicitly told not to, to save power. Google’s feature is interesting, but there have been some obstacles to it going into the upstream kernel, due it’s semi-invasive nature (and the objection of some to adding such features to the kernel). Ingo objected to Ted’s comment that “hundreds of engineering hours have been made trying to accomodate the demands of LKML — and LKML has said no to suspend blockers/wakelocks”, noting the many possible features that might have been added to the kernel if engineering time was the sole driving factor, rather than a pursuit of a more perfect option. Ingo believes “Linux is an engineering effort that has literally cost about ten thousand man years. That’s about a 85 million man hours. It takes effort to keep that kind of work valuable!”. Linux Weekly News had a lengthy article on this topic, so I’m going to suggest you subscribe and read that for more detail.

In today’s miscellaneous items:

*). Linus Walleij posted a patch implementing MTDparts style partition table specification via kernel command line parameter. The rationale here is that not all embedded systems use standard partition tables, as was already the case with MTD partitions. This extends the MTD concept to cover regular block devices on some newer embedded devices.

*). Yanmin Zhang reported a hang in btrfs related to calling sync() at the end of a test cycle. The kernel would ultimately output a hangcheck warning.

*). Some online controller reset patches for megaraid_sas from Bo Yang.

*). Jari Ruusu reported a module reference counting issue affecting block device mount and unmount that was caused by a bug introduced in a patch from Tejun Heo. Al Viro rediscovered the problem (bd_start_claiming grabs an extra reference that is never released), which Tejun had already fixed and was queued up for inclusion in a later kernel tree.

*). Patrick J. LoPresti posted a patch for a NULL pointer dereference bug he thought he had found in the device mapper multi-path code, and wondered aloud whether the Coverity folks were still running their nightly checks (he thought this was something that should have been flagged up by Coverity automatically in the course of such checks). Last time I heard, Coverity had had some issues with automated nightly runs but scan.coverity.com seems to still contain various data pertaining to linux-2.6.

*). Alan Olsen (whose name I had to extract from whois records because his email client is not configured to include it) noted a modpost segfault that Krzysztof Halasa tracked down to a faulty offset calculation in reading module license data. It was wondered why this had been ok with previous GCC releases, but in any case was urgently suggested for 2.6.35.

In today’s announcements:

*). Karel Zak announced the latest RC version of the util-linux-ng package is available at ftp://ftp.kernel.org/pub/linux/utils/util-linux-ng/v2.18/ The release candidate removes the rdev, ramsize, vidmode, and rootflags legacy utilities, adds a new libmount library for utlities such as mount (though the library API is not yet officially stable), and adds other commands cuh as findmnt, fsfreeze, swaplable, and so forth.

*). Robert P J Day announced that he has created an “online beginner’s kernel programming course” for $39, the details of which are available at http://www.crashcourse.ca Robert notes on that site that he will require a course text in the form of Robert Love’s upcoming third edition of “Linux Kernel Development”. That seems reasonable – this author already has a copy on pre-order, and it will be officially released on July 5th.

The latest kernel release is 2.6.35-rc3.

That’s a summary of today’s Linux Kernel Mailing List traffic, for further information visit www.kernel.org. I’m Jon Masters.

Categories: episodes Tags:

2010/06/06 Linux Kernel Podcast

June 9th, 2010 jcm No comments

Audio: http://traffic.libsyn.com/jcm/linux_kernel_podcast_20100606.mp3

For the weekend of June 6th 2010, I’m Jon Masters with a summary of the past week’s LKML traffic.

In today’s issue: Linux 2.6.35-rc2, Hibernation, KVM clock, Phoronix benchmarks, Threaded Interrupts, and UML.

Linux 2.6.35-rc2. Linus Torvalds announced the release of 2.6.35-rc2 on Saturday, June 6th 2010 at 9:15pm Best Coast Time (PDT). Quoting Linus, “So -rc2 is out there, and hopefully fixes way more problems than it introduces”. Linus expresses some concern at the size of the RC2, which although not as big as the same time last cycle (that was unusally big in general however), was nonetheless larger than he would have liked and less in line with his intention to have a “calmer release cycle this time”. He says he’s going to enforce making RC3 more sane because “the upcoming week is the last week of school for my kids. And when the kids get out of school, I’m going to be offline for a while. And as a result, I _really_ don’t want to pull anything even half-way scary in the next week for -rc3″. He doesn’t specificially mention the Phoronix regression covered later in this episode.

Hibernation. Nigel Cunningham (TuxOnIce) posted some ponderings on idealized algorithms for writing memory images to disk during hibernation in a thread entitled “Proposal for a new alhorithm for reading & writing a hibernation image”, in which he postulated (amongst other things) essentially setting up a new set of page tables while existing pages are being written out to disk such that future writes to memory would result in a page fault that could be used to setup a number of new tables of pages that should later be written out as having changed during the disk write operation. Current algorithms do play games with page faults, but don’t use secondary tables.

KVM clock. Orion Poplawski followed up to a posting from last week in which he noted that guest virtual machines would occasionlly experience wild jumps in time of 6-12 hours. Gleb Netapov, Alexander Graf, John Stultz, Zachary Amsden, and others asked some questions – for example whether ntp was in use or not – and Zachary noted that a theoretical problem existed with the way that some CPUs have seemingly reliable TSC increment until they enter certain power states, at which point the clocking can break. He has some patches in the works to handle this server-side for kvm-clock and he noted that newer kernels should also detect an unreliable host clock quickly on boot. I should add that I have previously experienced similar issues with kvm-clock and raised them directly with the KVM folks quite a few months ago now.

Phoronix benchmarks. Phoronix are known in certain kernel circles for posting “benchmarks” of various new features, especially those targeting the desktop, and so it was not surprising that Alex Buell would wonder aloud why Phoronix were seeing “20 times” performance loss in 2.6.35 release candidate kernels. What was sad, as Robert Hancock noted, was that they wrote an article without choosing to draw the developer’s attention to the problem (although they did note rather specifically when the problem – believed to be causing udev to get into a 100% CPU utilitization loop – was introduced at the end of May). Mike Galbraith and Ingo Molnar replied and infered that such stories were increasingly par for the course as Linux becomes more mainstream. Mike in particular said that, “If eggshells land in our omelet, they can make a buck telling people about it. Who cares? If tasty bacon bits land, they’ll make a buck on that event. Either way, we get some test coverage.” Others (including Ted T’so) were less concerned about “news” stories, and were instead more concerned that there had been a performance problem that had made it to the release candidate stage without being dealt with much sooner.

Thread interrupts. The way of the future for Linux interrupts has been the work that Thomas Gleixner has been organizing to switch over to threaded handlers (specifically, a small quiescent handler intended to quickly placate a device, and a later thread to do the “interrupt” work in thread context). Dmitry Torokhov was very interested in migrating the various input drivers over, but some of those need to do some very long lived polling of devices post interrupt. Dmitry wondered if it was ok to have a threaded handler do that work and just run for a very long time, to which Thomas said, “Sure, why not?” and noted that the only real thing to watch out for was appropriately lowering the priority of the interrupt thread such that other work will take scheduler priority in the interim. Of course the thread should probably not blindly adjust its priority without restoring it to whatever value it started it, just in case the user has specifically configured the intented priority (in the case of RT users).

UML. User Mode Linux (not the silly “modelling” language) has had a few on-and-off issues over the past year. Typically it will break for a few days and be fixed up, but there is some sense that its one-time popularity is truly on the wane. That’s sad, because UML can be useful in an educational context. Most recently, on x86, the UML breakage was caused by a callee register saving optimization trick in the arch-specific “hweight” (hamming weight) function. Borislav Petkov had posted a “bandage” (as Peter Anvin would later describe it), before Jeff Dike, and also others weighed in with various opinions. Peter’s main concern was that it was not at all obvious why UML broke.

In today’s miscellaneous items:

*). Andi Kleen posted a dm-crypt patch intended to facilitate it scaling to multiple CPUs (by means of using per-cpu workqueues), rather than the existing case of all work being bound on a single workqueue. In a separate thread, Neil Brown provided some useful insight into device mapper behavior for layered mappings, explaining why touching an underlying mapping is not a good idea when another is layered on top of it, including side effects.

*). Xudong Hao reported a possibly bridging related regression in 2.6.34 that caused the host kernel to panic when starting KVM guests. Avi Kivity kindly forwarded the mail to the netdev and briding mailing lists for followup.

*). Peter Ovtchenkov inquired as to common bit similaries in seemingly randomly generated UUID pseudofile entries within the kernel. It turns out that version 4 UUIDs specify that the “7th high half-byte” contain 4. Both Lukasz Gromanowski and Ian Campbell followed up to explain this..

*). Michal Marek posted a pull request for some kbuild changes in 2.6.35 (to include the new nconfig interface), to which Linus responded, giving some advice about the best way to do a pull request, and calling out Dave Miller’s networking pull requests as a good example of how to do it. Linus had similarly posted some good recommendations for diffstat generation on May 27th in response to some btrfs update requests.

*). Daisuke Nishimura posted some patches correcting a bad interaction between Andrea Arcangeli’s transparent hugepage patches and memory cgroups (the tail pages during a split did not have correct cgroup LRU entries).

*). Jeff Garzik posted some libata updates. Amongst them was a default disabling of Asynchronous Notification (AN), since, “Proper use is vague, and behavior of firmwares in the field do not match each other.”

*). Rusty Russell suggested to me that support for waiting module removal in modprobe be officially removed, to which I don’t have huge objections. At this point, very few people actually remove modules from production boxes. Rusty and Linus had some other conversation surrouding the in-kernel module loader and locking use therein in a thread overwise devoted to a problem with bne2 network driver initialization that went long (the thread).

*). Luis R. Rodriguez and Stephen Hemminger continued discussing an Ethernet drivers wiki page that could rival the content on compat-wireless today. Stephen wondered why Luis didn’t want this documentation in-tree, and Luis responded that he thought this stuff was better done on a wiki.

*). Dmitry Torokhov noted that “make install” over sshfs is now painfully slow since various git poking is done at the install stage. He wondered if the work implemented by Greg Thelen could be done at build time instead of at install time because it is slowing his build cycles “down to a crawl”.

In today’s announcements:

*). Jeff Merkey posted an announcement (free from any 4-letter-words this time) saying that he had written something called “Cworthy” that was intended to look like an old Netware interface. It’s on Googlecode.com. Jeff also had a few remarks on 2.6.34 that he decided to share under “general comments”.

*). Vladislav Bolkhovitin posted to let everyone know that the ISCSI-SCST project has now implemented a “full set of SCSI Persistent Reservations”. Further information is available at http://scst.sourceforge.net/.

The latest kernel release is 2.6.35-rc2.

Andrew Morton posted an mm-of-the-moment (mmotm) for 2010-06-03-16-36.

Greg Kroah-Hartman announced the release of “long term” stable series kernel 2.6.32.15, which reverted two patches that had unintentionally been included within the previous 2.6.32.14 release. He noted that those not experiencing problems had no real need to upgrade to the latest version at this time.

That’s a summary of today’s Linux Kernel Mailing List traffic, for further information visit www.kernel.org. I’m Jon Masters.

Categories: episodes Tags:

2010/05/30 Linux Kernel Podcast

June 3rd, 2010 jcm No comments

Audio: http://traffic.libsyn.com/jcm/linux_kernel_podcast_20100530.mp3

The podcast has returned from a brief break of a few weeks while I was busily working on a certain Enterprise Linux and using my spare time to not be in front of a computer (sailing). There is a backlog of shows in various stages though I’m not yet sure when I’ll get around to posting them online. Thanks for bearing with me and let’s hope we can get back into a routine once more. As always, if you are interested in helping out, drop me a line by email.

For the US Memorial Day Holiday weekend of May 31st 2010, I’m Jon Masters with a summary of the past week’s LKML traffic.

In today’s issue: Linux 2.6.35-rc1, errors, TSC, Unified Ringbuffer, virtio, and YAFFS.

Linux 2.6.35-rc1. Linus Torvalds announced the release of kernel 2.6.35-rc1 on Sunday, May 30th 2010 at 1:21pm Best Coast Time (PDT). Quoting Linus, “…and thus endeth the merge window”. After a two week merge window, Linus says that the “bulk should be there. And please, let’s try to make the merge window mean something this time – don’t send me any new pull request unless they are for real regressions or for major bugs, ok?”. The 2.6.35 release will not feature any new filesystems for a change, but does have all of the ususal driver updates, and of thr 8500 commits, there were about 1000 individual developers involved in the 2.6.35 tree this time around. Linus described the statistics – specifically calling them out in his mail – as demonstrating what is “a healthy development environment”.

Errors. Modern hardware is generally highly reliable, but scalability and the growth of datacenters play havoc with statistics. Given a large enough amount of memory, disks, or other devices, something will eventually go wrong. When it does, it is useful to handle as much as possible with an air of grace. Memory errors are of particular concern, especially with the growth in the amount of RAM in (increasingly) large servers. ECC (Error Correcting Memory) can help, and includes the useful side effect of reporting on correctable errors. Existing userspace utilities, such as Andi Kleen’s mcelog (and other related work in the kernel itself into recoverable memory errors) offer an ability to collect reports of such errors, as well as Machine Check Exceptions (essentially hardware errors, usually related to failing memory, caches, etc.) of various other kinds. At this year’s Linux Foundation Collab Summit (April 15th 2010), there was a mini-summit aimed at figuring out a path for the future of various separate error reporting subsystems, such as MCE (Intel), and EDAC (AMD). Mauro Carvalho Chehab posted a summary of the minutes in the form of an email thread entitled “Hardware Error Kernel Mini-Summit”, in which it is proposed that a new kernel error subsystem be created, abstracting all of the existing mechanisms, and wired up using performance events (perf). The latter piece comes largely at the insistance of Ingo Molnar and Thomas Gleixner, and is not without its controvasy amongst those who feel perf is growing to become some catch-all solution to every problem. Still, it seems likely that there will be some generic replacement to meclog in the future.

TSC. Venkatesh Pallipadi (Google) posted a patch, originally from Dan Magenheimer (Oracle) in which various information about the perceived (or, generally, otherwise) reliability of the TSC known by the kernel was exported via the sysfs. This would allow userspace applications using rdtsc to know whether the counter is generally regarded as a reliable source of time or not. Thomas Gleixner and Ingo Molnar both absolutely hated this, on the grounds that the TSC is known to be generally not a great clocksource (although it is becoming more reliable in many systems) and that just because the reading of it is generally unprivileged and thus widespread does not mean that the kernel should be complicit in encouraging others not to use the standard timestamp reading abstractions. Especially with modern kernels, where there are vsyscalls and other facing mapped page hacks, the overhead of obtaining timestamp information from the kernel is generally fairly reasonable. There was even some suggestion of limiting ring3 access to the TSC by means of a SPR (Special Purpose Register) setting. Dan Magenhiemer noted that the uses of userspace reading of the TSC were more widespread than Thomas and Ingo may have considered, and he called out the dynamic linker used in RHEL5 as one example of a semi-frequent reader of TSC information. Brian Bloniarz, John Stultz, and Peter Anvin took the conversation in a slightly different direction after Brian noted that sometimes userspace needs to know how reliable the current clocksource is considered to be for use in calibration (for example, when using NTP and desiring to know oscillator accuracy). It seemed to be decided that it would therefore be worthwhile to have a general means to determine the accuracy of the current clocksource, not just the Intel-world-view centric TSC. That latter part may well happen.

Unified Ringbuffer. Hardware error detection wasn’t the only topic of general unification efforts this week. Steven Rostedt posted an RFC thread entitled “Unified Ring Buffer” in which he discussed implementing a globally generic kernel ringbuffer that could be used in any subsystem (recall that Steven also implemented a fancy ringbuffer design in ftrace). He posted links to LKML discussion on the effort so far, and an LWN summary article, noting that both the ftrace ringbuffer and the oprofile ringbuffer have so far been unified, but also noting that the introduction of perf events (which require both a lockless, NMI safe, and mmap()able implementation) came with yet another new ringbuffer from Peter Zijlstra. Steven’s original ringbuffer became lockless last year, but currently does not support mmap. So there are two implementations, “neither of which can perform all of the features needed. This is putting a bit of stress on the users of these tools, not to mention the stress on the developers as well”. Steven would like to find a solution to this problem, and so started the thread. Mathieu Desnoyers added that he was happy to help, and had already started working on his own tree (originally intended to help his LTTng tracing tools), while Andi Kleen wondered aloud why Steven would “want a single ring buffer for everyone?”. Steven said the solution might not be to have one implementation, but merely one single interface (with varying backends used, including, as Andi had noted, kfifo based implementations). This lead Ingo Molnar to suggest that grand design planning discussion of ringbuffers was less important than discussing the future direction for tracing and instrumentation (the main users of these ringbuffers, and the real motivation behind them), and to note that performance was currently quite sucky both in ftrace and perf. The conversation seemed to dry up without any specific conclusions. Separately, Peter Zijlstra posted perf ringbuffer optimization patches in a thread entitled “Optimize perf ring-buffer”. Still separately, Chase Douglas posted some “Tracing configuration review” questions for the forthcoming Ubuntu kernel configuration, seeking review comments.

Virtio. Michael S. Tsirkin posted an RFC patch entitled “virtio: put last seen used index into ring itself”, which as it implies modifies the ring buffer used for host/guest communication of vitio (via a feature flag, using available room in the existing structure) such that a guest will update the ring buffer with a host-visible state of where it is in consuming the buffer. The host doesn’t technically require this information, but it can save on unwanted interrupts if the host knows that the guest is not done processing previous ringbuffer entries, and provides useful statistical information. There then followed a lengthy (and somewhat interesting) debate between Michael, Dor Laor, and Rusty Russell concerning the latter’s assertion that the state of the ring buffer could be stored in the same cacheline as the last item in the buffer, rather than in its own cacheline. Rusty contended that this would be more efficient (since occasionally the index and data would be read at the same time), but when he wrote a useful test program was only able to prove that Avi Kivity was right in suggesting separation. Various other dialogue related to the complexity of virtio was discussed.

YAFFS. Charles Manning, ever diligent YAFFS (Yet Another Flash Filesystem – an excellent alternative that this author has had the privilege of poking at with his embedded hat on in the past) developer posted some questions on SLUB behavior. Charles uses a SLUB-like allocator in YAFFS to manage objects, but his objects are separated according to the mount to which they refer. This makes it very easy for him to just throw away a large number of objects on unmount without de-allocating them (”trust me, I know what I’m doing”). He is looking at replacing his custom allocator with SLUB in order to facilitate eventual mainlining of YAFFS, but wants to know whether SLUB could grow some additional “don’t combine this SLUB with others” and ‘”trust me, I know what I’m doing”: Allow the cache to be dumped with objects still allocated” flags. So far, nobody has answered his questions.

In today’s miscellaneous items:

*). Mike Snitzer, Jens Axboe, Vivek Goyal, and Kiyoshi Ueda discussed (in a thread entitled “only initialize full request_queue for request-based device) various approaches to minimalist initialization of Device Mapper devices, specifically given the new split handling of bio vs. request based devices. Only the latter type require “full” queue setup.

*). Ingo Molnar requested that Linus pull the “lockup-detector-for-linus” tree, which contains a unified kernel lockup detector in kernel/watchdog.c that replaces the existing NMI, hung tasks, softlockup, and so forth all in one place. Big thanks go to Don Zickus for his work on this.

*). Discussion continued surrounding some documentation that Henrik Rydberg posted on the Multitouch event slots protocol for multitouch devices. It seems that these input devices become more complex by the day.

*). Don Zickus posted a patch entitled “Makefile.build: make KBUILD_SYMTYPES work again”, in which he provided some fixes to the code that provides a means to determine why kernel symbol versions have changed (i.e. which specific change to which kernel structure or function was the cause). This is of particular use to “Enterprise” distributions doing module versioning.

*). Michel Lespinasse (Google) posted a patch entitled “Stronger CONFIG_DEBUG_SPINLOCK_SLEEP without CONFIG_PREEMPT” in which he proposed tracking the preempt count even when not using CONFIG_PREEMPT, but when nonetheless building with CONFIG_DEBUG_SPINLOCK_SLEEP. Rather than the use of preempt_{dis,en}able actually resulting in preemption, it would merely serve as a means to warn when attempting to sleep incorrectly from within a critical section, but without explicitly enforcing it.

*). Discussion continued surrounding a previous patch from Kay Sievers adding new “devname” module aliases to facilitate module on-demand autoloading. The idea here is that modules can now provide the name of the device entry or entries they will create and so tools like udev can demand load modules as the nodes they support are accessed.

*). Thomas Gleixner finally posted the patch series he had threatend to post previously, entitled “Run interrupt handlers always with interrupts disabled”, that does largely what it says on the tin. It removes the IRQF_DISABLED functionality at interrupt registration and runs all interrupt handles with IRQs off. This should facilitate greater migration over to modern threaded interrupt handlers as needed.

*). Neil Brown posted a patch entitled “VFS: fix recent breakage of FS_REVAL_DOT” in which he provided a fix for a change to NFS client mount behaviors, under which the client would no longer check if a directory within which “ls -l” were being run had changed at the time of the command, without waiting for the cached timestamp attributes to timeout. Al Viro took the patch, but did not like the implementation, so some further discussion ensued.

*). Arve Hjønnevåg posted the latest version of the “suspend block API”, which provides the “same functionality as the android wakelock api”. This is intended to control when a system will be blocked from suspending due to activity, and comes with the benefit of lengthy LKML discussion.

*). Glauber Costa posted version 3 of a patch implementing various MSR (Machine Status Register) KVM specific documentation.

In today’s announcements:

* Smatch 1.55. Dan Carpenter announced release 1.55 of the “smatch” static C source checker tool is now available. The latest version includes an enhanced array overflow check, new checks for precedence bugs caused by macro expansion, rewritten checks for null pointer dereferences, and some kernel specific checks for kunmap, release_resource, etc. http://smatch.sf.net/ or git://repo.or.cz/smatch.git

* Jeff Merkey announced version 2.6.34 of ndiswrapper. Quoting Jeff, “Always here to support the hated projects of Evil Emperor Linus. Needed this f**king think to work on my laptop so fixed the busted sh*t.” His 4-letter-word strewn announcement was greeted by a reply from Simon Horman noting that he would be happy to send Jeff a dictionary if he was looking to “learn some words that are more than four letters long”.

The latest kernel release is 2.6.35-rc1.

Greg Kroah-Hartman posted a series of 2.6.32.14 stable kernel review patches. He notes that he only included patches that were released in kernels up to the 2.6.34 release, since the line had to be drawn somewhere. This is a “long term” stable kernel tree. Many vendors are basing on 2.6.32 now. Greg also posted “take 2″ of some 2.6.27.47 stable series patches, as well as stable review patches for 2.6.33.5.

That’s a summary of today’s Linux Kernel Mailing List traffic, for further information visit www.kernel.org. I’m Jon Masters.

Categories: episodes Tags:

2010/05/02 Linux Kernel Podcast

May 17th, 2010 jcm No comments

Audio: COMING SOON

For the weekend of May 2nd 2010, I’m Jon Masters with a summary of the past week’s LKML traffic.

In today’s issue: Linux 2.6.34-rc6, vger.kernel.org, Checkpoint and Restart, Frontswap, FUSE, and the Scheduler.

Linux 2.6.34-rc6. Linus Torvalds announced the latest 2.6.34 RC kernel on Thursday April 29th at 8:18pm PDT (Best Coast Time). The latest release is bloated by an updated PowerPC defconfig but does containing other fixes.

vger.kernel.org. There was a vger.kernel.org outage this week, from the 28th through the weekend, due to a power failure in the datacenter that hosts the equipment. This disrupted traffic to LKML, although some folks on IRC noted that their productivity had improved due to the lack of distraction.

Checkpoint and restart. Oren Laadan posted the latest version (21) of the “Kernel based checkpoint/restart” patch series, all 100 of the patches. He included various hints about which bits should be reviewed by whom, but the sheer size of the series boggled a few people. Although there wasn’t much discussion on the list, it does seem unlikely that a 100-part patch series of this kind would be pulled whole any time soon. http://www.linux-cr.org.

Frontswap. Discussion continued on some patches we missed in last week’s episode, on a rewritten piece of the previous “Transcendent Memory” patch series, named “Frontswap”. This piece of the large patch series – which is apparently shipping now in both OpenSuSE and Oracle Enterprise Linux – adds a new generic means to register what is the “opposite” of a swap-like backing store. Frontswap is essentially non-addressable RAM that is provided by a hypervisor (or perhaps a compressed in-kernel RAM device) and which may grow and shrink over time according to the availability of system resources. For example, a hypervisor may grant guests large amounts of otherwise unused RAM in the form of such “frontswap”able devices that may need to be reclaimed later on if other guests require the resources. Using frontswap, one can potentially avoid additional disk overhead usually associated with “swap”. One of the biggest criticisms, from Avi Kivity – was that these patches assume access to the frontswap device is synchronous and not being performed using DMA or some other asynchronous process. Dan Magenheimer confirmed that this is an intential design limitation in order to make the implementation much simpler for its use case(s) dealing with real physical RAM. Dan noted that the conversation had gone off on a tangent, discussing such other (interesting, but not directly relevant) issues as swap-to-flash.

Fuse. Miklos Szeredi posted an RFC patchset implementing splice(2) support for FUSE (Filesystems in USErspace). This means that is is possible to move an existing page directly into the page cache of the FUSE filesystem without ever having to perform a copy. Given that there is obvious overhead in having filesystems implemented in userspace, adding splice support is a nice touch. Apparently the early tests show improved bandwdith and reduced system time but it will be interesting to see what further testing reveals over time.

Scheduler. Ted Baker, Joerg Roedel, Doug Niehaus, and Peter Zijlstra discussed scheduler policy and classes available in the kernel in a followup to a much earlier thread entitled “RFC for a new Scheduling policy/class in the Linux-kernel”, specifically about any plans to support SCHED_SPORADIC. Both Ted Baker and Doug Niehaus had plans for the ability to assign a task a priority that is specifically non-runnable without having to send it a signal – such as SIGSTOP – that requires the task to run in order to process the STOP. Peter Zijlstra stated that the current plan involved supporting the sporadic task model through the use of SCHED_DEADLINE rather than POSIX’s SCHED_SPORADIC (the name of which, according to Peter, was jokingly “stole[n] [...] from us”). Ted Baker replied to Peter, noting that deadline scheduling and sporadic server scheduling are “two quite different things” – the latter belonging to the existing fixed priority scheduling domain (that is a separate problem domain from that of the deadline scheduling folks). Ted thought issues with the POSIX SCHED_SPORADIC API that may have problems could be corrected through “interpretation” of the standard such that a solution were available in short order rather than longer term, especially if Linux were to do something with implementation that he could feed to the Austin Group (the POSIX folks).

In today’s miscellaneous items:

* Mike Travis (SGI) posted a patch providing a kernel parameter to increase pid_max from 32k for early-in-boot use, before it can be otherwise set to a higher value. Otherwise, on a system with 1664 CPUs, Mike finds that there are 25163 processes started before the login prompt!

* Jack Steiner (SGI) noted that the existing SLAB allocator implementation of cpuset_mem_spread_node used a single rotor for allocating both file pages and SLAB pages, so that (on a multi-node memory system), writing a particular test file results in advancing the rotor 2 nodes per allocation and skipping e.g. odd number nodes in the SLAB pages allocation. The patc introduces a second rotor just for the SLAB page allocation.

* Philip Langdale (VM) noted that he has been following the Transparent Hugepage work over the past few weeks and is very encouraged. He claims a 22% improvement in ops/sec reported by SPECjbb under virtualization.

* A kernel developer posted a somewhat distressing thread suggesting some emotional disturbance caused by a particular relationship. In the interest of not being the US Weekly of LKML I shall refrain from further comment, and agree with the suggestion of using the “It’s Complicated” button on Facebook next time something like this comes up instead.

* Ying Huang posted initial support for APEI (ACPI Platform Error Interface).

* Joerg Roedel posted the second version of the “Nested Paging support for Nested SVM” patchset.

* Steven J. Magnani posted version 2 of a stack unwinder for Microblaze.

* A second series of viafb patches for OLPC from Jonathan Corbet, who later pushed a version 2.1 of the series, containing three additional patches fixing issues pointed out by Bruno Prémont. The patches are available from git://git.lwn.net/linux-2.6.git in the branch viafb-posted. Jon wondered if the patches were ready to go into viafb-next.

In today’s announcements:

* DRM. Stefan Bader posted to let everyone know that he is now maintaining a 2.6.32-based tree on kernel.org containing backported DRM improvements for 2.6.32 based kernels, since a number of vendors are using that tree. Luis R. Rodriguez replied saing that this was “Great stuff! Thanks for putting this up!”. One wonders if this is more sign of a growing trend.

* Linux 2.6.33-rt19. Thomas Gleixner announced verion 2.6.33.3-rt19 of the Real Time patchset, containing mostly VFS scalability bits. This followed a previous 2.6.33-rt16 release also this week containing largely a merge with upstream 2.6.33, and -rt17 and -rt18 releases that contained a few fixes. Thomas notes in his posting that he had previously pushed out rt14 and rt15 without sending an announcement out to the list, so he included changelogs from -rt13 to 16, and rt17-rt18 (in the separate emails he made announcing -rt16, and -rt17). Patches are available at http://www.kernel.org/pub/linux/kernel/projects/rt/ and the tip git tree on git.kernel.org contains existing rt/head and rt/2.6.33 release branches.

* Upstart 0.6.6. Scott James Remnant announced the 0.6.6 release of the “upstart” SYSV init daemon replacement that supports modern asycnhronous event driven operation rather than traditional runlevels (though it does also support emulating those for backward compatibility). Upstart is used by a number of distributions, and is available at upstart.ubuntu.com/

The latest kernel release was 2.6.34-rc6.

Greg Kroah-Hartman released stable series kernels 2.6.32.12 and 2.6.33.3. The former came with some thanks (and possibly an indirect dig at vendors) to Maximilian Attems for his “hard work digging out patches from the various vendor kernel trees for this release”. Maximilian was also thanked specifically in the latter case for contributing patches also. Separately, Greg requested of Stephen Rothwell that he begin pulling a new staging-next tree into his daily Linux -staging tree (a nice present for Stephen as he returned from vacation).

Frederic Weisbecker replied (in an innocuous thread otherwise containing a patch email thread of conversation entitled “ptrace: Cleanup useless header”) noting that things touching the BKL should CC both him and Arnd Bergmann. They are still working on Big Kernel Lock (BKL) removal, which you can keep track of via http://kernelnewbies.org/BigKernelLock. There was some other BKL removal traffic over the past week, also, including some patches from Arnd entitled “Push down BKL into device drivers” (similar to the FS patches he had posted previously that did the same in that layer – nice).

That’s a summary of today’s Linux Kernel Mailing List traffic, for further information visit www.kernel.org. I’m Jon Masters.

Categories: episodes Tags:

2010/04/25 Linux Kernel Podcast

May 13th, 2010 jcm No comments

Audio: COMING SOON

For the weekend of April 25th 2010, I’m Jon Masters with a summary of the past week’s LKML traffic.

In today’s issue: Linux 2.6.34-rc5, CFS, Firmware, and IPC.

Linux 2.6.34-rc5. Linus Torvalds announced the release of Linux kernel 2.6.34-rc5 on Mon, April 19th 2010 at 4:42pm PDT (Best Coast Time). As he said, “Another week, another -rc. This time there wasn’t some big nasty regression I was working on to hold things up” (refering to the issues with anon_vmas and anon_vma_chains from last week). The latest release includes a number of general fixes, including boot fixes for ACPI parsing, and the usual kinds of driver updates (radeon, amd-iommu, filesystems). SPARC now has ftrace support if you are interested in playing with that. Upon mentioning regressions, Rafael J. Wysocki seemed to fly into action with his usual vigor and post his regular regression summary of issues outstanding since 2.6.33. The current statistics show that the number of unresolved issues has tended to increase over the several weeks leading up to -rc5, with 34 outstanding.

CFS. Mathieu Desnoyers posted version 2 of a patch entitled “CFS fix place entity spread issue”, which is aimed to address an apparent situation in which Mathieu felt that min_vruntime could go backwards and cause large unwanted latencies for certain workloads. Peter Zijlstra disputed that this was happening and Linus, upon testing the patch, using his “favorite non-scientific desktop load” and found that it made things worse in terms of X performance, which was apparently to be expected (according to Mathieu) because Xorg had been getting unfair runtime treatment that was now corrected. This didn’t make Linus particularly happy (from a user experience viewpoint) and meanwhile Mathieu and Peter continued to debate what was happening. Mathieu posted some links to an ELC (Embedded Linux Conference) presentation that he did on this topic at http://www.efficios.com/elc2010 and then later followed up (in an entirely separate thread) with version 11 of his “introduce sys_membarrier(): process-wide memory barrier” that he uses to assist with his userspace RCU implementation, all the while still stranded at San Francisco airport waiting for a means to get back home.

Firmware. Tomas Winkler posted a thread entitled “request_firmware API exhaust memory” in which it was discovered that some performance enhancement work done by David Woodhouse a while back actually caused the kernel to leak memory used for firmware handling, especially in the case that a large number of calls were made to request_firmware, as in the case of Tomas’ code. The issue was that the firmware code was attempting to free pages not allocated with vmalloc using vfree, whereas the underlying pages were actually being allocated and then mapped into linear kernel virtual memory with vmap calls. The fix involves unmapping and then freeing.

IPC. Manfred Spraul posted a three part patch series entitled “ipc/sem.c: Optimization for reducing spinlock contention” in which he attempts to “fix the spinlock contention reported by Chris Mason: His benchmark exposes problems of the current code”. Manfred then summarizes three main issues, including the prominent first issue that the algorithm used by update_queue() has a worst case performance on the order of O(N^2) and bulk wakeups can enter this worst case if they are unlucky. After applying the patch and performing some runs with sembench using 250 threads, waking 64 threads at a time, Manfred reports 1.1% CPU lost spinning vs. 47% before, and 6% of spinlocks spinning vs. 91% before, amongst other statistics.

In today’s miscellaneous items:

* Jon Corbet posted version 2 of an RFC patch series entitled “Initial OLPC Viafb merge”, and noted that he would begin a linux-next tree.

* Yanmin Zhang posted version 5 of a patch intended to implement perf statistics collection in the host of various guest KVM instances.

* Hiroyuki Kamezawa reported an issue with memory compaction support in the mm-of-the-day (mmotm) for 2010-04-15-14-42. He and Mel Gorman discussed it a little. Separately, Mel posted version 8 of the memory compaction patch series, without an obvious fix for the crash issue.

* Justin P. Mattock reported that the issues booting MacBook Pro systems from the previous week seemed to now be resolved in the latest kernels.

* Rusty Russell posted a module patch that causes the module_lock mutex to be dropped when waiting for parallel module loads to complete.

* Don Zickus posted a 6 part patch series entitled “lockup detector changes” that “covers mostly the changes necessary for combining the nmi_watchdog and socklockup code”.

* Stefani Seibold posted yet another (unversioned in the subject line) 4 part patch series that was entitled “enhanced reimplementation of the kfifo API”, and which contained basically a rebase to recent kernels.

* Kyle McMartin posted a patch changing the default file permissions on the kernel provided pseudo file /proc/sys/vm/mmap_min_addr to 0600 from 0644. There wasn’t a huge security issue as writes were already denied by virtue of the fact that CAP_SYS_RAWIO was also required underneath.

* Kent Overstreet posted version 3 of the “bcache” patch series.

In today’s announcements:

* Linux Plumbers Conference (LPC). Ted Ts’o posted a “Call for Tracks”, noting that this year’s conference will take place in Cambridge, MA from November 3-5. The organizers are looking for “problem statements” summarizing “things that could be improved in Linux that cross multiple interfaces or other project boundaries”. For further information about the conference, and to submit ideas, see: http://www.linuxplumbersconf.org/

* git 1.7.0.6. Junio C Hamano announced version 1.7.0.6 of the GIT utility used for version control by the Linux kernel community. The latest version includes fixes for “git diff -stat” overflow, and “git rev-list –abbrev-commit” using the older 40-byte abbreviation format. Junio also announced version 1.7.1 of the GIT utility, which included updates to gitk, the ability to invoke an external command for passwords (GIT_ASKPASS), a new bash completion script (for those who use that), and dozens of other fixes besides. Git is available on the kernel.org website: http://www.kernel.org/pub/software/scm/git/

* hwloc. Samuel Thibault announced the release of hwloc version 1.0rc1, a “hardware locality” utility intended to provide command line support for obtaining information about NUMA memory, shared caches, processor sockets, processor cores, and processor “threads”. For further detail see the project website: http://www.open-mpi.org/projects/hwloc/

The latest kernel release was 2.6.34-rc5.

Andrew Morton posted an mm-of-the-moment (mmotm) for 2010-04-22-16-38.

There was some ongoing discussion of kernel vmalloc performance and a few patches were posted, most recently from Minchan Kim.

Joe Perches asked about the -staging tree review and acceptance process, noting that there are a “number of patches appear[ing] to go unnoticed or
untracked”. Greg Kroah-Hartman followed up explaining that he’s had conferences, travel, and has moved house, and basically asked for a break.
Greg has generally been responsive on the staging tree discussion list in my experience, and there is a lot of work that goes in there.

Greg Kroah-Hartman posted a 2.6.32 stable kernel review patch series comprised from 197 individual patches to the “long term” stable kernel 2.6.32. He also posted a 139 part patch series for the 2.6.33 stable series kernel.

That’s a summary of today’s Linux Kernel Mailing List traffic, for further information visit www.kernel.org. I’m Jon Masters.

Categories: episodes Tags:

2010/04/18 Linux Kernel Podcast

May 10th, 2010 jcm No comments

Audio: COMING SOON

For the weekend of April 18th 2010, I’m Jon Masters with a summary of the past week’s LKML traffic.

In today’s issue: Linux 2.6.34-rc4, adaptive spinning mutexes, Microblaze, Remote Controller Subsystem, Stack Size, and VM.

Linux 2.6.34-rc4. Linus Torvalds announced the release of kernel 2.6.34-rc4 on Monday April 12th 2010 at 7:16pm PDT (Best Coast Time), which had been delayed while he, Borislav Petkov, Rik van Riel, and others were tracking down an annoying rmap VM regression caused by the introduction of anon_vma_chain support. Most of Linus’ announcement covers that bug – stay tuned for some coverage on that – but also mentions the new cxgb4 network driver.

Adaptive spinning mutexes. Benjamin Herrenschmidt posted a new thread entitled “Possible bug with mutex adaptive spinning” in which he noted that the current adaptive spinning (in which a mutex will spin briefly rather than immediately going to sleep if the owner of a lock is already running and might release it soon) code in mutex_spin_on_owner() does not correctly handle the case of the owner CPU being offlined. In this case, the function will return 1, meaning that the caller should spin, which it may do forever. Ben changes the return to 0 in the case that the CPU is offline so that a sleep occurs immediately.

Microblaze. Michal Simek posted a thread entitled “Microblaze – The fi[r]st year”, in which he summarized what has happened in the year since support for the soft-core Xilinx Microblaze CPU was first added to the mainline kernel. He calls out a number of folks for specific thanks – both from Xilinx, and from PetaLogix, as well as the wider community (the usual suspects: Andrew Morton, Arnd Bergmann, Grant Likely, Ingo Molnar, John Linn, John Williams, Stephen Neuendorffer, etc.). He includes a timeline of events over the past year as well as links to git trees, the wiki, and even a Facebook fan page (such is the world in which we live today – and yes, I am a “fan” myself).

Remote Controller Subsystem. Mauro Carvalho Chehab posted an informative mail entitled “Remote Controller subsystem status” in which he updated everyone on the current progress toward implementing a new “remote controller” subsystem that replaces the legacy V4L/DVB code and will become a new “core” subsystem available in /sys/rc. There is a userspace tool called ir-keytable and some discussion of plans for merging in 2.6.35. A mail worth reading.

Stack size. Dave Chinner posted a thread entitled “mm: disallow direct reclaim page writeback” in which he advocates for using the background IO flusher threads even in the case that VM pressure is so high that direct page reclaim becomes a necessity. Dave feels that in such cases, “we may have used an arbitrary amount of stack space, and hence enterring the filesystem to do writeback can then lead to stack overruns. This problem was recently encountered [on] x86_64 systems with 8k stacks running XFS with simple storage configurations”. This lead to a longer thread in which the issue of kernel stack footprint was addressed, as well as the specific issue of what to do in the direct reclaim situation. Andi Kleen followed up to Chris Mason’s comments concerning the relatively large footprint of single fs functions with an assertion that the ‘4K stack simply has to go. I tend to call it “russian roulette” mode’. Andi considers such small stacks to be dangerous given the “obscure paths through the more an more subsystems”. He is fond of the separate interrupt stack in the case of 4K process stacks, but feels that there should always be a separate interrupt stack in any case, as might have helped in the case that Dave Chinner was mentioning in the original posting. Mel Gorman later followed up with an RFC patch series entitled “Reduce stack usage used by page reclaim” in which he attempted to “reduce some of the more obvious stack usage in page reclaim”, including in putback_lru_pages, kswapd, shrink_page_list, shrink_zone, and so forth (up to 1096 bytes saved).

VM. The Linux kernel includes support for reverse page mapping (rmap), a means by which it is possible for the Virtual Memory subsystem to answer important scalability questions such as “which virtual memory pages reference this physical page?” without having to walk through a large number of process page tables each time. Over the years, this code has become more complex through the addition of anon_vma, and anon_vma_chain structures intended to allow object based reverse mapping of anonymous memory pages with reduced overhead as compared with Rik van Riel’s original (and more simple) mechanism of having additional pointers in every struct page. anon_vma is used to track per-task anonymous VMA use, while anon_vma_chains link these together to allow the VM to determine which tasks have a shared reference to a given anonymous VMA.

The implementation of this complex VMA tracking was suffering from a bug that Borislav Petkov kept hitting in performing a suspend/resume cycle on his system, in which the resume code would wind up referencing a previously unmapped shared page first within a child process (setting up a new anon_vma) and later within a parent (causing an anon_vma_chain link to be setup pointing in the wrong direction from child to parent) that subsequently could no longer reach the child anon_vma after the child task exited. As Linus said, “End result: process A has a page that points to anon_vma B, but anon_vma B does not exist any more. This can go on forever. Forget about RCU grace periods, forget about locking, forget anything like that. The bug is simply that page->mapping points to an anon_vma that was correct at one point, but was _not_ the one that was shared by all users of that possible mapping.” Thus the fix is to ensure that new anon_vma_chain entries are always referencing the “_oldest_ possible anon_vma for the page mapping”, as is the case for Linus’ eventual (simple) patch, entitled “[PAGE 4/4] anonvma: when setting up page->mapping, we need to pick the _oldest_ anonvma”. Borislav said it survived more than 20 test cycles where the system would previously have managed at most 6 resume attempts.

Linus seemed genuinely excited about tracking down this bug – it can’t always be easy doing his job, and I’m sure he relishes an occasionally really dirty bug to poke at. One thing that did come of this exercise was an improvement in comments and documentation both on list and in the affected code. Linus seemed very happy with the effort Borislav was putting in to help test and track down this issue (ending the thread with a little joke about Borislav’s email gateway, which claims to be “SuperMail on a ZX Spectrum 128k”). The thread fixed a few other issues aswell, and gave Peter Zijlstra a chance to post a documentation patch for page_lock_anon_vma noting that it is very difficult to serialize fully against page_remove_rmap so that the lock function doesn’t try, but instead all users of it should verify that the anon_vma returned to them is actually still relevant to them. Finally,
Ulrich Drepper followed up some time later – on a tangent – wondering aloud why mprotect need create so many VMAs when changing permissions
on thread stacks and the like instead of modifying page table entries.

As usual, Linux Weekly News (LWN) did a much better job of explaining the overall multi-day issue in depth so you are encouraged to take a look at
their story for more of the history, analysis, and nice graphics.

In today’s miscellaneous items:

* Robert Richter posted some model specific performance events patches in order to support AMD IBS (an unfortunate acronym in this case standing for Instruction Based Sampling).

* Nigel Cunningham was looking for a job.

* Several people have reported issues booting Macbook Pros with recent kernels. Len Brown noted that this was likely already fixed (referencing BZ 15749). In response, Harald Arnesen was especially happy about git bisect as a debugging tool for non kernel hackers to help track down bugs such as this one.

* Jason Baron posted version 7 of his “jump label” patch series.

In today’s announcements:

Git 1.7.1.rc1. Junio C Hamano announced Git version 1.7.1.rc1, which includes a number of fixes. http://www.kernel.org/pub/software/scm/git/ This comes at around the time of the 5th anniversary of the kernel switching to Git for development, which Christian Ludwig noted occured on the 15th April. Christian notes that he has made a YouTube video visualizing git development history, available at http://www.youtube.com/watch?v=ntTpM8hfl_E

Guilt 0.33. Josef “Jeff” Sipek announced version 0.33 of the Guilt (Git Quilt) series of bash scripts was now available from the usual location.
http://www.kernel.org/pub/linux/kernel/people/jsipek/guilt/

LTTng 0.210. Mathieu Desnoyers announced LTTng 0.210 for kernel 2.6.33.2, which was largely a revert of a PowerPC specific TRACE_EVENT definition that occured outside of include/trace, and which particularly bothered Mathieu.

sdparm 1.05. Douglas Gilbert announced that the 1.05 release of sdparm was now available. This is a direct analogy of “hdparm” but for SCSI devices, and so supports a lot of SCSI specific fancy options.

trace-cmd version 1.0. Steven Rostedt announced version 1.0 of his trace-cmd utility, which is a cross-platform, endian safe binary reader for ftrace that
can be used to capture data on one machine (e.g. as a flight recorder) and then decode and process it on another, at runtime, or after the fact.

The latest kernel release was 2.6.34-rc4.

Andrew Morton posted an mm-of-the-day (mmotm) for 2010-04-15-14-42.

An issue was discovered with a net-2.6 patch entitled “tcp: Set CHECKSUM_UNNECESSARY in tcp_init_nondata_skb” that caused ssh to fail. David
Miller subsequently stated that he would revert this patch and specifically test zero length data area CHECKSUM_PARTIAL packets with the IGB driver.

Pavel Machek noted that the LOCALVERSION_AUTO configuration option, which appends a new version to the kernel on each compilation, has an unfortunate interaction with loadable kernel modules when CONFIG_MODVERSIONS is unset insomuch as it causes the simple kernel version check to fail. Linus was very clear that the problem here is people building kernels without enabling modversions and expecting that to be even remotely safe.

That’s a summary of today’s Linux Kernel Mailing List traffic, for further information visit www.kernel.org. I’m Jon Masters.

Categories: episodes Tags: