Select(2) on character device hangs and makes system unstable on El Capitan

Originator:billziss
Number:rdar://23225698 Date Originated:2015-10-22
Status:Open Resolved:
Product:OSX Product Version:10.11.1
Classification:Crash/Hang/Data Loss Reproducible:Always
 
Summary:
My project is a user-mode file system that uses the third-party open-source software OSXFUSE. The OSXFUSE project includes a kext that makes writing user-mode file systems possible, by working as a conduit between the kernel and the user-mode file system implementation.

The OSXFUSE kext implements a character device that the user-mode file system uses to communicate with the kext. Prior to OSX 10.11 it had been possible for the user-mode file system to issue select(2) on the character device to determine whether input is available. Since OSX 10.11 a select(2) on the character device results in a hang system call, an unkillable process and often times a kernel panic.

The full list of symptoms is as follows:
- select(2) hangs indefinitely even when called with a 0 timeout.
- The FUSE process (i.e. the user-level file system) becomes unkillable (kill -9 does *not* work).
- The mount_osxfusefs (an OSXFUSE helper process) hangs and becomes unkillable (kill -9 does *not* work).
- The CPU time for either the FUSE process and/or the mount_osxfusefs process shoots to 100%.
- Every few times this will also result in the rarely seen OS X BSOD.
- - https://support.apple.com/en-us/TS3742
- If the system does not crash it cannot be shutdown or restarted, unless forced to by holding down the power button.
- When the system crashes it leaves a crash report behind. In most of these cases the OSXFUSE kext is being hinted to as the culprit.

I understand that the obvious culprit is the OSXFUSE kext. I also understand that OSXFUSE is not an Apple provided framework. However I point out that this kext has worked on prior versions of the OS without any problems. Furthermore there have been no relevant changes in the kext in the last few months.

The complete code to OSXFUSE is available at this location:
    https://github.com/osxfuse
The kext code in particular is here:
    https://github.com/osxfuse/kext

I believe that the code that handles select(2) in the kext is in function fuse_device_select() in the file fuse_device.c. This code has not been changed since 2012.

I have for your use in tracking this issue a zip file containing:
- A minimal implementation of a FUSE file system that exhibits the issue (hello_ll.c).
- Two kernel crash reports.

Steps to Reproduce:
- Install OSXFUSE 2.8.1 from https://osxfuse.github.io
- Compile the to be provided hello_ll.c using the command line:
-- clang hello_ll.c -I /usr/local/include/osxfuse/fuse -D_FILE_OFFSET_BITS=64 -L/usr/local/lib -losxfuse
- Create a temporary directory for mounting:
-- mkdir /tmp/root
- Mount the broken "hello" file system:
-- ./a.out /tmp/root

Expected Results:
I expected the select(2) calls to complete without issues (esp. when issued with a 0 timeout).


Actual Results:
The select(2) call hangs, the process become unkillable (kill -9 does *not* work) and the system becomes unstable and often times results in a kernel panic.

Version:
OS X: 10.11.1 (15B42)

[Also on OS X 10.11.0]

Notes:
No workaround has been found.

The problem also happens with the latest experimental OSXFUSE releases (currently 3.0.6).

Configuration:
Hardware: MacBook Pro (Retina, 13-inch, Early 2015)


Attachments:
'SelectBug.zip' was successfully uploaded.

Comments

A release note for FUSE for OS X 3.0.7

https://github.com/osxfuse/osxfuse/releases/tag/osxfuse-3.0.7

> … Fix select(2) for FUSE devices on OS X 10.11. The issue was caused by a kernel private struct that has changed between OS X 10.10 and 10.11. …

By grahamperrin at Feb. 18, 2016, 11:30 a.m. (reply...)

The OSXFUSE maintainer has instructed me to attach an archive that contains the relevant kext symbols. The archive contains three kernel extensions for different versions of OS X: 10.5 10.6 and 10.9. 10.9 is the one that is used on OS X 10.11.

'kext-2.8.1.tbz' was successfully uploaded.


Please note: Reports posted here will not necessarily be seen by Apple. All problems should be submitted at bugreport.apple.com before they are posted here. Please only post information for Radars that you have filed yourself, and please do not include Apple confidential information in your posts. Thank you!