CoreML reports NaN confidence values when performing inference on GPU
Originator: | srichey | ||
Number: | rdar://47670895 | Date Originated: | 01/30/2019 |
Status: | Open | Resolved: | |
Product: | CoreML | Product Version: | |
Classification: | Other bug | Reproducible: | Yes |
Summary: We are developing an iOS application that uses image classification to identify products in a large catalog. We originally had 30-60 products -- and thus, 30-60 output labels -- but as we scale up to the full catalog of 400+ products, we’ve encountered an issue where the confidence values output by the model are all NaN when using the GPU for on-device inference on iOS. Initially, we believed that this was an issue with one specific CoreML file that resulted from a training session, as the following model did not experience this issue. However, we have now seen this issue recur with numerous models, leaving us to believe that this represents a bug in iOS. This does not occur when performing inference on the CPU, and does not seem to be affected by compiling the mlmodel file on device or as part of Xcode's build process. Steps to Reproduce: These steps are dependent on the exact model used, as only some have this issue. 1. Use Turi Create to create a CoreML file based on resnet-50 (most likely to occur with a high number of output labels) 2. Import this CoreML file into a new Xcode project using standard settings 3. Add an image to the Xcode project 4. In the default ViewController, add code to viewDidLoad that performs the following: a. Read the image into a CVPixelBuffer b. Create the MLModel from the included mlmodel file c. Perform prediction on the given image d. Either print the label/confidence outputs of the model to the log, or display in a UILabel Example project attached. Expected Results: We would expect all confidence values to be non-NaN, as we have seen in the past, and as we see when performing inference on the CPU. Actual Results: On certain devices, the output confidence values will be NaN. As noted in the “Configuration” section below, this does not occur on all devices that run iOS. Version/Build: The term “valid” below means that all output confidence values were normal floating-point values, and were not NaN or infinite. In one case the results array was empty, and no error was given. iPhone 8 Plus, iOS 12.1.2: Valid on CPU, NaN on GPU iPhone XR, iOS 12.1.3: Valid on CPU, Valid on GPU iPad Pro (9.7 inch), iOS 12.0: Valid on CPU, NaN on GPU iPhone 6s Plus, iOS 12.1.2: Valid on CPU, Valid on GPU iPhone XS Max, iOS 12.1.3: Valid on CPU, Valid on GPU iPhone 7 Plus, iOS 12.1.3: Valid on CPU, NaN on GPU iPhone 7, iOS 12.1: Valid on CPU, NaN on GPU iPad Air 2, iOS 12.1.1: Valid on CPU, NaN on GPU iPad mini 3, iOS 11.2.5: Valid on CPU, Valid on GPU iPad Pro (12.9 inch), iOS 11.0.1: Valid on CPU, Empty on GPU iPad Pro (11 inch), iOS 12.1.1: Valid on CPU, Valid on GPU Configuration: iMac Pro 2017 macOS Mojave 10.14 Turi Create 5.2.1 Python 2.7 Image classifier uses resnet-50 as the base model. We have over 100,000 images with over 200 output labels.
Comments
Please note: Reports posted here will not necessarily be seen by Apple. All problems should be submitted at bugreport.apple.com before they are posted here. Please only post information for Radars that you have filed yourself, and please do not include Apple confidential information in your posts. Thank you!