The PDF Spotlight plugin hangs when it can't parse a file's CMap

Originator:SaagarJha28
Number:rdar://31631951 Date Originated:14-Apr-2017 11:44 AM
Status:Open Resolved:
Product:macOS + SDK (Spotlight) Product Version:
Classification:Performance Reproducible:
 
Area:
Spotlight

Summary:
The PDF importer seems to hang (but appears to eventually finish, if enough time is given) on certain PDF files–I've attached one to this report. It was generated by taking an EPUB and running it through Calibre's (https://calibre-ebook.com) PDF converter.

Steps to Reproduce:
1. Run mdimport on the attached file

Expected Results:
Spotlight will import it quickly

Actual Results:
PDF.mdimporter spams "failed to parse embedded CMap." hundreds of thousands of times and is killed since it takes too long

Version:
macOS Sierra 10.12.5 Beta (16F54b)

Notes:


Configuration:
Early 2015 MacBook Pro with Retina Display running

Comments


Please note: Reports posted here will not necessarily be seen by Apple. All problems should be submitted at bugreport.apple.com before they are posted here. Please only post information for Radars that you have filed yourself, and please do not include Apple confidential information in your posts. Thank you!