NSXMLParser ignoring internal entity definitions

Originator:roppongi765
Number:rdar://9912558 Date Originated:08-Aug-2011 03:49 PM
Status:Open Resolved:No
Product:Mac OS X Product Version:10.7.0
Classification:Serious Bug Reproducible:Always
 
Summary:
NSXMLParser does not call parser:foundCharacters: for entities like &something; anymore. It used to do that.

Steps to Reproduce:
Create a minimal XML document like:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE doc [
<!ELEMENT doc (node)*>
<!ELEMENT node ANY>
<!ENTITY entity "Lorem ipsum dolor">
]>
<doc>
	<node>Foo &entity; bar</node>
</doc>

Then parse it using NSXMLParser.

Expected Results:
After the <node>-element started, NSXMLParser will call parser:foundCharacters: three times:
1) foundCharacters: @"Foo "
2) foundCharacters: @"Lorem ipsum dolor"
3) foundCharacters: @" bar"

Actual Results:
NSXMLParser will call parser:foundCharacters: only two times:
1) foundCharacters: @"Foo "
2) foundCharacters: @" bar"

Regression:
Use libxml2 (which is ironically used by NSXMLParser) or any other XML parser.

Notes:
NSXMLParser is not calling any other delegate method when it hits the entity &entity; either. In its current implementation, NSXMLParser is unusable.

Comments


Please note: Reports posted here will not necessarily be seen by Apple. All problems should be submitted at bugreport.apple.com before they are posted here. Please only post information for Radars that you have filed yourself, and please do not include Apple confidential information in your posts. Thank you!