iPhone implements Emoji incompatibly

Originator:mo
Number:rdar://6402446 Date Originated:2008-11-26
Status:Duplicate/6209815 Resolved:No
Product:iPhone Product Version:2.2
Classification:Other bug Reproducible:Always
 
Further related to rdar://problem/6398355 and rdar://problem/6395684

Summary:

iPhone OS 2.2 includes Emoji support, which is by default enabled solely for Softbank SIMs, and is only “supported” when sending text messages to other Softbank users or when sending e-mail via Softbank’s i.softbank.jp e-mail server.

The reason for this appears to be that the iPhone implements Emoji using the “private use” section of Unicode, making use of code points in the range U+E001–U+E05A. This is presumably because Softbank has not previously made use of Unicode-based encodings for i-mode services and Emoji, preferring to use S-JIS instead.

In contrast, NTT DoCoMo’s services are Unicode-based, and have recommended for some time that developers building i-mode services use Unicode throughout.

While Apple's implementation of Emoji was designed to not conflict with DoCoMo’s similar use of the Unicode private use area (DoCoMo’s range begins at U+E63E, as detailed on http://www.nttdocomo.co.jp/english/service/imode/make/content/pictograph/basic/index.html), it is inherently incompatible with any existing implementation.

My best guess is that i.softbank.jp and Softbank’s SMSC translates to and from the private-use area code points used by the iPhone to S-JIS characters used by other subscribers to Softbank’s services. In electing to implement the service this way, picking a separate private-use area range from DoCoMo (which has operated a Unicode-based service from some time), iPhone applications are guaranteed to only interoperate properly with either other Softbank subscribers in the best case (when sending text messages or e-mail from an @i.softbank.jp address), or other iPhone OS 2.2 users in the worst case (for example, when using third-party web services). Had Apple and Softbank elected to use the same code points as DoCoMo, all devices which encoded emoji as Unicode code points would interoperate.

Steps to reproduce:

1. Include emoji in a text sent by an application

Expected result:

Either:

a) [Best case] Emoji would be encoded as Unicode code points from a range reserved for the purpose by the appropriate standards body, guaranteeing interoperability.

or

b) [Worst case] Emoji would be encoded as Unicode code points in a manner consistent with existing implementations, ensuring compatibility with the “de facto” standard.

Actual result:

Emoji are encoded using private-use area code points which are compatible only with other iPhone OS 2.2 users in most cases. In many cases, even subscribers of the same network (Softbank) will not be able to view the pictograms.


26-Nov-2008 01:10 PM Mo McRoberts:
Apologies, where I stated U+E001-U+E05A in the above report, I obviously intended to say U+E001-U+E537.

Comments

Untrue

"it is inherently incompatible with any existing implementation" Not true. I can send/receive emoji from/to iPhones and my Japanese Softbank/Toshiba 911T and they show up correctly. This is through SMS, on a Swedish network (so there's no operator-side translation going on). I can type them on the Softbank phone and send to the iPhone, works fine. Type them on the iPhone and send to softbank, they work fine. It's perfectly in order with the unicode points listed here http://creation.mb.softbank.jp/web/web_pic_01.html

Previously used

Hi,

I worked on projects for Vodafone KK (later Softbank) back in 2004. We for sure used the unicode private range for sending emoji. Also the specs at that time specified to using utf8 and not shift-jis (trying to follow the OMA standards). Not sure if this has changed since though.

By anders.hasselqvist at Dec. 3, 2008, 1:41 a.m. (reply...)

Re: Previously used

That's interesting to know—the only information on the various ranges is really stuff dotted around the web (including some of the developer pages which are helpfully translated into English, but they're relatively few and far between).

Do you know if Softbank uses this particular range for general-purpose use, or whether it's a different one? The OMA spec just says that existing Unicode code points should be used where possible, but doesn't detail what should happen in other situations beyond specifying that the private use area should be used (i.e., it doesn't say what code points within the PUA should be utilised).

Actual ranges

This is obviously beside the point, but U+E001–U+E05A is not the only range used. I’ve found the following:

U+E001–U+E05A, U+E101–U+E15A, U+E201–U+E253, U+E301–U+E34D, U+E401–U+E44C, U+E501–U+E537

There’s a dump at http://ahruman.is-a-geek.net/temp/emoji.html, only useful on an iPhone OS 2.2 device for obvious reasons.

By jens.ayton at Nov. 26, 2008, 10:50 a.m. (reply...)

Yes, you're right!

I'm an idiot, and grabbed the wrong end-value from that page when I submitted the rdar; I copied the first end-value instead of the last end-value.


Please note: Reports posted here will not necessarily be seen by Apple. All problems should be submitted at bugreport.apple.com before they are posted here. Please only post information for Radars that you have filed yourself, and please do not include Apple confidential information in your posts. Thank you!