ld can crash when encountering a symbol in a nested anonymous namespace

Originator:mark
Number:rdar://8010332 Date Originated:05/20/2010
Status:Duplicate/7989734 Resolved:
Product:Developer Tools Product Version:Xcode 3.2.2 10M2148, ld64 97.2
Classification:Crash/Hang/Data Loss Reproducible:Always
 
SUMMARY

Sloppy programming in ld64-97.2/src/ld/ld.cpp’s canonicalizeAnonymousName function causes ld to behave unexpectedly and may cause ld to crash. This problem occurs when encountering a symbol in a nested anonymous namespace, when the parent of the anonymous namespace is named and ends with one or more digits, and when an order file is used.

STEPS TO REPRODUCE

Attempt to build the attached test program using “make:”

mark@selecty bash$ tar -jxf ldcrash.tar.bz2 
mark@selecty bash$ cd ldcrash
mark@selecty bash$ make

EXPECTED RESULTS

This is a simple well-formed no-op program. It should compile and link successfully.

OBSERVED RESULTS

The program fails to link. ld crashes with SIGSEGV.

g++  -c test.cc -o test.o
g++  test.o -o test -Wl,-order_file,test.order
collect2: ld terminated with signal 11 [Segmentation fault]
make: *** [test] Error 1

REGRESSION

This appears to have regressed between Xcode 2.4.1/2.5’s ld64-62.1 and Xcode 3.0’s ld64-77, although this analysis is based solely on source code examination and I haven’t tested with a live 2.5 installation.

NOTES

canonicalizeAnonymousName simplistically scans a symbol for _GLOBAL__N_ and then backs up as many digits as it is able to in an attempt to find the length of the anonymous namespace pseudo-name generated by the compiler. If the containing namespace’s name ends in a digit, canonicalizeAnonymousName will interpret it as part of the anonymous namespace pseudo-name’s length. For example, in this program:

namespace n1 {
namespace {
int i;
}  // namespace
}  // namespace n1

the variable i is decorated as __ZN2n112_GLOBAL__N_11iE. canonicalizeAnonymousName will interpret the length of the anonymous namespace’s pseudo-name as 112 characters long, when in reality it is only 12 characters long, because it included the trailing 1 from parent namespace n1. This leads to inappropriate behavior of canonicalizeAnonymousName: the result it returns will be incorrect, it will potentially overflow its “out” buffer, and it may crash. Certain names will make a crash more likely. Naming the outer namespace n9999999 and using a small test program will cause a crash in both the i386 and x86_64 versions of ld64-97.6.2.

canonicalizeAnonymousName is only called in the ordered symbol sort portion of the sortAtoms routine, and only when an ordered symbol is given that is not found in the global symbol table. One way to ensure that this occurs is to request ordering of a symbol that’s not present in the output. This can occur in production scenarios where a single order file is used with multiple build configurations, not all of which will produce the same set of symbols in the output. This case will also occur when a symbol to order is not global (for example, a static function). When this happens, a map of symbols is built, and names in an anonymous namespace are supposed to be canonicalized using canonicalizeAnonymousName.

The attached program exhibits the bug in the form of a crash. It should link cleanly (and not do anything), but instead, ld crashes:

mark@selecty bash$ make
g++  -c test.cc -o test.o
g++  test.o -o test -Wl,-order_file,test.order
collect2: ld terminated with signal 11 [Segmentation fault]
make: *** [test] Error 1

This bug has caused ld crashes in Chromium/Google Chrome, a large project written predominantly in C++. We have seen it several times when an anonymous namespace is placed within namespace gles2.

The fix is to rewrite canonicalizeAnonymousName (and its partner function, usesAnonymousNamespace) to properly interpret decorated C++ names in such a way as to identify those that are in an anonymous namespace and remove these anonymous namespace pseudo-names by replacing them with hyphens. An added improvement that can be made is that names nested within multiple anonymous namespaces can be canonicalized properly.

I am including a patch to ld64-97.2 to implement the proposed improvement. The bug also affects ld64-85.2.1 as included in Xcode 3.1.4 9M2809. If there will be further releases of Xcode 3.1.x, this patch should be included.

mark@selecty bash$ ld -v
@(#)PROGRAM:ld  PROJECT:ld64-97.2
llvm version 2.6svn,  Apple Build #2207-05
mark@selecty bash$ xcodebuild -version
Xcode 3.2.2
Component versions: DevToolsCore-1648.0; DevToolsSupport-1631.0
BuildVersion: 10M2148
mark@selecty bash$ uname -a
Darwin selecty 10.3.0 Darwin Kernel Version 10.3.0: Fri Feb 26 11:58:09 PST 2010; root:xnu-1504.3.12~1/RELEASE_I386 i386
mark@selecty bash$ sw_vers
ProductName:	Mac OS X
ProductVersion:	10.6.3
BuildVersion:	10D573

Comments

Open Radar doesn’t provide for attachments, but I’ve got an idea…

Test case and ld patch available at http://groups.google.com/a/chromium.org/group/chromium-dev/msg/829eece0a6581c03


Please note: Reports posted here will not necessarily be seen by Apple. All problems should be submitted at bugreport.apple.com before they are posted here. Please only post information for Radars that you have filed yourself, and please do not include Apple confidential information in your posts. Thank you!