rdar://31155555: Metal should expose shuffle instructions

Metal should expose shuffle instructions

Originator:	katokop1
Number:	rdar://31155555	Date Originated:	2017-03-20
Status:	open	Resolved:
Product:	iOS SDK	Product Version:
Classification:	Enhancement	Reproducible:

 
Warp/lane shuffle operations are essential to doing register-level blocking of computation collaboratively across threads/work items. They can be challenging to use in general cases, but are critical to extracting peak performance on very important workloads like dense matrix-matrix multiply (and many neural net layer types), as well as local stream compaction for fine-grained dynamic collaboration between threads.

As such, all other compute APIs expose shuffle operations. Metal should as well. Without them, it is impossible to write competitive GEMM, conv layer, and other important kernels on most curent GPU architectures.

Comments

Please note: Reports posted here will not necessarily be seen by Apple. All problems should be submitted at bugreport.apple.com before they are posted here. Please only post information for Radars that you have filed yourself, and please do not include Apple confidential information in your posts. Thank you!

Open Radar

Community bug reports

Metal should expose shuffle instructions

Comments