Hello. I suggest to add more SIMD intrinsic, such as "add 4 of uint8 a and b" (aka number or binary32 with four uint8 operation). These features have NVidia GPU for example. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20170621/1798ef18/attachment.html>