[rust-dev] Re: RFC: Tuple Swizzling/Shuffling
wichard at vitalitystudios.com
Tue Jan 14 12:06:57 PST 2014
On Tue, Jan 14, 2014 at 1:07 PM, Carter Schonwald <
carter.schonwald at gmail.com> wrote:
> Neat proposal:
> 0) seems like you need to add a notion of const expr to the type system
> for this proposal, right? I started staring at that and it's pretty subtle
> (though I may have been looking at it wrong)
I did add the const expr notion a bit, but it's only in anticipation of
somebody wanting to specify the mask in such terms. Of course, boarder
support for const expr's would be required.
1) do rust tuples actually map to the llvm simd vector types?
They do with #[simd]. Though I think it would be worth considering whether
all tuples of a single primitive type should be automatically considered
#[simd] and be lowered to a LLVM vector type. In fact one way I'm
considering impling this is to treat the mask as a tuple and lower it as a
vector into a shufflevector mask operand.
2) so this would require some special syntax support right? Could it be
> prototyped with a procedural macro plus having the shuffle ast code
> generated internally?
Yes. Sadly not; I've tried. Moreover procedural macros don't exist at a
stage where tuple type info is known, so such a macro would only be able to
operate on literals and still wouldn't be able to dictate trans.
3) would the internal rep properly treat the shuffle mask ints as part of
> the op itself so that it won't get cse'd or the like?
4) would the syntax do a type / syntax error when you use a tuple position
> index that's too large?
> 5) the llvm shuffle vector intrinsic takes two vectors of values and let's
> you express interleaving them, not just rearranging a single one (which
> seems to be a restriction on your example). Both styles matter, and
> respectively correspond to different platform specific shuffle instructions
Yeah, not even ten minutes after sending the proposal, I realized I had
forgotten about that one detail. I'm not terribly fond of any of my
envisioned solutions, but the best one uses the recently removed '@':
let vec1 = (10, 11, 12, 13);
let vec2 = (9, 8, 7, 6);
let shuffled = vec1 @ vec2 -> (7, 6, 5, 4, 0, 1, 2, 3); // think
concATenate; it'll likely be thought of as such in rustc.
As an aside, note the arrows point towards what type of value the
expression is supposed to be (lvalue is <-, rvalue is ->).
I like the idea of something like this, and it's definitely simpler than
> the shuffle proposals I've been trying to draft, though using a word like
> "shuffle" may be. Though it doesn't give a way to use the same idea for
> someone using the platform specific shuffle intrinsics that hopefully would
> be added eventually. (Any such platform specific intrinsics would be for
> fixed tuple size and type).
The intent is to lean on LLVM's platform independenceness; LLVM supports
vectors (the SIMD type) of any length, so in an effort to be forward
compatible with future SIMD sizes, offer a generic way to 'do the shuffle'.
Access to more specialized shuffles could be done via functions with a
#[link_name] attribute (because such instructions operate on operands of
specific types anyway, they wouldn't be very amenable to a generic
solution) in unstable::simd or an external library. But that's another
> *An Alternative approach*? *Int *
> One way around the const expr requirement for the type system that someone
> suggested was pretty neat: expose the various platform specific simd
> shuffle ops, an have their shuffle mask int args actually be "type args".
> Apparently there's some protean support for type level numbers because of
> sized vectors, and because rust requires all generics to be monomorphized,
> this actually would capture the right "constness at compile time"
> an example of this idea would be to take the VSHUFPD intel instruction (in
> the intel architecture), and modify the intrinsic from the c code version
> (nb: __m256d == v4f64 in rust parlance)
> __m256d _mm256_shuffle_pd (__m256d a, __m256d b, const int select);
> fn _mm256_shuffle_pd<const int select>(__m256d a, __m256d b)-> __m256d
> I'm not sure how such a type level int application would work out, but It
> may be the nicest way to conservatively add type safe SIMD shuffle primops
> to rust, though I could be completely wrong. (I was initially meh on this
> type application idea, but its grown on me, it exploits the way rust
> generics work very very nicely!)
> *note* while exposing the architecture specific intrinsics would be bit
> more work, it would also mean that the SIMD support in rust have a more
> transparent mapping to various architectures, allow better architecture/cpu
> microarchitecture based tuning (writing an version of BLIS
> http://code.google.com/p/blis/ in rust might be a good stress test), and
> it'd be less coupled to the vagaries of how LLVM lowers the shuffle
> instruction to the target architecture. This actually matters in the
> context of writing code that uses the "optimal" instruction sequence by
> detecting the cpu micro architecture at runtime and branching to the tune
> variant internally, something OpenBLAS does very nicely, see here for
> examples https://github.com/xianyi/OpenBLAS/tree/develop/kernel/x86_64
> That said, having a systematic way to support the llvm shuffle intrinsic
> In it's full generality would be lovely, it's a much more friendly
> operation that people can use to get started with doing simd in a somewhat
> user friendly way.
> point being: I support there being better shuffle simd support / simd
> support period :), though how to do it best seems unclear to me (and
> theres also a few ways that arent good too)
> On Tuesday, January 14, 2014, Richard Diamond wrote:
>> Basically the idea here is to support shuffling for SIMD types in a way
>> that can be easily lowered to IR (LLVM's shufflevector requires the mask be
>> a vector of constants, so an intrinsic function is out of the question),
>> however I image this sugar could extend to tuples with multiple types.
>> Some examples:
>> let vec = (1.0f32, 2.0f32, 3.0f32, 4.0f32);
>> let all_x = vec -> (0, 0, 0, 0); // perhaps this should be "vec <- (0, 0,
>> 0, 0)"?
>> assert_eq!(all_x, (1.0f32, 1.0f32, 1.0f32, 1.0f32));
>> let single_x = vec -> (0);
>> assert_eq!(single_x, (1.0f32));
>> let mut vec = vec;
>> vec <- (0) = 5.0f32; // set x only
>> vec <- (1, 2) = (6.0f32, 7.0f32) // set y & z
>> assert_eq!(vec, (5.0f32, 6.0f32, 7.0f32, 4.0f32));
>> let vec = vec;
>> // the mask may be arbitrarily long:
>> assert_eq!(vec -> (0, 1, 2, 3, 0), (5.0f32, 6.0f32, 7.0f32, 4.0f32,
>> // leaves vec unchanged
>> let functional_update = vec -> (0, 1, 3) .. (0.5f32, 1.0f32, 10.0f32);
>> // functional_update would take it's type from vec
>> assert_eq!(vec, (5.0f32, 6.0f32, 7.0f32, 4.0f32));
>> assert_eq!(functional_update, (0.5f32, 1.0f32, 7.0f32, 10.0f32));
>> A couple of things would need to be disallowed, however:
>> let mut vec = vec;
>> // no duplicate assignments/functional updates:
>> vec <- (0, 0) = (..);
>> let _ = vec -> (0, 1, 2, 3, 0) .. (..);
>> // no out-of-bounds:
>> vec <- (5, 9000) = (..);
>> let _ = vec -> (5, 9001);
>> let _ = vec -> (5, 9002) .. (..);
>> let _ = vec -> (0, 1, 2, 3, 4) .. (..);
>> // all mask values must be a const expr:
>> let mut non_const_expr = 15;
>> vec <- (non_const_expr) = (..);
>> let _ = vec -> (non_const_expr) .. (..);
>> let _ = vec -> (non_const_expr);
>> // mismatched tuple sizes:
>> vec <- (0, 1) = (0.0f32, 0.0f32, 0.0f32);
>> let _ = vec -> (0) .. (0.0f32, 0.0f32);
>> AIUI, the notation would be:
>> tuple_mask : '(' integer [ ',' integer ] * ')' ;
>> tuple_expr : '(' expr [ ',' expr ] * ')' |
>> tuple_expr "->" tuple_mask [ ".." tuple_expr ] ? ;
>> I'm willing to write this myself, but I'd like some consensus/feedback
>> regarding ze proposed sugar.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Rust-dev