Architectural Enhancements for Fast Subword Permutations with Repetitions in Cryptographic Applications