## Getting sighax

Presentation by Myria

• We know the format of the decrypted signature, but now we need to find a signature matching that pattern.
• This is brute-forcing RSA, but astronomically easier than it normally would be, thanks to Nintendo fails.

## Brute Force

• Back-of-napkin math suggests approximately \$2^{43}\$ signatures searched per match.
• sighax was said by derrek to take six months per signature, and there are six signature types.
• Screw that.

## RSA - A Review

• Signatures are made by powering by secret \$d\$:
• \$s = m^d \mod n\$
• Verification is done by powering by \$e\$ (65537):
• \$v = s^e \mod n\$
• If \$v = m\$, \$s\$ is a valid signature for \$m\$.

## The Naive Method

• Calculate \$m = s^{65537} \mod n\$ for random \$s\$ until you get a useful \$m\$.
• Very slow.
• 17 multiplies and 17 mods per attempt.

## The Negation Trick

• When checking \$m\$, also check \$-m\$.
• If \$-m\$ is a match, then \$-s\$ is its signature.
• Subtraction is much faster than multiplication.
• This works because:
• \$-s=(-1)s\$
• \${(ab)}^e=(a^e)(b^e)\$ for any \$a\$, \$b\$, \$e\$
• \${(-s)}^{65537}={(-1)}^{65537}s^{65537}\$
\$=-(s^{65537})=-m\$.
• ~1.9x performance boost.

## RSA Multiplicative Attack

• If \$s_a\$ and \$s_b\$ are signatures for \$a\$ and \$b\$, then \$s_as_b \pmod n\$ is a signature for \$ab \pmod n\$.
• \${(s_as_b)}^e = {(s_a)}^e{(s_b)}^e = ab\$
• Such RSA attacks normally stopped by "padding".
• Getting a validly-padded message this way is extremely unlikely.
• But padding is what Nintendo messed up!

## Using Multiplicative Attack

1. Select random root \$r\$ such that \$1 < r < n\$.
2. Calculate multiplier \$k = r^{65537} \mod n\$.
3. Set \$x = r\$ and \$y = k\$.
4. Do forever:
1. \$x\$ = (\$x\$ * \$r\$) \$\mod n\$;
2. \$y\$ = (\$y\$ * \$k\$) \$\mod n\$;
3. if (IsMatch(\$y\$)) return \$x\$;
4. if (IsMatch(\$n - y\$)) return \$n - x\$;

## Doubling Performance Again

We only need \$x\$ if we found a match, so defer it:

1. Select random root \$r\$ such that \$1 < r < n\$.
2. Calculate multiplier \$k = r^{65537} \mod n\$.
3. Set \$y = k\$ and \$z = 0\$.
4. Do forever:
1. \$z\$++;
2. \$y\$ = (\$y\$ * \$k\$) \$\mod n\$;
3. if (IsMatch(\$y\$)) return \$r^z \mod n\$;
4. if (IsMatch(\$n - y\$)) return \$-r^z \mod n\$;

## Are We There Yet?

• Final powmod much faster than calculating \$x\$.
• So now we're down to 1 mul-mod per 2 checks.
• Can we do better?
• Of course!

## Shenanigans

• We require any \$k\$ such that \$k = r^{65537} \mod n\$ and we know \$r\$.
• Multiplying by any \$k\$ whose \$r\$ is known works.
• (Negation trick: for \$k=-1\$, \$r=-1\$.)
• Do we have a \$k\$ for which multiplication is easier?
• It turns out we do: legitimate signatures.

## Fun with PKCS #1

• PKCS #1 signature format:
• 0001FFFFFFFF...FFFF00<garbage>
• Fast multiplication:
• Fact: \$(a - b)x = ax - bx\$
• 0x0001FFFFFFFF...FFFF00...00 + garbage = 0x000200000000...000 - inv_garbage
• Multiplying by 0x00020...000 is a left shift: fast!
• inv_constant is 408 bits.
• Replace 2048x2048 multiply with a 2048x408 one.

## GPUs Are Space Heaters

This brute force is perfect for GPUs.

• Algorithm is same thing repeated massively.
• Few conditionals during main loop.
• No communication needed among operations.
• No need to "cover" the keyspace.
• Entire algorithm fits in GPU registers.
• My cat likes the warmth.

## GPU Implementation

• CUDA or OpenCL?
• CUDA tools are better.
• The current best compute cards are Nvidia.
• Amazon AWS and Windows Azure use Nvidia.

## GPU Implementation

• Keeping register count down
• Algorithm is register-bound.
• Use Montgomery reduction GPU algorithm by Kaiyong Zhao:
• Can't use PKCS #1 trick, but GPUs win bigly.

## Hardware

• Home Desktops
• 3x GeForce GTX 1080 Ti
• 1x GeForce GTX 1080
• 1x GeForce GTX 980 Ti

## Hardware

• The Cloud™
• Amazon AWS "p2.8xlarge" instance
• 8 Tesla K80 GPUs
• 32 Xeon CPU cores (might as well use!)
• K80 is half as fast as 1080, but there are 8!
• ~230M checks per second combined GPU + CPU
• US\$7/hour (yikes!)

## Results

• All 6 signatures in ~9 days.
• ~8,000,000,000,000 checks per signature.
• So much for 6 months.
• Brute-forcer program is open-source: 