New Hashcat Optimization - Faster Maxwell Cards!
Hashcat, the de-facto password cracking tool that recently went open-source, works very well on both AMD and Nvidia GPUs.
One problem, however, was that when everyone went out to buy their GTX 980's and other Maxwell-based cards, they discovered that rule-based attacks on wordlists were slower than brute-force attacks on some algorithms. They were also just slow compared to other similar-spec cards from AMD. Obviously, wordlist and rule-based attack speeds depend on how many hashes you have, and how many wordlist candidates are keeping the GPU's busy. However, even when properly optimized, and due to OpenCL constraints, the speed of Maxwell GPU's in wordlists+rule modes lags behind their AMD cousins.
Until now that is.
Thank's to a recent tweak by atom (Hashcat's developer) we are enjoying a major speed boost for Maxwell-based cards. The tweak was a workaround for how OpenCL is used by Hashcat with Maxwell-based Nvidia cards.
I have decided to do some benchmarks to show the difference.
the benchmarks were done using;
- 1 SHA256(p./s), MD5, NTLM & PHPass Hash
- A 1GB wordlist to ensure that all GPU's are 100% utilized during our measurement run.
- The d3ad0ne.rule ruleset that ships with Hashcat
- A timer of 60 seconds to let everything settle and run.
- No reboots, no driver changes, no extra Hashcat settings (all on automatic).
- 4 measurements of speed during the 60 seconds, averaged to a final speed.
The Benchmarks were done in the following manner;
- Old Hashcat + 1 980 GPU
- New Hashcat + 1 980 GPU
- Old Hashcat + 6 x 980 GPUs
- New Hashcat + 6 x 980 GPUs
Note: Old Hashcat refers to the version built off Github before 3.10 which is the newly optimized version built from Github.
All the results are published in the graphs below, which indicate the changes:

As can be clearly seen on 1 GPU, NTLM & MD5 have received an awesome speed boost with the new optimization. PHPass and SHA256 (p.s) remained much the same. Let's look at 6 GPU's...

With 6 GPU's the increase remains the same (which is what we want!) - we see a speed increase for all 6 Maxwell cards doing NTLM and MD5. How much increase did we measure?

Clearly, atom's changes have given Hashcat a major boost :) Check that 45% speed increase in NTLM on Maxwell cards!
Atom pushed some more optimization that increases speed although not as much as the initial changes did. For brevity's sake and not to redo all the graphs again, below is the table showing the differences between the current optimized version (3.10-611) and the one after that (3.10-620).

Your numbers may be higher or lower than mine, given that you can still tweak the utilization, overclock the GPU's or other settings. However, its clear everyone with Maxwell-based GPU's is in for a nice treat with the changes.
You can grab the updated Hashcat from here: https://github.com/hashcat/hashcat OR download the pre-built binaries here: https://hashcat.net/beta/
Happy Cracking! And of course, thank you to atom for continually making Hashcat better!