Password cracking is a staple part of pentesting and with a few exceptions, dictionary/rule based attacks are the predominant method in getting those ever-elusive plain text values. Cracking rigs have afforded pentesters and blackhats alike the ability to throw a few graphics cards at some hashes and achieve phenomenal speeds, for example, earlier this year an 8-GPU system broke 500GH/s against NTLM hashes (that’s over 500 billion hashes/second). Be afraid Windows passwords… be very afraid.
Time is limited during a pentest and after acquiring a load of hashes you’ll quickly want to crack as many as you can. NotSoSecure decided to have a look at the success rates of different rules that are commonly used and @Stealthsploit has been looking at deriving a custom rule based from these tests that can better help satisfy his clear text cravings.
The Target Data:
The hash set used was the Lifeboat data dump. Lifeboat is a Minecraft community and in January 2016 over 7 million account details were leaked including unsalted MD5 password hashes. The number of hashes and weak algorithm made this dump a prime candidate for research. The raw MD5’s were extracted from the file and after de-duplicating hashcat reported a little over 4.3 million unique hashes.
The Dictionary and Rules:
The popular rockyou dictionary was used during testing with each of the following rules:
- best64
- unix-ninja-leetspeak
- InsidePro-HashManager
- InsidePro-PasswordsPro
- toggles5
- T0XICv1
- rockyou-30000
- d3ad0ne
- dive
- generated2
- d3adhob0
- hob064
- KoreLogicRulesPrependRockYou50000
- v2.dive
All except d3adhob0, hob064, KoreLogicRulesPrependRockYou50000 and _NSAKEY.v2.dive are included with hashcat. A good mixture of rule files should allow us to identify which individual rules are the big hitters from which a custom rule can be created.
Testing:
Each session started with the following command, substituting the rule used and the filenames, respectively:
The potfile was disabled so that hashcat didn’t check it prior to each crack and skew our numbers. Debug mode can only be enabled when using rules and the debug file contains the stats. Every time a rule cracks a hash it’s logged in the file. After hashcat completes, the file can then be sorted to show the number of times a rule was successful, therefore revealing the most successful rules in each set.
The Results:
The results from each test can be found below, showing the generated password candidates from each rule set and the total/percentage cracked.
Rule | Total Candidates | Cracked | % Cracked |
---|---|---|---|
dive | 1,421,219,827,456 | 2,843,085 | 65.64 |
_NSAKEY.v2.dive | 1,768,370,620,544 | 2,784,741 | 64.30 |
generated2 | 933,992,405,632 | 2,606,565 | 60.18 |
d3ad0ne | 489,063,363,712 | 2,580,399 | 59.58 |
rockyou-30000 | 430,298,880,000 | 2,557,422 | 59.05 |
T0XICv1 | 171,129,864,576 | 2,357,989 | 54.44 |
InsidePro-HashManager | 92,801,125,120 | 2,247,349 | 51.89 |
InsidePro-PasswordsPro | 44,751,083,520 | 2,056,467 | 47.48 |
d3adhob0 | 825,313,251,840 | 1,712,581 | 39.54 |
best64 | 1,104,433,792 | 1,404,449 | 32.43 |
hob064 | 917,970,944 | 1,195,032 | 27.59 |
KoreLogicRulesPrependRockYou50000 | 717,078,740,224 | 1,137,852 | 26.27 |
toggles5 | 70,898,912,128 | 759,344 | 17.53 |
unix-ninja-leetspeak | 44,048,262,016 | 621,280 | 14.34 |
Success Rate on Lifeboat
We can also look at the effectiveness of each rule set by comparing success relative to the total candidates tested. For example, we can see that the d3adhob0 rules had the fourth largest candidate size (825 billion), however it cracked only 39.54% of passwords. By comparison the InsidePro-PasswordsPro rule had only 45 billion candidates yet it cracked 47.48% of passwords. The latter rule is clearly more efficient!
There are lots of metrics we’re not taking into account here so we’re not saying, “never use d3adhob0, always use InsidePro-PasswordsPro”. This is just an observation from this specific test. Lots of other metrics, like time, algorithm, available resources, potentially known characters (where mask attacks come in) etc need to be considered depending on what you’re trying to achieve. We chose this setup because a large set of hashes using unsalted MD5 provided the best balance for speed/time.
A rule efficiency breakdown can be seen below.
Test | Total Candidates | Cracked | Guesses per crack |
---|---|---|---|
hob064 | 917,970,944 | 1,195,032 | 768 |
best64 | 1,104,433,792 | 1,404,449 | 786 |
InsidePro-PasswordsPro | 44,751,083,520 | 2,056,467 | 21,761 |
InsidePro-HashManager | 92,801,125,120 | 2,247,349 | 41,294 |
unix-ninja-leetspeak | 44,048,262,016 | 621,280 | 70,899 |
T0XICv1 | 171,129,864,576 | 2,357,989 | 72,574 |
toggles5 | 70,898,912,128 | 759,344 | 93,368 |
rockyou-30000 | 430,298,880,000 | 2,557,422 | 168,255 |
d3ad0ne | 489,063,363,712 | 2,580,399 | 189,530 |
generated2 | 933,992,405,632 | 2,606,565 | 358,323 |
d3adhob0 | 825,313,251,840 | 1,712,581 | 481,912 |
KoreLogicRulesPrependRockYou50000 | 717,078,740,224 | 1,137,852 | 630,204 |
_NSAKEY.v2.dive | 1,768,370,620,544 | 2,784,741 | 635,022 |
dive | 1,421,219,827,456 | 2,843,085 | 4,998,867 |
Efficiency on Lifeboat
So even though the dive rule cracked the most it was the least optimal in terms of average guesses before cracking a hash. This isn’t necessarily an issue if you have the luxury of time, but the time would be substantially longer if these hashes were SHA-256 for example. Also, the most efficient rule, hob064, cracked a similar number of hashes as the KoreLogicRulesPrependRockYou50000 rule, however it took the latter nearly 30,000 guesses more between each crack.
The resulting debug files were sorted, two examples of which can be seen below.
This is only a snippet as one of the rule sets contains over 100,000 rules and others contain several tens of thousands. A couple of quickly identified passwords trends in the above example show that the Minecraft community love to substitute ‘a’ for ‘4’ (sa4 rule), as well as capitalise the first letter and lowercase the rest (c rule)! A complete list of hashcat rule switches can be found on their website.
Concurrency Anomalies:
It became apparent after running one of the tests twice (in this case the best64 rule set), that the resulting stats were slightly different. A section of the stats from both runs is shown below.
There were 7 more plain rockyou hits in the first test than in the second. The other rules also reported slightly different numbers both here and in other rule sets.
This is likely due to multi-threading / high concurrency which meant that different rules produced the same plain text value before the “:” rule hit (especially seeing as we’re running a -w3 profile!). For example, let’s say the password was “L3tme1n” and the dictionary contains “l3tme1n”. If the “T0” rule (toggles the case of the first character) hits before the “:” rule, then “T0” gets the point, effectively stealing it from “:”. In each test the differing results were noted to be relatively consistent.
One Rule to Rule Them All…:
From here we selected the top 25% performing rules from each set, then de-duped, concatenated and tidied, leaving us a custom super-rule set containing 51,998 rules. Time to put it through the paces.
Rule | Total Candidates | Cracked | % Cracked |
---|---|---|---|
OneRuleToRuleThemAll | 745,808,362,112 | 2,960,711 | 68.36 |
dive | 1,421,219,827,456 | 2,843,085 | 65.64 |
_NSAKEY.v2.dive | 1,768,370,620,544 | 2,784,741 | 64.30 |
generated2 | 933,992,405,632 | 2,606,565 | 60.18 |
Success Rate on Lifeboat
Test | Total Candidates | Cracked | Guesses per crack |
---|---|---|---|
[…] | […] | [...] | […] |
OneRuleToRuleThemAll | 745,808,362,112 | 2,960,711 | 251,902 |
generated2 | 933,992,405,632 | 2,606,565 | 358,323 |
d3adhob0 | 825,313,251,840 | 1,712,581 | 481,912 |
KoreLogicRulesPrependRockYou50000 | 717,078,740,224 | 1,137,852 | 630,204 |
_NSAKEY.v2.dive | 1,768,370,620,544 | 2,784,741 | 635,022 |
dive | 1,421,219,827,456 | 2,843,085 | 4,998,867 |
Efficiency on Lifeboat
Although not the most efficient against all our tests (due to the large number of candidates), the custom rule cracked 2.72% (117,626) more passwords than dive did. Our custom rule was however substantially more efficient than dive which took second place in success rates.
Our rule was also tested against a couple of other data breaches that were published online to see how it performed again different data sets, again comparing against the dive rule.
#Test 1
XSplit breach, November 2013, 2,983,472 accounts, 2,227,270 unique hashes. Unsalted SHA-1.
Rule | Total Candidates | Cracked | % Cracked |
---|---|---|---|
OneRuleToRuleThemAll | 745,808,362,112 | 1,455,682 | 65.36 |
dive | 1,421,219,827,456 | 1,402,636 | 62.98 |
Success Rate on Xsplit
Our custom rule cracked 2.38% (53,046) more rules.
#Test 2
Battlefield Heroes breach, June 2011. 548,773 accounts, 423,623 unique hashes. Unsalted MD5.
Rule | Total Candidates | Cracked | % Cracked |
---|---|---|---|
OneRuleToRuleThemAll | 745,808,362,112 | 318,958 | 75.29 |
dive | 1,421,219,827,456 | 314,150 | 74.16 |
Success Rate on Battlefield Heroes
Our custom rule cracked 1.13% (4,808) more rules.
Wrap Up:
Our super rule came out on top in all our tests above, as well as others we looked at after. We’re sorry to disappoint any Lord of the Rings fans (“One ring to rule them all!”), but despite our rule name, there likely won’t ever be one rule to rule them all as other rule based attacks wouldn’t exist if there was. Password attacks should always be executed factoring in all variables, in particular the available time, hardware resources, dictionary size and algorithm.
What these tests have shown however, is that by creating your own custom rules (which we highly encourage), you can grab many more of those plain text secrets that you may not have seen if you just ran a standard dictionary attack or one with a single in-built rule. We’re certainly looking forward to using our super rule against many pentesting hash dumps in the future!
The custom rule we have used is accessible over at github