Credential Dumpster Diving


Credential dumps. Leaked identities. Stuffing lists. Data leaks. The names of the contents change, but the data stays pretty consistent. It is the use, discovery method, and origin of the data that ends up applying the pretty label.

  • Ever been assigned the task to find out what credentials from your company are “out there”?
  • Had a third party alert you to compromised credentials from one named breach or another?
  • Discovered a link between the phishing attempts against your organization and data leaks?
  • Found or were alerted to malicious activity that employed your organization’s emails?


The four scenarios are pretty common situations to handle in the threat intelligence world. If you are new, any one of them can be daunting to tackle. The first two scenarios are relatively straightforward single step activities to get to a final answer.  The last two consist of multiple stages to reach a final hypothesis on “what happened”. In part I, we are going to tackle a way to handle the first two scenarios.


The first time this task drops in your lap, you don’t realize the bus has already run over you. It is only when you delve a little deeper into the topic do you realize the bus just backed up and did it again. Tracking down lost or loose credentials is a frustrating endeavor to undertake. It has no end.  Even after people leave your company, the credentials they had will continue to bedevil and confound long after they depart.


To have a chance to succeed at this task, you have to set strong objectives upfront. In other words, define the scope of the project. That means engaging whoever assigned this task and laying down the task goals, plans, etc. If that happens to be you, then it is easier.  If not, hammer it out with them.


If you happen to be asking “why” or “what makes this complicated enough to bother”, take a moment and define “credential”. Does that mean only emails? Usernames? Network IDs? Certificates? Personas? All of them? Putting that down in your objectives is step-one in making this task manageable. Not doing so, is a guarantee for requirements creep in later on. Additionally, it could morph into something more negative if you and the one directing the task don’t have a common agreement on what is being collected, analyzed and reported.


Once the scope is defined, it is time to research. Before collecting one iota of credentials, leverage your home base advantage and determine what a correct credential looks like, so you can equally define what an incorrect one resembles.


  • Know the pattern of credential, such as an email naming convention — first dot last name, first initial last name, and so on.
  • Determine the creation process for the credential. Who makes them, how they make them, assignment, etc.
  • Outline credential usage. Email is easy, but a code signing certificate might be more tricky. Personas or people’s names are another. When does your company employ them as a credential and in what context? Usernames are a third. Do usernames replicate across applications? One-to-one?
  • Define the end-of-life process for credentials. How are they retired or age out?


As part of the scope, define where and how you will store the data. Tossing zip files onto an unsecured share is probably not a good move. If it wasn’t defined in scope, break out where data will go and how you will access it and perform analysis. It is highly suggested you leverage your threat intelligence platform (TIP), so you can connect compromised credentials to events and other elements of intelligence data.


Armed with scope, inside knowledge, and an outline of storage, it is time to start collecting. Your first move should be an internal one. Begin with your TIP. What information do you already have and what sources? Next, ask if other departments already collect this information to prevent a duplication of effort.  Depending on the size of your organization, one or more other departments may be gaining access to or collect this information. Example:  many security operations centers (SOCs) leverage the website haveIbeenpwned (HIBP) as a free quality resource for email. Registering on the website and verifying your identity with them allows you to get a report on any email they see. Whoever handles your data leak prevention (DLP) program may also have reams of data on credentials that have been leaked, lost or compromised. The list continues.  Ask inside first.


The second move is to see what you expose. Look at your public facing outlets — website, social media, job sites, newsletters, circulars, etc. — find out what credentials you release into the wild as a byproduct of doing business. (Same for usernames, IDs, certificates and other information.)


After internal queries, the third move to make is sharing partners. If you have them, find out what credential information they share or services they provide to members. As an example, the various Information Sharing and Analysis Centers (ISACs) share data by email, chat, service and many other means. Find the one for your sector of the marketplace and take advantage of the information they provide. Perform this step for all your sharing partners.


The fourth move is to inquire externally. This can be done in several ways. You can leverage a service.  For example, CyberDefenses provides credential monitoring and we are far from alone in that capacity.  A service can speed external collection, but it comes at a cost.  If that’s in scope, then do so.  If it is not in scope or you question the breadth of the service’s collection effort, doing it yourself is always an option.  Assembled below is a list of locations to visit to kick-start external searches.


  • Online Threat Exchanges (OTX). Both the AlienVault OTX and IBM XForce OTX are good starting locations.  Both regularly share threat data, which contains rich amounts of credential information for the gathering.
  • Twitter. This social medium has become a de facto source of good leads on credentials.  Leveraging the right mix of hashtags and automation can provide a continuous stream of data.  Defining the right hashtags is a bit artistic, but sticking to variations of “leak”, “data”, “incident”, “credential”, and so on should do the trick.
  • Pastebin. The site is a hodgepodge of data and some of that data is credential dumps.  It can be scanned without cost, but it is more efficient to purchase a lifetime account (~$20-50) and automate the scans.  While on the topic, Pastebin is far, far from the only “paste” site out there, even if it is the most common.  Explore the hundreds of other alternates and focus on ones with promise.  Other sites like psbdmp will store deleted pastes and can be explored as well.
  • GithubGist. In many ways providing the same ability as Pastebin.  Equally a target for automation and can be searched with topical keywords “breach”, “credential”, and so on.
  • Reddit. Absolutely a smorgasbord of data that can be quixotic and frustrating at times, but a rich source of hints, pointers and outright credential-related data.


Another suggestion to consider is building a topical list of “known” leaks. Numerous ways exist to do this, but a simple starter method is to use the list from HIBP to build on. Another, more scattered source of data is hackmageddon. Exploring the timeline of attacks can help pinpoint events of interest.  Shodan presented a blog post on how to leverage their service to find defaced websites. The same process can be used to define potential sites of compromise. Lastly, a trusty open ended search can find a lot of data on breached sources. Knowing these “hotspots” of breached locations can focus analysis.


That kick-starts your beginning collection efforts. Polish and bolster with new sources until you build a working mechanism, preferably one as automated as possible. In the next article, we’ll discuss the analysis and reporting effort.


For more information on CyberDefenses, Inc visit our Academy Page for a full list of course offerings both in the classroom and online.  And stay-tuned for our next blog post, delving in a little more detail on this topic.

Image courtesy of

About the author

Monty St John

Monty is a security professional with more than two decades of experience in threat intelligence, digital forensics, malware analytics, quality services, software engineering, development, IT/informatics, project management and training. He is an ISO 17025 laboratory auditor and assessor, reviewing and auditing 40+ laboratories. Monty is also a game designer and publisher who has authored more than 24 products and 35 editorial works.