Pattern Recognition, Analysis and Profiling. Is it for you?

Training Pic

Photo by rawpixel on Unsplash

by Monty St John

Our Pattern Recognition, Analysis and Profiling class tends to throw people when they see the name. In their head they ponder the title and think about pattern matching. After a dance of Sudoku, logic puzzles, and crosswords pass by their internal sensor, they give it a pass to move on to more seductively named classes. Don’t let the class’s less than glamorous name fool you. It’s got a lot going on under the hood.

In the first day, we come out swinging after a bit of introduction to the idea of patterns. It’s always interesting on our part to see the difficulty people have in throwing down a definition for a pattern. Tried that out in your head or hit google yet?

Patterns by definition are some kind of regularity, a repeating series of elements that are repeating in a predictable manner. They can be tangible, man-made things or abstract ideas. We can sense them directly, but require analysis to understand and observe the abstract patterns that exist in science, math, language, and behavior. That last word is the one to zoom-in on. As investigators, we are constantly looking at data and the patterns within the data that allow us to connect to behavior.


In the first day, we discover that thought process through exploring a scenario that introduces us to the first of several patterns that require sleuthing. Grep, AWK and Regular Expressions are introduced here. Not to instill mastery, but to provide simple tools, though the practice you’ll get will be valuable in more ways than one. More complicated tools could easily be substituted, but start to detract from the point of the exercise. Training your brain to recognize and detect them is the top point here, not have software do it on your behalf.

Individuals not fluent in regular expressions (regex) get a crash course and immediate practice. Depending on the scenario the target of this effort can vary in a lot of different ways, but always focuses on direct detection and finally patterns that can only be derived by analysis. Students work independently and together; highly encouraged to share data to tackle the scenario as a group.


After this initial discovery of patterns of behavior occurs in the scenario, we step left to talk about analysis techniques. After a short orientation to their use, we dive in together to use the hypotheses generator technique to create a number of possible outcomes to explain the behavior in the data we derived. It’s a technique we are going to use a lot, just like we’ll take the patterns we noticed (the evidence) and put them to each hypothesis to complete them.

Now, we’ve discovered patterns and data points we think are significant.  What do they mean?  What do they point to?  Why are they present?  The whole “what, so what and now what” ladder of inference plays a part here.  The thoughts and hypotheses are collected, revisited and updated as the rest of the scenario plays out.


Armed with information and thoughts as to what’s going on, we enter into day two of the training.   Expansions to the scenario introduce a completely new gamut of information.  The initial communication diagram is expanded and grown with new information.

YARA, the swiss army knife of matching patterns is introduced on this day.  Where Grep and AWK were leveraged against data in a file, YARA is introduced as a handy tool to look a number of files, whether it’s an operating system bundle of files or across a network.

Here is where the big picture starts to come into focus and additional data collection is combined with the communication identified in the earlier scenarios of day one.





On this day, we also introduce methods to combat analysis saturation, as the amount of data rises beyond what can easily be mapped on a white board or jotted down in a document.  To hone-in the issues and to determine a proper roadmap for investigation, two structed analytical techniques are introduced.  Morphological Analysis is used to break out the components of the activity observed.  By forcing associations between the components of the activity, we can quickly define the most likely possibilities in descending order to the least likely.  That analysis is fed into Scenario Analysis to determine the likelihood and evidence required for a particular scenario to occur.  Our original hypotheses are not forgotten and revisited and updated with the new information that has been discovered.


On the last day is when everything is pieced together.  All the data and analysis performed is assembled and revisited.  Additional analysis is undergone, both as a check and as an extension.  New data is provided as well, the last pieces of the puzzle, but this comes after an examination of the work done to this point.

The first analytical run is a key assumptions check.  This functions to make sure we haven’t overlooked something or assumed something that we shouldn’t have in our analysis.  A second analytical run is made using Matrix Analysis to assemble and weight the evidence.  Here is where we balance out what we know with what we suspect and use it to refine down the possibilities when a clear and obvious option isn’t present.


The new information given on the last day is a collection of social profiles and suspect interactions.  We dive into some basic social network analysis to understand how these social interactions connect and supplement our earlier understanding of the social links between individuals that we derived from the earlier chat communication.


Armed with everything we can find, everything is revisited.  Old hypotheses that have proven to be invalid or unlikely are removed, while strengthened ones are elevated.  The culmination is putting together what happened, how it happened and if possible, the why behind it all.


The purpose of the class is be thought stimulating and useful, providing use cases and techniques to solve them.  The class is based-off an amalgam of real-world events and has…well, a few Easter eggs hidden in it for the discerning.


To see the full syllabus or to register for the upcoming course visit us here:

To learn more about our full list of training options, visit our academy page:

About the author

Monty St John

Monty is a security professional with more than two decades of experience in threat intelligence, digital forensics, malware analytics, quality services, software engineering, development, IT/informatics, project management and training. He is an ISO 17025 laboratory auditor and assessor, reviewing and auditing 40+ laboratories. Monty is also a game designer and publisher who has authored more than 24 products and 35 editorial works.