Detecting and Characterizing Lateral Phishing at Scale

USENIX Security Symposium, pp. 1273-1290, 2019.

Cited by: 9|Bibtex|Views111
EI
Other Links: dblp.uni-trier.de|academic.microsoft.com|arxiv.org
Weibo:
Our work showed that 14% of our randomly sampled organizations, ranging from small to large, experienced lateral phishing attacks within a seven-month time period, and that attackers succeeded in compromising new accounts at least 11% of the time

Abstract:

We present the first large-scale characterization of lateral phishing attacks, based on a dataset of 113 million employee-sent emails from 92 enterprise organizations. In a lateral phishing attack, adversaries leverage a compromised enterprise account to send phishing emails to other users, benefitting from both the implicit trust and t...More

Code:

Data:

0
Introduction
  • The security community has explored a myriad of defenses against phishing attacks.
  • In a lateral phishing attack, an adversary uses a compromised enterprise account to send phishing emails to a new set of recipients.
  • Listing 1 shows an anonymized example of a lateral phishing attack from the study
  • In this attack, the phisher tried to lure the recipient into clicking on a link under the false pretense of a new contract.
  • The attacker tried to make the deception more credible by responding to recipients who inquired about the email’s authenticity; and they actively hid their presence in the compromised user’s mailbox by deleting all traces of their phishing email
Highlights
  • For over a decade, the security community has explored a myriad of defenses against phishing attacks
  • Our work focuses on lateral phishing attacks that employ a malicious URL embedded in the email, which is the most common exploit method identified in our dataset
  • Using this value as a final threshold for this second candidate set of Organization-wide attackers, we identify 29 Organization-wide attackers where over 95% of their recipients belong to the account takeover (ATO)’s organization but less than 11% of the recipients came from the ATO’s recent contacts; a combination that suggests the attacker seeks primarily to compromise other employees, but who do not necessarily have a personal connection with the hijacked account
  • In this work we presented the first large-scale characterization of lateral phishing attacks across more than 100 million employee-sent emails from 92 enterprise organizations
  • We developed and evaluated a new detector that found many known lateral phishing attacks, as well as dozens of unreported attacks, while generating a low volume of false positives
  • Our work showed that 14% of our randomly sampled organizations, ranging from small to large, experienced lateral phishing attacks within a seven-month time period, and that attackers succeeded in compromising new accounts at least 11% of the time
Methods
  • Establishing Generalizability: As described earlier in Section 3.2, the authors split the dataset into two disjoint segments: a training dataset consisting of emails from the 52 exploratory organizations during April–June 2018 and a test dataset from 92 enterprises during July–October 2018; in § 5.2, the authors show that the detector’s performance remains the same if the test dataset contains only the emails from the 40 withheld test organizations
  • Given these two datasets, the authors first trained the classifier and tuned its hyperparameters via cross validation on the training dataset (Appendix A.2).
  • To ensure that any tuning or knowledge the authors derived from the training dataset did not bias or overfit the classifier, the authors did not alter any of the model’s hyperparameters or features during the evaluation on the test dataset
Results
  • For the same reasons the authors saw in the training dataset, this detector exhibited a high false negative rate, missing 57 user-reported incidents.
  • Despite this strategy’s high false negative rate, the authors find that it generates virtually no false positives across a test dataset of tens-of-millions of emails.
  • As the authors explored earlier in Section B.3, this result reflects the fact that the text of phishing emails exhibits frequent churn over time, causing the two text-similarity driven strategies to miss new attacks that the main approach detects
Conclusion
  • In this work the authors presented the first large-scale characterization of lateral phishing attacks across more than 100 million employee-sent emails from 92 enterprise organizations.
  • The authors uncovered and quantified several thematic recipient targeting strategies and deceptive content narratives; while some attackers engage in targeted attacks, most follow strategies that employ non-personalized phishing attacks that can be readily used across different organizations
  • Despite this apparent lack of sophistication in tailoring and targeting their attacks, 31% of the dataset’s lateral phishers engaged in some form of sophisticated behavior designed to increase their success rate or mask their presence from the hijacked account’s true owner.
  • The authors' work provides the first large-scale insights into an emerging, widespread form of enterprise phishing attacks, and illuminates techniques and future ideas for defending against this potent threat
Summary
  • Introduction:

    The security community has explored a myriad of defenses against phishing attacks.
  • In a lateral phishing attack, an adversary uses a compromised enterprise account to send phishing emails to a new set of recipients.
  • Listing 1 shows an anonymized example of a lateral phishing attack from the study
  • In this attack, the phisher tried to lure the recipient into clicking on a link under the false pretense of a new contract.
  • The attacker tried to make the deception more credible by responding to recipients who inquired about the email’s authenticity; and they actively hid their presence in the compromised user’s mailbox by deleting all traces of their phishing email
  • Objectives:

    Since the goal of this paper is to begin exploring practical detection techniques, and develop a large set of lateral phishing incidents for the analysis, this feature suffices for the needs.
  • Methods:

    Establishing Generalizability: As described earlier in Section 3.2, the authors split the dataset into two disjoint segments: a training dataset consisting of emails from the 52 exploratory organizations during April–June 2018 and a test dataset from 92 enterprises during July–October 2018; in § 5.2, the authors show that the detector’s performance remains the same if the test dataset contains only the emails from the 40 withheld test organizations
  • Given these two datasets, the authors first trained the classifier and tuned its hyperparameters via cross validation on the training dataset (Appendix A.2).
  • To ensure that any tuning or knowledge the authors derived from the training dataset did not bias or overfit the classifier, the authors did not alter any of the model’s hyperparameters or features during the evaluation on the test dataset
  • Results:

    For the same reasons the authors saw in the training dataset, this detector exhibited a high false negative rate, missing 57 user-reported incidents.
  • Despite this strategy’s high false negative rate, the authors find that it generates virtually no false positives across a test dataset of tens-of-millions of emails.
  • As the authors explored earlier in Section B.3, this result reflects the fact that the text of phishing emails exhibits frequent churn over time, causing the two text-similarity driven strategies to miss new attacks that the main approach detects
  • Conclusion:

    In this work the authors presented the first large-scale characterization of lateral phishing attacks across more than 100 million employee-sent emails from 92 enterprise organizations.
  • The authors uncovered and quantified several thematic recipient targeting strategies and deceptive content narratives; while some attackers engage in targeted attacks, most follow strategies that employ non-personalized phishing attacks that can be readily used across different organizations
  • Despite this apparent lack of sophistication in tailoring and targeting their attacks, 31% of the dataset’s lateral phishers engaged in some form of sophisticated behavior designed to increase their success rate or mask their presence from the hijacked account’s true owner.
  • The authors' work provides the first large-scale insights into an emerging, widespread form of enterprise phishing attacks, and illuminates techniques and future ideas for defending against this potent threat
Tables
  • Table1: Evaluation results of our detector. ‘Detected Known Attacks’ shows the number of incidents that our detector identified, and were also reported by an employee at an organization. ‘Detected New Attacks’ shows the number of incidents that our detector identified, but were not reported by anyone. ‘Missed Attacks (FN)’ shows all incidents either reported by a user or found by any of our detection strategies, but our detector marked it as benign (false negative). Of the 22 incidents our detector misses, 12 are attachment-based attacks, a threat model which our detector explicitly does not target but which we include in our FN and Detection Rate results for completeness
  • Table2: Summary of the scale and success of the lateral phishing attacks in our dataset (§ 6.1)
  • Table3: Summary of recipient targeting strategies per ATO (§ 6.2)
  • Table4: Distribution of the number of incidents per message tailoring category (§ 6.3). The columns correspond to how unique and specific the message’s topic pertains to the victim or organization. The rows correspond to whether the phishing email explicitly names the recipient or organization
  • Table5: Top 10 most common words across all 180 lateral phishing incidents
Download tables as Excel
Related work
  • Detection: An extensive body of prior literature proposes numerous techniques for detecting traditional phishing attacks [1,3,13,14,43], as well as more sophisticated spearphishing attacks [8, 10, 22, 40, 46]. Hu et al studied how to use social graph metrics to detect malicious emails sent from compromised accounts [18]. Their approach detects hijacked accounts with false positive rates between 20–40%. Unfortunately, in practice, many organizations handle tens of thousands of employee-sent emails per day, so a false positive rate of 20% would lead to thousands of false alerts each day. IdentityMailer, proposed by Stringhini et al [40], detects lateral phishing attacks by training behavior models based on timing patterns, metadata, and stylometry for each user. If a new email deviates from an employee’s behavioral model, their system flags it as an attack. While promising, their approach produces false positive rates in the range of 1–10%, which is untenable in practice given the high volume of benign emails and low base rate of phishing. Additionally, their system requires training a behavioral model for each employee, incurring expensive technical debt to operate at scale.
Funding
  • This work was supported in part by the Hewlett Foundation through the Center for Long-Term Cybersecurity, NSF grants CNS1237265 and CNS-1705050, an NSF GRFP Fellowship, the Irwin Mark and Joan Klein Jacobs Chair in Information and Computer Science (UCSD), by generous gifts from Google and Facebook, a Facebook Fellowship, and operational support from the UCSD Center for Networked Systems
Reference
  • Saeed Abu-Nimeh, Dario Nappa, Xinlei Wang, and Suku Nair. A Comparison of Machine Learning Techniques for Phishing Detection. In Proc. of 2nd ACM eCrime, 2007.
    Google ScholarLocate open access versionFindings
  • Kevin Allix, Tegawendé F Bissyandé, Jacques Klein, and Yves Le Traon. Are Your Training Datasets Yet Relevant? In Proc. of 7th Springer ESSoS, 2015.
    Google ScholarLocate open access versionFindings
  • Andre Bergholz, Jeong Ho Chang, Gerhard Paaß, Frank Reichartz, and Siehyun Strobel. Improved Phishing Detection using Model-Based Features. In Proc. of 5th CEAS, 2008.
    Google ScholarLocate open access versionFindings
  • James Bergstra and Yoshua Bengio. Random search for hyper-parameter optimization. JMLR, 13(Feb), 2012.
    Google ScholarLocate open access versionFindings
  • Steven Bird, Edward Loper, and Ewan Klein. Natural Language Toolkit. https://www.nltk.org/, 2019.
    Locate open access versionFindings
  • Elie Bursztein, Borbala Benko, Daniel Margolis, Tadek Pietraszek, Andy Archer, Allan Aquino, Andreas Pitsillidis, and Stefan Savage. Handcrafted Fraud and Extortion: Manual Account Hijacking in the Wild. In Proc. of 14th ACM IMC, 2014.
    Google ScholarLocate open access versionFindings
  • Asaf Cidon. Threat Spotlight: Office 365 Account Takeover — the New “Insider Threat”. https://blog.barracuda.com/2017/08/30/threatspotlight-office-365-account-compromisethe-new-insider-threat/, Aug 2017.
    Findings
  • Asaf Cidon, Lior Gavish, Itay Bleier, Nadia Korshun, Marco Schweighauser, and Alexey Tsitkin. High Precision Detection of Business Email Compromise. In Proc. of 28th Usenix Security, 2019.
    Google ScholarLocate open access versionFindings
  • DomainKeys Identified Mail. Mail. Accessed: 2018-11-01.
    Google ScholarFindings
  • Sevtap Duman, Kubra Kalkan-Cakmakci, Manuel Egele, William Robertson, and Engin Kirda. EmailProfiler: Spearphishing Filtering with Header and Stylometric Features of Emails. In Proc. of 40th IEEE COMPSAC, 2016.
    Google ScholarLocate open access versionFindings
  • Manuel Egele, Gianluca Stringhini, Christopher Kruegel, and Giovanni Vigna. COMPA: Detecting Compromised Accounts on Social Networks. In Proc. of 20th ISOC NDSS, 2013.
    Google ScholarLocate open access versionFindings
  • FBI. BUSINESS E-MAIL COMPROMISE THE 12 BILLION DOLLAR SCAM, Jul 2018. https://www.ic3.gov/media/2018/180712.aspx.
    Findings
  • Ian Fette, Norman Sadeh, and Anthony Tomasic. Learning to Detect Phishing Emails. In Proc. of 16th ACM WWW, 2007.
    Google ScholarLocate open access versionFindings
  • Sujata Garera, Niels Provos, Monica Chew, and Aviel D Rubin. A Framework for Detection and Measurement of Phishing Attacks. In Proc. of 5th ACM WORM, 2007.
    Google ScholarLocate open access versionFindings
  • Hugo Gascon, Steffen Ullrich, Benjamin Stritter, and Konrad Rieck. Reading Between the Lines: ContentAgnostic Detection of Spear-Phishing Emails. In Proc. of 21st Springer RAID, 2018. https://developers.google.com/machinelearning/crash-course/classification/rocand-auc, 2019.
    Locate open access versionFindings
  • [17] Grant Ho, Aashish Sharma, Mobin Javed, Vern Paxson, and David Wagner. Detecting Credential Spearphishing Attacks in Enterprise Settings. In Proc. of 26th USENIX Security, 2017.
    Google ScholarLocate open access versionFindings
  • [18] Xuan Hu, Banghuai Li, Yang Zhang, Changling Zhou, and Hao Ma. Detecting Compromised Email Accounts from the Perspective of Graph Topology. In Proc. of 11th ACM CFI, 2016.
    Google ScholarLocate open access versionFindings
  • [19] Dan Hubbard. Cisco Umbrella 1 Million. https://umbrella.cisco.com/blog/2016/12/14/ciscoumbrella-1-million/, Dec 2016.
    Findings
  • [20] Chris Kanich, Christian Kreibich, Kirill Levchenko, Brandon Enright, Geoffrey M Voelker, Vern Paxson, and Stefan Savage. Spamalytics: An Empirical Analysis of Spam Marketing Conversion. In Proc. of 15th ACM CCS, 2008.
    Google ScholarLocate open access versionFindings
  • [21] Thomas Karagiannis and Milan Vojnovic. Email information flow in large-scale enterprises. Technical report, Microsoft Research, 2008.
    Google ScholarFindings
  • [22] Mahmoud Khonji, Youssef Iraqi, and Andrew Jones. Mitigation of spear phishing attacks: A content-based authorship identification framework. In Proc. of 6th IEEE ICITST, 2011.
    Google ScholarLocate open access versionFindings
  • [23] FT Labs. A sobering day. https://labs.ft.com/2013/05/a-sobering-day/?mhq5j=e6, May 2013.
    Findings
  • [24] Stevens Le Blond, Cédric Gilbert, Utkarsh Upadhyay, Manuel Gomez Rodriguez, and David Choffnes. A Broad View of the Ecosystem of Socially Engineered Exploit Documents. In Proc. of 24th ISOC NDSS, 2017.
    Google ScholarLocate open access versionFindings
  • [25] Stevens Le Blond, Adina Uritesc, Cédric Gilbert, Zheng Leong Chua, Prateek Saxena, and Engin Kirda. A Look at Targeted Attacks Through the Lense of an NGO. In Proc. of 23rd USENIX Security, 2014.
    Google ScholarLocate open access versionFindings
  • [26] Mailgun Team. Talon. https://github.com/mailgun/talon, 2018.
    Findings
  • [27] William R Marczak, John Scott-Railton, Morgan Marquis-Boire, and Vern Paxson. When Governments Hack Opponents: A Look at Actors and Technology. In Proc. of 23rd USENIX Security, 2014.
    Google ScholarLocate open access versionFindings
  • [28] Microsoft Graph: message resource type. https://developer.microsoft.com/en-us/graph/docs/api-reference/v1.0/resources/message. Accessed:2018-11-01.
    Findings
  • [29] Microsoft. People overview - Outlook Web https://support.office.com/enus/article/people-overview-outlook-webapp-5fe173cf-e620-4f62-9bf6-da5041f651bf.
    Findings
  • Accessed: 2018-11-01.
    Google ScholarFindings
  • Brad Miller, Alex Kantchelian, Michael Carl Tschantz, Sadia Afroz, Rekha Bachwani, Riyaz Faizullabhoy, Ling Huang, Vaishaal Shankar, Tony Wu, George Yiu, et al. Reviewer Integration and Performance Measurement for Malware Detection. In Proc. of 13th Springer DIMVA, 2016.
    Google ScholarLocate open access versionFindings
  • Jeremiah Onaolapo, Enrico Mariconti, and Gianluca Stringhini. What Happens After You Are Pwnd: Understanding the Use of Leaked Webmail Credentials in the Wild. In Proc. of 16th ACM IMC, 2016.
    Google ScholarLocate open access versionFindings
  • [33] Feargus Pendlebury, Fabio Pierazzi, Roberto Jordaney, Johannes Kinder, and Lorenzo Cavallaro. Tesseract: Eliminating experimental bias in malware classification across space and time. In Proc. of 28th Usenix Security, 2019.
    Google ScholarLocate open access versionFindings
  • [34] Kevin Poulsen. Google disrupts chinese spear-phishing attack on senior u.s. officials. https://www.wired.com/2011/06/gmail-hack/, Jul 2011.
    Findings
  • [35] Steve Ragan. Office 365 phishing attacks create a sustained insider nightmare for it. https://www.csoonline.com/article/3225469/office365-phishing-attacks-create-a-sustainedinsider-nightmare-for-it.html, Sep 2017.
    Findings
  • [36] Fahmida Y. Rashid. Don’t like Mondays? Neither do attackers. https://www.csoonline.com/article/3199997/don-t-like-mondays-neither-doattackers.html, Aug 2017.
    Findings
  • [37] Retraining models on new data. https://docs.aws.amazon.com/machine-learning/latest/dg/retraining-models-on-new-data.html, 2019.
    Findings
  • [38] Jeff John Roberts. Homeland Security Chief Cites Phishing as Top Hacking Threat. http://fortune.com/2016/11/20/jeh-johnson-phishing/, Nov 2016.
    Findings
  • [39] Apache Spark. PySpark DecisionTreeClassificationModel v2.1.0. http://spark.apache.
    Findings
  • [40] Gianluca Stringhini and Olivier Thonnard. That Ain’t You: Blocking Spearphishing Through Behavioral Modelling. In Proc. of 12th Springer DIMVA, 2015.
    Google ScholarLocate open access versionFindings
  • [41] Kurt Thomas, Frank Li, Chris Grier, and Vern Paxson. Consequences of Connectivity: Characterizing Account Hijacking on Twitter. In Proc. of 21st ACM CCS, 2014.
    Google ScholarLocate open access versionFindings
  • [42] Lisa Vaas. How hackers broke into John Podesta, DNC Gmail accounts. https://nakedsecurity.sophos.com/2016/10/25/how-hackers-broke-into-johnpodesta-dnc-gmail-accounts/, Oct 2016.
    Findings
  • [43] Colin Whittaker, Brian Ryner, and Marria Nazif. LargeScale Automatic Classification of Phishing Pages. In Proc. of 17th ISOC NDSS, 2010.
    Google ScholarLocate open access versionFindings
  • [44] Wikipedia. Random forest. https://en.wikipedia.org/wiki/Random_forest, 2019.
    Findings
  • [45] Kim Zetter. Researchers uncover rsa phishing attack, hiding in plain sight. https://www.wired.com/2011/08/how-rsa-got-hacked/, Aug 2011.
    Findings
  • [46] Mengchen Zhao, Bo An, and Christopher Kiekintveld. Optimizing Personalized Email Filtering Thresholds to Mitigate Sequential Spear Phishing Attacks. In Proc. of 13th AAAI, 2016.
    Google ScholarLocate open access versionFindings
  • 1. Number of trees: 50–500, in steps of 50 (i.e., 50, 100, 150,..., 450, 500)
    Google ScholarFindings
  • 2. Maximum tree depth: 10–100, in steps of 10
    Google ScholarFindings
  • 3. Minimum leaf size: 1, 2, 4, 8
    Google ScholarFindings
  • 4. Downsampling ratio of (benign / attack) emails: 10, 50, 100, 200
    Google ScholarFindings
Full Text
Your rating :
0

 

Best Paper
Best Paper of USENIX Security, 2019
Tags
Comments