Identifying the Phishing Websites Using the Patterns of TLS Certificates

Yuji Sakurai; Takuya Watanabe; Tetsuya Okuda; Mitsuaki Akiyama; Tatsuya Mori

doi:10.13052/jcsm2245-1439.1026

2021, WTMC 2020 Workshop

2021

Identifying the Phishing Websites Using the Patterns of TLS Certificates

WTMC 2020 Workshop

https://doi.org/10.13052/jcsm2245-1439.1026

Published 2021-04-15

Yuji Sakurai⁺⁻
Takuya Watanabe⁺⁻
Tetsuya Okuda⁺⁻
Mitsuaki Akiyama⁺⁻
Tatsuya Mori⁺⁻

Yuji Sakurai

Waseda University, Shinjuku City, Tokyo, Japan

https://orcid.org/0000-0002-0634-8126

Takuya Watanabe

NTT Secure Platform Laboratories, Japan

https://orcid.org/0000-0001-7205-5367

Tetsuya Okuda

NTT Secure Platform Laboratories, Japan

Mitsuaki Akiyama

NTT Secure Platform Laboratories, Japan

https://orcid.org/0000-0001-7052-8562

Tatsuya Mori

Waseda University, Shinjuku City, Tokyo, Japan; NICT, Japan

https://orcid.org/0000-0003-1583-4174

PDF

HTML

Keywords

TLS Certificate
Phishing
Web security

How to Cite

[1]

Y. Sakurai, T. Watanabe, T. Okuda, M. Akiyama, and T. Mori, “Identifying the Phishing Websites Using the Patterns of TLS Certificates”, JCSANDM, vol. 10, no. 2, pp. 451–486, Apr. 2021.

Abstract

With the recent rise of HTTPS adoption on the Web, attackers have begun “HTTPSifying” phishing websites. HTTPSifying a phishing website has the advantage of making the website appear legitimate and evading conventional detection methods that leverage URLs or web contents in the network. Further, adopting HTTPS could also contribute to generating intrinsic footprints and provide defenders with a great opportunity to monitor and detect websites, including phishing sites, as they would need to obtain a public-key certificate issued for the preparation of the websites. The potential benefits of certificate-based detection include (1) the comprehensive monitoring of all HTTPSified websites by using certificates immediately after their issuance, even if the attacker utilizes dynamic DNS (DDNS) or hosting services; this could be overlooked with the conventional domain-registration-based approaches; and (2) to detect phishing websites before they are published on the Internet. Accordingly, we address the following research question: How can we make use of the footprints of TLS certificates to defend against phishing attacks? For this, we collected a large set of TLS certificates corresponding to phishing websites from Certificate Transparency (CT) logs and extensively analyzed these TLS certificates. We demonstrated that a template of common names, which are equivalent to the fully qualified domain names, obtained through the clustering analysis of the certificates can be used for the following promising applications: (1) The discovery of previously unknown phishing websites and (2) understanding the infrastructure used to generate the phishing websites. Furthermore, we developed a real-time monitoring system using the analysis techniques. We demonstrate its usefulness for the practical security operation. We use our findings on the abuse of free certificate authorities (CAs) for operating HTTPSified phishing websites to discuss possible solutions against such abuse and provide a recommendation to the CAs.

https://doi.org/10.13052/jcsm2245-1439.1026

PDF

HTML

References

Censys. https://censys.io/.

APWG. Phishing activity trends report 3rd quarter 2019. https://docs.apwg.org/reports/apwg_trends_report_q4_2019.pdf.

Dominik Birk, Sebastian Gajek, Felix Grobert, and Ahmad-Reza Sadeghi. Phishing phishers – observing and tracing organized cybercrime. In Proc. of ICIMP ’07.

Aaron Blum, Brad Wardman, Thamar Solorio, and Gary Warner. Lexical feature based phishing url detection using online learning. In Proc. of ACM AISec, pages 54–60. ACM, 2010.

Sharon Boeyen, Stefan Santesson, Tim Polk, Russ Housley, Stephen Farrell, and Dave Cooper. Internet X.509 Public Key Infrastructure Certificate and Certificate Revocation List (CRL) Profile. RFC 5280, May 2008.

Chrome DevTools team. Puppeteer. https://github.com/puppeteer/puppeteer.

Igino Corona, Battista Biggio, Matteo Contini, Luca Piras, Roberto Corda, Mauro Mereu, Guido Mureddu, Davide Ariu, and Fabio Roli. Deltaphish: Detecting phishing webpages in compromised websites. In Proc. of ESORICS, 2017.

cPanel. https://cpanel.net/.

Zheng Dong, Apu Kapadia, Jim Blythe, and L. Jean Camp. Beyond the lock icon: Real-time detection of phishing websites using public key certificates. In Proc. of APWG Symposium eCrime, 2015.

Vincent Drury and Ulrike Meyer. Certified phishing: Taking a look at public key certificates of phishing websites. In Proc. of USENIX Symposium SOUPS, 2019.

Zakir Durumeric, David Adrian, Ariana Mirian, Michael Bailey, and J. Alex Halderman. A search engine backed by Internet-wide scanning. In ACM CCS, 2015.

Adrienne Porter Felt, Richard Barnes, April King, Chris Palmer, Chris Bentzel, and Parisa Tabriz. Measuring HTTPS adoption on the web. In 26th USENIX Security Symposium, 2017.

Ian Fette, Norman Sadeh, and Anthony Tomasic. Learning to detect phishing emails. In Procc of International Conference WWW, 2007.

Google. Certificate transparency. https://www.certificate-transparency.org.

Google. Https encryption on the web. https://transparencyreport.google.com/https/overview?hl=en.

Google. A secure web is here to stay. https://security.googleblog.com/2018/02/a-secure-web-is-here-to-stay.html.

Google. Webmaster central blog, https as a ranking signal. https://webmasters.googleblog.com/2014/08/https-as-ranking-signal.html.

Internet Crime Complaint Center. 2018 internet crime report. https://pdf.ic3.gov/2018_IC3Report.pdf.

B. Laurie, A. Langley, and E. Kasper. Certificate transparency. RFC 6962, RFC Editor, June 2013.

Anh Le, Athina Markopoulou, and Michalis Faloutsos. Phishdef: Url names say it all. In Proceedings of 2011 IEEE INFOCOM, 2011.

Let’s Encrypt. https://letsencrypt.org/.

X. Li, G. Geng, Z. Yan, Y. Chen, and X. Lee. Phishing detection based on newly registered domains. In 2016 IEEE International Conference on Big Data (Big Data), pages 3685–3692, 2016.

J. Mao, W. Tian, P. Li, T. Wei, and Z. Liang. Phishing-alarm: Robust and efficient phishing detection via page component similarity. IEEE Access, 5:17020–17030, 2017.

Samuel Marchal, Kalle Saari, Nidhi Singh, and N. Asokan. Know your phish: Novel techniques for detecting phishing sites and their targets. In Proc. of IEEE ICDCS 2016.

Mozilla. Communicating the dangers of non-secure http. https://blog.mozilla.org/security/2017/01/20/communicating-the-dangers-of-non-secure-http/.

David Naylor, Alessandro Finamore, Ilias Leontiadis, Yan Grunenberger, Marco Mellia, Maurizio Munafò, Konstantina Papagiannaki, and Peter Steenkiste. the cost of the “s” in https.

L. A. T. Nguyen, B. L. To, H. K. Nguyen, and M. H. Nguyen. A novel approach for phishing detection using url-based heuristic. In 2014 International Conference on Computing, Management and Telecommunications (ComManTel), pages 298–303, April 2014.

A. Oest, Y. Safaei, A. Doupé, G. Ahn, B. Wardman, and K. Tyers. Phishfarm: A scalable framework for measuring the effectiveness of evasion techniques against browser phishing blacklists. In 2019 IEEE SP.

Adam Oest, Yeganeh Safei, Adam Doupé, Gail-Joon Ahn, Brad Wardman, and Gary Warner. Inside a phisher’s mind: Understanding the anti-phishing ecosystem through phishing kit analysis. In APWG Symposium on Electronic Crime Research (eCrime), pages 1–12, 2018.

OpenPhish. Openphish faq. https://openphish.com/faq.html.

Peng Peng, Chao Xu, Luke Quinn, Hang Hu, Bimal Viswanath, and Gang Wang. What happens after you leak your password: Understanding credential sharing on phishing sites. In Proc. of ACM CCS, 2019.

Peng Peng, Chao Xu, Luke Quinn, Hang Hu, Bimal Viswanath, and Gang Wang. What happens after you leak your password: Understanding credential sharing on phishing sites. pages 181–192, 07 2019.

PHISHLABS. 2019 phishing trends and intelligence report the growing social engineering threat. https://info.phishlabs.com/hubfs/2019PTIReport/2019PhishingTrendsandIntelligenceReport.pdf.

John W Ratcliff and David E Metzener. Pattern-matching-the gestalt approach. Dr Dobbs Journal, 13(7):46, 1988.

Robert Ecker. Universal leet (l337, l33t, 1337) converter. http://www.robertecker.com/hp/research/leet-converter.php.

Sectigo. https://sectigo.com/.

Sectigo. Sectigo certification practice statement (version 5.1.1). https://sectigo.com/uploads/files/Sectigo-CPS-v5.1.1.pdf.

Hossein Shirazi, Bruhadeshwar Bezawada, and Indrakshi Ray. Kn0w thy doma1n name: Unbiased phishing detection using domain name based features. In Proc. of ACM SACMAT.

The Chromium Project. Chromium Certificate Transparency Policy. https://github.com/chromium/ct-policy.

Ivan Torroledo, Luis David Camacho, and Alejandro Correa Bahnsen. Hunting malicious tls certificates with deep neural networks. In Proc. of ACM AIsec, October 2018.

Amber van der Heijden and Luca Allodi. Cognitive triaging of phishing attacks. In 28th USENIX Security Symposium (USENIX Security 19), Santa Clara, CA, August 2019. USENIX Association.

Rakesh Verma and Keith Dyer. On the character of phishing urls: Accurate and robust statistical learning classifiers. In Proc. of ACM CODASPY 2015.

VirusTotal. Virustotal. https://www.virustotal.com/.

Guang Xiang, Jason Hong, Carolyn P Rose, and Lorrie Cranor. Cantina+: A feature-rich machine learning framework for detecting phishing web sites. ACM TISSEC, 2011.

Yinglian Xie, Fang Yu, Kannan Achan, Rina Panigrahy, Geoff Hulten, and Ivan Osipkov. Spamming botnets: Signatures and characteristics. ACM SIGCOMM CCR, 38:171–182, 01 2008.

Yue Zhang, Jason I. Hong, and Lorrie F. Cranor. Cantina: A content-based approach to detecting phishing web sites. In Proceedings of the 16th International Conference on World Wide Web (WWW), 2007.

Downloads

Download data is not yet available.

Identifying the Phishing Websites Using the Patterns of TLS Certificates

Keywords

How to Cite

Download Citation

Abstract

References

Downloads