Home   >   CSC-OpenAccess Library   >    Manuscript Information
Run-time Detection of Cross-site Scripting: A Machine-Learning Approach Using Syntactic-Tagging N-Gram Features
Nurul Atiqah Abu Talib, Kyung-Goo Doh
Pages - 9 - 27     |    Revised - 31-05-2022     |    Published - 30-06-2022
Volume - 16   Issue - 2    |    Publication Date - June 2022  Table of Contents
Cross-site Scripting, N-gram, Web Application Security, Supervised Machine Learning, Tagging, Syntactic Structure.
Ensuring the security of web applications against cross-site scripting is practically a never-ending story. With the emergence of new applications with loaded payloads of open expressiveness and versatile functionalities to provide users with interactive services, the fight is even more challenging. A new feasible approach now in growing prominence is to use machine-learning classification. In this paper, we demonstrate an approach for payload abstraction through the translation of payloads into sentences of syntactic tags. This is to extract a normalized set of features of appropriate data and to minimize the problems of manually creating rules based on dangerous characteristics of payloads. We show that through abstraction and normalized features, we can accurately classify input payloads according to their proper categories. We assert that the security work is adequately informative to represent payloads and it can be more sustainable by using the automaton of machine-learning technique.
Apruzzese, G., Colajanni, M., Ferretti, L., Guido, A., & Marchetti, M. (2018). On the effectiveness of machine and deep learning for cyber security. International Conference on Cyber Conflict, CYCON, 2018-May, 371-389. https://doi.org/10.23919/CYCON.2018.8405026
Cure53. (n.d.). HTML5 Security Cheatsheet. Retrieved March 7, 2017, from https://html5sec.org
Duraibi, S., Alashjaee, A. M., & Song, J. (2019). A Survey of Symbolic Execution Tools. International Journal of Computer Science and Security (IJCSS), 13(6), 244-254.
Fang, Y., Li, Y., Liu, L., & Huang, C. (2018). DeepXSS: Cross site scripting detection based on deep learning. ACM International Conference Proceeding Series, 47-51. https://doi.org/10.1145/3194452.3194469
Ferrag, M. A., Maglaras, L., Moschoyiannis, S., & Janicke, H. (2020). Deep learning for cyber security intrusion detection: Approaches, datasets, and comparative study. Journal of Information Security and Applications, 50. https://doi.org/10.1016/j.jisa.2019.102419
Gundy, M. Van, Gundy, M. Van, Chen, H., & Chen, H. (2009). Noncespaces: using randomization to enforce information ow tracking and thwart cross-site scripting attacks. Proceedings of the Network and Distributed System Security Symposium (NDSS), 1-18.
Gupta, S., & Gupta, B. B. (2016). XSS-immune: a Google chrome extension-based XSS defensive framework for contemporary platforms of web applications. Security and Communication Networks, 9(17), 3966-3986. https://doi.org/10.1002/sec.1579
Habibi, G., & Surantha, N. (2020). XSS attack detection with machine learning and n-gram methods. Proceedings of 2020 International Conference on Information Management and Technology, ICIMTech 2020, 516-520. https://doi.org/10.1109/ICIMTech50083.2020.9210946
Kallin, J., & Lobo Valbuena, I. (n.d.). Excess XSS: A comprehensive tutorial on cross-site scripting. Retrieved March 22, 2017, from https://excess-xss.com/
Kowsari, K., Meimandi, K. J., Heidarysafa, M., Mendu, S., Barnes, L., & Brown, D. (2019). Text classification algorithms: A survey. Information (Switzerland), 10(4). https://doi.org/10.3390/info10040150
Liu, H., & Lang, B. (2019). Machine learning and deep learning methods for intrusion detection systems: A survey. Applied Sciences (Switzerland), 9(20). https://doi.org/10.3390/app9204396
Lou, B., & Song, J. (2020). A Study on Using Code Coverage Information Extracted from Binary to Guide Fuzzing. International Journal of Computer Science and Security (IJCSS), 14, 200-209.
Manico, J., & Hansen, R. (2015). XSS Filter Evasion Cheat Sheet. https://owasp.org/wwwcommunity/xss-filter-evasion-cheatsheet
Mereani, F. A., & Howe, J. M. (2018). Detecting Cross-Site Scripting Attacks Using Machine Learning. Advances in Intelligent Systems and Computing, 723, 200-210. https://doi.org/10.1007/978-3-319-74690-6_20
MITRE Corporation. (n.d.). CVE Details: The Ultimate Security Vulnerability Datasource. Retrieved March 20, 2019, from https://www.cvedetails.com/vulnerabilities-by-types.php
Mitropoulos, D., Stroggylos, K., & Spinellis, D. (2016). How to Train Your Browser: Preventing XSS Attacks Using Contextual Script Fingerprints. ACM Transactions on Privacy and Security ACM Trans. Priv. Secur, 19(2), 1-31. https://doi.org/10.1145/2939374
Mujawib Alashjaee, A., & Duraibi, S. (2019). Dynamic Taint Analysis Tools: A Review.
Nasser Mohammed, M., & Mohamed Ahmed, M. (2019). Data Preparation and Reduction Technique in Intrusion Detection Systems: ANOVA-PCA. International Journal of Computer Science and Security (IJCSS), 13(5), 167-182. https://www.cscjournals.org/manuscript/Journals/IJCSS/Volume13/Issue5/IJCSS-1498.pdf
Nunan, A. E., Souto, E., Dos Santos, E. M., & Feitosa, E. (2012). Automatic classification of cross-site scripting in web pages using document-based and URL-based features. Proceedings - IEEE Symposium on Computers and Communications, 000702-000707. https://doi.org/10.1109/ISCC.2012.6249380
OWASP. (2004). OWASP Top Ten 2004. https://www.owasp.org/index.php/Top_10_2004
OWASP. (2021). OWASP Top 10:2021. OWASP. https://owasp.org/Top10/
Pereira, R. F., Silva, R. M., & Orvalho, J. P. (2020). Virtualization and Security Aspects: An Overview. International Journal of Computer Science and Security (IJCSS).
Rao, K. S., Jain, N., Limaje, N., Gupta, A., Jain, M., & Menezes, B. (2016). Two for the price of one: A combined browser defense against XSS and clickjacking. 2016 International Conference on Computing, Networking and Communications, ICNC 2016. https://doi.org/10.1109/ICCNC.2016.7440629
Rathore, S., Sharma, P. K., & Park, J. H. (2017). XSSClassifier: An efficient XSS attack detection approach based on machine learning classifier on SNSs. Journal of Information Processing Systems, 13(4), 1014-1028. https://doi.org/10.3745/JIPS.03.0079
Salahaldeen Duraibi & Jia Song International Journal of Computer Science and Security (IJCSS), 13, 231.
Steinhauser, A., & Gauthier, F. (2016). JSPChecker: Static detection of context-sensitive crosssite scripting flaws in legacy web applications. PLAS 2016 - Proceedings of the 2016 ACM Workshop on Programming Languages and Analysis for Security, Co-Located with CCS 2016, 57-68. https://doi.org/10.1145/2993600.2993606
Talib, N. A. A., & Doh, K. G. (2021b). Assessment of dynamic open-source cross-site scripting filters for web application. KSII Transactions on Internet and Information Systems, 15(10), 3750-3770. https://doi.org/10.3837/tiis.2021.10.015
Talib, N. A. A., & Doh, K.-G. (2021a). Static Analysis Tools Against Cross-site Scripting Vulnerabilities in Web Applications : An Analysis. Journal of Software Assessment and Valuation, 17(2), 125-142. https://doi.org/10.29056/jsav.2021.12.14
van Oorschot, P. C. (2020). Web and Browser Security. Information Security and Cryptography, 245-279. https://doi.org/10.1007/978-3-030-33649-3_9
Xin, Y., Kong, L., Liu, Z., Chen, Y., Li, Y., Zhu, H., Gao, M., Hou, H., & Wang, C. (2018). Machine Learning and Deep Learning Methods for Cybersecurity. IEEE Access, 6, 35365-35381. https://doi.org/10.1109/ACCESS.2018.2836950
Yan, F., & Qiao, T. (2016). Study on the detection of cross-site scripting vulnerabilities based on reverse code audit. Proceedings of the 17th International Conference on Intelligent Data Engineering and Automated Learning, IDEAL, 9937 LNCS, 154-163. https://doi.org/10.1007/978-3-319-46257-8_
Miss Nurul Atiqah Abu Talib
Computer Science and Engineering, Hanyang University ERICA, Ansan, 15588 - South Korea
Professor Kyung-Goo Doh
Computer Science and Engineering, Hanyang University ERICA, Ansan, 15588 - South Korea

View all special issues >>