Home   >   CSC-OpenAccess Library   >    Manuscript Information
Towards a Flow-based Internet Traffic Classification For Bandwidth Optimization
Sulaiman Mohd Nor, Abuagla Babiker Mohd
Pages - 146 - 153     |    Revised - 05-05-2009     |    Published - 18-05-2009
Volume - 3   Issue - 2    |    Publication Date - April 2009  Table of Contents
NetFlow, machine learning, classification, accuracy, video streaming, peer to peer.
Abstract The evolution of the Internet into a large complex service-based network has posed tremendous challenges for network monitoring and control in terms of how to collect the large amount of data in addition to the accurate classification of new emerging applications such as peer to peer, video streaming and online gaming. These applications consume bandwidth and affect the performance of the network especially in a limited bandwidth networks such as university campuses causing performance deterioration of mission critical applications. Some of these new emerging applications are designed to avoid detection by using dynamic port numbers (port hopping), port masquerading (use http port 80) and sometimes encrypted payload. Traditional identification methodologies such as port-based signature-based are not efficient for today’s traffic. In this work machine learning algorithms are used for the classification of traffic to their corresponding applications. Furthermore this paper uses our own customized made training data set collected from the campus, The effect on the amount of training data set has been considered before examining, the accuracy of various classification algorithms and selecting the best. Our findings show that random tree, IBI, IBK, random forest respectively provide the top 4 highest accuracy in classifying flow based network traffic to their corresponding application among thirty algorithms with accuracy not less than 99.33%.
CITED BY (21)  
1 Paramita, A. S. (2016). Principal Feature Selection Impact for Internet Traffic Classification Using Naïve Bayes. In Proceedings of Second International Conference on Electrical Systems, Technology and Information 2015 (ICESTI 2015) (pp. 475-480). Springer Singapore.
2 Antonio, T., & Paramita, A. S. (2015). Feature Selection Technique Impact for Internet Traffic Classification Using Naïve Bayesian. Jurnal Teknologi, 72(5).
3 Paramita, A. S. (2014). Feature Selection Technique Using Principal Component Analysis For Improving Fuzzy C-Mean Internet Traffic Classification. Australian Basic and Applied Sciences (AJBAS): 2014 International on Engineering Conference on Sciences and Technology Innovation–ISSN: 1997-8178–American-Eurasian Network for Scientific Information (AENS
4 Finsterbusch, M., Richter, C., Rocha, E., Muller, J. A., & Hanssgen, K. (2014). A survey of payload-based traffic classification approaches. Communications Surveys & Tutorials, IEEE, 16(2), 1135-1156.
5 Ghaffari, M., & Ghadiri, N. (2014). Ambiguity-Driven Fuzzy C-Means Clustering: How to Detect Uncertain Clustered Records. arXiv preprint arXiv:1409.2821.
6 Suryaputra, A., Samopa, F., & Hindayanto, B. C. (2014). klasterisasi dan analisis trafik internet menggunakan fuzzy c mean dengan ekstraksi fitur data. jurnal informatika, 12(1), 33-39.
7 Paramita, A. S., Samopa, F., & Hindayanto, B. C. (2014). Klasterisasi dan Analisis Trafik Internet Menggunakan Fuzzy C Mean Dengan Ekstraksi Fitur Data.
8 Li, B., Springer, J., Bebis, G., & Gunes, M. H. (2013). A survey of network flow applications. Journal of Network and Computer Applications, 36(2), 567-581.
9 Arndt, D. J. (2012). An Investigation of Using Machine Learning with Distribution Based Flow Features for Classifying SSL Encrypted Network Traffic.
10 C. McCarthy, A.N. Z. Heywood, “An Investigation on Identifying SSL Traffic”, in Proceedings Computational Intelligence for Security and Defense Applications (CISDA), 2011 IEEE Symposium , Paris, 11-15 April 2011, pp. 115-122.
11 J. Barker, P. Hannay and P. Szewczyk, "Using Traffic Analysis to Identify the Second Generation Onion Router," in Proceedings, 9th International Conference on Embedded and Ubiquitous Computing, Melbourne, Victoria Australia, October 24- 26, 2011, pp.72-78.
12 Björkkvist, P. (2011). Comparative study of metrics for IPTV transport in the access network.
13 J. Barker, P. Hannay and C. Bolan, “Using Traffic Analysis to Identify Tor Usage – A Proposed Study”, in Proceedings of the International Conference on Security & Management, Las Vegas, Nevada, USA. 2010, pp. 620-623.
14 V. J. Vivek , N. Chandrasekar and Y. Srinivas, “Improving Seismic Monitoring System for Small to Intermediate Earthquake Detection”, International Journal of Computer Science and Security (IJCSS), 4(3), pp. 308 – 315, 2010.
15 Joevivek, V., Chandrasekar, N., & Srinivas, Y. (2010). Improving seismic monitoring system for small to intermediate earthquake detection. International Journal of Computer Science and Security (IJCSS), 4(3), 308.
16 D. Shukla , V. K. Tiwari , S. Thakur and A. K. Deshmukh, “Share Loss Analysis of Internet Traffic Distribution in Computer Networks”, International Journal of Computer Science and Security (IJCSS), 3(5), pp. 414 – 426, 2009.
17 A. B. Mohammed and S. M. Nor, “Near Real Time Online Flow-Based Internet Traffic Classification Using Machine Learning (C4.5)”, International Journal of Engineering (IJE), 3(4), pp. 370 – 379, 2009.
18 A. Y. Dahab , A. M. Said and H. Hasbullah“Applications of Extreme Value Theory to Burst Predictions”. Signal Processing: An International Journal, 3 (4), pp. 55 – 63, 2009.
19 D. Shukla, V. K. Tiwari, S. Thakur and M. Tiwari, “A Comparison of Methods for Internet Traffic Sharing in Computer Network”, Int. J. of Advanced Networking and Applications, 1(3), pp. 164-169, 2009.
20 Dahab, A. Y., bin Md Said, A., & bin Hasbullah, H. (2009). Application of extreme value theory to bursts prediction. SPIJ, 3(4), 55.
21 Mohd, A. B., & bin Mohd Nor, S. (2009). Near Real Time Online Flow-Based Internet Traffic Classification Using Machine Learning (C4. 5). Abuagla Babiker Mohammed, Assoc. Prof. Dr. Sulaiman Mohd Nor., 3(4), 370.
1 Google Scholar 
2 Academic Journals Database 
3 ScientificCommons 
4 Academic Index 
5 CiteSeerX 
6 refSeek 
7 iSEEK 
8 Socol@r  
9 ResearchGATE 
10 Libsearch 
11 Bielefeld Academic Search Engine (BASE) 
12 Scribd 
13 WorldCat 
14 SlideShare 
16 PdfSR 
17 Chinese Directory Of Open Access 
A.W.Moore and D.papagiannaki, “Toward the accurate Identification of network applications”, in poc. 6th passive active measurement. Workshop (PAM), mar 2005,vol. 3431, pp 41-54
Christian, D., W. Arne, et al. (2003). An analysis of Internet chat systems. Proceedings of the 3rd ACM SIGCOMM conference on Internet measurement. Miami Beach, FL, USA, ACM: pp: 51 - 64
Daniel Roman Koller, Application Detection and Modeling using Network Traces, master thesis “swiss federal institute of technology,2007
Feldman, R. S. a. A. "An IDS Using NetFlow Data." Retrieved march 2008.
Fivos Constantinou, Panayiotis Mavrommatis, “Identifying Known and Unknown Peer-to- Peer Traffic “, Fifth IEEE International Symposium on Network Computing and Applications (NCA'06) 0-7695-2640-3/06 $20.00 © 2006 IEEE
Hongbo, J., W. M. Andrew, et al. (2007). Lightweight application classification for network management. Proceedings of the 2007 SIGCOMM workshop on Internet network management. Kyoto, Japan, ACM: pp: 299 - 304
http://www.iana.org/assignments /port-numbers
Jeffrey, E., A. Martin, et al. (2006). Traffic classification using clustering algorithms. Proceedings of the 2006 SIGCOMM workshop on Mining network data. Pisa, Italy, ACM: pp: 281 - 286
Nigel, W., Z. Sebastian, et al. (2006). "A preliminary performance comparison of five machine learning algorithms for practical IP traffic flow classification." SIGCOMM Comput. Commun. Rev. 36(5): 5-16.
Patrick, H., S. Subhabrata, et al. (2005). ACAS: automated construction of application signatures. Proceedings of the 2005 ACM SIGCOMM workshop on Mining network data. Philadelphia, Pennsylvania, USA, ACM: pp: 197 - 202
Rui Wang1, Y. L., Yuexiang Yang3, Xiaoyong Zhou4 (16-18 October 2006). Solving the App- Level Classification Problem of P2P Traffic via Optimized Support Vector Machines. Proceedings of the Sixth International Conference on Intelligent Systems
Subhabrata, S., S. Oliver, et al. (2004). Accurate, scalable in-network identification of p2p traffic using application signatures. Proceedings of the 13th international conference on World Wide Web. New York, NY, USA, ACM: pp: 512 - 521
T. Karagiannis, A. B., and N. Brownlee (2004). Is P2P Dying or Just Hiding? . GLOBECOM '04. Dallas, USA, IEEE: pp:1532 - 1538 Vol.3.
Thomas, K., B. Andre, et al. (2004). Transport layer identification of P2P traffic. Proceedings of the 4th ACM SIGCOMM conference on Internet measurement. Taormina, Sicily, Italy, ACM: pp: 121 - 134
Thomas, K., P. Konstantina, et al. (2005). BLINC: multilevel traffic classification in the dark. Proceedings of the 2005 conference on Applications, technologies, architectures, and protocols for computer communications. Philadelphia, Pennsylvania, USA, ACM: pp: 229 - 240
Williamson, A. M. C. (2006). A Longitudinal Study of P2P Traffic Classification. Proceedings of the 2th IEEE International Symposium on (MASCOTS '06), Los Alamitos, California, IEEE. Pp 179 - 188
Assistant Professor Sulaiman Mohd Nor
university technologi malaysia - Malaysia
Mr. Abuagla Babiker Mohd
- Malaysia

View all special issues >>