Data Visualization and Preprocessing for Network Traffic Analysis in Cybersecurity

<p>Mayank Kapadia<sup>1,*</sup>, Andrew Dunton<sup>1</sup>, Chelsea Jaculina<sup>1</sup>, David Thach<sup>1</sup>, Albert Ong<sup>1</sup> and Shih Yu Chang<sup>1</sup></p>

doi:10.33969/J-NaNA.2025.050102

Data Visualization and Preprocessing for Network Traffic Analysis in Cybersecurity

Mayank Kapadia^1,*, Andrew Dunton¹, Chelsea Jaculina¹, David Thach¹, Albert Ong¹ and Shih Yu Chang¹

Corresponding Author:

Mayank Kapadia

Affiliation(s):

¹San Jose State University (SJSU), San Jose, California, USA

*Corresponding author

Abstract:

In the age of cybersecurity, visualization and interpretation of network traffic data is very crucial for real-time intrusion detection. The proposed paper provides a data visualization driven approach for analyzing network intrusions using the CICIDS 2017 dataset. The study also uses different preprocessing techniques, such as data cleaning, transformation, and feature selection, to make the data set ready for analysis. Principal Component Analysis, or PCA, is used for reducing dimensionality, helping to optimize memory, and clarifying visualizations. We will focus on the visualization of network attack patterns, model performance, and PCA results that provides actionable insights. Python libraries are used in conjunction with Power BI to create a data visualization platform that has interactive real-time visualizations for users to explore across attack types, feature importance, and model evaluation metrics. The goal is to show how effective data visualization can improve the understanding of complex network traffic data and help make better decisions in cybersecurity.

Keywords:

Cybersecurity, Intrusion Detection Systems (IDS), Network Traffic Analysis, Data Visualization, Machine Learning, Primary Component Analysis (PCA), CICIDS 2017 Dataset, Dimensionality Reduction, Power BI, Real-time Visualization, Anomaly Detection

PDF

Downloads: 99 Views: 1186

Cite This Paper:

Mayank Kapadia, Andrew Dunton, Chelsea Jaculina, David Thach, Albert Ong and Shih Yu Chang (2025). Data Visualization and Preprocessing for Network Traffic Analysis in Cybersecurity. Journal of Networking and Network Applications, Volume 5, Issue 1, pp. 13–26. https://doi.org/10.33969/J-NaNA.2025.050102.

References:

[1] I. Sharafaldin, A. H. Lashkari, and A. A. Ghorbani, ”Toward generating a new intrusion detection dataset and intrusion traffic characterization,” in Proc. 4th Int. Conf. Inf. Syst. Secur. Privacy, 2018, pp. 108–116.

[2] Y. Wang, J. Liu, J. Zhang, and W. Zhang, ”PCA-based dimensionality reduction for network intrusion detection,” IEEE Access, vol. 8, pp. 123091–123104, 2020, doi: 10.1109/ACCESS.2020.3016783.

[3] N. Moustafa and J. Slay, ”UNSW-NB15: A comprehensive dataset for network intrusion detection systems,” in Proc. Mil. Commun. Inf. Syst. Conf., 2015, pp. 1–6.

[4] D. Kurniabudi, D. Stiawan, M. Y. B. Idris, A. M. Bamhdi, and R. Budiarto, ”CICIDS-2017 dataset feature analysis with information gain for anomaly detection,” IEEE Access, vol. 8, pp. 132911–132921, 2020, doi: 10.1109/ACCESS.2020.3009843.

[5] S. Chen, Q. Li, and Y. Wang, ”A comprehensive visualization framework for intrusion detection using Tableau,” Comput. Secur., vol. 107, p. 102335, 2021, doi: 10.1016/j.cose.2020.102335.

[6] F. Ullah, S. Habib, A. Khan, and S. Khan, ”Enhancing interpretability of intrusion detection systems using Power BI dashboards,” Expert Syst. Appl., vol. 184, p. 115635, 2021, doi: 10.1016/j.eswa.2020.115635.

[7] P. Mondal and J. Sanchez, ”Real-time intrusion detection using machine learning in Docker environments,” J. Netw. Comput. Appl., vol. 184, p. 102773, 2021, doi: 10.1016/j.jnca.2021.102773.

[8] Y. Zhou, G. Cheng, S. Jiang, and M. Dai, ”Building an effi-cient intrusion detection system based on feature selection and en-semble classifier,” Comput. Netw., vol. 174, p. 107247, 2020, doi: 10.1016/j.comnet.2020.107247.

[9] W. Elmasry, A. Akbulut, and A. H. Zaim, ”Evolving deep learn-ing architectures for network intrusion detection using a double PSO meta-heuristic,” Comput. Netw., vol. 168, p. 107042, 2020, doi: 10.1016/j.comnet.2019.107042.

[10] R. Vinayakumar et al., ”Deep learning approach for intelligent intrusion detection systems,” IEEE Access, vol. 7, pp. 41525–41550, 2019, doi: 10.1109/ACCESS.2019.2895334.

[11] J. R. Quinlan, ”Learning decision tree classifiers,” ACM Comput. Surv., vol. 28, no. 1, pp. 71–72, 1996.

[12] L. Breiman, ”Random forests,” Mach. Learn., vol. 45, pp. 5–32, 2001.

[13] T. Chen and C. Guestrin, ”XGBoost: A scalable tree boosting system,” in Proc. 22nd ACM SIGKDD Int. Conf. Knowl. Discov. Data Min., 2016, pp. 785–794.

[14] M. P. LaValley, ”Logistic regression,” Circulation, vol. 117, no. 18, pp. 2395–2399, 2008.

[15] I. Rish, ”An empirical study of the naive Bayes classifier,” in IJCAI 2001 Workshop Empir. Methods Artif. Intell., vol. 3, no. 22, pp. 41–46, Seattle, WA, USA, 2001.

[16] Y. Zhang, ”Support vector machine classification algorithm and its application,” in Inf. Comput. Appl.: 3rd Int. Conf. ICICA 2012, Chengde, China, Sept. 14–16, 2012, Proc. Part II, Springer, 2012, pp. 179–186.

[17] M. A. Ferrag, L. Shu, and Q. Li, ”Convolutional neural network-based intrusion detection system for IoT networks,” Comput. Netw., vol. 181, p. 107420, 2022.

[18] Z. Zhang, S. Yu, and W. Wang, ”Hybrid LSTM and Random Forest model for intrusion detection,” Future Gener. Comput. Syst., vol. 116, pp. 128–139, 2022.

[19] V. V. Subramaniam, S. Santhosh, and P. K. Nair, ”1D-CNN for intrusion detection in cybersecurity,” J. Cybersecurity, vol. 15, no. 4, pp. 243–251, 2021.

[20] D. J. Olive, ”Principal component analysis,” in Robust Multivariate Analysis, Springer, pp. 189–217, 2017.

[21] A. Ferrari and M. Russo, Introducing Microsoft Power BI. Microsoft Press, 2016.

[22] Y.-C. Wu and J.-W. Feng, ”Development and application of artificial neural network,” Wireless Personal Communications, vol. 102, pp. 1645–1656, 2018.

[23] S. Gamage and J. Samarabandu, ”Deep learning methods in network intrusion detection: A survey and an objective comparison,” J. Netw. Comput. Appl., vol. 169, p. 102767, 2020.

[24] A. Yulianto, P. Sukarno, and N. A. Suwastika, ”Improving Adaboost-based intrusion detection system (IDS) performance on CIC IDS 2017 dataset,” in J. Phys.: Conf. Ser., vol. 1192, p. 012018, IOP Publishing, 2019.

[25]S. Y. Chang, H. -C. Wu and Y. -C. Kao, ”Tensor Extended Kalman Filter and its Application to Traffic Prediction,” IEEE Trans. Intell. Transp. Syst., vol. 24, no. 12, pp. 13813-13829, Dec. 2023, doi: 10.1109/TITS.2023.3299557.

[26] S. Y. Chang, H. -C. Wu, Y. -C. Kuan and Y. Wu, ”Tensor Levenberg-Marquardt Algorithm for Multi-Relational Traffic Prediction,” IEEE Trans. Veh. Technol., vol. 72, no. 9, pp. 11275-11290, Sept. 2023, doi: 10.1109/TVT.2023.3270037.

Data Visualization and Preprocessing for Network Traffic Analysis in Cybersecurity

Related Links

Resources & Policies

Resources & Policies

Contact Us

Related Links

Resources & Policies

Resources & Policies

Contact Us