Network intrusion detection with a novel hierarchy of distances between embeddings of hash IP addresses

Lopez-Martin, Manuel; Carro, Belen; Arribas, Juan Ignacio; Sánchez-Esguevillas, Antonio

doi:10.1016/j.knosys.2021.106887

Título

Network intrusion detection with a novel hierarchy of distances between embeddings of hash IP addresses

dc.contributor.author	Lopez-Martin, Manuel
dc.contributor.author	Carro, Belen
dc.contributor.author	Arribas, Juan Ignacio
dc.contributor.author	Sánchez-Esguevillas, Antonio
dc.date.accessioned	2024-01-22T11:27:06Z
dc.date.available	2024-01-22T11:27:06Z
dc.date.issued	2021-02
dc.identifier.issn	0950-7051
dc.identifier.uri	http://hdl.handle.net/10366/154482
dc.description.abstract	Including high-dimensional categorical predictors in a machine learning model is a major challenge. This is particularly appropriate for the IP and Port addresses of network connections when they are considered as predictors (features) in machine learning models. These features are particularly important for network intrusion detection, as many attacks exploit information about IP/Port addresses. The sparsity and high dimensionality of these features make it difficult their inclusion into the models, being discarded as useful information in many cases. This work proposes to replace the original network addresses by new features based on a set of distances defined between different components of the source and destination IP and Port addresses. These distances incorporate information on the probability of co-occurrence of source and destination addresses. The distances are calculated using a dense, low-dimensional vector representation (embedding) of the different network address components. The embeddings are obtained with a neural network, which requires few computational resources, plus an additional hash function that collapses the extremely large range of IP and Port values, making the model implementation feasible. A self-supervised learning framework under a hierarchical model is used to train the encoding network. The novel features can be used to predict future co-occurrence of source and destination network addresses, and, when applied as features in a supervised model, they significantly increase the prediction performance of most classifiers for the detection of network intrusions. We demonstrate this prediction improvement over two modern network intrusion datasets: CICIDS2017 and CICDDoS2019	es_ES
dc.language.iso	eng	es_ES
dc.subject	Hash function	es_ES
dc.subject	Self-supervised learning	es_ES
dc.subject	Neural network	es_ES
dc.subject	Network address embedding	es_ES
dc.subject	Network intrusion detection	es_ES
dc.title	Network intrusion detection with a novel hierarchy of distances between embeddings of hash IP addresses	es_ES
dc.type	info:eu-repo/semantics/article	es_ES
dc.relation.publishversion	https://doi.org/10.1016/j.knosys.2021.106887
dc.subject.unesco	3325 Tecnología de las Telecomunicaciones
dc.subject.unesco	2490 Neurociencias
dc.identifier.doi	10.1016/j.knosys.2021.106887
dc.rights.accessRights	info:eu-repo/semantics/openAccess	es_ES
dc.journal.title	Knowledge-Based Systems	es_ES
dc.volume.number	219	es_ES
dc.page.initial	106887	es_ES

Fichier(s) constituant ce document

Nom:: Network intrusion detection with ...
Taille:: 3.820Mo
Format:: PDF
Description:: Artículo principal

Voir/Ouvrir

Ce document figure dans la(les) collection(s) suivante(s)

INCyL. Unidad de Excelencia iBRAINS-IN-CyL [145]

Afficher la notice abrégée