Reverse Engineering Packet Structures from Network Traces by Segment-based Alignment
thesisposted on 24.09.2018, 08:50 by Othman Mohamed A. Esoul
Many applications in security, from understanding unfamiliar protocols to fuzz-testing and guarding against potential attacks, rely on analysing network protocols. In many situations we cannot rely on access to a specification or even an implementation of the protocol, and must instead rely on raw network data “sniffed” from the network. When this is the case, one of the key challenges is to discern from the raw data the underlying packet structures – a task that is commonly carried out by two steps: message clustering, and message Alignment. Clustering quality is critically contingent upon the selection of the right parameters. In this thesis, we experimentally investigated two aspects: 1) the effect of different parameters on clustering, and 2) whether suitable parameter configuration for clustering can be inferred for undocumented protocols (when messages classes are unavailable). In this thesis, we have quantified the impact of specific parameters on clustering, and used clustering validation measures to predict parameter configurations with high clustering accuracy. Our results indicate that: 1) The choice of the distance measure and the message length has the most substantial impact on cluster accuracy. 2) The Ball-Hall intrinsic validation measure has yielded the best results in predicting suitable parameter configuration for clustering. While clustering is used to detect message types (similar groups) within a dataset, sequence alignment algorithms are often used to detect the protocol message structure (field partitioning). For this, most approaches have used variants of the Needleman-Wunsch algorithm to perform byte-wise alignment. However, they can suffer when messages are heterogeneous, or in cases where protocol fields are separated by long variable fields. In this thesis, we present an alternative alignment algorithm known as segment-based alignment. The results indicate that segmented-based alignment can produce highly accurate results than traditional alignment techniques especially with long and diverse network packets.