/PRE-list

List of (automatic) protocol reverse engineering tools for network protocols

Primary LanguagePythonCreative Commons Zero v1.0 UniversalCC0-1.0

PRE-list

List of (automatic) protocol reverse engineering tools/methods/approaches for network protocols

This is a collection of 71 scientific papers about (automatic) protocol reverse engineering (PRE) methods and tools. The papers are categorized into different groups so that it is more easy to get an overview of existing solutions based on the problem you want to tackle.

The collection is based on the following three surveys and got extended afterwards:

  • J. Narayan, S. K. Shukla, and T. C. Clancy, “A Survey of Automatic Protocol Reverse Engineering Tools,” ACM Computing Surveys, vol. 48, no. 3, pp. 1–26, Feb. 2016, doi: 10.1145/2840724. PDF
  • J. Duchêne, C. Le Guernic, E. Alata, V. Nicomette, and M. Kaâniche, “State of the art of network protocol reverse engineering tools,” Journal of Computer Virology and Hacking Techniques, vol. 14, no. 1, pp. 53–68, Feb. 2018, doi: 10.1007/s11416-016-0289-8. PDF
  • B. D. Sija, Y.-H. Goo, K.-S. Shim, H. Hasanova, and M.-S. Kim, “A Survey of Automatic Protocol Reverse Engineering Approaches, Methods, and Tools on the Inputs and Outputs View,” Security and Communication Networks, vol. 2018, pp. 1–17, 2018, doi: 10.1155/2018/8370341. PDF

Furthermore, there is a very extensive surveys which focuses on the methods and approaches of PRE tools that are based on network traces. The work of Kleber et al. is an excellent starting point to see what was already tried and for which use cases a method is working best.

  • S. Kleber, L. Maile, and F. Kargl, “Survey of Protocol Reverse Engineering Algorithms: Decomposition of Tools for Static Traffic Analysis,” IEEE Communications Surveys & Tutorials, vol. 21, no. 1, pp. 526–561, 2019, doi: 10.1109/COMST.2018.2867544. PDF

Please help extending this collection by adding papers to the tools.ods.

Table of Contents

Overview

Name Year Approach used
PIP [1] 2004 Keyword detection and Sequence alignment based on Needleman and Wunsch 1970 and Smith and Waterman 1981; this approach was applied and extended by many following papers
GAPA [2] 2005 Protocol analyzer and open language that uses the protocol analyzer specification Spec → it is meant to be integrated in monitoring and analyzing tools
ScriptGen [3] 2005 Grouping and clustering messages, find edges from clusters to clusters for being able to replay messages once a similar message arrives
RolePlayer [4] 2006 Byte-wise sequence alignment (find variable fields in messages) and clustering with FSM simplification
Ma et al. [5] 2006 Please review
FFE/x86 [6] 2006 Please review
Replayer [7] 2006 Please review
Discoverer [8] 2007 Tokenization of messages, recursive clustering to find formats, merge similar formats
Polyglot [9] 2007 Dynamic taint-analysis
PEXT [10] 2007 Message clustering for creating FSM graph and simplify FSM graph
Rosetta [11] 2007 Please review
AutoFormat [12] 2008 Dynamic taint-analysis
Tupni [13] 2008 Dynamic taint-analysis; look for loops to identify boundaries within messages
Boosting [14] 2008 Please review
ConfigRE [15] 2008 Please review
ReFormat [16] 2009 Dynamic taint-analysis, especially targeting encrypted protocols by looking for bitwise and arithmetic operations
Prospex [17] 2009 Dynamic taint-analysis with following message clustering, optionally provides fuzzing candidates for Peach fuzzer
Xiao et al. [18] 2009 Please review
Trifilo et al. [19] 2009 Measure byte-wise variances in aligned messages
Antunes and Neves [20] 2009 Please review
Dispatcher [21] 2009 Dynamic taint-analysis (successor of Polyglot using send instead of received messages)
Fuzzgrind [22] 2009 Please review
REWARDS [23] 2010 Please review
MACE [24] 2010 Please review
Whalen et al. [25] 2010 Please review
AutoFuzz [26] 2010 Please review
ReverX [27] 2011 Speech recognition (thus only for text-based protocols) to find carriage returns and spaces, afterwards looking for frequencies of keywords; multiple partial FSMs are merged and simplified to get PFSM
Veritas [28] 2011 Identifiying keywords, clustering and transition probability → probabilistic protocol state machine
Biprominer [29] 2011 Statistical analysis including three phases, learning phase, labeling phase and transition probability model building phase. See this figure.
ASAP [30] 2011 Please review
Howard [31] 2011 Please review
ProDecoder [32] 2012 Successor of Biprominer which also addresses text-based protocols; two-phases are used: first apply Biprominer, second use Needleman-Wunsch for alignment
Zhang et al. [33] 2012 Please review
Netzob [34] 2012 See this figure
PRISMA [35] 2012 Please review, follow-up paper/project to ASAP
ARTISTE [36] 2012 Please review
Wang et al. [37] 2013 Capturing of data, identifying frames and inferring the format by looking and frequency of frames and doing association analysis (using Apriori and FP-Growth).
Laroche et al. [38] 2013 Please review
AutoReEngine [39] 2013 Apriori Algorithm (based on Agrawal/Srikant 1994). Identify fields and keywords by considering the amount of occurrences. Message formats are considered as series of keywords. State machines are derived from labeled messages or frequent subsequences. See this figure for clarification.
Dispatcher2 [40] 2013 Please review
ProVeX [41] 2013 Identify Botnet traffic and try to infer the botnet type by using signatures
Meng et al. [42] 2014 Please review
AFL [43] 2014 Please review
Proword [44] 2014 Please review
ProGraph [45] 2015 Please review
FieldHunter [46] 2015 Please review
RS Cluster [47] 2015 Please review
UPCSS [48] 2015 Please review
ARGOS [49] 2015 Please review
PULSAR [50] 2015 Reverse engineer network protocols with the aim to fuzz them with thus knowledge
Li et al. [51] 2015 Please review
Cai et al. [52] 2016 Please review
WASp [53] 2016 Pcap files are provided with context information (i.e. known MAC address), then grouping and analysing (looking for CRC, N-gram, Entropy, Features, Ranges), afterwards report creation based on scoring.
PRE-Bin [54] 2016 Please review
Xiao et al. [55] 2016 Please review
PowerShell [56] 2017 Please review
ProPrint [57] 2017 Please review
ProHacker [58] 2017 Please review
Esoul and Walkinshaw [59] 2017 Please review
PREUGI [60] 2017 Please review
NEMESYS [61] 2018 Please review
Goo et al. [62] 2019 Apriori based: Finding „frequent contiguous common subsequences“ via new Contiguous Sequential Pattern (CSP) algorithm which is based on Generalized Sequential Pattern (GSP) and other Apriori algorithms. CSP is used three times hierarchically to extract different information/fields based on previous results.
Universal Radio Hacker [63] 2019 Physical layer based analysis of proprietary wireless protocols considering wireless specific properties like Received Signal Strength Indicator (RSSI) and using statistical methods
Luo et al. [64] 2019 From abstract: “[…] this study proposes a type-aware approach to message clustering guided by type information. The approach regards a message as a combination of n-grams, and it employs the Latent Dirichlet Allocation (LDA) model to characterize messages with types and n-grams via inferring the type distribution of each message.”
Sun et al. [65] 2019 Please review
Yang et al. [66] 2020 Using deep-learning (LSTM-FCN) for reversing binary protocols
Sun et al. [67] 2020 "To measure format similarity of unknown protocol messages in a proper granularity, we propose relative measurements, Token Format Distance (TFD) and Message Format Distance (MFD), based on core rules of Augmented Backus-Naur Form (ABND)." for clustering process Silhouette Coefficient and Dunn Index are used. density based cluster algorithm DBSCAN is used for clustering of messages
Shim et al. [68] 2020 Follow up on Goo et al. 2019
IPART [69] 2020 Using extended voting expert algorithm to infer boundaries of fields, otherwise using three phase which are tokenizing, classifying and clustering.
NEMETYL [70] 2020 Please review
NetPlier [71] 2021 Probabilistic method for network trace based protocol reverse engineering.

Input and Output

NetT: input is a network trace (e.g. pcap)
ExeT: input is an execution trace (code/binary at hand)
PF: output is protocol format (describing the syntax)
PFSM: output is protocol finite state machine (describing semantic/sequential logic)

Name Year NetT ExeT PF PFSM Other Output
PIP [1] 2004 Keywords/ fields
GAPA [2] 2005
ScriptGen [3] 2005 Dialogs/scripts (for replaying)
RolePlayer [4] 2006 Dialogs/scripts
Ma et al. [5] 2006 App-identification
FFE/x86 [6] 2006
Replayer [7] 2006
Discoverer [8] 2007
Polyglot [9] 2007
PEXT [10] 2007
Rosetta [11] 2007
AutoFormat [12] 2008
Tupni [13] 2008
Boosting [14] 2008 Field(s)
ConfigRE [15] 2008
ReFormat [16] 2009
Prospex [17] 2009
Xiao et al. [18] 2009
Trifilo et al. [19] 2009
Antunes and Neves [20] 2009
Dispatcher [21] 2009 C&C malware
Fuzzgrind [22] 2009
REWARDS [23] 2010
MACE [24] 2010
Whalen et al. [25] 2010
AutoFuzz [26] 2010
ReverX [27] 2011
Veritas [28] 2011
Biprominer [29] 2011
ASAP [30] 2011 Semantics
Howard [31] 2011
ProDecoder [32] 2012
Zhang et al. [33] 2012
Netzob [34] 2012
PRISMA [35] 2012
ARTISTE [36] 2012
Wang et al. [37] 2013
Laroche et al. [38] 2013
AutoReEngine [39] 2013
Dispatcher2 [40] 2013 C&C malware
ProVeX [41] 2013 Signatures
Meng et al. [42] 2014
AFL [43] 2014
Proword [44] 2014
ProGraph [45] 2015
FieldHunter [46] 2015 Fields
RS Cluster [47] 2015 Grouped-messages
UPCSS [48] 2015 Proto-classification
ARGOS [49] 2015
PULSAR [50] 2015
Li et al. [51] 2015
Cai et al. [52] 2016
WASp [53] 2016 scored analysis reports, spoofing candidates
PRE-Bin [54] 2016
Xiao et al. [55] 2016
PowerShell [56] 2017 Dialogs/scripts
ProPrint [57] 2017 Fingerprints
ProHacker [58] 2017 Keywords
Esoul and Walkinshaw [59] 2017
PREUGI [60] 2017
NEMESYS [61] 2018
Goo et al. [62] 2019
Universal Radio Hacker [63] 2019
Luo et al. [64] 2019
Sun et al. [65] 2019
Yang et al. [66] 2020
Sun et al. [67] 2020
Shim et al. [68] 2020
IPART [69] 2020
NEMETYL [70] 2020
NetPlier [71] 2021

Tested protocols

Name Year Text-based Binary-based Hybrid Other Protocols
PIP [1] 2004 HTTP
GAPA [2] 2005 HTTP
ScriptGen [3] 2005 HTTP NetBIOS DCE
RolePlayer [4] 2006 HTTP, FTP, SMTP, NFS, TFTP DNS, BitTorrent, QQ, NetBios SMB, CIFS
Ma et al. [5] 2006 HTTP, FTP, SMTP, HTTPS (TCP-Protos) DNS, NetBIOS, SrvLoc (UDP-Protos)
FFE/x86 [6] 2006
Replayer [7] 2006
Discoverer [8] 2007 HTTP RPC SMB, CIFS
Polyglot [9] 2007 HTTP, Samba, ICQ DNS, IRC
PEXT [10] 2007 FTP
Rosetta [11] 2007
AutoFormat [12] 2008 HTTP, SIP DHCP, RIP, OSPF SMB, CIFS
Tupni [13] 2008 HTTP, FTP RPC, DNS, TFTP WMF, BMP, JPG, PNG, TIF
Boosting [14] 2008 DNS
ConfigRE [15] 2008
ReFormat [16] 2009 HTTP, MIME IRC One unknown protocol
Prospex [17] 2009 SMTP, SIP SMB Agobot (C&C)
Xiao et al. [18] 2009 HTTP, FTP, SMTP
Trifilo et al. [19] 2009 TCP, DHCP, ARP, KAD
Antunes and Neves [20] 2009 FTP
Dispatcher [21] 2009 HTTP, FTP, ICQ DNS
Fuzzgrind [22] 2009
REWARDS [23] 2010
MACE [24] 2010
Whalen et al. [25] 2010
AutoFuzz [26] 2010
ReverX [27] 2011 FTP
Veritas [28] 2011 SMTP PPLIVE, XUNLEI
Biprominer [29] 2011 XUNLEI, QQLive, SopCast
ASAP [30] 2011 HTTP, FTP, IRC, TFTP
Howard [31] 2011
ProDecoder [32] 2012 SMTP, SIP SMB
Zhang et al. [33] 2012 HTTP, SNMP, ISAKMP
Netzob [34] 2012 FTP, Samba SMB Unknown P2P & VoIP protocol
PRISMA [35] 2012
ARTISTE [36] 2012
Wang et al. [37] 2013 ICMP ARP
Laroche et al. [38] 2013 FTP DHCP
AutoReEngine [39] 2013 HTTP, FTP, SMTP, POP3 DNS, NetBIOS
Dispatcher2 [40] 2013 HTTP, FTP, ICQ DNS SMB
ProVeX [41] 2013 HTTP, SMTP, IMAP DNS, VoIP, XMPP Malware Family Protocols
Meng et al. [42] 2014 TCP, ARP
AFL [43] 2014
Proword [44] 2014
ProGraph [45] 2015 HTTP DNS, BitTorrent, WeChat
FieldHunter [46] 2015 MSNP DNS SopCast, Ramnit
RS Cluster [47] 2015 FTP, SMTP, POP3, HTTPS DNS, XunLei, BitTorrent, BitSpirit, QQ, eMule MSSQL, Kugoo, PPTV
UPCSS [48] 2015 HTTP, FTP, SMTP, POP3, IMAP DNS, SSL, SSH SMB
ARGOS [49] 2015
PULSAR [50] 2015
Li et al. [51] 2015
Cai et al. [52] 2016 HTTP, SSDP DNS, BitTorrent, QQ, NetBios
WASp [53] 2016 IEEE 802.15.4 proprietary protocols, Smart plug & PSD systems
PRE-Bin [54] 2016
Xiao et al. [55] 2016
PowerShell [56] 2017 ARP, OSPF, DHCP, STP CDP/DTP/VTP, HSRP, LLDP, LLMNR, mDNS, NBNS, VRRP
ProPrint [57] 2017
ProHacker [58] 2017
Esoul and Walkinshaw [59] 2017
PREUGI [60] 2017
NEMESYS [61] 2018
Goo et al. [62] 2019 HTTP DNS
Universal Radio Hacker [63] 2019 proprietary wireless protocols of IoT devices
Luo et al. [64] 2019
Sun et al. [65] 2019
Yang et al. [66] 2020 IPv4, TCP
Sun et al. [67] 2020
Shim et al. [68] 2020 FTP Modbus/TCP, Ethernet/IP
IPART [69] 2020 Modbus, IEC104, Ethernet/IP
NEMETYL [70] 2020
NetPlier [71] 2021

Source Code

Most papers do not provide the code used in the research. For the following papers exists (example) code.

Name Year Source Code
PIP [1] 2004 https://web.archive.org/web/20090416234849/http://4tphi.net/~awalters/PI/PI.html
ReverX [27] 2011 https://github.com/jasantunes/reverx
Netzob [34] 2012 https://github.com/netzob/netzob
PRISMA [35] 2012 https://github.com/tammok/PRISMA/
PULSAR [50] 2015 https://github.com/hgascon/pulsar
NEMESYS [61] 2018 https://github.com/vs-uulm/nemesys
Universal Radio Hacker [63] 2019 https://github.com/jopohl/urh
NetPlier [71] 2021 https://github.com/netplier-tool/NetPlier/

References

[1]

M. Beddoe, “The protocol informatics project,” 2004, http://www.4tphi.net/∼awalters/PI/PI.html. PDF

[2]

N. Borisov, D. J. Brumley, H. J. Wang, J. Dunagan, P. Joshi, and C. Guo, “Generic application-level protocol analyzer and its language,” MSR Technical Report MSR-TR-2005-133, 2005. PDF

[3]

C. Leita, K. Mermoud, and M. Dacier, “ScriptGen: an automated script generation tool for Honeyd,” in Proceedings of the 21st Annual Computer Security Applications Conference (ACSAC ’05), pp. 203–214, Tucson, Ariz, USA, December 2005. PDF

[4]

W. Cui, V. Paxson, N. C. Weaver, and R. H. Katz, “Protocolindependent adaptive replay of application dialog,” in Proceedings of the 13th Symposium on Network and Distributed System Security (NDSS ’06), 2006. PDF

[5]

J. Ma, K. Levchenko, C. Kreibich, S. Savage, and G. Voelker, “Automatic protocol inference: unexpected means of identifying protocols,” UCSD Computer Science Technical Report CS2006-0850, 2006. PDF

[6]

Lim, J., Reps, T., Liblit, B.: Extracting output formats from executables. In: 13th Working Conference on Reverse Engineering, 2006. WCRE ’06, pp. 167–178. IEEE, Benevento (2006). doi:10.1109/WCRE.2006.29 PDF

[7]

Cui, W., Paxson, V., Weaver, N., Katz, R.H.: Protocol-independent adaptive replay of application dialog. In: Proceedings of the 13th Annual Network and Distributed System Security Symposium (NDSS). Internet Society, San Diego (2006). http://research.microsoft.com/apps/pubs/default.aspx?id=153197

[8]

W. Cui, J. Kannan, and H. J. Wang, “Discoverer: Automatic protocol reverse engineering from network traces.,” in USENIX security symposium, 2007, pp. 1–14. PDF

[9]

J. Caballero, H. Yin, Z. Liang, and D. Song, “Polyglot: automatic extraction of protocol message format using dynamic binary analysis,” in Proceedings of the 14th ACM Conference on Computer and Communications Security (CCS ’07), pp. 317–329, ACM, November 2007. PDF

[10]

M. Shevertalov and S. Mancoridis, “A reverse engineering tool for extracting protocols of networked applications,” in Proceedings of the 14th Working Conference on Reverse Engineering (WCRE ’07), pp. 229–238, October 2007. PDF

[11]

Caballero, J., Song, D.: Rosetta: Extracting Protocol Semantics Using Binary Analysis with Applications to Protocol Replay and NAT Rewriting. Technical Report CMU-CyLab-07-014, Carnegie Mellon University, Pittsburgh (2007)

[12]

Z. Lin, X. Jiang, D. Xu, and X. Zhang, “Automatic protocol format reverse engineering through context-aware monitored execution,” in Proceedings of the 15th Symposium on Network and Distributed System Security (NDSS ’08), February 2008. PDF

[13]

W. Cui, M. Peinado, K. Chen, H. J. Wang, and L. Irun-Briz, “Tupni: automatic reverse engineering of input formats,” in Proceedings of the 15th ACM Conference on Computer and Communications Security (CCS ’08), pp. 391–402, ACM, Alexandria, Va, USA, October 2008. PDF

[14]

K. Gopalratnam, S. Basu, J. Dunagan, and H. J. Wang, “Automatically extracting fields from unknown network protocols,” in Proceedings of the 15th Symposium on Network and Distributed System Security (NDSS ’08), 2008. PDF

[15]

Wang, R., Wang, X., Zhang, K., Li, Z.: Towards automatic reverse engineering of software security configurations. In: Proceedings of the 15th ACM Conference on Computer and Communications Security, CCS ’08, pp. 245–256. ACM, Limerick (2008). doi:10.1145/1455770.1455802

[16]

Z. Wang, X. Jiang, W. Cui, X. Wang, and M. Grace, “ReFormat: automatic reverse engineering of encrypted messages,” in Computer Security—ESORICS 2009. ESORICS 2009, M. Backes and P. Ning, Eds., vol. 5789 of Lecture Notes in Computer Science, pp. 200–215, Springer, Berlin, Germany, 2009. PDF

[17]

P. M. Comparetti, G. Wondracek, C. Kruegel, and E. Kirda, “Prospex: protocol specification extraction,” in Proceedings of the 30th IEEE Symposium on Security and Privacy, pp. 110–125, Berkeley, Calif, USA, May 2009. PDF

[18]

M.-M. Xiao, S.-Z. Yu, and Y. Wang, “Automatic network protocol automaton extraction,” in Proceedings of the 3rd International Conference on Network and System Security (NSS ’09), pp. 336–343, October 2009.

[19]

A. Trifilo, S. Burschka, and E. Biersack, “Traffic to protocol reverse engineering,” in Proceedings of the IEEE Symposium on Computational Intelligence for Security and Defense Applications, pp. 1–8, July 2009. PDF

[20]

J. Antunes and N. Neves, “Building an automaton towards reverse protocol engineering,” 2009, http://www.di.fc.ul.pt/∼nuno/PAPERS/INFORUM09.pdf.

[21]

J. Caballero, P. Poosankam, C. Kreibich, and D. Song, “Dispatcher: enabling active botnet infiltration using automatic protocol reverse-engineering,” in Proceedings of the 16th ACM Conference on Computer and Communications Security (CCS ’09), pp. 621–634, ACM, Chicago, Ill, USA, November 2009. PDF

[22]

Campana, G.: Fuzzgrind: an automatic fuzzing tool. In: Hack. lu. Hack. lu, Luxembourg (2009)

[23]

Lin, Z., Zhang, X., Xu, D.: Automatic reverse engineering of data structures from binary execution. In: Proceedings of the 17th Annual Network and Distributed System Security Symposium (NDSS). Internet Society, San Diego (2010)

[24]

Cho, C.Y., Babi D., Shin, E.C.R., Song, D.: Inference and analysis of formal models of botnet command and control protocols. In: Proceedings of the 17th ACM Conference on Computer and Communications Security, CCS ’10, pp. 426–439. ACM, New York, NY (2010). doi:10.1145/1866307.1866355 Cho, C.Y., Babi, D., Poosankam, P., Chen, K.Z., Wu, E.X., Song, D.: MACE: model-inference-assisted concolic exploration for protocol and vulnerability discovery. In: Proceedings of the 20th USENIX Conference on Security, SEC’11, p. 19. USENIX Association, Berkeley, CA (2011)

[25]

S. Whalen, M. Bishop, and J. P. Crutchfield, “Hidden Markov Models for Automated Protocol Learning,” in Security and Privacy in Communication Networks, vol. 50, S. Jajodia and J. Zhou, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, 2010, pp. 415–428. PDF

[26]

S. Gorbunov and A. Rosenbloom, “Autofuzz: Automated network protocol fuzzing framework,” IJCSNS, vol. 10, no. 8, p. 239, 2010. PDF

[27]

J. Antunes, N. Neves, and P. Verissimo, “Reverse engineering of protocols from network traces,” in Proceedings of the 18th Working Conference on Reverse Engineering (WCRE ’11), pp. 169–178, October 2011. PDF

[28]

Y. Wang, Z. Zhang, D. D. Yao, B. Qu, and L. Guo, “Inferring protocol state machine from network traces: a probabilistic approach,” in Proceedings of the 9th Applied Cryptography and Network Security International Conference (ACNS ’11), pp. 1–18, 2011. PDF

[29]

Y. Wang, X. Li, J. Meng, Y. Zhao, Z. Zhang, and L. Guo, “Biprominer: automatic mining of binary protocol features,” in Proceedings of the 12th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT ’11), pp. 179–184, October 2011.

[30]

T. Krueger, N. Krmer, and K. Rieck, “Asap: automatic semantics-aware analysis of network payloads,” in Proceedings of the ECML/PKDD, 2011. PDF

[31]

Slowinska, A., Stancescu, T., Bos, H.: Howard: a dynamic excavator for reverse engineering data structures. In: Proceedings of the 18th Annual Network and Distributed System Security Symposium (NDSS). Internet Society, San Diego (2011)

[32]

Y. Wang, X. Yun, M. Z. Shafiq et al., “A semantics aware approach to automated reverse engineering unknown protocols,” in Proceedings of the 20th IEEE International Conference on Network Protocols (ICNP ’12), pp. 1–10, IEEE, Austin, Tex, USA, November 2012. PDF

[33]

Z. Zhang, Q.-Y. Wen, and W. Tang, “Mining protocol state machines by interactive grammar inference,” in Proceedings of the 2012 3rd International Conference on Digital Manufacturing and Automation (ICDMA ’12), pp. 524–527, August 2012.

[34]

G. Bossert, F. Guihéry, and G. Hiet, “Towards automated protocol reverse engineering using semantic information,” in Proceedings of the 9th ACM Symposium on Information, Computer and Communications Security, Kyoto, Japan, June 2014. G. Bossert and F. Guihéry, “Reverse and simulate your enemy botnet C&C,” in Proceedings of the Mapping a P2P Botnet with Netzob, Black Hat 2012, Abu Dhabi, UAE, December 2012. PDF

[35]

Krueger, T., Gascon, H., Krmer, N., Rieck, K.: Learning stateful models for network honeypots. In: Proceedings of the 5th ACM Workshop on Security and Artificial Intelligence, AISec ’12, pp. 37–48. ACM, New York, NY (2012). PDF

[36]

Caballero, J., Grieco, G., Marron, M., Lin, Z., Urbina, D.: ARTISTE: Automatic Generation of Hybrid Data Structure Signatures from Binary Code Executions. Technical Report TR-IMDEA-SW-2012-001, IMDEA Software Institute, Madrid (2012)

[37]

Y. Wang, N. Zhang, Y.-M. Wu, B.-B. Su, and Y.-J. Liao, “Protocol formats reverse engineering based on association rules in wireless environment,” in Proceedings of the 12th IEEE International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom ’13), pp. 134–141, Melbourne, Australia, July 2013.

[38]

P. Laroche, A. Burrows, and A. N. Zincir-Heywood, “How far an evolutionary approach can go for protocol state analysis and discovery,” in Proceedings of the IEEE Congress on Evolutionary Computation (CEC ’13), pp. 3228–3235, June 2013.

[39]

J.-Z. Luo and S.-Z. Yu, “Position-based automatic reverse engineering of network protocols,” Journal of Network and Computer Applications, vol. 36, no. 3, pp. 1070–1077, 2013.

[40]

J. Caballero and D. Song, “Automatic protocol reverse-engineering: message format extraction and field semantics inference,” Computer Networks, vol. 57, no. 2, pp. 451–474, 2013. PDF

[41]

C. Rossow and C. J. Dietrich, “PROVEX: detecting botnets with encrypted command and control channels,” in Detection of Intrusions and Malware, and Vulnerability Assessment, Springer, 2013. PDF

[42]

F. Meng, Y. Liu, C. Zhang, T. Li, and Y. Yue, “Inferring protocol state machine for binary communication protocol,” in Proceedings of the IEEE Workshop on Advanced Research and Technology in Industry Applications (WARTIA ’14), pp. 870–874, September 2014.

[43]

Zalewski, M.: American Fuzzy Loop. http://lcamtuf.coredump.cx/afl/technical_details.txt

[44]

Z. Zhang, Z. Zhang, P. P. C. Lee, Y. Liu, and G. Xie, “ProWord: An unsupervised approach to protocol feature word extraction,” in IEEE INFOCOM 2014 - IEEE Conference on Computer Communications, Toronto, ON, Canada, Apr. 2014, pp. 1393–1401, doi: 10.1109/INFOCOM.2014.6848073. PDF

[45]

Q. Huang, P. P. C. Lee, and Z. Zhang, “Exploiting intrapacket dependency for fine-grained protocol format inference,” in Proceedings of the 14th IFIP Networking Conference (NETWORKING ’15), Toulouse, France, May 2015.

[46]

I. Bermudez, A. Tongaonkar, M. Iliofotou, M. Mellia, and M. M. Munafo, “Automatic protocol field inference for deeper protocol understanding,” in Proceedings of the 14th IFIP Networking Conference (Networking ’15), pp. 1–9, May 2015. PDF

[47]

J.-Z. Luo, S.-Z. Yu, and J. Cai, “Capturing uncertainty information and categorical characteristics for network payload grouping in protocol reverse engineering,” Mathematical Problems in Engineering, vol. 2015, Article ID 962974, 9 pages, 2015.

[48]

R. Lin, O. Li, Q. Li, and Y. Liu, “Unknown network protocol classification method based on semi supervised learning,” in Proceedings of the IEEE International Conference on Computer and Communications (ICCC ’15), pp. 300–308, Chengdu, China, October 2015.

[49]

Zeng, J., Lin, Z.: Towards automatic inference of kernel object semantics from binary code. In: 18th International Symposium, RAID 2015, vol. 9404, pp. 538–561. Springer, Kyoto (2015). doi:10.1007/978-3-319-26362-5

[50]

H. Gascon, C. Wressnegger, F. Yamaguchi, D. Arp, and K. Rieck, “Pulsar: Stateful Black-Box Fuzzing of Proprietary Network Protocols,” in Security and Privacy in Communication Networks, vol. 164, B. Thuraisingham, X. Wang, and V. Yegneswaran, Eds. Cham: Springer International Publishing, 2015, pp. 330–347. PDF

[51]

H. Li, B. Shuai, J. Wang, and C. Tang, “Protocol Reverse Engineering Using LDA and Association Analysis,” in 2015 11th International Conference on Computational Intelligence and Security (CIS), Shenzhen, China, Dec. 2015, pp. 312–316, doi: 10.1109/CIS.2015.83.

[52]

J. Cai, J. Luo, and F. Lei, “Analyzing network protocols of application layer using hidden Semi-Markov model,” Mathematical Problems in Engineering, vol. 2016, Article ID 9161723, 14 pages, 2016.

[53]

K. Choi, Y. Son, J. Noh, H. Shin, J. Choi, and Y. Kim, “Dissecting customized protocols: automatic analysis for customized protocols based on IEEE 802.15.4,” in Proceedings of the 9th ACM Conference on Security and Privacy in Wireless and Mobile Networks, pp. 183–193, Darmstadt, Germany, July 2016. PDF

[54]

S. Tao, H. Yu, and Q. Li, “Bit‐oriented format extraction approach for automatic binary protocol reverse engineering,” IET Communications, vol. 10, no. 6, pp. 709–716, Apr. 2016, doi: 10.1049/iet-com.2015.0797. PDF

[55]

M.-M. Xiao, S.-L. Zhang, and Y.-P. Luo, “Automatic network protocol message format analysis,” IFS, vol. 31, no. 4, pp. 2271–2279, Sep. 2016, doi: 10.3233/JIFS-169067.

[56]

D. R. Fletcher Jr., Identifying Vulnerable Network Protocols with PowerShell, SANS Institute Reading Room site, 2017.

[57]

Y. Wang, X. Yun, Y. Zhang, L. Chen, and G. Wu, “A nonparametric approach to the automated protocol fingerprint inference,” Journal of Network and Computer Applications, vol. 99, pp. 1–9, 2017.

[58]

Y. Wang, X. Yun, Y. Zhang, L. Chen, and T. Zang, “Rethinking robust and accurate application protocol identification,” Computer Networks, vol. 129, pp. 64–78, 2017.

[59]

O. Esoul and N. Walkinshaw, “Using Segment-Based Alignment to Extract Packet Structures from Network Traces,” in 2017 IEEE International Conference on Software Quality, Reliability and Security (QRS), Prague, Czech Republic, Jul. 2017, pp. 398–409, doi: 10.1109/QRS.2017.49. PDF

[60]

M.-M. Xiao and Y.-P. Luo, “Automatic protocol reverse engineering using grammatical inference,” IFS, vol. 32, no. 5, pp. 3585–3594, Apr. 2017, doi: 10.3233/JIFS-169294.

[61]

S. Kleber, H. Kopp, and F. Kargl, “{NEMESYS}: Network message syntax reverse engineering by analysis of the intrinsic structure of individual messages,” 2018. PDF

[62]

Y.-H. Goo, K.-S. Shim, M.-S. Lee, and M.-S. Kim, “Protocol Specification Extraction Based on Contiguous Sequential Pattern Algorithm,” IEEE Access, vol. 7, pp. 36057–36074, 2019, doi: 10.1109/ACCESS.2019.2905353. PDF

[63]

J. Pohl and A. Noack, “Universal radio hacker: A suite for analyzing and attacking stateful wireless protocols,” Baltimore, MD, Aug. 2018, [Online]. Available: https://www.usenix.org/conference/woot18/presentation/pohl. J. Pohl and A. Noack, “Automatic wireless protocol reverse engineering,” Santa Clara, CA, Aug. 2019, [Online]. Available: https://www.usenix.org/conference/woot19/presentation/pohl. PDF

[64]

X. Luo, D. Chen, Y. Wang, and P. Xie, “A Type-Aware Approach to Message Clustering for Protocol Reverse Engineering,” Sensors, vol. 19, no. 3, p. 716, Feb. 2019, doi: 10.3390/s19030716. PDF

[65]

F. Sun, S. Wang, C. Zhang, and H. Zhang, “Unsupervised field segmentation of unknown protocol messages,” Computer Communications, vol. 146, pp. 121–130, Oct. 2019, doi: 10.1016/j.comcom.2019.06.013.

[66]

C. Yang, C. Fu, Y. Qian, Y. Hong, G. Feng, and L. Han, “Deep Learning-Based Reverse Method of Binary Protocol,” in Security and Privacy in Digital Economy, vol. 1268, S. Yu, P. Mueller, and J. Qian, Eds. Singapore: Springer Singapore, 2020, pp. 606–624.

[67]

F. Sun, S. Wang, C. Zhang, and H. Zhang, “Clustering of unknown protocol messages based on format comparison,” Computer Networks, vol. 179, p. 107296, Oct. 2020, doi: 10.1016/j.comnet.2020.107296.

[68]

K. Shim, Y. Goo, M. Lee, and M. Kim, “Clustering method in protocol reverse engineering for industrial protocols,” International Journal of Network Management, Jun. 2020, doi: 10.1002/nem.2126. PDF

[69]

X. Wang, K. Lv, and B. Li, “IPART: an automatic protocol reverse engineering tool based on global voting expert for industrial protocols,” International Journal of Parallel, Emergent and Distributed Systems, vol. 35, no. 3, pp. 376–395, May 2020, doi: 10.1080/17445760.2019.1655740.

[70]

S. Kleber, R. W. van der Heijden, and F. Kargl, “Message Type Identification of Binary Network Protocols using Continuous Segment Similarity,” in IEEE INFOCOM 2020 - IEEE Conference on Computer Communications, Toronto, ON, Canada, Jul. 2020, pp. 2243–2252. doi: 10.1109/INFOCOM41043.2020.9155275. PDF

[71]

Ye, Yapeng, Zhuo Zhang, Fei Wang, Xiangyu Zhang, and Dongyan Xu. “NetPlier: Probabilistic Network Protocol Reverse Engineering from Message Traces.” In NDSS. 2021. PDF