FASER: Binary Code Similarity Search through the use of Intermediate Representations |
CAMLIS |
2023 |
link |
|
link |
link |
VEXIR2Vec: An Architecture-Neutral Embedding Framework for Binary Similarity |
|
2023 |
link |
|
|
|
kTrans: Knowledge-Aware Transformer for Binary Code Embedding |
|
2023 |
link |
|
|
link |
Improving Binary Code Similarity Transformer Models by Semantics-Driven Instruction Deemphasis |
ISSTA |
2023 |
link |
|
|
link |
Asteria-Pro: Enhancing Deep-Learning Based Binary Code Similarity Detection by Incorporating Domain Knowledge |
TOSEM |
2023 |
link |
|
|
link |
sem2vec: Semantics-aware Assembly Tracelet Embedding |
TOSEM |
2023 |
link |
|
|
link |
1-to-1 or 1-to-n? Investigating the effect of function inlining on binary similarity analysis |
TOSEM |
2023 |
link |
|
|
|
Binary Function Clone Search in the Presence of Code Obfuscation and Optimization over Multi-CPU Architectures |
AsiaCCS |
2023 |
Link |
|
|
|
VulHawk: Cross-architecture Vulnerability Detection with Entropy-based Binary Code Search |
NDSS |
2023 |
link |
|
|
link |
A Game-Based Framework to Compare Program Classifiers and Evaders |
CGO |
2023 |
link |
link |
link |
link |
BBDetector: A Precise and Scalable Third-Party Library Detection in Binary Executables with Fine-Grained Function-Level Features |
MDPI |
2023 |
link |
|
|
|
A Survey of Binary Code Fingerprinting Approaches: Taxonomy, Methodologies, and Features |
CSUR |
2022 |
link |
|
|
|
Practical Binary Code Similarity Detection with BERT-based Transferable Similarity Learning |
ACSAC |
2022 |
link |
link |
|
link |
Improving cross-platform binary analysis using representation learning via graph alignment |
ISSTA |
2022 |
link |
|
link |
link |
jTrans: Jump-Aware Transformer for Binary Code Similarity |
ISSTA |
2022 |
link |
|
link |
link |
COBRA-GCN: Contrastive Learning to Optimize Binary Representation Analysis with Graph Convolutional Networks |
DIMVA |
2022 |
link |
|
|
|
A Large-Scale Empirical Analysis of the Vulnerabilities Introduced by Third-Party Components in IoT Firmware |
ISSTA |
2022 |
link |
|
link |
link |
How Machine Learning Is Solving the Binary Function Similarity Problem |
Usenix |
2022 |
link |
|
link |
link |
Enhancing DNN-Based Binary Code Function Search With Low-Cost Equivalence Checking |
TSE |
2022 |
link |
|
|
link |
Program Representations for Predictive Compilation: State of Affairs in the Early 20's |
COLA |
2022 |
link |
link |
|
link |
Improving binary diffing speed and accuracy using community detection and locality-sensitive hashing: an empirical study |
JCVHT |
2022 |
link |
|
|
|
PalmTree: Learning an Assembly Language Model for Instruction Embedding |
CCS |
2021 |
link |
link |
|
link |
Binary code similarity detection |
ASE |
2021 |
link |
|
|
|
Binary diffing as a network alignment problem via belief propagation |
ASE |
2021 |
link |
|
|
|
Asteria: Deep Learning-based AST-Encoding for Cross-platform Binary Code Similarity Detection |
IEEE DSN 2021 |
2021 |
link |
|
|
link |
BinDeep: A deep learning approach to binary code similarity detection |
ESWA |
2021 |
link |
|
|
|
EnBinDiff: Identifying Data-Only Patches for Binaries |
TDSC |
2021 |
link |
|
|
|
BinDiffNN: Learning Distributed Representation of Assembly for Robust Binary Diffing Against Semantic Differences |
TSE |
2021 |
link |
|
|
link |
Codee: A Tensor Embedding Scheme for Binary Code Search |
TSE |
2021 |
link |
|
|
link |
Revisiting Binary Code Similarity Analysis using Interpretable Feature Engineering and Lessons Learned |
TSE(revision) |
2021 |
link |
|
|
link |
How could Neural Networks understand Programs? |
ICML 2021 |
2021 |
link |
|
link |
|
Multi-threshold token-based code clone detection |
SANER 2021 |
2021 |
link |
|
|
|
FastSpec: Scalable Generation and Detection of Spectre Gadgets Using Neural Embeddings |
IEEE Euro S&P 2021 |
2021 |
link |
|
link |
link |
TREX: Learning Execution Semantics from Micro-Traces for Binary Similarity |
|
2020 |
link |
|
|
link |
Similarity of Binaries Across Optimization Levels and Obfuscation |
ESORICS 2020 |
2020 |
link |
|
link |
|
Open-source tools and benchmarks for code-clone detection: past, present, and future trends |
|
2020 |
link |
|
|
|
Semantically Find Similar Binary Codes with Mixed Key Instruction Sequence |
|
2020 |
|
|
|
|
LibDX: A Cross-Platform and Accurate System to Detect Third-Party Libraries in Binary Code |
|
2020 |
link |
|
|
|
Detecting Code Clones with Graph Neural Network and Flow-Augmented Abstract Syntax Tree |
SANER |
2020 |
link |
|
|
|
What You See is What it Means! Semantic Representation Learning of Code based on Visualization and Transfer Learning |
|
2020 |
link |
|
|
|
Clone Detection on Large Scala Codebases |
|
2020 |
link |
|
|
|
CloneCompass: Visualizations for Code Clone Analysis |
|
2020 |
link |
|
|
|
DEEPBINDIFF: Learning Program-Wide Code Representations for Binary Diffing |
NDSS |
2020 |
link |
|
link |
link |
VGraph: A Robust Vulnerable Code Clone Detection System Using Code Property Triplets |
EuroS&P |
2020 |
link |
|
|
|
Order Matters: Semantic-Aware Neural Networks for Binary Code Similarity Detection |
AAAI |
2020 |
link |
|
|
|
Similarity Metric Method for Binary Basic Blocks of Cross-Instruction Set Architecture |
NDSS |
2020 |
link |
|
|
link |
Investigating Graph Embedding Neural Networks with Unsupervised Features Extraction for Binary Analysis |
NDSS Workshop on Binary Analysis Research (BAR) |
2019 |
link |
|
|
link |
Asm2Vec: Boosting Static Representation Robustness for Binary Clone Search against Code Obfuscation and Compiler Optimization |
IEEE S&P |
2019 |
link |
link |
link |
|
Semantic-Based Representation Binary Clone Detection for Cross-Architectures in the Internet of Things |
MDPI |
2019 |
link |
|
|
|
A Survey of Binary Code Similarity |
CSUR |
2019 |
link |
|
|
|
代码克隆检测研究进展 |
软件学报 |
2019 |
link |
|
|
|
A Systematic Review on Code Clone Detection |
|
2019 |
link |
|
|
|
A Cross-Architecture Instruction Embedding Model for Natural Language Processing-Inspired Binary Code Analysis |
NDSS |
2019 |
link |
|
|
link |
Neural Machine Translation Inspired Binary Code Similarity Comparison beyond Function Pairs |
NDSS |
2019 |
link |
link |
link |
model |
SAFE: Self-Attentive Function Embeddings for Binary Similarity |
|
2019 |
link |
link |
|
link |
Learning-Based Recursive Aggregation of Abstract Syntax Trees for Code Clone Detection |
SANER |
2019 |
link |
|
|
|
基于深度学习的跨平台二进制代码关联分析 |
|
2019 |
link |
|
|
|
CVSkSA: cross-architecture vulnerability search in firmware based on kNN-SVM and attributed control flow graph |
|
2019 |
link |
|
|
|
Function matching between binary executables: efficient algorithms and features |
JCVHT |
2019 |
link |
|
|
|
BinMatch: A Semantics-based Hybrid Approach on Binary Code Clone Analysis |
ICSME |
2018 |
link |
|
|
|
αDiff: Cross-Version Binary Code Similarity Detection with DNN |
ASE |
2018 |
link |
|
|
dataset |
Binary Similarity Detection Using Machine Learning |
PLDI |
2018 |
link |
|
|
|
CCAligner: A Token Based Large-Gap Clone Detector |
ICSE |
2018 |
link |
|
|
|
Oreo: Detection of Clones in the Twilight Zone |
FSE |
2018 |
link |
|
|
|
VulSeeker: A Semantic Learning Based Vulnerability Seeker for Cross-platform Binary |
ASE |
2018 |
link |
|
|
link |
VulSeeker-pro: enhanced semantic learning based binary vulnerability seeker with emulation |
|
2018 |
link |
|
|
|
FirmUp: Precise Static Detection of Common Vulnerabilities in Firmware |
|
2018 |
link |
|
|
|
BINARM: Scalable and Efficient Detection of Vulnerabilities in Firmware Images of Intelligent Electronic Devices |
|
2018 |
link |
|
|
|
A Resilient and Efficient System for Identifying FOSS Functions in Malware Binaries |
|
2018 |
link |
|
|
|
Beyond Precision and Recall: Understanding Uses (and Misuses) of Similarity Hashes in Binary Analysis |
|
2018 |
link |
link |
|
|
BCD: Decomposing Binary Code Into Components Using Graph-Based Clustering |
ASIA CCS |
2018 |
link |
|
|
|
A Deep Learning Approach to Program Similarity |
MASES |
2018 |
link |
|
|
|
Recurrent Neural Network for Code Clone Detection |
SEIM |
2018 |
link |
|
|
|
The Adverse Effects of Code Duplication in Machine Learning Models of Code |
|
2018 |
link |
|
link |
|
Benchmarks for software clone detection: A ten-year retrospective |
SANER |
2018 |
link |
|
|
|
Binary Code Clone Detection across Architectures and Compiling Configurations |
ICPC |
2017 |
link |
|
|
|
Neural Network-based Graph Embedding for Cross-Platform Binary Code Similarity Detection |
ACM CCS |
2017 |
link |
|
|
link |
BinSequence: Fast, Accurate and Scalable Binary Code Reuse Detection |
ASIA CCS |
2017 |
link |
|
|
|
BinShape: Scalable and Robust Binary Library Function Identification Using Function Shape |
DIMVA |
2017 |
link |
|
|
|
Compiler-agnostic function detection in binaries |
IEEE EuroS&P |
2017 |
link |
|
|
link |
BinSign: Fingerprinting binary functions to support automated analysis of code executables |
|
2017 |
link |
|
|
|
Similarity of binaries through re-optimization |
PLDI |
2017 |
link |
link |
|
|
Transferring code-clone detection and analysis to practice |
ICSE-SEIP |
2017 |
link |
|
|
|
Cryptographic Function Detection in Obfuscated Binaries via Bit-Precise Symbolic Loop Mapping |
IEEE S&P |
2017 |
link |
|
|
|
Supervised Deep Features for Software Functional Clone Detection by Exploiting Lexical and Syntactical Information in Source Code |
IJCAI |
2017 |
link |
|
|
|
Extracting Conditional Formulas for Cross-Platform Bug Search |
ASIA CCS |
2017 |
link |
|
|
|
SPAIN: Security Patch Analysis for Binaries Towards Understanding the Pain and Pills |
ICSE |
2017 |
link |
|
|
|
CCLearner: A Deep Learning-Based Clone Detection Approach |
|
2017 |
link |
|
|
link |
BinSim: Trace-based Semantic Binary Diffing via System Call Sliced Segment Equivalence Checking |
USENIX |
2017 |
link |
link |
link |
|
In-memory Fuzzing for Binary Code Similarity Analysis |
ASE |
2017 |
link |
|
|
|
DéjàVu: a map of code duplicates on GitHub |
OOPSLA |
2017 |
link |
|
|
|
Some from Here, Some from There: Cross-project Code Reuse in GitHub |
MSR |
2017 |
link |
|
|
|
CVSSA: Cross-Architecture Vulnerability Search in Firmware Based on Support Vector Machine and Attributed Control Flow Graph |
|
2017 |
link |
|
|
|
Identifying Functionally Similar Code in Complex Codebases |
ICPC |
2016 |
link |
|
|
link |
Scalable graph-based bug search for firmware images (Genius) |
ASM CCS |
2016 |
link |
|
link |
link |
Cross-Architecture Binary Semantics Understanding via Similar Code Comparison |
IEEE SANER |
2016 |
link |
|
|
|
discovRE: Efficient cross-architecture identification of bugs in binary code |
NDSS |
2016 |
link |
|
|
|
BinGo: Cross-architecture cross-OS Binary Search |
FSE |
2016 |
link |
|
|
|
Kam1n0: Mapreduce-based assembly clone search for reverse engineering |
KDD |
2016 |
link |
|
|
link |
Statistical similarity of binaries |
PLDI |
2016 |
link |
link |
|
link |
Deep learning code fragments for code clone detection |
ASE |
2016 |
link |
|
|
|
A Survey of Software Clone Detection Techniques |
|
2016 |
link |
|
|
|
SourcererCC: Scaling Code Clone Detection to Big Code |
ICSE |
2016 |
link |
|
|
|
Binary executable file similarity calculation using function matching |
|
2016 |
link |
|
|
|
Matching Similar Functions in Different Versions of a Malware |
|
2016 |
link |
|
|
|
BinDNN: Resilient Function Matching Using Deep Learning |
|
2016 |
link |
|
|
|
VulPecker: An Automated Vulnerability Detection System Based on Code Similarity Analysis |
ACSAC |
2016 |
link |
|
|
link |
BigCloneEval: A Clone Detection Tool Evaluation Framework with BigCloneBench |
|
2016 |
link |
|
|
link |
Cross-architecture bug search in binary executables |
IEEE S&P |
2015 |
link |
|
|
|
Library functions identification in binary code by using graph isomorphism testings |
|
2015 |
link |
|
|
|
Evaluating clone detection tools with BigCloneBench |
|
2015 |
link |
|
|
link |
Memoized semantics-based binary diffing with application to malware lineage inference |
|
2015 |
link |
|
|
|
Sigma: A semantic integrated graph matching approach for identifying reused functions in binary code |
|
2015 |
link |
link |
|
|
BYTEWEIGHT: Learning to Recognize Functions in Binary Code |
USENIX |
2014 |
link |
link |
link |
|
Semantics-based obfuscation-resilient binary code similarity comparison with applications to software plagiarism detection |
FSE |
2014 |
link |
|
|
|
Binclone: Detecting code clones in malware |
SERE |
2014 |
link |
|
|
link |
Detecting fine-grained similarity in binaries |
|
2014 |
link |
|
|
|
Leveraging semantic signatures for bug search in binary programs |
ACSAC |
2014 |
link |
|
|
|
How Accurate Is Coarse-grained Clone Detection?: Comparision with Fine-grained Detectors |
|
2014 |
link |
|
|
|
Tracelet-based code search in executables |
PLDI |
2014 |
link |
|
|
|
Control Flow-Based Malware Variant Detection |
|
2014 |
link |
|
|
|
Hashing for Similarity Search: A Survey |
|
2014 |
link |
|
|
|
Achieving accuracy and scalability simultaneously in detecting application clones on android markets |
ICSE |
2014 |
link |
|
|
|
Identifying Shared Software Components to Support Malware Forensics |
|
2014 |
link |
|
|
|
Evaluating Modern Clone Detection Tools |
|
2014 |
link |
|
|
|
Rendezvous: a search engine for binary code |
MSR |
2013 |
link |
|
|
|
Binslayer: accurate comparison of binary executables |
PPREW |
2013 |
link |
|
|
link |
Software clone detection: A systematic review |
|
2013 |
link |
|
|
|
How to extract differences from similar programs? A cohesion metric approach |
|
2013 |
link |
|
|
|
Software clone detection and refactoring |
|
2013 |
link |
|
|
|
An Emerging Approach towards Code Clone Detection: Metric Based Approach on Byte Code |
|
2013 |
link |
|
|
|
A hybrid-token and textual based approach to find similar code segments |
|
2013 |
link |
|
|
|
Gapped code clone detection with lightweight source code analysis |
|
2013 |
link |
|
|
|
MutantX-S: Scalable Malware Clustering Based on Static Features |
USENIX |
2013 |
link |
|
link |
|
Binjuice: Fast Location of Similar Code Fragments Using Semantic Juice |
PPREW |
2013 |
link |
|
|
|
Towards Automatic Software Lineage Inference |
USENIX |
2013 |
link |
|
link |
|
AnDarwin: Scalable Detection of Semantically Similar Android Applications |
|
2013 |
link |
|
|
|
Expose: Discovering potential binary code re-use |
|
2013 |
link |
|
|
|
Function Matching-based Binary level Software Similarity Calculation |
RACS |
2013 |
link |
|
|
|
FIRMA: Malware Clustering and Network Signature Generation with Mixed Network Behaviors |
RAID |
2013 |
link |
|
|
|
A study of repetitiveness of code changes in software evolution |
ASE |
2013 |
link |
|
|
|
ibinhunt: Binary hunting with interprocedural control flow |
|
2012 |
link |
link |
|
|
ReDeBug: Finding Unpatched Code Clones in Entire OS Distributions |
USENIX |
2012 |
link |
|
|
|
Boreas: an accurate and scalable token-based approach to code clone detection |
ASE |
2012 |
link |
|
|
|
Folding Repeated Instructions for Improving Token-Based Code Clone Detection |
|
2012 |
link |
|
|
|
A metrics-based data mining approach for software clone detection |
|
2012 |
link |
|
|
|
Comparison of Clone Detection Techniques |
|
2012 |
|
|
|
|
Malware Classification Method via Binary Content Comparison |
RACS |
2012 |
link |
|
|
|
Binary function clustering using semantic hashes |
ICMLA |
2012 |
link |
|
|
|
Value-based program characterization and its application to software plagiarism detection |
|
2011 |
link |
|
|
|
CMCD: Count Matrix Based Code Clone Detection |
|
2011 |
link |
|
|
|
Incremental code clone detection: A pdg-based approach |
|
2011 |
link |
|
|
|
Anywhere, Any-Time Binary Instrumentation |
|
2011 |
link |
|
|
|
Code reuse in open source software development: Quantitative evidence, drivers, and impediments |
|
2010 |
|
|
|
|
Index-based code clone detection: incremental, distributed, scalable |
|
2010 |
|
|
|
|
Detection of Type-1 and Type-2 Code Clones Using Textual Analysis and Metrics |
|
2010 |
|
|
|
|
Ghezzi, A hybrid approach (syntactic and textual) to clone detection |
|
2010 |
|
|
|
|
Evaluating code clone genealogies at release level: An empirical study |
|
2010 |
|
|
|
|
A survey of Binary similarity and distance measures |
|
2010 |
|
|
|
|
Idea: Opcode-Sequence-Based Malware Detection |
|
2010 |
|
|
|
|
Behavioral Clustering of HTTP-Based Malware and Signature Generation Using Malicious Network Traces |
USENIX |
2010 |
|
|
|
|
Data fingerprinting with similarity digests |
|
2010 |
|
|
|
|
Automatic mining of functionally equivalent code fragments via random testing |
|
2009 |
|
|
|
|
A mutation/injection-based automatic framework for evaluating code clone detection tools |
|
2009 |
|
|
|
|
Problematic code clones identification using multiple detection results |
|
2009 |
|
|
|
|
Incremental clone detection |
|
2009 |
|
|
|
|
Scalable and incremental clone detection for evolving software |
|
2009 |
|
|
|
|
Large-scale Malware Indexing Using Function-call Graphs |
|
2009 |
|
|
|
|
Scalable, Behavior-Based Malware Clustering |
|
2009 |
|
|
|
|
peHash: A Novel Approach to Fast Malware Clustering |
USENIX |
2009 |
|
|
|
|
Detecting Code Clones in Binary Executables |
|
2009 |
|
|
|
|
Binhunt: Automatically finding semantic differences in binary programs |
|
2008 |
|
|
|
|
Scalable detection of semantic clones |
|
2008 |
|
|
|
|
Deckard: Scalable and accurate tree-based detection of code clones |
|
2007 |
|
|
|
|
Large-scale code reuse in open source software |
|
2007 |
|
|
|
|
A survey on software clone detection research |
|
2007 |
link |
|
|
|
A study of consistent and inconsistent changes to code clones |
|
2007 |
|
|
|
|
Comparison and evaluation of clone detection tools |
|
2007 |
|
|
|
|
Comprehensive Survey on Distance/Similarity Measures between Probability Density Functions |
|
2007 |
|
|
|
|
A Static Birthmark of Binary Executables Based on API Call Structure |
|
2007 |
|
|
|
|
CP-Miner: Finding copy-paste and related bugs in large-scale software code |
|
2006 |
|
|
|
|
Survey of research on software clones |
|
2006 |
link |
|
|
|
"Cloning considered harmful" considered harmful: patterns of cloning in software |
|
2006 |
link |
|
|
|
GPLAG: detection of software plagiarism by program dependence graph analysis |
|
2006 |
|
|
|
|
Detecting Self-mutating Malware Using Control-flow Graph Matching |
|
2006 |
|
|
|
|
Identifying Almost Identical Files Using Context Triggered Piecewise Hashing |
|
2006 |
|
|
|
|
Hamsa: Fast signature generation for zero-day polymorphic worms with provable attack resilience |
IEEE S&P |
2006 |
|
|
|
|
Graph-based comparison of executable objects |
|
2005 |
|
|
|
|
SDD: high performance code clone detection system for large scale source code |
|
2005 |
link |
|
|
|
Polygraph: Automatically generating signatures for polymorphic worms |
|
2005 |
|
|
|
|
K-gram Based Software Birthmarks |
|
2005 |
|
|
|
|
Insights into System-Wide Code Duplication |
IEEE |
2004 |
link |
|
|
|
Clone detection in source code by frequent itemset techniques |
|
2004 |
|
|
|
|
Evaluating clone detection techniques from a refactoring perspective |
|
2004 |
|
|
|
|
Structural comparison of executable objects |
|
2004 |
|
|
|
|
Code compaction of matching single-entry multiple-exit regions |
|
2003 |
link |
|
|
|
CloSpan: Mining: Closed sequential patterns in large datasets |
|
2003 |
|
|
|
|
Ccfinder: a multilinguistic token-based code clone detection system for large scale source code |
|
2002 |
|
|
|
|
Identifying similar code with program dependence graphs |
|
2001 |
|
|
|
|
Using slicing to identify duplication in source code |
|
2001 |
|
|
|
|
BMAT – A Binary Matching Tool for Stale Profile Propagation |
|
2000 |
|
|
|
|
A language independent approach for detecting duplicated code |
|
1999 |
|
|
|
|
Compressing Differences of Executable Code |
|
1999 |
|
|
|
|
Similarity search in high dimensions via hashing |
|
1999 |
|
|
|
|
Clone detection using abstract syntax trees |
|
1998 |
|
|
|
|
Experiment on the Automatic Detection of Function Clones in a Software System Using Metrics |
|
1996 |
|
|
|
|
Pattern matching for clone and concept detection |
|
1996 |
|
|
|
|
On finding duplication and near-duplication in large software systems |
|
1995 |
link |
|
|
|
Detecting code similarity using patterns |
|
1995 |
|
|
|
|
A Cross-platform Binary Diff |
|
1995 |
|
|
|
|