Kam1n0: Assembly Code Data Mining for Reverse Engineering

Speaker:        Steven Ding
                McGill University

Title:  "Kam1n0: Assembly Code Data Mining for Reverse Engineering"

Date:   Monday, 22 October 2018

Time:   4:00pm - 5:00pm

Venue:  Lecture Theater F (near lift 25/26), HKUST


Assembly code analysis is one of the critical processes for mitigating the
exponentially increasing threats from malicious software. It is also a
common practice for detecting and justifying software plagiarism and
software patent infringements when the source code is unavailable.
However, it is a manually intensive and time-consuming process even for
experienced reverse engineers.

An effective and efficient assembly code clone search engine can greatly
reduce the effort of this process since it can identify the cloned parts
that have been previously analyzed. By closely collaborating with reverse
engineers, we studied the challenges, designed and implemented an
award-winning assembly clone search engine called Kam1n0. It is the first
clone search engine that can efficiently and accurately identify the given
query assembly function's subgraph clones from a repository of millions of
candidates. It also introduces specialized techniques that can mitigate
the variance introduced by different compilers, optimization techniques,
and binary protection techniques. Extensive experimental results suggest
that Kam1n0 is accurate, efficient, and scalable for handling a large
volume of assembly code. This talk will include a live demonstration of


Steven is a Ph.D. Candidate at McGill University. He is affiliated with
the Data Mining and Security Lab. His research develops novel data mining
and machine learning techniques driven by the needs and challenges of
real-life applications in cybersecurity. Steven is awarded the Dean's
Graduate Award at McGill University. His research is also supported by the
FRQNT Doctoral Research Scholarship of Canada. His study on assembly clone
search has been published in the data mining conference SIGKDD and will be
presented in the security conference IEEE Security and Privacy. The
resulting search engine, namely Kam1n0, won the Hex-Rays plug-in contest
award. Kam1n0 has been presented at the Smart Cybersecurity Network
Canada, SOPHOS, ESET, Above Security, and Google. It is now used in CISCO.
See Steven's research website http://stevending.net for more information.