SandMark
SandMark is a tool developed at the University of Arizona for software
watermarking, tamperproofing, birthmarking, and code obfuscation of Java
bytecode. The tool incorporates several dynamic and static watermarking
algorithms, a large collection of obfuscation algorithms, a code
optimizer, and tools for viewing and analyzing Java bytecode.
The SandMark website is here.
The latest version of the SandMark
tool is here: sandmark.jar.
Publications
-
Christian Collberg, Clark Thomborson, Gregg M. Townsend,
Dynamic Graph-Based Software Fingerprinting,
ACM Transactions of Programming Languages and Systems,
Volume 29, Number 6, October 2007.
pdf
-
Ginger Myles, Christian Collberg,
Software Watermarking via Opaque Predicates: Implementation, Analysis, and Attacks,
Electronic Commerce Research Journal,
Volume 6, Number 2, pp. 155-171, 2006.
pdf
-
Ginger Myles, Christian Collberg,
k-gram Based Software Birthmarks,
Proceedings of the 2005 ACM Symposium on Applied Computing, Computer Security Track,
pp. 314-318, 2005.
pdf
-
Christian Collberg, Tapas Sahoo,
Software Watermarking in the Frequency Domain: Implementation, Analysis, and Attacks,
Journal of Computer Security,
Volume 13, Number 5, 721--755, 2005.
pdf
-
Ginger Myles, Christian Collberg, Zachary Heidepriem, Armand Navabi,
The evaluation of two software watermarking algorithms,
Software - Practice and Experience
Volume 35, Number 10, pp 923-938, 2005.
pdf
-
Christian Collberg, Edward Carter, Saumya Debray, Andrew Huntwork, John Kececioglu, Cullen Linn, Michael Stepp,
Dynamic Path-Based Software Watermarking,
ACM Programming Languages Design and Implementation (PLDI),
2004.
pdf
-
Christian Collberg, Andrew Huntwork, Edward Carter, Gregg Townsend,
Graph Theoretic Software Watermarks: Implementation, Analysis, and Attacks,
6thInformation Hiding Workshop,
2004.
pdf
-
Ginger Myles, Christian Collberg,
Software Watermarking via Opaque Predicates: Implementation, Analysis, and Attack,
The Seventh International Conference on Electronic Commerce Research (ICECR-7),
June 2004.
pdf
-
Kelly Heffner, Christian Collberg,
The Obfuscation Executive,
7th Information Security Conference (ISC'04),
September 2004.
pdf
-
Ginger Myles, Christian Collberg,
Detecting Software Theft via Whole Program Path Birthmarks,
7th Information Security Conference (ISC'04),
September 2004.
pdf
-
Christian Collberg, Ginger Myles, Andrew Huntwork,
Sandmark--A Tool for Software Protection Research,
IEEE Security & Privacy,
Volume 1, Number 4, pp. 40--49, 2003.
pdf
-
Christian Collberg, Edward Carter, Stephen Kobourov, Clark Thomborson,
Error-Correcting Graphs for Software Watermarking,
29th Workshop on Graph Theoretic Concepts in Computer Science (WG'2003),
June 2003.
pdf
-
Ginger Myles, Christian Collberg,
Software Watermarking Through Register Allocation: Implementation Analysis, and Attacks,
6th Annual International Conference on Information Security and Cryptology (ICISC),
November 2003.
springer
-
Christian Collberg, Clark Thomborson, Douglas Low,
Obfuscation techniques for enhancing software security,
United States Patent 6,668,325, Assignee: InterTrust Technologies (Santa Clara, CA),
Filed June 9, 1998, Issued December 23, 2003.
pdf
-
Christian Collberg, Clark Thomborson,
Watermarking, Tamper-Proofing, and Obfuscation -- Tools for Software Protection,
IEEE Transactions on Software Engineering,
Volume 28, Number 8, pp. 735--746, August 2002,
This paper was among the most cited journal
articles in software engineering from 2002 based on a citation study conducted by Prof.
Claes Wohlin,
pdf
-
Christian Collberg, Clark Thomborson,
Software Watermarking --- Models and Dynamic Embeddings,
ACM Principles of Programming Languages (POPL'99),
January 1999.
pdf
-
Christian Collberg, Clark Thomborson, and Douglas Low,
Manufacturing Cheap, Resilient, and Stealthy Opaque Constructs,
ACM Principles of Programming Languages (POPL'98),
January 1998.
pdf (scanned),
pdf (clean)
-
Christian Collberg, Clark Thomborson, Douglas Low,
Breaking Abstractions and Unstructuring Data Structures,
IEEE International Conference on Computer Languages (ICCL'98),
May 1998.
pdf
Supporting Grants and Contracts
- September 1, 2000--August 1, 2004,
$265,000 from the NSF: Software Watermarking, Obfuscation,
and Tamper-Proofing for Software Protection, grant CCR-0073483.
- June 2002, $417,000 (Option/Year 1) + $417,000 (Option/Year 2),
Air Force Research Lab (AFRL): Protecting Software
Against Tampering and Reverse Engineering, contract F33615-02-1146.
Splat
Self-plagiarism occurs when an author reuses portions of their
previous writings in subsequent research papers. Occasionally, the
derived paper is simply a re-titled and reformatted version of the
original one, but more frequently it is assembled from bits and pieces
of previous work.
It is our belief that self-plagiarism is
detrimental to scientific progress and bad for our academic
community. Flooding conferences and journals with near-identical
papers makes searching for information relevant to a particular topic
harder than it has to be. It also rewards those authors who are able
to break down their results into overlapping least-publishable-units
over those who publish each result only once. Finally, whenever a
self-plagiarized paper is allowed to be published, another, more
deserving paper, is not.
You can read more about Splat here.
Collaborators
Publications
-
Christian Collberg, Stephen Kobourov,
Self-Plagiarism in Computer Science,
Communications of the ACM,
April 2005.
pdf
-
Christian Collberg, Stephen Kobourov, Joshua Louie, Thomas Slattery,
SPLAT: A System for Self-Plagiarism Detection,
IADIS International Conference WWW/Internet (ICWI 2003),
pp. 508-514, November 2003.
pdf
Automatic Retargeting
There are three popular methods for constructing highly retargetable
compilers: (1) the compiler emits abstract machine code which is
interpreted at run-time, (2) the compiler emits C code which is
subsequently compiled to machine code by the native C compiler, or (3)
the compiler's code-generator is generated by a back-end generator
from a formal machine description produced by the compiler writer.
These methods incur high costs at run-time, compile-time, or
compiler-construction time, respectively.
We're interested in a fourth method which combines the fast
retargeting of C code generating compilers with the efficiency of
specification-driven code generators.
The basic idea is to use the native C compiler at compiler
construction time to discover architectural features of the new
architecture. From this information a formal machine description is
produced. Given this machine description, a native code-generator can
be generated by a back-end generator such as BEG or burg.
You can download the tool here.
Publications
-
Christian Collberg,
Automatic Derivation of Compiler Machine Descriptions,
ACM Transactions on Programming Languages and Systems,
Volume 24, Number 4, July 2002, pp. 369--408.
pdf
-
Christian Collberg,
Reverse Interpretation + Mutation Analysis = Automatic Re" targeting,
ACM SIGPLAN Conference on Programming Language Design and Implementation, (PLDI'97),
June 1997.
pdf
-
Christian Collberg,
Automatic Derivation of Machine Descriptions,
Proceedings of the Twentieth Australasian Computer Science Conference,
February 1997.
pdf
Code Rendering
ART is a language-independent and specification-driven program
rendering tool that is able to produce high-quality code renderings of
arbitrary complexity. The tool can incorporate arbitrary types of
information together with the program code, allowing it to be used for
debugging and profiling as well as for producing beautiful renderings
of programs for publication.
You can download the tool here
and the README file here.
Collaborators
Publications
-
Christian Collberg, Sean Davey, Todd Proebsting,
Language-Agnostic Program Rendering for Presentation, Debugging and Visualization,
IEEE Symposium on Visual Languages (VL'2000),
September 2000.
pdf
AlgoVista
AlgoVista is a web-based search engine designed to allow applied
computer scientists to classify problems and find algorithms and
implementations that solve these problems. Unlike other search
engines, AlgoVista is not keyword based. Rather, users provide a set
of input=>output samples that describe the behavior of the problem
they wish to classify. This type of query-by-example requires no
knowledge of specialized terminology, only an ability to formalize the
problem. The search mechanism of AlgoVista is based on a novel
application of program checking, a technique developed as an
alternative to program verification and testing.
You can download the tool here.
Collaborators
Publications
-
Christian Collberg, Stephen Kobourov, Suzanne Westbrook,
AlgoVista: an algorithmic search tool in an educational setting,
Technical Symposium on Computer Science Education (SIGCSE),
pp. 462-466, March 2004.
acm
-
Christian Collberg, Todd A. Proebsting,
Problem Classification using Program Checking,
Fun with Algorithms (FUN '01),
May 29--31, 2001.
pdf
-
Christian Collberg, Stephen Kobourov, Suzanne Westbrook,
AlgoVista: an algorithmic search tool in an educational setting,
Technical Symposium on Computer Science Education (SIGCSE),
pp. 462-466, March 2004.
acm
Flexible Encapsulation
Most modular programming languages provide an encapsulation
concept. Such concepts are used to protect the representational
details of the implementation of an abstraction from abuse by its
clients. Unfortunately, strict encapsulation is hindered by the
separate compilation facilities provided by modern languages. The goal
of the work presented here is to introduce techniques which allow
modular languages to support both separate compilation and strict
encapsulation without undue translation-time or execution-time cost.
You can download the
tool here.
Publications
-
Christian Collberg,
Distributed High-Level Module Binding for Flexible Encapsulation and Fast Inter-Modular Optimization,
International Conference on Programming Languages and Systems Architectures,
LNCS 782, March 1994.
pdf
-
Christian Collberg,
Flexible Encapsulation,
Ph.D. Thesis, Lund University,
December 1992.
-
Christian Collberg, Magnus Krampell,
Design and Implementation of Modular Languages Supporting Information Hiding,
6th International Phoenix Conference on Computers and Communications,
February 1987.