Abstract
In the last few decades, we can see our day-to-day life being dependent on software programs. The
digital revolution in the software industry has been progressing at a staggering rate. Thousands of software applications like web browsers, search engines, simulators (aviation), etc. are being developed by the different enterprises in various domains that are changing every facet of our lives. Most of these software programs are written in fast, low-level and memory-unsafe languages like C/C++. These memory unsafe languages rely on the programmer’s expertise for memory management of objects created during runtime and hence, leads to some memory-corruption bugs like buffer-overflow [1], dangling pointers [2], etc. Later, these bugs have the potential to be exploited by some malicious users compromising the software security. These bugs can cause serious security problems, such as confidential information leakage, system crashes, user permission changes, etc. These bugs (when exploited) have an adverse effect on both consumers and enterprises in terms of money, time and even human lives. In such a scenario, it becomes extremely important to catch these bugs apriori and mitigate before any
exploitation. This thesis helps in identifying and reporting a class of these bugs, which could breach
some confidential information out of the software (aka information leaks).Fuzzing is arguably the most effective security testing technique to find memory-corruption bugs in applications. However, fuzzing is effective in finding specific bugs that result in some observable effects, like crash. Consequently, information leak bugs often remain undetected by fuzzing, in spite of their serious effect on the security of the application. In order to help fuzzing detect such bugs,
address sanitiser (LLVM/GCC Asan option) based approaches are employed. However, such an option
is available only when we have the source code of the application available. Other binary-only solutions, like Valgrind, have limited ability to detect information leaks (e.g., stack variables are not covered). In this thesis, we investigate the application of dynamic taint flow analysis(DTA) to detect information leak in binary executables. In this approach, we leverage DTA to track data and pointer taintedness and device a set of rules to infer when there could be a possible information leak. Our method is suitable for typical desktop applications (for example, image and document viewers) that parse and display the information contained in the provided input.
In order to show the effectiveness of the proposed technique, we build a dynamic binary analysis
(DBA) tool - TailFinder (Taint Assisted Info Leak Finder) in C++ using PIN dynamic binary instrumentation framework and evaluate it on real-world applications. On such applications, we find information leaks that have known CVEs1
, thereby showing the applicability of the proposed technique. We also
present, how we can integrate these tools with fuzzing to find possible information leaks apriori.