Buffer Overflows

Using angr to find overflows

Automatically detecting memory corruption

angr is an awesome concolic analysis framework to recover control flow, solve CTF problems, and find bugs! The current write-ups are a little old and pretty verbose. The interfaces provided by angr allow for much smaller scripts to do almost the exact same thing when detecting bugs.

The current method on the angr docs uses a pretty verbose stepping routine, where they step until they discover an unconstrained path resulting in a controllable program counter.

while exploitable_state is None:
print(sm)
sm.step()
if len(sm.unconstrained) > 0:
l.info("found some unconstrained states, checking exploitability")
for u in sm.unconstrained:
if fully_symbolic(u, u.regs.pc):
exploitable_state = u
break

I've found in my work, that the cost of asking if every bit inside of the program counter is symbolic slows down super large jobs. I've instead found that creating a method to query if a specific value is satisfiable runs much faster.

The code below will accept a simulation manager, check each unconstrained path to see if the program counter can point to "0x43434343"

def check_mem_corruption(simgr):
if len(simgr.unconstrained):
for path in simgr.unconstrained:
if path.satisfiable(extra_constraints=[path.regs.pc == b"CCCC"]):
path.add_constraints(path.regs.pc == b"CCCC")
if path.satisfiable():
simgr.stashes['mem_corrupt'].append(path)
simgr.stashes['unconstrained'].remove(path)
simgr.drop(stash='active')
return simgr

Understanding the memory corruption check

if len(simgr.unconstrained):
for path in simgr.unconstrained:

Unconstrained paths are program states where some form of symbolic data influences where the program instruction pointer points. When running bare bones projects from states like "blank states", data read from files, arguments passed to the programs, and any STDIN is often treated as symbolic data. The hope is that once some symbolic data corrupts the program counter, we check to see if it could result from any malicious user input.

if path.satisfiable(extra_constraints=[path.regs.pc == b"CCCC"]):
path.add_constraints(path.regs.pc == b"CCCC")
if path.satisfiable():

Program state (or path) satisfiablility is asking the theorem prover (Microsoft's Z3) whether the current program state could exist given the current circumstances. The code above shows a constraint being imposed on the theorem prover to see if a program state could exist where the program counter can point to "0x43434343" (CCCC).

By using "CCCC" instead of the traditional "AAAA" on arm devices, when you analyze thumb mode code, PoCs generated produce "CCCB" for their program counter instead of "AAA@" when thumb mode rounds the last digit down.

Piecing it together

Building methods to determine vulnerability enables much more granulated specification of vulnerability types and the ability to stack multiple vulnerability checkers.

The code below uses a simulation manager checking function to determine vulnerability.

Overflow_Detector.py
import angr, argparse, IPython
def check_mem_corruption(simgr):
if len(simgr.unconstrained):
for path in simgr.unconstrained:
if path.satisfiable(extra_constraints=[path.regs.pc == b"CCCC"]):
path.add_constraints(path.regs.pc == b"CCCC")
if path.satisfiable():
simgr.stashes['mem_corrupt'].append(path)
simgr.stashes['unconstrained'].remove(path)
simgr.drop(stash='active')
return simgr
def main():
parser = argparse.ArgumentParser()
parser.add_argument("Binary")
parser.add_argument("Start_Addr", type=int)
args = parser.parse_args()
p = angr.Project(args.Binary)
state = p.factory.blank_state(addr=args.Start_Addr)
simgr = p.factory.simgr(state, save_unconstrained=True)
simgr.stashes['mem_corrupt'] = []
simgr.explore(step_func=check_mem_corruption)
IPython.embed()
if __name__ == "__main__":
main()