Hash-based carving: Searching media for complete files and file fragments with sector hashing and hashdb

Digital Investigation(2015)

引用 31|浏览12
暂无评分
摘要
Hash-based carving is a technique for detecting the presence of specific “target files” on digital media by evaluating the hashes of individual data blocks, rather than the hashes of entire files. Unlike whole-file hashing, hash-based carving can identify files that are fragmented, files that are incomplete, or files that have been partially modified. Previous efforts at hash-based carving have looked for evidence of a single file or a few files. We attempt hash-based carving with a target file database of roughly a million files and discover an unexpectedly high false identification rate resulting from common data structures in Microsoft Office documents and multimedia files. We call such blocks “non-probative blocks.” We present the HASH-SETS algorithm that can determine the presence of files, and the HASH-RUNS algorithm that can reassemble files using a database of file block hashes. Both algorithms address the problem of non-probative blocks and provide results that can be used by analysts looking for target data on searched media. We demonstrate our technique using the bulk_extractor forensic tool, the hashdb hash database, and an algorithm implementation written in Python.
更多
查看译文
关键词
Hash-based carving,Hashdb,bulk_extractor,Sector hashing,Similarity
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要