Analyzing binary files and memory dumps is a common task in software development, especially in cybersecurity, reverse engineering, and low-level programming.
In this article, we will build a memory and hex dump analyzer in Rust that provides an interactive UI to view, navigate, and search through binary data.
By the end, you’ll have a tool capable of detecting specific byte patterns, and ASCII strings, and displaying them in an organized way. 🦀
1. Project Overview
Our Rust Dump Analyzer will allow us to:
- Display a hex dump of binary files with addresses and ASCII string detections.
- Detect common file patterns (e.g., PDF, JPEG) based on known byte headers.
- Navigate through entries, view contextual byte data, and use search and jump-to-address functions.
- Get an overview of key statistics (total entries, patterns found, and ASCII strings detected).
We’ll implement the tool with Rust’s crossterm and ratatui libraries to build an interactive command-line interface.
2. Setting Up the Project
Begin by creating a new Rust project:
cargo new rust-dump-analyzer
cd rust-dump-analyzer
Add the following dependencies to Cargo.toml:
[dependencies]
crossterm = "0.20"
memchr = "2.5"
ratatui = "0.29" # for building the UI
3. Implementing the Core Functionality
Our analyzer’s core functions will focus on:
- Reading Binary Data: Loading the binary file’s contents into memory.
- Detecting ASCII Strings and Patterns: Identifying readable text and known file signatures in the data.
- Generating a Hex Dump: Displaying a formatted hex dump for easier analysis.
Let’s break down each of these components in detail.
Reading Binary Data
The first task is reading the binary data from a file. We’ll implement a function called read_dump_file that opens a file, reads its contents into a byte vector, and returns this data.
This function needs to:
- Open the File: Use Rust’s
File::openmethod to open the file. - Read to End: Use
read_to_endto read the file's entire content into a byte vector.
Here’s the full implementation of the read_dump_file function:
use std::fs::File;
use std::io::{self, Read};
fn read_dump_file(filename: &str) -> io::Result<Vec<u8>> {
let mut file = File::open(filename)?;
let mut buffer = Vec::new();
file.read_to_end(&mut buffer)?;
Ok(buffer)
}
- Error Handling: The
?operator is used to propagate errors, allowing the function to return anio::Result. - Buffer: The
bufferis dynamically sized to accommodate the file’s contents, making it suitable for files of various sizes.
This function will be used to load binary files, providing raw data for further analysis in subsequent functions.
Detecting ASCII Strings
In binary files, ASCII strings often represent readable text or meaningful data. We want to identify these strings and their positions.
Our find_ascii_strings function will:
- Detect ASCII Characters: Iterate over bytes and check if each byte is an ASCII character (i.e., printable).
- Build Strings: Collect consecutive ASCII bytes into strings.
- Minimum Length Filter: Only return strings longer than a specified minimum length (e.g., 4 characters).
Here’s the complete implementation of find_ascii_strings:
fn find_ascii_strings(chunk: &[u8], chunk_offset: usize, min_length: usize) -> Vec<(String, usize)> {
let mut result = Vec::new();
let mut current_string = Vec::new();
let mut start_index = 0;
for (i, &byte) in chunk.iter().enumerate() {
if byte.is_ascii_graphic() || byte == b' ' {
if current_string.is_empty() {
start_index = i;
}
current_string.push(byte);
} else if current_string.len() >= min_length {
result.push((
String::from_utf8_lossy(¤t_string).to_string(),
chunk_offset + start_index,
));
current_string.clear();
} else {
current_string.clear();
}
}
if current_string.len() >= min_length {
result.push((
String::from_utf8_lossy(¤t_string).to_string(),
chunk_offset + start_index,
));
}
result
}
- Iterating Over Bytes: We loop through each byte in
chunk. Theis_ascii_graphicmethod helps us filter for printable characters. - String Building: We use
current_stringto collect contiguous ASCII bytes. When a non-ASCII byte is encountered, the accumulated bytes are processed if they meet the minimum length requirement. - Result: We return a vector of tuples, where each tuple contains an ASCII string and its starting position in the file.
This function will be used to detect readable text within binary data, which can often reveal metadata, file names, and other useful information.
Detecting Known File Patterns
Many file formats have specific “magic numbers” — unique byte sequences at the beginning of the file. Detecting these patterns can help identify embedded files or known data structures within the binary dump.
Our detect_patterns function will:
- Define Common Patterns: Accept a list of known byte patterns to search for, such as PDF, JPEG, ZIP, and PNG headers.
- Search for Patterns: Use a slice-searching function to locate patterns within the binary data.
- Store Results: Return a list of found patterns with their names and starting addresses.
Here’s the complete detect_patterns implementation:
use memchr::memmem;
#[derive(Debug, Clone)]
struct Pattern {
name: &'static str,
bytes: &'static [u8],
}
fn detect_patterns(chunk: &[u8], chunk_offset: usize, patterns: &[Pattern]) -> Vec<(String, usize)> {
let mut results = Vec::new();
for pattern in patterns {
let mut start = 0;
while let Some(pos) = memmem::find(&chunk[start..], pattern.bytes) {
let actual_pos = chunk_offset + start + pos;
results.push((pattern.name.to_string(), actual_pos));
start += pos + 1;
}
}
results
}
- Pattern Struct: We define a
Patternstruct to store the name and byte sequence for each known pattern. - Search Logic: For each pattern, we use
memmem::find, a fast substring search, to locate occurrences of the pattern within the data. - Result Collection: Each time a pattern is found, its name and address are stored in the
resultsvector.



