Repair of Corrupted Document Files

Integrated Framework for Damaged Document Reconstruction

PDF and Microsoft Office document formats are widely used as digital records across administrative, legal, academic, and business domains owing to their ubiquity and reliability. They are increasingly collected as evidence in digital forensic investigations. Therefore, the ability to repair corrupted documents is crucial in forensic investigations, as losing data contained in them could result in the loss of critical evidence.

In this study, we propose a novel PDF(Portable Document Format) repair framework that automatically reconstructs object relationships along with a pre-constructed font database, enabling effective repair even when embedded fonts or Unicode mappings are missing. We evaluated the framework on 1,000 multilingual PDF files covering ten real-world corruption scenarios, and it consistently outperformed existing tools. Our dataset and proof-of-concept tool are available at:repdf.site.

Microsoft Office documents are stored in two primary formats: the legacy CFBF(Compound File Binary Format), which organizes data hierarchically as storages and streams, and the modern OOXML(Office Open XML), a ZIP-based container that holds multiple XML and media components. Our repair technique repairs damaged CFBF files by reconstructing the FAT(File Allocation Table) chain to recover data streams, while for OOXML, it extracts valid XML and media components, rebuilds relationships, and repackages them into valid Office documents.

#Corrupted File Repair #Electronic Document #PDF #MS Office #Reference Data
Corrupted Document_repdf-site Corrupted Document_ooxml
img 1

Detection of Malicious Document Files

PDF, MS Office

내용1

#태그 #태그 #태그 #태그 #태그

Analysis and Reconstruction of Data Fragments

Multimedia Data

An in-depth forensic examination of video files edited by Apple Photos

Uncover the hidden frames!
With the widespread availability of mobile and desktop video-editing tools, it has become increasingly feasible for individuals to alter digital evidence in ways that serve their interests. On Apple iOS and macOS platforms, the native Photos application stands out for its ability to edit videos without re-encoding them, leaving behind traces of manipulation such as metadata changes and unreferenced frames. Although many video players and commercial forensic tools overlook these meaningful artifacts, they can be crucial for revealing malicious editing behavior by a suspect. In this paper, we explore how the Photos application can be used to manipulate video files for potentially adversarial purposes and examine its impact on the underlying file structure. We then propose and implement detection methods that cover operations such as trimming, cropping, and rotation to identify these manipulations and recover any residual unreferenced frames. By testing various devices and operating system versions, we demonstrate the broad applicability of our approach, showing that between 1 and 245 unreferenced frames can be recovered. As a result, our research provides the forensic community with robust methods for classifying suspicious video files, identifying their editing techniques, and extracting residual data that can be valuable as evidence.

#Multimedia forensics #Video tampering #Apple devices #AVC #HEVC

SQLCipher-Encrypted Data

내용2

#태그 #태그 #태그 #태그 #태그
Multimedia Data 1 Multimedia Data 2
img 1

Cloud Data Collection and Analysis

FACT

FACT (Forensic Acquisition and Criminal Investigation Tool), an integrated digital forensic solution developed by the Digital Forensic Research Center at Korea University and Plainbit Co., Ltd., provides capabilities for collecting and analysing data from services related to secure messaging, cloud storage, anonymous networks, and cryptocurrencies, in order to counter anti-forensic activities encountered in real-world investigative environments.
Secure messenger data collection and analysis: FACT collects cloud resources using user credentials stored on local devices, and reconstructs contacts, chat histories, posts, and other relevant artefacts. It supports various services, including Telegram, Instagram, and Facebook Messenger, for chat data collection and visualization.
Metadata-based cloud forensics: FACT collects various types of metadata such as thumbnails, OCR results, and file history, and selectively retrieves data stored on the server based on metadata search results. It supports seven cloud services, including MS OneDrive, Google Drive, MEGA, Box, Dropbox, Naver Mybox, and iCloud Drive.

#Cloud Forensics #Cloud Data Collection #Secure Messenger #Cloud Storage #Forensic Tool

Tracking of API Evolutions

FOREST: Inspecting and Tracking RESTful APIs for Cloud Forensic Readiness

As digital evidence increasingly resides in the cloud, forensic investigations must navigate constantly changing service interfaces. Many RESTful APIs—responsible for handling authentication, file access, and communication—are undocumented or evolve silently, breaking reproducibility and obscuring user traces. FOREST (Forensic Readiness via RESTful API Schema Tracking) introduces an automated framework that discovers, analyzes, and tracks undocumented APIs directly from live network traffic. By parsing HTTP sessions captured during natural user interactions, FOREST identifies user-relevant endpoints, extracts forensic artifacts such as identities and messages, and reconstructs their structures into OpenAPI Specifications.
The framework integrates AI-based filtering to detect user-related responses, dependency testing to infer required request parameters, and schema-based comparison to monitor API evolution over time. Applied to Microsoft OneDrive, Teams, and Mattermost, FOREST achieved over 90% precision in identifying forensic endpoints and successfully traced schema-level changes across service versions. FOREST establishes a foundation for API-centric cloud forensics, ensuring that investigators can reproduce and verify evidence acquisition in rapidly evolving environments while contributing standardized knowledge to the SOLVE-IT forensic database.

#Cloud Forensics #RESTful API #Undocumented API #API Evolution Tracking #Forensic Infrastructure
Wallet Applications Multi-source Off-chain Data Forensic Framework Against Transaction Obfuscation

Virtual Asset Forensics

Wallet Applications

Monero is a privacy-preserving cryptocurrency that leverages advanced cryptographic mechanisms to conceal transaction participants and amounts, ensuring strong untraceability. Nevertheless, forensic techniques can still uncover sensitive information through the analysis of off-chain artifacts such as memory and wallet files.
In our study, we perform a comprehensive forensic investigation of Monero’s wallet application, emphasizing the internal management of public and private keys as well as its data storage structures. We demonstrate how these cryptographic keys are maintained in memory and propose a memory scanning algorithm capable of detecting key-related data structures. Additionally, we examine the wallet’s key and cache files, introducing a method to decrypt and interpret serialized keys and transaction data encrypted with a user-defined passphrase. Our implementation, developed as an open-source Volatility3 plugin accompanied by dedicated decryption scripts, was evaluated across various cryptocurrency wallets that include Monero components.

#Live Forensics #Memory Forensics #Cryptocurrency #Transaction Tracing #Monero #Forensic Tool Development

Exchange Services

내용2

#태그 #태그 #태그 #태그 #태그

Multi-source Off-chain Data Forensic Framework Against Transaction Obfuscation

Cryptocurrencies operate on their respective blockchains. Chain-hopping is a technique of exchanging one cryptocurrency for another. Criminals employ cross-chain exchanges, which is a form of transaction obfuscation, to launder illicit money. Although all transactions are recorded on blockchains, enabling investigators to trace cryptocurrency flows, assets moved via cross-chain exchanges are considerably more difficult to trace.
Our study performs digital forensic analysis not only on suspects’ devices but also on data stored in the cloud to trace obfuscated transaction flows. A multi-wallet can hold multiple cryptocurrencies and may offer an in-app exchange function to convert one currency into another. A web-interface exchange enables rapid conversion between cryptocurrencies using only a web browser. Because such exchange services retain records of exchange operations, digital forensics can be used to reconstruct a suspect’s cryptocurrency exchange history.

#Cryptocurrency Forensics #Off-chain Data Forensics #Transaction Tracing #Transaction Deobfuscation #Cross-chain Exchange #Forensics Framework

AI-assisted Forensics

Multimodal Data Aanalysis

SERENA

내용1

Suspicious Conversation Detection

This study develops an AI-assisted forensic framework that detects illicit drug and cryptocurrency transactions hidden within messenger conversations. By combining NER, semantic retrieval (ChromaDB), and RAG-based LLM reasoning, the system interprets slang and contextual intent beyond simple keywords. The framework enhances the accuracy, transparency, and efficiency of digital investigations through context-aware evidence analysis.

#NER(Named Entity Recognition) #RAG(Retrieval-Augmented Generation) #LLM(Large Language Model) #Illegal Drug Trade Chat Detection

Source Identification

Video

Video source identification using machine learning: A case study of 16 instant messaging applications

Who gave you this video?!
In recent years, there has been a notable increase in the prevalence of cybercrimes related to video content, including the distribution of illegal videos and the sharing of copyrighted material. This has led to the growing importance of identifying the source of video files to trace the owner of the files involved in the incident or identify the distributor. Previous research has concentrated on revealing the device (brand and/or model) that “originally” created a video file. This has been achieved by analysing the pattern noise generated by the image sensor in the camera, the storage structural features of the file, and the metadata patterns. However, due to the widespread use of mobile environments, instant messaging applications (IMAs) such as Telegram and Wire have been utilized to share illegal videos, which can result in the loss of information from the original file due to re-encoding at the application level, depending on the transmission settings. Consequently, it is necessary to extend the scope of existing research to identify the various applications that are capable of re-encoding video files in transit. Furthermore, it is essential to determine whether there are features that can be leveraged to distinguish them from the source identification perspective. In this paper, we propose a machine learning-based methodology for classifying the source application by extracting various features stored in the storage format and internal metadata of video files. To conduct this study, we analyzed 16 IMAs that are widely used in mobile environments and generated a total of 1974 sample videos, taking into account both the transmission options and encoding settings offered by each IMA. The training and testing results on this dataset indicate that the ExtraTrees model achieved an identification accuracy of approximately 99.96 %. Furthermore, we developed a proof-of-concept tool based on the proposed method, which extracts the suggested features from videos and queries a pre-trained model. This tool is released as open-source software for the community.

#Multimedia forensics #AVC #HEVC #Source identification #Machine learning

Audio

내용1

#태그 #태그 #태그 #태그 #태그
Suspicious Conversation Detection Source Identification_Video
img 1

Forensics on AI-related Systems

Local LLM Applications

내용1

#태그 #태그 #태그 #태그 #태그

Deepfake Apps and Services

내용2

#태그 #태그 #태그 #태그 #태그

Drone Forensics

(드론포렌식팀) .. 위크 Demo

내용1

#태그 #태그 #태그 #태그 #태그
img 1
iOS Sysdiagnose(SIREN) 1 iOS Sysdiagnose(SIREN) 2

Embedded Device Forensics

Consent-Based Mobile Data Collection

내용1

#태그 #태그 #태그 #태그 #태그

OS Log Analysis

iOS Sysdiagnose

Intelligent Framework for Automated Digital Trace Analysis

iOS Sysdiagnose serves as a comprehensive diagnostic bundle containing unified logs, power events, and process histories that capture every aspect of device operation. Traditionally, mobile forensics relied on privilege escalation or zero-day vulnerabilities to extract evidence, but such approaches are increasingly infeasible on hardened systems. SIREN (Sysdiagnose Intelligence & Report ENgine) introduces a novel forensic framework that reconstructs user behavior directly from system-level logs—without exploiting vulnerabilities. By correlating logarchive and Powerlog datasets, SIREN systematically normalizes temporal inconsistencies and rebuilds event sequences such as app usage, camera activity, battery patterns, and network transmissions.
The framework integrates 15 analysis modules and an LLM-based report generator that summarizes findings across thousands of raw entries into structured, human-readable evidence. In evaluations, SIREN successfully reconstructed behavioral timelines in cases of data exfiltration, cyberstalking, and identity theft, reducing manual analysis time from days to minutes. This work demonstrates a significant advancement in iOS digital forensics by establishing reliable, reproducible evidence acquisition through system logs—enhancing both efficiency and evidential integrity.

#iOS Forensics #iOS Sysdiagnose #User Behavior #AI Report Generation

Maritime Equipment

내용2

#태그 #태그 #태그 #태그 #태그

Multimedia Forensics

Structure-based Forgery Detection

Metadata-based audio file authenticity analysis framework: Galaxy ecosystem as a study

We talked about this! Don’t you remember?!
Verifying the authenticity of audio recordings is difficult when files are lightly edited, re-encoded, or moved across devices in ways that keep waveforms plausible while altering provenance. This paper presents a metadata-centered framework that complements signal-level detectors. We catalog on-device media artifacts, focusing on Android MediaStore records such as creation and modification times, acquisition times, application package provenance, and bitrates, and relate them to ISOBMFF fields. We combine these sources with application and filesystem traces to reconstruct timelines through preparation, acquisition, and examination phases. Controlled case studies cover genuine Android recordings and two tampered scenarios involving smartphone to smartwatch and smartphone to Windows PC transfers with trimming and copy-back insertion. Patterns such as synchronized timestamp resets, unexpected package provenance, and bitrate shifts reveal edits even when audio sounds natural. The approach is limited by device-specific schemas and access constraints but offers a reproducible, low-overhead basis for authenticating everyday audio evidence.

#Audio authenticity #Android MediaStore #Multimedia forensics #Galaxy devices
img 1
Built-in Anti-Forensic Features 1 Built-in Anti-Forensic Features 2

Operating System Forensics

Built-in Anti-Forensic Features

Modern operating systems incorporate various automated management features to enhance system performance and optimize storage efficiency. Among these, some operate as anti-forensic mechanisms by automatically deleting artifacts and files. The Disk Cleanup feature serves as a representative example of such built-in anti-forensic functionality within the operating system. While it helps manage storage space by removing unnecessary files, it can also lead to the unintended deletion of artifacts and files that may be crucial for forensic investigations. This may result in the loss of forensic evidence or even create circumstances that could be misinterpreted as deliberate evidence tampering.
Our study focuses on the disk cleanup functions built into operating systems such as Windows to analyze whether these built-in mechanisms can act as anti-forensic features by deleting or altering digital artifacts. Through reverse engineering and file analysis, we identify the internal structure and operational principles of these functions. We then conduct experiments to observe whether artifacts can be automatically deleted without user intervention. Based on the analysis and experimental results, this research proposes a forensic investigation procedure that accounts for anti-forensic behaviors embedded within the operating system.

#Anti-Forensic #Disk Cleanup #Artifacts Deletion #Forensic Investigation Procedure

Development of DFIR Infrastructures

Software Hash Database

내용1

#태그 #태그 #태그 #태그 #태그

Forensic Tool Testing

내용2

#태그 #태그 #태그 #태그 #태그

Multi-purpose Synthetic Datasets

내용1

#태그 #태그 #태그 #태그 #태그

Logical Image Formats

내용2

#태그 #태그 #태그 #태그 #태그
img 1
Design Drawing File 1 Design Drawing File 2 MS Office

Similarity Comparison

Design Drawing Files

Guarding Engineering Design Drawings: Detecting Data Leaks via Structural Similarity in AutoCAD and OrCAD Files

As industrial environments become increasingly digitalized, engineering drawings and related technical files have emerged as key assets representing an organization’s core technologies. However, leaks of such high-value design data are growing more frequent, threatening both corporate competitiveness and national technological security.
In this study, we aim to detect technology leaks by analyzing the structural characteristics of engineering design files. Targeting AutoCAD and OrCAD drawings, we examine their internal file structures to identify distinctive features and develop a similarity-based comparison algorithm. Using a custom dataset built from real and generated drawings, the proposed method demonstrates strong effectiveness and practicality. The results highlight its potential for protecting industrial technology and supporting digital forensic investigations.

#Design Drawing File Forensics #Structural Similarity Analysis #Data Leak Detection

MS Office

Layout-based similar document search through representative image creation: MS PowerPoint as a case study

Find my twin.
With the recent development of technology, the work environment is all digitized, and digital documents are utilized in most of the work. For digital forensic investigators who need to quickly select documents related to a case, numerous digital documents cause a lot of difficulties in investigations. In particular, in eDiscovery, it is important to find meaningful digital evidence by analyzing associations between many documents and files within a limited time. In the case of digital forensic investigation, if documents with similar types are selected among numerous documents by identifying the types of digital documents, only documents created by a specific organization can be grouped.
In this paper, we present a method of generating an image that can represent a document among images stored as many as the number of pages of a document for searching similar documents, and a method of searching similar documents using an image hash for similarity analysis between representative images. About 50,000 Microsoft PowerPoint files in the Govdocs1 data set and about 6,000 Microsoft PowerPoint files in the NapierOne data set demonstrate the practicality of the method presented in this paper.

#Document File Forensics #Document Layout Analysis #Image Hash #Image Similarity

Analysis of Embedded Device Firmware

FirmAware

내용1

#태그 #태그 #태그 #태그 #태그

VEAM+FirmBase

내용2

#태그 #태그 #태그 #태그 #태그

FIRE

내용1

#태그 #태그 #태그 #태그 #태그

FBOM Generatoin

내용2

#태그 #태그 #태그 #태그 #태그
img 1
img 1

Monitoring of Online Data for Cybersecurity and Forensics

Internet-Exposed Sensitive Systems

내용1

#태그 #태그 #태그 #태그 #태그

National Security Threat Activities

This study investigates a multimodal AI framework for monitoring and detecting terrorist propaganda, recruitment, and fundraising activities across social media and online communities. Through social media mining, the research collects and analyzes open-source data, integrating OCR, ASR, and NER across text, image, audio, and video modalities to identify key indicators such as Telegram IDs, cryptocurrency wallets, and donation messages. The framework aims to provide early evidence and investigative leads that can assist law enforcement in tracing financial networks and initiating formal investigations.

#Terrorism Detection #National Security #Policy and Technology Integration #Social Media Mining #Multimodal AI #OCR #ASR #NER