Repair of Corrupted Document Files
Integrated Framework for Damaged Document Reconstruction
PDF and Microsoft Office document formats are widely used as digital records across administrative, legal, academic, and business domains owing to their ubiquity and reliability. They are increasingly collected as evidence in digital forensic investigations. Therefore, the ability to repair corrupted documents is crucial in forensic investigations, as losing data contained in them could result in the loss of critical evidence.
In this study, we propose a novel PDF(Portable Document Format) repair framework that automatically reconstructs object relationships along with a pre-constructed font database, enabling effective repair even when embedded fonts or Unicode mappings are missing. We evaluated the framework on 1,000 multilingual PDF files covering ten real-world corruption scenarios, and it consistently outperformed existing tools. Our dataset and proof-of-concept tool are available at:repdf.site.
Microsoft Office documents are stored in two primary formats: the legacy CFBF(Compound File Binary Format), which organizes data hierarchically as storages and streams, and the modern OOXML(Office Open XML), a ZIP-based container that holds multiple XML and media components. Our repair technique repairs damaged CFBF files by reconstructing the FAT(File Allocation Table) chain to recover data streams, while for OOXML, it extracts valid XML and media components, rebuilds relationships, and repackages them into valid Office documents.


Detection of Malicious Document Files
PDF, MS Office
내용1
Analysis and Reconstruction of Data Fragments
Multimedia Data
An in-depth forensic examination of video files edited by Apple Photos
Uncover the hidden frames!
With the widespread availability of mobile and desktop video-editing tools, it has become increasingly feasible for individuals to alter digital evidence in ways that serve their interests. On Apple iOS and macOS platforms, the native Photos application stands out for its ability to edit videos without re-encoding them, leaving behind traces of manipulation such as metadata changes and unreferenced frames. Although many video players and commercial forensic tools overlook these meaningful artifacts, they can be crucial for revealing malicious editing behavior by a suspect. In this paper, we explore how the Photos application can be used to manipulate video files for potentially adversarial purposes and examine its impact on the underlying file structure. We then propose and implement detection methods that cover operations such as trimming, cropping, and rotation to identify these manipulations and recover any residual unreferenced frames. By testing various devices and operating system versions, we demonstrate the broad applicability of our approach, showing that between 1 and 245 unreferenced frames can be recovered. As a result, our research provides the forensic community with robust methods for classifying suspicious video files, identifying their editing techniques, and extracting residual data that can be valuable as evidence.
SQLCipher-Encrypted Data
내용2


Cloud Data Collection and Analysis
FACT
FACT (Forensic Acquisition and Criminal Investigation Tool), an integrated digital forensic solution developed by the Digital Forensic Research Center at Korea University and Plainbit Co., Ltd., provides capabilities for collecting and analysing data from services related to secure messaging, cloud storage, anonymous networks, and cryptocurrencies, in order to counter anti-forensic activities encountered in real-world investigative environments.
Secure messenger data collection and analysis: FACT collects cloud resources using user credentials stored on local devices, and reconstructs contacts, chat histories, posts, and other relevant artefacts. It supports various services, including Telegram, Instagram, and Facebook Messenger, for chat data collection and visualization.
Metadata-based cloud forensics: FACT collects various types of metadata such as thumbnails, OCR results, and file history, and selectively retrieves data stored on the server based on metadata search results. It supports seven cloud services, including MS OneDrive, Google Drive, MEGA, Box, Dropbox, Naver Mybox, and iCloud Drive.
Tracking of API Evolutions
FOREST: Inspecting and Tracking RESTful APIs for Cloud Forensic Readiness
As digital evidence increasingly resides in the cloud, forensic investigations must navigate constantly changing service interfaces. Many RESTful APIs—responsible for handling authentication, file access, and communication—are undocumented or evolve silently, breaking reproducibility and obscuring user traces. FOREST (Forensic Readiness via RESTful API Schema Tracking) introduces an automated framework that discovers, analyzes, and tracks undocumented APIs directly from live network traffic. By parsing HTTP sessions captured during natural user interactions, FOREST identifies user-relevant endpoints, extracts forensic artifacts such as identities and messages, and reconstructs their structures into OpenAPI Specifications.
The framework integrates AI-based filtering to detect user-related responses, dependency testing to infer required request parameters, and schema-based comparison to monitor API evolution over time. Applied to Microsoft OneDrive, Teams, and Mattermost, FOREST achieved over 90% precision in identifying forensic endpoints and successfully traced schema-level changes across service versions. FOREST establishes a foundation for API-centric cloud forensics, ensuring that investigators can reproduce and verify evidence acquisition in rapidly evolving environments while contributing standardized knowledge to the SOLVE-IT forensic database.
Copyright Infringement Response
Illigal Streaming Devices
내용1


Virtual Asset Forensics
Wallet Applications
Monero is a privacy-preserving cryptocurrency that leverages advanced cryptographic mechanisms to conceal transaction participants and amounts, ensuring strong untraceability. Nevertheless, forensic techniques can still uncover sensitive information through the analysis of off-chain artifacts such as memory and wallet files.
In our study, we perform a comprehensive forensic investigation of Monero’s wallet application, emphasizing the internal management of public and private keys as well as its data storage structures. We demonstrate how these cryptographic keys are maintained in memory and propose a memory scanning algorithm capable of detecting key-related data structures. Additionally, we examine the wallet’s key and cache files, introducing a method to decrypt and interpret serialized keys and transaction data encrypted with a user-defined passphrase. Our implementation, developed as an open-source Volatility3 plugin accompanied by dedicated decryption scripts, was evaluated across various cryptocurrency wallets that include Monero components.
Exchange Services
내용2
Multi-source Off-chain Data Forensic Framework Against Transaction Obfuscation
Cryptocurrencies operate on their respective blockchains. Chain-hopping is a technique of exchanging one cryptocurrency for another. Criminals employ cross-chain exchanges, which is a form of transaction obfuscation, to launder illicit money. Although all transactions are recorded on blockchains, enabling investigators to trace cryptocurrency flows, assets moved via cross-chain exchanges are considerably more difficult to trace.
Our study performs digital forensic analysis not only on suspects’ devices but also on data stored in the cloud to trace obfuscated transaction flows. A multi-wallet can hold multiple cryptocurrencies and may offer an in-app exchange function to convert one currency into another. A web-interface exchange enables rapid conversion between cryptocurrencies using only a web browser. Because such exchange services retain records of exchange operations, digital forensics can be used to reconstruct a suspect’s cryptocurrency exchange history.
AI-assisted Forensics
Multimodal Data Aanalysis
SERENA
내용1
Suspicious Conversation Detection
This study develops an AI-assisted forensic framework that detects illicit drug and cryptocurrency transactions hidden within messenger conversations. By combining NER, semantic retrieval (ChromaDB), and RAG-based LLM reasoning, the system interprets slang and contextual intent beyond simple keywords. The framework enhances the accuracy, transparency, and efficiency of digital investigations through context-aware evidence analysis.
Source Identification
Video
Video source identification using machine learning: A case study of 16 instant messaging applications
Who gave you this video?!
In recent years, there has been a notable increase in the prevalence of cybercrimes related to video content, including the distribution of illegal videos and the sharing of copyrighted material. This has led to the growing importance of identifying the source of video files to trace the owner of the files involved in the incident or identify the distributor. Previous research has concentrated on revealing the device (brand and/or model) that “originally” created a video file. This has been achieved by analysing the pattern noise generated by the image sensor in the camera, the storage structural features of the file, and the metadata patterns. However, due to the widespread use of mobile environments, instant messaging applications (IMAs) such as Telegram and Wire have been utilized to share illegal videos, which can result in the loss of information from the original file due to re-encoding at the application level, depending on the transmission settings. Consequently, it is necessary to extend the scope of existing research to identify the various applications that are capable of re-encoding video files in transit. Furthermore, it is essential to determine whether there are features that can be leveraged to distinguish them from the source identification perspective. In this paper, we propose a machine learning-based methodology for classifying the source application by extracting various features stored in the storage format and internal metadata of video files. To conduct this study, we analyzed 16 IMAs that are widely used in mobile environments and generated a total of 1974 sample videos, taking into account both the transmission options and encoding settings offered by each IMA. The training and testing results on this dataset indicate that the ExtraTrees model achieved an identification accuracy of approximately 99.96 %. Furthermore, we developed a proof-of-concept tool based on the proposed method, which extracts the suggested features from videos and queries a pre-trained model. This tool is released as open-source software for the community.
Audio
내용1


Forensics on AI-related Systems
Local LLM Applications
내용1
Deepfake Apps and Services
내용2
Drone Forensics
(드론포렌식팀) .. 위크 Demo
내용1


Embedded Device Forensics
Consent-Based Mobile Data Collection
내용1
OS Log Analysis
iOS Sysdiagnose
Intelligent Framework for Automated Digital Trace Analysis
iOS Sysdiagnose serves as a comprehensive diagnostic bundle containing unified logs, power events, and process histories that capture every aspect of device operation. Traditionally, mobile forensics relied on privilege escalation or zero-day vulnerabilities to extract evidence, but such approaches are increasingly infeasible on hardened systems. SIREN (Sysdiagnose Intelligence & Report ENgine) introduces a novel forensic framework that reconstructs user behavior directly from system-level logs—without exploiting vulnerabilities. By correlating logarchive and Powerlog datasets, SIREN systematically normalizes temporal inconsistencies and rebuilds event sequences such as app usage, camera activity, battery patterns, and network transmissions.
The framework integrates 15 analysis modules and an LLM-based report generator that summarizes findings across thousands of raw entries into structured, human-readable evidence. In evaluations, SIREN successfully reconstructed behavioral timelines in cases of data exfiltration, cyberstalking, and identity theft, reducing manual analysis time from days to minutes. This work demonstrates a significant advancement in iOS digital forensics by establishing reliable, reproducible evidence acquisition through system logs—enhancing both efficiency and evidential integrity.
Maritime Equipment
내용2
Multimedia Forensics
Structure-based Forgery Detection
Metadata-based audio file authenticity analysis framework: Galaxy ecosystem as a study
We talked about this! Don’t you remember?!
Verifying the authenticity of audio recordings is difficult when files are lightly edited, re-encoded, or moved across devices in ways that keep waveforms plausible while altering provenance. This paper presents a metadata-centered framework that complements signal-level detectors. We catalog on-device media artifacts, focusing on Android MediaStore records such as creation and modification times, acquisition times, application package provenance, and bitrates, and relate them to ISOBMFF fields. We combine these sources with application and filesystem traces to reconstruct timelines through preparation, acquisition, and examination phases. Controlled case studies cover genuine Android recordings and two tampered scenarios involving smartphone to smartwatch and smartphone to Windows PC transfers with trimming and copy-back insertion. Patterns such as synchronized timestamp resets, unexpected package provenance, and bitrate shifts reveal edits even when audio sounds natural. The approach is limited by device-specific schemas and access constraints but offers a reproducible, low-overhead basis for authenticating everyday audio evidence.


Operating System Forensics
Built-in Anti-Forensic Features
Modern operating systems incorporate various automated management features to enhance system performance and optimize storage efficiency. Among these, some operate as anti-forensic mechanisms by automatically deleting artifacts and files. The Disk Cleanup feature serves as a representative example of such built-in anti-forensic functionality within the operating system. While it helps manage storage space by removing unnecessary files, it can also lead to the unintended deletion of artifacts and files that may be crucial for forensic investigations. This may result in the loss of forensic evidence or even create circumstances that could be misinterpreted as deliberate evidence tampering.
Our study focuses on the disk cleanup functions built into operating systems such as Windows to analyze whether these built-in mechanisms can act as anti-forensic features by deleting or altering digital artifacts. Through reverse engineering and file analysis, we identify the internal structure and operational principles of these functions. We then conduct experiments to observe whether artifacts can be automatically deleted without user intervention. Based on the analysis and experimental results, this research proposes a forensic investigation procedure that accounts for anti-forensic behaviors embedded within the operating system.
Development of DFIR Infrastructures
Software Hash Database
내용1
Forensic Tool Testing
내용2
Multi-purpose Synthetic Datasets
내용1
Logical Image Formats
내용2


Similarity Comparison
Design Drawing Files
Guarding Engineering Design Drawings: Detecting Data Leaks via Structural Similarity in AutoCAD and OrCAD Files
As industrial environments become increasingly digitalized, engineering drawings and related technical files have emerged as key assets representing an organization’s core technologies. However, leaks of such high-value design data are growing more frequent, threatening both corporate competitiveness and national technological security.
In this study, we aim to detect technology leaks by analyzing the structural characteristics of engineering design files. Targeting AutoCAD and OrCAD drawings, we examine their internal file structures to identify distinctive features and develop a similarity-based comparison algorithm. Using a custom dataset built from real and generated drawings, the proposed method demonstrates strong effectiveness and practicality. The results highlight its potential for protecting industrial technology and supporting digital forensic investigations.
MS Office
Layout-based similar document search through representative image creation: MS PowerPoint as a case study
Find my twin.
With the recent development of technology, the work environment is all digitized, and digital documents are utilized in most of the work. For digital forensic investigators who need to quickly select documents related to a case, numerous digital documents cause a lot of difficulties in investigations. In particular, in eDiscovery, it is important to find meaningful digital evidence by analyzing associations between many documents and files within a limited time. In the case of digital forensic investigation, if documents with similar types are selected among numerous documents by identifying the types of digital documents, only documents created by a specific organization can be grouped.
In this paper, we present a method of generating an image that can represent a document among images stored as many as the number of pages of a document for searching similar documents, and a method of searching similar documents using an image hash for similarity analysis between representative images. About 50,000 Microsoft PowerPoint files in the Govdocs1 data set and about 6,000 Microsoft PowerPoint files in the NapierOne data set demonstrate the practicality of the method presented in this paper.
Analysis of Embedded Device Firmware
FirmAware
내용1
VEAM+FirmBase
내용2
FIRE
내용1
FBOM Generatoin
내용2


Monitoring of Online Data for Cybersecurity and Forensics
Internet-Exposed Sensitive Systems
내용1
National Security Threat Activities
This study investigates a multimodal AI framework for monitoring and detecting terrorist propaganda, recruitment, and fundraising activities across social media and online communities. Through social media mining, the research collects and analyzes open-source data, integrating OCR, ASR, and NER across text, image, audio, and video modalities to identify key indicators such as Telegram IDs, cryptocurrency wallets, and donation messages. The framework aims to provide early evidence and investigative leads that can assist law enforcement in tracing financial networks and initiating formal investigations.