MITRE ATT&CK T1005 Data from Local System
Data theft is a critical concern for organizations, and attackers often seek to extract sensitive information directly from compromised systems. Adversaries collect files and other valuable data stored on infected endpoints in both targeted cyber espionage campaigns and financially motivated cybercrime, as stolen information can be exploited for further attacks, sold on underground markets, or leveraged for ransom demands.
In this blog post, we explain the T1005 Data from Local System technique of the MITRE ATT&CK® framework and explore how adversaries employ Data from Local System techniques with real-world attack examples in detail.
|
|
Adversary Use of Data from Local System
Adversaries collect data from local systems as part of their broader strategy to achieve their objectives, which may include espionage, financial gain, or disruption. By accessing files directly from the local system, attackers can gather data efficiently without raising suspicion, especially in environments where monitoring or detection mechanisms may not cover such activities comprehensively.
The process typically starts with adversaries identifying and locating the files they intend to access. This could involve using file search utilities, scripts, or built-in operating system commands to find data with specific extensions or stored in common directories. They may target files containing credentials, intellectual property, personally identifiable information (PII), or other sensitive content. Advanced attackers might also search for configuration files, logs, or cached data that can provide additional insights or lead to further exploitation.
The reasons adversaries rely on this technique are closely tied to the accessibility and strategic importance of local data. Files on a local system often contain critical information, and their retrieval can provide attackers with immediate value or serve as stepping stones for lateral movement or privilege escalation. For example, collecting locally stored credentials might allow an attacker to impersonate a legitimate user or gain access to restricted systems. Similarly, exfiltrating proprietary documents can lead to significant financial or reputational damage to the victim organization.
Adversaries also prefer this method because it allows them to operate covertly. Accessing data locally avoids triggering network-based defenses, such as data loss prevention (DLP) systems, which are often focused on monitoring external communications. By extracting data locally and exfiltrating it later in carefully crafted stages, attackers can further reduce the likelihood of detection.
When leveraging the T1005 Data from Local System technique, adversaries often rely on native system tools and commands because these utilities are pre-installed on most operating systems, making them highly accessible. Using such tools allows attackers to avoid introducing custom malware or external binaries, which are more likely to trigger security alerts. Native tools are trusted by the operating system and often used by legitimate users and administrators, enabling attackers to blend their malicious activities into regular system operations and evade detection by traditional security solutions.
Native Tools and Commands Used to Collect Local Files
Adversaries can leverage various built-in operating system tools and commands to locate and collect data from compromised local systems. This section will examine native tools and commands exploited by adversaries and found in major operating systems in detail.
1. dir (Windows)
dir command, which is built into Windows, allows attackers to browse the file system, list files in specific directories, and identify data that might be valuable for collection. By combining dir with various switches, they can filter results and focus on files of interest. For example, the /s option allows recursive searches through subdirectories, while /a can display hidden or system files that may contain sensitive information. Additionally, attackers might use wildcards (e.g., *.docx or *.txt) to search for files with specific extensions commonly associated with valuable data.
In August 2024, Voldemort backdoor malware was reported to use the dir command to list the folders and files in the compromised systems [1].
2. findstr (Windows)
findstr command is a native utility designed to search for patterns or keywords within the contents of files. Its versatility and ability to filter through large volumes of data make it an effective tool for attackers to pinpoint sensitive information without having to examine each file manually.
Typically, an adversary may combine findstr with directory enumeration commands like dir to streamline the process of identifying and accessing targeted data. For instance, after locating files of interest using dir or similar tools, they might execute a command such as findstr "password" *.txt to search for occurrences of the word "password" within all .txt files in a directory. This approach allows attackers to zero in on files containing specific terms or strings that are likely to hold valuable information, such as credentials, API keys, or personally identifiable information (PII).
Adversaries can also use findstr with additional options to refine their searches. For example, the /s option enables recursive searching through subdirectories, and /i makes the search case-insensitive, increasing the likelihood of finding relevant results. They might also use wildcards to broaden their search scope, targeting multiple file types simultaneously, such as findstr "secret" *.txt *.log.
In November 2024, CISA reported that the BianLian ransomware group used the following command to find passwords in all files in the current folder and its subfolders [2].
findstr /spin "password" *.* >C:\Users\training\Music\<file>.txt |
3. Get-ChildItem (Windows)
Get-ChildItem is a versatile cmdlet that provides extensive functionality for searching and retrieving file system objects on Windows systems. Its advanced filtering capabilities and seamless integration with other PowerShell features make it particularly attractive for malicious actors.
Attackers use Get-ChildItem to efficiently explore the file system and identify files of interest, such as documents, credentials, or configuration files. By default, the command lists files and directories in a specified location. With different parameters, adversaries can perform recursive searches, set filters such as size or modification date, and pipe into other commands for further processing. The following command identifies all files modified within the past week, potentially indicating active documents or logs.
Get-ChildItem -Recurse | Where-Object {$_.LastWriteTime -gt (Get-Date).AddDays(-7)} |
The Chinese APT group Mustang Panda uses the command below in its getdata.ps1 script for reconnaissance and data collection [3].
Get-ChildItem ([environment]::getfolderpath("desktop")) |
4. Select-String (Windows)
Select-String command allows users to search through file contents for specific strings or patterns of interest. This cmdlet is often described as the PowerShell equivalent of the grep command in Linux and is highly effective for locating sensitive information, such as credentials, configuration data, or personally identifiable information (PII), within the files stored on a compromised system.
Using Select-String, attackers can automate the process of searching through one or multiple files, filtering out irrelevant data, and focusing on content matching predefined keywords or regular expressions. The flexibility of Select-String makes it particularly appealing to adversaries. They can use it with wildcards to target a wide range of files or restrict searches to specific directories and file types. For instance, the following command can search log files for error messages or references to tokens that might reveal authentication details or debugging information. Additionally, adversaries can leverage regular expressions for more complex searches, such as patterns resembling email addresses, URLs, or API keys.
Select-String -Path C:\Logs\*.log -Pattern "Error|Token" |
In May 2024, a hacktivist group called Twelve used the Select-String command in combination with Get-Child item to collect sensitive information from compromised systems [4].
Get-ChildItem -Path C:\ -Recurse -Include *.doc, *.docx, *.xls, *.xlsx, *.ppt, *.pptx, *.pdf, *.eml, *.msg, *.pst, *.mbox, *.csv, *.qbw, *.qba, *.qfx, *.txt, *.rtf, *.xml, *.json, *.conf, *.cfg, *.ini, *.db, *.sql, *.mdb, *.log | Select-String -Pattern "[PATTERN]" -CaseSensitive:$false | Select Path, LineNumber, Line | Out-File -FilePath C:\sensitive_data_results.txt |
5. ls (Linux and macOS)
ls command provides a snapshot of the files and folders within a specific directory, allowing attackers to quickly map the file system and locate valuable data for further analysis or exfiltration. When an adversary gains access to a compromised system, they often begin by using ls to assess the directory structure. By executing a simple ls command, they can list files and subdirectories in the current directory, obtaining a general overview of what is stored there. This basic reconnaissance helps attackers determine whether the directory contains files worth investigating or if they should navigate to another location in the file system.
ls is a flexible command that includes various options that provide detailed insights about files. For instance, an attacker might use ls -l to display information such as file permissions, ownership, size, and modification times. This data can help them prioritize files based on characteristics like recent changes or accessibility. For example, a recently modified file owned by a privileged user might suggest the presence of current and sensitive data.
Another useful feature for attackers is the ability to list hidden files with the ls -a option. Hidden files, often used for configuration or authentication purposes, can include critical information like credentials, API keys, or cryptographic materials. By running ls -a, an adversary can uncover files like .ssh/authorized_keys or .env that might otherwise go unnoticed during a standard file system scan.
Adversaries may also use the ls command recursively to enumerate the contents of nested directories. By combining ls with the -R option, they can obtain a comprehensive listing of files across multiple levels of the directory structure. This approach is particularly useful in identifying sensitive data stored in deeply nested directories without needing to navigate manually through each level.
The results of ls commands can also be redirected to files or combined with other tools to enhance reconnaissance efforts. For instance, an attacker might run the following command to generate a detailed inventory of all files and directories under the /home directory, which could then be analyzed offline or used in conjunction with other tools to search for specific patterns or keywords.
ls -lR /home > directory_listing.txt |
In the CRON#TRAP campaign, adversaries used the command below to enumerate the directories and confirm file locations [5].
ls -hal -h: Human-readable sizes -a: All files -l: Long format Displays detailed information for each file or directory in a long listing format. |
6. find (Linux and macOS)
find is a highly versatile command that allows attackers to search the file system based on a wide array of attributes, such as file names, extensions, sizes, modification times, or even file types. When adversaries gain access to a system, they often start by exploring the file system to identify targets. The find command is particularly useful for locating files that match certain patterns or criteria. For instance, an attacker might search for configuration files (*.conf) across the system to uncover sensitive settings, credentials, or API keys stored in plain text. Similarly, they might target document files such as .docx, .pdf, or .xlsx, which are more likely to contain personal, financial, or proprietary information.
The find command is also effective for discovering files based on size, age, or type. Adversaries might use parameters such as -size to locate large files that could contain logs or databases or -mtime to identify files modified within a specific time frame. For instance, find /var/log -size +1M could reveal large log files in the /var/log directory, which might include system activity or authentication details. Similarly, find / -mtime -7 identifies files modified in the last seven days, which might indicate recent activity or updates containing useful information.
In May 2024, APT36, also known as Transparent Tribe, was reported to use the find command in an obfuscated version of GLOBSHELL malware [6].
7. grep (Linux and macOS)
grep utility allows attackers to sift through large volumes of data, narrowing down their focus to information that matches a desired pattern, such as credentials, API keys, or sensitive personal information. Attackers often use grep to search for terms associated with valuable data, such as "password," "key," or "token". By tailoring the search term to the context of the compromised environment, adversaries can efficiently locate sensitive information that might facilitate further exploitation.
For example, the command grep -r "secret" /home would scan all user files under the /home directory for the keyword "secret," potentially uncovering confidential information stored in text documents or configuration files. The utility also supports regular expressions, enabling attackers to craft advanced patterns for complex searches. For instance, searching with grep -E "password[:= ]" config.txt would match variations like password=, password:, or password, which are commonly used in configuration files. This precision allows adversaries to extract specific lines containing relevant data without needing to review entire files manually.
In many cases, attackers combine grep with other commands to streamline their workflow. Pairing grep with find can help locate files of interest and immediately search their contents. The command below identifies files in the /etc directory and searches them for instances of the term "api_key." Such combinations enable efficient and targeted reconnaissance within complex file systems.
find /etc -type f | xargs grep "api_key" |
In October 2024, the Shedding Zmiy threat group was reported to use the following command to dump binary logs and extract certain keywords from them using grep command [7].
utmpdump /var/log/wtmp | grep -v "redacted" >.t |
Ready to Simulate Real-World Threats From Red Report 2025?
References
[1] "The Malware That Must Not Be Named: Suspected Espionage Campaign Delivers 'Voldemort,'" Proofpoint. https://www.proofpoint.com/uk/blog/threat-insight/malware-must-not-be-named-suspected-espionage-campaign-delivers-voldemort
[2] "#StopRansomware: BianLian Ransomware Group," Cybersecurity and Infrastructure Security Agency CISA. Available: https://www.cisa.gov/news-events/cybersecurity-advisories/aa23-136a
[3] Cyble, "Vietnamese Entities Targeted by China's Mustang Panda," Cyble. https://cyble.com/blog/vietnamese-entities-targeted-by-china-linked-mustang-panda-in-cyber-espionage/
[4] "Ландшафт киберугроз." Available: https://media.kasperskycontenthub.com/wp-content/uploads/sites/58/2024/05/20212017/Report_Threat-Landscape_for_Russia_and_CIS.pdf
[5] "CRON#TRAP: Emulated Linux Environments as the Latest Tactic in Malware Staging," Securonix. https://www.securonix.com/blog/crontrap-emulated-linux-environments-as-the-latest-tactic-in-malware-staging/
[6] "Transparent Tribe Targets Indian Government, Defense, and Aerospace Sectors Leveraging Cross-Platform Programming Languages," BlackBerry. https://blogs.blackberry.com/en/2024/05/transparent-tribe-targets-indian-government-defense-and-aerospace-sectors
[7] К. 4rays, "Распутываем змеиный клубок: по следам атак Shedding Zmiy." https://rt-solar.ru/solar-4rays/blog/4333/