Skip to content
Snippets Groups Projects
Commit ebf201a4 authored by Brian Carrier's avatar Brian Carrier
Browse files

removed docs

parent d22a0064
No related branches found
No related tags found
No related merge requests found
EXTRA_DIST = other.txt ref_fs.txt ref_timeline.txt \
skins_fat.txt skins_iso9660.txt skins_ntfs.txt skins_windows.txt
The library API documentation can be found online at:
http://www.sleuthkit.org/sleuthkit/docs/api-docs/
For more docs, refer to The Sleuth Kit Informer at:
www.sleuthkit.org/informer
brian
File System Analysis Techniques
Sleuth Kit Reference Document
http://www.sleuthkit.org
Brian Carrier
Last Updated: July 2005
INTRODUCTION
=======================================================================
Currently, evidence is most frequently found in the file system.
This is because it is non-volatile and remnants of deleted files
can typically be found. This file will help one to use the low-level
tools in The Sleuth Kit for a forensic analysis.
This document is organized into small scenarios, which provide
examples of how to use The Sleuth Kit. Most of these functions
are automated with Autopsy, but they are here for reference and
education.
http://www.sleuthkit.org/autopsy
The techniques used here apply to both UNIX and Windows file systems.
TIME LINE
=======================================================================
The steps from the timeline Sleuth Kit Implementation Notes are
followed (using both ils and fls) and you notice some interesting
activity from unallocated inodes, namely MFT Entry 5035 from image
c_drive.dd. To display the contents of this file, use "icat":
# icat images/c_drive.dd 5035 | less
NOTE: To prevent your terminal from getting messed up, pipe all
output of "icat" through a pager like "less".
SEARCH
=======================================================================
In this scenario, we will search the unallocated space of the
"wd0e.dd" image for the string "abcdefg". The first step is to
extract the unallocated disk units using the "blkls" tool (as this
is an FFS image, the addressable units are fragments).
# blkls images/wd0e.dd > output/wd0e.blkls
Next, use the UNIX strings(1) utility to extract all of the ASCII
strings in the file of unallocated data. If we are only going to be
searching for one string, we may not need to do this. If we are going
to be searching for many strings, then this is faster. Use the '-t d'
flags with "strings" to print the byte offset that the string was found.
# strings -t d output/wd0e.blkls > output/wd0e.blkls.str
Use the UNIX grep(1) utility to search the strings file.
# grep "abcdefg" output/wd0e.blkls.str | less
10389739: abcdefg
We notice that the string is located at byte 10389739. Next,
determine what fragment. To do this, we use the 'fsstat' tool:
# fsstat openbsd images/wd0e.dd
<...>
CONTENT-DATA INFORMATION
--------------------------------------------
Fragment Range: 0 - 266079
Block Size: 8192
Fragment Size: 1024
This shows us that each fragment is 1024 bytes long. Using a
calculator, we find that byte 10389739 divided by 1024 is 10146
(and change). This means that the string "abcdefg" is located in
fragment 10146 of the "blkls" generated file. This does not really
help us because the blkls image is not a real file system. To view
the full fragment from the blkls image, we can use dd:
# dd if=images/wd0e.dd bs=1024 skip=10146 count=1 | less
Next, we will identify where this fragment is in the original image.
The "blkcalc" tool will be used for this. "blkcalc" will return the
"address" in the original image when given the "address" in the
blkls generated image. (NOTE, this is currently kind of slow). The
'-u' flag shows that we are giving it an blkls address. If the '-d'
flag is given, then we are giving it a dd address and it will
identify the blkls address.
# blkcalc -u 10146 images/wd0e.dd
59382
Therefore, the string "abcdefg" is located in fragment 59382. To view
the contents of this fragment, we can use "blkcat".
# blkcat images/wd0e.dd 59382 | less
To make more sense of this, let us identify if there is a meta data
structure that still has a pointer to this fragment. This is achieved
using "ifind". The '-a' argument means to find all occurrences.
# ifind -a images/wd0e.dd 59382
493
Inode 493 has a pointer to fragment 59382. Let us get more information
about inode 493, using "istat".
# istat images/wd0e.dd 493
inode: 493
Not Allocated
uid / gid: 1000 / 1000
mode: rw-------
size: 92
num of links: 1
Modified: 08.10.2001 17:09:49 (GMT+0)
Accessed: 08.10.2001 17:09:58 (GMT+0)
Changed: 08.10.2001 17:09:49 (GMT+0)
Direct Blocks:
59382
Next, let us find out if there is a file that is still associated with
this (unallocated) inode. This is done using "ffind".
# ffind -a images/wd0e.dd 493
* /dev/.123456
The leading '*' identifies the file as deleted. Therefore, at one point,
the file '/dev/.123456' allocated inode 493, which allocated fragment
59382, which contained the string "abcdefg".
If "ffind" returned with more than file that had allocated inode 493,
it means that either both were hard-links to the same file or that one
file (chicken) allocated the inode, it was deleted, a second file (egg)
allocated it, and then it was deleted. The string belongs to the second
file, but it is difficult to determine which came first. On the other
hand, if "ffind" returns with two entries where one deleted and one not,
then the string belongs to the non-deleted file.
As previously mentioned, Autopsy will do all of this for you when
you do a keyword search of unallocated space.
DELETED CONTENT
=======================================================================
To view all of the deleted file names in an image, use the "fls" tool.
For all deleted files, use the '-r' flag for recursive and '-d' flag
for deleted.
# fls -rd images/hda9.dd | less
d/d * 232: /TEMP-823450
r/d * 293: /TEMP-131100
This shows us the full path that the deleted files are located. On some
systems, such as Windows NTFS, the file content may be recovered
(depending on how much system activity has occurred). On other
systems, such as Solaris UFS and Linux Ext3, deleted files can not
be easily recovered. The number at the beginning of the line is
the inode number. The '*' shows that it is deleted and the 'd' and
'r' show the type (directory and file). The first letter identifies
the directory entry type value (which does not exist in all file
system types) and the second letter is the type according to the
inode. In most cases these should be the same, but it may not for
deleted files if the inode has been reallocated to a file of a
different type. If we do an "istat" on the directory (232) we will
notice that the size is 0.
# istat images/hda9.dd 232
inode: 232
Not Allocated
uid / gid: 0 / 0
mode: rwxr-xr-x
size: 0
num of links: 0
Modified: 08.23.2001 21:52:33 (GMT+0)
Accessed: 08.23.2001 23:05:39 (GMT+0)
Changed: 08.23.2001 21:52:33 (GMT+0)
Deleted: 08.23.2001 23:05:39 (GMT+0)
Direct Blocks:
Linux does this to all of its deleted directories. It should also
be observed that no block addresses are shown in the "istat" output.
This is because the size is 0 and the program thinks that the address
is bogus. Using the '-b' option of "istat", we can force it to
output the block address. With Linux Ext3, the block pointers would
be 0, but Linux Ext2 kept the old addresses.
# istat -b 2 images/hda9.dd 232
inode: 232
Not Allocated
uid / gid: 0 / 0
mode: rwxr-xr-x
size: 0
num of links: 0
Modified: 08.23.2001 21:52:33 (GMT+0)
Accessed: 08.23.2001 23:05:39 (GMT+0)
Changed: 08.23.2001 21:52:33 (GMT+0)
Deleted: 08.23.2001 23:05:39 (GMT+0)
Direct Blocks:
388 0
Now we can examine the contents of block 388 and see the file
names that were in that directory:
# blkcat -h images/hda9.dd 388 | less
MANUAL UNIX FILE RECOVERY
=======================================================================
A UFS/FFS or EXT2FS/EXT3FS file system is organized into groups.
Each group has its own inodes and blocks to store data in. When
a new file is created, it is given an inode in the same group that
the parent directory inode is in (if there are still inodes
available). When a new directory is created, it is given an inode
in a new group. An inode allocates blocks from the same group that
its inode is in.
When recovering a file from one UFS or EXTxFS, the group layout
can be used. When a deleted file is found with 'fls', notice the
inode of the parent directory:
# fls -r images/hda1.dd
d/d 30789: doc
+ r/r * 0: doc/.a/ssh.tar
+ r/r 30792: doc/.a/install
<...>
We want to recover the 'ssh.tar' file and notice that the parent
directory is 30789 and the deleted file has a cleared inode pointer.
To identify the group that it is in, the 'fsstat' tool is used:
# fsstat images/hda1.dd
FILE SYSTEM INFORMATION
--------------------------------------------
File System Type: EXT3FS
<...>
Group: 0:
Inode Range: 1 - 15392
Block Range: 0 - 32767
Super Block: 0 - 0
Group Descriptor Table: 1 - 1
Data bitmap: 2 - 2
Inode bitmap: 3 - 3
Inode Table: 4 - 484
Data Blocks: 485 - 32767
Group: 1:
Inode Range: 15393 - 30784
Block Range: 32768 - 65535
Super Block: 32768 - 32768
Group Descriptor Table: 32769 - 32769
Data bitmap: 32770 - 32770
Inode bitmap: 32771 - 32771
Inode Table: 32772 - 33252
Data Blocks: 33253 - 65535
Group: 2:
Inode Range: 30785 - 46176
Block Range: 65536 - 98303
Data bitmap: 65536 - 65536
Inode bitmap: 65537 - 65537
Inode Table: 65540 - 66020
Data Blocks: 65538 - 65539, 66021 - 98303
<...>
The inode is in the range of inode addresses for group 1. To search
for the deleted file, we extract the unallocated space using 'blkls':
# blkls images/hda1.dd 32768-65535 > output/hda1-grp1.blkls
If we wanted to extract all of the data for the group, we could
use 'dd':
# dd if=images/hda1.dd of=output/hda1-grp1.dd bs=4096 skip=32768 \
count=32767
Where, the fragment size is 4096 (which can also be found in the
'fsstat' output). Either of these images can then be analyze for
keywords or using other data carving tools such as 'foremost'.
This process allows one to reduce the amount of data that must be
analyzed.
http://foremost.sourceforge.net
-----------------------------------------------------------------------
Send documentation updates to: <doc-updates at sleuthkit dot org>
Copyright (c) 2002-2005 by Brian Carrier. All Rights Reserved
File Activity Timelines
Sleuth Kit Reference Document
http://www.sleuthkit.org
Brian Carrier
Last Updated: Sept 2008
INTRODUCTION
=======================================================================
Creating a timeline of file activity will give an investigator
clues regarding where to probe further. This document will describe
how to generate one using The Sleuth Kit. The timelines in The Sleuth
Kit allow one to quickly get a high-level look at system activity,
such as when files were compiled and when archives were opened.
BACKGROUND
=======================================================================
Many files and directories have times associated with them. The
quantity and description of which depend on the file system type.
FFS file systems have a Modified, Accessed, and Changed time
associated with them. EXT2FS file systems have a Modified, Accessed,
Changed, and Deleted time. FAT stores the Written, Accessed, and
Created time, although by spec the Created and Access times are
optional and the Access time is only accurate to the day.
TIMELINE CREATION
=======================================================================
The creation of a file activity timeline in The Sleuth Kit has
three phases.
1. Gather file data. Using the 'fls' tool, the data associated with
allocated and some unallocated files can be gathered. To do this
requires the '-m' argument with the '-r' flag to gather all files.
This needs to be done for each partition image.
# fls -f openbsd -m / -r images/root.dd > data/body
# fls -f openbsd -m /var/ -r images/var.dd >> data/body
# fls -f openbsd -m /usr/ -r images/usr.dd >> data/body
NOTE: Some systems delete the link between deleted file names and
meta data, such as Solaris, so only information about allocated
files will be useful.
NOTE: This replaces the actions of 'grave-robber -m' in TCT. The
'mac-robber' tool (on the www.sleuthkit.org web site) can also be
used to gather allocated file data on a mounted file system.
'mac-robber' is useful for file systems where tools do not exist
(such as AIX jfs).
2. Gather unallocated meta data. Using the 'ils' tool, the data
associated with unallocated meta data can be gathered. When files
are deleted, the times associated with the file are updated.
Although many times we may not be able to link the original name
to the meta data, it will still give some clue with respect to when
activity occurred. This uses the '-m' flag of 'ils'.
# ils -f openbsd -m images/root.dd >> data/body
# ils -f openbsd -m images/var.dd >> data/body
# ils -f openbsd -m images/usr.dd >> data/body
NOTE: Because of the way that FAT stores time, the timezone is
needed while executing 'ils'. If you will be giving 'mactime' a
timezone to use then set the TZ environment variable:
# set TZ=EST5EDT
3. Format the data nicely. The 'body' file now needs to be run
through the 'mactime' program to sort it and make it organized.
# mactime -b data/body 2002-03-01 > tl.03.01.2002
The above command generates a timeline of file activity from the
previously created data/body file for all activity starting in
March. If the /etc/passwd or /etc/group files are known, they can
be specified using the '-p' and '-g' flags. Otherwise the numerical
values will be displayed. The '-z' flag can be used to specify
the time zone.
# mactime -b data/body -p data/passwd -g data/group 2002-03-01
> tl.03.01.2002
The output format has changed slightly since the 'mactime' in TCT. The
inode value is now displayed in a separate column. Previously it was
not displayed.
Some example outputs of mactime will now be shown. The next two
entries are for a deleted socket in an EXT2FS image:
Wed Mar 20 2002 16:56:12 0 ..c s/srwxrwxr-x 500 500 127 /tmp/socket1 (deleted)
0 ..c srwxrwxr-x 500 500 127 <linux.dd-dead-127>
The first is the 'fls' entry and the second is the corresponding
entry from 'ils'. While it may seem redundant to show both, many
times 'fls' will not show the deleted file name because the entry
has been reallocated. Therefore, just the 'ils' dead entry will
appear and the investigator will not know the original path location.
The first 0 is the file size. The "..c" string means that this
entry is for the "Change" value. The dots are replaced with 'm'
or 'a' for other entry types (deleted entries are not created for
EXT2FS). The next string is the file system mode. The entries
from 'fls' will have the directory entry type first, followed by
a slash and the mode from the inode entry. 'ils' entries will only
have the inode mode. The next two are the UID and GID (or names
if the group and passwd file are specified), followed by the inode.
The final entry is the file name (or <IMG-dead-#> for unallocated
inodes).
The next two are for file that is deleted, but the inode that the
directory entry points to is deleted.
Fri Aug 23 2002 16:56:12 11 .a. l/-rw-r--r-- 0 0 34689 /tmp/file1 (deleted-realloc)
11 .a. -/-rw-r--r-- 0 0 34689 /etc/sysconfig/desktop
This can be see because they are both entries for the deleted file
(tmp/file1) and the allocated file (desktop), which have the same
inode (34689). It can also be seen because the deleted entry has
different values for the file type (l and -).
If you are going to include the resulting timeline in a document,
then it maybe better to supply the '-d' argument to output in comma
delimited format. The resulting timeline can then be imported into
a spread sheet and included as a table.
The '-i' option to 'mactime' creates an index summary file, including
how many hits were found per day or hour. Using '-d' with '-i'
allows one to easily import data into a spread sheet that can be
graphed to spot suspicious behavior.
# mactime -b data/body -d -i hour data/tl-hour-sum.txt > data/timeline.txt
TIME SKEW
=======================================================================
The time skew of the system can also be taken into consideration.
Using the '-s' argument to 'fls' and 'ils', the intermediate body
file can have the adjusted times so that the system is consistent
with other servers.
The argument reflects the skew in seconds. If the original system
was 100 seconds slower than NTP or some other 'main' server, then
the argument would be '-s -100'. If it were 145 seconds fast, then
it would be '-s 145'.
AUTOPSY
=======================================================================
The Autopsy Forensic Browser is a graphical interface to The Sleuth
Kit and it can automate the process of creating and viewing a time
line.
http://www.sleuthkit.org/autopsy
-----------------------------------------------------------------------
Send documentation updates to: <doc-updates at sleuthkit dot org>
Copyright (c) 2002-2008 by Brian Carrier. All Rights Reserved
The FAT File System
Sleuth Kit Implementation Notes (SKINs)
http://www.sleuthkit.org
Brian Carrier
Last Updated: Sept 2008
INTRODUCTION
========================================================================
This document contains information on the implementation of the
FAT file system in The Sleuth Kit. The Sleuth Kit is based on the
original designs of The Coroner's Toolkit (TCT), which was designed
only for UNIX file systems. The FAT file system and UNIX file
systems are very different and this document will identify how
those differences were handled. A basic understanding of FAT is
assumed.
The major design "decisions" that had to be made are related to:
- Disk unit addressing
- Meta-data addressing
DISK UNIT ADDRESSING
========================================================================
FAT saves file content in clusters. A cluster is a grouping of
consecutive sectors (512-bytes each). When a file is described
by the directory entries and File Allocation Table, the cluster
numbers are used as addresses. The problem, is that cluster 0 is
not at the beginning of the partition. Cluster 0 is in the Data
Area, which is after the super block and File Allocation Tables
and can be hundreds of sectors into the partition. This creates
a problem because if The Sleuth Kit were to use clusters as the
addressable units, then there would be no way to identify the
non-"data area" sectors.
This problem was solved by making the sector as the addressable
unit, instead of the cluster. When a file is described (using
'istat' for example), the sector addresses are given. In the
output of 'fsstat', the File Allocation Table contents are displayed
in sectors and when using 'blkls -l', the sector status is given.
This actually makes manual data recovery easier because one can
use 'dd' to carve out data using the sector addresses. If clusters
were given, the user would have to translate the Data Area address
to sectors before carving out data.
META-DATA ADDRESSING
========================================================================
FAT describes its files in a directory entry structure, which is
contained in the sectors allocated by the parent directory. The
directory entry structures have a fixed size of 32-bytes, not
addressed, and can exist anywhere in the partition. The Sleuth
Kit requires some type of addressing method for meta data structures,
so this became a problem. Also, the root directory does not have
a directory entry. In other words, there is no descriptive
information for the root directory.
The solution to this problem was to use the same method that is
used in many UNIX implementations. Each sector in the data area
is treated as though it could be full of directory entries. As
each sector is 512-bytes and each directory entry is 32-bytes, each
sector could contain 16 entries. To keep things similar to UNIX,
the root directory is given the value of 2 (and its meta-data is
set to 0). The first 32-bytes of the first sector in the data area
are addressed as 3, the second 32-bytes of the sector are 4 etc.
The Sleuth Kit will scan through the sectors and identify which
ones actually contain directory entries.
This method will produce large gaps of addresses between used
address values and places a limit on the size of the partition that
can be analyzed. The limit is:
2^32 / 16 = 2^28 sectors
Therefore, we can handle partitions of size 137,438,953,472 bytes.
It is unlikely that FAT file systems will be over 128GB in size.
NOTES ON TIMEZONES
========================================================================
FAT does not store the file times in the delta format that UNIX
does. Instead of saving the difference in time from GMT, FAT simply
saves the raw hour, minute, and second values. The Sleuth Kit
stores all times in the UNIX GMT offset format and will translate
the FAT time to the UNIX offset. This uses the current timezone
value when identify the GMT offset.
If the tool displays the time in a nice ASCII format, the same
timezone will be used to translate the offset value into a date.
Therefore, you can use any timezone value and the time will not
change (just the timezone name). On the other hand, if you use a
tool such as 'ils' or 'fls -m', which display the time in the offset
format, then it will have the offset of the current timezone or
the one specified with '-z'. Therefore, ensure that the same '-z'
argument is used with 'mactime' to display the correct time in
the timeline.
GENERAL NOTES ON TIME
========================================================================
Each file in FAT can store up to three times (last accessed, written,
and created). The last written time is the only 'required' time
and is accurate to a second. The create time is optional and is
accurate to the tenth of a second (Note that I have seen several
system directories in Windows that have a create time of 0). The
last access time is also optional and is only accurate to the day
(so the times are 00:00:00 in The Sleuth Kit).
The FAT spec can be found at:
http://www.microsoft.com/whdc/system/platform/firmware/fatgen.mspx
-----------------------------------------------------------------------------
Send documentation updates to: <doc-updates at sleuthkit dot org>
Copyright (c) 2002-2008 by Brian Carrier. All Rights Reserved
The ISO9660 File System
Sleuth Kit Implementation Notes (SKINs)
http://www.sleuthkit.org
Wyatt Banks, Crucial Security
Last Updated: June 2005
INTRODUCTION
=======================================================================
The ISO9660 file system is used on many platforms and has many
variations and extensions. At the most basic level of ISO9660 there
are several differences than traditional filesystems due to the type
of media available.
This document gives a quick overview of ISO9660 and how it was
implemented.
The Sleuth Kit allows one to investigate an ISO9660 image in the same
ways as any UNIX image, including:
- Creation of ASCII timeline of file activity
- File and directory level analysis
ISO9660 OVERVIEW
=======================================================================
This provides a quick introduction to the ISO9660 file system. The
terms used are different then with other file systems. For a full
overview of the file system, refer to the document "Volume and File
Structure of CDROM for Information Interchange"
http://www.ecma-international.org/publications/standards/Ecma-119.htm
Volume descriptors
-----------------------------------------------------------------------
ISO9660 uses structures called Volume Descriptors to store information
about the directory hierarchy of an ISO9660 volume. At 32768 bytes
into the image there is a contiguous list of volume descriptors.
A primary volume descriptor contains an address of a Path Table which
is a list of every directory on the volume. In this path table each
directory record has a single run of contiguous bytes known as an
Extent. Each directory's single data extent contains a group of
contiguous directory descriptors which represent files, directories
or other standard file types.
Primary volume descriptors only allow uppercase filenames in the
8.3 format (8 chars dot 3 chars).
Supplementary volume descriptors are very similar to primary volume
descriptors. The main difference is that supplementary volume
descriptors store filenames as UCS-2 characters and are used
in Microsoft Joliet extensions to allow mixed case filenames up to
103 characters.
All volume descriptors are stored at least once, with there being a
requirement to have only a single primary volume descriptor for an
image to be valid. Supplementary volume descriptors usually contain
the same data as primary volume descriptors.
FILES
-----------------------------------------------------------------------
ISO9660 file are stored in an extent whose size is measured in bytes.
A file is considered unique if its extent address is unique.
DIRECTORIES
-----------------------------------------------------------------------
Directory names are only stored in the path table of the volume
descriptor. As a directory is encountered as a directory descriptor
inside another directory's extent, the address of its data extent
is examined by the ISO9660 implementation to see if we've seen this
directory before and figure out what its name is.
Directories are unusual in the way they are identified as a unique
inode. If we examine the root directory using a primary volume
descriptor then its extent address is where on the volume the extent
containing the list of directory descriptors with 8.3 encoded names
exists. If we examine the root directory of that same volume using
a supplementary volume descriptor we will find that the extent
address is different because these directory descriptors are UCS-2
encoded, even though each directory descriptor will point at the same
data extent for each file.
This last paragraph is quite complicated. Lets simplify:
Imagine a CD with 3 files on it: file-1.txt, file-2.txt, file3.txt.
The path table in a primary volume descriptor has one directory in it
and its extent contains 3 directory descriptor structures with 8.3
uppercase encoding. The path table in a supplementary volume
descriptor describing this same volume has one directory but its extent
is different because those 3 directory descriptor structures are
different than the previous 3. The files are not considered unique
because their extent addresses (where their data lies) is not unique.
OF NOTE:
-----------------------------------------------------------------------
Due to many reports of mastering software errata, there are some
issues that The Sleuth Kit handles that the specifications for ISO9660
say will never happen. The specs say that there is only one unique
primary volume descriptor per volume. The Sleuth Kit handles the
possibility of finding more and alerts the user to this.
Inodes don't really exist in ISO9660 so the implementation is
improvised based on anything thats extent is unique is a different
file. The pseudo inode strucutre is stored in a linked list to make
viewing an entire image faster.
ISO9660 stores many fields as both byte order. A 32 bit number
will take 8 bytes, the first 4 are little endian, the last 4 are
big endian.
USING THE SLEUTH KIT WITH ISO9660
=======================================================================
The Sleuth Kit allows one to view all aspects of the ISO9660 structure.
All Sleuth Kit commands should work the same as their counterparts.
Note that Autopsy can automate this process for you and allows you
to view all attributes.
http://www.sleuthkit.org/autopsy
WHAT THE SLEUTH KIT CANNOT CURRENTLY DO
=======================================================================
There are a few things that The Sleuth Kit is not yet able to do
with ISO9660:
- Multisessions CDs are not handled.
- High Sierra is not handled.
- Files that are stored with an interleave gap
-----------------------------------------------------------------
Send documentation updates to: <doc-updates at sleuthkit dot org>
The NTFS File System
Sleuth Kit Implementation Notes (SKINs)
http://www.sleuthkit.org
Brian Carrier
Last Updated: Sept 2008
INTRODUCTION
=======================================================================
The NTFS file system is used in all critical Microsoft Windows
systems. It is an advanced file system that makes it different
from the UNIX file systems that the original TCT was designed for.
This document gives a quick overview of NTFS and how it was
implemented. The biggest difference is the use of Alternate Data
Streams (ADS) when specifying a meta data structure.
The Sleuth Kit allows one to investigate an NTFS image in the same
ways as any UNIX image, including:
- Creation of ASCII timeline of file activity
- Cluster analysis and mapping between clusters and MFT entries
- MFT analysis and mapping between MFT entries and file names
- File and directory level analysis including deleted files
NTFS OVERVIEW
=======================================================================
This provides a quick introduction to the NTFS file system. The
terms used are different then with other file systems. For a full
overview of the file system, refer to the "Inside Windows 2000"
book by Solomon and Russinovich and for details of the file system
structures, refer to the NTFS Source Forge project at:
http://linux-ntfs.sourceforge.net/ntfs/index.html
MFT
-----------------------------------------------------------------------
The Master File Table (MFT) contains entries that describe all
system files, user files, and directories. The MFT even contains
an entry (#0) that describes the MFT itself, which is how we
determine its current size. Other system files in the MFT include
the Root Directory (#5), the cluster allocation map, Security
Descriptors, and the journal.
MFT ENTRIES
-----------------------------------------------------------------------
Each MFT entry is given a number (similar to inode numbers in UNIX).
The user files and directories start at MFT #25. The MFT entry
contains a list of attributes. Example attributes include "Standard
Information" which stores data such as MAC times, "File Name" which
stores the file or directories name(s), $DATA which stores the
actual file content, or "Index Alloc" and "Index Root" which contain
directory contents stored in a B-Tree.
Each type of attribute is given a numerical value and more than
one instance of a type can exist for a file. The "id" value for
each attribute allows one to specify an instance. A given file
can have more than one "$Data" attribute, which is a method that
can be used to hide data from an investigator. To get a mapping
of attribute type values to name, use the 'fsstat' command. It
displays the contents of the $AttrDef system file.
Each attribute has a header and a value and an attribute is either
resident or non-resident. A resident attribute has both the header
and the content value stored in the MFT entry. This only works
for attributes with a small value (the file name for example).
For larger attributes, the header is stored in the MFT entry and
the content value is stored in Clusters in the data area. A Cluster
in NTFS is the same as FAT, it is a consecutive group of sectors.
If a file has too many different attributes, an "Attribute List"
is used that stores the other attribute headers in additional MFT
entries.
FILES
-----------------------------------------------------------------------
Files in NTFS typically have the following attributes:
- $STANDARD_INFORMATION (#16): Contains MAC times, security ID,
Owners ID, permissions in DOS format, and quota data.
- $FILE_NAME (#48): Contains the file name in UNICODE, as well
as additional MAC times, and the MFT entry of the parent
directory.
- $OBJECT_ID (#64): Identifiers regarding the files original
Object ID, its birth Volume ID, and Domain ID.
- $DATA (#128): The raw content data of the file.
When a file is deleted, the IN_USE flag is cleared from the MFT entry,
but the attribute contents still exist.
DIRECTORIES
-----------------------------------------------------------------
Directories in NTFS are indexed to make finding a specific entry
in them faster. By default, they are stored in a B-Tree sorted in
alphabetical order. There are two attributes that describe the
B-Tree contents. Directories in NTFS typically have the following
attributes:
- $STANDARD_INFORMATION (#16): See above
- $FILE_NAME (#48): See above
- $OBJECT_ID (#64): See above
- $INDEX_ROOT (#144): The root of the B-Tree. The $INDEX_ROOT
value is one more more "Index Entry" structures that each
describe a file or directory. The "Index Entry" structure
contains a copy of the "$FILE_NAME" attribute for the file or
sub-directory.
- $INDEX_ALLOCATION (#160): The sub-nodes of the B-Tree. For
small directories, this attribute will not exist and all
information will be saved in the $INDEX_ROOT structure. The
content of this attribute is one or more "Index Buffers". Each
"Index Buffer" contains one or more "Index Entry" structures,
which are the same ones found in the $INDEX_ROOT.
- $BITMAP (#176): This describes which structures in the B-Tree
are being used.
When files are deleted from a directory, the tree node is removed
and the tree is resorted. Therefore, the "Index Entry" for the
deleted file maybe written over when the tree is resorted. This
is different than what is usually seen with UNIX and FAT file
systems, which always have the original name and structure until
a new file is created. Also, when the tree is resorted, a file
that is on the bottom of the tree can be moved up and a deleted
file name will exist for the original location (even though it was
never deleted by a user).
USING THE SLEUTH KIT WITH NTFS
=======================================================================
The Sleuth Kit allows one to view all aspects of the NTFS structure.
The biggest difference with using The Sleuth Kit with NTFS instead
of UNIX file systems is the attributes. With UNIX you only need
to reference the inode number because there is only one piece of
content for the file. With NTFS, one can either specify just the
MFT number and the default data attribute is used or the type can
be specified by adding it to the end of the MFT entry, 36-128 for
example. If more than one attribute of the same type exists, then
the id can be used after the type, 36-128-5 for example.
All Sleuth Kit tools can take MFT values in any of the above formats
and output from the tools will also be in one of the above formats.
For example, the 'istat' tool will list all attributes a file has.
To get the details of MFT entry 49, use:
# istat -f ntfs ntfs.dd 49
MFT Entry: 49
Sequence: 2
Allocated
UID: 0
DOS Mode: File
Size: 15
Links: 1
Name: multiple.txt
$STANDARD_INFORMATION Times:
File Modified: Mon Nov 5 19:58:27 2001
MFT Modified: Mon Nov 5 19:58:27 2001
Accessed: Mon Nov 5 19:58:27 2001
$FILE_NAME Times:
Created: Mon Nov 5 19:57:29 2001
File Modified: Mon Nov 5 19:57:29 2001
MFT Modified: Mon Nov 5 19:57:29 2001
Accessed: Mon Nov 5 19:57:29 2001
Attributes:
Type: $STANDARD_INFORMATION (16-0) Name: N/A Resident size: 72
Type: $FILE_NAME (48-2) Name: N/A Resident size: 90
Type: $OBJECT_ID (64-3) Name: N/A Resident size: 16
Type: $DATA (128-1) Name: $Data Resident size: 15
Type: $DATA (128-5) Name: overhere Resident size: 26
We see that it has 5 attributes, all of them are resident (notice
the small sizes). Two of the attributes are $DATA attributes (128-1
and 128-5). The full name of 128-1 is 'multiple.txt' and the full
name of 128-5 is 'multiple.txt:overhere'.
The following command would display the default data attribute
(128-1):
# icat -f ntfs ntfs.dd 49
The following is the same:
# icat -f ntfs ntfs.dd 49-128-1
The following displays the other data stream:
# icat -f ntfs ntfs.dd 49-128-5
As an additional example, the raw format of the $FILE_NAME attribute
can be viewed using:
# icat -f ntfs ntfs.dd 49-48-2
The output of the above command would be a combination of UNICODE
characters and other binary data (I would recommend just using the
output of the istat command for this type of data).
The output of the 'fls' command is similar:
# fls -f ntfs ntfs.dd
<...>
r/r 48-128-1: test-1.txt
r/r 49-128-1: multiple.txt
r/r 49-128-5: multiple.txt:NEW
r/r 50-128-1: test-2.txt
<...>
This allows you to easily identify all data streams.
Note that Autopsy can automate this process for you and allows you
to view all attributes.
http://www.sleuthkit.org/autopsy
WHAT THE SLEUTH KIT CANNOT CURRENTLY DO
=======================================================================
There are a few things that The Sleuth Kit is not yet able to do
with NTFS:
- The Security Descriptors are not yet analyzed. Therefore, the
exact ACLs of the object can not be displayed.
- Directories that are indexed by a descriptor other than the file
name, are not supported.
- Encrypted files are not supported
-----------------------------------------------------------------
Send documentation updates to: <doc-updates at sleuthkit dot org>
Copyright (c) 2002-2008 by Brian Carrier. All Rights Reserved
Windows Implementation
Sleuth Kit Implementation Notes
http://www.sleuthkit.org
Brian Carrier
Last Updated: Sept 2008
INTRODUCTION
=======================================================================
Version 2.06 of The Sleuth Kit included support for Microsoft Windows.
There were several design changes that needed to occur so that TSK could
run on both Windows and Unix systems. The biggest change, and the focus
of this document, was how Unicode and non-English characters were dealt
with.
PROBLEM
=======================================================================
Unicode characters can be stored in multiple formats. Unix systems
use UTF-8, which stores the characters in 1, 2, 3, or 4 bytes. Windows
users UTF-16, which stores characters in 2 or 4 bytes. Because of
this difference, the input to and output of TSK is different on Windows
versus Unix.
SOLUTION
=======================================================================
The solution to this problem was to create many C #defines that map
a general name to the specific function or type that is used on each
platform. Internally, all code uses the UTF-8 encoding. This means
that the input and output may need to be converted on Windows.
The input data consists of image file names, image and file system types,
and addresses. There is no need to convert the file names because the
native system calls need the same format as the input. For the image,
volume, and file system types, I assume that they will always be in
English and therefore they are easily converted to ASCII on Windows.
Lastly, addresses in a string form are easy to convert to an integer
and this is done using either UTF-8 or UTF-16 atoi-type functions.
For output, the printf and fprintf functions were wrapped with
TSK-specific versions. The wrappers will convert the UTF-8 code to
UTF-16, if needed, and then print the resulting data.
Therefore, few changes occurred to the volume and file system code except
that the printf wrappers were used. The command line tools needed to
be changed to handle the 2-byte TCHAR values as input and to use the T*
functions, which map to either UTF-8 or UTF-16 functions.
Update: When support was added for the mingw cross-compiler, some of
the things had to be changed. Specifically, the biggest change was
that the command line arguments in the tools have to be obtained via
GetCommandLineW() instead of using wmain() because mingw does not
support wmain().
-----------------------------------------------------------------------
Send documentation updates to: <doc-updates at sleuthkit dot org>
Copyright (c) 2006-2008 by Brian Carrier. All Rights Reserved
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment