From 31c3ad0544f43f01aecceaedda75123aa21bb241 Mon Sep 17 00:00:00 2001
From: Brian Carrier <carrier@sleuthkit.org>
Date: Fri, 9 Apr 2021 18:40:27 -0400
Subject: [PATCH] Starting point for new docs

---
 bindings/java/doxygen/Doxyfile        |   2 +
 bindings/java/doxygen/datasources.dox |  33 ++++++++
 bindings/java/doxygen/main.dox        |   2 +
 bindings/java/doxygen/os_accounts.dox | 115 ++++++++++++++++++++++++++
 4 files changed, 152 insertions(+)
 create mode 100644 bindings/java/doxygen/datasources.dox
 create mode 100644 bindings/java/doxygen/os_accounts.dox

diff --git a/bindings/java/doxygen/Doxyfile b/bindings/java/doxygen/Doxyfile
index d5819781d..adb110b6d 100644
--- a/bindings/java/doxygen/Doxyfile
+++ b/bindings/java/doxygen/Doxyfile
@@ -765,6 +765,8 @@ INPUT                  = main.dox \
                          artifact_catalog.dox \
                          insert_and_update_database.dox \
                          communications.dox \
+                         datasources.dox \
+                         os_accounts.dox \
                          schema/schema_list.dox \
                          schema/db_schema_8_6.dox \
                          schema/db_schema_9_0.dox \
diff --git a/bindings/java/doxygen/datasources.dox b/bindings/java/doxygen/datasources.dox
new file mode 100644
index 000000000..103121e75
--- /dev/null
+++ b/bindings/java/doxygen/datasources.dox
@@ -0,0 +1,33 @@
+/*! \page mod_dspage Data Sources, Hosts, and Persons
+  
+\section ds_overview Overview
+This page outlines some of the core concepts around data sources and how they are organized. 
+
+\section ds_ds Data Sources
+A org.sleuthkit.datamodel.DataSource represents a set of data that is added to a case. Example data sources include:
+- A disk or phone image
+- A set of logical files
+- A report from another forensics tool
+
+The case database and objects are organized in, generally, a tree structure.  The data sources are often the set of top-most items in the tree. 
+You can call org.sleuthkit.datamodel.SleuthkitCase.getDataSources() to get all of the ones in a case. From there you can call getChildren() to recursively go from, for example, the disk image to volumes, to file systems, to files and subfolders. 
+
+You can add data sources by various SleuthkitCase methods, such as org.sleuthkit.datamodel.SleuthkitCase.makeAddImageProcess(). 
+
+
+\section ds_hosts Hosts
+All data sources must be associated with a org.sleuthkit.datamodel.Host. A host represents the device that the data source came from.  Some hosts will have only a single data source, such as for a computer with one hard drive.  Others may have multiple, such as a phone with an image of the handset and another image of a media card. 
+
+If you later learn that two data soures are from the same device, you can merge the hosts. 
+
+Hosts are managed from org.sleuthkit.datamodel.HostManager. 
+
+NOTE: Hosts are different from org.sleuthkit.datamodel.HostAddress.  A Host is for devices that were seized and added to the case.  A HostAddress is for an address to any external host that was found during the analysis of a data source. For example, a HostAddress for "www.sleuthkit.org" could be created based on web history artifacts. 
+
+
+\section ds_person Persons
+You can optionally associate a host with a org.sleuthkit.datamodel.Person.  This can allow you to more easily organize data in a large case. The concept is that you have a data source becuase it is owned or used by a given person. You can group that person's data sources together. 
+
+Persons are managed from org.sleuthkit.datamodel.PersonManager. 
+
+*/
diff --git a/bindings/java/doxygen/main.dox b/bindings/java/doxygen/main.dox
index eb5021368..bf9ba7cb4 100644
--- a/bindings/java/doxygen/main.dox
+++ b/bindings/java/doxygen/main.dox
@@ -37,6 +37,8 @@ You can also access the data in its tree form by starting with org.sleuthkit.dat
 
 \section main_other Other Topics
 
+- \subpage mod_dspage describes data source organization 
+- \subpage mod_os_accounts_page
 - \subpage mod_bbpage is where analysis modules (such as those in Autopsy) can post and save their results. 
 - The \subpage artifact_catalog_page gives a list of the current artifacts and attributes used on \ref mod_bbpage.
 - \subpage mod_compage is where analysis modules can store and retrieve communications-related data. 
diff --git a/bindings/java/doxygen/os_accounts.dox b/bindings/java/doxygen/os_accounts.dox
new file mode 100644
index 000000000..9b6625d17
--- /dev/null
+++ b/bindings/java/doxygen/os_accounts.dox
@@ -0,0 +1,115 @@
+/*! \page mod_os_accounts_page OS Accounts and Realms
+
+\section os_acct_overview Overview
+
+This page outlines some of the core concepts around OS Accounts and Realms and how they are stored.
+OS Accounts are unique data types in the TSK datamodel and have more complexity than other types because
+we often may not fully understand the details at various times in the processing. 
+
+\section os_acct_basics Basic Terminology
+
+- An OS account allows a person to do some action or access some resource on a device. 
+- A realm is the scope wherein the OS Account is defined. A realm can be scoped to a single host (i.e. for accounts that exist only on a single host) or to a network domain (such as Windows domain accounts). 
+
+
+\section os_acct_challenges OS Account Challenges
+
+A key challenge with OS Accounts is that we do not know the account information until we have started to parse files and OS configuration files. Or, we may never know the details if we have only a media card.  
+
+As a user adds a disk image to the case, we may learn about addresses from the files. But, we won't yet know the account name or if it is domain-scoped or local-scoped. So, the basic properties of the realm and account may change as more data is ingested and analyzed. This could even result in needing to merge realms and accounts.
+
+Another difference with other data types in the TSK data model is that OS Accounts may span multiple data sources if they are domain accounts. Therefore, they are not "children" of a data source.  They exist outside of the usual tree model in TSK. 
+
+\section os_acct_realm OS Account Realms
+
+An org.sleuthkit.datamodel.OsAccountRealm represents the scope of a set of OS Accounts. It's scope is defined by org.sleuthkit.datamodel.OsAccountRealm.RealmScope.  By default, the scope is set to host-level and a org.sleuthkit.datamodel.OsAccountRealm.ScopeConfidence of inferred. As more is learned, the confidence and scope can be made more specific. 
+
+A realm has two core fields:
+- Address that the OS uses internally, such as part of a Windows SID.
+- Name that is what users more often see
+
+When searching for realms, the address has priority over the name. Often times with Windows systems, we may have a realm address from SIDs, but not a specific realm name. 
+
+Realms are managed by org.sleuthkit.datamodel.OsAccountRealmManager.
+
+
+\section os_acct_acct OS Accounts
+
+An org.sleuthkit.datamodel.OsAccount represents an account that was configured into an operating system. It must be defined within the scope of an OsAccountRealm.  
+
+It has two core fields:
+- Login name that the user enters (such as jdoe)
+- Address that the OS uses internally (such as a UID of 0 or a Windows SID) 
+
+OS Accounts also have other properties, such as full name, creation date, etc. that can be set after it is creaated. 
+
+OS Accounts are managed by org.sleuthkit.datamodel.OsAccountManager.
+
+\subsection os_acct_acct_os Supported Operating Systems
+
+At this point, APIs exist for only Windows accounts, such as: 
+- org.sleuthkit.datamodel.OsAccountManager.newWindowsOsAccount()
+- org.sleuthkit.datamodel.OsAccountManager.getWindowsOsAccount()
+
+In the future, additional methods will be created for other OSes.
+
+The underlying database schema supports other OSes, but the utility APIs do not exist to populate them other than with Windows SIDs.
+
+\section os_account_storing Storing Original Account Data
+
+We recommend that the OS account addresses or names that were parsed from the data source are saved alongside any references to OsAccount objects. For example, the TSK database stores the UID or SID that was stored in a file system for a file in addition to the reference to the OsAccount object that is associated with that address.  This helps to ensure the original data is preserved in case an OsAccount can't be created, gets deleted, or incorrectly merged. 
+
+
+\section os_acct_example Example Creation & Update Code
+
+There are three unique elements to creating and updating OS Accounts when adding data to the case database:
+
+1) You cannot create or update OS accounts in a multi-step org.sleuthkit.datamodel.SleuthkitCase.CaseDbTransaction. To avoid duplicates accross multiple node systems, you need to insert and update in a single step. If you have a transaction open while creating accounts, the database will likely go into a deadlock in single-user cases because the thread can not have two connections at the same time.
+
+This means that if you are using CaseDbTransation to add a lot of files or artifacts, you'll need to:
+- Pre-process the data to identify what accounts you need to find references to 
+- See if the OS Accounts already exist and update or make new ones
+- Add the files and artifacts with refrences to the OsAccounts
+
+2) You need to check if you have more information than what is already stored (i.e. maybe the realm name was unknown).
+
+3) You need to record that an OS Account was refrenced on a given data source because OS Accounts are stored in parallel to data sources and are not children of them. 
+
+Here are some examples.
+
+\subsection os_acct_ex_get Adding a File or Data Artifact
+
+If you pass in an OsAccount to the various methods to add files and data artifacts, then the DB will make the association and record the occurance. All you need to do is get the account.  You can do that with org.sleuthkit.datamodel.OsAccountManager.getWindowsOsAccount(). Note that sometimes that call will fail if the SID associated with the file is for a group, such as what  happens when the OS Account has admin rights. 
+
+If you get an OsAccount, you can try to upate it if you think you may have new information. 
+
+Here is example pseudo-code:
+
+\code
+OsAccount osAcct = null;
+
+try {
+    Optional<OsAccount> osAcctOpt = getWindowsOsAccount("S-....", "jdoe", "ACME", host);
+    if (osAcctOpt.isPresent(())  {
+            osAcct = osAcctOpt.get();
+            updateWindowsOsAccount(osAccount, "S-.....", "jdoe", "ACME", host);
+    }
+    else {
+            osAcct = newWindowsOsAccount("S-....", "jdoe", "ACME", host)
+    }
+}
+catch (NotUserSIDException ex) {
+    // Ignore this SID
+}
+
+// Pass in osAcct when making artifacts and files 
+\endcode
+
+\subsection os_acct_ex_update Parsing OS Configuration Data
+
+When parsing the Windows registry or other OS Configuration file you may find updated information about OS Accounts.  You can call various org.sleuthkit.datamodel.OsAccountManager methods to get and update the accounts.  When adding extended attributes, you can choose to limit the scope of the attribute to the single host being parsed or domain-level.  
+
+You should make sure to call org.sleuthkit.datamodel.OsAccountManager.newOsAccountInstance() to ensure it is recorded that there was at least some reference to account on that data source.  Otherwise, it will no be associated with it unless there were also files or artifacts that were mapped to it. 
+
+
+*/
-- 
GitLab