Update readme

7da08b44 · Jesper Vines · e08bbf18 · 7da08b44 · 7da08b44
Commit 7da08b44 authored 2 years ago by Jesper Vines
--- a/README.md
+++ b/README.md
@@ -7,7 +7,7 @@ in an Open AI-gym-inspired environment.

 The scenario is a simplified setting with the following components.

-Detections are viewed very simplistic in this scenario.
+Detections are viewed very simplisticly in this scenario.
 Each aircraft has its own model of Situational Awareness (SA) for its opponents.
 If an observation of an opponent is made (or received from a teammate),
 the own SA is updated.
@@ -45,18 +45,23 @@ To create an virtual environment and install dependencies, run the following com
 >>> pip install -r requirements.txt  # or python install -r requirements_deploy.txt
 ```

-## Example usage
+## Using the environment

-Here follows some examples of how the environment can be used.
+The environment creates a scenario between two opposing teams, green and red. Each team uses a specific behaviour 
+that determines what radar actions the aircraft in the team make. All avaliable behaviours can be found in [behaviour.py](behaviour.py).

-The main files for running tests are:
+
+The main files for running scenario tests are:
 - main.py
- analyse.py
- run_scenario.sh
+- run_testing.sh
+- run_scenarios.sh
+
+Before running any type of test, the virtual environment must activated.

-### Perform a single test
+### Performing a single test

-Run a single test.
+Use the [main.py](main.py) file to run a single test, which can be configured with several different flags.
+See the `--help` flag for more information about all the available flags.

 ```bash
 >>> # source virtual environment
@@ -66,20 +71,79 @@ Run a single test.
 >>> python main.py --help
 ```

+### Performing a test of developed behaviours.
+
+Three behaviours, based on mixed integer programming, reinforcement learning and algorithm selection, have been developed specifically to be analysed and evaluated in this environment.
+To run a test that pitches these three behaviours against all other available behaviours, use the file [run_testing.sh](run_testing.sh).
+There are some environment variables in the file that can be used to configure a run.
+
+```bash
+>>> # source virtual environment
+>>> sh run_testing.sh
+```
+
+For details about the developed behaviours, see [mip_behaviour.py](mip_behaviour.py), [rl_behavior.py](rl_behavior.py) and [algsel_behaviour.py](algsel_behaviour.py).
+
+
 ### Perform all tests

-Run all tests. There are some environment variables in `run_scenarios.sh` that can be used to configure a run.
+To run a test of all behaviours, use the file [run_scenarios.sh](run_scenarios.sh).
+The same environment variables that are available in `run_testing` are available here as well.
+The default settings pitches all behaviours in the test list against each other 200 times, which takes a considerable amount of time.

 ```bash
 >>> # source virtual environment
 >>> sh run_scenarios.sh
 ```

-### Perform analysis
+## Analysis
+
+There are a few ways of analysing the results from the test runs.
+
+The main files for analysing the tests are:
+- test_analysis.py
+- analyse.py

-Perform analysis with on CSV-file with data
+### Perform basic analysis
+
+Perform analysis on CSV-file with data. Gives information about which team that won a given metric in a scenario. 

 ```bash
 >>> # source virtual environment
 >>> python analyse.py PATH_TO_CSV_FILE
 ```
+
+### Perform extended anlysis
+
+Perform more analysis on a CSV-folder with data. Prints win/draw/loss percentages in terminal 
+and can plot the score in the metrics in each time step. Use the `--help` flag for a list of options when analysing.
+
+```bash
+>>> # source virtual environment
+>>> python test_analysis.py PATH_TO_CSV_FOLDER
+```
+
+### Training an RL-behaviour
+
+Behaviours can be trained using the deep reinforcement learning interface availade in the mdptest branch. To influence how the behaviour is trained, setting regarding the training, and the reward function can be changed. To start the training, simply run the multiagent_rl_train.py script within the virtual environment.
+
+```bash
+>>> # source virtual environment
+>>> python multiagent_rl_train.py
+```
+
+### Changing the reward
+
+In the reward method, in the ScalarizedRewardWrapper class, in the python file scalarized_reward_wrapper.py, the reward can be set by changing the new_reward dict in the else statement. See the code for more information on the different options and how to select them.
+
+### Changing the settings
+
+In the config.py file there are many settings available to change the simulation environemtn, the training environment, and the network parameters. For the simulation environment settings, there are options to change the number of planes per team and what behaviour is used to train against, as well as if a specific seed should be used for randomising. For the training environemt settings, the number of simulations running in parallell, and wether to use the evaluation or not can be configured. For the network parameters, there are a multitude. See the config.py file for a description of all of them.
+
+### Continuing to train an old behaviour
+
+It is possible to continue training an old behaviour. To do this, change the model_dir config in config.py to the path of the models directory of the model you wawnt to keep training. If you want to start a new training from scratch, change the model_dir to None.
+
+### Rendering of the behaviour
+
+It is possible to render the scenario during the evaluation simulations. To enable this, make sure that use_eval is set to True in config.py. Then, also in config.py, set eval_render to True, and use the visualise_delay and only_delay_middle settings to change if and how the rendering is slowed down. Now when a behaviour is trained, during eval, a window will open and the evaluation simulation will be rendered. 
--- a/README_dev.md
+++ b/README_dev.md
@@ -7,7 +7,7 @@ in an Open AI-gym-inspired environment.

 The scenario is a simplified setting with the following components.

-Detections are viewed very simplisticly in this scenario.
+Detections are viewed very simplistic in this scenario.
 Each aircraft has its own model of Situational Awareness (SA) for its opponents.
 If an observation of an opponent is made (or received from a teammate),
 the own SA is updated.
@@ -45,23 +45,18 @@ To create an virtual environment and install dependencies, run the following com
 >>> pip install -r requirements.txt  # or python install -r requirements_deploy.txt
 ```

-## Using the environment
+## Example usage

-The environment creates a scenario between two opposing teams, green and red. Each team uses a specific behaviour 
-that determines what radar actions the aircraft in the team make. All avaliable behaviours can be found in [behaviour.py](behaviour.py).
+Here follows some examples of how the environment can be used.

-
-The main files for running scenario tests are:
+The main files for running tests are:
 - main.py
- run_testing.sh
- run_scenarios.sh
-
-Before running any type of test, the virtual environment must activated.
+- analyse.py
+- run_scenario.sh

-### Performing a single test
+### Perform a single test

-Use the [main.py](main.py) file to run a single test, which can be configured with several different flags.
-See the `--help` flag for more information about all the available flags.
+Run a single test.

 ```bash
 >>> # source virtual environment
@@ -71,79 +66,20 @@ See the `--help` flag for more information about all the available flags.
 >>> python main.py --help
 ```

-### Performing a test of developed behaviours.
-
-Three behaviours, based on mixed integer programming, reinforcement learning and algorithm selection, have been developed specifically to be analysed and evaluated in this environment.
-To run a test that pitches these three behaviours against all other available behaviours, use the file [run_testing.sh](run_testing.sh).
-There are some environment variables in the file that can be used to configure a run.
-
-```bash
->>> # source virtual environment
->>> sh run_testing.sh
-```
-
-For details about the developed behaviours, see [mip_behaviour.py](mip_behaviour.py), [rl_behavior.py](rl_behavior.py) and [algsel_behaviour.py](algsel_behaviour.py).
-
-
 ### Perform all tests

-To run a test of all behaviours, use the file [run_scenarios.sh](run_scenarios.sh).
-The same environment variables that are available in `run_testing` are available here as well.
-The default settings pitches all behaviours in the test list against each other 200 times, which takes a considerable amount of time.
+Run all tests. There are some environment variables in `run_scenarios.sh` that can be used to configure a run.

 ```bash
 >>> # source virtual environment
 >>> sh run_scenarios.sh
 ```

-## Analysis
-
-There are a few ways of analysing the results from the test runs.
-
-The main files for analysing the tests are:
- test_analysis.py
- analyse.py
+### Perform analysis

-### Perform basic analysis
-
-Perform analysis on CSV-file with data. Gives information about which team that won a given metric in a scenario. 
+Perform analysis with on CSV-file with data

 ```bash
 >>> # source virtual environment
 >>> python analyse.py PATH_TO_CSV_FILE
 ```
-
-### Perform extended anlysis
-
-Perform more analysis on a CSV-folder with data. Prints win/draw/loss percentages in terminal 
-and can plot the score in the metrics in each time step. Use the `--help` flag for a list of options when analysing.
-
-```bash
->>> # source virtual environment
->>> python test_analysis.py PATH_TO_CSV_FOLDER
-```
-
-### Training an RL-behaviour
-
-Behaviours can be trained using the deep reinforcement learning interface availade in the mdptest branch. To influence how the behaviour is trained, setting regarding the training, and the reward function can be changed. To start the training, simply run the multiagent_rl_train.py script within the virtual environment.
-
-```bash
->>> # source virtual environment
->>> python multiagent_rl_train.py
-```
-
-### Changing the reward
-
-In the reward method, in the ScalarizedRewardWrapper class, in the python file scalarized_reward_wrapper.py, the reward can be set by changing the new_reward dict in the else statement. See the code for more information on the different options and how to select them.
-
-### Changing the settings
-
-In the config.py file there are many settings available to change the simulation environemtn, the training environment, and the network parameters. For the simulation environment settings, there are options to change the number of planes per team and what behaviour is used to train against, as well as if a specific seed should be used for randomising. For the training environemt settings, the number of simulations running in parallell, and wether to use the evaluation or not can be configured. For the network parameters, there are a multitude. See the config.py file for a description of all of them.
-
-### Continuing to train an old behaviour
-
-It is possible to continue training an old behaviour. To do this, change the model_dir config in config.py to the path of the models directory of the model you wawnt to keep training. If you want to start a new training from scratch, change the model_dir to None.
-
-### Rendering of the behaviour
-
-It is possible to render the scenario during the evaluation simulations. To enable this, make sure that use_eval is set to True in config.py. Then, also in config.py, set eval_render to True, and use the visualise_delay and only_delay_middle settings to change if and how the rendering is slowed down. Now when a behaviour is trained, during eval, a window will open and the evaluation simulation will be rendered.