diff --git a/README.md b/README.md
index f094be4a0f0b8c48fd9b14515f37b901f8244239..4a9f12e25f44a8d377b3914b936651ab28ec2f01 100644
--- a/README.md
+++ b/README.md
@@ -12,11 +12,11 @@ The dataset used consist of 50k labeled IMDb movie reviews. Due to hardware cons
 
 <center>
 
-|           | Train | Test  |
-| :-------: | :---: | :---: |
-| Positive  | 5,189 | 4,766 |
-| Negative  | 1,707 | 1,613 |
-| **Total** | 9,955 | 3,319 |
+|           | Train | Valid | Test  |
+| :-------: | :---: | :---: | :---: |
+| Positive  | 4,849 | 1,033 | 1,013 |
+| Negative  | 4,442 |  958  |  979  |
+| **Total** | 9,291 | 1,991 | 1,992 |
 
 </center>
 
@@ -36,7 +36,7 @@ our model, precision, recall and f1-score will serve as a complementary to the a
 
 As baseline for this project, a regular BERT model has been implemented and fine tuned on the task of classifying the sentiment of IMDb reviews.
 
-Training our baseline model for 1 epoch using a batch size of 32 yielded the following results:
+Training our baseline model for 1 epoch using a batch size of 32 yielded the following average results:
 
 <center>
 
@@ -50,7 +50,7 @@ Training our baseline model for 1 epoch using a batch size of 32 yielded the fol
 
 ### Method 1
 
-Method 1 implements a multi layer perceptron to combine the fine-tuned BERT model from our baseline with VAD-scores from VADER. Training the MLP implementation yielded results as follows:
+Method 1 implements a multi layer perceptron to combine the fine-tuned BERT model from our baseline with VAD-scores from VADER. Training the MLP implementation yielded average results as follows:
 
 <center>
 
@@ -62,7 +62,7 @@ Method 1 implements a multi layer perceptron to combine the fine-tuned BERT mode
 
 ### Method 2
 
-Method 2 assigns weights to the individual results from the fine-tuned BERT and VADER and combines the models with different weight-combinations. The best combination of weights yielded the following results:
+Method 2 assigns weights to the individual results from the fine-tuned BERT and VADER and combines the models with different weight-combinations. The best combination of weights yielded the following average results:
 
 <center>