From 4f331eff92a3b5c3d62282c68d06ed30fa0ac5de Mon Sep 17 00:00:00 2001 From: Daniel Tomlinson Date: Mon, 27 Sep 2021 19:02:23 +0100 Subject: [PATCH] Deployed cad6612 with MkDocs version: 1.2.2 --- 404.html | 136 ++- discussion/exploration.html | 548 ++++++++++++ discussion/introduction.html | 553 ++++++++++++ documentation/installation.html | 140 ++- documentation/usage.html | 155 +++- index.html | 140 ++- pandas-profiling/report.html | 1439 +++++++++++++++++++++++++++++++ search/search_index.json | 2 +- sitemap.xml | 10 + sitemap.xml.gz | Bin 195 -> 198 bytes 10 files changed, 3111 insertions(+), 12 deletions(-) create mode 100644 discussion/exploration.html create mode 100644 discussion/introduction.html create mode 100644 pandas-profiling/report.html diff --git a/404.html b/404.html index 65f92e8..d68a870 100644 --- a/404.html +++ b/404.html @@ -158,6 +158,59 @@ + + +
@@ -170,8 +223,10 @@ @@ -317,7 +449,7 @@
- + diff --git a/discussion/exploration.html b/discussion/exploration.html new file mode 100644 index 0000000..0472cb8 --- /dev/null +++ b/discussion/exploration.html @@ -0,0 +1,548 @@ + + + + + + + + + + + + + + + + + Data Exploration Report - The Street Group Technical Test + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + Skip to content + + +
+
+ +
+ + + +
+ + +
+ +
+ + + + + + + + + +
+
+ + + +
+
+
+ + + + + + +
+
+
+ + + +
+
+
+ + +
+
+
+ + +
+
+ + + + + + + +

Data Exploration Report

+

A brief exploration was done on the full dataset using the module pandas-profiling. The module uses pandas to load a dataset and automatically produce quantile/descriptive statistics, common values, extreme values, skew, kurtosis etc.

+

The script used to generate this report is located in ./exploration/report.py.

+

The report can be viewed by clicking the Data Exploration Report tab at the top of the page.

+ + + + + + + +
+
+
+ +
+ + + + +
+
+
+
+ + + + + + + + + + + + \ No newline at end of file diff --git a/discussion/introduction.html b/discussion/introduction.html new file mode 100644 index 0000000..26e140a --- /dev/null +++ b/discussion/introduction.html @@ -0,0 +1,553 @@ + + + + + + + + + + + + + + + + + Introduction - The Street Group Technical Test + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + Skip to content + + +
+
+ +
+ + + +
+ + +
+ +
+ + + + + + + + + +
+
+ + + +
+
+
+ + + + + + +
+
+
+ + + +
+
+
+ + +
+
+
+ + +
+
+ + + + + + + +

Introduction

+

This section will go through some discussion of the test including:

+
    +
  • Data exploration
  • +
  • Cleaning the data
  • +
  • Interpreting the results
  • +
  • Deploying on GCP DataFlow
  • +
  • Improvements
  • +
+ + + + + + + +
+
+
+ +
+ + + + +
+
+
+
+ + + + + + + + + + + + \ No newline at end of file diff --git a/documentation/installation.html b/documentation/installation.html index a5b79e5..66139f9 100644 --- a/documentation/installation.html +++ b/documentation/installation.html @@ -163,6 +163,61 @@ + + +
@@ -175,8 +230,10 @@ @@ -383,7 +517,7 @@
pip install poetry
 

From the root of the repo install the dependencies with:

-
poetry install --nodev
+
poetry install --no-dev
 
@@ -457,7 +591,7 @@
- + diff --git a/documentation/usage.html b/documentation/usage.html index 0af3c3b..d25b4ad 100644 --- a/documentation/usage.html +++ b/documentation/usage.html @@ -163,6 +163,61 @@ + + +
@@ -175,8 +230,10 @@ @@ -436,7 +570,7 @@ optional arguments: --output OUTPUT Full path to the output file without extension.

The default value for input is ./data/input/pp-2020.csv and the default value for output is ./data/output/pp-2020.

-

If passing in values for input/output these should be full paths to the files. The test will parse these inputs as a str() and pass this to beam.io.ReadFromText().

+

If passing in values for input/output these should be full paths to the files. The test will parse these inputs as a str() and pass this to beam.io.ReadFromText().

Run the pipeline

To run the pipeline and complete the task run:

analyse-properties --runner DirectRunner
@@ -476,6 +610,21 @@ optional arguments:
         
       
       
+        
+        
+