mirror of
https://github.com/dtomlinson91/street_group_tech_test
synced 2025-12-22 03:55:43 +00:00
updating documentation
This commit is contained in:
@@ -27,9 +27,6 @@ To get around public IP quotas I created a VPC in the `europe-west1` region that
|
||||
|
||||
Assuming the `pp-2020.csv` file has been placed in the `./input` directory in the bucket you can run a command similar to:
|
||||
|
||||
!!! caution
|
||||
Use the command `python -m analyse_properties.main` as the entrypoint to the pipeline and not `analyse-properties` as the module isn't installed with poetry on the workers with the configuration below.
|
||||
|
||||
```bash
|
||||
python -m analyse_properties.main \
|
||||
--runner DataflowRunner \
|
||||
|
||||
@@ -55,7 +55,7 @@ A possible solution would be to leverage BigQuery to store the results of the ma
|
||||
|
||||
In addition to creating the mapping table `(key, value)` pairs, we also save these pairs to BigQuery at this stage. We then yield the element as it is currently written to allow the subsequent stages to make use of this data.
|
||||
|
||||
Remove the condense mapping table stage as it is no longer needed.
|
||||
Remove the condense mapping table stage as it is no longer needed (which also saves a bit of time).
|
||||
|
||||
Instead of using:
|
||||
|
||||
|
||||
Reference in New Issue
Block a user