|
|
810f57366b
|
Merge branch 'develop' into main
|
2021-09-28 10:48:19 +01:00 |
|
|
|
e172b704a7
|
typo in documentation
|
2021-09-28 10:47:28 +01:00 |
|
dtomlinson91
|
80376a662e
|
Merge final release (#1)
* adding initial skeleton
* updating .gitignore
* updating dev dependencies
* adding report.py
* updating notes
* adding prospector.yaml
* updating beam to install gcp extras
* adding documentation
* adding data exploration report + code
* adding latest beam pipeline code
* adding latest beam pipeline code
* adding debug.py
* adding latesty beam pipeline code
* adding latest beam pipeline code
* adding latest beam pipeline code
* updating .gitignore
* updating folder structure for data input/output
* updating prospector.yaml
* adding latest beam pipeline code
* updating prospector.yaml
* migrate beam pipeline to main.py
* updating .gitignore
* updating .gitignore
* adding download script for data set
* adding initial docs
* moving inputs/outputs to use pathlib
* removing shard_name_template from output file
* adding pyenv 3.7.9
* removing requirements.txt for documentation
* updating README.md
* updating download data script for new location in GCS
* adding latest beam pipeline code for dataflow
* adding latest beam pipeline code for dataflow
* adding latest beam pipeline code for dataflow
* moving dataflow notes
* updating prospector.yaml
* adding latest beam pipeline code for dataflow
* updating beam pipeline to use GroupByKey
* updating download_data script with new bucket
* update prospector.yaml
* update dataflow documentation with new commands for vpc
* adding latest beam pipeline code for dataflow with group optimisation
* updating dataflow documentation
* adding latest beam pipeline code for dataflow with group optimisation
* updating download_data script with pp-2020 dataset
* adding temporary notes
* updating dataflow notes
* adding latest beam pipeline code
* updating dataflow notes
* adding latest beam pipeline code for dataflow
* adding debug print
* moving panda-profiling report into docs
* updating report.py
* adding entrypoint command
* adding initial docs
* adding commands.md to notes
* commenting out debug imports
* updating documentation
* updating latest beam pipeline with default inputs
* updating poetry
* adding requirements.txt
* updating documentation
v1.0
|
2021-09-28 00:31:09 +01:00 |
|
|
|
8a0d8085a2
|
updating documentation
|
2021-09-28 00:29:41 +01:00 |
|
|
|
c481c1a976
|
adding requirements.txt
|
2021-09-28 00:07:08 +01:00 |
|
|
|
577aa9e388
|
updating poetry
|
2021-09-27 23:58:27 +01:00 |
|
|
|
4d3e5fbc23
|
updating latest beam pipeline with default inputs
|
2021-09-27 23:58:19 +01:00 |
|
|
|
a53d79118a
|
Merge branch 'docs/mkdocs' into develop
|
2021-09-27 23:16:41 +01:00 |
|
|
|
4561f1a356
|
updating documentation
|
2021-09-27 23:16:30 +01:00 |
|
|
|
4056ca1f32
|
commenting out debug imports
|
2021-09-27 23:16:18 +01:00 |
|
|
|
cfdee9d3ed
|
adding commands.md to notes
|
2021-09-27 21:20:34 +01:00 |
|
|
|
cbb8a7e237
|
adding initial docs
|
2021-09-27 21:19:28 +01:00 |
|
|
|
a73d7b74a4
|
Merge branch 'wip/dataflow_refactor_group' into develop
|
2021-09-27 21:18:48 +01:00 |
|
|
|
76434fae5b
|
adding entrypoint command
|
2021-09-27 21:18:28 +01:00 |
|
|
|
886a37ca94
|
updating report.py
|
2021-09-27 21:18:14 +01:00 |
|
|
|
3263b3dd8b
|
moving panda-profiling report into docs
|
2021-09-27 21:18:06 +01:00 |
|
|
|
dffc6aa553
|
adding debug print
|
2021-09-27 21:17:49 +01:00 |
|
|
|
f9eeb8bfad
|
adding latest beam pipeline code for dataflow
|
2021-09-27 21:17:39 +01:00 |
|
|
|
cad6612ebe
|
updating dataflow notes
|
2021-09-27 03:39:40 +01:00 |
|
|
|
391861d80c
|
adding latest beam pipeline code
|
2021-09-27 03:39:30 +01:00 |
|
|
|
f60beb4565
|
updating dataflow notes
|
2021-09-27 03:18:49 +01:00 |
|
|
|
f2ed60426d
|
adding temporary notes
|
2021-09-27 03:18:42 +01:00 |
|
|
|
7db1edb90c
|
updating download_data script with pp-2020 dataset
|
2021-09-27 03:18:33 +01:00 |
|
|
|
3a74579440
|
adding latest beam pipeline code for dataflow with group optimisation
|
2021-09-27 03:18:17 +01:00 |
|
|
|
377e3c703f
|
updating dataflow documentation
|
2021-09-27 01:35:48 +01:00 |
|
|
|
a8fc06c764
|
adding latest beam pipeline code for dataflow with group optimisation
|
2021-09-26 23:28:58 +01:00 |
|
|
|
eaa36877f6
|
update dataflow documentation with new commands for vpc
|
2021-09-26 23:28:35 +01:00 |
|
|
|
1941fcb7bf
|
update prospector.yaml
|
2021-09-26 23:28:21 +01:00 |
|
|
|
99e67c2840
|
updating download_data script with new bucket
|
2021-09-26 23:28:12 +01:00 |
|
|
|
8e8469579e
|
updating beam pipeline to use GroupByKey
|
2021-09-26 20:29:11 +01:00 |
|
|
|
4e3771c728
|
adding latest beam pipeline code for dataflow
|
2021-09-26 17:15:35 +01:00 |
|
|
|
8856a9763f
|
updating prospector.yaml
|
2021-09-26 17:15:24 +01:00 |
|
|
|
fded858932
|
moving dataflow notes
|
2021-09-26 17:15:17 +01:00 |
|
|
|
bb71d55f8c
|
adding latest beam pipeline code for dataflow
|
2021-09-26 16:23:19 +01:00 |
|
|
|
8047b5ced4
|
adding latest beam pipeline code for dataflow
|
2021-09-26 16:16:58 +01:00 |
|
|
|
9f53c66975
|
adding latest beam pipeline code for dataflow
|
2021-09-26 15:57:00 +01:00 |
|
|
|
e6ec110d54
|
updating download data script for new location in GCS
|
2021-09-26 14:56:53 +01:00 |
|
|
|
83807616e0
|
Merge branch 'wip/pathlib' into develop
|
2021-09-26 14:56:21 +01:00 |
|
|
|
7f874fa6f6
|
updating README.md
|
2021-09-26 14:56:01 +01:00 |
|
|
|
b8a997084d
|
removing requirements.txt for documentation
|
2021-09-26 14:55:48 +01:00 |
|
|
|
c4e81065b1
|
adding pyenv 3.7.9
|
2021-09-26 14:55:05 +01:00 |
|
|
|
62bd0196ad
|
removing shard_name_template from output file
|
2021-09-26 06:11:42 +01:00 |
|
|
|
7f9b7e4bfd
|
moving inputs/outputs to use pathlib
|
2021-09-26 06:03:55 +01:00 |
|
|
|
7962f40e32
|
adding initial docs
|
2021-09-26 01:34:06 +01:00 |
|
|
|
2a43ea1946
|
adding download script for data set
|
2021-09-26 01:18:23 +01:00 |
|
|
|
07d176be79
|
updating .gitignore
|
2021-09-26 01:10:53 +01:00 |
|
|
|
f804e85cc3
|
updating .gitignore
|
2021-09-26 01:10:32 +01:00 |
|
|
|
9fdc6dce05
|
migrate beam pipeline to main.py
|
2021-09-26 01:10:09 +01:00 |
|
|
|
54cf5e3e36
|
updating prospector.yaml
|
2021-09-26 01:09:48 +01:00 |
|
|
|
2e42a453b0
|
adding latest beam pipeline code
|
2021-09-25 22:15:58 +01:00 |
|