810f57366b
Merge branch 'develop' into main
2021-09-28 10:48:19 +01:00
e172b704a7
typo in documentation
2021-09-28 10:47:28 +01:00
8a0d8085a2
updating documentation
2021-09-28 00:29:41 +01:00
c481c1a976
adding requirements.txt
2021-09-28 00:07:08 +01:00
577aa9e388
updating poetry
2021-09-27 23:58:27 +01:00
4d3e5fbc23
updating latest beam pipeline with default inputs
2021-09-27 23:58:19 +01:00
a53d79118a
Merge branch 'docs/mkdocs' into develop
2021-09-27 23:16:41 +01:00
4561f1a356
updating documentation
2021-09-27 23:16:30 +01:00
4056ca1f32
commenting out debug imports
2021-09-27 23:16:18 +01:00
cfdee9d3ed
adding commands.md to notes
2021-09-27 21:20:34 +01:00
cbb8a7e237
adding initial docs
2021-09-27 21:19:28 +01:00
a73d7b74a4
Merge branch 'wip/dataflow_refactor_group' into develop
2021-09-27 21:18:48 +01:00
76434fae5b
adding entrypoint command
2021-09-27 21:18:28 +01:00
886a37ca94
updating report.py
2021-09-27 21:18:14 +01:00
3263b3dd8b
moving panda-profiling report into docs
2021-09-27 21:18:06 +01:00
dffc6aa553
adding debug print
2021-09-27 21:17:49 +01:00
f9eeb8bfad
adding latest beam pipeline code for dataflow
2021-09-27 21:17:39 +01:00
cad6612ebe
updating dataflow notes
2021-09-27 03:39:40 +01:00
391861d80c
adding latest beam pipeline code
2021-09-27 03:39:30 +01:00
f60beb4565
updating dataflow notes
2021-09-27 03:18:49 +01:00
f2ed60426d
adding temporary notes
2021-09-27 03:18:42 +01:00
7db1edb90c
updating download_data script with pp-2020 dataset
2021-09-27 03:18:33 +01:00
3a74579440
adding latest beam pipeline code for dataflow with group optimisation
2021-09-27 03:18:17 +01:00
377e3c703f
updating dataflow documentation
2021-09-27 01:35:48 +01:00
a8fc06c764
adding latest beam pipeline code for dataflow with group optimisation
2021-09-26 23:28:58 +01:00
eaa36877f6
update dataflow documentation with new commands for vpc
2021-09-26 23:28:35 +01:00
1941fcb7bf
update prospector.yaml
2021-09-26 23:28:21 +01:00
99e67c2840
updating download_data script with new bucket
2021-09-26 23:28:12 +01:00
8e8469579e
updating beam pipeline to use GroupByKey
2021-09-26 20:29:11 +01:00
4e3771c728
adding latest beam pipeline code for dataflow
2021-09-26 17:15:35 +01:00
8856a9763f
updating prospector.yaml
2021-09-26 17:15:24 +01:00
fded858932
moving dataflow notes
2021-09-26 17:15:17 +01:00
bb71d55f8c
adding latest beam pipeline code for dataflow
2021-09-26 16:23:19 +01:00
8047b5ced4
adding latest beam pipeline code for dataflow
2021-09-26 16:16:58 +01:00
9f53c66975
adding latest beam pipeline code for dataflow
2021-09-26 15:57:00 +01:00
e6ec110d54
updating download data script for new location in GCS
2021-09-26 14:56:53 +01:00
83807616e0
Merge branch 'wip/pathlib' into develop
2021-09-26 14:56:21 +01:00
7f874fa6f6
updating README.md
2021-09-26 14:56:01 +01:00
b8a997084d
removing requirements.txt for documentation
2021-09-26 14:55:48 +01:00
c4e81065b1
adding pyenv 3.7.9
2021-09-26 14:55:05 +01:00
62bd0196ad
removing shard_name_template from output file
2021-09-26 06:11:42 +01:00
7f9b7e4bfd
moving inputs/outputs to use pathlib
2021-09-26 06:03:55 +01:00
7962f40e32
adding initial docs
2021-09-26 01:34:06 +01:00
2a43ea1946
adding download script for data set
2021-09-26 01:18:23 +01:00
07d176be79
updating .gitignore
2021-09-26 01:10:53 +01:00
f804e85cc3
updating .gitignore
2021-09-26 01:10:32 +01:00
9fdc6dce05
migrate beam pipeline to main.py
2021-09-26 01:10:09 +01:00
54cf5e3e36
updating prospector.yaml
2021-09-26 01:09:48 +01:00
2e42a453b0
adding latest beam pipeline code
2021-09-25 22:15:58 +01:00
adfbd8e93d
updating prospector.yaml
2021-09-25 22:15:46 +01:00
1bd54f188d
updating folder structure for data input/output
2021-09-25 22:15:39 +01:00
a7c52b1085
updating .gitignore
2021-09-25 22:15:10 +01:00
24420c8935
adding latest beam pipeline code
2021-09-25 18:25:08 +01:00
44f346deff
Merge remote-tracking branch 'origin/develop' into develop
2021-09-25 16:48:04 +01:00
a37e7817c3
adding latest beam pipeline code
2021-09-25 16:47:52 +01:00
539e4c7786
adding latesty beam pipeline code
2021-09-25 16:47:17 +01:00
214ce77d8f
adding debug.py
2021-09-25 16:47:08 +01:00
47a4ac4bc3
adding latest beam pipeline code
2021-09-25 01:44:37 +01:00
aa61ea9c57
adding latest beam pipeline code
2021-09-25 00:48:41 +01:00
94cc22a385
adding data exploration report + code
2021-09-25 00:48:34 +01:00
a05182892a
adding documentation
2021-09-25 00:48:13 +01:00
c38a10ca2f
updating beam to install gcp extras
2021-09-25 00:48:04 +01:00
ab993fc030
adding prospector.yaml
2021-09-25 00:47:49 +01:00
9cf5662600
updating notes
2021-09-24 16:52:40 +01:00
0f262daf39
adding report.py
2021-09-24 16:52:34 +01:00
972d0a852a
updating dev dependencies
2021-09-24 16:52:25 +01:00
5301d5ff04
updating .gitignore
2021-09-24 16:51:56 +01:00
6db2fa59b9
adding initial skeleton
2021-09-24 16:01:03 +01:00