Uploading Results (rclone)¶
At the end of a VeRCYe run the pipeline can optionally package the most
relevant study outputs (a curated subset — not every file produced by the
run) into a zip and upload it to a cloud drive via rclone. This is useful
when the machine running VeRCYe is not where users want to fetch results
from (e.g. a shared HPC node).
The upload step is off by default. It only runs when
packaging_params.rclone_target is set in the run config (or equivalently
when RCLONE_TARGET is set in your .env before running vercye prep).
What gets uploaded¶
Every file under the study directory whose basename matches a glob in
vercye_ops/reporting/output_data_patterns.txt is bundled into a single
zip. The default pattern set contains:
final_report_*.pdf,lai_report_*.pdf,aggregated_met_stats_*.pdfmultiyear_summary*.zip,interactive_map_*.zipagg_yield_estimates_*.csv,all_predictions_*.csv,referencedata_*.csvyield_mosaic_projected_*.tif,aggregated_LAI_MAX_*.tif,aggregated_cropmask_*.tifaggregated_region_boundaries_*.geojsonconfig.yaml
Edit that file to change which outputs are shipped.
Remote folder layout¶
The zip is uploaded into:
<rclone_target>/<rclone_folder_prefix>/<study_id>_<YYYYMMDD>[_N]/
_N is a numeric suffix appended only when a folder with the same base
name already exists on the target, so re-runs on the same day never
overwrite an existing upload.
vercye run --validate checks that rclone is on PATH, that
rclone_target names a configured remote, and that the target is
reachable. Auth / reachability issues surface at validation time rather
than at the end of a long run.
Setting up rclone for Google Drive¶
- Install rclone
bash
# Debian / Ubuntu
sudo apt-get install rclone
# macOS
brew install rclone
# Or the upstream one-liner:
curl https://rclone.org/install.sh | sudo bash
- Create a Google Drive remote
bash
rclone config
Follow the prompts:
n→ new remote- name:
gdrive - storage:
drive - leave
client_id/client_secretempty unless you have your own - scope:
1(full access) or2(file scope) - leave root_folder_id / service account blank for interactive auth
y→ auto config; a browser window opens — sign in and grant accessn→ not a team drive (unless it is)- confirm with
y
On a headless server pick n for auto config and follow the
printed instructions: run rclone authorize "drive" on a machine with
a browser and paste the resulting token back.
- Verify
bash
rclone listremotes # should list 'gdrive:'
rclone lsd gdrive: # should list your Drive top-level folders
- Point VeRCYe at it. Add to
.env(preferred —vercye prepwrites it into the run config) …
dotenv
RCLONE_TARGET="gdrive:"
RCLONE_FOLDER_PREFIX="vercye_kenya"
… or set it directly in config.yaml:
yaml
packaging_params:
rclone_target: "gdrive:"
rclone_folder_prefix: "vercye_kenya"
Each run then uploads to
gdrive:vercye_kenya/<study_id>_<YYYYMMDD>[_N]/<zip>.
Other backends¶
rclone_target can be any rclone-supported backend
— Google Drive, S3, Azure Blob, Dropbox, SFTP, Backblaze B2, OneDrive, etc.
The setup is always the same: rclone config to add the remote, then set
RCLONE_TARGET to <remote_name>: (or <remote_name>:<path>).
Running¶
No extra CLI flag is needed — the upload step runs automatically as the
last rule of vercye run when rclone_target is set. Logs are written to
logs_package_and_upload/<study_id>_package_and_upload.log under the
snakemake run directory, and a marker file
.package_and_upload_<study_id>.done is left in the study directory once
the upload has completed successfully.