Configure An Argo-Based Job to Access S3
Some jobs can be configured to read from or write to Amazon S3 (S3).
You can configure a combination of Solr and cloud-based input or output, that is, you can read from S3 and then write to Solr or vice versa. However, you cannot configure multiple storage sources for input or multiple storage targets for output; only one can be configured for each.
Supported jobs
This procedure applies to these Argo jobs:
-
Content based Recommender
-
BPR Recommender
-
Classification
-
Evaluate QnA Pipeline
-
QnA Coldstart Training
-
QnA Supervised Training
For Spark jobs, see Configure A Spark-Based Job to Access Cloud Storage.
How to configure a job to access S3
-
Gather the access key and secret key for your S3 account.
See the AWS documentation.
-
Create a Kubernetes secret:
kubectl create secret generic aws-secret --from-literal=my-aws-key-file='<access key>' --from-literal=my-aws-secret-path='<secret key>' --namespace <fusion-namespace>
-
In the job’s Cloud storage secret name field, enter the name of the secret for the S3 target as mounted in the Kubernetes namespace.
This is the name you specified in the previous step. In the example above, the secret name is
aws-secret
.You can also find this name using kubectl get secret -n <fusion-namespace>
. -
In the job’s Additional Parameters, add these two parameters:
-
Param name:
fs.s3a.access.keyPath
Param value:
<name of the file containing the access key that is available when the S3 secret is mounted to the pod>
-
Param name:
fs.s3a.secret.keyPath
Param value:
<name of the file containing the access secret that is available when the S3 secret is mounted to the pod>
The file name may be different than the secret name. You can check using kubectl get secret -n <fusion-namespace> <secretname> -o yaml
. -