Introduction
Argo Workflows is a native Kubernetes workflow engine, perfect for orchestrating complex CI/CD, ML, or ETL pipelines. Unlike Jenkins or Airflow, it runs directly in the cluster using CRDs to define DAGs (Directed Acyclic Graphs) of containers. In 2026, with Kubernetes 1.32+, Argo shines in cloud-native environments thanks to horizontal scaling, artifact support (S3, GCS), and an intuitive UI.
This expert tutorial guides you step by step: from Helm installation to advanced workflows with parameters, loops, and shared volumes. You'll learn to avoid performance pitfalls and secure with RBAC. By the end, you'll master production-ready pipelines that any senior DevOps engineer would bookmark.
Prerequisites
- Kubernetes cluster 1.28+ (Minikube, EKS, GKE)
- Configured kubectl
- Helm 3.14+
- Dedicated namespace (argo)
- Advanced knowledge of Kubernetes YAML and Docker containers
Install Argo Workflows with Helm
# Ajouter le repo Argo
helm repo add argo https://argoproj.github.io/argo-helm
helm repo update
# Créer namespace
kubectl create namespace argo
# Installer Argo Workflows (version stable 2026)
helm install argo argo/argo-workflows \
--namespace argo \
--set server.service.type=LoadBalancer \
--set workflows.defaultNamespace=argo \
--set server.ingress.enabled=false
# Vérifier pods
kubectl get pods -n argo
# Port-forward pour UI (optionnel)
kubectl port-forward svc/argo-server -n argo 2746:2746This script installs Argo via Helm in the 'argo' namespace with a LoadBalancer service for the UI. It sets the default namespace for workflows. Avoid default configs in production: enable RBAC and persistence afterward. Check that pods are 'Running' before proceeding.
Access the UI
Once installed, access the Argo UI at http://localhost:2746 (via port-forward) or the LoadBalancer IP. The UI visualizes DAGs in real-time, container logs, and artifacts. Use kubectl -n argo create token workflow-controller for basic auth. Analogy: Like Grafana for workflows, but native to Kubernetes.
First Simple Workflow
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
generateName: whalesay-
namespace: argo
spec:
entrypoint: whalesay
templates:
- name: whalesay
container:
image: docker/whalesay
command: [cowsay]
args: ["hello world"]
---
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
generateName: hello-world-
namespace: argo
spec:
entrypoint: main
templates:
- name: main
steps:
- - name: print-message
template: whalesayThis YAML defines two workflows: a simple 'whalesay' container and one with composed steps. Submit with kubectl apply -f whalesay.yaml -n argo. Check status in the UI. Pitfall: Forget the 'argo' namespace to avoid 'NotFound' errors.
Submitting and Monitoring
- Submit:
argo submit -n argo --watch whalesay.yaml - UI: Filter by label, visualize DAG.
- Logs:
argo logs @latest -n argo
Workflow with Parameters and DAG
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
generateName: dag-params-
namespace: argo
spec:
entrypoint: main
arguments:
parameters:
- name: message
value: "Argo en 2026"
templates:
- name: main
dag:
tasks:
- name: generate
template: generate
- name: consume
template: consume
depends: generate
- name: generate
script:
image: python:3.9
command: [python]
source: |
import json
print(json.dumps({"message": "{{workflow.parameters.message}}"}))
- name: consume
script:
image: python:3.9
command: [python]
source: |
import sys, json
data = json.loads(sys.stdin.read())
print(f"Consumed: {data['message']}")DAG with injected parameters and data passing via stdout/stdin. The 'generate' task outputs JSON, and 'consume' reads it. Submit with --parameter message=Test. Pitfall: stdout >1MB fails; use artifacts for large data.
Workflow with Loops and Conditions
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
generateName: loop-condition-
namespace: argo
spec:
entrypoint: loop-example
arguments:
parameters:
- name: loops
value: "3"
templates:
- name: loop-example
steps:
- - name: loop-task
template: loop-template
arguments:
parameters:
- name: iteration
value: "{{item}}
withParam: "{{range $i := until {{workflow.parameters.loops}} }}{{ $i }}{{end}}"
- - name: condition-task
template: condition-template
when: "{{steps.loop-task.status}} == Succeeded"
- name: loop-template
inputs:
parameters:
- name: iteration
script:
image: alpine
command: [sh]
source: echo "Iteration {{inputs.parameters.iteration}}"
- name: condition-template
script:
image: alpine
command: [sh]
source: echo "Loop succeeded!"Loop using 'withParam' (until N), condition with 'when' on previous status. Runs 3 iterations then checks. Great for batch processing. Pitfall: 'withParam' is string-only; use 'seq' for dynamic integer ranges.
Workflow with Artifacts
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
generateName: artifacts-s3-
namespace: argo
spec:
entrypoint: main
volumeClaimTemplates:
- metadata:
name: workdir
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 1Gi
templates:
- name: main
steps:
- - name: produce
template: produce-artifact
- - name: consume
template: consume-artifact
arguments:
artifacts:
- name: mydata
from: "{{steps.produce.outputs.artifacts.mydata}}"
- name: produce-artifact
outputs:
artifacts:
- name: mydata
path: /tmp/data.txt
script:
image: alpine
volumeMounts:
- name: workdir
mountPath: /tmp
command: [sh]
source: |
echo "Data from S3 producer" > /tmp/data.txt
- name: consume-artifact
inputs:
artifacts:
- name: mydata
path: /tmp/input.txt
script:
image: alpine
volumeMounts:
- name: workdir
mountPath: /tmp
command: [sh]
source: cat /tmp/input.txtLocal artifacts via PVC (workdir) for sharing between steps. Replace with S3 via global Argo config (configmap.artifactRepository.s3). Pitfall: Per-workflow PVCs cause contention; use ephemeral for parallelism.
Advanced RBAC Configuration
apiVersion: v1
kind: ConfigMap
metadata:
name: workflow-controller-configmap
namespace: argo
data:
artifactRepository: |
s3:
bucket: my-bucket
endpoint: s3.amazonaws.com
insecure: false
accessKeySecret:
name: s3-credentials
key: accesskey
secretKeySecret:
name: s3-credentials
key: secretkey
---
apiVersion: v1
kind: Secret
metadata:
name: s3-credentials
namespace: argo
data:
accesskey: <base64-key>
secretkey: <base64-secret>
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: argo-binding
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- kind: ServiceAccount
name: argo-workflow-controller
namespace: argoConfigMap for S3 artifacts, Secret for credentials, and ClusterRoleBinding for permissions. Apply with kubectl apply -f rbac-config.yaml -n argo ; kubectl rollout restart deployment/argo-server argo-workflow-controller -n argo. Pitfall: Credentials must be base64-encoded, not plain text.
Best Practices
- Resource limits: Always set
resources.requests/limitsper template for QoS. - Retry policies: Use
retryStrategy.backoff.maxDuration: 3600for resilience. - Parallelism: Set
parallelism: 20at workflow level for scaling. - Labels/Annotations: Add
labels: {team: devops}for UI/Grafana filtering. - Persistence: Use MinIO for local artifacts in dev, S3 in production.
Common Errors to Avoid
- Namespace mismatch: Workflows in 'default' fail if
defaultNamespace=argo. - Artifact size: >500MB via stdin causes OOM; switch to PVC/S3.
- Infinite loops: Unbounded 'withItems' hangs; test with 'until 1'.
- Overly loose RBAC: Cluster-admin is risky; restrict to Workflow CRDs.
Next Steps
- Official docs: Argo Workflows
- Integrate Argo Events/Rollouts for event-driven workflows.
- Monitoring: Prometheus + Argo exporter.