Getting Started

Deploy Cohestra on your Kubernetes cluster and run your first Flink job in under 10 minutes.

Prerequisites

Kubernetes cluster (EKS, GKE, AKS, kind, or any conformant cluster)
Flink Kubernetes Operator v1.15+
Helm 3.x
kubectl configured for your cluster

1. Install Cohestra

helm repo add cohestra https://cohestra-project.github.io/charts
helm install cohestra cohestra/cohestra \
  --namespace cohestra-system --create-namespace \
  --set temporal.enabled=true \
  --set temporal.web.enabled=true

This deploys:

Cohestra API Server — REST API + Operations Console on port 8080
Cohestra Worker — Temporal workflow and activity worker
Temporal Server — durable workflow engine (bundled for trials; bring your own for production)
Temporal Web UI — workflow visibility on port 8088

Verify:

kubectl get pods -n cohestra-system
curl http://localhost:8080/healthz
# {"status":"ok"}

2. Register a Deployment

Tell Cohestra about the Flink job you want to manage:

curl -X PUT http://localhost:8080/api/v1/deployments/prod/streaming/orders \
  -H 'Content-Type: application/json' \
  -d '{
    "owner": "platform-team",
    "serviceAccount": "flink",
    "nodePool": "default"
  }'

Or with the Python SDK:

from cohestra_sdk import CohestraClient

client = CohestraClient("http://localhost:8080")
client.register("prod", "streaming", "orders", owner="platform-team")

3. Deploy Your Flink Job

curl -X POST http://localhost:8080/api/v1/deployments/prod/streaming/orders/deploy \
  -H 'Content-Type: application/json' \
  -H 'Idempotency-Key: deploy-001' \
  -d '{
    "requester": "ci-pipeline",
    "approved": true,
    "spec": {
      "imageDigest": "your-registry/orders-job@sha256:abc123",
      "flinkVersion": "2.2",
      "parallelism": 8,
      "maxParallelism": 128,
      "resources": {
        "taskManagerCpu": 2,
        "taskManagerMemoryMiB": 4096,
        "taskManagerCount": 2,
        "slotsPerManager": 4
      },
      "stateCompatibility": {
        "jobGraphCompatible": true,
        "operatorUidsStable": true
      }
    }
  }'

Cohestra will:

Take a savepoint of the running job (if upgrading)
Apply the new FlinkDeployment spec via the Kubernetes Operator
Wait for the job to reach RUNNING state
Verify health gates (checkpoints, restarts, backpressure, Kafka lag)
Promote the version or rollback automatically on failure

4. Check Status

curl http://localhost:8080/api/v1/deployments/prod/streaming/orders/actor | jq

Response:

{
  "identity": {"environment": "prod", "namespace": "streaming", "name": "orders"},
  "status": "IDLE",
  "currentVersion": {
    "versionId": 1,
    "spec": {...},
    "healthSummary": {
      "healthy": true,
      "running": true,
      "checkpointCompleted": true
    }
  },
  "recentOperations": [
    {"operationId": "deploy-001", "commandType": "DeployVersion", "status": "SUCCEEDED"}
  ]
}

5. Open the Operations Console

Navigate to http://localhost:8080 to see the Cohestra dashboard with deployment cards, version history, and operation logs.

Prerequisites​

1. Install Cohestra​

2. Register a Deployment​

3. Deploy Your Flink Job​

4. Check Status​

5. Open the Operations Console​

Next Steps​