Skip to content

Conversation

Future-Outlier
Copy link
Member

Why are these changes needed?

Related issue number

#3860 (comment)

Checks

  • I've made sure the tests are passing.
  • Testing Strategy
    • Unit tests
    • Manual tests
    • This PR is not tested :(

Signed-off-by: Future-Outlier <[email protected]>
@Future-Outlier
Copy link
Member Author

cc @machichima

apiVersion: ray.io/v1
kind: RayJob
metadata:
  name: rayjob-sample-external-redis-2-49-0-2
spec:
  # submissionMode specifies how RayJob submits the Ray job to the RayCluster.
  # The default value is "K8sJobMode", meaning RayJob will submit the Ray job via a submitter Kubernetes Job.
  # The alternative value is "HTTPMode", indicating that KubeRay will submit the Ray job by sending an HTTP request to the RayCluster.
  # submissionMode: "K8sJobMode"
  entrypoint: python -c "import os, time; print(os.environ.get('HOSTNAME')); [time.sleep(1) or print(i) for i in range(1000)]"
  # shutdownAfterJobFinishes specifies whether the RayCluster should be deleted after the RayJob finishes. Default is false.
  # shutdownAfterJobFinishes: false

  # ttlSecondsAfterFinished specifies the number of seconds after which the RayCluster will be deleted after the RayJob finishes.
  # ttlSecondsAfterFinished: 10

  # activeDeadlineSeconds is the duration in seconds that the RayJob may be active before
  # KubeRay actively tries to terminate the RayJob; value must be positive integer.
  # activeDeadlineSeconds: 120

  # RuntimeEnvYAML represents the runtime environment configuration provided as a multi-line YAML string.
  # See https://docs.ray.io/en/latest/ray-core/handling-dependencies.html for details.
  # (New in KubeRay version 1.0.)
  runtimeEnvYAML: |
    pip:
      - requests==2.26.0
      - pendulum==2.1.2
    env_vars:
      counter_name: "test_counter"

  # Suspend specifies whether the RayJob controller should create a RayCluster instance.
  # If a job is applied with the suspend field set to true, the RayCluster will not be created and we will wait for the transition to false.
  # If the RayCluster is already created, it will be deleted. In the case of transition to false, a new RayCluster will be created.
  # suspend: false

  # rayClusterSpec specifies the RayCluster instance to be created by the RayJob controller.
  rayClusterSpec:
    rayVersion: '2.46.0' # should match the Ray version in the image of the containers
    gcsFaultToleranceOptions:
    # In most cases, you don't need to set `externalStorageNamespace` because KubeRay will
    # automatically set it to the UID of RayCluster. Only modify this annotation if you fully understand
    # the behaviors of the Ray GCS FT and RayService to avoid misconfiguration.
    # [Example]:
    # externalStorageNamespace: "my-raycluster-storage"
      redisAddress: "redis:6379"
      redisPassword:
        valueFrom:
          secretKeyRef:
            name: redis-password-secret
            key: password
    # Ray head pod template
    headGroupSpec:
      # The `rayStartParams` are used to configure the `ray start` command.
      # See https://github.com/ray-project/kuberay/blob/master/docs/guidance/rayStartParams.md for the default settings of `rayStartParams` in KubeRay.
      # See https://docs.ray.io/en/latest/cluster/cli.html#ray-start for all available options in `rayStartParams`.
      rayStartParams: {}
      #pod template
      template:
        spec:
          # terminationGracePeriodSeconds: 1203
          containers:
          - name: ray-head
            image: rayproject/ray:2.46.0
            resources:
              limits:
                cpu: "1"
              requests:
                cpu: "1"
            
            lifecycle:
              preStop:
                exec:
                  command: ["python -c 'import ray; ray.shutdown()'"]
            ports:
            - containerPort: 6379
              name: redis
            - containerPort: 8265
              name: dashboard
            - containerPort: 10001
              name: client
            volumeMounts:
            - mountPath: /tmp/ray
              name: ray-logs
            - mountPath: /home/ray/samples
              name: ray-example-configmap
          volumes:
          - name: ray-logs
            emptyDir: {}
          - name: ray-example-configmap
            configMap:
              name: ray-example
              defaultMode: 0777
              items:
              - key: detached_actor.py
                path: detached_actor.py
              - key: increment_counter.py
                path: increment_counter.py
    workerGroupSpecs:
    # the pod replicas in this group typed worker
    - replicas: 1
      minReplicas: 1
      maxReplicas: 10
      groupName: small-group
      # The `rayStartParams` are used to configure the `ray start` command.
      # See https://github.com/ray-project/kuberay/blob/master/docs/guidance/rayStartParams.md for the default settings of `rayStartParams` in KubeRay.
      # See https://docs.ray.io/en/latest/cluster/cli.html#ray-start for all available options in `rayStartParams`.
      rayStartParams: {}
      # Pod template
      template:
        spec:
          containers:
          - name: ray-worker
            image: rayproject/ray:2.46.0
            volumeMounts:
            - mountPath: /tmp/ray
              name: ray-logs
            resources:
              limits:
                cpu: "1"
              requests:
                cpu: "1"
          volumes:
          - name: ray-logs
            emptyDir: {}
---
kind: ConfigMap
apiVersion: v1
metadata:
  name: redis-config
  labels:
    app: redis
data:
  redis.conf: |-
    dir /data
    port 6379
    bind 0.0.0.0
    appendonly yes
    protected-mode no
    requirepass 5241590000000000
    pidfile /data/redis-6379.pid
---
apiVersion: v1
kind: Service
metadata:
  name: redis
  labels:
    app: redis
spec:
  type: ClusterIP
  ports:
  - name: redis
    port: 6379
  selector:
    app: redis
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: redis
  labels:
    app: redis
spec:
  replicas: 1
  selector:
    matchLabels:
      app: redis
  template:
    metadata:
      labels:
        app: redis
    spec:
      containers:
      - name: redis
        image: redis:7.4.0
        command:
        - "sh"
        - "-c"
        - "redis-server /usr/local/etc/redis/redis.conf"
        ports:
        - containerPort: 6379
        volumeMounts:
        - name: config
          mountPath: /usr/local/etc/redis/redis.conf
          subPath: redis.conf
      volumes:
      - name: config
        configMap:
          name: redis-config
---
# Redis password
apiVersion: v1
kind: Secret
metadata:
  name: redis-password-secret
type: Opaque
data:
  # echo -n "5241590000000000" | base64
  password: NTI0MTU5MDAwMDAwMDAwMA==
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: ray-example
data:
  detached_actor.py: |
    import ray

    @ray.remote(num_cpus=1)
    class Counter:
      def __init__(self):
          self.value = 0

      def increment(self):
          self.value += 1
          return self.value

    ray.init(namespace="default_namespace")
    Counter.options(name="counter_actor", lifetime="detached").remote()
  increment_counter.py: |
    import ray

    ray.init(namespace="default_namespace")
    counter = ray.get_actor("counter_actor")
    print(ray.get(counter.increment.remote()))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants