feat(exapp_development): add Kubernetes setup instructions

Signed-off-by: Edward Ly <contact@edward.ly>
This commit is contained in:
Edward Ly
2026-03-11 17:41:14 +01:00
parent 241b4c48c9
commit f986f22c93
7 changed files with 1331 additions and 23 deletions

View File

@@ -1,22 +0,0 @@
Scaling
=======
AppAPI delegates the scaling task to the ExApp itself.
This means that the ExApp must be designed in a way to be able to scale vertically.
As for the horizontal scaling, it is currently not possible except by using,
for example, a Server-Worker architecture, which is a good way to support basic scaling capabilities.
In this case, the Server is your ExApp and the Workers are the external machines that can work with the ExApp
using Nextcloud user authentication.
Additional clients (or workers) can be (optionally) added (or attached) to the ExApp
to increase the capacity and performance.
GPUs scaling
------------
Currently, if a Deploy daemon configured with GPUs available,
AppAPI by default will attach all available GPU devices to each ExApp container on this Deploy daemon.
This means that these GPUs are shared between all ExApps on the same Deploy daemon.
Therefore, for the ExApps that require heavy use of GPUs,
it is recommended to have a separate Deploy daemon (host) for them.

View File

@@ -19,6 +19,5 @@ or provide a brief answer.
DockerContainerRegistry
DockerSocketProxy
GpuSupport
Scaling
BehindCompanyProxy
Troubleshooting

View File

@@ -7,6 +7,7 @@ ExApp development
Introduction
DevSetup
scaling/index.rst
development_overview/index.rst
tech_details/index.rst
faq/index.rst

View File

@@ -0,0 +1,366 @@
Emulating AppAPI
================
This section documents the ``curl`` commands used to emulate AppAPI when
testing HaRPs Kubernetes backend.
Prerequisites
-------------
- HaRP is reachable at: ``http://nextcloud.local/exapps``
- HaRP was started with the same shared key as used below
(``HP_SHARED_KEY``)
- HaRP has Kubernetes backend enabled (``HP_K8S_ENABLED=true``) and can
access the k8s API
- ``kubectl`` is configured to point to the same cluster HaRP uses
- Optional: ``jq`` for parsing JSON responses
--------------
0. Environment variables
------------------------
.. code:: bash
# .env
EXAPPS_URL="http://nextcloud.local/exapps"
APPAPI_URL="${EXAPPS_URL}/app_api"
HP_SHARED_KEY="some_very_secure_password"
# Optional: Nextcloud base (only used by ExApp container env in this guide)
NEXTCLOUD_URL="http://nextcloud.local"
.. code:: bash
source .env
.. note::
All AppAPI-emulation calls go to ``$APPAPI_URL/...`` and require the header ``harp-shared-key``.
.. note::
You can also hit the agent directly on
``http://127.0.0.1:8200/...`` for debugging, but that bypasses the
HAProxy/AppAPI path and may skip shared-key enforcement depending
on your routing.
1. Check if ExApp is present (k8s deployment exists)
----------------------------------------------------
.. code:: bash
curl -sS \
-H "harp-shared-key: $HP_SHARED_KEY" \
-H "Content-Type: application/json" \
-X POST \
-d '{
"name": "test-deploy",
"instance_id": ""
}' \
"$APPAPI_URL/k8s/exapp/exists"
Expected output:
.. code:: json
{"exists": true}
or
.. code:: json
{"exists": false}
2. Create ExApp (PVC + Deployment with replicas=0)
--------------------------------------------------
.. code:: bash
curl -sS \
-H "harp-shared-key: $HP_SHARED_KEY" \
-H "Content-Type: application/json" \
-X POST \
-d '{
"name": "test-deploy",
"instance_id": "",
"image": "ghcr.io/nextcloud/test-deploy:latest",
"environment_variables": [
"APP_ID=test-deploy",
"APP_DISPLAY_NAME=Test Deploy",
"APP_VERSION=1.2.1",
"APP_HOST=0.0.0.0",
"APP_PORT=23000",
"NEXTCLOUD_URL='"$NEXTCLOUD_URL"'",
"APP_SECRET=some-dev-secret",
"APP_PERSISTENT_STORAGE=/nc_app_test-deploy_data"
],
"resource_limits": { "cpu": "500m", "memory": "512Mi" }
}' \
"$APPAPI_URL/k8s/exapp/create"
Expected output (example):
.. code:: json
{"name":"nc-app-test-deploy"}
3. Start ExApp (scale replicas to 1)
------------------------------------
.. code:: bash
curl -sS \
-H "harp-shared-key: $HP_SHARED_KEY" \
-H "Content-Type: application/json" \
-X POST \
-d '{
"name": "test-deploy",
"instance_id": ""
}' \
"$APPAPI_URL/k8s/exapp/start"
Expected: HTTP 204.
4. Wait for ExApp to become Ready
---------------------------------
.. code:: bash
curl -sS \
-H "harp-shared-key: $HP_SHARED_KEY" \
-H "Content-Type: application/json" \
-X POST \
-d '{
"name": "test-deploy",
"instance_id": ""
}' \
"$APPAPI_URL/k8s/exapp/wait_for_start"
Expected output (example):
.. code:: json
{
"started": true,
"status": "running",
"health": "ready",
"reason": null,
"message": null
}
5. Expose + register in HaRP
----------------------------
5.1 NodePort (default behavior)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
**Minimal (uses defaults, may auto-pick a node address):**
.. code:: bash
EXPOSE_JSON=$(
curl -sS \
-H "harp-shared-key: $HP_SHARED_KEY" \
-H "Content-Type: application/json" \
-X POST \
-d '{
"name": "test-deploy",
"instance_id": "",
"port": 23000,
"expose_type": "nodeport"
}' \
"$APPAPI_URL/k8s/exapp/expose"
)
echo "$EXPOSE_JSON"
**Recommended (provide a stable host reachable by HaRP):**
.. code:: bash
# Example: edge node IP / VIP / L4 LB that forwards NodePort range
UPSTREAM_HOST="172.18.0.2"
EXPOSE_JSON=$(
curl -sS \
-H "harp-shared-key: $HP_SHARED_KEY" \
-H "Content-Type: application/json" \
-X POST \
-d '{
"name": "test-deploy",
"instance_id": "",
"port": 23000,
"expose_type": "nodeport",
"upstream_host": "'"$UPSTREAM_HOST"'"
}' \
"$APPAPI_URL/k8s/exapp/expose"
)
echo "$EXPOSE_JSON"
5.2 ClusterIP (only if HaRP can reach ClusterIP + resolve service DNS)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. code:: bash
EXPOSE_JSON=$(
curl -sS \
-H "harp-shared-key: $HP_SHARED_KEY" \
-H "Content-Type: application/json" \
-X POST \
-d '{
"name": "test-deploy",
"instance_id": "",
"port": 23000,
"expose_type": "clusterip"
}' \
"$APPAPI_URL/k8s/exapp/expose"
)
echo "$EXPOSE_JSON"
5.3 Manual (HaRP does not create or inspect any Service)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. code:: bash
EXPOSE_JSON=$(
curl -sS \
-H "harp-shared-key: $HP_SHARED_KEY" \
-H "Content-Type: application/json" \
-X POST \
-d '{
"name": "test-deploy",
"instance_id": "",
"port": 23000,
"expose_type": "manual",
"upstream_host": "exapp-test-deploy.internal",
"upstream_port": 23000
}' \
"$APPAPI_URL/k8s/exapp/expose"
)
echo "$EXPOSE_JSON"
6. Extract exposed host/port for follow-up tests (requires ``jq``)
------------------------------------------------------------------
.. code:: bash
EXAPP_HOST=$(echo "$EXPOSE_JSON" | jq -r '.host')
EXAPP_PORT=$(echo "$EXPOSE_JSON" | jq -r '.port')
echo "ExApp upstream endpoint: ${EXAPP_HOST}:${EXAPP_PORT}"
7. Check ``/heartbeat`` via HaRP routing (AppAPI-style direct routing headers)
------------------------------------------------------------------------------
This checks HaRPs ability to route to the ExApp given an explicit
upstream host/port and AppAPI-style authorization header.
7.1 Build ``authorization-app-api`` value
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
HaRP typically expects this value to be the **base64-encoded value of**
``user_id:APP_SECRET`` (similar to HTTP Basic without the ``Basic``
prefix). For an “anonymous” style request, use ``:APP_SECRET``.
.. code:: bash
# Option A: anonymous-style
AUTH_APP_API=$(printf '%s' ':some-dev-secret' | base64 | tr -d '\n')
# Option B: user-scoped style (example user "admin")
# AUTH_APP_API=$(printf '%s' 'admin:some-dev-secret' | base64 | tr -d '\n')
7.2 Call heartbeat
~~~~~~~~~~~~~~~~~~
.. code:: bash
curl -sS \
"http://nextcloud.local/exapps/test-deploy/heartbeat" \
-H "harp-shared-key: $HP_SHARED_KEY" \
-H "ex-app-version: 1.2.1" \
-H "ex-app-id: test-deploy" \
-H "ex-app-host: $EXAPP_HOST" \
-H "ex-app-port: $EXAPP_PORT" \
-H "authorization-app-api: $AUTH_APP_API"
If this fails with auth-related errors, verify:
- ``APP_SECRET`` in the ExApp matches what you used here,
- your HaProxy config expectations for ``authorization-app-api`` (raw
vs base64).
8. Stop and remove (API-based cleanup)
--------------------------------------
Stop ExApp (scale replicas to 0)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. code:: bash
curl -sS \
-H "harp-shared-key: $HP_SHARED_KEY" \
-H "Content-Type: application/json" \
-X POST \
-d '{
"name": "test-deploy",
"instance_id": ""
}' \
"$APPAPI_URL/k8s/exapp/stop"
Remove ExApp (Deployment + optional PVC; Service may be removed depending on HaRP version)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. code:: bash
curl -sS \
-H "harp-shared-key: $HP_SHARED_KEY" \
-H "Content-Type: application/json" \
-X POST \
-d '{
"name": "test-deploy",
"instance_id": "",
"remove_data": true
}' \
"$APPAPI_URL/k8s/exapp/remove"
--------------
Useful ``kubectl`` commands (debug / manual cleanup)
----------------------------------------------------
Check resources
~~~~~~~~~~~~~~~
.. code:: bash
kubectl get deploy,svc,pvc -n nextcloud-exapps -o wide | grep -E 'test-deploy|NAME' || true
kubectl get pods -n nextcloud-exapps -o wide
Delete Service (if it was exposed and needs manual cleanup)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. code:: bash
kubectl delete svc nc-app-test-deploy -n nextcloud-exapps
Delete Deployment
~~~~~~~~~~~~~~~~~
.. code:: bash
kubectl delete deployment nc-app-test-deploy -n nextcloud-exapps
Delete PVC (data)
~~~~~~~~~~~~~~~~~
PVC name is derived from ``nc_app_test-deploy_data`` and sanitized for
k8s, typically: ``nc-app-test-deploy-data``
.. code:: bash
kubectl delete pvc nc-app-test-deploy-data -n nextcloud-exapps

View File

@@ -0,0 +1,521 @@
Autoscaling with KEDA
=====================
This section explains how to set up `KEDA <https://keda.sh/>`__ to auto-scale ExApp pods
(using the `llm2 <https://docs.nextcloud.com/server/latest/admin_manual/ai/app_llm2.html>`_ app as an example)
based on the Nextcloud TaskProcessing queue depth.
Prerequisites
-------------
- A working Nextcloud + HaRP + k8s setup (see
:ref:`scaling-kubernetes-setup`)
- An ExApp deployed and running (e.g. ``llm2`` with deployment name
``nc-app-llm2``)
- ``kubectl`` configured and pointing to the cluster
- ``helm`` installed (`install
guide <https://helm.sh/docs/intro/install/>`__)
- For GPU ExApps: the daemon must be registered with
``--compute_device=cuda``
Architecture overview
---------------------
.. mermaid::
graph TB
Users[Users submit tasks] --> Nextcloud["Nextcloud TaskProcessing Queue
(scheduled + running tasks)"]
Nextcloud -->|"GET /ocs/v2.php/taskprocessing/queue_stats
Auth: Basic (admin app_password)"| KEDA["KEDA (metrics-api-server in k8s)"]
KEDA -->|"polls every pollingInterval (e.g. 15s)
scaling deployment based on queue depth"| deployment["nc-app-llm2 deployment (1..N pods)
Each pod independently calls next_task()"]
KEDA uses a ``metrics`` trigger (HTTP polling) to query Nextcloud
``queue_stats`` endpoint.
When the queue grows, KEDA scales up the ExApp deployment.
When the queue reduces in size, KEDA scales back down.
--------------
0. GPU Setup (kind cluster)
---------------------------
If your ExApp needs GPU (e.g. llm2), you must set up GPU passthrough in
the kind cluster.
0.1 Configure Docker on the host
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. code:: bash
sudo nvidia-ctk runtime configure --runtime=docker --set-as-default --cdi.enabled
sudo nvidia-ctk config --set accept-nvidia-visible-devices-as-volume-mounts=true --in-place
sudo systemctl restart docker
0.2 Create kind cluster with GPU support
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. code:: yaml
# kind-gpu-config.yaml
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
extraMounts:
- hostPath: /dev/null
containerPath: /var/run/nvidia-container-devices/all
.. code:: bash
kind create cluster --name nc-exapps --config kind-gpu-config.yaml
0.3 Install nvidia-container-toolkit inside the kind node
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. code:: bash
docker exec nc-exapps-control-plane bash -c '
apt-get update -y && apt-get install -y gnupg2 curl &&
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | \
gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg &&
curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
sed "s#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g" \
> /etc/apt/sources.list.d/nvidia-container-toolkit.list &&
apt-get update && apt-get install -y nvidia-container-toolkit
'
0.4 Configure containerd and restart
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. code:: bash
docker exec nc-exapps-control-plane bash -c '
nvidia-ctk runtime configure --runtime=containerd --set-as-default &&
systemctl restart containerd
'
0.5 Install NVIDIA device plugin
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
For a single GPU shared across multiple pods, use **time-slicing**.
First create a ConfigMap with the number of replicas (virtual GPUs):
.. code:: bash
cat <<'EOF' | kubectl apply -f -
apiVersion: v1
kind: ConfigMap
metadata:
name: nvidia-device-plugin-config
namespace: kube-system
data:
config.yaml: |
version: v1
sharing:
timeSlicing:
renameByDefault: false
resources:
- name: nvidia.com/gpu
replicas: 4
EOF
Then deploy the device plugin with the config:
.. code:: bash
cat <<'EOF' | kubectl apply -f -
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: nvidia-device-plugin-daemonset
namespace: kube-system
spec:
selector:
matchLabels:
name: nvidia-device-plugin-ds
template:
metadata:
labels:
name: nvidia-device-plugin-ds
spec:
tolerations:
- key: nvidia.com/gpu
operator: Exists
effect: NoSchedule
priorityClassName: system-node-critical
containers:
- image: nvcr.io/nvidia/k8s-device-plugin:v0.17.0
name: nvidia-device-plugin-ctr
args: ["--config-file=/config/config.yaml"]
env:
- name: FAIL_ON_INIT_ERROR
value: "false"
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop: ["ALL"]
volumeMounts:
- name: device-plugin
mountPath: /var/lib/kubelet/device-plugins
- name: plugin-config
mountPath: /config
volumes:
- name: device-plugin
hostPath:
path: /var/lib/kubelet/device-plugins
- name: plugin-config
configMap:
name: nvidia-device-plugin-config
items:
- key: config.yaml
path: config.yaml
EOF
0.6 Verify GPU is visible
~~~~~~~~~~~~~~~~~~~~~~~~~
.. code:: bash
kubectl get nodes -o json | python3 -c "
import json,sys
for n in json.load(sys.stdin)['items']:
gpu = n['status']['capacity'].get('nvidia.com/gpu','N/A')
print(f'{n[\"metadata\"][\"name\"]}: nvidia.com/gpu = {gpu}')
"
Expected: ``nvidia.com/gpu = 4`` (or your configured replicas count).
0.7 Test GPU from a pod
~~~~~~~~~~~~~~~~~~~~~~~
.. code:: bash
kubectl run gpu-test --image=nvidia/cuda:12.6.3-base-ubuntu24.04 --restart=Never \
--overrides='{"spec":{"containers":[{"name":"gpu-test","image":"nvidia/cuda:12.6.3-base-ubuntu24.04","command":["nvidia-smi"],"resources":{"limits":{"nvidia.com/gpu":"1"}}}]}}' \
-n nextcloud-exapps
sleep 30 && kubectl logs gpu-test -n nextcloud-exapps
kubectl delete pod gpu-test -n nextcloud-exapps
--------------
1. Install KEDA
---------------
.. code:: bash
helm repo add kedacore https://kedacore.github.io/charts
helm repo update
helm install keda kedacore/keda --namespace keda --create-namespace
Verify:
.. code:: bash
kubectl get pods -n keda
# All pods should be Running
2. DNS setup (kind only)
------------------------
KEDA pods need to resolve ``nextcloud.local``. **HaRP does this
automatically now** — when ``HP_K8S_HOST_ALIASES`` is set, HaRP patches
the CoreDNS ``ConfigMap`` on startup and restarts CoreDNS so that every
pod in the cluster (including KEDA) can resolve the configured
hostnames.
If you need to do it manually (or verify), the commands are:
.. code:: bash
# Get the nginx proxy IP
PROXY_IP=$(docker inspect master-proxy-1 \
--format '{{(index .NetworkSettings.Networks "master_default").IPAddress}}')
echo "Proxy IP: $PROXY_IP"
# Write the Corefile with the correct IP
cat > /tmp/Corefile << EOF
.:53 {
errors
health {
lameduck 5s
}
ready
kubernetes cluster.local in-addr.arpa ip6.arpa {
pods insecure
fallthrough in-addr.arpa ip6.arpa
ttl 30
}
prometheus :9153
hosts {
${PROXY_IP} nextcloud.local
fallthrough
}
forward . /etc/resolv.conf {
max_concurrent 1000
}
cache 30
loop
reload
loadbalance
}
EOF
kubectl create configmap coredns -n kube-system \
--from-file=Corefile=/tmp/Corefile \
--dry-run=client -o yaml | kubectl apply -f -
kubectl rollout restart deployment coredns -n kube-system
Verify:
.. code:: bash
kubectl run dns-test --rm -i --restart=Never --image=busybox -- nslookup nextcloud.local
3. Create a Nextcloud App Password
----------------------------------
KEDA needs credentials to poll the ``queue_stats`` endpoint. The
endpoint is admin-only.
1. Log in to Nextcloud as admin
2. Go to **Settings > Security > Devices & sessions**
3. Enter a name (e.g. ``keda-scaler``) and click **Create new app
password**
4. Copy the password into a **.env** file
.. code:: bash
# .env
NC_USER="admin"
NC_APP_PASSWORD="<the-app-password-you-created>"
NC_URL="https://nextcloud.local"
Verify:
.. code:: bash
source .env
curl -s -k -u "${NC_USER}:${NC_APP_PASSWORD}" \
"${NC_URL}/ocs/v2.php/taskprocessing/queue_stats?format=json"
Expected:
.. code:: json
{"ocs":{"meta":{"status":"ok","statuscode":200,"message":"OK"},"data":{"scheduled":0,"running":0}}}
4. Create k8s secret
--------------------
.. code:: bash
kubectl create secret generic nextcloud-keda-auth \
--namespace=nextcloud-exapps \
--from-literal=username="${NC_USER}" \
--from-literal=password="${NC_APP_PASSWORD}"
5. Create KEDA TriggerAuthentication
------------------------------------
.. code:: bash
cat <<'EOF' | kubectl apply -f -
apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
name: nextcloud-auth
namespace: nextcloud-exapps
spec:
secretTargetRef:
- parameter: username
name: nextcloud-keda-auth
key: username
- parameter: password
name: nextcloud-keda-auth
key: password
EOF
6. Create KEDA ScaledObject
---------------------------
.. note::
Nextcloud OCS returns XML by default. Always include ``format=json`` in the URL.
Task type filter
~~~~~~~~~~~~~~~~
llm2 registers many task types. Use a comma-separated list to scale on
all of them:
::
?taskTypeId=core:text2text,core:text2text:chat,core:text2text:summary,core:text2text:headline,core:text2text:topics,core:text2text:simplification,core:text2text:reformulation,core:contextwrite,core:text2text:changetone,core:text2text:chatwithtools,core:text2text:proofread
Apply
~~~~~
.. code:: yaml
# keda-llm2-scaler.yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: llm2-scaler
namespace: nextcloud-exapps
spec:
scaleTargetRef:
name: nc-app-llm2
pollingInterval: 15
cooldownPeriod: 120
initialCooldownPeriod: 60
minReplicaCount: 1
maxReplicaCount: 4
triggers:
- type: metrics-api
metadata:
url: "https://nextcloud.local/ocs/v2.php/taskprocessing/queue_stats?format=json&taskTypeId=core:text2text,core:text2text:chat,core:text2text:summary"
valueLocation: "ocs.data.scheduled"
targetValue: "5"
authMode: "basic"
unsafeSsl: "true"
authenticationRef:
name: nextcloud-auth
.. code:: bash
kubectl apply -f keda-llm2-scaler.yaml
Scaling formula
~~~~~~~~~~~~~~~
::
desiredReplicas = ceil( metricValue / targetValue )
=============== ============= ===================
Scheduled tasks targetValue=5 Result
=============== ============= ===================
0 \- 1 (minReplicaCount)
3 ceil(3/5)=1 1 pod
12 ceil(12/5)=3 3 pods
50 ceil(50/5)=10 4 (capped at max)
=============== ============= ===================
7. Verify and Monitor
---------------------
Quick status
~~~~~~~~~~~~
.. code:: bash
kubectl get scaledobject -n nextcloud-exapps && echo && \
kubectl get deploy nc-app-llm2 -n nextcloud-exapps && echo && \
kubectl get pods -n nextcloud-exapps -l app=nc-app-llm2 -o wide
- ``READY=True`` - KEDA can reach the metrics endpoint
- ``ACTIVE=False`` - no tasks queued
- ``AVAILABLE=1`` - one pod running (minReplicaCount)
Watch scaling live
~~~~~~~~~~~~~~~~~~
.. code:: bash
# Terminal 1: pods
kubectl get pods -n nextcloud-exapps -l app=nc-app-llm2 -w
# Terminal 2: deployment
kubectl get deploy nc-app-llm2 -n nextcloud-exapps -w
# Terminal 3: KEDA logs
kubectl logs -n keda -l app=keda-operator -f --tail=5
Check HPA (KEDA creates this)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. code:: bash
kubectl get hpa -n nextcloud-exapps
kubectl describe hpa -n nextcloud-exapps
Full dashboard
~~~~~~~~~~~~~~
.. code:: bash
echo "=== ScaledObject ===" && \
kubectl get scaledobject -n nextcloud-exapps && echo && \
echo "=== HPA ===" && \
kubectl get hpa -n nextcloud-exapps && echo && \
echo "=== Deployment ===" && \
kubectl get deploy nc-app-llm2 -n nextcloud-exapps && echo && \
echo "=== Pods ===" && \
kubectl get pods -n nextcloud-exapps -l app=nc-app-llm2 -o wide && echo && \
echo "=== Queue ===" && \
curl -s -k -u "${NC_USER}:${NC_APP_PASSWORD}" \
"${NC_URL}/ocs/v2.php/taskprocessing/queue_stats?format=json"
--------------
Tuning Guide
------------
+---------------------------+---------+---------+-------------------------------------+
| Parameter | Example | Default | What it does |
+===========================+=========+=========+=====================================+
| ``pollingInterval`` | 15 | 30 | Seconds between polls. |
| | | | Lower = faster reaction |
+---------------------------+---------+---------+-------------------------------------+
| ``cooldownPeriod`` | 120 | 300 | Seconds to wait before scaling down |
+---------------------------+---------+---------+-------------------------------------+
| ``initialCooldownPeriod`` | 60 | 0 | Wait after new pod starts. Set to |
| | | | 60 for LLM model loading time |
+---------------------------+---------+---------+-------------------------------------+
| ``minReplicaCount`` | 1 | 0 | Min pods. Must be 1+ (AppAPI needs |
| | | | at least one pod for heartbeat) |
+---------------------------+---------+---------+-------------------------------------+
| ``maxReplicaCount`` | 4 | 100 | Max pods. Match your GPU count or |
| | | | time-slicing replicas |
+---------------------------+---------+---------+-------------------------------------+
| ``targetValue`` | 5 | \- | Tasks per pod. |
| | | | Lower = more pods sooner |
+---------------------------+---------+---------+-------------------------------------+
GPU time-slicing notes
~~~~~~~~~~~~~~~~~~~~~~
- One physical GPU can be shared by multiple pods using NVIDIA
time-slicing
- Each llm2 pod uses about 8GB VRAM (model dependent)
- RTX 5090 (32GB): can run 3-4 pods with time-slicing replicas=4
- RTX 4090 (24GB): can run 2-3 pods with time-slicing replicas=3
- Set ``maxReplicaCount`` to match your time-slicing replicas
- CUDA gives each pod equal GPU time
LLM notes
~~~~~~~~~
- Model loading takes 30-60s. New pods are not ready right away
- Use ``initialCooldownPeriod`` to avoid over-scaling during warmup
- PVC access mode is ``ReadWriteOnce``. Works on single-node only
- Multi-node clusters are not supported yet
--------------
Cleanup
-------
.. code:: bash
# Remove KEDA ScaledObject
kubectl delete scaledobject llm2-scaler -n nextcloud-exapps
# Remove auth resources
kubectl delete triggerauthentication nextcloud-auth -n nextcloud-exapps
kubectl delete secret nextcloud-keda-auth -n nextcloud-exapps

View File

@@ -0,0 +1,411 @@
.. _scaling-kubernetes-setup:
Setting up Kubernetes
=====================
This guide will help you set up a local Kubernetes cluster
(via `kind <https://kind.sigs.k8s.io/>`__)
with HaRP and AppAPI for ExApp development.
After completing these steps you will be able to register a k8s deploy daemon in Nextcloud and deploy a test app.
Prerequisites
-------------
- Docker must be installed and running
- A `nextcloud-docker-dev <https://github.com/juliusknorr/nextcloud-docker-dev>`__ environment running at ``https://nextcloud.local``
- The Nextcloud container is on the ``master_default`` Docker
network
- ``kubectl`` installed (`install
guide <https://kubernetes.io/docs/tasks/tools/>`__)
- ``kind`` installed (`install
guide <https://kind.sigs.k8s.io/docs/user/quick-start/#installation>`__)
- HaRP repository cloned (e.g. ``~/nextcloud/HaRP``)
Architecture overview
---------------------
.. mermaid::
graph TB
OCC[Browser / OCC / API calls] -->|"Nextcloud (PHP, in Docker container)"| nginx[nginx proxy]
nginx -->|/exapps/| HaRP["HaRP (host network, port 8780)"]
HaRP -->|"k8s API calls (Deployments, Services, PVCs)"| kind["kind cluster (nc-exapps)"]
kind --> ExApp["ExApp pod (e.g. test-deploy)"]
- **HaRP** runs on the host network (``--network=host``) and
communicates with:
- The kind k8s API server (via ``https://127.0.0.1:<port>``)
- ExApp pods via NodePort services (via the kind node IP)
- **Nextcloud** reaches HaRP via the Docker network gateway IP
- **nginx proxy** forwards ``/exapps/`` requests to HaRP
--------------
.. _scaling-kubernetes-setup-step-1:
1. Create the kind Cluster
--------------------------
.. code:: bash
kind create cluster --name nc-exapps
Verify:
.. code:: bash
kubectl config use-context kind-nc-exapps
kubectl cluster-info
kubectl get nodes -o wide
Note the **API server URL** (e.g. ``https://127.0.0.1:37151``) and the
**node InternalIP** (e.g. ``172.18.0.2``):
.. code:: bash
# API server
kubectl config view --minify -o jsonpath='{.clusters[0].cluster.server}'
# Node internal IP
kubectl get nodes -o jsonpath='{.items[0].status.addresses[?(@.type=="InternalIP")].address}'
.. _scaling-kubernetes-setup-step-2:
2. Create Namespace and RBAC
----------------------------
.. code:: bash
# Create the ExApps namespace
kubectl create namespace nextcloud-exapps
# Create a ServiceAccount for HaRP
kubectl -n nextcloud-exapps create serviceaccount harp-exapps
# Grant cluster-admin (for development; restrict in production)
kubectl create clusterrolebinding harp-exapps-admin \
--clusterrole=cluster-admin \
--serviceaccount=nextcloud-exapps:harp-exapps
Generate a bearer token (valid for 1 year):
.. code:: bash
kubectl -n nextcloud-exapps create token harp-exapps --duration=8760h
.. note::
The ``redeploy_host_k8s.sh`` script generates this token
automatically, so you dont need to copy it manually.
.. _scaling-kubernetes-setup-step-3:
3. Configure the nginx Proxy
----------------------------
The nextcloud-docker-dev nginx proxy must forward ``/exapps/`` to HaRP.
Find the gateway IP of the ``master_default`` Docker network (this is
how containers reach the host):
.. code:: bash
docker network inspect master_default \
--format '{{range .IPAM.Config}}Gateway: {{.Gateway}}{{end}}'
Typically this is your host IP like ``192.168.21.1`` (may vary on your
machine).
Edit the nginx vhost file:
.. code:: bash
# Path relative to your nextcloud-docker-dev checkout:
# data/nginx/vhost.d/nextcloud.local_location
Set the content to:
.. code:: nginx
location /exapps/ {
set $harp_addr <GATEWAY_IP>:8780;
proxy_pass http://$harp_addr;
# Forward the true client identity
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
Replace ``<GATEWAY_IP>`` with the gateway from above
(e.g. ``192.168.21.1``).
Reload nginx:
.. code:: bash
docker exec master-proxy-1 nginx -s reload
.. _scaling-kubernetes-setup-step-4:
4. Build and Deploy HaRP
------------------------
From the HaRP repository root:
.. code:: bash
cd ~/nextcloud/HaRP
bash development/redeploy_host_k8s.sh
The script will:
1. Auto-detect the k8s API server URL
2. Generate a fresh bearer token
3. Build the HaRP Docker image
4. Start HaRP with k8s backend enabled on host network
Wait for HaRP to become healthy:
.. code:: bash
docker ps | grep harp
# Should show "(healthy)" after ~15 seconds
Check logs if needed:
.. code:: bash
docker logs appapi-harp --tail=20
.. _scaling-kubernetes-setup-step-5:
5. Register the k8s Deploy Daemon in Nextcloud
----------------------------------------------
Run this inside the Nextcloud container (replace ``<NC_CONTAINER>`` with
your container ID or name, and ``<GATEWAY_IP>`` with the gateway from
:ref:`Step 3 <scaling-kubernetes-setup-step-3>`):
.. code:: bash
docker exec <NC_CONTAINER> sudo -E -u www-data php occ app_api:daemon:register \
k8s_local "Kubernetes Local" "kubernetes-install" \
"http" "<GATEWAY_IP>:8780" "http://nextcloud.local" \
--harp \
--harp_shared_key "some_very_secure_password" \
--harp_frp_address "<GATEWAY_IP>:8782" \
--k8s \
--k8s_expose_type=nodeport \
--set-default
Verify:
.. code:: bash
docker exec <NC_CONTAINER> sudo -E -u www-data php occ app_api:daemon:list
.. _scaling-kubernetes-setup-step-6:
6. Run Test Deploy
------------------
Via OCC
~~~~~~~
.. code:: bash
docker exec <NC_CONTAINER> sudo -E -u www-data php occ app_api:app:register test-deploy k8s_local \
--info-xml https://raw.githubusercontent.com/nextcloud/test-deploy/main/appinfo/info.xml \
--test-deploy-mode
Expected output:
::
ExApp test-deploy deployed successfully.
ExApp test-deploy successfully registered.
Via API (same as what the Admin UI uses)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. code:: bash
# Start test deploy
curl -X POST -u admin:admin -H "OCS-APIREQUEST: true" -k \
"https://nextcloud.local/index.php/apps/app_api/daemons/k8s_local/test_deploy"
# Check status
curl -u admin:admin -H "OCS-APIREQUEST: true" -k \
"https://nextcloud.local/index.php/apps/app_api/daemons/k8s_local/test_deploy/status"
# Stop and clean up
curl -X DELETE -u admin:admin -H "OCS-APIREQUEST: true" -k \
"https://nextcloud.local/index.php/apps/app_api/daemons/k8s_local/test_deploy"
Verify k8s Resources
~~~~~~~~~~~~~~~~~~~~
.. code:: bash
kubectl get deploy,svc,pvc,pods -n nextcloud-exapps -o wide
Unregister
~~~~~~~~~~
.. code:: bash
docker exec <NC_CONTAINER> sudo -E -u www-data php occ app_api:app:unregister test-deploy
--------------
Cluster Overview
----------------
==================== ===========================
Component Value
==================== ===========================
**Type** kind (Kubernetes in Docker)
**Cluster Name** ``nc-exapps``
**Node** ``nc-exapps-control-plane``
**ExApps Namespace** ``nextcloud-exapps``
**ServiceAccount** ``harp-exapps``
==================== ===========================
--------------
Monitoring Commands
-------------------
Cluster Status
~~~~~~~~~~~~~~
.. code:: bash
kubectl cluster-info
kubectl get nodes -o wide
kubectl get pods -n nextcloud-exapps
kubectl get pods -n nextcloud-exapps -w # watch in real-time
Pod Inspection
~~~~~~~~~~~~~~
.. code:: bash
kubectl describe pod <pod-name> -n nextcloud-exapps
kubectl logs <pod-name> -n nextcloud-exapps
kubectl logs -f <pod-name> -n nextcloud-exapps # follow logs
kubectl logs --previous <pod-name> -n nextcloud-exapps # after restart
Resources
~~~~~~~~~
.. code:: bash
kubectl get svc,deploy,pvc -n nextcloud-exapps
kubectl get all -n nextcloud-exapps
HaRP Logs
~~~~~~~~~
.. code:: bash
docker logs appapi-harp --tail=50
docker logs -f appapi-harp # follow
--------------
Troubleshooting
---------------
HaRP cant reach k8s API
~~~~~~~~~~~~~~~~~~~~~~~~
.. code:: bash
# Check if kind container is running
docker ps | grep kind
# Verify API server is reachable from host
curl -k https://127.0.0.1:37151/version
Nextcloud cant reach HaRP
~~~~~~~~~~~~~~~~~~~~~~~~~~
.. code:: bash
# From inside the Nextcloud container, test connectivity to HaRP:
docker exec <NC_CONTAINER> curl -s http://<GATEWAY_IP>:8780/
# Should return "404 Not Found" (HaRP is responding)
# If connection refused: check HaRP is running and gateway IP is correct
Heartbeat fails after successful deploy
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Check HaRP logs for routing errors:
.. code:: bash
docker logs appapi-harp --tail=20
HaRP lazily resolves the k8s Service upstream on first request after a
restart, so restarting HaRP does **not** require re-deploying ExApps. If
heartbeat still fails, verify the k8s Service exists and is reachable:
.. code:: bash
kubectl get svc -n nextcloud-exapps
Pods stuck in Pending
~~~~~~~~~~~~~~~~~~~~~
.. code:: bash
kubectl describe pod <pod-name> -n nextcloud-exapps
# Check Events section for scheduling or image pull issues
Image pull errors
~~~~~~~~~~~~~~~~~
The kind cluster needs to be able to pull images. For public images
(like ``ghcr.io/nextcloud/test-deploy:release``) this should work out of
the box.
Token expired
~~~~~~~~~~~~~
Regenerate by rerunning the redeploy script:
.. code:: bash
cd ~/nextcloud/HaRP
bash development/redeploy_host_k8s.sh
Clean up all ExApp resources
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. code:: bash
kubectl delete deploy,svc,pvc -n nextcloud-exapps --all
Reset everything
~~~~~~~~~~~~~~~~
.. code:: bash
# Remove daemon config
docker exec <NC_CONTAINER> sudo -E -u www-data php occ app_api:daemon:unregister k8s_local
# Delete kind cluster
kind delete cluster --name nc-exapps
# Remove HaRP container
docker rm -f appapi-harp
Then start again from :ref:`Step 1<scaling-kubernetes-setup-step-1>`.

View File

@@ -0,0 +1,32 @@
Scaling ExApps
==============
AppAPI delegates the scaling task to the ExApp itself.
This means that the ExApp must be designed in a way so that it is possible to scale vertically.
As for horizontal scaling, we recommend using Kubernetes for this.
You could also implement, for example, a Server-Worker architecture for basic scaling.
In this case, the Server is your ExApp and the Workers are the external machines that can work with the ExApp
using Nextcloud user authentication.
Additional clients (or workers) can be (optionally) added (or attached) to the ExApp
to increase the capacity and performance.
The rest of this section will explain how to setup and use Kubernetes for automated scaling.
Additional instructions are also provided if you have a GPU device for GPU scaling.
.. note::
Currently, if a Deploy daemon is configured with GPUs available,
AppAPI will by default attach all available GPU devices to each ExApp container on this Deploy daemon.
This means that these GPUs are shared between all ExApps on the same Deploy daemon.
Therefore, for the ExApps that require heavy use of GPUs,
it is recommended to have a separate Deploy daemon (host) for them.
.. toctree::
:maxdepth: 2
KubernetesSetup
KEDASetup
AppAPIEmulation