Skip to content

Commit 9aeba64

Browse files
committed
docs: 添加 Cilium eBPF L4 负载均衡方案文档
- 添加 VIP 二层网络要求说明 - 添加 Custom(自定义)模式描述 - 生成英文版文档 (cherry picked from commit f823a0122ba89d5658d160588d5040b750d3c72c)
1 parent a867802 commit 9aeba64

2 files changed

Lines changed: 569 additions & 0 deletions

File tree

Lines changed: 290 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,290 @@
1+
---
2+
id: KB260300001
3+
products:
4+
- Alauda Container Platform
5+
kind:
6+
- Solution
7+
sourceSHA: pending
8+
---
9+
10+
# High-Performance Container Networking with Cilium CNI and eBPF-based L4 Load Balancer (Source IP Preservation)
11+
12+
This document describes how to deploy Cilium CNI in a ACP 4.2+ cluster and leverage eBPF to implement high-performance Layer 4 load balancing with source IP preservation.
13+
14+
## Prerequisites
15+
16+
| Item | Requirement |
17+
|------|------|
18+
| ACP Version | 4.2+ |
19+
| Network Mode | Custom Mode |
20+
| Architecture | x86_64 / amd64 |
21+
22+
> **Note**: Cilium/eBPF requires Linux kernel 4.19+ (5.10+ recommended). The following operating systems are **NOT supported**:
23+
> - CentOS 7.x (kernel version 3.10.x)
24+
> - RHEL 7.x (kernel version 3.10.x - 4.18.x)
25+
>
26+
> Recommended:
27+
> - RHEL 8.x
28+
> - Ubuntu 22.04
29+
> - MicroOS
30+
>
31+
> **Note**: Cilium/eBPF requires Linux kernel 4.19+ (5.10+ recommended). The following operating systems are **NOT supported**:
32+
> - CentOS 7.x (kernel version 3.10.x)
33+
> - RHEL 7.x (kernel version 3.10.x - 4.18.x)
34+
>
35+
> Recommended:
36+
> - RHEL 8.x
37+
> - Ubuntu 22.04
38+
> - MicroOS
39+
40+
### Node Port Requirements
41+
42+
| Port | Component | Description |
43+
|------|------|------|
44+
| 4240 | cilium-agent | Health API |
45+
| 9962 | cilium-agent | Prometheus Metrics |
46+
| 9879 | cilium-agent | Envoy Metrics |
47+
| 9890 | cilium-agent | Agent Metrics |
48+
| 9963 | cilium-operator | Prometheus Metrics |
49+
| 9891 | cilium-operator | Operator Metrics |
50+
| 9234 | cilium-operator | Metrics |
51+
52+
### Kernel Configuration Requirements
53+
54+
Ensure the following kernel configurations are enabled on the nodes (can be checked via `grep` in `/boot/config-$(uname -r)`):
55+
56+
- `CONFIG_BPF=y` or `=m`
57+
- `CONFIG_BPF_SYSCALL=y` or `=m`
58+
- `CONFIG_NET_CLS_BPF=y` or `=m`
59+
- `CONFIG_BPF_JIT=y` or `=m`
60+
- `CONFIG_NET_SCH_INGRESS=y` or `=m`
61+
- `CONFIG_CRYPTO_USER_API_HASH=y` or `=m`
62+
63+
## ACP 4.x Cilium Deployment Steps
64+
65+
### Step 1: Create Cluster
66+
67+
On the cluster creation page, set **Network Mode** to **Custom** mode. Wait until the cluster reaches `EnsureWaitClusterModuleReady` status before deploying Cilium.
68+
69+
### Step 2: Install Cilium
70+
71+
1. Download the latest Cilium image package (v4.2.x) from the ACP marketplace
72+
73+
2. Upload to the platform using violet:
74+
75+
```bash
76+
export PLATFORM_URL=""
77+
export USERNAME=''
78+
export PASSWORD=''
79+
export CLUSTER_NAME=''
80+
81+
violet push cilium-v4.2.17.tgz --platform-address "$PLATFORM_URL" --platform-username "$USERNAME" --platform-password "$PASSWORD" --clusters "CLUSTER_NAME"
82+
```
83+
84+
3. Create temporary RBAC configuration on the business cluster where Cilium will be installed (this RBAC permission is not configured before the cluster is successfully deployed):
85+
86+
Create temporary RBAC configuration file:
87+
88+
```bash
89+
cat > tmp.yaml << 'EOF'
90+
apiVersion: rbac.authorization.k8s.io/v1
91+
kind: ClusterRole
92+
metadata:
93+
name: cilium-clusterplugininstance-admin
94+
labels:
95+
app.kubernetes.io/name: cilium
96+
rules:
97+
- apiGroups: ["cluster.alauda.io"]
98+
resources: ["clusterplugininstances"]
99+
verbs: ["*"]
100+
---
101+
apiVersion: rbac.authorization.k8s.io/v1
102+
kind: ClusterRoleBinding
103+
metadata:
104+
name: cilium-admin-clusterplugininstance
105+
labels:
106+
app.kubernetes.io/name: cilium
107+
roleRef:
108+
apiGroup: rbac.authorization.k8s.io
109+
kind: ClusterRole
110+
name: cilium-clusterplugininstance-admin
111+
subjects:
112+
- apiGroup: rbac.authorization.k8s.io
113+
kind: User
114+
name: admin
115+
EOF
116+
```
117+
118+
Apply temporary RBAC configuration:
119+
120+
```bash
121+
kubectl apply -f tmp.yaml
122+
```
123+
124+
4. Navigate to **Administrator → Marketplace → Cluster Plugins** and install Cilium
125+
126+
5. After Cilium is successfully installed, delete the temporary RBAC configuration:
127+
128+
```bash
129+
kubectl delete -f tmp.yaml
130+
rm tmp.yaml
131+
```
132+
133+
## Create L4 Load Balancer with Source IP Preservation
134+
135+
Execute the following operations on the master node backend.
136+
137+
### Step 1: Remove kube-proxy and Clean Up Rules
138+
139+
1. Get the current kube-proxy image:
140+
141+
```bash
142+
kubectl get -n kube-system kube-proxy -oyaml | grep image
143+
```
144+
145+
2. Backup and delete the kube-proxy DaemonSet:
146+
147+
```bash
148+
kubectl -n kube-system get ds kube-proxy -oyaml > kube-proxy-backup.yaml
149+
150+
kubectl -n kube-system delete ds kube-proxy
151+
```
152+
153+
3. Create a BroadcastJob to clean up kube-proxy rules:
154+
155+
```yaml
156+
apiVersion: operator.alauda.io/v1alpha1
157+
kind: BroadcastJob
158+
metadata:
159+
name: kube-proxy-cleanup
160+
namespace: kube-system
161+
spec:
162+
completionPolicy:
163+
ttlSecondsAfterFinished: 300
164+
type: Always
165+
failurePolicy:
166+
type: FailFast
167+
template:
168+
metadata:
169+
labels:
170+
k8s-app: kube-proxy-cleanup
171+
spec:
172+
serviceAccountName: kube-proxy
173+
hostNetwork: true
174+
restartPolicy: Never
175+
nodeSelector:
176+
kubernetes.io/os: linux
177+
priorityClassName: system-node-critical
178+
tolerations:
179+
- operator: Exists
180+
containers:
181+
- name: kube-proxy-cleanup
182+
image: registry.alauda.cn:60070/tkestack/kube-proxy:v1.33.5 ## Replace with the kube-proxy image from Step 1
183+
imagePullPolicy: IfNotPresent
184+
command:
185+
- /bin/sh
186+
- -c
187+
- "/usr/local/bin/kube-proxy --config=/var/lib/kube-proxy/config.conf --hostname-override=$(NODE_NAME) --cleanup || true"
188+
env:
189+
- name: NODE_NAME
190+
valueFrom:
191+
fieldRef:
192+
apiVersion: v1
193+
fieldPath: spec.nodeName
194+
securityContext:
195+
privileged: true
196+
volumeMounts:
197+
- mountPath: /var/lib/kube-proxy
198+
name: kube-proxy
199+
- mountPath: /lib/modules
200+
name: lib-modules
201+
readOnly: true
202+
- mountPath: /run/xtables.lock
203+
name: xtables-lock
204+
volumes:
205+
- name: kube-proxy
206+
configMap:
207+
name: kube-proxy
208+
- name: lib-modules
209+
hostPath:
210+
path: /lib/modules
211+
type: ""
212+
- name: xtables-lock
213+
hostPath:
214+
path: /run/xtables.lock
215+
type: FileOrCreate
216+
EOF
217+
```
218+
219+
Save as `kube-proxy-cleanup.yaml` and apply:
220+
221+
```bash
222+
kubectl apply -f kube-proxy-cleanup.yaml
223+
```
224+
225+
The BroadcastJob is configured with `ttlSecondsAfterFinished: 300` and will be automatically cleaned up within 5 minutes after completion.
226+
227+
### Step 2: Create Address Pool
228+
229+
> **VIP Address Requirement**: Cilium L2 Announcement implements IP failover through ARP broadcasting. Therefore, the VIP must be in the **same Layer 2 network** as the cluster nodes to ensure ARP requests can be properly broadcast and responded to.
230+
231+
Save as `lb-resources.yaml`:
232+
233+
```yaml
234+
apiVersion: cilium.io/v2alpha1
235+
kind: CiliumLoadBalancerIPPool
236+
metadata:
237+
name: lb-pool
238+
spec:
239+
blocks:
240+
- cidr: "192.168.132.192/32" # Replace with the actual VIP segment
241+
---
242+
apiVersion: cilium.io/v2alpha1
243+
kind: CiliumL2AnnouncementPolicy
244+
metadata:
245+
name: l2-policy
246+
spec:
247+
interfaces:
248+
- eth0 # Replace with the actual network interface name
249+
externalIPs: true
250+
loadBalancerIPs: true
251+
```
252+
253+
Apply the configuration:
254+
255+
```bash
256+
kubectl apply -f lb-resources.yaml
257+
```
258+
259+
### Step 3: Verification
260+
261+
Create a LoadBalancer Service to verify IP allocation and test connectivity.
262+
263+
**Verification 1: Check if LB Service has been assigned an IP**
264+
265+
```bash
266+
kubectl get svc -A
267+
```
268+
269+
Expected output example:
270+
271+
```
272+
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
273+
cilium-123-1 test LoadBalancer 10.4.98.81 192.168.132.192 80:31447/TCP 35s
274+
```
275+
276+
**Verification 2: Check the leader node sending ARP requests**
277+
278+
```bash
279+
kubectl get leases -A | grep cilium
280+
```
281+
282+
Expected output example:
283+
284+
```
285+
cpaas-system cilium-l2announce-cilium-123-1-test 192.168.141.196 24s
286+
```
287+
288+
**Verification 3: Test external access**
289+
290+
From an external client, access the LoadBalancer Service. Capturing packets inside the Pod should show the source IP as the client's IP, indicating successful source IP preservation.

0 commit comments

Comments
 (0)