eBPF basic

Untitled

It solves memory efficiency and event re-ordering problems of the BPF perf buffer.It provides both perfbuf-compatible for easy migration, but also has the new reserve/commit API with better usability. Also, both synthetic and real-world benchmarks show that in almost all cases so think about making it a default choice for sending data from the BPF program to user-space.

It is a multi-producer, single-consumer (MPSC) queue and can be safely shared across multiple CPUs simultaneously.

ring buffer克服了perfbuf的两个主要缺点,一个是内存开销,另外一个则是event re-ordering,推荐使用ring buffer来代替perfbuf,perfbuf的核心是一组per-CPU circular buffers。ring buffer同时也支持变长数据、高效的内核和用户态数据交换,epoll和busy-loop数据通知。perfbuf是每一个CPU分配一个circular buffer,因此CPU核数增大就会导致buffer增大带来buffer的冗余,而ring buffer则是全局一个大的buffer和CPU核心数没关系。另外perfbuf需要先把event拷贝到一个per CPU的数组中(BPF栈空间很小,因此较大的event无法放到栈上),然后再将数据拷贝到perfbuf中,如果此时perfbuf空间不足导致失败,那么从event拷贝到per CPU数组这一步就导致了浪费。而ring buffer通过reserve和commit两阶段提交的方式避免了这个问题,先reserve确保有空间,然后再将数据写入到ring buffer中。

// 定义ring buffer
struct {
	__uint(type, BPF_MAP_TYPE_RINGBUF);
	__uint(max_entries, 256 * 1024);
} rb SEC(".maps");

// 使用ring buffer
/* reserve sample from BPF ringbuf */
struct event *e;
e = bpf_ringbuf_reserve(&rb, sizeof(*e), 0);
/* successfully submit it to user-space for post-processing */
bpf_ringbuf_submit(e, 0);

// 用户态程序创建ring buffer polling
/* Set up ring buffer polling */
	rb = ring_buffer__new(bpf_map__fd(skel->maps.rb), handle_event, NULL, NULL);
	if (!rb) {
		err = -1;
		fprintf(stderr, "Failed to create ring buffer\\n");
		goto cleanup;
	}

// 从ring buffer中poll数据
while (!exiting) {
		err = ring_buffer__poll(rb, 100 /* timeout, ms */);
		/* Ctrl-C will cause -EINTR */
		if (err == -EINTR) {
			err = 0;
			break;
		}
		if (err < 0) {
			printf("Error polling perf buffer: %d\\n", err);
			break;
		}
	}

/* Clean up */
ring_buffer__free(rb);