-
Notifications
You must be signed in to change notification settings - Fork 136
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] rump hypcall integration for LKL #255
base: master
Are you sure you want to change the base?
Conversation
arch/lkl/kernel/time.c
Outdated
@@ -29,6 +29,11 @@ void calibrate_delay(void) | |||
{ | |||
} | |||
|
|||
void read_persistent_clock(struct timespec *ts) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might make sense to make this public i.e. in addition add some helper function in user space using settimeofday/adjtimex that syncs the host time with the lkl time. One possible use case is syncing the clock with the host in case of drift in a long-running application or a simple jump of the host time due to manual change. I know everyone can write this quickly themselves, but it would be a nice convenience function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the comment @Rondom.
It might make sense to make this public i.e. in addition add some helper function in user space using settimeofday/adjtimex that syncs the host time with the lkl time.
I don't understand this comment; what do you mean by making a function public ? btw, 66513c9 exposes all kernel functions by putting a prefix to avoid symbol collision.
(see #17 (comment) more detail)
though you might be aware of this, this additional function overrides a function of kernel (below), which each architecture can define its own way to access to a clock.
void __weak read_persistent_clock(struct timespec *ts)
in kernel/time/timekeeping.c.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was not aware (and not really awake while writing that answer). Thanks for the explanation.
arch/lkl/defconfig
Outdated
@@ -32,8 +32,8 @@ CONFIG_EXT4_FS_SECURITY=y | |||
CONFIG_XFS_FS=y | |||
CONFIG_XFS_POSIX_ACL=y | |||
CONFIG_ISO9660_FS=y | |||
CONFIG_BTRFS_FS=y | |||
CONFIG_BTRFS_FS_POSIX_ACL=y | |||
#CONFIG_BTRFS_FS=y |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the reason for disabling btrfs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is because this LKL kernel stops during make test, though I need to investigate more what's the root cause.
https://circleci.com/gh/libos-nuse/frankenlibc/136
I was suspecting some of busy loops may cause the stuck.
Awesome ! This looks good at a first glace except two minor issues. One is the BTRFS issue. Do you see the stuck issue when running w/ rump or with regular Linux backend? If it is under rump/qemu you can probably start gdb and see where it is looping. On the other hand I think that we perhaps should look into adding proper IRQ support, since we have at least one new target that supports it. There is another concertn I have with regard to the lkl__sync ops, there might be slower then the __sync primitives and that could cause some performance regressions. |
https://gist.github.com/thehajime/f8af72bf625e90d397c2584a285d97d3 it stopped at
in the current version, I can't find proper atomic ops (at least can't use In any case, since atomic ops seems to not have a portable interface, I thought it's a good idea having wrapper functions (or adding host_ops entries). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The LKL_HIJACK_DEBUG doesn't work anymore for me. Can you also rebase to master so that I can check the performance of it?
arch/lkl/include/asm/thread_info.h
Outdated
@@ -22,6 +22,7 @@ struct thread_info { | |||
lkl_thread_t tid; | |||
struct task_struct *prev_sched; | |||
unsigned long stackend; | |||
void *rump_client; /* for syscall proxy */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addr of stackend should be the end of thread info?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
though I couldn't find how stackend is used, I agree with you by changing the place of new member.
tools/lkl/lib/fs.c
Outdated
@@ -102,7 +110,6 @@ int lkl_get_virtio_blkdev(int disk_id, uint32_t *pdevid) | |||
|
|||
ret = snprintf(sysfs_path + sysfs_path_len, sizeof(sysfs_path) - sysfs_path_len, "/%s/block", | |||
virtio_name); | |||
free(virtio_name); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it safe to do so? seems like d_name is pointing a memory that will be freed in lkl_closedir
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be fixed now with replacing strdup
by lkl_ops.mem_alloc + memcpy
.
a40346e
to
78522ae
Compare
should be fixed now. thanks !
I have a trouble with a rebased version on qemu-x86_64 (on linux and rump/frankenlibc are working fine). It seems like the direct syscall patch (e1cd839) doesn't like me - revert the commit with a hello world program works fine. (#257 didn't help me btw) you can try a measurement with hijack library - this should work fine now. |
Yes. The performance looks neutral. The direct syscall requires support of long jmp. Maybe you want to check if it works as expected on qemu? |
78522ae
to
a0dd66c
Compare
a0dd66c
to
fff885e
Compare
setjmp/longjmp seems to be working on qemu. incomplete pthread implementation outside of LKL might be an issue with the direct syscall implementation (e1cd839). so #267 didn't help either. I will split some of commits which is independent from rump integration, while investigating the above issue separately. |
51eca75
to
76abb0b
Compare
Can one of the admins verify this patch? |
76abb0b
to
291ceb8
Compare
3ae4248
to
7d95a54
Compare
This will allow us to reimplpement atomic ops with lock/touch/unlock way if underlying hosts don't support a particular atomic ops. Signed-off-by: Hajime Tazaki <[email protected]>
This only handles signals right after a system call was invoked. Signed-off-by: Hajime Tazaki <[email protected]>
This allows us to define IRQ related operation outside of the kernel by host_ops implementation. Signed-off-by: Hajime Tazaki <[email protected]>
This host_ops entry will provide a way to pass any variables (e.g., environmental variables) to an LKL application from the host environment. Signed-off-by: Hajime Tazaki <[email protected]>
This commit adds an implementation of new host_ops by using rump hypercall (wrapped as an LKL host_ops) provided by rumpkernel project. The rump hypercall implementation is specifically integrated with the one provided by frankenlibc project, which is an alternative way to use rump hypercalls. There are other rump hypercall implementations such as rumprun, NetBSD kernel tree, but this commit only focuses on the frankenlibc integration. The new features covered by this commits ranges in various ways, but here is some highlights. - New thread primitives frankenlibc implements user-space thread implementation based on makecontext(3)&co with non-preemptive (cooperative) scheduler implementation. - LKL specific standard library bind frankenlibc offers cross-compiler build chains to build an LKL application bound to LKL-versioned standard library (i.e., libc). Our musl libc implementation is provided by different repository. So, any application can be built with this toolchain to use LKL instead of host libc. - New host environments support LKLed application now has potentials to run on the platform which rump hypercall implementations support. frankenlibc supports Liunx, FreeBSD, NetBSD, and qemu-arm. Others supposrts run an LKL applications on hypervisors (kvm, xen, etc) The frankenlibc code is located at the following repository, but it will be upstreamed to the original one (justincormack/frankenlibc). https://github.com/libos-nuse/frankenlibc Signed-off-by: Hajime Tazaki <[email protected]>
qemu-arm support with frankenlibc & lkl is limited at this moment: no network (virtio-net) is not supported yet, etc. Signed-off-by: Hajime Tazaki <[email protected]>
The code is not built as-is but put an implementation using rump syscall proxy which makes an LKL application communicate with outside of the process. Signed-off-by: Hajime Tazaki <[email protected]>
This commit offers an integration with rumprun unikernel. It is only tested with qemu-system-x86_64 but it potentially works fine with x86 baremetal machines as well as xen hypervior. An initial PCI device support is also implemented by relying on exposed virtual devices from hyperviros: virtio-net device is tested for the moment. The rumprun repository to support lkl build is not upstreamed yet - you can try with the following repository before it will be in. https://github.com/libos-nuse/rumprun Signed-off-by: Hajime Tazaki <[email protected]>
Signed-off-by: Hajime Tazaki <[email protected]>
Signed-off-by: Hajime Tazaki <[email protected]>
7d95a54
to
f82b294
Compare
This reverts commit 2c507e4 ("device core: Remove deprecated create_singlethread_workqueue"). This is a temporary fix for qemu-arm hangs on the following command that schedule_work() triggers (?) dead-lock that nanosleep on hello->main() never returns. $ qemu-system-arm -M versatilepb -m 512M -nographic -serial null \ -semihosting -kernel rumpobj/tests/hello This should be fixed in a transparent way in a future. Signed-off-by: Hajime Tazaki <[email protected]>
ea9039a
to
95be33a
Compare
This looks cool. What's the status on it? |
@copumpkin I've constantly updating this branch. 2 PRs were also submitted to rumpkernel but still no immediate upstream yet. |
commit 355627f upstream. Commit 7c05126 ("mm, fork: make dup_mmap wait for mmap_sem for write killable") made it possible to kill a forking task while it is waiting to acquire its ->mmap_sem for write, in dup_mmap(). However, it was overlooked that this introduced an new error path before the new mm_struct's ->uprobes_state.xol_area has been set to NULL after being copied from the old mm_struct by the memcpy in dup_mm(). For a task that has previously hit a uprobe tracepoint, this resulted in the 'struct xol_area' being freed multiple times if the task was killed at just the right time while forking. Fix it by setting ->uprobes_state.xol_area to NULL in mm_init() rather than in uprobe_dup_mmap(). With CONFIG_UPROBE_EVENTS=y, the bug can be reproduced by the same C program given by commit 2b7e866 ("fork: fix incorrect fput of ->exe_file causing use-after-free"), provided that a uprobe tracepoint has been set on the fork_thread() function. For example: $ gcc reproducer.c -o reproducer -lpthread $ nm reproducer | grep fork_thread 0000000000400719 t fork_thread $ echo "p $PWD/reproducer:0x719" > /sys/kernel/debug/tracing/uprobe_events $ echo 1 > /sys/kernel/debug/tracing/events/uprobes/enable $ ./reproducer Here is the use-after-free reported by KASAN: BUG: KASAN: use-after-free in uprobe_clear_state+0x1c4/0x200 Read of size 8 at addr ffff8800320a8b88 by task reproducer/198 CPU: 1 PID: 198 Comm: reproducer Not tainted 4.13.0-rc7-00015-g36fde05f3fb5 lkl#255 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-20170228_101828-anatol 04/01/2014 Call Trace: dump_stack+0xdb/0x185 print_address_description+0x7e/0x290 kasan_report+0x23b/0x350 __asan_report_load8_noabort+0x19/0x20 uprobe_clear_state+0x1c4/0x200 mmput+0xd6/0x360 do_exit+0x740/0x1670 do_group_exit+0x13f/0x380 get_signal+0x597/0x17d0 do_signal+0x99/0x1df0 exit_to_usermode_loop+0x166/0x1e0 syscall_return_slowpath+0x258/0x2c0 entry_SYSCALL_64_fastpath+0xbc/0xbe ... Allocated by task 199: save_stack_trace+0x1b/0x20 kasan_kmalloc+0xfc/0x180 kmem_cache_alloc_trace+0xf3/0x330 __create_xol_area+0x10f/0x780 uprobe_notify_resume+0x1674/0x2210 exit_to_usermode_loop+0x150/0x1e0 prepare_exit_to_usermode+0x14b/0x180 retint_user+0x8/0x20 Freed by task 199: save_stack_trace+0x1b/0x20 kasan_slab_free+0xa8/0x1a0 kfree+0xba/0x210 uprobe_clear_state+0x151/0x200 mmput+0xd6/0x360 copy_process.part.8+0x605f/0x65d0 _do_fork+0x1a5/0xbd0 SyS_clone+0x19/0x20 do_syscall_64+0x22f/0x660 return_from_SYSCALL_64+0x0/0x7a Note: without KASAN, you may instead see a "Bad page state" message, or simply a general protection fault. Link: http://lkml.kernel.org/r/[email protected] Fixes: 7c05126 ("mm, fork: make dup_mmap wait for mmap_sem for write killable") Signed-off-by: Eric Biggers <[email protected]> Reported-by: Oleg Nesterov <[email protected]> Acked-by: Oleg Nesterov <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Konstantin Khlebnikov <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Vlastimil Babka <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Commit 7c05126 ("mm, fork: make dup_mmap wait for mmap_sem for write killable") made it possible to kill a forking task while it is waiting to acquire its ->mmap_sem for write, in dup_mmap(). However, it was overlooked that this introduced an new error path before the new mm_struct's ->uprobes_state.xol_area has been set to NULL after being copied from the old mm_struct by the memcpy in dup_mm(). For a task that has previously hit a uprobe tracepoint, this resulted in the 'struct xol_area' being freed multiple times if the task was killed at just the right time while forking. Fix it by setting ->uprobes_state.xol_area to NULL in mm_init() rather than in uprobe_dup_mmap(). With CONFIG_UPROBE_EVENTS=y, the bug can be reproduced by the same C program given by commit 2b7e866 ("fork: fix incorrect fput of ->exe_file causing use-after-free"), provided that a uprobe tracepoint has been set on the fork_thread() function. For example: $ gcc reproducer.c -o reproducer -lpthread $ nm reproducer | grep fork_thread 0000000000400719 t fork_thread $ echo "p $PWD/reproducer:0x719" > /sys/kernel/debug/tracing/uprobe_events $ echo 1 > /sys/kernel/debug/tracing/events/uprobes/enable $ ./reproducer Here is the use-after-free reported by KASAN: BUG: KASAN: use-after-free in uprobe_clear_state+0x1c4/0x200 Read of size 8 at addr ffff8800320a8b88 by task reproducer/198 CPU: 1 PID: 198 Comm: reproducer Not tainted 4.13.0-rc7-00015-g36fde05f3fb5 lkl#255 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-20170228_101828-anatol 04/01/2014 Call Trace: dump_stack+0xdb/0x185 print_address_description+0x7e/0x290 kasan_report+0x23b/0x350 __asan_report_load8_noabort+0x19/0x20 uprobe_clear_state+0x1c4/0x200 mmput+0xd6/0x360 do_exit+0x740/0x1670 do_group_exit+0x13f/0x380 get_signal+0x597/0x17d0 do_signal+0x99/0x1df0 exit_to_usermode_loop+0x166/0x1e0 syscall_return_slowpath+0x258/0x2c0 entry_SYSCALL_64_fastpath+0xbc/0xbe ... Allocated by task 199: save_stack_trace+0x1b/0x20 kasan_kmalloc+0xfc/0x180 kmem_cache_alloc_trace+0xf3/0x330 __create_xol_area+0x10f/0x780 uprobe_notify_resume+0x1674/0x2210 exit_to_usermode_loop+0x150/0x1e0 prepare_exit_to_usermode+0x14b/0x180 retint_user+0x8/0x20 Freed by task 199: save_stack_trace+0x1b/0x20 kasan_slab_free+0xa8/0x1a0 kfree+0xba/0x210 uprobe_clear_state+0x151/0x200 mmput+0xd6/0x360 copy_process.part.8+0x605f/0x65d0 _do_fork+0x1a5/0xbd0 SyS_clone+0x19/0x20 do_syscall_64+0x22f/0x660 return_from_SYSCALL_64+0x0/0x7a Note: without KASAN, you may instead see a "Bad page state" message, or simply a general protection fault. Link: http://lkml.kernel.org/r/[email protected] Fixes: 7c05126 ("mm, fork: make dup_mmap wait for mmap_sem for write killable") Signed-off-by: Eric Biggers <[email protected]> Reported-by: Oleg Nesterov <[email protected]> Acked-by: Oleg Nesterov <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Konstantin Khlebnikov <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: <[email protected]> [4.7+] Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
I'm curious how the signals be delivered to the rump-client, cooperated lkl app or hijacked app? |
Did you mean the rump-client as syscall proxy from external processes ? If not, would you care to elaborate ? |
Sure, I meant the rump-client as syscall proxy from external processes. |
this rump hypcall integration has several modes of execution: external process via syscall proxy is one of them. musl libc integration with no dependency to host kernel is the other one. I am specifically trying to support So, we need to have some sort of Is that clear to you ? Let me know if not. |
Is there a blocker on this? We have a rump hypcall-based environment, and would like to use Linux, but I've been waiting to put time into lkl until this was pulled in. Should we stop waiting? |
|
||
if (write) { | ||
if (size == 1) { | ||
#ifdef __x86_64__ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm guessing this means that x86-32 is not supported?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LKL does not support x86-32, yet #341
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a messy part of lots of ifdefs and a result of what i personally tried. this can be expanded a platform which is currently not supported (e.g., x86-32).
|
||
static void *rump_tls_get(struct lkl_tls_key *key) | ||
{ | ||
return rumpuser_curlwp(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't get the TLS variable given the key, correct? Knowing rumpkernel's handling of TLS (where get
s are implicit on __thread
variables), it seems like there might be a mismatch between the LKL and rumpkernel models here. I like explicit calls to get TLS based on a key in LKL much more.
I suppose the assumption here is that LKL only uses TLS to store the currently active thread?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as you can see, this function drops the key argument the result only indicate a TLS val, not localized by a key.
I suppose the assumption here is that LKL only uses TLS to store the currently active thread?
Not really, just a limitation of in this RFC stage.
.mem_free = rump_mem_free, | ||
.ioremap = lkl_ioremap, | ||
.iomem_access = lkl_iomem_access, | ||
.jmp_buf_set = jmp_buf_set, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this is the LKL way of doing this. Perhaps this isn't the place to ask this, but why not include a more abstract function that simply switches to another thread. Could be implemented with setjmp/longjmp, or with OS primitives below this layer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(asking these question here should be fine with me)
I'm not sure if I understand your question. maybe the commit 79131f2 is relevant to this question ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gparmer I have an old branch that removes most of this low level ops and switches to higher level opts like alloc/switch/free contexts, see: https://github.com/tavip/linux/commits/the-expanse, in particular tavip@e182fe0.
The reason I did not merge this is that we have optimizations to reduce the system call latency (see e1cd839) which I could not map well with the new scheme.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you both for the information and references! @thehajime @tavip
I think that the patches @tavip referenced for host context switching do what I was contemplating. I didn't foresee the complications with the system call path (nor do I fully understand the interaction now).
From my perspective (i.e. wanting to not use lkl as a library, but as a normal resource manager and abstraction provider, while using host abstractions such as threads and memory management), some of this seemed a little overly complex. However, given the goals of the system, this is all making quite a bit more sense. It seems really quite well engineered.
I don't mean to distract lkl at all, but I wonder if a number of simplifying assumptions would simplify the design. For example, if I could make the following set of assumptions:
- all host threads that will make "system calls" into the library are created using the thread creation APIs of lkl (with the exception of the initial thread -- that executes
main
), - all interrupts are executed as host threads that the lkl creates and controls (thus execute from the Linux perspective more like the interrupt threads in the real-time patches), and
- the context switch API focused on switching host threads.
Could it enable "direct" system calls (without switching stacks/contexts)? Would it allow preemptive execution? SMP?
Again, sorry for distracting from the main lkl goals and thrust. I certainly understand that lkl will not adopt these assumptions, given its goals. However, I'd very much value your opinion if these assumptions could simplify some of the interactions with the host system.
return ret; | ||
} | ||
|
||
struct lkl_dev_net_ops rumpfd_ops = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The LKL way of doing things is to export the raw hardware I/O to the underlying/hosting system. In the BMK rumpkernel implementation, you actually use the devices within the RK for I/O. Given that you've implemented the PCI rumpuser interface, are LKL device drivers supported?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this particular piece of code (rumpfd_ops
) is used by https://github.com/libos-nuse/frankenlibc, which is a variant of rumprun unikernel. and you're right: with rumprun bmk LKL can use drivers/net/virtio-net.c or drivers/block/virtio-blk.c. I have tried e1000 driver of Linux with qemu e1000 emulation over LKL. Though recently rumprun LKL is not well maintained (I hope I do as early as possible).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm familiar with rumprun (we have our own hacked up version with a backend for our own OS). I was not sure that you'd be able to run native drivers (e.g. e1000) since I didn't see the virtual to physical rumpuser function (which is used in bmk
, for example to enable devices to understand how to interact with DMA).
Great to hear that devices are at least possible, if not supported!
I was unaware of the mainlining initiative so I certainly understand why this patchset is currently unmaintained.
@gparmer thanks for the comment (and sorry for the late response). My personal opinion on this PR is still not changed: I would go for upstream basic LKL part to linus tree first, then add more extensions like this PR after that. But for the mean time, it would be of course great to receive any comments/suggestions/patches :) (ref: #304) |
Add a new test case which performs double query of the bpf_mprog through libbpf API, but also via raw bpf(2) syscall. This is testing to gather first the count and then in a subsequent probe the full information with the program array without clearing passed structs in between. # ./vmtest.sh -- ./test_progs -t tc_opts [...] ./test_progs -t tc_opts [ 1.398818] tsc: Refined TSC clocksource calibration: 3407.999 MHz [ 1.400263] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x311fd336761, max_idle_ns: 440795243819 ns [ 1.402734] clocksource: Switched to clocksource tsc [ 1.426639] bpf_testmod: loading out-of-tree module taints kernel. [ 1.428112] bpf_testmod: module verification failed: signature and/or required key missing - tainting kernel lkl#252 tc_opts_after:OK lkl#253 tc_opts_append:OK lkl#254 tc_opts_basic:OK lkl#255 tc_opts_before:OK lkl#256 tc_opts_chain_classic:OK lkl#257 tc_opts_chain_mixed:OK lkl#258 tc_opts_delete_empty:OK lkl#259 tc_opts_demixed:OK lkl#260 tc_opts_detach:OK lkl#261 tc_opts_detach_after:OK lkl#262 tc_opts_detach_before:OK lkl#263 tc_opts_dev_cleanup:OK lkl#264 tc_opts_invalid:OK lkl#265 tc_opts_max:OK lkl#266 tc_opts_mixed:OK lkl#267 tc_opts_prepend:OK lkl#268 tc_opts_query:OK <--- (new test) lkl#269 tc_opts_replace:OK lkl#270 tc_opts_revision:OK Summary: 19/0 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Daniel Borkmann <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin KaFai Lau <[email protected]>
Add a new test case to query on an empty bpf_mprog and pass the revision directly into expected_revision for attachment to assert that this does succeed. ./test_progs -t tc_opts [ 1.406778] tsc: Refined TSC clocksource calibration: 3407.990 MHz [ 1.408863] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x311fcaf6eb0, max_idle_ns: 440795321766 ns [ 1.412419] clocksource: Switched to clocksource tsc [ 1.428671] bpf_testmod: loading out-of-tree module taints kernel. [ 1.430260] bpf_testmod: module verification failed: signature and/or required key missing - tainting kernel lkl#252 tc_opts_after:OK lkl#253 tc_opts_append:OK lkl#254 tc_opts_basic:OK lkl#255 tc_opts_before:OK lkl#256 tc_opts_chain_classic:OK lkl#257 tc_opts_chain_mixed:OK lkl#258 tc_opts_delete_empty:OK lkl#259 tc_opts_demixed:OK lkl#260 tc_opts_detach:OK lkl#261 tc_opts_detach_after:OK lkl#262 tc_opts_detach_before:OK lkl#263 tc_opts_dev_cleanup:OK lkl#264 tc_opts_invalid:OK lkl#265 tc_opts_max:OK lkl#266 tc_opts_mixed:OK lkl#267 tc_opts_prepend:OK lkl#268 tc_opts_query:OK lkl#269 tc_opts_query_attach:OK <--- (new test) lkl#270 tc_opts_replace:OK lkl#271 tc_opts_revision:OK Summary: 20/0 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Daniel Borkmann <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin KaFai Lau <[email protected]>
(since this is RFC, I'd be happy to have any comments and suggestions from yours)
This PR provides an initial implementation of rump kernel hypercalls (*1) which brings us a couple of interesting features to LKL.
The highlight of new features are:
The addition of rump hypercalls does not interfere the current implementation of LKL, but with additional external repositories (*2), you will run LKLed applications with rump hypercall.
Non-hijack application support requires to build an (existing) application by dedicated build toolchain (cross compilation as rumprun does), but it will solve most of hijack limitations such as conflicted symbols/namespaces, non exposed symbols (e.g., replace hidden symbols in glibc). With this toolchain, I confirmed
nginx
,netperf
, andghc
(a haskell compiler) somehow work fine, though those need more work to be more than just a hello world.The green thread implementation is obviously useful to avoid the expensive task of context switches. I saw some of netperf benchmarks outperform pthread based one (will re-measure with netperf in my spare time).
(rump hypercall has pthread-based thread implementation but I didn't test it in this PR).
I'm going to upstream the external repositories (*2) as well as this PR so, this PR will be a base of all of external dependencies.
I think merging non-master branch of LKL and synchronize periodically with master branch is also reasonable since the changes (of this PR) are a bit large.
*1 rump kernel
http://book.rumpkernel.org
*2 additional (external) repositories
This change is