GALAXY: tcp connection live migration with CRIU

Galaxy is fascinating.

Program: simple client/server, loopback Tools: CRIU does checkpoint/restore work

ps -ef

liuqius+ 20052 2464 0 22:00 pts/6 00:00:00 ./server 7022 liuqius+ 20053 2464 0 22:00 pts/6 00:00:00 ./client 127.0.0.1 7022

SIGCHLD:

When I start server and client, and then use CRIU to do checkpoint:

root@liuqiushan-K43SA:/home/liuqiushan/BCMatrix/criu/deps/criu-x86_64# criu dump -D checkpoint -t 20052 –shell-job –tcp-established

Though it can generate the checkpoint image files, but some errors happen:

Error (parasite-syscall.c:389): si_code=1 si_pid=20114 si_status=0 Error (parasite-syscall.c:389): si_code=1 si_pid=20119 si_status=0

So let’s see parasite-syscall.c:

I think we can ignore the Error, because it uses SIGCHLD to do exit.

Continue.

We finish the dump work on server and client respectively. If we restore them at once, then it will succeed. But if we reboot the laptop, we will encounter the following Errors:

root@liuqiushan-K43SA:/home/liuqiushan/BCMatrix/criu/deps/criu-x86_64# criu restore -d -D checkpoint –shell-job –tcp-established iptables: Bad rule (does a matching rule exist in that chain?). Error (util.c:580): exited, status=1 Error (netfilter.c:69): Iptables configuration failed: No such file or directory iptables: Bad rule (does a matching rule exist in that chain?). Error (util.c:580): exited, status=1 Error (netfilter.c:69): Iptables configuration failed: No such file or directory Can’t read socket: Connection reset by peer

root@liuqiushan-K43SA:/home/liuqiushan/BCMatrix/criu/deps/criu-x86_64# criu restore -d -D checkpoint_client –shell-job –tcp-established iptables: Bad rule (does a matching rule exist in that chain?). Error (util.c:580): exited, status=1 Error (netfilter.c:69): Iptables configuration failed: No such file or directory iptables: Bad rule (does a matching rule exist in that chain?). Error (util.c:580): exited, status=1 Error (netfilter.c:69): Iptables configuration failed: No such file or directory PP 26 -> -1

Pay attention to : iptables: Bad rule (does a matching rule exist in that chain?).

And also through util.c and netfilter.c, we can focus on the problem of iptables.

More importantly, the server can restore successfully, but the client, it seems that:

the client also restores, but after ‘PP 26->-1 (above)’, it exit abnormally.

We should pay attention to these information:

According to SIGCHLD mentioned above, I try such work:

first dump client. then server, and restore server, then client

Also failed.

A little confusing.

iptables rule:

Now we care about:

Can’t read socket: Connection reset by peer

It’s fatal.

“Connection reset by peer” is the TCP/IP equivalent of slamming the phone back on the hook. It’s more polite than merely not replying, leaving one hanging. But it’s not the FIN-ACK expected of the truly polite TCP/IP converseur.

So now the question is: who causes ‘Connection reset by peer’ ?

when do dump, CRIU does?

or

after reboot machine, iptables changed?

First, need more details about iptables.

Let us see: +reset+by+peer

And I try this work:

make client build connection via while(1), so that it can rebuild a connection after a connection reset by peer.

Failed. After reboot, the client still be terminated.

Pay attention to iptables and link above.

who sends RST?

But we should notice that though the tcp connection can not recover, the server can still restore, while the client is terminated. Care about the server side: why can it restore?

Pay attention to:

iptables: Bad rule (does a matching rule exist in that chain?)

I try to modify the source code of CRIU:

netfilter.c

Because by reading its code, I think CRIU has restored the iptables rule related to the tcp connection, unless the rule is wrong.

No reboot: tcp checkpoint

And even though we don’t reboot the machine, we can do restore successfully only for the first time.

For the second time:

root@liuqiushan-K43SA:/home/liuqiushan/BCMatrix/criu/deps/criu-x86_64# criu restore -d -D checkpoint –shell-job –tcp-established iptables: Bad rule (does a matching rule exist in that chain?). Error (util.c:580): exited, status=1 Error (netfilter.c:69): Iptables configuration failed: No such file or directory iptables: Bad rule (does a matching rule exist in that chain?). Error (util.c:580): exited, status=1 Error (netfilter.c:69): Iptables configuration failed: No such file or directory

But it can restore successfully, the client and the server recover tcp connection.

But for the third time:

每一件事都要用多方面的角度来看它

GALAXY: tcp connection live migration with CRIU

相关文章:

你感兴趣的文章:

标签云: