关于heartbeat的一个问题;

我的heartbeat启动后,过一段时间,监测到另一方dead后,会出现下列消息.
heartbeat[18478]: 2007/07/23_18:45:10 WARN: node wxsc: is dead
heartbeat[18478]: 2007/07/23_18:45:10 info: Comm_now_up(): updating status to active
heartbeat[18478]: 2007/07/23_18:45:10 info: Local status now set to: 'active'
heartbeat[18478]: 2007/07/23_18:45:10 info: Starting child client "/data/heartbeat/lib/heartbeat/ipfail" (17,65)
heartbeat[18478]: 2007/07/23_18:45:10 WARN: No STONITH device configured.
heartbeat[18478]: 2007/07/23_18:45:10 WARN: Shared disks are not protected.
heartbeat[18478]: 2007/07/23_18:45:10 info: Resources being acquired from wxsc.
heartbeat[18490]: 2007/07/23_18:45:10 info: Starting "/data/heartbeat/lib/heartbeat/ipfail" as uid 17 gid 65 (pid 18490)
harc[18491]: 2007/07/23_18:45:10 info: Running /etc/ha.d/rc.d/status status
mach_down[18511]: 2007/07/23_18:45:10 info: /data/heartbeat/lib/heartbeat/mach_down: nice_failback: foreign resources acquired
heartbeat[18481]: 2007/07/23_18:45:10 WARN: Performed 1 more non-realtime malloc calls.
heartbeat[18481]: 2007/07/23_18:45:10 info: Total non-realtime malloc bytes: 12288
mach_down[18511]: 2007/07/23_18:45:10 info: mach_down takeover complete for node wxsc.
heartbeat[18478]: 2007/07/23_18:45:10 info: Initial resource acquisition complete (T_RESOURCES(us))
heartbeat[18478]: 2007/07/23_18:45:10 info: mach_down takeover complete.
heartbeat[18478]: 2007/07/23_18:45:10 ERROR: ipc_bufpool_update: magic number in head does not match.Something very bad happened, abort now, farside pid =18490
heartbeat[18478]: 2007/07/23_18:45:10 ERROR: magic=315b6c69, expected value=abcd
heartbeat[18478]: 2007/07/23_18:45:10 info: pool: refcount=1, startpos=0x810d2b0, currpos=0x810d3f8,consumepos=0x810d35c, endpos=0x810e298, size=4096
heartbeat[18478]: 2007/07/23_18:45:10 info: nmsgs=0
heartbeat[18481]: 2007/07/23_18:45:10 CRIT: Emergency Shutdown: Master Control process died.
heartbeat[18481]: 2007/07/23_18:45:10 CRIT: Killing pid 18478 with SIGTERM
heartbeat[18481]: 2007/07/23_18:45:10 CRIT: Killing pid 18482 with SIGTERM
heartbeat[18481]: 2007/07/23_18:45:10 CRIT: Killing pid 18483 with SIGTERM
heartbeat[18481]: 2007/07/23_18:45:10 CRIT: Killing pid 18484 with SIGTERM
heartbeat[18481]: 2007/07/23_18:45:10 CRIT: Killing pid 18485 with SIGTERM
heartbeat[18481]: 2007/07/23_18:45:10 CRIT: Killing pid 18486 with SIGTERM
heartbeat[18481]: 2007/07/23_18:45:10 CRIT: Killing pid 18487 with SIGTERM
heartbeat[18481]: 2007/07/23_18:45:10 CRIT: Emergency Shutdown(MCP dead): Killing ourselves.
heartbeat[18492]: 2007/07/23_18:45:10 info: Local Resource acquisition completed.

这个应该怎样解决啊?有没有谁有过这种问题的出现?

Forums:

章老师,您好!

我的环境是
linux-2.4.20
ipvs-1.0.9
打开持久性连接选项,并且设置持久时间为30,但有这样的问题:
第一次访问被调度到22,我想通过把23的Weight设大后将连接调度给23。
此时连接已经没有连接状态:
-bash-2.05b# ipvsadm -Lc
IPVS connection entries
pro expire state source virtual destination

同一客户再来一次连接,连接还是被调度到22
IP Virtual Server version 1.0.9 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP 200.200.30.253:80 wlc persistent 10
-> 200.200.30.22:80 Local 6 0 1
-> 200.200.30.23:80 Route 60000 0 0
-> 200.200.30.24:80 Route 6 0 0

现象就是:即使过了持久时间,ipvs还是为那个IP保持持久连接,请问这是在什么地方出问题?