关于ActiveConn分布不均衡的讨论,望不吝赐教--------问题已得到解决

master151_11:~ # ipvsadm -ln
IP Virtual Server version 1.2.1 (size=1048576)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP 172.16.151.45:28000 wrr
-> 172.16.151.35:28000 Tunnel 1 82 7
-> 172.16.151.34:28000 Tunnel 1 118 5
-> 172.16.151.33:28000 Tunnel 1 118 2
-> 172.16.151.30:28000 Tunnel 1 123 5
TCP 172.16.151.44:28000 wrr
-> 172.16.151.29:28000 Tunnel 1 10 13
-> 172.16.151.10:28000 Tunnel 1 7 13
-> 172.16.151.138:28000 Tunnel 1 94 12
TCP 172.16.151.180:3306 wrr
-> 172.16.151.37:3306 Tunnel 1 27 0
-> 172.16.151.31:3306 Tunnel 1 33 0

上述的ld显示,中间的一个转发连接分布很不均衡,一些辅助信息:
// 活动的连接
atomic_t activeconns; /* active connections */
// 不活动的连接
atomic_t inactconns; /* inactive connections */

由于Tunnel模式在RS回包时候是不经过LD的,所有这里的activeconns将是通过,ACK状态推断而来,my question is:
为什么只有这一套转发显示的activeconns不均衡,而同在一起的其他5-6套都能够很均衡的体现,通过观察此现象已有近两个月的时间,请专业人士予以协助,谢谢!

————————————————————————————————————————————————————————————
问题原因已找到,是RS端定义的close conn时间不一致所致(部分过短),连接数教高的那个定义的关闭时间是3600,而其他的是30

————————————————————————————————————————————————————————————
一些可能对理解此问题有帮助的信息一并附上:
ActiveConn/InActConn connnection (LVS)
The output of ipsvadm lists connections, either as

[b] * ActiveConn - [color=Purple]in ESTABLISHED state[/color]
* InActConn - [color=Purple]any other state [/color][/b]

With LVS-NAT, the director sees all the packets between the client and the realserver, so always knows the state of tcp connections and the listing from ipvsadm is accurate. However for LVS-DR, LVS-Tun, the director does not see the packets from the realserver to the client. Termination of the tcp connection occurs by one of the ends sending a FIN (see W. Richard Stevens, TCP/IP Illustrated Vol 1, ch 18, 1994, pub Addison Wesley) followed by reply ACK from the other end. Then the other end sends its FIN, followed by an ACK from the first machine. If the realserver initiates termination of the connection, the director will only be able to infer that this has happened from seeing the ACK from the client. In either case the director has to infer that the connection has closed from partial information and uses its own table of timeouts to declare that the connection has terminated. Thus the count in the InActConn column for LVS-DR, LVS-Tun is inferred rather than real.

Entries in the ActiveConn column come from

* service with an established connection. Examples of services which hold connections in the ESTABLISHED state for long enough to see with ipvsadm are telnet and ftp (port 21).

Entries in the InActConn column come from

*

Normal operation
o Services like http (in non-persistent i.e. HTTP /1.0 mode) or ftp-data(port 20) which close the connections as soon as the hit/data (html page, or gif etc) has been retrieved (<1sec). You're unlikely to see anything in the ActiveConn column with these LVS'ed services. You'll see an entry in the InActConn column untill the connection times out. If you're getting 1000connections/sec and it takes 60secs for the connection to time out (the normal timeout), then you'll have 60,000 InActConns. This number of InActConn is quite normal. If you are running an e-commerce site with 300secs of persistence, you'll have 300,000 InActConn entries. Each entry takes 128bytes (300,000 entries is about 40M of memory, make sure you have enough RAM for your application). The number of ActiveConn might be very small.
*

Pathological Conditions (i.e. your LVS is not setup properly)
o

identd delayed connections: The 3 way handshake to establish a connection takes only 3 exchanges of packets (i.e. it's quick on any normal network) and you won't be quick enough with ipvsadm to see the connection in the states before it becomes ESTABLISHED. However if the service on the realserver is under authd/identd, you'll see an InActConn entry during the delay period.
o

Incorrect routing (usually the wrong default gw for the realservers):

In this case the 3 way handshake will never complete, the connection will hang, and there'll be an entry in the InActConn column.

Usually the number of InActConn will be larger or very much larger than the number of ActiveConn.

Forums:

我也碰到这个问题,不知到RS端定义的close conn时间在哪修改?

randomness