See below for a longer, more
detailed technical description.
This document does not
cover the behavior of other SailFin components like
JMS/JDBC/IIOP/SSR/CLB/GMS on a multihomed machine. Extensive testing is
required on a multihomed setup similar to what has been described in
the Description.
This document aims to
address the network related concerns a user might have while
configuring SailFin on a machine with multiple NIC cards. Since each
interface (IP address) in a production environment can be potentially
be connected to a different network (switch/router/broadcast-domain),
support for multi-home in SailFin becomes very important to make the IP
traffic model of the system deterministic. It also dicusses how
separation of traffic (internal and external) can be achieved
configuratively.
Current behavior:
It is possible to configure the listen addresses (incoming) for all the
listeners exposed in domain.xml configuration. And this would ensure
that in a multihomed system the listeners accept packets only if the
destination IP address is the one that is configured in the listener
and not just through any of the available IP addresses in the
system. This behavior is necessary but not sufficient for a multihome
deployment.
When a connection is created from SailFin, the IP
address that will be used by the local Socket to create a connection
the the destination is non-deterministic and would depend on the
platform, and the operating system network configuration. We do not
exercise any explicit control (bind) during the socket creation
(atleast on the SIP and the http clb proxy side) and so end up being at
the mercy of whatever is decided by JDK.
A multi-homed host is known as a
computer that has multiple network
connections, of which the connections may or may not be the same
network. The 2 elements involved in such a setup are the NICs and the
IP addresses, below are the combination that can be achieved
- Single Link, Multiple IP address (Spaces) -
Generally called as IP Aliasing. - Multiple Interfaces, Single IP address per interface
The host has multiple interfaces
and each interface has one, or more, IP addresses. If one of the
links fails, then its IP address becomes unreachable, but the other
IP addresses will still work. Connection fail-over is possible only
using protocols like SCTP and not with TCP. -
Multiple Links, Single IP address (Space) - Using Bonding
When one of the links fails, the protocol notices this on both sides
and traffic is not sent over the failing link any more.
-
Multiple Links, Multiple IP address (Spaces)
It allows
use of all links at the same time to increase the total available
bandwidth and detects link saturation and failures in real time to
redirect traffic. Algorithms allow traffic management.
One
such configuration which is a combination of 1 and 3 is shown
below...., this will be used as a reference in the document going
forward.
Figure 14.1.1 Reference Network Diagram : Figure 2
Figure
2 describes the deployment architecure that would be used for
validating the multihome support. The purpose of the diagram is to
provide an indication of a setup that would be used for
testing/validating the multihome support in SailFin. By no means this
is exhaustive and there are numerous other configurations possible
within the scope of this document. But for the sake of brevity (time
and resources), we will consider only the above setup.
Legend eth0, eth1 : Available interfaces (physical) in the system.
IPv4_1:
IPv4 address configured in the private address space and used for
cluster internal traffic. The assumption in the diagram is that this IP
(network) is from the private backplance network and is not reachable
through the VIP/HardwareLB.
IPv4_2: IPv4 address configured in the public address space, and reachable through VIP external network.
IPv6_1 : IPv6 address configured in the public address space and reachable through VIP/external network.
sip-listener-1
: An IPv4 sip listener that can receive requests from SIP clients
through VIP/hardware-load-balancer. Note that this can also be an IPv6
listener in a production deployment .
http-listener-1
: An Ipv6 based http listener that can receive Http requests from
external clients through VIP/hardware-load-balancer. Note that this can
also be an IPv4 listener in a production deployment .
sip-listener-2
: An
IPv4 sip listener that is used by the clb for proxying requests.
Note that this can also be an IPv6 listener in a production deployment.
But for the scope of this release it will be a v4 address.
http-listener-2
: An
IPv4 http listener that is used by the clb for proxying requests. Note
that this can also be an IPv6 listener in a production deployment. But
for the scope of this release it will be a v4 address.
Payload_1 to Payload_10:
10 instance cluster. One instance per machine would be used for
testing, but an ideal deployment could have more instances per machine.
Switch0 and 1: IP switches (or VLANs) that are used for network segragation.
With reference to the above figure, the requirement is to allow
Inbound traffic
: SailFin to bind to one (or more) of the 5 virtual (aliased) IPs
available in bond0 (similarly for bond1). Each of the virtual IP
addresses is connected to a broadcast domain that is implemented
through VLAN configuration in the switch.
For e.g sip-listener_1 may
be bound to bond1:1, sip_listener_2 is bound to bond0:2 etc.. where the
listeners may be configured as "internal" or "external'
Outbound traffic
:Subsequent-requests for traffic coming in on bond0:1, can be sent
through another interface like bond0:3 (iff its an external listener).
In this case the outbound IP:port information would be available
through the TargetTuple interface. There should also be a provision to
configure a default outbound interface , which would be used by
the NM in case the Target Tuple lacks the outbound information.
Connection caching:
Today the connections are cached with only the destination host,
port and protocol combination as the key. This has to be enhanced to
include the source ip, port in the key. This is because there could be
instances where the same destination has to be reached through
different local ip addresses.
Traffic separation-Clb changes
:
Traffic separation is a use case that arises out of the multihome
support in network manager.. External network traffic is serviced at
the public interfaces of the cluster – Virtual IP. Thereafter this
traffic is load balanced onto any of server replicas over their
internal network interfaces. Any request-response messaging would occur
over the internal network interfaces that connect the server replicas.
From a Clb perspective, this segregation can achieved either of the ways :
(a) Two-tier deployment
Here
the fronting tier is the pure load balancing cluster which is
configured to face the external / public network traffic. The Back-end
cluster is the pure application tier, with its server replicas are
connected over dedicated internal network path.
The front-end
cluster configuration would need to configure certain listeners
which connect over the internal network path. CLB would use these
listeners when connecting over to back-end cluster while load balancing
the traffic. This is not within the scope of this document and is only
mentioned as an existing alternative to traffic separation use case.
(b) Self-Load Balancing deployment
Here
in, each server replica would be connected over a external/public
facing network and an internal network path over another NIC, such that
CLB would use the listeners on the internal network for load balancing
the traffic.
CLB Interfaces that capture the external and internal endpoints :
connid
Pushed as a parameter onto topmost VIA while load
balancing the request to another replica. Records the
connection information between client / user agent and
front-end. It would used for dispatching the request
back to client.
CLBConstants.CONN_ID
Message attribute which records the connection
information between front-end and back-end. Used by CLB
back-end to dispatch the response back to originating
replica.
Listener categorization and tagging
: To achieve traffic separation, a mechanism needs to exist that helps
identify whether a listening endpoint is an internal endpoint or an
external endpoint, this information is also crucual to the generation
of the clb xml configuration file. HTTP service and SIP service would
allow tagging the listeners as either internal or external (or both if
none is specified, this is to maintain backward
compatibility). The set of internal listeners would be used by
Converged Load Balancer for load balancing of the incoming
traffic.
Via : For internal SIP requests the address that is being used to
send the message will be used as the Via between FE and BE. For outgoing ones we should use the outbound interface
that is configured in the message or the default external address (VIP/HLB).
Proxying Ipv6 requests: A request (http/sip) received on IPv6 can be proxied over IPv4 (if there are no IPv6) internal listeners.
Clb backend optimization:
Any optimizations that can be performed in the backend (including
skipping of layers) would be done after establishing the results
through a proto-type.
sip external address configuration:
Today it is possible to configure one and only one address in the
sip-container->external-address attribute, that will be used by the
container to update the SIP headers so that a UA is able to correctly
contact the application server through the front end VIP/HWLB. It there
is more than one external sip listener that is configured.
If there are multiple VIPs/HWLBs spraying traffic to their respective
listeners, then probably there is a need to have one external address
configurable per sip listener rather than one per container. The
external sip address configuration has to be per-listener rather than
per-container. If none is specified in the listener then the
sip-container value will be utilized, this way the backward
compatibility is also preserved. The external addres is valid only for
external sip listeners which receive traffic from UAs, and the list of
such external addresses will be exposed to the application through the
JSR 289 mechanism.
<sip-listener name="listener_1" external-sip-address="xx.xx.xx.xx" external-sip-port="YYYY"....../>
Dynamic configuration impact:A listener cannot be changed from one type to another at runtime, the change will take effect only after the subsequent restart.
Addition of listeners is possible and the subsequent invocation of the exposed APIs will return the new listeners.
external-sip-address attributes are not dynamically configurable.
IPv6 support and Dual Stack support :An excellent documentation of the the dual-stack node impact on a Java Application is available here,
http://java.sun.com/javase/6/docs/technotes/guides/net/ipv6_guide/index.html#dual
(As per http://java.sun.com/javase/6/docs/technotes/guides/net/ipv6_guide/index.html)
The Java networking stack will first check whether IPv6 is supported on
the underlying OS. If IPv6 is supported, it will try to use the IPv6
stack. More specifically, on dual-stack systems it will create an IPv6
socket. On separate-stack systems things are much more complicated.
Java will create two sockets, one for IPv4 and one for IPv6
communication. For client-side TCP applications, once the socket is
connected, the internet-protocol family type will be fixed, and the
extra socket can be closed. For server-side TCP applications, since
there is no way to tell from which IP family type the next client
request will come, two sockets need to be
maintained. For UDP applications, both sockets will be needed for the lifetime of the communication.
Changes in SailFin are
1. Bind to :: (IPv6 anylocal) rather than 0.0.0.0 (IPv4 anylocal) address.
2. Remove direct references to IPv4 literal addresses
3. Parsing of Ipv6 addresses in URI has to be verified because it contains a ":".
Configurable properties:
http://java.sun.com/javase/6/docs/technotes/guides/net/ipv6_guide/index.html#ipv6-networking
JSR 289 requirements : Section 14.2 :
The
ability to send and receive messages from a particular interface will
be accomlished through the changes that have been described above. The
IP address of the interface through which the message was recieved is
exposed to the application even today (getLocal...()).
JSR 289 Details
----------------------
The servlet context attribute javax.servlet.sip.outboundinterfaces contain the possible outbound interfaces that can be used
by the application. This will be supported. The list will contain all the external addresses configured for all the listeners
in the server.
setOutboundInterface method as defined by the JSR 289 will be supported in Proxy, ProxyBranch and Session. Application
can use those methods as defined by the specification. The composition rules specified in section 14.2.1 will be followed.
Take a look at JSR 289 specification and javadoc for more details.
If an application does not set the out bound interface, the default
interface will be the external address configured for the sip
listener, where the original message was received.
There wont be any specific support for session replication. The outbound interface setting in the session will not be persisted.
Application could use sessionDidActivate callback to handle the loss of the outbound interface setting during serialization.
There is no impact on the Application Router or its usage.
Enabling Strict multihoming in the OS:
Binding the “source” socket against the specific interface you want the message to go from. seems to work fine as an ethereal/wireshark capture will usually confirm (the source IP address is the one from the correct interface). However,
in reality this does not work because the OS routing mechanism takes
over at this point (doing the link between OSI level 3 and 2) and will
often apply the “default” routing rule thus choosing itself the interface to be used (and these rules only uses the “TO” part of the IP address to determine the NIC to be used. They ignore the “FROM” part).
Strict multohome configuration is requried to overcome this problem.
This might be required depending on which platform the setup is
deployed.
Covers only Layer 3 and Layer 4 (OSI) changes. For layer 7 changes,
please refer to the FS for JSR 289 related changes [2]. Other protocol
listeners in SailFin are not covered in this document.
TargetTuple has to be enhanced to include the local IP and port through which the request has to be sent
The new attributes have to be
documented, and the deployment guide should highlight a multihome
system deployment, and how the listeners have to be configured to
achieve the same. Admin guide should describe the steps to configure a
listener type, as well as the existence (creation ?) of default
outbound
interface.
Admin back-end while rendering the
converged load balancer configuration,
converged-load-balancer.xml, would need to collate only those
server listeners which are marked “external=false”.
Admin changes for accomplishing the dtd changes.
Admin
configuration (domain.xml) should support resolution of IP addresses
(through tokens) in the http-listener and sip-listener configurations.
Port
conflicts (validation of configured ports in listeners) have to be
checked based on the listener IP. For e.g it should be possible to
configured 2 listeners on the same port but different IP address.
Please see [2] for further information.
Changes to GUI to capture the listener category as external or internal.
Installation will only provide the
default sip listeners, that listen on all the ips in the machine.
Specific sip-listeners that bind to a particular host:port have to be
added later. Default listener type will be external.
Network Partitioning support in SailFin 2.0 is equivalent to SailFin
1.5 deployment with all existing listeners being assumed internal.
As such there is no support in SailFin 1.5 to segregate the listeners
into external andinternal.
List any requirements on
upgrade tool and migration tool.
JSR 289 multihome changes [2].
Yes, changes would be required in the http-listener configuration to specify the type of the listener , please see [1].
List any third party API
used ( if any).