Multihome support in Network Manager and Clb - One Pager


1. Introduction

1.1. Project/Component Working Name:

Multihome support in Network Manager

1.2. Name(s) and e-mail address of Document Author(s)/Supplier:


VersionDate CommentsAuthor
0.109-Feb-2009First draftRamesh Parthasarathy
0.202-March-2009Added reference network diagram,
clb changes, ipv6, dual stack support,
Ramesh Parthasarathy
0.316 March 2009Moved Reference diagram under separate section and added legend. (feedback from Joe)
Added a requirements to Admin section.
Updated CLB section with comments from Kshitz.
Ramesh Parthasarathy
0.417 March 2009Update based on comments received in Engineering meeting held on 16 March
JSR289, Admin doc link, CLB optmization, Strict multihoming
Ramesh
Parthasarathy

1.3. Date of This Document:

02-March-2009

2. Project Summary

2.1. Project Description:

Multihome support in Project SailFin includes the following capabilities

SailFin should be capable of the following

  1. Bind to a particular IP address (IPv4 or IPv6) and receive SIP messages from that ip:port in a system that has multiple IP addresses configured. The responses for SIP requests received on a particular interface will also go out through that interface (even if the connection is broken and a new conneciton is established).
  2. Bind to a particular IP address (IPv4 or IPv6) and receive Http messages from that ip:port in a system that has multiple IP addresses configured. (GlassFish related)
  3. Send outbound Http/SIP messages through a configured interface (ip). 
  4. Support separation of external and internal traffic: converged load balancer proxy traffic should be restricted to an interface (ip). Listeners can be configured as internal/external. 
  5. Populate the SIP messages with the correct address depending on the interface that is chosen to receive/send the message. [2]
  6. Allow for configuring more than one external and/or more than one internal SIP listener. Configuring more than one internal listener would mean that the clb would use these implicitely for proxying in a round-robin mechanism.
  7. If traffic is received on one interface (IPv4 or IPV6) and allow for subsequent  outbound traffic (for the same request) to be sent via any other external interface.

See below for a longer, more detailed technical description.

2.2. Risks and Assumptions:

This document does not cover the behavior of other SailFin components like JMS/JDBC/IIOP/SSR/CLB/GMS on a multihomed machine. Extensive testing is required on a multihomed setup similar to what has been described in the Description.

3. Problem Summary

3.1. Problem Area:

This document aims to address the network related concerns a user might have while configuring SailFin on a machine with multiple NIC cards. Since each interface (IP address) in a production environment can be potentially be connected to a different network (switch/router/broadcast-domain), support for multi-home in SailFin becomes very important to make the IP traffic model of the system deterministic. It also dicusses how separation of traffic (internal and external) can be achieved configuratively.

Current behavior: It is possible to configure the listen addresses (incoming) for all the listeners exposed in domain.xml configuration. And this would ensure that in a multihomed system the listeners accept packets only if the destination IP address is the one that is configured in the listener and  not just through any of the available IP addresses in the system. This behavior is necessary but not sufficient for a multihome deployment.
When a connection is created from SailFin, the IP address that will be used by the local Socket to create a connection the the destination is non-deterministic and would depend on the platform, and the operating system network configuration. We do not exercise any explicit control (bind) during the socket creation (atleast on the SIP and the http clb proxy side) and so end up being at the mercy of whatever is decided by JDK.

3.2. Justification:

Enables SailFin to be used in an environment that has multiple NICs (IPs)

4. Technical Description:

4.1. Details:

A multi-homed host is known as a computer that has multiple network connections, of which the connections may or may not be the same network. The 2 elements involved in such a setup are the NICs and the IP addresses, below are the combination that can be achieved 

  1. Single Link, Multiple IP address (Spaces) -
    Generally  called as IP Aliasing.
  2. Multiple Interfaces, Single IP address per interface
    The host has multiple interfaces and each interface has one, or more, IP addresses. If one of the links fails, then its IP address becomes unreachable, but the other IP addresses will still work. Connection fail-over is possible only using protocols like SCTP and not with TCP.
  3. Multiple Links, Single IP address (Space) - Using Bonding
    When one of the links fails, the protocol notices this on both sides and traffic is not sent over the failing link any more.
  4. Multiple Links, Multiple IP address (Spaces)
    It allows use of all links at the same time to increase the total available bandwidth and detects link saturation and failures in real time to redirect traffic. Algorithms allow traffic management.
One such configuration which is a combination of 1 and 3 is shown below...., this will be used as a reference in the document going forward.

topo

                                                        Figure 1


4.1.1 Reference Network Diagram :



topo
               
                 Figure 2
 
Figure 2 describes the deployment architecure that would be used for validating the multihome support. The purpose of the diagram is to provide an indication of a setup that would be used for testing/validating the multihome support in SailFin. By no means this is exhaustive and there are numerous other configurations possible within the scope of this document. But for the sake of brevity (time and resources), we will consider only the above setup.

Legend

eth0, eth1 : Available interfaces (physical) in the system.

IPv4_1: IPv4 address configured in the private address space and used for cluster internal traffic. The assumption in the diagram is that this IP (network) is from the private backplance network and is not reachable through the VIP/HardwareLB.

IPv4_2:  IPv4 address configured in the public address space, and reachable through VIP external network.

IPv6_1 : IPv6 address configured in the public address space and reachable through VIP/external network.

sip-listener-1 : An IPv4 sip listener that can receive requests from SIP clients through VIP/hardware-load-balancer. Note that this can also be an IPv6 listener in a production deployment .

http-listener-1 : An Ipv6 based http listener that can receive Http requests from external clients through VIP/hardware-load-balancer. Note that this can also be an IPv4 listener in a production deployment .

sip-listener-2 : An IPv4 sip listener that is used by the clb for proxying requests.  Note that this can also be an IPv6 listener in a production deployment. But for the scope of this release it will be a v4 address.

http-listener-2 : An IPv4 http listener that is used by the clb for proxying requests.  Note that this can also be an IPv6 listener in a production deployment. But for the scope of this release it will be a v4 address.

Payload_1 to Payload_10: 10 instance cluster. One instance per machine would be used for testing, but an ideal deployment could have more instances per machine.

Switch0 and 1: IP switches (or VLANs) that are used for network segragation.

With reference to the above figure, the requirement is to allow
Inbound traffic : SailFin to bind to one (or more) of the 5 virtual (aliased) IPs available in bond0 (similarly for bond1). Each of the virtual IP addresses is connected to a broadcast domain that is implemented through VLAN configuration in the switch.
For e.g sip-listener_1 may be bound to bond1:1, sip_listener_2 is bound to bond0:2 etc.. where the listeners may be configured as "internal" or "external'

Outbound traffic :Subsequent-requests for traffic coming in on bond0:1, can be sent through another interface like bond0:3 (iff its an external listener). In this case the outbound IP:port information would be available through the TargetTuple interface. There should also be a provision to configure a default outbound interface , which would be used by the NM in case the Target Tuple lacks the outbound information.

Connection caching: Today the connections are cached with only the destination host, port and protocol combination as the key. This has to be enhanced to include the source ip, port in the key. This is because there could be instances where the same destination has to be reached through different local ip addresses.

Traffic separation-Clb changes : Traffic separation is a use case that arises out of the multihome support in network manager.. External network traffic is serviced at the public interfaces of the cluster – Virtual IP. Thereafter this traffic is load balanced onto any of server replicas over their internal network interfaces. Any request-response messaging would occur over the internal network interfaces that connect the server replicas.

From a Clb perspective, this segregation can achieved either of the ways :
(a) Two-tier deployment
Here the fronting tier is the pure load balancing cluster which is configured to face the external / public network traffic. The Back-end cluster is the pure application tier, with its server replicas are connected over dedicated internal network path.
The front-end cluster configuration would need to configure  certain listeners which connect over the internal network path. CLB would use these listeners when connecting over to back-end cluster while load balancing the traffic. This is not within the scope of this document and is only mentioned as an existing alternative to traffic separation use case.

(b) Self-Load Balancing deployment
Here in, each server replica would be connected over a external/public facing network and an internal network path over another NIC, such that CLB would use the listeners on the internal network for load balancing the traffic.

 CLB Interfaces that capture the external and internal endpoints :

 connid
       Pushed as a parameter onto topmost VIA while load 
       balancing the request to another replica. Records the
       connection information between client / user agent and
       front-end. It would used for dispatching the request
       back to client.

CLBConstants.CONN_ID
      Message attribute which records the connection 
      information between front-end and back-end. Used by CLB
      back-end to dispatch the response back to originating
      replica.

Listener categorization and tagging : To achieve traffic separation, a mechanism needs to exist that helps identify whether a listening endpoint is an internal endpoint or an external endpoint, this information is also crucual to the generation of the clb xml configuration file. HTTP service and SIP service would allow tagging the listeners as either internal or external (or both if none is specified, this is to maintain backward compatibility). The set of internal listeners would be used by Converged Load Balancer for load balancing of the incoming traffic.

Via : For internal SIP requests the address that is being used to send the message will be used as the Via between FE and BE. For outgoing ones we should use the outbound interface that is configured in the message or the default external address (VIP/HLB).

Proxying Ipv6 requests: A request (http/sip) received on IPv6 can be proxied over IPv4 (if there are no IPv6) internal listeners.

Clb backend optimization: Any optimizations that can be performed in the backend (including skipping of layers) would be done after establishing the results through a proto-type.

sip external address configuration: Today it is possible to configure one and only one address in the sip-container->external-address attribute, that will be used by the container to update the SIP headers so that a UA is able to correctly contact the application server through the front end VIP/HWLB. It there is more than one external sip listener that is configured. If there are multiple VIPs/HWLBs spraying traffic to their respective listeners, then probably there is a need to have one external address configurable per sip listener rather than one per container. The external sip address configuration has to be per-listener rather than per-container. If none is specified in the listener then the sip-container value will be utilized, this way the backward compatibility is also preserved. The external addres is valid only for external sip listeners which receive traffic from UAs, and the list of such external addresses will be exposed to the application through the JSR 289 mechanism.

<sip-listener name="listener_1" external-sip-address="xx.xx.xx.xx" external-sip-port="YYYY"....../>


Dynamic configuration impact:
A listener cannot be changed from one type to another at runtime, the change will take effect only after the subsequent restart.
Addition of listeners is possible and the subsequent invocation of the exposed APIs will return the new listeners.
external-sip-address attributes are not dynamically configurable.

IPv6 support and Dual Stack support :
An excellent documentation of the the dual-stack node impact on a Java Application is available here,
http://java.sun.com/javase/6/docs/technotes/guides/net/ipv6_guide/index.html#dual

(As per http://java.sun.com/javase/6/docs/technotes/guides/net/ipv6_guide/index.html)

The Java networking stack will first check whether IPv6 is supported on the underlying OS. If IPv6 is supported, it will try to use the IPv6 stack. More specifically, on dual-stack systems it will create an IPv6 socket. On separate-stack systems things are much more complicated. Java will create two sockets, one for IPv4 and one for IPv6 communication. For client-side TCP applications, once the socket is connected, the internet-protocol family type will be fixed, and the extra socket can be closed. For server-side TCP applications, since there is no way to tell from which IP family type the next client request will come, two sockets need to be
maintained. For UDP applications, both sockets will be needed for the lifetime of the communication.

Changes in SailFin are
1. Bind to :: (IPv6 anylocal) rather than 0.0.0.0 (IPv4 anylocal) address.
2. Remove direct references to IPv4 literal addresses
3. Parsing of Ipv6 addresses in URI has to be verified because it contains a ":".

Configurable properties:
http://java.sun.com/javase/6/docs/technotes/guides/net/ipv6_guide/index.html#ipv6-networking

JSR 289 requirements : Section 14.2 :
The ability to send and receive messages from a particular interface will be accomlished through the changes that have been described above. The IP address of the interface through which the message was recieved is exposed to the application even today (getLocal...()). 

JSR 289 Details
----------------------

The servlet context attribute javax.servlet.sip.outboundinterfaces contain the possible outbound interfaces that can be used by the application. This will be supported. The list will contain all the external addresses configured for all the listeners in the server.

setOutboundInterface method as defined by the JSR 289 will be supported in Proxy, ProxyBranch and Session. Application can use those methods as defined by the specification. The composition rules specified in section 14.2.1 will be followed.
Take a look at JSR 289 specification and javadoc for more details.

If an application does not set the out bound interface, the default interface will be the external address configured for the sip listener, where the original message was received.

There wont be any specific support for session replication. The outbound interface setting in the session will not be persisted.
Application could use sessionDidActivate callback to handle the loss of the outbound interface setting during serialization.
There is no impact on the Application Router or its usage.

Enabling Strict multihoming in the OS: Binding the source socket against the specific interface you want the message to go from.  seems to work fine as an ethereal/wireshark capture will usually confirm (the source IP address is the one from the correct interface). However, in reality this does not work because the OS routing mechanism takes over at this point (doing the link between OSI level 3 and 2) and will often apply the default routing rule thus choosing itself the interface to be used (and these rules only uses the TO part of the IP address to determine the NIC to be used. They ignore the FROM part). Strict multohome configuration is requried to overcome this problem. This might be required depending on which platform the setup is deployed.
 

4.2. Bug/RFE Number(s):

4.2.1 Bug/RFE Numbers from Issue Tracker

                    Issue 913
                    Issue 914
                    Issue 576

4.2.2 Requirememt Ids that are being addressed as a part of this proposal.

105 65-0192/03385

105 65-0192/03442

4.3. In Scope:

4.4. Out of Scope:

Covers only Layer 3 and Layer 4 (OSI) changes. For layer 7 changes, please refer to the FS for JSR 289 related changes [2]. Other protocol listeners in SailFin are not covered in this document.

4.5. Interfaces:


4.5.1 Exported Interfaces

Will expose the following interfaces

1. A list of all external sip listeners (IP address and port number), can be used by the higher layers to query the outbound interfaces available in the system.
2. A list of all internal sip listeners (IP address and port), can be used by the load balancer to determine the internal endpoints that are exposed by the system.
Interface: sun-domain_1_4.dtd

(a)Element : http-listener
   Comments: A new attribute called "type" will be  
             introduced for “http-listener” element. The
             default value for SailFin 2.0 deployments 
             would be “default”.
                         
                   server-name CDATA #REQUIRED
                   redirect-port CDATA #IMPLIED
                   xpowered-by %boolean; "true"
                   enabled %boolean; "true"
              +    type CDATA "default">

         
             (b) Element : sip-listener
                 Comments :  A new attribute called typre will be
                             introduced for “sip-listener”. The default
                             value for SailFin 2.0 deployments would be
                             "default".

                   address CDATA #REQUIRED
                   port CDATA #REQUIRED
                   transport (udp_tcp | tls) "udp_tcp"
                    enabled %boolean; "true"
              +    type CDATA "default"
               +   external-sip-address #OPTIONAL
               +   external-sip-port #OPTIONAL>

"type" can take values of "internal", "external" or "default". An "internal" listener denotes that it will be used strictly for the purpose of proxying by the clb. An "external" listener will be used only by UAs and not by the clb. A "default" type can be used by UAs as well as clb, and is the default type for a listener.

4.5.2 Imported interfaces

TargetTuple has to be enhanced to include the local IP and port through which the request has to be sent

4.5.3 Other interfaces (Optional)

Any private interfaces that may be of interest?

4.6. Doc Impact:

The new attributes have to be documented, and the deployment guide should highlight a multihome system deployment, and how the listeners have to be configured to achieve the same. Admin guide should describe the steps to configure a listener type, as well as the existence (creation ?) of default outbound interface.

4.7. Admin/Config Impact:

Admin back-end while rendering the converged load balancer configuration, converged-load-balancer.xml, would need to collate only those server listeners which are marked  “external=false”.
Admin changes for accomplishing the dtd changes.

Admin configuration (domain.xml) should support resolution of IP addresses (through tokens) in the http-listener and sip-listener configurations.

Port conflicts (validation of configured ports in listeners) have to be checked based on the listener IP. For e.g it should be possible to configured 2 listeners on the same port but different IP address.
Please see [2] for further information.

4.7.1 Configuration changes needed

4.7.2 CLI / GUI impact if any

Changes to GUI to capture the listener category as external or internal.

4.8. HA Impact:

Clb impact covered in [1]

4.9. I18N/L10N Impact:

No

4.10. Packaging & Delivery:

Installation will only provide the default sip listeners, that listen on all the ips in the machine. Specific sip-listeners that bind to a particular host:port have to be added later. Default listener type will be external.

4.10.1 Binaries in which the code is delivered

comms-appserv-rt.jar

4.11. Security Impact:

None

4.12. Compatibility Impact

Network Partitioning support in SailFin 2.0 is equivalent to SailFin 1.5 deployment with all existing listeners being assumed internal. As such there is no support in SailFin 1.5 to segregate the listeners into external andinternal.

List any requirements on upgrade tool and migration tool.

4.13. Dependencies:

JSR 289 multihome changes [2].

4.13.1 Changes required in GlassFish

Yes, changes would be required in the http-listener configuration to specify the type of the listener , please see [1].

4.13.2 Third Party APIs

List any third party API used ( if any).

4.14 Miscellaneous

Will this component work with Ipv6 addresses Yes
Will this component work with JDK 64bit Yes
Will this component require configuration using a sun-specific deployment descriptor.If yes, please specify below that configuration elements needed No

5. Open Issues

These have been highlighted in red in the document.



Issue No Description Comments Resolution

6. Reference Documents:


  1. http://wiki.glassfish.java.net/attach/SFv2FunctionalSpecs/jsr-289-multihost.html
  2. http://wiki.glassfish.java.net/attach/SFv2FunctionalSpecs/sailfin_admin_v2.doc
       

7. Schedule:

7.1. Projected Availability:

            SailFin 2.0