Harpreet Singh: harpreet.singh@sun.com
Jennifer Chou: jennifer.chou@sun.com
Mahesh Kannan: mahesh.kannan@sun.com
Sreenivas Munnangi: sreenivas.munnangi@sun.com
Administrators need a lightweight tool to identify the problem, so it can be used with little overhead in the production environment. Apart from this, they need to have enough details to be able to identify the root cause of the problems in production environment without impacting the businesses. Application server would be the right product to build this intelligence of generating the lightweight data in production environment and analysing them to present it in a meaningful way, so the administrators can get to the root cause of the problem. This will also result in giving value to the users by giving the right set of tools along with App Server product.
See below for a longer, more detailed technical description.
There is a need for lightweight monitoring mechanism that will allow monitoring to be turned on in production environment with minimal impact. There should be no overhead when there is no monitoring.
There is a need for an infrastructure that allows clients to decide what should be monitored with an ability to monitor specific issues (e.g: Only HTTP 200 OK response). There is also a need to allow clients to conditionally monitor some activity (e.g: Track ejb methods ONLY when called within a Tx). Additionally, there should be capability to monitor components beyond EJB, Web etc (ThreadPools, Replication etc)
GlassFish v2 monitoring infrastructure is heavy weight with three monitoring levels:
Also, the set of monitorable data is fixed. There is no easy/extensible way to add new monitoring data to the system.
![]() |
Now Called: Flashlight
gfProbe is a lightweight and extensible framework that allows clients to monitor GlassFish in a production environment. The framework also allows new clients to be written after the product has shipped and hence is extensible. The infrastructure also ensures that the system operates with zero overhead when no monitoring is going on. Monitoring using gfProbe framework is done by instrumenting the target code (server code). More specifically, only methods that provide monitoring data (or being monitored) are instrumented (using method retransformation support provided by JDK6). When there are no clients, then no instrumentation is done and hence the system operates with zero monitoring overhead.
The Probe Infrastructure provides a Factory class that allows a Provider to register an interface as a Provider. The interface typically defines the methods that signals probe points.
@Contract
public interface ProbeProviderFactory {
public <T> T getProbeProvider(String moduleName, String providerName, String appName, Class<T> providerClazz)
throws InstantiationException, IllegalAccessException;
The ProbeProviderFactory.getProbeProvider() method allows a provider interface to be registered against a As mentioned previously, a probe provider is basically an interface that provides a high level view of the probe points (or events) in the system. A v3 container or module may define a ProbeProvider to emit probe events to signal high level events.
For example, the transaction manager module may define a ProbeProvider as follows:
@ProbeProvider
public interface TxManager {
@ProbeParams(“{txId}”)
@ProbeName(“begin”)
public void onTxBegin(String txId);
@ProbeParams(“{status}”)
public void onCompletion(boolean outcome);
}
The transaction manager then registers this class with the Probe Infrastructure by calling the ProbeProviderFactory.getProvider() method. The return value from this method will be a class that implements the TxManager interface.
Underlying implementation of TxManager is generated by ASM. By default a no-op implmentation of the methods is created. In case there are no Listeners, the methods do not incur any costs.
The @ProbeName annotaion defines the name of the probe. If not used, the name of the method will be used as the probe name. Note: that if the provider class has overloaded methods, then this annotation should be used to define a non-conflicting name.
The @ProbeParams annotation in the Provider class is used to give a name to each of the values that are passed as arguments. This allows the client to choose a subset of values in their methods. See the next couple of sections for more details. Note: Instead of using @ProbeParams, we could have used @ProbeParam annotation in ProbeProviderFactory to annotate individual probe parameters.
Once registered, the Provider is now ready to emit events that clients can listen to. Thus, deep in the module implementation, when the module knows that it is about to begin a transaction, it can emit the event by calling the onTxBegin method. The code snippet below lists the code for registering the provider and the code for emitting the event.
public class TransactionManagerImpl {
TxManager txProvider = ProbeProviderFactory.
createProvider(“tx”, “TxManager”, null,
TxManager.class);
public void begin() {
String txId = createTransactionId();
....
txProvider.onTxBegin(txId); //emit
}
}
public interface ProbeProviderInfo {
public String getModuleName();
public String getProviderName();
public String getApplicationName();
public String getProbeName();
public String[] getParamterNames();
public Class getParamterTypes();
}
Listeners (or clients) express interests in probe points (or probe events) by a particular set of Probe Providers. The client just needs to annotate the methods with @ProbeListener passing just the
public class TxListener {
AtomicInteger txCount = ....;
@ProbeListner(“tx:TxManager::begin”)
public void begin(String txId) {
txCount++;
}
}
gfProbes infrastructure allows clients to monitor glassfish even in the absence of provider classes. This is done by allowing clients to receive callbacks when a java methods are entered / exited. Note that while this approach allows a client to monitor legacy code, it may not always be possible to receive “high-level” events. For example, while it is easy to monitor (through gfProbes) when TransactionManagerImpl.begin() entered / exited, the client cannot determine the transaction ID in this case.
public class TxMonitor {
@MethodEntry(“tx:com.sun.tx.TxMgrImpl::onTxBegin")
public void onTx(String tId) {
count++;
}
}
The @MethodEntry annotation must be used by the client to receive callback when the target method is entered. The client method argument types and count must match the target methods parameter types/count. (This restriction might be removed later)
The @MethodExit annotation must be used by the client to receive callback when the target method is exited. The client method argument types and count must match the target methods parameter types/count. (This restriction might be removed later). The first parameter in the client method should match the return type of the target method (only if the target method has a non void return type)
The @OnException annotation must be used by the client to receive callback when the target method exits because of an exception. The client method argument types and count must match the target methods parameter types/count. (This restriction might be removed later). The first parameter in the client method should be of type Throwable
Probe clients can express their interest in certain predefined values that are not part of the target method definition. For example, ${gf.appname}, ${gf.modulename}}} etc. are some of the computed params that are available to the clients, these values are computed/evaluated only on demand and provided by the probe infrastructure.
A client is registered with the gfProbe framework to receive callbacks. This is done by calling ProbeClientMediator.registerListener() method.
@Contract
public interface ProbeClientMediator {
public Collection<ProbeClientMethodHandle> registerListener(Object listener);
}
The listener can be any java object, can extend any object and can implement any number of interfaces.
The only restriction is that the return value from callback methods must be void.
The listener must be thread safe as the target method that is being probed may be entered by multiple threads. However, the framework will provide utility classes to perform some common operations like count(), avg(), sum() etc.
A listener that is not registered to listen for events will never be called by the framework. Thus unregistered listeners invoke no overhead.
Also, the set of operations that a Btrace client can perform is also limited.
gfProbe clients on the other hand are true clients and infact are called in the same thread as the target/probed method. This allows the client to use thread locals and even access thread locals of the probed system (if allowed). However, a gfProbe client will have the same set of restrictions as a JavaEE application (like they cannot open server socket, create new thread etc.)
Btrace has many features that allows debugging of a target applications. For example, it is easy to track the number of times a java object is locked / unlocked. gfProbe does not provide these facilities.
For GlassFish v3 Prelude release the Monitoring infrastructure will depend on the Probe Provider implementation by the Web Container. The Web Container will need to provide certain probe points as listed by gfProbes Probe Provider contract
The Monitoring infrastructure will write ProbeListeners to the probe points listed by the web container Probe Provider. These ProbeListeners are called as Telemetry Objects in the context of Monitoring Infrastructure and are discussed in details in the following sub-sections.
The Telemetry component listeners for a provider will be registered or unregistered based on the lifecycle event from the provider (when provider is coming up or going down), thus making the Telemetry component agnostic to whichever modules its associated with (i.e no dependencies). Once the listeners are registered, the data is collected from probe points, where each record can encompass of several events from several listeners (ex., getting the response time data from 'Request start' and 'Request end' probes/listeners). Analyzer is a built-in facility of the Telemetry module which will massage the data collected from listeners to expose it using Object View Hierarchy.
The CLI will expose the 'asadmin set' command to allow the configuration (enable/disable) of monitorable components. This will be in addition to what we would do in V2 (see below).
The existing domain.xml elements and the child nodes will be preserved to turn the monitoring levels at the very high level for a module (for ex. turn off the monitoring for all the components of web-container). The levels 'Low' and 'High' will have no difference for V3 release, we will expose them from UI as either turning them 'On' or 'Off'.
The monitoring at the more granular level will not be done for Prelude, and we will be supporting only the at a very high level for Prelude.
Thus the information gathering happens at a much granular level only for the requested nodes (or attributes for next release). For example the Request Information gathering which can be a very expensive operation during the peak time of a business, one can choose to turn it off and this object will be omitted from the hierarchy and also the probe listeners are unregistered to stop collecting the information.
The CLI will traverse the OVH for a given dotted name, which results in the retrieval of the data corresponding to the dotted name. We will be backward compatible with regards to the dotted name.
Following figure gives the detailed view of the Monitoring Object View Hierarchy for the Web container, which will be the only provider for this release.
The syntax of the monitor command is as follows:
monitor --type monitor_type --interval 30 --filter filter --filename filename target
To monitor http-listener1 in server instance:
>asadmin monitor --type web-container.http-listener.http-listener1 --interval=5 server
Where type denotes the dotted name value referring to the sub component which you would like to monitor (http-listener1 in this case). Note that the dotted name would be able to identify any third party component also, provided they are implemented with the right contracts (annotations) and interfaces.
The filename option allows user to save the monitoring attributes to a file in comma separated format.
The interval option is the duration of time when it refreshes the screen with new data from the server.
We will also support the keywords like httplistener, jvm etc., for --type option, to be backward compatible with V2.
asadmin get --monitor dotted-name
set <dotted-name=true|false> [target=server]
{{ex: asadmin set server.web-container.thread-pool.thread-pool-1.enabled=false
Note that the dotted name exposed should be able to accommodate any third party components also.
The framework will also provide a tree data structure that will be created by clients to store data. This tree will be queried by runtime to provide data. The tree will be in the form of the object view hierarchy. Consider a WebTelemetryClient, it wants to count the number of times methodEntry was called. It declares a method called "getCount" and makes it monitorable. It uses the Counter utility class to maintain the count.
public class WebTelemetryClient{
Counter counter;
@Monitorable ("count")
public long getCount (){
return counter.getCount ();
}
@ProbeListener (web::methodEntry)
public methodEntered (){
counter.increment ();
}
public void init (){
TreeNode node = TreeNodeFactory.createTreeNode ("webTelemetry", this);
TreeNode child = TreeNodeFactory.createTreeNode ("count", this);
node.addChild ("count");
}
}
The WebTelemetryObject registers itself as a TreeNode and registers the "getCount" monitorable method to the tree node. At runtime admin cli will get an instance of the TreeNode class ("webTelemetry object) from the habitat and invoke getCount (treeNode.getNode ("count).getValue () on it to return the value of getCount.
of the Admin GUI Functional Spec for more details.
Pluggability in general is described in a separate spec
. Here we will discuss the pluggability aspects which are specific to monitoring. Any module (either third party or built-in) will be able to use our monitoring infrastructure to expose the monitoring functionality for its component. A module owner would need to come up with Telemetry objects(listeners), Monitoring Object view hierarchy, Probe Points, CLI and optionally GUI interface for their module. We will provide a way (himself to the Object view hierarchy) for the module writer to be able to seamlessly integrate his module with ours to expose the monitoring capabilities for his module.
.A system administrator will write a custom Probe Client that listens to the probe points. The system administrator will deploy this custom script to the Probes Infrastructure and start listening to Probe Events. The scripts will be deployed through asadmin deploy command. The custom scripts will need to be packaged as a jar to be deployed onto the gfProbes infrastructure.
Once deployed, the scripts can start listening to events, as well as make use of utility classes provided by the infrastructure to maintain structures like count, averages etc.
public interface WebContainerProvider {
public void requestArrived();
public void responseSent();
}
then an instance of provider is created by doing the following:
import com.sun.tracing.*; .... ProviderFactory factory = ProviderFactory.getDefaultFactory(); WebContianerProvider webProvider = factory.createProvider(WebContainerProvider.class); .... webProvider.requestArrived(); .... webProvider.responseSent();
To enable easy integration with the above feature, all we have to do is the following:
Exposed @Service interfaces for the third-party monitoring, which would be included as part of Object view hierarchy, which would in turn be exposed as part of dotted names to the CLI commands.
Other exported interfaces are dotted-names and all the CLI commands
package org.glassfish.gfprobe.provider;
@Contract
public interface ProbeProviderFactory {
public <T> T getProvider(String moduleName, String providerName, String appName, Class<T> provideClazz);
}
package org.glassfish.gfprobe.provider;
@Retention(RetentionPolicy.RUNTIME)
@Target(ElementType.METHOD)
public @interface ProbeListener {
public String value() default “”;
}
package org.glassfish.gfprobe.client;
@Service
public class ProbeClientMediator {
public void registerClient(Object obj);
}
package org.glassfish.gfprobe.client;
@Retention(RetentionPolicy.RUNTIME)
@Target(ElementType.METHOD)
public @interface ProbeMethodExit {
public String value() default “”;
}
package org.glassfish.gfprobe.client;
@Retention(RetentionPolicy.RUNTIME)
@Target(ElementType.METHOD)
public @interface ProbeMethodException {
public String value() default “”;
public Class[] exceptions() default null;
}
4.5.1.3 Utility Framework Classes For all of these interfaces:
package org.glassfish.flashlight;
import java.lang.annotation.ElementType;
import java.lang.annotation.Retention;
import java.lang.annotation.RetentionPolicy;
import java.lang.annotation.Target;
/**
* To designate a class as monitorable so that it is published in the
* MonitoringRegistry
* @author Harpreet Singh
*/
@Target ({ElementType.TYPE,
ElementType.FIELD,
ElementType.METHOD})
@Retention (RetentionPolicy.RUNTIME)
public @interface Monitorable {
String value () default "";
}
/**
* @author Harpreet Singh
*/
@Contract
public interface MonitoringRuntimeDataRegistry {
public void add (String name, TreeNode node);
public void remove (String name);
/**
* @param name of the top node in the registry
* @return TreeNode
*/
public TreeNode get (String name);
}
package org.glassfish.flashlight.datatree;
@Contract
public interface TreeNode {
public String getName ();
public void setName (String name);
// TBD getValue should take varargs
public Object getValue ();
public void setValue (Object value);
public String getCategory ();
public void setCategory (String category);
public boolean isEnabled ();
public void setEnabled (boolean enabled);
// Children utility methods
public TreeNode addChild (TreeNode newChild);
public void setParent (TreeNode parent);
public TreeNode getParent ();
/**
*
* @return the complete dotted name to this node
*/
public String getCompletePathName ();
public boolean hasChildNodes ();
/**
* Returns a mutable view of the children
* @return Collection<TreeNode>
*/
public Collection<TreeNode> getChildNodes ();
public TreeNode getNode (String completeName);
public List<TreeNode> traverse ();
public List<TreeNode> getNodes (String regex);
}
In addition, documentation will have to provide some sample scripts that enables monitoring of most commonly monitored data. For example, a web container may provide scripts to monitor the number of web requests, the average response times of such requests etc.
JSR77 could be an issue as we are trying to incorporate the REST functionality. REST has some limitations in terms of the objects exposed which needs to have specific interfaces implemented in each of those objects. We need to think of a way to overcome this for the final release.
JMX we think could expose the OVH using the MBean Server Interceptors and Virtual MBeans implementation. See link1
, link2
for more details. When we solve the JMX problem, AMX shouldn't be an issue.
Dotted names should be able to support and enhance the monitoring and configuration data from the previous release.
Backward compatibility with CallFlow will be supported post-prelude.
Monitoring Infrastructure will be delivered as part of the 'Prelude' release. For this release we will be concentrating on 'Web Container' monitoring only. Probes Infrastructure will be delivered as part of the “Prelude” release.
The aim is to get the web container module to provide probe points for “Prelude”
For details on the Monitoring schedule – refer monitoring specification.