Support Resources

LucidWorks Forum
KnowledgeBase

LucidWorks Platform v2.0

PDF Version

Older Versions

LWE Guide 1.8
LWE Guide 1.7
LWE Guide 1.6

This is the documentation for LucidWorks Platform v2.0, the latest release is v2.1.

Skip to end of metadata
Go to start of metadata

Monitoring your application always is an important part of running production system. Most system administrators have used various tools to ensure everything is ok from the health of server's filesystem to the the temperature of CPUs. LucidWorks Enterprise provides additional capabilities to integrate application level statistics information into these monitoring tools.

This functionality is available in LucidWorks Enterprise but not LucidWorks Cloud.

JMX

JMX is a standard way for managing and monitoring all varieties of software components for Java applications. JMX uses objects called MBeans (Managed Beans) to expose data and resources from your application. LucidWorks Enterprise provides number of read-only monitoring beans that provide useful statistical/performance information. Combined with JVM (platform JMX MBeans) and OS level information, it becomes powerful tool for monitoring.

Enabling JMX for LucidWorks Enterprise

By default JMX is enabled in LucidWorks Enterprise for local access only. If you want to connect and monitor application remotely you need to change lwecore.jvm.params parameter in the LWE_HOME/conf/master.conf file and add the following JVM parameters:

lwecore.jvm.params=... -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=3000 -Dcom.sun.management.jmxremote.ssl=false
-Dcom.sun.management.jmxremote.authenticate=false -Djava.rmi.server.hostname=my.server.name

Where 3000 is an unused TCP port number.

You might want to secure remote JMX access either by configuring a software or hardware firewall to allow connections to specified port only from your hosts/network or by configuring password authentication and/or SSL encryption. For more information about various security options please refer to the JMX documentation.

JMX Clients

There are number of various JMX clients out there you can use to connect to LucidWorks Enterprise server and browse available information.

JConsole

JConsole is a standard (part of the JDK) graphical monitoring tool to monitor Java Virtual Machine (JVM) and Java applications which provides a nice way to display memory and CPU information as well MBeans from arbitrary applications.

JMXTerm

Jmxterm is an open source command line based interactive JMX client. It allows you to easily navigate JMX MBeans on remote servers without running a graphical interface or opening a JMX port. It can also be integrated with script languages such as Bash, Perl, Python, Ruby, etc.  See the following as an example of how it can be used: 

sh> java -jar jmxterm-1.0-alpha-4-uber.jar
Welcome to JMX terminal. Type "help" for available commands.

$>jvms
67183    ( ) - start.jar /Users/alexey/LWE/conf/jetty/rails/etc/jetty.xml /Users/alexey/LWE/conf/jetty/rails/etc/jetty-jmx.xml 
/Users/alexey/LWE/conf/jetty/rails/etc/jetty-ssl.xml
67182    (m) - start.jar /Users/alexey/LWE/conf/jetty/lwe-core/etc/jetty.xml /Users/alexey/LWE/conf/jetty/lwe-core/etc/jetty-jmx.xml 
/Users/alexey/LWE/conf/jetty/lwe-core/etc/jetty-ssl.xml
93534    ( ) - jmxterm-1.0-alpha-4-uber.jar
8554     ( ) -

$>open 67182
#Connection to 67182 is opened

$>domains
#following domains are available
JMImplementation
com.sun.management
java.lang
java.util.logging
org.mortbay.jetty
org.mortbay.jetty.handler
org.mortbay.jetty.security
org.mortbay.jetty.servlet
org.mortbay.jetty.webapp
org.mortbay.log
org.mortbay.util
solr/LucidWorksLogs
solr/collection1

$>domain solr/collection1
#domain is set to solr/collection1

$>beans
#domain = solr/collection1:
...
solr/collection1:id=collection1,type=core
solr/collection1:id=org.apache.solr.handler.StandardRequestHandler,type=standard
...
solr/collection1:id=org.apache.solr.search.FastLRUCache,type=fieldValueCache
solr/collection1:id=org.apache.solr.search.LRUCache,type=documentCache
solr/collection1:id=org.apache.solr.search.LRUCache,type=filterCache
solr/collection1:id=org.apache.solr.search.LRUCache,type=queryResultCache
solr/collection1:id=org.apache.solr.search.SolrFieldCacheMBean,type=fieldCache
...
solr/collection1:id=org.apache.solr.search.SolrIndexSearcher,type=searcher
solr/collection1:id=org.apache.solr.update.DirectUpdateHandler2,type=updateHandler

$>bean type=updateHandler,id=org.apache.solr.update.DirectUpdateHandler2
#bean is set to solr/collection1:type=updateHandler,id=org.apache.solr.update.DirectUpdateHandler2

$>info
#mbean = solr/collection1:type=updateHandler,id=org.apache.solr.update.DirectUpdateHandler2
#class name = org.apache.solr.core.JmxMonitoredMap$SolrDynamicMBean
# attributes
  %0   - adds (java.lang.String, r)
  %1   - autocommit maxTime (java.lang.String, r)
  %2   - autocommits (java.lang.String, r)
  %3   - category (java.lang.String, r)
  %4   - commits (java.lang.String, r)
  %5   - cumulative_adds (java.lang.String, r)
  %6   - cumulative_deletesById (java.lang.String, r)
  %7   - cumulative_deletesByQuery (java.lang.String, r)
  %8   - cumulative_errors (java.lang.String, r)
  %9   - deletesById (java.lang.String, r)
  %10  - deletesByQuery (java.lang.String, r)
  %11  - description (java.lang.String, r)
  %12  - docsPending (java.lang.String, r)
  %13  - errors (java.lang.String, r)
  %14  - expungeDeletes (java.lang.String, r)
  %15  - name (java.lang.String, r)
  %16  - optimizes (java.lang.String, r)
  %17  - rollbacks (java.lang.String, r)
  %18  - source (java.lang.String, r)
  %19  - sourceId (java.lang.String, r)
  %20  - version (java.lang.String, r)
#there's no operations
#there's no notifications

$>get cumulative_adds
#mbean = solr/collection1:type=updateHandler,id=org.apache.solr.update.DirectUpdateHandler2:
cumulative_adds = 125;

JMX MBeans

LucidWorks Enterprise provides number of useful JMX MBeans, some in Solr and some in LucidWorks Enterprise:

Solr MBeans

Domain Objects Available attributes Comments
solr/<collection_name> type=updateHandler,id=org.apache.solr.update.DirectUpdateHandler2 cumulative_adds, cumulative_deletesById, cumulative_deletesByQuery, cumulative_errors, commits, autocommits, optimizes, rollbacks, docsPending, etc This MBean provides comprehensive information about indexing activity like number of added documents, number of errors, number of commits, autocommits and optimize operations. It is really useful to plot that information into graphs in your monitoring system. The cumulative_errors parameter shows the number of low level IO exceptions.
solr/<collection_name> type=/update,id=org.apache.solr.handler.XmlUpdateRequestHandler request, errors, avgTimePerRequest, etc If using direct Solr API, there are separate beans for all types of handlers you can use to index documents into the system, such as XML, CSV, JSON request handlers. It makes sense to add this UpdateRequest Handler information to indexing graphs as well. You might also setup monitoring alert on a number of errors for particular update handler to make sure LucidWorks Enterprise clients don't hit any errors during indexing like invalid fields names or types, no required fields in indexed documents, etc.
solr/<collection_name> type=/lucid,id=org.apache.solr.handler.StandardRequestHandler requests, errors, timeouts, avgTimePerRequest This MBean represents the default LucidWorks Enterprise Solr request handler and provides statistics about number of search requests, errors, timeouts and average response time for search requests. It's pretty useful to display this information on monitoring graphs as well as setup monitoring alerts, such as, "notify administrator if average response time is more than 0.5 second or total number of errors and timeouts is more than 1% of total requests".
solr/<collection_name> type=searcher,id=org.apache.solr.search.SolrIndexSearcher numDocs, warmupTime numDocs is the total number of documents in the index. warmupTime is the amount of time a new Searcher takes to warm. When LucidWorks Enterprise commits new data into index, a new Searcher is opened and warmed. The warming operation regenerates caches from the previous Searcher instance and runs some predefined in solrconfig.xml queries to warm up IO filesystem cache and load Lucene FieldCache in memory. This attribute basically defines how long does it take to commit before new data will be available to users. It makes sense to monitor this parameter and setup trigger to alert the LucidWorks Enterprise administrator if it takes more time than you expect.
solr/<collection_name> type=filterCache,id=org.apache.solr.search.LRUCache cumulative_evictions, cumulative_hitratio, cumulative_hits, cumulative_inserts, cumulative_lookups, warmupTime, etc Solr caches popular filter query (fq=category:IT) attributes as unordered sets of document ids. This technique significantly improves search filtering/faceting performance. size is the current number of cached filter queries. cumulative_hitratio represents if this cache is successfully utilized by giving the ratio of successful cache hits to overall number of lookups. If it's low (such as < 0.3 or 30%) over long period of time then you might want either increase cache size or disable it at all to reduce performance overhead.
solr/<collection_name> type=queryResultCache,id=org.apache.solr.search.LRUCache cumulative_evictions, cumulative_hitratio, cumulative_hits, cumulative_inserts, cumulative_lookups, warmupTime, etc This cache stores ordered sets of document IDs and the top N results of a query ordered by some criteria. It has the same attributes as filterCache.
solr/<collection_name> type=documentCache,id=org.apache.solr.search.LRUCache cumulative_evictions, cumulative_hitratio, cumulative_hits, cumulative_inserts, cumulative_lookups, etc The documentCache stores Lucene Document objects that have been fetched from disk.

LucidWorks Enterprise MBeans

Domain Objects Available attributes Comments
lwe id=crawlers,name=<data_source_id>,type=datasources total_runs, total_time, num_total, num_new, num_updated, num_unchanged, num_failed, num_deleted This MBean displays crawlers statistics information for specific data source (like number of processed documents, number of errors, etc). If you have periodically or long running scheduled data source then you might want to monitor and alert if there's any problem with the underlying source (web site, SharePoint server, etc) or how optimized your incremental crawl is (percentage of num_unchanged to num_total), for example.
lwe id=crawlers,name=<collection_name>,type=collections total_runs, total_time, num_total, num_new, num_updated, num_unchanged, num_failed, num_deleted If you have multiple data sources and don't want to monitor on per data source level, but keep an eye on aggregate numbers for the whole collection you might want to use this bean.
lwe id=crawlers,type=total total_runs, total_time, num_total, num_new, num_updated, num_unchanged, num_failed, num_deleted You can use this MBean if you have multiple collections (homogeneous collections or multi-tenant architecture) to monitor on per instance level.

Integrating with Monitoring Systems

Using JConsole and JmxTerm tools is a good way to explore information hidden in JMX, but what you really need is to monitor your application automatically, record historical information, display it in a graphical form, configure parameters thresholds as triggers and send alerts in case of denial of service or performance problems. There are various standard sysadmin tools for that and integrating LucidWorks Enterprise with them is no different than with any other Java application. The idea is that you can retrieve application information and send it to external monitoring system. In our documentation we provide two examples of integrating LucidWorks Enterprise server with popular open source monitoring tools - Zabbix and Nagios.

Zabbix

Zabbix is an enterprise-class open source distributed monitoring solution for networks and applications. It comes with pre-defined templates for almost all operating systems as well as various open source applications. It also has a great template for JVM that contains the most vital statistics of arbitrary Java application. There are different ways how you can integrate LucidWorks Enterprise with Zabbix and the best approach depends on the Zabbix release version.

Post-2.0 releases

Post-2.0 releases (currently it's in beta release stage) comes with built-in support for monitoring Java applications (Zabbix Java proxy). For more information please see the JMX Monitoring section of the Zabbix manual.

Pre-2.0 releases

If you are handy with scripting and command line tools then you can also gather and send all the JMX information using either:

  • UserParameter: You can configure the Zabbix system agent to send custom monitored items using UserParameter configuration parameter. For retrieving JMX statistics you can use either cmdline-jmxclient or jmxterm command line clients.
UserParameter=jvm.maxthreads, java -jar cmdline-jmxclient.jar localhost:3000 java.lang:type=Threading PeakThreadCount
  • zabbix_sender tool: If you have a large number of JMX monitored items, or you need to monitor some items quite frequently, then spawning a Java Virtual Machine process to get a single object/attribute can be too expensive. In this case consider scripting JMX interactions using the JMXTerm command line tool and your favorite scripting language. The solution below is in Ruby but it could be implemented using any scripting language. The main idea is that you can run a JMXTerm java application from your script and communicate with it using stdin and stdout streams using expect library.
require "open3"
require 'expect'
....
# run jmxterm java application
stdin, stdout, wait_thr = Open3.popen2e('java -jar jmxterm-1.0-alpha-4-uber.jar')
# wait for prompt
result = stdout.expect('$>', 60)
...
# connect to specific jvm
stdin.puts("open #{process_id}")
result = stdout.expect('$>', 60)
...
stdin.puts('get -d solr/collection1 -b type=searcher,id=org.apache.solr.search.SolrIndexSearcher numDocs')
result = stdout.expect('$>', 60)
# parse response from jmxterm command
...
# run zabbix_sender command to send single item or save multiple values into file and send as a batch
output = `zabbix_sender -z #{@server_name} -p #{@server_port} -i file.txt`.chomp
# parse response and validate that operation is successful
...

How to integrate with Zabbix 2.0 (1.9.x)

This section covers step by step guide how to integrate LucidWorks Enterprise product with the Zabbix 2.0 (1.9.x) release. This won't work with previous releases (1.8.x) because they lack built-in JMX support.

  1. Download and install 2.0 (1.9.x) release according to official documentation. In order to build Zabbix JMX proxy you should build Zabbix package with the --enable-java configuration option, such as ./configure --enable-server --with-mysql --enable-java.
  2. After make install you should copy the example init.d start script from misc/init.d/debian/zabbix-server into the /etc/init.d directory and edit it to start the JMX proxy daemon as well. To do that you should add <install_dir>/sbin/zabbix_java/startup.sh and <install_dir>/sbin/zabbix_java/shutdown.sh calls to the corresponding options in init.d.
  3. Configure JMX proxy in /etc/zabbix/zabbix_server.conf (see JavaProxyJavaProxyPort and StartJavaPollers parameters). Verify that you're using the same port configured in <install_dir>/sbin/zabbix_java/settings.sh file. It is also recommended to enable JMX proxy verbose logging (edit <install_dir>/sbin/zabbix_java/lib/logback.xml file and change file element to point to your log file directory and set level attribute to debug level).
  4. Import the sample Zabbix templates found in $LWE_HOME/app/examples/zabbix called lwe_zabbix_templates.xml (there are 3 in that file).
  5. Install the Zabbix agent to the server where LucidWorks Enterprise is installed and configure it to connect to the Zabbix server.
  6. Add Zabbix host and assign proper template for the OS (linux, freebsd, etc.).
  7. Assign the imported templates (Template_JVM, Template_Solr, Template_LWE) to that host.
  8. Enable JMX monitoring in LucidWorks Enterprise and allow the Zabbix server connect to JMX interface over the network.
  9. Add JMX interface to host where LucidWorks Enterprise is installed.
  10. Start any activity in the LucidWorks Enterprise server (crawling, indexing, serving) and check out graphs for monitored host (see screenshots below).

Example graphs

  • Total number of documents in search index
  • Solr index operations (commits, optimizes, rollbacks)
  • Solr document operations (adds, deletes by id or query)
  • Crawling activity - number of total documents processed, number of failures (retrieve, parsing), number of new documents
  • Search activity - number of search requests
  • Search Average Response Time
  • Searcher Warmup Time (how fast committed docs become visible/searchable)
  • Java Heap Memory Usage
  • Caches stats

Nagios

Nagios is a popular open source computer system and network monitoring software application. It watches hosts and services, alerting users when things go wrong and again when they get better. There are different Nagios plugins that allow you to monitor Java applications using JMX interface. We recommend you to use Syabru Nagios JMX Plugin as the most mature plugin that supports different data types (integers, floats, string regular expressions) and advanced Nagios threshold syntax. In order to install Syabru Nagios JMX Plugin you should copy check_jmx and check_jmx.jar from the downloaded package to Nagios plugins directory and add check_jmx command definition to either global commands.cfg configuration file or put the jmx.cfg file into nagios_plugins configuration directory. The next step is to define Nagios services, as in this example:

# LWE searcher warmup time is no more than 1) 1 second - warning state 2) 2 seconds - critical state
define service {
        hostgroup_name                  all
        service_description             LWE_SEARCHER_WARMUP_TIME
        check_command                   check_jmx!3000!-O "solr/collection1:type=searcher,id=org.apache.solr.search.SolrIndexSearcher" -A warmupTime -w 1000 -c 2000 -u ms
        use                             generic-service
        notification_interval           0
}
# LWE search average response time is no more than 1) 100ms - warning state 2) 200ms - critical state
define service {
        hostgroup_name                  all
        service_description             LWE_SEARCHER_AVG_RSP_TIME
        check_command                   check_jmx!3000!-O "solr/collection1:type=/lucid,id=org.apache.solr.handler.StandardRequestHandler" -A avgTimePerRequest -w 100 -c 200 -u ms
        use                             generic-service
        notification_interval           0
}

After you setup your services and reload the Nagios configuration you can monitor application state using either the Nagios web UI or receive email notifications.

  • Nagios UI screenshot (thresholds on the screenshots are lowered to trigger critical state as an example)
  • Nagios email alert

Helpful tips

  • OS file system cache: One of the frequent problems with LucidWorks Enterprise and Lucene/Solr applications is that if you do not have enough free memory and a significant index size you might notice performance problems because there's not enough free memory for the file system cache. IO cache is a crucial resource for search applications, so it definitely makes sense to monitor this parameter and display it in graphs with other memory information like free memory, jvm heap memory, swap, etc. This parameter is part of the OS level monitoring in Zabbix (name is vm.memory.size[cached]).
  • File descriptors: Another problem is that sometimes your application can hit OS or per process file descriptor limits. It is also recommended to monitor these parameters and set trigger thresholds for these parameters.
  • CPU usage: Default Zabbix templates have triggers for CPU load average numbers. You might want to tune thresholds for your server based on number of CPUs and expected load.
  • Heap memory usage and garbage collector statistics: Zabbix Java template contains multiple items and triggers for memory and garbage collector invocation counts. You should also tune these parameters to match your scenario.
  • Solr index size and free disk space: These should be set properly to avoid "Out Of Disk Space" errors.
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.