Have you wondered why Liferay decided to remove support for an internal Lucene cluster in Liferay DXP, and replace that with Elasticsearch? Why do I need the additional infrastructure to host an Elasticsearch cluster for high availability whereas with older Liferay versions, I could get by just fine without it? If you are asking these and other related questions, you are not alone.
Liferay DXP has been out for quite a while, but this question still seems to come up frequently when talking with clients. Since I’m fielding this question from multiple clients, I figured I’d write a blog to help others in the community. First, I’m going to start off by saying that I truly don’t know exactly why Liferay stopped supporting an internal Lucene cluster in DXP. However, I can explain why it was a logical decision. In the end I’m sure you’ll agree that this was a good decision on the part of Liferay.
Using the internal Lucene was a fairly quick setup in Liferay 6.2 (and earlier), however, it had several drawbacks. In a clustered environment, it was not particularly stable and it negatively impacted scalability. To start, a Lucene cluster does not scale well to a larger infrastructure. As the app server cluster gets larger, the amount of chatter required between nodes gets unmanageable. It becomes easy for the index to get out of sync between nodes. You can also have issues with a smaller number of app servers but a very large search index. In this case, your app servers can have GC issues in the JVM handling the user traffic along with maintaining the search index. When you dig into the details of the core product, Liferay performs a lot of index updates during regular use of the site. Assets being updated with view counters and users being updated with last login timestamps are just two examples. At one previous client, I measured the index queries and updates traffic and found them to be very close in volume.
In a small cluster, Lucene worked pretty well assuming all nodes remained active and search index integrity is not too important. Let’s look at a simplistic view of a two node cluster and examine how your search index is replicated. Anytime a change is made on node1, a message is sent to node2 to keep the index in sync. For example, when a user registers on the website using node1 they will be indexed in Lucene on node1. That index will be replicated to the node2 to keep them in sync. This is also happening in reverse.
Now imagine, we decide to perform a zero downtime deployment of code. We remove traffic from node1 in the load balancer (or web tier) to deploy our code and restart the app server. While the app server is restarting, a user registers in the website on node2. That user is indexed on node2 but never indexed on node1, as it’s currently down. This can also happen with many other changes such as content modifications (user uploaded documents), user profile changes, etc. Yes, I agree that this is a very simplistic example but it shows one particular scenario. Realistically, the next time that user is updated they will get re-indexed and hopefully sync the Lucene index on both nodes. However, this example does show how your search index can slowly drift apart on each app server node. If you’re running Lucene now, check your statistics. I would bet that the number of records indexed in each node is not identical. You may also see a different number of users displayed in control panel on each node. You can combat these issues by occasionally reindexing content, users, etc in Liferay to keep them in sync. But, this adds overhead and work otherwise not needed with an external search index such as Elasticsearch or Solr. Furthermore as the number of users, content, documents, and assets grow in Liferay the reindex process will get longer. With some of our larger Liferay installations, we have seen reindexing times measured in days rather than minutes.
Generally, we’ve always recommended an external search cluster anway. In the past, that was Solr. Now, we were very excited that DXP supports Elasticsearch out of the box. This provides more options to companies. You can still use a Solr cluster if desired but the initial setup for Elasticsearch is quicker. Most clients are moving this direction with their DXP upgrade.
With all the issues of inconsistency along with the obvious scalability issues, I’m sure that ultimately lead to Liferay’s decision to stop supporting an internal Lucene cluster. The search function in Liferay DXP is much more important than ever before.
If you have questions on how you can best leverage Liferay Search Indexing and / or need help with your Liferay DXP implementation, please engage with us via comments on this blog post, or reach out to us at https://www.xtivia.com/contact/ or [email protected].
Additional Reading
You can also continue to explore Liferay DXP and Elasticsearch by checking out Migrating to Liferay DXP, The Top 10 New Features in Liferay DXP, or Search in the Liferay 7 Knowledge Base.