Liferay Portal also provides search functionality out of the box. Liferay Portal includes the search framework which can be integrated with external search engines.
Liferay Portal, by default, uses the embedded Apache Lucene search engine.
By default, Liferay Portal's search API connects with the local embedded Lucene search engine. It stores search indexes on the local filesystem. When we use Lucene in a clustered environment, we need to make sure the indexes are replicated across the cluster.
Index storage on SAN
One of the options is to configure Lucene to store indexes on a centralized network location.
All the Liferay Portal nodes will refer to the same version of indexes. Liferay provides a way to configure indexes on a particular location. This approach is recommended only if we have SAN installed, and the SAN provider handles file locking issues.
To configure the location of the index directory, we need to add the following property in portal-ext.properties:
lucene.dir = < SAN lucene index location >
Replication Index using Cluster Link with Lucene
Cluster Link also replicates Lucene indexes across the Liferay Portal nodes. Cluster Link connects to all the Liferay Portal nodes using UDP multicast. When Cluster Link is enabled, the Liferay search engine API raises an event on Cluster Link to replicate specific index changes across the cluster.
This option is recommended if we cannot go with centralized index storage on SAN.
Apache Solr is one of the powerful open source search engines. It is based on the Apache Lucene search engine. In simple words, it wraps the Lucene search engine and provides access to Lucene search engine APIs through web services.
Solr runs as a separate web application.
To integrate Apache Solr with Liferay, we need to install the Solr web plugin. We can configure the URL of the Solr server by modifying the configuration of the Solr web plugin. It is recommended to use Solr with Liferay Portal when the Portal is expected to write a large amount of data in search indexes.
Apache Lucene will add a lot of overhead due to index replication over the cluster. As Apache Solr runs as a separate web application, it makes the Portal architecture more scalable.
Apache Solr is installed on a separate server. The Apache Solr server internally stores indexes on the filesystem. All Liferay Portal servers are connected with the Apache Solr server. Every search request and index write request will be sent to the Apache Solr server.
The Solr server performs concurrent read and write operations on the same index storage.
In such situations, it is recommended to use the master-slave Solr setup. In this approach, one master and many slave Solr servers are configured to work together. The master server will handle all the write operations and the slave servers will handle all read and search operations.
The Solr master server is configured such that it automatically replicates indexes to the slave server. Each Liferay Portal application server will be connected to both master and slave servers. The Liferay Solr web plugin provides a way to configure separate Solr servers for read and write operations. We can also configure separate slave servers for each Liferay portal node.