Introduction to Solr:
Solr is an open-source search engine built on top of Apache Lucene. Solr is a more advanced version of Lucene’s search. It offers more functionality and is designed for scalability. Solr can be communicated via REST clients, wget, curl and Chrome’s POSTMAN, native clients, etc.
Features of Solr:
- Faceted search
- Highlighting
- Full-text search
- Dynamic clustering
- Real-time indexing
- Replication
- Paginations
- Sorting
- Database integration
- NoSQL features
- Rich document handling (eg: Word and PDF files)
Introduction to Elasticsearch:
Elasticsearch engine is also open-source and based on the Apache Lucene java library. It was developed by Shay Banon of Elastic NV. Elasticsearch provides a distributed, multitenant full-text search capability with an HTTP web interface. It can be communicated by RESTful API services.
Features of Elasticsearch:
- Automatic node recovery
- Automatic data rebalancing
- Full-text search
- Field and document level API security
- Filters
- Highly available scalable alerting
- Encrypted communications
- Distributed search
- Multi-tenancy
- An analyzer chain
- Analytical search
- Grouping & aggregation
Comparisons between Solr and Elasticsearch:
Solr | Elasticsearch | |
Configurations | XML file solrconfig.xml used for configuration | YAML file elasticsearch.yml used for configuration |
Indexing/Searching | Text-oriented | Better performance of analytical queries. |
Scalability and Clustering | Provides SolrCloud | Better inherent scalability and designed for the cloud. |
Alerts | Heartbeat, Threshold and Incident Alerts. | Highly available and scalable alerting. Notifications via email, Slack, and JIRA. |
REST API Ref | Solr API to manage Collection level configurations and Client API. | Document API, Search API, Aggregations API, Ingest API and Management API. |
Full-text search | Faceted search, Highlighters, spell check, Autocomplete and Filter queries. | Inverted index, cross-cluster search, Highlighters, Query DSL, Typeahead, corrections (spell check) |
Security | Basic authentications, Document-level security, need for a firewall. Has no known cross-site scripting vulnerabilities. | Encrypted communications, Encryptions at REST support, Attribute-based access control, Role-based access control, Filed and Document-level security, Single sign-on (SSO) and IP filtering. |
Ingest Node | Introduced in Elasticsearch version 5.0 to process documents before search. | |
Community | Bigger ecosystem community | Has a big community, Discussion forums, Resources to learn. |
Search Engine Rankings:
Below is the ranking chart provided by DB-Engine based on the popularity of a variety of search engines. As per the below chart, nowadays Elasticsearch is a more popular search engine.
Conclusion:
Both Solr and Elasticsearch engines have matured codebase and a well-documented, big ecosystem; based on the requirement we can choose either one. For text-based search, Solr will be the best choice. For distributed and scalable features and analytical queries, Elasticsearch will be the best choice.