Unleashing Lightning-Fast Searches: Achieving 200x Performance Enhancement with Manticore Search

Search Image — Image by Gerd Altmann from Pixabay

An indispensable aspect of web applications is the 'Search suggestion' feature, streamlining user access to desired information. But when sluggish, users might explore alternatives. Poor search performance can repel users. This challenge confronted a Pevatrons' client with an enterprise app. We delve into the issue and our efficient solution.

The Problem

Initially relying on Sphinx Search for their search functionality, the client encountered complications over time due to the following concerns::

They were using Sphinx search in plain mode and were loading their data from MySQL (the core data repository) to Sphinx index via Indexer tool, which always lead to them not being able to search the newly added data immediately; due to the delay caused by loading of the data, the users were left wondering if there is something wrong in what they did.
During the process of loading data from MySQL to Sphinx, there would be failures during peak load times as the load on databases would increase.
Due to the Sphinx real-time mode not working, MySQL was used which unfortunately means that the search would be super slow, in some cases it would take upwards of 10 seconds and sometimes even fail.

The team at Pevatrons quickly diagnosed the issue and identified the pain point. The client wanted a true real-time search i.e. if the data is updated or inserted, it should be available for search immediately. Any solution that is even near-real-time (NRT) - which Elastic Search and Sphinx supported - was unacceptable.

The first task was to find out a search engine that suits our purpose. We zeroed in on Manticore Search based on a few important criteria:

- Community support and adoption

- Availability of official Docker image

- Manticore was a fork of Sphinx - so the learning curve is reasonable

- Support for real-time search

- Ability to parallelize any search query

- Support for (multi-node) replication with a virtually synchronous multi-master replication

What is Manticore Search and What does it do?

Quoting directly from the Manticore Search website:

Manticore Search is a multi-storage database specifically designed for search, with robust full-text search capabilities.

As an open-source database (available on GitHub), Manticore Search was created in 2017 as a continuation of Sphinx Search engine. Our development team took all the best features of Sphinx and significantly improved its functionality, fixing hundreds of bugs along the way (as detailed in our Changelog). With nearly complete code rewrites, Manticore Search is now a modern, fast, and light-weight database with full features and exceptional full-text search capabilities.

Manticore Search allows you to search your data in a resource-efficient and latency-efficient manner. The core of manticore search is written in C++, which allows it to be closer to hardware than many other programming allowing it to exploit hardware-based optimisations and deliver blazing-fast search with minimal resource consumption.

Manticore Search has beaten MySQL and the crowd favourite Elastic Search in many of the benchmarks

Manticore Search's Gihub page has more to say:

182x faster than MySQL for small data (reproducible❗)

29x faster than Elasticsearch for log analytics (reproducible❗)

15x faster than Elasticsearch for small dataset (reproducible❗)

5x faster than Elasticsearch for medium-size data (reproducible❗)

4x faster than Elasticsearch for big data (reproducible❗)

up to 2x faster max throughput than Elasticsearch's for data ingestion on a single server (reproducible❗)

These numbers made us believe that we have made the right choice, however the devil is in the details.

Various interfaces to connect to Manticore Search

Manticore Search allows us to connect to it via the following interfaces

Manticore Search supports MySQL protocol which means you can use any MySQL client library of your choice to connect to it.
It supports connecting via HTTP/HTTPS protocols too, its endpoints are similar to Elasticsearch endpoints meaning many of the tools built with Elastic Search (which is a lot more popular in adoption) can work with this too.
Manticore internally uses its binary protocol to connect to its nodes in the cluster.

All this meant that regardless of any programming language it would work as you would find MySQL implementations everywhere, if not we could always go back to using HTTP endpoints for querying the data.

While all these look simple to implement, while introducing any new component in the system it needs to be thoroughly tested before the component could be introduced.

We also needed to make sure the newly introduced component could handle the load and could fit into the existing architecture with minimal changes.

Legacy and New Architecture

The above legacy architecture came with inherent drawbacks:

User requests are received by Amazon API Gateway, which activates a Lambda Function through endpoints and rules. This function, built with Node.js, handles searches. Formerly, it called Sphinx for this but switched to relying solely on MySQL due to issues.
The application's data in MySQL was distributed across various tables, each aligned with specific features. For search purposes, these tables needed to be joined before conducting searches on the combined data. This process was considerably sluggish, akin to searching through an extensive table.
Only full-text search was supported as anything other than that would be terribly slow.
Type-as-you-go suggestions could not be supported since the search would take upwards of 10 seconds, in the worst case.

What's in the new architecture?

Pevatrons' team seamlessly integrated the Manticore Search component into the architecture, yielding the following enhancements:

Real-time capabilities for Insert, Update, Delete, and Search operations.
Remarkable acceleration of search queries—what used to take over 10 seconds on the legacy system now completed in under 50 milliseconds for the 95th percentile of users, marking a staggering 200x performance improvement.
The implementation of prefix-based search in Manticore enabled instant type-as-you-go suggestions once users inputted 3 or more characters.
The newfound latency improvements made type-as-you-go suggestions a reality, enhancing the overall user experience.

The crux of the solution lay in redefining the interplay between Manticore Search and MySQL. While retaining MySQL as the repository for all data, we strategically funneled only searchable data into Manticore Search. To ensure real-time search capabilities, we instituted a direct pipeline for user-inserted or updated data into both Manticore and MySQL. This strategic pivot eliminated the lag that the prior MySQL-to-Manticore synchronization approach had introduced. Naturally, this shift necessitated the development of a data migration script to synchronize MySQL and Manticore for initial parity.

Load Generation and Testing

We had to make sure the Manticore Search component can handle the necessary load and can scale up and down as necessary. Load generation is the process of generating load to simulate real-world users and make sure the application can handle real-world load.

We used the Locust load generation tool for this purpose and on an AWS EC2 instance optimized for compute, we were confident that the architecture could handle upwards of 1200 users per second with less than 30% CPU utilization. The load test results also meant that our client was confident about getting one step closer to moving the new architecture to production.

Conclusion

The new architecture with Manticore Search powered the enterprise web application with support for real-time search (type-as-you-go) - a dramatic change for the end users using the application as that makes them a lot more productive. While the benefits for the end user are more apparent, the new architecture came with impressive engineering advantages such as low resource consumption, the ability to use SQL syntax, and a 'peaceful' co-existence with MySQL where the main data resides.

Some useful links to understand more about Manticore Search.

We are always happy to hear your thoughts and experiences on this, feel free to comment and share. If you want to implement a real-time search for your application shoot an email to queryus@pevatrons.net.