JuangaCovas.info

La página personal de Juan Gabriel Covas

Herramientas de usuario

Herramientas del sitio


linux:howtos:manticore-playground

¡Esta es una revisión vieja del documento!


Manticore Search adventures

This is my page of playing with Manticore Search.

Manticore Search is an open-source search engine designed specifically for search, including full-text search, with focus on low latency and high throughput. It was born in 2017 as a continuation of the famous Sphinx Search engine.

Things I like:

  • SQL-first, you can connect to the server using just a MySQL/MariaDB client or your mysql connection library of choice, from any language.
    • Default port: 9306 instead of 3306 (default for mysql)
  • Official PHP interface (complete HTTP API integration via cURL) for index maintenance, etc. (Searches could be done from mysql or API)
  • Real Time indexes that allow instant updates
    • Attaching a plain index to a real-time index: A plain -static- index can be converted into a real-time index or added to an existing real-time index.
  • Main+Delta schema: There's a frequent situation when the total dataset is too big to be reindexed from scratch often, but the amount of new records is rather small. Example: a forum with a 1,000,000 archived posts, but only 1,000 new posts per day. In this case, “live” (almost real time) index updates could be implemented using so called “main+delta” scheme.
  • Fast geospatial search
  • You could go for distributed architecture for faster indexing and searching over petabytes of data

Windows

Extract the zip, then edit the conf. file:

manticore.conf.in


# holy crap

common {
    plugin_dir = /usr/local/manticore/lib
}

searchd {
    listen = 127.0.0.1:9312
    listen = 127.0.0.1:9306:mysql
#   listen = 127.0.0.1:9308:http
    log = E:/manticore36/log/searchd.log
    query_log = E:/manticore36/log/query.log
    pid_file = E:/manticore36/log/searchd.pid
    data_dir = E:/manticore36/data
    query_log_format = sphinxql
}

Let's install as a service:

E:\Manticore\bin\searchd --install --config E:\Manticore\manticore.conf.in --servicename Manticore

Manticore can be started and stopped from the Services Control Panel or manually from the command line:

sc.exe start Manticore
sc.exe stop Manticore

If you don't install Manticore as Windows service, you can start it from the command line:

.\bin\searchd -c manticore.conf.in

To ensure a fast connection, use 127.0.0.1 and not localhost:

mysql -P9306 -h127.0.0.1

Scripted configuration

Manticore configuration supports shebang syntax, meaning that the configuration can be written in a programming language and interpreted at loading, allowing dynamic settings.

For example, indexes can be generated by querying a database table, various settings can be modified depending on external factors or external files can be included (which contain indexes and/sources).

The configuration file is parsed by declared declared interpreter and the output is used as the actual configuration. This is happening each time the configuration is read (not only at searchd startup).

This facility is not available on Windows platform.

In the following example, we are using PHP to create multiple indexes with different name and we also scan a specific folder for file containing extra declarations of indexes.

manticore.conf.in

#!/usr/bin/php
...
<?php for ($i=1; $i<=6; $i++) { ?>
index test_<?=$i?> {
  type = rt
  path = /var/lib/manticore/data/test_<?=$i?>
  rt_field = subject
  ...
 }
 <?php } ?>
 ...

 <?php
 $confd_folder='/etc/manticore.conf.d/';
 $files = scandir($confd_folder);
 foreach($files as $file)
 {
         if(($file == '.') || ($file =='..'))
         {} else {
                 $fp = new SplFileInfo($confd_folder.$file);
                 if('conf' == $fp->getExtension()){
                         include ($confd_folder.$file);
                 }
         }
 }

Comments

The configuration file supports comments, with # character used as start comment section. The comment character can be present at the start of the line or inline.

Extra care should be considered when using # in character tokenization settings as everything after it will not be taken into consideration. To avoid this, use # UTF-8 which is U+23.

# can also be escaped using \. Escaping is required if # is present in database credential in source declarations.

linux/howtos/manticore-playground.1629627463.txt.bz2 · Última modificación: 22/08/2021 12:17 por Juanga Covas