Muestra las diferencias entre dos versiones de la página.
Ambos lados, revisión anteriorRevisión previaPróxima revisión | Revisión previa | ||
linux:howtos:manticore-playground [21/08/2021 22:28] – Juanga Covas | linux:howtos:manticore-playground [23/08/2021 19:15] (actual) – [Comments] Juanga Covas | ||
---|---|---|---|
Línea 1: | Línea 1: | ||
- | ====== Manticore | + | ====== Manticore |
+ | |||
+ | This is my page of notes after playing with Manticore Search to adopt it. | ||
+ | |||
+ | **[[https:// | ||
+ | |||
+ | **Things I like**: | ||
+ | * SQL-first, you can connect to the server using just a MySQL/ | ||
+ | * Default port: 9306 instead of 3306 (default for mysql) | ||
+ | * Official PHP interface (complete HTTP API integration via cURL) for index maintenance, | ||
+ | * Real Time indexes that allow instant updates | ||
+ | * Attaching a plain index to a real-time index: A plain -static- index can be converted into a real-time index or added to an existing real-time index. | ||
+ | * Supports Main+Delta schema: There' | ||
+ | * Fast geospatial search | ||
+ | * You could go for distributed architecture for faster indexing and searching over petabytes of data | ||
+ | |||
+ | **Other notes** | ||
+ | * A confusing concept to understand is how //searchd// is run in "RT mode" OR "Plain mode" VS. the index types (RT index and Plain index also). **RT mode is __required__ if you want to enable | ||
+ | * REAL-TIME MODE **requires** no index definition in the configuration file and having a // | ||
+ | * PLAIN MODE allows to specify index schema in config which will be read on Manticore start and created if missing. This mode is especially useful for plain indexes that need to be built from an external storage. Dropping indexes is only possible by removing them from the configuration file or by removing the path setting and sending a HUP signal to the server or restarting it.\\ **You can still use REAL-TIME INDEX (RT indexes) in this Plain Mode** since [[https:// | ||
- | This is my page of playing with Manticore Search | ||
===== Windows ===== | ===== Windows ===== | ||
Línea 18: | Línea 36: | ||
listen = 127.0.0.1: | listen = 127.0.0.1: | ||
listen = 127.0.0.1: | listen = 127.0.0.1: | ||
- | # | + | # |
log = E:/ | log = E:/ | ||
query_log = E:/ | query_log = E:/ | ||
pid_file = E:/ | pid_file = E:/ | ||
- | | + | # PLAIN MODE is enabled by omitting " |
+ | # | ||
+ | # data_dir = E:/ | ||
query_log_format = sphinxql | query_log_format = sphinxql | ||
} | } | ||
Línea 40: | Línea 60: | ||
.\bin\searchd -c manticore.conf.in | .\bin\searchd -c manticore.conf.in | ||
- | To ensure a fast connection, use '' | + | To ensure a fast connection, use '' |
mysql -P9306 -h127.0.0.1 | mysql -P9306 -h127.0.0.1 | ||
+ | ===== Scripted configuration ===== | ||
+ | |||
+ | Manticore configuration supports shebang syntax, meaning that the configuration can be written in a programming language and interpreted at loading, allowing dynamic settings. | ||
+ | |||
+ | For example, indexes can be generated by querying a database table, various settings can be modified depending on external factors or external files can be included (which contain indexes and/ | ||
+ | |||
+ | The configuration file is parsed by declared declared interpreter and the output is used as the actual configuration. This is happening each time the configuration is read (not only at searchd startup). | ||
+ | |||
+ | This facility is not available on Windows platform. | ||
+ | |||
+ | In the following example, we are using PHP to create multiple indexes with different name and we also scan a specific folder for file containing extra declarations of indexes. | ||
+ | |||
+ | <code php|'' | ||
+ | # | ||
+ | ... | ||
+ | <?php for ($i=1; $i<=6; $i++) { ?> | ||
+ | index test_<? | ||
+ | type = rt | ||
+ | path = / | ||
+ | rt_field = subject | ||
+ | ... | ||
+ | } | ||
+ | <? | ||
+ | ... | ||
+ | |||
+ | <? | ||
+ | | ||
+ | | ||
+ | | ||
+ | { | ||
+ | | ||
+ | {} else { | ||
+ | $fp = new SplFileInfo($confd_folder.$file); | ||
+ | | ||
+ | | ||
+ | } | ||
+ | } | ||
+ | } | ||
+ | </ | ||
+ | |||
+ | ===== Comments ===== | ||
+ | |||
+ | The configuration file supports comments, with # character used as start comment section. The comment character can be present at the start of the line or inline. | ||
+ | |||
+ | Extra care should be considered when using # in character tokenization settings as everything after it will not be taken into consideration. To avoid this, use # UTF-8 which is U+23. | ||
+ | |||
+ | # can also be escaped using \. Escaping is required if # is present in database credential in source declarations. | ||
+ | |||
+ | ===== Source ===== | ||
+ | |||
+ | Nice usage of '' | ||
+ | |||
+ | A table to keep some indexing information | ||
+ | CREATE TABLE `product_search_status` | ||
+ | `id` varchar(30) CHARACTER SET latin1 COLLATE latin1_bin NOT NULL, | ||
+ | `value` bigint(20) UNSIGNED NOT NULL, | ||
+ | PRIMARY KEY (`id`) USING BTREE | ||
+ | ) ENGINE = InnoDB; | ||
+ | |||
+ | < | ||
+ | # we set unicode charset and wait_timeout to a high value to prevent connection timeout errors | ||
+ | sql_query_pre = SET NAMES utf8 | ||
+ | sql_query_pre = SET SESSION wait_timeout=3600 | ||
+ | # we store the index time for information | ||
+ | sql_query_pre = REPLACE INTO product_search_status (id, value) VALUES (' | ||
+ | # we set start-end document ids so that manticore will know where to start and stop indexing | ||
+ | sql_query_range = SELECT MIN(id), MAX(id) FROM product | ||
+ | sql_range_step = 10000 | ||
+ | # this is the main query to create documents | ||
+ | sql_query = SELECT \ | ||
+ | id, \ | ||
+ | name AS name_ft, \ | ||
+ | | ||
+ | name \ | ||
+ | FROM product \ | ||
+ | WHERE id >= $start AND id <= $end | ||
+ | # we store the most recent document id for information | ||
+ | sql_query_post_index = REPLACE INTO product_search_status (id, value) VALUES (' | ||
+ | </ | ||