Indexing¶
-
sir.indexing.
reindex
(args)[source]¶ Reindexes all entity types in args[“entity_type”].
If no types are specified, all known entities will be reindexed.
Parameters: args (dict) – A dictionary with a key named entities
.
-
sir.indexing.
index_entity
(session, entity_name, bounds, data_queue)[source]¶ Retrieve rows for a single entity type identified by
entity_name
, convert them to a dict withsir.indexing.query_result_to_dict()
and put the dicts intoqueue
.Parameters: - session (sqlalchemy.orm.Session) –
- entity_name (str) –
- bounds ((int, int)) –
- data_queue (Queue.Queue) –
-
sir.indexing.
queue_to_solr
(queue, batch_size, solr_connection)[source]¶ Read
dict
objects fromqueue
and send them to the Solr server behindsolr_connection
in batches ofbatch_size
.Parameters: - queue (multiprocessing.Queue) –
- batch_size (int) –
- solr_connection (solr.Solr) –
-
sir.indexing.
send_data_to_solr
(solr_connection, data)[source]¶ Sends
data
throughsolr_connection
.Parameters: Raises:
-
sir.indexing.
_multiprocessed_import
(entity_names, live=False, entities=None)[source]¶ Does the real work to import all entities with
entity_name
in multiple processes via themultiprocessing
module.When
live
is True, it means, we are live indexing documents with ids in theentities
dict, otherwise it reindexes the entire table for entities inentity_names
.Parameters:
-
sir.indexing.
_index_entity_process_wrapper
(args, live=False)[source]¶ Calls
sir.indexing.index_entity()
withargs
unpacked.Parameters: live (bool) – Return type: None or an Exception
-
sir.indexing.
live_index
(entities)[source]¶ - Reindex all documents in``entities`` in multiple processes via the
multiprocessing
module.Parameters: entities (dict(set(int))) –
-
sir.indexing.
live_index_entity
(session, entity_name, ids, data_queue)[source]¶ Retrieve rows for a single entity type identified by
entity_name
, convert them to a dict withsir.indexing.query_result_to_dict()
and put the dicts intoqueue
.Parameters: - session (sqlalchemy.orm.Session) –
- entity_name (str) –
- ids ([int]) –
- data_queue (Queue.Queue) –