Indexing

sir.indexing.reindex(args)[source]

Reindexes all entity types in args[“entity_type”].

If no types are specified, all known entities will be reindexed.

Parameters:args (dict) – A dictionary with a key named entities.
sir.indexing.index_entity(entity_name, bounds, data_queue)[source]

Retrieve rows for a single entity type identified by entity_name, convert them to a dict with sir.indexing.query_result_to_dict() and put the dicts into queue.

Parameters:
sir.indexing.queue_to_solr(queue, batch_size, solr_connection)[source]

Read dict objects from queue and send them to the Solr server behind solr_connection in batches of batch_size.

Parameters:
sir.indexing.send_data_to_solr(solr_connection, data)[source]

Sends data through solr_connection.

Parameters:
Raises:

solr.SolrException

sir.indexing._multiprocessed_import(entity_names, live=False, entities=None)[source]

Does the real work to import all entities with entity_name in multiple processes via the multiprocessing module.

When live is True, it means, we are live indexing documents with ids in the entities dict, otherwise it reindexes the entire table for entities in entity_names.

Parameters:
sir.indexing._index_entity_process_wrapper(args, live=False)[source]

Calls sir.indexing.index_entity() with args unpacked.

Parameters:live (bool) –
Return type:None or an Exception
sir.indexing.live_index(entities)[source]
Reindex all documents in``entities`` in multiple processes via the

multiprocessing module.

Parameters:entities (dict(set(int))) –
sir.indexing.live_index_entity(entity_name, ids, data_queue)[source]

Retrieve rows for a single entity type identified by entity_name, convert them to a dict with sir.indexing.query_result_to_dict() and put the dicts into queue.

Parameters: