Haystack docs home page

Module docs2answers

Docs2Answers

class Docs2Answers(BaseComponent)

This Node is used to convert retrieved documents into predicted answers format. It is useful for situations where you are calling a Retriever only pipeline via REST API. This ensures that your output is in a compatible format.

Module join_docs

JoinDocuments

class JoinDocuments(JoinNode)

A node to join documents outputted by multiple retriever nodes.

The node allows multiple join modes:

  • concatenate: combine the documents from multiple nodes. Any duplicate documents are discarded. The score is only determined by the last node that outputs the document.
  • merge: merge scores of documents from multiple nodes. Optionally, each input score can be given a different weight & a top_k limit can be set. This mode can also be used for "reranking" retrieved documents.
  • reciprocal_rank_fusion: combines the documents based on their rank in multiple nodes.

JoinDocuments.__init__

def __init__(join_mode: str = "concatenate", weights: Optional[List[float]] = None, top_k_join: Optional[int] = None)

Arguments:

  • join_mode: concatenate to combine documents from multiple retrievers merge to aggregate scores of individual documents, reciprocal_rank_fusion to apply rank based scoring.
  • weights: A node-wise list(length of list must be equal to the number of input nodes) of weights for adjusting document scores when using the merge join_mode. By default, equal weight is given to each retriever score. This param is not compatible with the concatenate join_mode.
  • top_k_join: Limit documents to top_k based on the resulting scores of the join.

Module join_answers

JoinAnswers

class JoinAnswers(JoinNode)

A node to join Answers produced by multiple Reader nodes.

JoinAnswers.__init__

def __init__(join_mode: str = "concatenate", weights: Optional[List[float]] = None, top_k_join: Optional[int] = None, sort_by_score: bool = True)

Arguments:

  • join_mode: "concatenate" to combine documents from multiple Readers. "merge" to aggregate scores of individual Answers.
  • weights: A node-wise list (length of list must be equal to the number of input nodes) of weights for adjusting Answer scores when using the "merge" join_mode. By default, equal weight is assigned to each Reader score. This parameter is not compatible with the "concatenate" join_mode.
  • top_k_join: Limit Answers to top_k based on the resulting scored of the join.
  • sort_by_score: Whether to sort the incoming answers by their score. Set this to True if your Answers are coming from a Reader or TableReader. Set to False if any Answers come from a Generator since this assigns None as a score to each.

Module route_documents

RouteDocuments

class RouteDocuments(BaseComponent)

A node to split a list of Documents by content_type or by the values of a metadata field and route them to different nodes.

RouteDocuments.__init__

def __init__(split_by: str = "content_type", metadata_values: Optional[List[str]] = None)

Arguments:

  • split_by: Field to split the documents by, either "content_type" or a metadata field name. If this parameter is set to "content_type", the list of Documents will be split into a list containing only Documents of type "text" (will be routed to "output_1") and a list containing only Documents of type "table" (will be routed to "output_2"). If this parameter is set to a metadata field name, you need to specify the parameter metadata_values as well.
  • metadata_values: If the parameter split_by is set to a metadata field name, you need to provide a list of values to group the Documents to. Documents whose metadata field is equal to the first value of the provided list will be routed to "output_1", Documents whose metadata field is equal to the second value of the provided list will be routed to "output_2", etc.