Introduction

Overview

rtrapy is a Python library designed to streamline the process of Retrieval-Augmented Generation (RAG). It acts as a bridge between real-time web search results and Large Language Models (LLMs), allowing developers to generate responses based on the most current information available on the internet.

By leveraging the Google Custom Search API for data retrieval and Hugging Face’s inference endpoints for text generation, rtrapy provides a simple, unified interface for building applications that require up-to-date knowledge beyond a model's static training data.

Key Features

Real-time Web Retrieval: Connects directly to the Google Custom Search API to fetch the latest web snippets.
Context Aggregation: Automatically extracts and prepares relevant search context for model consumption.
Augmented Generation: Interfaces with high-performance LLMs (specifically Mistral-7B-Instruct) to generate detailed, context-aware responses.
Minimalistic API: Provides a high-level RTRAConnector class to handle the entire pipeline in just a few lines of code.

Getting Started

To use rtrapy, you will need a Google Cloud API Key and a Programmable Search Engine ID (CX).

Basic Usage

The primary entry point for the library is the RTRAConnector class.

from rtrapy.rtra_connector import RTRAConnector

# Initialize the connector with your Google Search credentials
connector = RTRAConnector(
    api_key="YOUR_GOOGLE_API_KEY",
    engine_id="YOUR_CUSTOM_SEARCH_ENGINE_ID"
)

# Generate a response based on real-time web data
query = "What are the latest developments in quantum computing as of 2024?"
response = connector.generate_detailed_response(query)

print(response)

API Reference

`RTRAConnector`

The main class used to interface with search engines and LLM providers.

`init(self, api_key, engine_id)`

Initializes the connector with necessary authentication.

api_key (str): Your Google Custom Search API key.
engine_id (str): Your Google Custom Search Engine ID (CX).

`search(self, query)`

Performs a raw search query against the Google Custom Search API.

Parameters: query (str) — The search term.
Returns: list — A list of raw result items (dictionaries) containing snippets, links, and titles.

`combine_search_results(self, query)`

Retrieves search results and concatenates the snippets into a single string for use as LLM context.

Parameters: query (str) — The search term.
Returns: str — A combined block of text from the top 5 search results.

`generate_detailed_response(self, query)`

The core method for RAG. It fetches real-time data for the query and passes it to the Mistral-7B-Instruct model to generate a final answer.

Parameters: query (str) — The user prompt or question.
Returns: str — The text generated by the LLM based on the retrieved web context.

Requirements

Python: 3.x
Libraries: requests
Credentials:
- Google Cloud Console API Key (with Custom Search API enabled).
- Google Custom Search Engine ID.