Ungafundisa indlela yokwakha umphathi we-AI for Research Paper Retrieval, Search, and Summarization
Ungafundisa indlela yokwakha umphathi we-AI for Research Paper Retrieval, Search, and Summarization
Ukuze abacwaningi, ukujabulela iziphumo ezidlulile zihlanganisa ukufumana isikhwama e-heystack. Qinisekisa umphathi we-AI-powered ukuthi akuyona kuphela amaphepha amakhulu asebenzayo kodwa futhi i-summarizes iziphumo eziyinhloko kanye nokuphendula imibuzo zakho ezithile, konke isikhathi esifanayo.
Ngokuvamile, umphumela we-inthanethi we-inthanethi we-inthanethi we-inthanethi we-inthanethi we-inthanethi we-inthanethi we-inthanethi we-inthanethi we-inthanethi we-inthanethi we-inthanethi we-inthanethi we-inthanethi we-inthanethi we-inthanethi we-inthanethi we-inthanethi we-inthanethi we-inthanethi we-inthanethi we-inthanethi we-inthanethi we-inthanethi we-inthanethi we-inthanethi we-inthanethi.
Ngokuvamile, umphumela we-inthanethi we-inthanethi we-inthanethi we-inthanethi we-inthanethi we-inthanethi we-inthanethi we-inthanethi we-inthanethi we-inthanethi we-inthanethi we-inthanethi we-inthanethi we-inthanethi we-inthanethi we-inthanethi we-inthanethi we-inthanethi we-inthanethi we-inthanethi we-inthanethi we-inthanethi we-inthanethi we-inthanethi we-inthanethi we-inthanethi we-inthanethi.TL;DR:
Ukwakha umphathi wokuthuthukiswa kwe-AI embalwa usebenzisa ukubuyekeza kwe-vector ye-Superlinked. It ukwehlisa i-RAG ephezulu ngokuvamile ngokuvimbela kanye nokufundisa idokhumenti ngqo-ukwenza ukubuyekeza ngokukhawuleza, kulula, futhi enhle.
(Uma ufuna ukufinyelela ngqo ku-code? Khangela isisindo esifundeni ku-GitHub lapha. Ukulungele ukuhlola kwe-semantic ye-agentic yakho yokusebenzisa? Thina siphinde ukunceda.)
Thola i-open source ku-GitHubUmbhali elandelayo kubonisa indlela yokwakha uhlelo we-agent usebenzisa i-Kernel agent ukulawula i-queries. Uma ufuna ukuyifaka kanye nokushesha ikhodi ku-browser,here’s the
Yini ukuqala ukwakha inkqubo yokuhlola umphakeli?
Ngokuvamile, ukwakhiwa kwelinye uhlelo kuhlanganisa ukucindezeleka kanye nezinsizakalo ezinhle. Izinsiza zokufaka ngokuvamile zitholela ikhadi elikhulu yokuqala esekelwe ku-relevance bese isetshenziswe inqubo ye-secondary rearranging ukuze zitholele iziphumo. Nakuba ukucindezeleka kwandisa ukucindezeleka, kubandisa kakhulu ukucindezeleka kwe-computational, i-latency, kanye ne-overhead ngenxa ye-data extensive retrieval okwenziwe ekuqaleni. I-Superlinked isixazulule lokhu ngokucindezeleka ngokuhlanganisa ukucindezeleka kwe-numeric kanye ne-catalogical nge-text embeddings semantic, okunikezela ama-vector multimodal ephelele. Lokhu
Ukwakhiwa kwe-agent system nge-Superlinked
Umphathi we-AI angakwazi ukwenza izinto ezintathu eziyinhloko:
- Find Papers: Search for research papers by topic (isib. “quantum computing”) futhi bese ukubeka kwabo ngokufanelekileyo kanye nesikhathi esifundeni.
- I-Summarize Papers: I-Condense i-Papers ebonakalayo ku-bit-size insights.
- Imibuzo ye-Response: Ukukhishwa imibuzo ngqo kusuka kumadokhumenti we-research eyodwa esekelwe ku-user-targeted queries.
I-Superlinked isetshenziselwa i-RecencySpace ye-Superlinked, okuyinto ikhowudi i-metadata ye-temporal, enikezela imikhiqizo yesikhathi esifundeni esifundeni ngesikhathi sokufaka, futhi ukunciphisa ingozi ye-re-ranking eyenziwe ngekhompyutha. Ngokwesibonelo, uma amaphepha amabili angama-relevance efanayo, eyodwa esifundeni esifundeni esifundeni kuya ku-ranking aphezulu.
Isinyathelo 1 : Setha Toolbox
%pip install superlinked
Ukuze kube lula futhi enamathela, ngithole isigaba se-Abstract Tool. Lokhu kuyacubungula inqubo yokwakha kanye nokwengeza izixhobo
import pandas as pd
import superlinked.framework as sl
from datetime import timedelta
from sentence_transformers import SentenceTransformer
from openai import OpenAI
import os
from abc import ABC, abstractmethod
from typing import Any, Optional, Dict
from tqdm import tqdm
from google.colab import userdata
# Abstract Tool Class
class Tool(ABC):
@abstractmethod
def name(self) -> str:
pass
@abstractmethod
def description(self) -> str:
pass
@abstractmethod
def use(self, *args, **kwargs) -> Any:
pass
# Get API key from Google Colab secrets
try:
api_key = userdata.get('OPENAI_API_KEY')
except KeyError:
raise ValueError("OPENAI_API_KEY not found in user secrets. Please add it using Tools > User secrets.")
# Initialize OpenAI Client
api_key = os.environ.get("OPENAI_API_KEY", "your-openai-key") # Replace with your OpenAI API key
if not api_key:
raise ValueError("Please set the OPENAI_API_KEY environment variable.")
client = OpenAI(api_key=api_key)
model = "gpt-4"
Isinyathelo 2: Ukuphathelela Dataset
Kulesi isibonelo usebenzisa dataset ebandakanya cishe 10,000 I-AI research papers ezinikezwayoNgena ngemvumeUkuze kube lula, kulula nje iseli elandelayo, futhi iyatholakala ngokuzenzakalelayo ukubuyekeza idatha ku-directory yakho yokusebenza. Ungasebenzisa izidakamizwa zakho zayo, njenge-papiers yokuhlola noma ezinye izinto zemfundo. Uma uye ukhethe ukwenza lokhu, konke ukuthi uzodingeka ukuguqulwa kwe-schema ngokunemba futhi ukuguqulwa ama-names ye-column.
import pandas as pd
!wget --no-check-certificate 'https://drive.google.com/uc?export=download&id=1FCR3TW5yLjGhEmm-Uclw0_5PWVEaLk1j' -O arxiv_ai_data.csv
Okwangoku, ukuze izicathulo zokusebenza ngokushesha, siza kusetshenziselwa isixazululo esincane se-papiers kuphela ukuze isixazulule izicathulo, kodwa sicela usize isibonelo usebenzisa i-dataset ephelele. Isithombe esikhulu se-technical lapha kuyinto ukuthi ama-timestamps ezivela ku-dataset zithunywe kusuka ku-string timestamps (njenge '1993-08-01 00:00:00+00:00') ku-panda datetime objects. Lokhu kokuguqulwa kuyadingeka ngenxa yokuvumela kwenziwa kwezinqubo ze-date/time.
df = pd.read_csv('arxiv_ai_data.csv').head(100)
# Convert to datetime but keep it as datetime (more readable and usable)
df['published'] = pd.to_datetime(df['published'])
# Ensure summary is a string
df['summary'] = df['summary'].astype(str)
# Add 'text' column for similarity search
df['text'] = df['title'] + " " + df['summary']
Debug: Columns in original DataFrame: ['authors', 'categories', 'comment', 'doi', 'entry_id', 'journal_ref' 'pdf_url', 'primary_category', 'published', 'summary', 'title', 'updated']
Ukubuyekeza i-DataSet Columns
Ngezansi, isithombe esifundeni esifundeni eziyinhloko ku-dataset yethu, okuyinto kubaluleke ngezinyathelo ezilandelayo:
- Imininingwane: Imininingwane yokubonisa umbhalo wophando.
- Umhlahlandlela: I-abstract ye-paper, enikeza ukubuyekeza okuhlobene.
- entry_id: I-ID eyodwa ye-arXiv ye-papiers ngamunye.
Ukuze lokhu ukuguqulwa, sinikezela ngokuvamile ku-four columns:entry_id
Ngena ngemvapublished
Ngena ngemvatitle
, futhisummary
Ukuze ukuphucula ikhwalithi lokufaka, inqwaba kanye nesithombe zihlanganiswa ku-column eyodwa, ephelele ye-text, okwenza isisindo se-embedding kanye ne-search yethu.
I-In-Memory Indexer ye-Superlinked : I-In-Memory Indexing ye-Superlinked ibhekwa i-dataset yethu ngqo ku-RAM, okwenza ukufakelwa okusheshayo kakhulu, okuyinto enhle yokufundisa isikhathi se-real-time kanye ne-prototyping eyenziwe ngempumelelo. Ukuze lokhu-proof-of-concept nge-1,000 imibhalo yokufundisa, ukusetshenziswa kwe-in-memory approach kuncike ukusebenza kwe-query, ukunciphisa izinga lokuphumelela ku-disk access.
Isinyathelo 3: Ukuqhathanisa i-Superlinked Schema
Ukusuka kulandelayo, kufuneka isikhwama yokubhalisa idatha yethu. SihlanganisaPaperSchema
Nge izindawo ebalulekile:
lass PaperSchema(sl.Schema):
text: sl.String
published: sl.Timestamp # This will handle datetime objects properly
entry_id: sl.IdField
title: sl.String
summary: sl.String
paper = PaperSchema()
Ukucaciswa kwe-Superlinked Spaces for Effective Retrieval
Isinyathelo esiyingqayizivele yokuhlanganisa nokuhambisa ngokushesha i-dataset yethu kuhlanganise ukucacisa izindawo ezimbili ze-vector: TextSimilaritySpace kanye ne-RecencySpace.
- TextSimilarityIzindawo
WazeTextSimilaritySpace
Ukusungulwa ukucubungula ulwazi se-text – njenge-titles kanye ne-abstracts ye-research papers ku-vectors. Ngokuguqulwa kwe-text ku-embeddings, le ndawo ikhiqiza kakhulu ukunambitheka nokunambitheka kwe-semantic searches. It is optimized specifically to handle longer text sequences efficiently, allowing precise similarity comparisons across documents.
text_space = sl.TextSimilaritySpace(
text=sl.chunk(paper.text, chunk_size=200, chunk_overlap=50),
model="sentence-transformers/all-mpnet-base-v2"
)
- Ukubuyekezwa
WazeRecencySpace
Ukubhathanisa i-metadata ye-temporal, okukhuthaza i-recency ye-publishment ye-research. Ngokukhipha ama-timestamps, le mkhakha ivumela ukucindezeleka kwama-documents ezintsha. Ngenxa yalokho, imiphumela ye-recovery ivimbele ngokwemvelo ukucindezeleka kwe-content ne-dates ye-publishment, okukhuthaza ukucindezeleka okuphambili.
recency_space = sl.RecencySpace(
timestamp=paper.published,
period_time_list=[
sl.PeriodTime(timedelta(days=365)), # papers within 1 year
sl.PeriodTime(timedelta(days=2*365)), # papers within 2 years
sl.PeriodTime(timedelta(days=3*365)), # papers within 3 years
],
negative_filter=-0.25
)
Thola i-RecencySpace njenge-time-based filter, efana nokuhlanganisa i-imeyili yakho ngomhla noma ukubonisa i-Instagram amaphepha nge-akhawunti ezintsha ngokuvamile. It inikeza ukujabulela umbuzo, 'Ukuhlobisa kanjani le iphepha?'
- I-timedeltas encane (njenge-365 iintsuku) ivumela ukuhlaziywa kwe-time-based ye-granular.
- I-timedeltas emikhulu (njenge-1095 amahora) ivela amahora amakhulu.
Wazenegative_filter
Ukucacisa okuhle kakhulu, bheka isibonelo elandelayo lapho amaphepha amabili anama-content relevance, kodwa izigaba zabo zihlanganisa izinsuku zofuzo zabo.
Paper A: Published in 1996
Paper B: Published in 1993
Scoring example:
- Text similarity score: Both papers get 0.8
- Recency score:
- Paper A: Receives the full recency boost (1.0)
- Paper B: Gets penalized (-0.25 due to negative_filter)
Final combined scores:
- Paper A: Higher final rank
- Paper B: Lower final rank
Lezi zihlanganisi zihlanganisa ukufinyelela kanye nokusebenza kwe-dataset. Lezi zihlanganisa ukufinyelela okuqukethwe kanye nesikhathi, futhi zihlanganisa kakhulu ukufinyelela kokubili lokuphelelwa kwebhizinisi. Lokhu kunikeza indlela enhle yokuhlanganisa futhi ukufinyelela nge-dataset ngokuvumelana nezinto kanye nesikhathi sokubiliwe.
Isinyathelo 4 : Ukwakhiwa kwe-index
Okulandelayo, izixazululo zihlanganiswa ku-index okuyinto isisindo se-search engine:
paper_index = sl.Index([text_space, recency_space])
Ngemuva kwalokho, i-DataFrame ibhalwe ku-schema futhi ibhalwe amasethi (10 amaphepha ngexesha) ku-in-memory store:
# Parser to map DataFrame columns to schema fields
parser = sl.DataFrameParser(
paper,
mapping={
paper.entry_id: "entry_id",
paper.published: "published",
paper.text: "text",
paper.title: "title",
paper.summary: "summary",
}
)
# Set up in-memory source and executor
source = sl.InMemorySource(paper, parser=parser)
executor = sl.InMemoryExecutor(sources=[source], indices=[paper_index])
app = executor.run()
# Load the DataFrame with a progress bar using batches
batch_size = 10
data_batches = [df[i:i + batch_size] for i in range(0, len(df), batch_size)]
for batch in tqdm(data_batches, total=len(data_batches), desc="Loading Data into Source"):
source.put([batch])
I-in-memory executor iyona Superlinked ivela lapha - i-1,000 amaphepha zihlanganisa ngokushesha ku-RAM, futhi izivakashi zihlanganisa ngaphandle kwe-Disk I/O bottlenecks.
Isinyathelo 5: Crafting i-query
Okulandelayo kuyinto ukwakhiwa kwebhizinisi. Kuyinto lapho isampula yokufakelwa kwebhizinisi yasungulwa. Ukuze kusetshenziswe lokhu, sincoma isampula yebhizinisi elihlanganisa kanye nesikhathi esifundeni. Ngiyazi ukuthi kuyoba:
# Define the query
knowledgebase_query = (
sl.Query(
paper_index,
weights={
text_space: sl.Param("relevance_weight"),
recency_space: sl.Param("recency_weight"),
}
)
.find(paper)
.similar(text_space, sl.Param("search_query"))
.select(paper.entry_id, paper.published, paper.text, paper.title, paper.summary)
.limit(sl.Param("limit"))
)
Ngokwenza lokhu, sinikezela ukuba ukhethe ukuthi ukunikezela impendulo (relevance_weight) noma ukuhlaziywa (recency_weight) - i-combo enhle kakhulu ekhona nezidingo zamakhasimende ethu.
Isinyathelo 6: Ukwakhiwa Imishini
Sishayele isigaba se-Tooling.
Thina ukwakha izixhobo ezintathu ...
- I-Retrieval Tool : Lesi sixhobo ifakwe ngokuvimbela ku-Index ye-Superlinked, okuvumela ukuvimbela i-top 5 amaphepha ngokuvumelana ne-question. It is a balance of relevance (1.0 weight) and recentity (0.5 weight) to accomplish the “find papers” goal. What we want is to find the papers that are relevant to the query. Ngakho-ke, uma i-question is: “What quantum computing papers were published between 1993 and 1994?”, then the retrieval tool will retrieve those papers, summarize them one by one, and return the results.
class RetrievalTool(Tool):
def __init__(self, df, app, knowledgebase_query, client, model):
self.df = df
self.app = app
self.knowledgebase_query = knowledgebase_query
self.client = client
self.model = model
def name(self) -> str:
return "RetrievalTool"
def description(self) -> str:
return "Retrieves a list of relevant papers based on a query using Superlinked."
def use(self, query: str) -> pd.DataFrame:
result = self.app.query(
self.knowledgebase_query,
relevance_weight=1.0,
recency_weight=0.5,
search_query=query,
limit=5
)
df_result = sl.PandasConverter.to_pandas(result)
# Ensure summary is a string
if 'summary' in df_result.columns:
df_result['summary'] = df_result['summary'].astype(str)
else:
print("Warning: 'summary' column not found in retrieved DataFrame.")
return df_result
Okulandelayo kuyintoSummarization Tool
. Lesi sixhobo iye yenzelwe izimo lapho i-shrink ye-paper iyadingeka. Ukuze kusetshenziswe, iyatholakalapaper_id
, okuyinto i-ID ye-paper ebonakalayo. Uma apaper_id
Ngaphandle kokufakwa, isixhobo akufanele ukusebenza njengezinto ezidingekayo ukuze uthole imibhalo efanelekayo ku-dataset.
class SummarizationTool(Tool):
def __init__(self, df, client, model):
self.df = df
self.client = client
self.model = model
def name(self) -> str:
return "SummarizationTool"
def description(self) -> str:
return "Generates a concise summary of specified papers using an LLM."
def use(self, query: str, paper_ids: list) -> str:
papers = self.df[self.df['entry_id'].isin(paper_ids)]
if papers.empty:
return "No papers found with the given IDs."
summaries = papers['summary'].tolist()
summary_str = "\n\n".join(summaries)
prompt = f"""
Summarize the following paper summaries:\n\n{summary_str}\n\nProvide a concise summary.
"""
response = self.client.chat.completions.create(
model=self.model,
messages=[{"role": "user", "content": prompt}],
temperature=0.7,
max_tokens=500
)
return response.choices[0].message.content.strip()
Okokuqala, sincomaQuestionAnsweringTool
. Lezi zixhobo zihlanganisa iRetrievalTool
ukuze uthenge imibhalo ezihlobene bese usebenzisa kubo ukuhlangabezana imibuzo. Uma imibhalo ezihlobene akuyona ukuhlangabezana imibuzo, kuyoba impendulo esisekelwe ulwazi jikelele
class QuestionAnsweringTool(Tool):
def __init__(self, retrieval_tool, client, model):
self.retrieval_tool = retrieval_tool
self.client = client
self.model = model
def name(self) -> str:
return "QuestionAnsweringTool"
def description(self) -> str:
return "Answers questions about research topics using retrieved paper summaries or general knowledge if no specific context is available."
def use(self, query: str) -> str:
df_result = self.retrieval_tool.use(query)
if 'summary' not in df_result.columns:
# Tag as a general question if summary is missing
prompt = f"""
You are a knowledgeable research assistant. This is a general question tagged as [GENERAL]. Answer based on your broad knowledge, not limited to specific paper summaries. If you don't know the answer, provide a brief explanation of why.
User's question: {query}
"""
else:
# Use paper summaries for specific context
contexts = df_result['summary'].tolist()
context_str = "\n\n".join(contexts)
prompt = f"""
You are a research assistant. Use the following paper summaries to answer the user's question. If you don't know the answer based on the summaries, say 'I don't know.'
Paper summaries:
{context_str}
User's question: {query}
"""
response = self.client.chat.completions.create(
model=self.model,
messages=[{"role": "user", "content": prompt}],
temperature=0.7,
max_tokens=500
)
return response.choices[0].message.content.strip()
Isinyathelo 7: Ukwakhiwa Kernel Agent
Okulandelayo i-Kernel Agent. I-Kernel Agent isebenza njenge-controller ye-central, enikezela ukusebenza okuhlobene futhi efanelekayo. Ukusebenza njenge-component ye-core ye-system, i-Kernel Agent ivimbela ukuxhumana ngokuvumelana nezidingo zayo lapho ama-agents amaningi zokusebenza ngokufanayo. Kwi-single-agent systems, njenge-one-agent, i-Kernel Agent isebenzisa ngqo izixhobo zokusebenza ngokuvumelana nezinsizakalo.
class KernelAgent:
def __init__(self, retrieval_tool: RetrievalTool, summarization_tool: SummarizationTool, question_answering_tool: QuestionAnsweringTool, client, model):
self.retrieval_tool = retrieval_tool
self.summarization_tool = summarization_tool
self.question_answering_tool = question_answering_tool
self.client = client
self.model = model
def classify_query(self, query: str) -> str:
prompt = f"""
Classify the following user prompt into one of the three categories:
- retrieval: The user wants to find a list of papers based on some criteria (e.g., 'Find papers on AI ethics from 2020').
- summarization: The user wants to summarize a list of papers (e.g., 'Summarize papers with entry_id 123, 456, 789').
- question_answering: The user wants to ask a question about research topics and get an answer (e.g., 'What is the latest development in AI ethics?').
User prompt: {query}
Respond with only the category name (retrieval, summarization, question_answering).
If unsure, respond with 'unknown'.
"""
response = self.client.chat.completions.create(
model=self.model,
messages=[{"role": "user", "content": prompt}],
temperature=0.7,
max_tokens=10
)
classification = response.choices[0].message.content.strip().lower()
print(f"Query type: {classification}")
return classification
def process_query(self, query: str, params: Optional[Dict] = None) -> str:
query_type = self.classify_query(query)
if query_type == 'retrieval':
df_result = self.retrieval_tool.use(query)
response = "Here are the top papers:\n"
for i, row in df_result.iterrows():
# Ensure summary is a string and handle empty cases
summary = str(row['summary']) if pd.notna(row['summary']) else ""
response += f"{i+1}. {row['title']} \nSummary: {summary[:200]}...\n\n"
return response
elif query_type == 'summarization':
if not params or 'paper_ids' not in params:
return "Error: Summarization query requires a 'paper_ids' parameter with a list of entry_ids."
return self.summarization_tool.use(query, params['paper_ids'])
elif query_type == 'question_answering':
return self.question_answering_tool.use(query)
else:
return "Error: Unable to classify query as 'retrieval', 'summarization', or 'question_answering'."
Kulesi isigaba, zonke izakhiwo ze-Research Agent System ziye zihlanganiswe. Le nkqubo iyakwazi ukuqala ngokuvumela i-Kernel Agent izixhobo ezifanele, ngemuva kwalokho i-Research Agent System iyahambisana ngokuphelele.
retrieval_tool = RetrievalTool(df, app, knowledgebase_query, client, model)
summarization_tool = SummarizationTool(df, client, model)
question_answering_tool = QuestionAnsweringTool(retrieval_tool, client, model)
# Initialize KernelAgent
kernel_agent = KernelAgent(retrieval_tool, summarization_tool, question_answering_tool, client, model)
Sishayele isistimu ...
# Test query print(kernel_agent.process_query("Find papers on quantum computing in last 10 years"))
Ukusebenza okuzenzakalelayoRetrievalTool
. It uzothola imibhalo olufanelekayo ngokuvamile kanye nesikhathi esifundeni, futhi uzothola imibhalo olufanelekayo. Uma imiphumela esivela kuhlanganisa isithombe se-resume (ukubonisa ukuthi imibhalo lithunyelwe kusuka ku-dataset), it uzothola imibhalo esifundeni futhi uzothola nathi.
Query type: retrieval
Here are the top papers:
1. Quantum Computing and Phase Transitions in Combinatorial Search
Summary: We introduce an algorithm for combinatorial search on quantum computers that
is capable of significantly concentrating amplitude into solutions for some NP
search problems, on average. This is done by...
1. The Road to Quantum Artificial Intelligence
Summary: This paper overviews the basic principles and recent advances in the emerging
field of Quantum Computation (QC), highlighting its potential application to
Artificial Intelligence (AI). The paper provi...
1. Solving Highly Constrained Search Problems with Quantum Computers
Summary: A previously developed quantum search algorithm for solving 1-SAT problems in
a single step is generalized to apply to a range of highly constrained k-SAT
problems. We identify a bound on the number o...
1. The model of quantum evolution
Summary: This paper has been withdrawn by the author due to extremely unscientific
errors....
1. Artificial and Biological Intelligence
Summary: This article considers evidence from physical and biological sciences to show
machines are deficient compared to biological systems at incorporating
intelligence. Machines fall short on two counts: fi...
Thola thwebula enye isibuyekezo, lokhu, thina thwebula isibuyekezo eyodwa..
print(kernel_agent.process_query("Summarize this paper", params={"paper_ids": ["http://arxiv.org/abs/cs/9311101v1"]}))
Query type: summarization
This paper discusses the challenges of learning logic programs that contain the cut predicate (!). Traditional learning methods cannot handle clauses with cut because it has a procedural meaning. The proposed approach is to first generate a candidate base program that covers positive examples, and then make it consistent by inserting cut where needed. Learning programs with cut is difficult due to the need for intensional evaluation, and current induction techniques may need to be limited to purely declarative logic languages.
Ngingathanda lokhu umzekelo iye asebenzayo ekuthuthukiseni ama-AI ama-agents kanye nama-agent-based systems. Iningi le-recovery functionality etholakalayo lapha lithunyelwe ku-Superlinked, ngakho-ke sicela ukuthatha umzila we-Superlinked.UkukhangisaUkuze izibuyekezo elandelayo lapho izinzuzo zokufaka okucacileyo zihlanganisa izesekeli zakho ze-AI!
Ukuhlobisa
- Ukuhlanganiswa kwe-semantic ne-temporal relevance ukunciphisa i-rearanking ephikile ngokuvumelana nokugcina ukucindezeleka kwebhizinisi lezifundo.
- I-time-based penalties (negative_filter=-0.25) ibonise izifundo ezivamile lapho izihloko zihlanganisa impahla efanayo.
- I-Architecture ye-Tool-Based Modular inikeza izingxenye ezizodwa ukuhlangabezana nezinsizakalo ezahlukile (i-recovery, i-summarization, i-question-answering) ngokuvumelana nokuphathwa kwe-system cohesion.
- Ukukhuthaza idatha emaphaketheni amancane (batch_size=10) nge-progress tracking kuncike ukuguqulwa kwekhwalithi lapho kusetshenziswe ama-datasets amancane ze-research.
- Imibuzo ye-query ye-adjustable ivumela abasebenzisi ukuguqulwa kwe-relevance (1.0) ne-recent (0.5) ngokuvumelana nezidingo zophando ezithile.
- I-component ye-question-answering iyahlukanisa ngokushesha ku-knowledge jikelele lapho isikhokelo se-paper-specific ayizukwazi ukufinyelela, ukuvimbela iziphakamiso ze-user-dead-end.
Ukuvumelana nenani elikhulu lwezifundo zezifundo ezivela ngokuvamile kungabangela izinzuzo futhi zokusebenza isikhathi. I-Agent AI assistant workflow enokusebenza ngokushesha lokufaka izifundo eziyinhloko, ukuphefumula izibuyekezo eziyinhloko, kanye nokuphendula imibuzo ezithile ezivela kulezi zifundo ingathuthukisa kakhulu le nqubo.
Izinzuzo
- Vipul Maheshwari, umbhali
- Filip Makraduli, umbhali