Ukucaciswa
Kwiintsuku ezidlulileyo, kwakhona i-AI Agent ebangeni. I-Intelligent Internet okanye i-II-agent, ebizwa ngokuba yi-intelligentest ehlabathini kodwa ngokupheleleyo kwi-open-source! Ngoko ke yintoni? Yintoni i-intelligent Internet iye yenza kakhulu kwi-intelligent Internet.I-Agent ye-Autonomous AIizifundo ze-intelligibility? Yintoni i-stack kwi-agents ezininzi ze-AI njenge-I-OpenAI I-Deep ResearchUkucingaUkucingaYayeUkucinga?
Kule nqaku, siza kuthetha kuphela imibuzo ezidlulileyo, kodwa siye kwenziwe iimeko ezininzi zokusebenza kwihlabathi yokwenene ukuba ama-AI-agents ziyafumaneka.
Ngokutsho kwe-Agent III-blog ye-Official Post, ibandakanya i-AI Agents ezininzi kwi-GAIA.
I-GAIA yi-benchmark ye-General AI Assistants enikeza imibuzo yehlabathi ebonakalayo ebonakalayo iinkonzo ezisemgangathweni, njenge-multimodality handling, i-web browsing, kunye ne-tool-use proficiency - imibuzo ebonakalayo kumadoda kodwa ebonakalayo kwi-AI ezininzi ezihlangeneyo, kunye ne-respondents ezininzi ze-92% ngokufanelekileyo kunokuba yi-15% kuphela kwi-GPT-4 eyenziwe nge-plugins.
Yintoni i-Agent II
Iimpawu kunye neziphumo ezizodwa
I-II-Agent yinkqubo ye-intelligent assistant ye-open-source enikezela kwi-benchmarks ye-agents ezinikezwayo nangokuthintela ngokugqithisileyo kunye ne-auditability epheleleyo. Iimpawu zayo zangaphakathi zihlanganisa uphando kunye ne-fact-checking kunye ne-web search ye-multi-step, ukuvelisa inkonzo kuquka iinkcukacha ze-blog kunye neengxaki ze-technical, ukuhlaziywa kwedatha kunye ne-reporting ye-automated, ukuthuthukiswa kwisoftware kwiilwimi ezininzi, i-workflow automation, kunye ne-advanced problem-solving.
Ukusetyenziswa kwe-architecture ye-Anthropic ye-Claude 3.7 Sonnet kwi-core yayo, ukulawula i-context ye-intelligent enokufumana i-120,000 i-token, i-automatization ye-browser ephakamileyo kunye ne-vision capabilities, kunye ne-real-time WebSocket yokuxhumana kwimeko ze-use interactive.
Umgangatho we-competitive vs. umgangatho we-leading:
I-II-Agent ibekwe njenge-openness-source ye-alternative kwi-cutting-edge-source-agents ezifana ne-Manus (eyenza i-65% kwi-GAIA benchmark) kunye ne-GenSpark AI. Nangona iimpumelelo ezidlulileyo ze-open-source ezifana ne-OpenManus yi-Foundation Agent kunye ne-Suna yi-Kortix-AI "benzisa ukuguqula iinkonzo ze-Manus kunye ne-GenSpark", akukho na "kwixesha lokugqibela ukufikelela ukusebenza kwayo okanye ukufumana iziphumo ezinxulumeneyo ze-benchmark".
I-II-Agent ibophelela le ingxaki ngokusebenzisa kuphela ukufaneleka, kodwa ukwandisa ukusebenza kwe-benchmark ngelixa ukugcina izinzuzo ezibalulekileyo ze-open-source, i-transparency, kunye ne-extensible - iimpawu ezininzi ezininzi ezininzi ze-industries ezihlawulwe kunye ne-maximum ye-adoption.
Umzekelo Usetyenziswa Case
Technical ukwakhiwa
Ngaba ufumana i-GitHub repository ye-II-agentYiba.
I-High Level Architecture
I-II-Agent inikeza izixhobo ezimbini eziphambili: isixhobo se-command-line (cli.py) yokusebenzisana kwe-terminal-based ngqo kunye ne-WebSocket server (ws_server.py) enikeza i-Frontend esekelwe kwi-React ebonakalayo kwi-frontend / i-directory (UkucingaUkulungiselela i-agent core ye-src/ii_agent/, leyo i-orchestrates ye-LLM (kwi-LLM/i-clients), i-planning kunye ne-reflection logic, kunye ne-dispatch yeengxaki ze-execution (i-tools/defined) kunye ne-utils/ (UkucingaI-Real-time communication iyatholakala nge-WebSockets, kunye neengxaki zeengxaki ze-agent zihlanganisa iziganeko kwi-user interface ye-browser (Ukucingainguqulelo
I-Architecture ye-internal eyahlukileyo
Nceda ukhangele iimoduli ezithile kwisakhiwo ye-II-Agent repository.
Agents Module (src/ii_agent/agents/): Le module ibandakanya iimveliso ze-agent core ezisebenza njenge-component ye-orchestration ye-central, ukulawula i-cycle ye-agent nge-reasoning ye- iterative kunye ne-capability selection. I-agent ibandakanya i-system prompting kunye ne-context eyenziwe ngokugqithisileyo, ibandakanya i-history ye-interaction epheleleyo, kwaye zihlanganisa ne-LLM ukucacisa imiyalelo ye-user kunye nokuqinisekisa indlela yokusebenza kunye neengxaki. Yonke i-agent instance isebenza ngaphakathi kwindawo zokusebenza ezizahlukileyo kwaye ibandakanya i-flow ukusuka kwi-user input nge-tool execution ukuya kwi-response ye-final delivery.
LLM Interface Module (src/ii_agent/llm/): Le module inikeza isakhiwo se-interface yokusebenza kunye neemodeli ezininzi ze-language, eyenzelwe ngokukodwa kwi-Anthropic's Claude 3.7 Sonnet. Ixhaswa kunye ne-API ye-Anthropic kunye ne-Google Cloud Vertex AI routing, ukulawula i-authentication, i-formatting ye-request, kunye ne-response processing. I-module ivumela ukulawula i-context ye-intelligent kunye ne-token usage estimation kunye ne-truncation strategic ukuxhaswa kwi-120,000 i-token ngexesha lokugcina umgangatho ne-conversation kunye ne-relevance.
Tools Module (src/ii_agent/tools/): Le module ibandakanya iinkcukacha ezininzi zokusebenza eziza kufumaneka umxholo wayo, kuquka ukusebenza kwinkqubo yeefayile nge-intelligent editing code, ukuguqulwa kwe-command line kwiingqungquthela ezigqityiweyo, ukuxhaswa kwewebhu kunye ne-browser automation, kunye neengcaciso ezizodwa kwiimodeli ezahlukeneyo. Izixhobo zinxhaswa i-PDF text extraction, i-audio transcription, i-speech synthesis, i-image generation, i-video generation, kunye ne-deep research integration. Izixhobo ze-browser automation zibonwe kwi-Playwright kunye ne-vision integration, okuvumela i-agent ukuxhaswa kwiifayile zewebhu, ukuthatha i
Utilities Module (src/ii_agent/utils/): Le module inikeza isakhiwo sokuxhomekeke kuquka inkqubo yokulawula inkxaso elandelayo ukuhlaziywa kwe-token kunye ne-optimization, iinkqubo ze-archival ezisekelwe kwifayile ze-outputs ezininzi, i-WebSocket i-communication protocols ye-interaction ye-time, kunye nezinye iimpawu ze-helper ye-data processing kunye ne-system operations. Izixhobo zihlanganisa kwakhona iinkqubo ze-planning kunye ne-reflection eziza ukucaciswa kwe-structured, i-problem decomposition, kunye ne-transparent decision-making processes ezibonisa i-audit trail ye-problem-solving ye-agent.
Ukusuka kwi-Other Open-Source Frameworks
I-II-Agent ifumaneka ngokugqithisileyo kwiinkonzo ezininzi ze-open-source kunye neebhizinisi. Kuba i-LLM ye-backends, i-Anthropic Claude ngokusebenzisa i-Anthropic Python SDK kunye ne-Google's Vertex AI ngokusebenzisa i-cloud-aiplatform client, kwaye ibonelela i-plug-and-play search providers ezifana neTavily, Jina, Firecrawl kunye neSerpAPI (Ukucinga). Kwi-frontend kunye ne-tooling, ibekwe kwi-React / Node.js stack (i-Next.js conventions) kwaye isebenzisa iiprojekthi ezisemthethweni - ikakhulukazi i-AugmentCode ye-software-engineering task orchestration, i-Manus's prompt-architecture patterns, kunye ne-Index Browser Use toolkit for advanced web interactions (Ukucingainguqulelo
I-Easy Test Case
Okokuqala ngaba ufuna ukuyifaka i-test case elula ukuyifaka i-base line. Kwaye akukho i-test case elula kunokuba i-Nokia Snake game. Nangona i-example case yaye i-example case yaye ndandisa i-twist elincane ukuba kufuneka yi-hTML5 mobile game.
[Prompt]:Ukuguqulwa kwimidlalo ye-Nokia Snake njenge-HTML5 web game
And it was a success from the first attempt!Ngaphandle kokuba umdlalo igqityiwe, i-Nokia iye yongezelela iimpawu zokusetyenziswa kwifowuni ezifanelekileyo ezibonisa umdlalo kwi-retro ye-authenticity.
Author’s Notes: Njengomgangatho, lo mfuneko efanayo kwi-Claude Sonnet 3.7 ikhiqiza isicelo esifanayo, kodwa ibonisa ngokuzenzekelayo "I-Game Over" xa umdlali udlala. I-Algorithm ye-collision detection yayo ayikho kwenziwe ngokufanelekileyo, kwaye le mdlalo ayikwazanga.
Ndiyathanda ngokugqithisileyo indlela yokusebenza kwe-Agent II:
First the II-agent created a plan(Ngoku zonke iinkcukacha ezininzi kwi-Agent Autonomous):
- Yenza isakhiwo esisodwa se-HTML kunye ne-canvas element
- Yenza i-CSS Styles ye-Layout ye-Game
- Implement the core Snake game logic in JavaScript:
- Snake movement and controls
- Food generation
- Collision detection
- Score tracking
- Game over handling
- Snake movement and controls
- Ukongeza iimpawu ze-Nokia-style (i-pixel art, i-color scheme)
- Ukuqhagamshelwano mobile-friendly
- Test umdlalo ngokugqithisileyo
- Yenza umdlalo ukuze unako ukudlala
Okulandelayo yenza iifayile ezinxulumene kunye nokuveliswa ikhodi. Kwaye ekugqibeleni itshintshwe njenge-web app ukuze ndiyabakhokela. I-II-agent webui ibala Claude, apho unokufuneka iifayile ezintathu eziphambili kwisithili: Browser, Code, Terminal.
Iingxaki ezininzi ze-test
Ngenxa yokuba umdlalo weNokia Snake iyathunyelwe kakuhle kwaye i-tested kwiinkqubo ezininzi ze-LLM kunye ne-AI Agents, kunokwenzeka ukwenza i-test case engaphezulu kwe-coding elula.
Indawo yam ukuba i-Agent IIbuild a website for Nokia’s historical feature phones, leyo ingxaki kuqukaresearch(iintlobo ze-Nokia zibonisa iifowuni ze-Nokia),design(Ukuhlolwa kwewebhu)codingYayebrowser use(Skrolela kwi-Internet ukuze ufumane iifoto kunye ne-specifics zeemveliso, ubonise kwiwebhusayithi lokugqibela).
Ngoko, iziphumo zibonisa ukuba nathi kwakhona umzila elide.
Ukusebenza Okokuqala
[PromptinguquleloYenza iwebhusayithi ye-Nokia feature phone museum, ezibonisa zonke iifowuni ze-Nokia feature phones kwifomati ye-gallery ye-product. Ngokutsho kwimodeli ye-telephone, ukubonisa i-image ye-product, yonyaka lokuvelisa, iiyunithi ezidlulileyo, kunye nesibhozo esincinane yeengxaki ze-key features.
Yenza iplani elifanelekileyo ukuxhaswa le ngxaki
- Research Nokia feature phone models to collect information on:
- Product images
- Launch years
- Units sold
- Key features for descriptions
- Design the website layout:
- Create a responsive main page with an introduction to Nokia feature phones
- Implement a product gallery section to display all phone models
- Include individual phone detail sections
- Add navigation and search functionality
- Develop the website using:
- HTML for structure
- Tailwind CSS for styling
- JavaScript for any interactive elements (if needed)
- Test the website for:
- Responsiveness
- Proper display of information
- Navigation functionality
- Ukusetyenzisa iwebhusayithi kunye nokunika i-URL ekugqibeleni
Emva koko uye kwenziwa uphando, uphando eningi. Ndingathanda kwi- "Browser" ifayibha leyo i-browsing ye-Internet ngexesha elide,But eventually it failed and stoppedUmzekelo? Umzekelo we-browsing umzekelo we-timed kwi-website ye-slow.
The AI agent’s plan did not include any exception handling and error recovery mechanism.
Umzila wokuqala
Ngenxa yokuba ngexesha lokuqala, i-agent yenzelwe kakhulu ukufumana iinkcukacha ze-Nokia phones, inkxaso yami yesibini yenzelwe ukunciphisa indawo kwi-5 phones ezidlulileyo.
[Prompt]:Yenza iwebhusayithi ye-Nokia feature phone museum, ibonisa i-Nokia's Top 5 iifowuni eziqhelekileyo kwizithuba ye-Gallery ye-product. Ukubonisa i-image ye-product, i-launch year, kunye ne-paragraph one short description of key features. Ukubonisa imifanekiso ye-product, i-lancer year, kunye ne-paragraph one short description of key features.
Top 5 iifowuni eziqhelekileyoKwixesha le mveliso yandibana kwaye yandibana kwakhona, ngenxa yeyiphi na isizathu. I-Anthropic rate limit itshintshwe, kwaye itshintshwe i-error. Kwakhona umvelisi we-AI ayinayo nayiphi na ingxaki yokuthumela kunye ne-recovery mechanism,
Ukongezelela, kukho enye ingxaki glaring. I-AI agentdid NOT follow user instructionUkunciphisa ukucacisa kuphela "i-Top 5 phones", kwaye ngexesha ukucacisa zonke iifowuni ze-feature.
Ukuzama Third
Ukufikelela kwexesha elidlulileyo kunokuba yinto yokuzonwabisa, ndingathanda ngoko ngoko ngoko ngoko akhawulezisa i-agent ukuvelisa iwebhusayithi ye-mockup yokuqala, kwaye emva kokufunda iimodeli ze-telefoni? Ngokwenza oku, sinokufumana okungenani i-Proof of Concept kwimeko yokufunda iimodeli ye-telefoni kwakhona. Ngokufanelekileyo sinxulumene iintlawulo zeeyunithi ezimbini yokufumana le nqaku.
[Prompt]:Yenza i-website ye-Nokia feature phone museum, enikezela iifowuni ze-5 ze-Nokia kwi-product gallery format. Ngokutsho kwimodeli ye-telephone, ukubonisa i-image ye-product, i-lanced year, kunye ne-a-paragraph short description of key features. Qinisekisa:
1) Ukucinga Okokuqala ukwakha website mock up ngaphandle data real
Okokuqala ukwakha website mock up ngaphandle data real2) Xa ucwaningo iimveliso yeefayili Nokia, kuphela ucwaningo iimveliso ezimbini, kuphela 5
Ngoko ke, ingxenye yokuqala yokwenza iwebhusayithi ye-mock-up ibonakala ukuba isebenza kakuhle:
Kodwa xa ixesha ukulayisha iwebhusayithi, kukho inkinga. Isebenzisa i-port 8000 ngenxa ye-hTTP server iingxaki kunye ne-ii-agent ye-backend ye-server yayo. Ngenxa yokuba akukho indlela yokuba i-II-agent kwaye ukuguqulwa kwinombolo ye-port kwi-fly, ndiya kuqala isixeko esitsha ukuze ufake imiyalezo yokuguqulwa.
Uhlobo lwe-4
[Prompt]: _Make website for Nokia feature phone museum, ezibonisa iintlobo ezimbini eziphambili ze-Nokia kwi-product gallery format. Ngokutsho kwimodeli ye-telephone, ukubonisa imifanekiso ye-product, yonyaka lokuvelisa, kunye ne-a paragraph short description of key features. Qinisekisa:
- Okokuqala ukwakha website mock up ngaphandle data real, usebenzisa port 8888 for web server
- Xa ucwaningo iimveliso yeefoni ye-Nokia, iimveliso ezininzi ezimbini kuphela, kuphela 5
Iwebhu ye-mock-up ngokwenene isebenza, kwaye ibonakala kakuhle.
Ukucaciswa kwewebhu okuzenzakalelayo ayisebenza, ngoko ke ngenxa ye-process management kwi-Terminal tab. Ngexesha elide, emva kweengcinga ezininzi ezifanelekileyo, ilandelayo kunye ne-life and said:
"Ndinga ukuqalisa i-web server ye-local kwi-port 8888, kodwa ndiya kuthatha iinkcukacha. Oku akufanele ukuvelisa kwisayithi yayo, njengoko ngoku ndiyaqhubeka ukuhlola iifowuni ze-5 ze-feature ze-Nokia ukongeza iinkcukacha zayo. "
"Ndinga ukuqalisa i-web server ye-local kwi-port 8888, kodwa ndiya kuthatha iinkcukacha. Oku akufanele ukuvelisa kwisayithi yayo, njengoko ngoku ndiyaqhubeka ukuhlola iifowuni ze-5 ze-feature ze-Nokia ukongeza iinkcukacha zayo. "
I felt a glimpse of hope for AI agents now.
Kwixesha lokugqibela, isicelo sokuVavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavavav
Fantasy vs Reality - izifundo zayo
We have always wanted very intelligent AI Agents that can be truly autonomous and do wonderful things on our behalf. But the reality is that we are quite far from it.Yinto ayikwazanga ngokufanelekileyo ye-Agent II, kodwa i-Agents ye-AI ngokubanzi.
Kukho izinto ezithile ukuba umntu uyenza kakuhle kakhulu kwaye phantse ngempumelelo, ngelixa zonke i-state of art I-AI Agents zithembisa.
- Plannng: ukunciphisa izicwangciso ezininzi ezincinane, ukwenza uphando lokufanelekileyo kwezinye izicwangciso, ukudala ukubeka izicwangciso
- I-Rasoning: ukuyifaka into efanelekileyo kunye neengxaki, ukuyifaka i-root cause analysis, ukuyifaka iingxaki zeengxaki kunye neengxaki zeengxaki
- Ukusabela: ukunciphisa indawo, ukuguqulwa kwinkqubo, ukuhlola isisombululo se-alternative, ukuqhuba i-proof-of-concepts, ucebisa ukhuseleko xa kufuneka, ukuqala kwakhona
Ngoko ke, ngeemodeli ze-SOTA kunye neengcali ezidlulileyo, sinayo iindlela ezininzi ngoku:
- Co-Pilot: oku kuboniswa kakhulu ngokwenene ngempumelelo kunye ne-commercial. umzekelo, ukucaciswa nge-Cursor kunye ne-GitHub Co-Pilot; Microsoft Co-Pilot for Office 365.
- I-Human in the loop; isakhiwo esisodwa ebizwa ngokuba yi-Agent Inbox yi-LangChain, apho i-agent inokufutshane abantu xa ziquka okanye kufuneka uncedo. I-challenge ibonakala ukuba ukuba ama-Agents ze-AI ziyafumaneka ngokufanelekileyo ukuba bafumane i-interrupts.
Ngoko ke unayo, umzila wokuqala we-Autonomous AI Agent: ukusuka kwi-Deep Research ye-OpenAI, Manus, GenSpark, kwaye ngoku kwi-agent II. I-Baby isinyathelo ngokwenene, kodwa sisebenza.
Umbhali
Inyanga elinye emva kokwenza iimvavanyo ezidlulileyo kunye nenqaku, ndingathanda ukunika iimvavanyo emnqweno ezininzi. Le nqaku iye yenza (i-99%): iye yenza iwebhusayithi ye-mock-up, uyifumaneka ngempumelelo yokufunda iifowuni ze-5 ezidlulileyo, zibonise zonke iinkcukacha kunye nezithombe, kwaye ekugqibeleni iwebhusayithi!
Yintoni i-99% ngexesha le-100%? Umgca olugqibeleleyo emva kokuguqulwa kwewebhu: ukulungiselela ukulungiselela kwitest. Emva kokufanisa i-manual debugging, kuba ukuba i-II-agent yenza i-session entsha kwaye awukwazi ukufumana iifayile zewebhu yokuqala. Kodwa ukuqhuba iwebhu ngexesha nge-Python, kwaye i-screenshots ziye zithunyelwe.
Oku kubonakala njenge-bug elincane enokufumaneka.