Meta Launches New Web Crawler to Bolster AI Data Collection

August 21, 2024 – Recently, Meta has quietly unleashed a new web crawler designed to scour the internet and amass vast amounts of data, bolstering its artificial intelligence models. This new crawler, dubbed Meta External Agent, was launched last month, as reported by three companies tracking web scrapers.

Resembling OpenAI’s GPTBot, Meta’s new crawler is capable of harvesting AI training data from the web, such as texts from news articles or conversations in online forums. Archival records indicate that Meta indeed updated a developer-facing company website in late July, with a tab revealing the presence of the new crawler. However, Meta has yet to publicly announce its new crawling robot.

Meta’s Llama stands as one of the largest llms. Although the company has not disclosed the training data used for its latest model, Llama 3, its initial model version relied on a massive dataset collected from various sources, including Common Crawl.

Earlier this year, Meta co-founder and CEO Mark Zuckerberg boasted during a financial earnings call that the company’s social platform has amassed a dataset for AI training that “exceeds Common Crawl.”

The emergence of the new crawler suggests that Meta’s extensive database might have become insufficient, as the company strives to update Llama and expand Meta AI. This typically demands fresh and high-quality training data to continuously enhance its functionalities.

According to data from Dark Visitors, nearly 25% of the world’s most popular websites have blocked GPTBot. In contrast, only 2% have blocked Meta’s new crawler robot, indicating a potentially wider acceptance or lack of awareness regarding Meta’s data collection efforts.

Agibot Unveils Lingxi X2 EDU: An Open Humanoid Robot Built for Research, Training, and Competition

Porsche CEO Rules Out All-Electric 911: “The Soul Stays Combustion”

Apple’s macOS Golden Gate Boosts AirDrop Speeds by 80% in Major WWDC 2026 Reveal

China Overtakes Japan as Saudi Arabia’s Top Auto Supplier After Shipping 1.9 Million Cars in Two Years

China Tops U.S. in AI Token Usage for Sixth Straight Week, With All Top 4 Models Homegrown

From Champion to Photographer: Xiaomi Robot Stuns 17T Event with One-Handed Phone Shots

Meta Launches New Web Crawler to Bolster AI Data Collection

Leave a Reply Cancel reply

Popular News

Leave a Reply Cancel reply

Related News