Bitget App
Trade smarter
Buy cryptoMarketsTradeFuturesEarnSquareMore
New initiative enhances AI access to Wikipedia information

New initiative enhances AI access to Wikipedia information

Bitget-RWA2025/10/01 13:25
By:Bitget-RWA

On Wednesday, Wikimedia Deutschland revealed a new database designed to make Wikipedia’s extensive information more easily available to AI systems.

Named the Wikidata Embedding Project, this platform utilizes a vector-based semantic search method—a process that enables computers to interpret the meanings and connections between words—on the vast data from Wikipedia and its related sites, which together hold close to 120 million records.

By integrating support for the Model Context Protocol (MCP)—a standard that enables AI to interact with data sources—the initiative allows LLMs to access the data through natural language queries more effectively.

Wikimedia’s German division developed the project in partnership with neural search company Jina.AI and DataStax, a real-time data training firm owned by IBM.

For years, Wikidata has provided machine-readable information from Wikimedia sites, but previous tools only supported keyword searches and SPARQL, a specialized query language. The updated system is better suited for retrieval-augmented generation (RAG) setups, which let AI models incorporate external knowledge, giving developers the ability to anchor their models in content reviewed by Wikipedia editors.

The data is organized to deliver essential semantic context. For example, searching for “scientist” in the database will yield lists of notable nuclear scientists, scientists affiliated with Bell Labs, translations of “scientist” in various languages, an approved Wikimedia image of scientists at work, and related terms like “researcher” and “scholar.”

Anyone can access the database on Toolforge. Additionally, Wikidata will host a webinar for developers interested in the project on October 9th.

This initiative arrives at a time when AI developers are urgently seeking reliable, high-quality data to refine their models. Training environments have grown more advanced—often built as intricate systems rather than simple datasets—but they still depend on carefully curated information. For applications demanding high precision, trustworthy data is crucial. While Wikipedia may have its critics, its content is far more fact-based than broad collections like Common Crawl, which aggregates vast numbers of web pages from the internet.

Sometimes, the pursuit of top-tier data can be costly for AI companies. For instance, in August, Anthropic agreed to pay $1.5 billion to settle a lawsuit with a group of authors whose works were used for training, resolving all related claims.

In a statement to the media, Wikidata AI project manager Philippe Saadé highlighted the project’s independence from major tech firms or leading AI labs. “The launch of this Embedding Project demonstrates that advanced AI doesn’t need to be dominated by a few corporations,” Saadé said. “It can be open, collaborative, and designed to benefit everyone.”

0

Disclaimer: The content of this article solely reflects the author's opinion and does not represent the platform in any capacity. This article is not intended to serve as a reference for making investment decisions.

PoolX: Earn new token airdrops
Lock your assets and earn 10%+ APR
Lock now!

You may also like

XRP News Today: "Major Institutions Accelerate XRP Integration as Cryptocurrency Sector Reaches New Level of Maturity"

- 21Shares launches U.S. spot XRP ETF TOXR on Dec 1, 2025, tracking $666.61M in inflows as institutional demand surges. - SEC-approved ETF holds physical XRP in Anchorage/BitGo custody, mirroring Bitcoin/ETH ETF structures with 0.25-0.40% expense ratio expectations. - XRP ETFs gained $643.92M in debut month, driven by Grayscale, Franklin Templeton, and $243M peak inflows, reducing exchange liquidity. - Institutional adoption boosts XRP's 12% weekly price rise, with analysts expecting stabilized liquidity a

Bitget-RWA2025/11/29 17:28
XRP News Today: "Major Institutions Accelerate XRP Integration as Cryptocurrency Sector Reaches New Level of Maturity"

XRP News Today: ETF Investments Drive XRP Closer to $2.60 Amid Rising Mainstream Acceptance

- Google Gemini forecasts XRP hitting $2.60, driven by 12 new U.S. XRP ETFs boosting institutional/retail demand. - ETF inflows ($422M) and $2.25 price rebound reverse 15% monthly decline, signaling renewed technical optimism. - August 2025 court ruling cleared XRP's secondary sales as non-securities, enabling broader institutional adoption. - XRP's 4B+ transactions and ETF-driven inflows highlight its transition from speculative asset to regulated financial product. - $2.60 target aligns with historical s

Bitget-RWA2025/11/29 17:28
XRP News Today: ETF Investments Drive XRP Closer to $2.60 Amid Rising Mainstream Acceptance

XRP News Today: XRP's ETF-Fueled Surge Faces Technical Challenges—Will Bulls Defend the $2.20 Mark?

- XRP's market structure shows strengthening signs ahead of multiple ETF launches, with Grayscale's GXRP and Franklin Templeton's XRPZ driving $422.64M in inflows despite broader crypto selloff. - Price dipped below $2.20 amid $164M ETF launch due to whale selling and derivatives liquidations, raising questions about Times Square marketing timing during the slump. - Technical analysis highlights critical $2.20 support level and resistance at $2.24–$3.00, with bulls needing to stabilize open interest and ov

Bitget-RWA2025/11/29 17:28
XRP News Today: XRP's ETF-Fueled Surge Faces Technical Challenges—Will Bulls Defend the $2.20 Mark?

Japan’s Declining Yen and U.S. Funding Pressures Trigger Worldwide Liquidity Crunch

- Robert Kiyosaki warns of an impending market crash amid global economic uncertainty, emphasizing Bitcoin as an inflation hedge. - Japan's fiscal stimulus and yen weakness trigger liquidity strains, disrupting global carry trades and accelerating portfolio rebalancing. - U.S. funding strains, including repo market stress and Treasury leverage, threaten dollar liquidity, limiting the Fed's policy flexibility. - Combined pressures depress stocks, crypto, and forex, but Fed intervention could stabilize marke

Bitget-RWA2025/11/29 17:08
Japan’s Declining Yen and U.S. Funding Pressures Trigger Worldwide Liquidity Crunch