AI Semantic Matching

Resolving the "Tower of Babel" Problem

Prediction markets rely on natural language to define events. However, different platforms describe the same event in different ways.

  • Polymarket: "Will Bitcoin hit $100,000 in 2024?"

  • Kalshi: "Bitcoin Price > $100,000 on Dec 31, 2024"

  • Limitless: "BTC/USD > 100k @ 2024 End"

To a computer, these are three completely different strings. To a human, they are the same event. Oraclyst uses an AI Semantic Engine to bridge this gap.

The Matching Process

  1. Ingestion: The system continuously scrapes active markets from all supported venues.

  2. Vectorization: We use OpenAI embeddings (text-embedding-3-small) to convert market titles and resolution rules into high-dimensional vectors.

  3. Cosine Similarity: The engine compares the vector of a new market against the vectors of existing markets in our database. It calculates a "Similarity Score" from 0.00 to 1.00.

  4. Clustering Logic:

    • Score > 0.98: The markets are considered identical. They are merged into a single Unified Event ID in the Oraclyst interface.

    • Score 0.85 - 0.98: The markets are flagged as "Related." They are sent to a human verification queue for manual review to ensure the resolution sources (e.g., AP vs. Reuters) are compatible.

    • Score < 0.85: The market is treated as a unique, standalone event.

This technology allows Oraclyst to aggregate liquidity without requiring manual data entry for thousands of markets.

Last updated