Thousands of Strains. 33 Dimensions. Zero Guesswork.

Most recommendation engines guess. They take a category — "Indica", "relaxing", "for sleep" — and return whatever happens to rank highest for that tag. Alfy doesn't guess. It computes. Under the hood is a kD-Tree powered by 33-dimensional emotional vectors, a semantic search engine that understands language the way you do, and an AI agent intelligent enough to know which tool to reach for. Here's exactly how it works.

Alfy is more than a recommender — it's a botanical intelligence trained across thousands of strains and 33 emotional dimensions.

The Database: Thousands of Strains, Scored Across 33 Dimensions

Everything starts with the data. Alfy's database holds thousands of cannabis strains, and for each one, every effect has been scored and normalised — not just named. This isn't a tag list. It's a numerical fingerprint.

1,000s

Strains in the database

Effect dimensions per strain

Specialised search tools

Each strain record contains three categories of scored effects, measured as normalised floats between 0 and 100:

✨ 13 Positive Effects

ArousedCreative EnergeticEuphoric FocusedGiggly HappyHungry RelaxedSleepy TalkativeTingly Uplifted

🩹 14 Medical Conditions

CrampsDepression Eye PressureFatigue HeadachesInflammation InsomniaLack of Appetite Muscle SpasmsNausea PainSeizures SpasticityStress

⚠️ 6 Negative Effects

AnxiousDizzy Dry EyesDry Mouth HeadacheParanoid

Put them together and each strain becomes a point in 33-dimensional space — a coordinate that describes not just what it does, but how much it does it. Blue Dream isn't just "Creative." It's Creative: 87, Happy: 91, Euphoric: 78, Relaxed: 65, Energetic: 72. That precision is what makes the matching meaningful.

3D grid of glowing data points representing thousands of strains plotted across 33 effect dimensions — Thousands of strains, each a point in 33-dimensional space — plotted by their effect scores across positive effects, medical conditions, and negatives.

The kD-Tree: Finding Your Nearest Neighbour in 33 Dimensions

When you tell Alfy you want to feel "Happy, Creative, and a little Relaxed," you're not giving it a search term — you're defining a coordinate. The system converts your selection into the same kind of 33-dimensional vector that every strain already occupies. Then it asks: which strain is closest?

That's the job of the kD-Tree (k-Dimensional Tree) — a data structure built specifically for finding nearest neighbours in high-dimensional space, fast — a technique whose mathematical foundations were established in Bentley's seminal 1975 paper on multidimensional binary search trees.[1]

"A kD-Tree doesn't search every strain every time. It partitions space into regions, prunes impossible branches, and navigates to the answer in a fraction of the time. Across thousands of strains and 33 dimensions, it's effectively instant."

— How Alfy's RecommendByEffects tool works

Before the tree is built, every effect score is passed through a MinMaxScaler — normalised to a 0–100 range across the entire dataset. This ensures that a score of 75 on "Euphoric" and a score of 75 on "Pain relief" mean the same thing in geometric terms: 75% of the way from minimum to maximum for that dimension. Without normalisation, dimensions with wider natural ranges would dominate the distance calculation unfairly — a well-documented failure mode in any distance-based model, and the reason scikit-learn's own preprocessing documentation classifies MinMaxScaler as a non-negotiable step before geometric similarity computation.[3]

Here's what that looks like for a real query: "I want to feel Happy, Creative, and Relaxed"

🔍 Query vector vs. Blue Dream's profile

Happy

Creative

Relaxed

Euphoric

Energetic

Stress

Euclidean distance to query 🌿 Blue Dream — distance: 18.4

kD-Tree partitioned space with a glowing green search path navigating to the nearest data point — The kD-Tree partitions space into regions and navigates directly to the nearest neighbour — pruning impossible branches without checking every strain.

The kD-Tree returns the 5 closest strains by Euclidean distance — not by opinion, not by trending ranking, not by sponsored placement. Pure geometry. Research on dynamic multi-dimensional spatial indexing confirms that balanced kD-Tree structures deliver predictable tail latency precisely because hyperplane pruning eliminates immense volumes of search space before the nearest-neighbour is reached.[2] The strain that lives closest to the emotional coordinate you described wins the recommendation.

And unlike a keyword search, this catches things that a label would miss. A strain that's 88 on Relaxed and 82 on Happy might never appear in results for "chill creative vibes" — but the kD-Tree finds it because its numbers put it right next to yours.

Emotional Literacy as Coordinates

The bridge between how you feel and what the kD-Tree searches is what Alfy calls emotional literacy mapping. Every option you select in the app — "🧘 Meditation", "🧠 Brain Waves", "😴 Zzz" — translates directly into an effect dimension at a specific intensity.

# How "Meditation" becomes a search vector
MoodOption(
  ui_label="Meditation",
  score_type="EffectScores",
  effect_name="Relaxed",
  on_select=100,   # full intensity
)

# "Brain Waves" maps to:
MoodOption(
  ui_label="Brain Waves",
  score_type="EffectScores",
  effect_name="Creative",
  on_select=100,
)

# Combined: {"Relaxed": 100, "Creative": 100}
# → fed directly into the kD-Tree query vector
  

Premium options work inversely — if you select "🍕 Avoid Munchies", the Hungry dimension is set to 0, pulling the results toward strains that score low on that dimension. You're not just describing what you want — you're also specifying what to avoid, and the geometry handles both simultaneously.

Wellness vibe finder on a phone with green selection chips — Each chip you tap becomes a coordinate. By the time you hit send, Alfy has already built your emotional vector and is calculating distances.

Five Specialised Tools — One Intelligent Agent

The kD-Tree is the precision instrument, but it's one of five tools the agent can reach for. Each is designed for a distinct type of question:

SearchStrainsByMood — natural-language semantic search
RecommendByEffects — kD-Tree nearest-neighbour on your effect vector
SearchByMedicalCondition — ranked lookup by medical dimension score
SearchStrainOnline — live web fallback for strains not in the database
GetStrainDetails — exact full-profile lookup by strain name

The art is in knowing which tool — or which combination — fits the question. That's the agent's job.

🔍

SearchStrainsByMood

Semantic vector search via Chroma. Every strain's description, effects, and metadata are embedded into a high-dimensional language vector. Your query is embedded too, and the closest strains by cosine similarity are returned.

Used when: you type in natural language — "something chill for a movie night"

📐

RecommendByEffects

The kD-Tree engine. Takes a numeric effect vector — built from your chip selections or the agent's interpretation — and finds the 5 nearest strains by Euclidean distance in 33-dimensional space.

Used when: specific effects or intensities are requested

🩹

SearchByMedicalCondition

Ranks all strains by their normalised score on a specific medical dimension. Returns the top matches by condition score — direct, ranked, no geometry needed.

Used when: a medical need is mentioned — pain, insomnia, anxiety

🌐

SearchStrainOnline

Tavily web search fallback. When a strain isn't in the local database — a new cultivar, a regional exclusive, something with an unusual name — Alfy reaches out to the web and synthesises what it finds.

Used when: the strain isn't in the local database

🔬

GetStrainDetails

Exact lookup by strain name. Full record: category, rating, description, all effect scores. When you know what you're looking for and just want the full profile.

Used when: you ask about a specific named strain

The Agent: Deciding Which Tool to Use

None of these tools matter if they're used at the wrong time. Alfy is built on a LangGraph ReAct agent — a loop that thinks before it acts: Reason → Act → Observe → Repeat. This paradigm — proved in the landmark "ReAct: Synergizing Reasoning and Acting in Language Models" paper (Yao et al., 2022) — dramatically reduces hallucinations by grounding every response in actual tool observations rather than the model's parametric memory.[7]

For each query, the agent reads the conversation, decides which tool fits, calls it, reads the output, and decides whether to act again or synthesise a final answer. It runs this loop up to 20 times per query — deep enough to handle complex multi-condition requests without getting stuck.

🗣️ You say something

A chip selection, a typed message, or both. Your input arrives with 6 turns of conversation history for context.

🧠 Agent reasons

The LangGraph ReAct agent reads the full context and decides: is this a mood query (semantic search), an effects query (kD-Tree), a medical need, or a named strain lookup? It might call multiple tools in sequence.

📐 Tools execute

The chosen tool runs — the kD-Tree computes distances, Chroma finds semantic matches, or Tavily fetches the web. Results come back as structured data.

🔁 Iterate if needed

If the result is incomplete — e.g. a named strain isn't in the database — the agent calls another tool (web search) rather than giving up. Up to 20 tool calls per query.

✨ Synthesise the answer

The agent writes a warm, human response — 2–3 specific strains with explanations grounded in the actual data retrieved. No hallucination: every recommendation came from the database or the web.

AI agent at the centre routing queries to multiple specialised tools — semantic search, kD-Tree, medical filter, web search — The ReAct agent at the centre — reading your query, choosing the right tool, observing the result, and deciding whether to iterate or answer.

Why This Is Better Than a Filter

Most cannabis apps let you filter by type and effect tag. That's useful, but it's brittle. Enterprise vector search analysis has documented how boolean filter logic acts as a binary gatekeeper — dropping matching entities entirely the moment a single criterion isn't met, producing false negatives at scale.[4] Filters work by exact match — if a strain isn't tagged "creative", it won't appear in the creative filter, even if its Creative score is 85. And they can't handle nuance — you can't tell a filter "I want something happy but not too energetic, and I sometimes get headaches."

The kD-Tree handles all of this naturally. "Happy but not too energetic" is just a vector where Happy is high and Energetic is low. "I sometimes get headaches" maps to keeping the Headache negative-effect dimension low. The geometry of the space handles the trade-offs — the nearest neighbour is the strain that best satisfies all your constraints simultaneously, not just the ones that are easy to filter for.

"The difference between a filter and a vector search is the difference between a yes/no question and a distance. Filters exclude. Vectors rank. And ranking always finds something useful — even when nothing is a perfect match."

Vector index platforms have independently demonstrated this property: mapping preferences into a continuous geometric space means results are always ranked by proximity rather than eliminated by rigid Boolean constraints, making the system resilient to partial matches.[5][6]

The semantic search layer adds another dimension: it understands language. "Something for a creative Sunday morning" will find strains whose descriptions and effect profiles cluster around that concept — without you having to know the word "Sativa" or the effect name "Creative." The embedding model has already learned what that phrase means.

The Result: A Recommendation That Earns Its Place

When Alfy recommends 🌿 Blue Dream, it's not because Blue Dream is popular, or because it showed up in the right ad buy. It's because Blue Dream's 33-dimensional vector sat closer to your emotional coordinate than every other strain in the database.

That's a claim that can be verified, challenged, and improved. As the database grows, as the emotional mappings get more refined, as your preferences get captured across sessions — the distances get tighter and the recommendations get sharper.

This is what it looks like when a recommendation engine is built around emotional literacy rather than popularity. Not what people usually want. What you actually need, right now, measured in the most precise language we have: mathematics.

References

🧠 See It In Action

Tell Alfy how you're feeling. Watch the reasoning unfold in real time — which tools it calls, what it found, why it chose what it chose.

Try Alfy Free →