what it does

rag-psych is a retrieval-augmented question-answering system over a local corpus of psychiatry / mental-health reference material. You type a clinical question; the system finds the most relevant passages in the corpus, has an LLM compose a grounded answer with citations back to those passages, and shows you the supporting passages alongside the answer so you can verify every claim.

what it offers

Grounded answers

Every factual claim in the response is followed by a [chunk_id] citation linking to the exact passage it came from. Click a citation to scroll to and highlight its chunk.

Source transparency

Retrieved passages are shown on the right with their source (clinical notes, research abstracts, or diagnostic references) colour-coded and labelled. No hidden reasoning.

Hallucination detection

Cited IDs that do not appear in the retrieved set are flagged in the answer and in a warning banner. The model does not get to quote things that weren't retrieved.

Insufficient-evidence refusal

When the corpus doesn't contain an answer, the system returns a canonical refusal string rather than inventing one. Off-topic queries trigger this at the retrieval layer with no LLM call.

Negation-aware retrieval

Passages that deny the queried concept ("patient denies suicidal ideation") are filtered out before reaching the answer step, so they're never cited as positive evidence.

Hybrid retrieval

Three retrievers run in parallel — dense semantic search, BM25-style keyword, and literal rare-token matching — then Reciprocal Rank Fusion and a cross-encoder re-score the combined candidate pool.

what to ask

diagnostic

criteria for generalized anxiety disorder

essential features of post-traumatic stress disorder

diagnostic criteria for obsessive compulsive disorder

clinical scenarios

45-year-old female presenting with depressive symptoms and suicidal ideation

patient medication list including SSRI for depression

research

cognitive behavioral therapy outcomes for anxiety disorders in adolescents

psychosocial interventions for bipolar disorder

cross-source

what does the literature say about the diagnostic criteria for depression

how is suicidal ideation assessed clinically and what is its prevalence

what it can't do

how a query flows

  1. Your query is embedded with a clinical-domain sentence encoder, and in parallel tokenised for keyword and rare-token lookups.
  2. Three retrievers run against the local vector database and return their top candidates independently.
  3. The candidate lists are fused by Reciprocal Rank Fusion, deduplicated, and re-scored by a cross-encoder reranker.
  4. A rule-based negation filter drops any surviving passage where the queried concept is denied / ruled out / negative for.
  5. If the best remaining passage clears a confidence threshold, the top-k are sent to a language model with a strict system prompt: answer only from these, cite every claim, refuse cleanly if the passages don't support an answer.
  6. The response is parsed for citation integrity — any cited ID not in the retrieved set is flagged before the answer is rendered.

what it could offer next

behind the curtain

A live evaluation dashboard with per-query metrics, source-mix breakdowns, latency profile, and run history is available at /eval — password-protected so it doesn't leak eval numbers to casual visitors. Credentials come from the operator's .env.