promptdojo_

Embedding that fits the budget — pick a model that matches your corpus — step 8 of 9

Write rank_by_similarity(query_vec, doc_vecs) that returns a list of doc INDICES sorted by descending cosine similarity to the query. The highest-scoring doc comes first.

  • Input: a query vector and a list of doc vectors (all the same dim).
  • Output: list of indices into doc_vecs, sorted best-first.
  • Use cosine similarity (dot product / product of norms).

A query and four docs run for you. The query points heavily in the first dimension. Doc 1 matches it closely. Doc 3 is the next-closest. Docs 2 and 0 point elsewhere.

Expected output:

[1, 3, 2, 0]

full-screen editor opens — close anytime to keep reading.