use solr to find documents that are most similar to other documents

https://solr.apache.org/guide/8_9/morelikethis.html

Finally, the MLT query parser can be used. This operates in much the same way as the request handler but since it is a query parser it can be used in filter queries, boost queries, etc., and results can be paginated or highlighted as needed.

Older answers suggested the TFiDF scoring algorithm.

https://gitlab.com/find-it-program-locator/findit/-/issues/202