I am wondering how well the "custom model" approach works for larger collections of documents. Assuming I have a corpus with a 100'000 sentences, would the custom model approach still be the way to go to look up the nearest neighbor for a query?
Are there other mechanisms provided by the system for efficient vector similarity lookup?
Is there a rule of thumb for the (key-)size for the custom model until it will have acceptable performance?
Are there other mechanisms provided by the system for efficient vector similarity lookup?
Is there a rule of thumb for the (key-)size for the custom model until it will have acceptable performance?