Lookup time complexity/methods for "nearest neighbors"?

I am wondering how well the "custom model" approach works for larger collections of documents. Assuming I have a corpus with a 100'000 sentences, would the custom model approach still be the way to go to look up the nearest neighbor for a query?

Are there other mechanisms provided by the system for efficient vector similarity lookup?

Is there a rule of thumb for the (key-)size for the custom model until it will have acceptable performance?