vllm online serving

#7
by aperez900907 - opened

When running the model with VLLM, it logs this warning:

WARNING 06-13 09:05:21 [api_server.py:848] To indicate that the rerank API is not part of the standard OpenAI API, we have located it at /rerank. Please update your client accordingly. (Note: Conforms to JinaAI rerank API)

I query /rerank /v1/rerank /v2/rerank and always get this output error:

{
    "object": "error",
    "message": "The model does not support Rerank (Score) API",
    "type": "BadRequestError",
    "param": null,
    "code": 400
}

The requests are 200 to all endpoints, but they do not work.

how are u running this

Easiest way is to use an adapted model, I tested it with infinity and this one worked:

infinity_emb v2 --model-id tomaarsen/Qwen3-Reranker-0.6B-seq-cls

Sign up or log in to comment