Auto Fine-Tuning
Just tell us which domain you want your embeddings to excel in, and we automatically deliver a ready-to-use, fine-tuned embedding model for that domain.
What is Auto Fine-Tuning?
Fine-tuning allows you to take a pre-trained model and adapt it to a specific task or domain by training it on a new dataset. In practice, finding effective training data is not straightforward for many users. Effective training requires more than just throwing raw PDFs, HTMLs into the model; and it is hard to get it right. Auto fine-tuning solves this problem by automatically generating effective training data using an advanced LLM agent pipeline; and fine-tuning the model within a ML workflow. You can think it as a combination of synthetic data generation and AutoML, so all you need to do is describe your target domain in natural language and let our system do the rest.
But does it work though?
Auto fine-tuning holds an auto-magical promise to deliver fine-tuned embeddings for any domain you want. But does it really work? This is a fairly reasonable doubt. We've tested it on a variety of domains and base models to find out. Check out the cherry-picked and lemon-picked results below.
jinaai/jina-embeddings-v2-base-en
arrow_upward 2%
0.505 arrow_forward 0.532 arrow_upward 5%
0.352 arrow_forward 0.389 arrow_upward 10%
0.352 arrow_forward 0.389 arrow_upward 10%
check
Tested on 50 random samples from tollefj/norwegian-nli-triplets
0.852 arrow_forward 0.867 arrow_upward 2%
0.800 arrow_forward 0.820 arrow_upward 2%
0.800 arrow_forward 0.820 arrow_upward 2%
4648
4480
168
jinaai/jina-embeddings-v2-base-en
arrow_upward 6%
0.672 arrow_forward 0.755 arrow_upward 12%
0.567 arrow_forward 0.675 arrow_upward 19%
0.567 arrow_forward 0.675 arrow_upward 19%
check
Tested on 50 random samples from mteb/askubuntudupquestions-reranking
0.698 arrow_forward 0.722 arrow_upward 3%
0.515 arrow_forward 0.549 arrow_upward 6%
0.666 arrow_forward 0.712 arrow_upward 7%
616
448
168
jinaai/jina-embeddings-v2-base-en
arrow_upward 9%
0.727 arrow_forward 0.861 arrow_upward 18%
0.640 arrow_forward 0.814 arrow_upward 27%
0.640 arrow_forward 0.814 arrow_upward 27%
check
Tested on 50 random samples from mteb/scidocs-reranking
0.773 arrow_forward 0.822 arrow_upward 6%
0.575 arrow_forward 0.651 arrow_upward 13%
0.823 arrow_forward 0.884 arrow_upward 7%
616
448
168
jinaai/jina-embeddings-v2-base-zh
arrow_upward 1%
0.718 arrow_forward 0.785 arrow_upward 9%
0.629 arrow_forward 0.717 arrow_upward 14%
0.629 arrow_forward 0.717 arrow_upward 14%
check
Tested on 50 random samples from C-MTEB/CMedQAv2-reranking
0.938 arrow_forward 0.948 arrow_upward 1%
0.912 arrow_forward 0.926 arrow_upward 2%
0.920 arrow_forward 0.933 arrow_upward 1%
616
448
168
jinaai/jina-embeddings-v2-base-en
arrow_upward 6%
0.543 arrow_forward 0.579 arrow_upward 7%
0.402 arrow_forward 0.452 arrow_upward 12%
0.402 arrow_forward 0.452 arrow_upward 12%
check
Tested on 50 random samples from nc33/triplet_sbert_law2 (machine-translated to dutch)
0.904 arrow_forward 0.948 arrow_upward 5%
0.870 arrow_forward 0.930 arrow_upward 7%
0.870 arrow_forward 0.930 arrow_upward 7%
9128
8960
168
jinaai/jina-embeddings-v2-base-code
arrow_downward -4%
0.671 arrow_forward 0.640 arrow_downward -5%
0.569 arrow_forward 0.525 arrow_downward -8%
0.569 arrow_forward 0.525 arrow_downward -8%
check
Tested on 50 random samples from mteb/stackoverflowdupquestions-reranking
0.640 arrow_forward 0.621 arrow_downward -3%
0.530 arrow_forward 0.505 arrow_downward -5%
0.555 arrow_forward 0.532 arrow_downward -4%
616
448
168
jinaai/jina-embeddings-v2-base-code
arrow_downward -4%
0.632 arrow_forward 0.711 arrow_upward 13%
0.517 arrow_forward 0.622 arrow_upward 20%
0.517 arrow_forward 0.622 arrow_upward 20%
check
Tested on 50 random samples from mteb/stackoverflowdupquestions-reranking
0.640 arrow_forward 0.619 arrow_downward -3%
0.530 arrow_forward 0.504 arrow_downward -5%
0.555 arrow_forward 0.525 arrow_downward -5%
616
448
168
jinaai/jina-embeddings-v2-base-en
arrow_upward 1%
0.646 arrow_forward 0.729 arrow_upward 13%
0.535 arrow_forward 0.644 arrow_upward 20%
0.535 arrow_forward 0.644 arrow_upward 20%
check
Tested on 50 random samples from mteb/askubuntudupquestions-reranking
0.645 arrow_forward 0.650 arrow_upward 1%
0.452 arrow_forward 0.462 arrow_upward 2%
0.606 arrow_forward 0.605 arrow_downward -0%
616
448
168
Auto Fine-Tuning API
Get fine-tuned embeddings for any domain you want.
Describe the domain you wish to fine-tune for.
Provide a detailed description of how the fine-tuned embeddings will be used. This is essential for generating high-quality synthetic data that will improve the performance of your embeddings.
Choose a base embedding model for fine-tuning.
Please enter the email where you want to receive the download link upon completion.
Agree to the terms and begin fine-tuning by clicking the button below.
FAQ
At any time, press
/
to open search barAuto Fine-Tuning-related common questions
How much does the Fine-tuning API cost?
What do I need to input? Do I need to provide training data?
How long does it take to fine-tune a model?
Where are the fine-tuned models stored?
If I provide a reference URL, how does the system use it?
Can I fine-tune a model for a specific language?
Can I fine-tune non-Jina embeddings, e.g., bge-M3?
How do you ensure the quality of the fine-tuned models?
How do you generate synthetic data?
Can I keep my fine-tuned models and synthetic data private?
How can I use the fine-tuned model?
I never received the email with the evaluation results. What should I do?
API-related common questions
code
Can I use the same API key for embedding, reranking, reader, fine-tuning APIs?
code
Can I monitor the token usage of my API key?
code
What should I do if I forget my API key?
code
Do API keys expire?
code
Why is the first request for some models slow?
code
Is user input data used for training your models?
Billing-related common questions
attach_money
Is billing based on the number of sentences or requests?
attach_money
Is there a free trial available for new users?
attach_money
Are tokens charged for failed requests?
attach_money
What payment methods are accepted?
attach_money
Is invoicing available for token purchases?