Reader
Convert any URL to an LLM-friendly input with a simple prefix https://r.jina.ai. Experience improved output for your agent and RAG systems at no cost.
What is a Reader?
Feeding web information into LLMs is an important step of grounding, yet it can be challenging. The simplest method is to scrape the webpage and feed the raw HTML. However, scraping can be complex and often blocked, and raw HTML is cluttered with extraneous elements like markups and scripts. The Reader API addresses these issues by extracting the core content from a URL and converting it into clean, LLM-friendly text, ensuring high-quality input for your agent and RAG systems.
Reader also reads images!
Images on the webpage are automatically captioned using a vision language model in the reader and formatted as image alt tags in the output. This gives your downstream LLM just enough hints to incorporate those images into its reasoning and summarizing processes. This means you can ask questions about the images, select specific ones, or even forward their URLs to a more powerful VLM for deeper analysis!
The best part? It's free!
The Reader API is available at no cost and does not require an API key. Built on a scalable infrastructure, it offers high accessibility, concurrency, and reliability. We strive to be your preferred solution for all your LLM input requirements.
Try the demo
Failed to fetch
Title: jinaai (Jina AI)
URL Source: https://huggingface.co/jinaai
Markdown Content:
[jina-embeddings-v2 The V2 family of Jina Embeddings supports encoding large documents with 8k sequence length.](https://huggingface.co/collections/jinaai/jina-embeddings-v2-65708e3ec4993b8fb968e744)
* [![Image 1: the word jina on a black background](https://cdn-avatars.huggingface.co/v1/production/uploads/603763514de52ff951d89793/AFoybzd5lpBQXEBrQHuTt.png)
#### jinaai/jina-embeddings-v2-base-en
Feature Extraction • Updated 22 days ago •
1.05M •
613
](https://huggingface.co/jinaai/jina-embeddings-v2-base-en)
* [![Image 2: the word jina on a black background](https://cdn-avatars.huggingface.co/v1/production/uploads/603763514de52ff951d89793/AFoybzd5lpBQXEBrQHuTt.png)
#### jinaai/jina-embeddings-v2-small-en
Feature Extraction • Updated 22 days ago •
3.66M •
103
](https://huggingface.co/jinaai/jina-embeddings-v2-small-en)
* [![Image 3: the word jina on a black background](https://cdn-avatars.huggingface.co/v1/production/uploads/603763514de52ff951d89793/AFoybzd5lpBQXEBrQHuTt.png)
#### jinaai/jina-embeddings-v2-base-de
Feature Extraction • Updated 7 days ago •
17.6k •
50
](https://huggingface.co/jinaai/jina-embeddings-v2-base-de)
* [![Image 4: the word jina on a black background](https://cdn-avatars.huggingface.co/v1/production/uploads/603763514de52ff951d89793/AFoybzd5lpBQXEBrQHuTt.png)
#### jinaai/jina-embeddings-v2-base-zh
Feature Extraction • Updated 11 days ago •
7.77k •
105
](https://huggingface.co/jinaai/jina-embeddings-v2-base-zh)
Reader API
Convert any URL to an LLM-friendly input with a simple prefix https://r.jina.ai. Experience improved output for your agent and RAG systems at no cost.
Using the Reader API is straightforward. Simply prepend 'https://r.jina.ai/' to any URL in your code or tool where LLM access is needed.
Streaming mode is useful when you find that the standard mode provides an incomplete result. This is because streaming mode will wait a bit longer until the page is fully rendered. Use the accept-header to toggle the streaming mode:
curl -H "Accept: text/event-stream" https://r.jina.ai/https://example.com
FAQ
At any time, press
/
to open search barReader-related common questions
What are the costs associated with using the Reader API?
How does the Reader API function?
Is the Reader API open source?
What is the typical latency for the Reader API?
Why should I use the Reader API instead of scraping the page myself?
Does the Reader API support multiple languages?
What should I do if a website blocks the Reader API?
Can the Reader API extract content from PDF files?
Can the Reader API process media content from web pages?
Is it possible to use the Reader API on local HTML files?
Does Reader API cache the content?
Can I use the Reader API to access content behind a login?
Can I use the Reader API to access PDF on arXiv?
How does image caption work in Reader?
What is the scalability of the Reader? Can I use it in production?
What is the rate limit of the Reader API?