Skip to main content

Documentation Index

Fetch the complete documentation index at: https://www.doc-reviewer.site/llms.txt

Use this file to discover all available pages before exploring further.

Instead of uploading a file, you can point Doc Reviewer at a live URL. The app fetches the page using a headless Chromium browser, extracts instruction sections from the HTML, and treats the result as a regular document ready for evaluation. You can add multiple pages to the same document before running the evaluation.
Web page evaluation requires Chromium, which is not included in the .exe. You must install it once on any machine where you want to use this feature — whether you are running the .exe or from source. Open a terminal and run:
py -3.11 -m playwright install chromium

Load a web page

1

Open the Evaluation page

Click Evaluation in the sidebar, or navigate to any project and click Evaluate by URL.
2

Select the By URL tab

At the top of the Evaluation page, click the By URL tab to switch from file upload to URL mode.
3

Paste a URL and click Load

Paste the full URL of the page you want to evaluate (must start with http:// or https://) and click Load. Doc Reviewer launches Chromium in headless mode and fetches the page.
4

Wait for the page to load

Chromium renders the page fully before extraction begins, so JavaScript-rendered content and single-page applications (SPAs) load correctly. Fetching typically takes a few seconds. A loading indicator is shown while Chromium is running.
5

Add more pages (optional)

After the first page loads successfully, a + Add page button appears in the document header. Click it, paste another URL, and click Load to append that page’s instructions to the same document. Repeat for each additional page you want to include.
6

Proceed to evaluation

Once all pages are loaded, the document tree on the left shows all extracted instruction sections. Review the sections, adjust classifications if needed, and click Evaluate to start the LLM evaluation.
All pages you add become part of a single document. Each instruction block extracted from each page appears as a separate section in the document tree and is evaluated individually.

How web parsing works

Doc Reviewer uses two parsing strategies depending on the source site:
When the page contains custom <instruction>, <action>, or <task> tags — the structure used in Positive Technologies web help — Doc Reviewer uses a specialized parser that:
  • Treats each <instruction> block as a separate document section
  • Correctly numbers steps inside <action> tags
  • Strips layout artifacts such as soft-hyphen characters (&shy;)
This parser produces the highest-quality extraction and is the primary use case for web evaluation.
Because Chromium renders JavaScript before extraction, the app supports JS-rendered pages and SPAs. Static HTML pages load faster, but both work correctly.