This page documents the current limitations and constraints of Doc Reviewer. Understanding these boundaries helps you set realistic expectations, work around known gaps, and avoid common frustrations. Limitations listed here are not bugs — they reflect deliberate design decisions or the current state of the underlying technology.Documentation Index
Fetch the complete documentation index at: https://www.doc-reviewer.site/llms.txt
Use this file to discover all available pages before exploring further.
Document parsing
Icon and symbol fonts in PDFs
PDF documents sometimes use decorative icon fonts to display button glyphs, status indicators, and other UI symbols. These symbols are stored in a non-standard encoding that Doc Reviewer’s PDF parser cannot extract as readable text. When Doc Reviewer encounters such a glyph, it inserts the marker[иконка] (icon marker) in its place.
This is expected behavior. The LLM evaluation prompt explicitly instructs the model to ignore [иконка] markers and never treat them as missing or unnamed UI elements. If an instruction reads “Click [иконка] Save”, the model understands that [иконка] is a decorative graphic, and “Save” is the actual button name.
If you see
[иконка] markers in the instruction preview panel, this does not indicate a parsing error and does not affect evaluation quality.Web page parsing
Web page support is optimized for the Positive Technologies web help structure, which uses custom<instruction>, <action>, and <task> HTML tags. When loading pages from this structure, Doc Reviewer parses each <instruction> block as a separate section with correct step numbering and clean text.
For all other websites, Doc Reviewer falls back to a generic HTML-to-Markdown conversion. The fallback works for most pages but may produce lower-quality text extraction on complex layouts, heavily styled pages, or pages with non-standard markup. If evaluation results look inconsistent for a web page, download it as a PDF and upload the file instead.
Language support
Instruction detection uses morphological analysis optimized for Russian-language text. The detector looks for patterns in section headings — verb forms, noun phrases, and other constructions that signal a procedural step — that are specific to Russian grammar. English documents can be uploaded and evaluated without errors, but automatic classification accuracy may be lower. More sections may be classified as possible or non-instruction when they should be instruction. You can correct this manually by clicking a section’s classification badge in the document tree and changing it to the appropriate value. The default criteria set and the LLM role description are also written in Russian. If you are evaluating English documents regularly, translate the active criteria set in Settings → Criteria for more consistent results.Single-user design
Doc Reviewer is designed for one technical writer on one machine. It does not support:- Multiple simultaneous users
- Network or server deployment
- Shared databases or collaborative workspaces
- Role-based access control
LLM dependency
Every instruction evaluation requires a live call to an LLM. There is no offline fallback evaluation mode — if the LLM API is unreachable or no model is configured, Doc Reviewer cannot produce evaluation results. Evaluation quality varies directly with the capability of the model you use. A large, well-tuned model (such as GPT-4o or Claude 3.5 Sonnet) produces accurate, specific recommendations. A small or poorly-tuned model may return vague, inconsistent, or incorrect scores. If results do not seem right, switching to a more capable model is the most effective fix.Web page loading
Loading web pages requires Chromium, which is not bundled insidedoc-reviewer.exe. You must install it once with:
%LOCALAPPDATA%\ms-playwright) and is shared across all Playwright-based applications on your machine.
Additional constraints for web page loading:
- Sites that block headless browsers — some websites detect and reject automated browser sessions. These pages will fail to load or return incomplete content. Download the page as a PDF or DOCX and upload the file as a workaround.
- JavaScript-heavy SPAs — single-page applications that load content asynchronously may not render completely before the parser captures the page. Results for these pages may be incomplete.