Annotate PDFs. Train Better AI.
Makroly is a precision PDF annotation platform built for AI teams. Define custom entity schemas, highlight structured information from any document, and export clean JSON datasets — ready to fine-tune your language models and document AI pipelines.

Everything your AI team needs
Purpose-built for the specific challenges of creating structured training data from real-world PDF documents.
Schema-Driven Annotation
Define named entity types with typed properties (text, number, date, enum). Every annotation is validated against your schema — no free-form noise.
Precise Text Span Selection
Select exact text fragments from rendered PDFs with pixel-perfect accuracy. Supports multi-span annotations for discontinuous evidence.
Implicit Object Support
Annotate entities implied by context but not explicitly mentioned — critical for comprehensive document understanding model training.
Structured JSON Export
Export annotations as clean, schema-aligned JSON ready to feed directly into your fine-tuning or RAG pipelines. No post-processing needed.
Schema Import / Export
Save and reuse annotation schemas across documents and projects. Share schemas with your team for consistent multi-annotator datasets.
100% Browser-Based & Private
No server uploads. Your PDFs and annotations never leave your device. Fully local processing — compliant with sensitive data policies.
From PDF to training data in minutes
A streamlined four-step workflow that replaces error-prone spreadsheet annotation with a structured, repeatable process.
Upload your PDF
Open any PDF document directly in the browser. Pages render with full fidelity — tables, figures, and mixed layouts all supported.
Define your schema
Create entity types matching your domain — invoices, contracts, medical reports, patents. Add typed properties and mark which are required.
Annotate with precision
Select text ranges on the PDF to link them to entities. Fill in property values. Add implicit objects for inferred information.
Export structured JSON
One click exports your complete annotation set as structured JSON, perfectly aligned with your schema. LLM-ready, immediately.
Built for AI, not for manual review
Traditional tools were designed for human review workflows — not for producing machine-readable training data for generative AI.
| Capability | Traditional Tools | Makroly |
|---|---|---|
| Schema-driven entity types | ❌ Free-form | ✅ Structured schemas |
| Typed properties on entities | ❌ Not supported | ✅ Text, number, date, enum |
| Implicit / inferred entities | ❌ Span-only | ✅ Full implicit object support |
| JSON export for AI pipelines | ⚠️ Requires conversion | ✅ Native structured JSON |
| Schema reuse across documents | ❌ Manual duplication | ✅ Import / export schemas |
| Data privacy | ⚠️ Cloud upload required | ✅ 100% local — no uploads |
| Setup complexity | ⚠️ Account + configuration | ✅ Open browser and go |
What teams use Makroly for
Invoice & Receipt Processing
Annotate line items, totals, vendors, and dates for document extraction model training.
Legal Contract Analysis
Tag clauses, parties, obligations, and dates to build contract review AI.
Medical Report Extraction
Label diagnoses, medications, and patient data from clinical PDFs for healthcare AI.
Scientific Literature Mining
Extract methods, findings, and citations from research papers for knowledge graphs.

Ready to build your AI training dataset?
No sign-up. No cloud uploads. Open Makroly in your browser right now and start annotating in seconds.
Open the Annotator — it's free →