Ingesting Content
Every path to adding knowledge to your vault — files, URLs, web crawling, email, quick notes, and direct text.
How Ingestion Works
No matter how content enters Wikori, it follows the same pipeline:
Whether you drop a file, paste a URL, or write a quick note — it always ends up as a .md file in your vault's INGEST/ folder.
PDF, DOCX, and XLSX files are converted to Markdown automatically before being queued. The original file is moved to a files/ subdirectory for safekeeping.
Wikori sends the Markdown text to your AI endpoint. The model returns a structured YAML block with title, summary, entities, tags, source type, and confidence score.
The YAML frontmatter is prepended to the file, and it's moved from INGEST/ to your vault root. It's now searchable, navigable, and accessible to AI agents via MCP.
From Files
The simplest way to ingest content is to drop a file directly into the INGEST/ folder. Wikori's file watcher picks it up within seconds.
Two ways to do it
Via file manager: Open your vault directory in Finder (macOS), Explorer (Windows), or your Linux file manager. Drag any supported file into the INGEST/ subfolder.
Via the app: Go to the Ingest page and follow the instructions in the "From Files" section — it shows the exact path to your current vault's INGEST folder.
Supported formats
| Type | Extensions | Processing |
|---|---|---|
| Documents | PDFDOCXXLSX | Text extracted to Markdown, original preserved in files/ |
| Office (basic) | PPTXODTODSODPTXTCSV | Processed as-is or with basic extraction |
| Notes | MD | Processed directly — no conversion needed |
| Images | PNGJPGJPEGWEBPTIFFBMP | AI vision model analyzes content, generates description |
Only .md files (and those that get converted to .md) are processed by the AI pipeline. Other file types are moved to the vault but won't receive YAML enrichment unless they are converted first.
From URLs & YouTube
Wikori can scrape any public web page and extract YouTube transcripts. Go to Ingest → From URLs.
Adding URLs
Paste one or more URLs into the input field and click Add to Queue. URLs are added to a queue — you can add many before starting the pipeline.
YouTube URLs
Paste any YouTube video URL (e.g. https://youtube.com/watch?v=...). Wikori automatically detects it, fetches the full transcript, and saves it as a Markdown file — no audio processing required.
Running the pipeline
The queue shows a count of pending URLs.
Wikori scrapes pages sequentially with polite delays to avoid triggering anti-bot measures.
Processed URLs disappear from the queue. Each becomes a .md file in INGEST/ and goes through AI enrichment.
Click Stop Pipeline at any time. Use Clear Queue to remove pending URLs without processing them.
Test the connection first with Test Connection to ensure your AI endpoint can be reached before starting a large batch.
Web Crawler
The Web Crawler takes a single starting URL and automatically discovers every linked page — then lets you review, curate, and ingest the ones you want. Go to Ingest → Web Crawler.
This is ideal for ingesting entire documentation sites, knowledge bases, blog archives, or competitor resource sections in one operation — instead of pasting URLs one by one.
How it works
Enter the starting page. The crawler will follow all links it finds on that page, then follow links on those pages, and so on — a breadth-first search.
Set the depth (how many levels of links to follow), domain scope (stay on the same domain or allow external links), maximum number of URLs to discover, and optional filters to include or exclude URL patterns.
Click Start Crawl. Wikori shows live progress — pages visited, links found, links discarded by your filters. You can cancel at any time and still review partial results.
The discovered URLs appear in a checklist. Select or deselect individual URLs, or batch-select all. This is your chance to remove irrelevant pages (login screens, terms of service, etc.) before processing.
Click Add to URL Pipeline. The selected URLs are fed into the same URL scraping pipeline used by single-URL ingestion — each page gets scraped, converted to Markdown, and AI-enriched.
Crawler profiles
If you crawl the same sites regularly (e.g., a vendor's documentation that updates monthly), save a crawler profile — a reusable configuration with the seed URL, depth, scope, and filters pre-set. Select a profile next time and click Start instead of reconfiguring.
| Setting | Description | Example |
|---|---|---|
| Seed URL | The starting page | https://docs.example.com |
| Depth | How many link levels to follow | 2 (seed → linked pages → their linked pages) |
| Domain scope | Stay on same domain or allow external | Same domain only |
| Max URLs | Stop after this many discovered URLs | 100 |
| Filters | Include or exclude URL patterns | Exclude /login, /signup, .pdf |
Pro tip: Start with a low depth (1–2) and a tight domain scope to get a feel for the site's structure. You can always run a deeper crawl later. The curation step means you never accidentally ingest hundreds of irrelevant pages.
From Email (IMAP)
Wikori can monitor an email inbox and automatically convert incoming messages into knowledge entries. Configure this in Settings → Email Ingestion.
Important: Wikori downloads and deletes matching emails from the mailbox. We strongly recommend using a dedicated mailbox or email alias, not your main inbox.
Configuration
| Field | Description |
|---|---|
| IMAP Host | Your mail server address (e.g. imap.gmail.com) |
| Port | Usually 993 for SSL |
| Username | Your email address |
| Password | App password or IMAP password — encrypted on save |
| Trusted Senders | Comma-separated list of email addresses. Only emails from these addresses are processed. |
| Vault Routing | Emails are routed to the vault whose email tag matches a tag in the subject line. Untagged emails go to the active vault. |
Running the pipeline
After saving your settings, click Start Pipeline to begin monitoring. Wikori checks the inbox periodically and processes any new emails from trusted senders. You can also click Check Now for an immediate poll.
Quick Notes Overlay
Quick Notes is a floating window that appears over any app — Figma, VS Code, a browser, a Zoom call — so you can capture thoughts without breaking your flow.
Opening the overlay
| OS | Hotkey | Notes |
|---|---|---|
| macOS | ⌥ + ⌥ (both Option keys) | Requires Accessibility permission |
| Windows / Linux | Alt + Alt (both Alt keys) | — |
Writing and routing notes
The overlay opens ready to type. Notes auto-save every 5 seconds and when you close the window.
The bottom bar shows a tag for each vault (e.g. #startup, #research). Click one to route this note to that vault. The tag turns bold and uppercase when selected.
Navigate between notes with ⌃⌥ → / ← (macOS) or CtrlAlt → / ← (Windows/Linux). Each note can be routed to a different vault.
Click the small dot button at the bottom-right of the tag bar. Wikori routes each tagged note to its vault's INGEST folder. The button flashes green to confirm. Untagged notes remain in temp storage.
The overlay is draggable, resizable, and always stays on top of other windows. Its position and size are remembered across app restarts. Press Esc to dismiss it — it saves your note before closing.
From Direct Text
For quick notes that you want to type directly in the app without using the overlay, go to Ingest → From Text.
Enter a filename (e.g. meeting-notes-2026-05-18) and type or paste your content. Click Save to INGEST and the file is written directly to your vault's INGEST folder and queued for AI processing.
Monitoring Ingest Progress
Switch to the Status page at any time to see what's happening:
| Section | What it shows |
|---|---|
| Vault Status | Active vault path, file watcher state, and total indexed count |
| Queue → Unindexed | Files in vault root not yet in the knowledge index. Click Process Unindexed to queue them. |
| Queue → Ingest Files | Files currently waiting in INGEST/ plus active processing count |
| Queue → Failed Files | Files that failed AI processing — retry individually or in bulk |