File Upload Ingestion
What this page is for
Upload PDF, DOCX, or plain text files to create draft sessions for chunking and publishing.
Endpoint
POST /api/ingest/file— multipart/form-data
Supported formats
| Format | Extension | Parser |
|---|---|---|
.pdf | pdf-parse | |
| Word | .docx | mammoth |
| Plain text | .txt | UTF-8 reader |
Maximum file size: 10 MB.
Steps
- Prepare a supported file (PDF, DOCX, or TXT).
- Send as
multipart/form-datawith field namefile. - Optionally include
chunkingConfigto control chunking method. - Store the returned
sessionIdand proceed to session operations.
Example
curl -X POST http://localhost:3000/api/ingest/file \
-H "X-User-ID: user-1" \
-F "file=@./document.pdf"
With custom chunking configuration:
curl -X POST http://localhost:3000/api/ingest/file \
-H "X-User-ID: user-1" \
-F "file=@./document.pdf" \
-F "chunkingConfig={\"method\":\"character\",\"chunkSize\":500,\"overlap\":100}"
Response
{
"sessionId": "abc-123",
"sourceType": "file",
"sourceUrl": "upload://document.pdf",
"status": "created",
"createdAt": "2025-01-15T10:30:00.000Z"
}
Verify
- Response contains
sessionId,sourceType: "file", and timestamps. GET /api/session/<id>returns the draft with extracted text content.
Troubleshooting
- Unsupported file type: only
.pdf,.docx, and.txtare supported. - File too large: maximum upload size is 10 MB.
- Empty content: ensure the file contains extractable text (scanned PDFs without OCR will produce empty content).
Next steps
- Sessions — edit and refine the generated chunks.
- Configurable Chunking — control how content is split.