PDF Structure Parser is a Python project designed to extract the structure (e.g., chapters, sections, subsections) from PDF files and output it in a structured JSON format. This tool is particularly ...
A focused pipeline to parse medical guidelines (PDF/HTML) into structured JSON for downstream clinical RAG or summarization. This implements models, parsers, normalization utils, and a CLI to ingest ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results