# PDFMathTranslate **Repository Path**: deepbluethinker/PDFMathTranslate ## Basic Information - **Project Name**: PDFMathTranslate - **Description**: No description available - **Primary Language**: Unknown - **License**: AGPL-3.0 - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 1 - **Created**: 2024-11-12 - **Last Updated**: 2025-01-10 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README
PDF scientific paper translation and bilingual comparison. - đ Retain formulas and charts. - đ Preserve table of contents. - đ Support multiple translation services. Feel free to provide feedback in [issues](https://github.com/Byaidu/PDFMathTranslate/issues) or [user group](https://t.me/+Z9_SgnxmsmA5NzBl). ## Installation Require Python version >=3.8, <=3.12 ```bash pip install pdf2zh ``` ## Usage Execute the translation command in the command line to generate the translated document `example-zh.pdf` and the bilingual document `example-dual.pdf` in the current directory. Use Google as the default translation service. ### Translate the entire document ```bash pdf2zh example.pdf ``` ### Translate part of the document ```bash pdf2zh example.pdf -p 1-3,5 ``` ### Translate with the specified language See [Google Languages Codes](https://developers.google.com/admin-sdk/directory/v1/languages), [DeepL Languages Codes](https://developers.deepl.com/docs/resources/supported-languages). ```bash pdf2zh example.pdf -li en -lo ja ``` ### Translate with DeepL/DeepLX See [DeepLX](https://github.com/OwO-Network/DeepLX). Set ENVs to construct an endpoint like: `{DEEPL_SERVER_URL}/{DEEPL_AUTH_KEY}/translate` - `DEEPL_SERVER_URL` (Optional), e.g., `export DEEPL_SERVER_URL=https://api.deepl.com` - `DEEPL_AUTH_KEY`, e.g., `export DEEPL_AUTH_KEY=xxx` ```bash pdf2zh example.pdf -s deepl ``` ### Translate with Ollama See [Ollama](https://github.com/ollama/ollama). Set ENVs to construct an endpoint like: `{OLLAMA_HOST}/api/chat` - `OLLAMA_HOST` (Optional), e.g., `export OLLAMA_HOST=https://localhost:11434` ```bash pdf2zh example.pdf -s ollama:gemma2 ``` ### Translate with OpenAI/SiliconCloud See [OpenAI](https://platform.openai.com/docs/overview). Set ENVs to construct an endpoint like: `{OPENAI_BASE_URL}/chat/completions` - `OPENAI_BASE_URL` (Optional), e.g., `export OPENAI_BASE_URL=https://api.openai.com/v1` - `OPENAI_API_KEY`, e.g., `export OPENAI_API_KEY=xxx` ```bash pdf2zh example.pdf -s openai:gpt-4o ``` ### Use regex to specify formula fonts and characters that need to be preserved ```bash pdf2zh example.pdf -f "(CM[^RT].*|MS.*|.*Ital)" -c "(\(|\||\)|\+|=|\d|[\u0080-\ufaff])" ``` ## Preview    ## Acknowledgement Document merging: [PyMuPDF](https://github.com/pymupdf/PyMuPDF) Document parsing: [Pdfminer.six](https://github.com/pdfminer/pdfminer.six) Document extraction: [MinerU](https://github.com/opendatalab/MinerU) Multi-threaded translation: [MathTranslate](https://github.com/SUSYUSTC/MathTranslate) Layout parsing: [DocLayout-YOLO](https://github.com/opendatalab/DocLayout-YOLO) Document standard: [PDF Explained](https://zxyle.github.io/PDF-Explained/), [PDF Cheat Sheets](https://pdfa.org/resource/pdf-cheat-sheets/) ## Contributors