← All Tools ZeroDataUpload Home

Ebook Converter

Convert ebooks between EPUB, PDF, FB2, RTF, Markdown & more -- with EPUB 3.0 builder

Launch Ebook Converter →
Ebook Converter

Table of Contents

  1. Overview
  2. Key Features
  3. How to Use
  4. Technical Deep Dive
  5. Frequently Asked Questions
  6. Privacy & Security

Overview

The Ebook Converter is a browser-based ebook format conversion tool that handles 7 input formats and 4 output formats entirely on your device. It accepts EPUB, PDF, TXT, HTML, FB2 (FictionBook 2), RTF (Rich Text Format), and Markdown files, and can convert them to EPUB, PDF, TXT, or HTML output. With 25 distinct conversion paths, it covers the vast majority of ebook format transformations that readers, authors, and publishers encounter.

At its core, the converter builds valid EPUB 3.0 packages from scratch -- complete with an OPF manifest defining the reading order via spine idref attributes, XHTML 1.1 content documents, and a navigation document with <nav> elements. It parses FB2 XML with semantic tag conversion, extracts PDF text using Mozilla's pdf.js library, strips RTF control sequences with a 3-pass regex pipeline, and renders Markdown using a custom regex-based engine -- all without any server involvement.

The tool relies on four JavaScript libraries that run entirely in your browser: JSZip 3.10.1 for reading and creating ZIP/EPUB containers, jsPDF 2.5.1 for generating PDF documents with proper typography, pdf.js 3.11.174 for extracting text from PDF pages, and FileSaver 2.0.5 for triggering file downloads. Your ebook content, metadata, and file data never leave your device -- every step from parsing to packaging to downloading happens locally.

Key Features

7 Input Formats

EPUB, PDF, TXT, HTML, FB2 (FictionBook), RTF, and Markdown -- covering every major ebook and text format used worldwide. Auto-detection from file extension means you simply drop a file and the converter knows what to do.

EPUB 3.0 Builder

Creates valid EPUB packages with a full OPF manifest, spine reading order via idref attributes, a <nav> navigation document, XHTML 1.1 content with proper DOCTYPE, and DEFLATE level 9 compression. The mimetype entry is stored uncompressed as required by the EPUB specification.

FB2 XML Parser

Converts FictionBook 2 semantic tags -- <section> to <div>, <title> to <h2>, <emphasis> to <em>, <strong> to <strong>, and <empty-line/> to <br><br> -- producing styled HTML with Georgia serif typography, 720px max-width, and 1.7 line-height.

PDF Text Extraction

Mozilla's pdf.js 3.11.174 renders each page and extracts text content via getTextContent(), assembling a complete text representation of multi-page documents. Supports the full range of PDF encodings and font mappings that pdf.js handles.

Content Preview

See your ebook content before converting: EPUB chapters extracted from up to 10 HTML files, PDF text from the first 5 pages, FB2 body content with tags stripped, RTF with control sequences removed, and Markdown rendered. All previews are capped at 5,000 characters with a "[... truncated]" indicator.

Custom Markdown Engine

A regex-based Markdown renderer supporting # h1 through ### h3 headings, **bold**, *italic*, `code`, - lists converted to <ul>, > blockquotes with left-bordered styling, and --- horizontal rules. No external Markdown library is used -- the entire renderer is custom regex.

RTF Control Sequence Stripping

A 3-pass regex pipeline handles Rich Text Format files: first pass converts \par to newlines, second pass removes {\ control sequences and decodes hex escapes like \'XX, and third pass strips remaining control words matching /\\[a-z]+\d*\s?/gi and collapses excessive newlines.

4 JavaScript Libraries

JSZip 3.10.1 for reading and creating ZIP/EPUB containers, jsPDF 2.5.1 for generating PDF documents with Helvetica typography, pdf.js 3.11.174 for extracting text from PDF pages, and FileSaver 2.0.5 for triggering cross-browser file downloads. All run client-side.

How to Use

  1. Open the Ebook Converter - Launch the tool and drag-and-drop your ebook file onto the upload area, or click the area to open a file browser. Supported extensions: .epub, .pdf, .txt, .html, .htm, .fb2, .rtf, and .md.
  2. Auto-Detection - The input format is automatically detected from the file extension. The converter displays the detected format and begins parsing the file immediately to prepare a content preview.
  3. Verify with Preview - Expand the content preview section to see a summary of your ebook content. For EPUB files, you will see text from the first chapters in spine order. For PDFs, text from the first 5 pages. For FB2, the body content with XML tags stripped. This lets you confirm the file was parsed correctly before converting.
  4. Select Output Format - Choose your desired output format from the dropdown menu. Only compatible output formats are shown -- for example, an EPUB input offers PDF, TXT, and HTML as outputs. FB2 files offer the most options with all four output formats available.
  5. Convert - Click the "Convert File" button. A progress bar tracks the three stages: extraction (parsing the source format), conversion (transforming content to the target format), and packaging (assembling the output file). Conversion speed depends on file size and format complexity.
  6. Download - Your converted ebook downloads automatically with the correct file extension. The filename preserves the original name with the new extension appended -- for example, mybook.epub converted to PDF becomes mybook.pdf.
  7. EPUB Output Validation - When converting to EPUB, the output file is a valid EPUB 3.0 package that opens in any standards-compliant ebook reader including Apple Books, Calibre, Kobo, Google Play Books, and most e-ink devices. The package includes a proper mimetype entry, META-INF/container.xml, OPF manifest with spine, and XHTML content documents.

Technical Deep Dive

EPUB Parsing: OPF Manifest and Spine Reading Order

When the converter reads an EPUB file, it uses JSZip to unzip the container and locate the .opf file (the OPF Package Document). This file is the heart of every EPUB -- it contains three critical sections: the <metadata> block with title, author, and language information; the <manifest> block that maps unique IDs to file paths within the ZIP; and the <spine> block that defines the reading order.

The spine is what makes chapters appear in the correct sequence. Each <itemref idref="chapter1"/> element in the spine references an item in the manifest by its id attribute. The converter builds a map of id to href from the manifest, then iterates through the spine's idref values to read content files in the author's intended order. If the OPF file is missing or malformed, the converter falls back to reading content files in alphabetical order.

For text extraction, each HTML/XHTML content file has its tags stripped and entities decoded -- specifically &nbsp;, &amp;, &lt;, &gt;, &quot;, and &#39;. For HTML extraction mode, the converter extracts the <body> content from each chapter and joins them with <hr> separators to produce a single combined HTML document.

EPUB 3.0 Builder: Package Structure

When generating EPUB output, the converter creates a standards-compliant EPUB 3.0 package from scratch using JSZip with DEFLATE level 9 compression. The ZIP structure follows the EPUB specification exactly:

First, the mimetype file is added uncompressed (no DEFLATE, no extra fields) with the exact content application/epub+zip and no trailing newline -- this is a strict requirement of the EPUB specification and is what allows ebook readers to identify the file as an EPUB container.

Next, META-INF/container.xml points to the Package Document at OEBPS/content.opf. The OPF file contains: a <dc:identifier> set to ebook-converter-{timestamp} using Date.now() for uniqueness; <dc:title> from the source filename; <dc:language> set to en; a <meta property="dcterms:modified"> timestamp; a manifest declaring the content file and the navigation document (with the nav property); and a spine with an <itemref> pointing to the content chapter.

The content itself is stored as OEBPS/chapter1.xhtml with a proper XHTML 1.1 DOCTYPE declaration. The navigation document at OEBPS/toc.xhtml uses the HTML5 <nav epub:type="toc"> element as required by EPUB 3.0.

FB2 Format: History and Parsing

FictionBook 2 (FB2) is an XML-based ebook format created by Dmitry Gribov in the early 2000s. It became enormously popular in Russia and Eastern Europe as a semantic ebook format -- rather than describing visual layout like HTML, FB2 uses tags that describe the meaning of content. A <title> tag means "this is a title," a <section> tag means "this is a logical section," and <emphasis> means "this text should be emphasized."

The converter's FB2 parser extracts the <body> element and performs semantic tag conversion: <section> becomes <div>, <title> becomes <h2>, <emphasis> becomes <em>, <strong> maps directly to HTML <strong>, and <empty-line/> is converted to <br><br>. The resulting HTML is styled with Georgia serif font, a maximum width of 720px, and a line-height of 1.7 for comfortable reading. For plain text extraction, all XML and HTML tags are simply stripped from the body content.

RTF Stripping: 3-Pass Regex Pipeline

Rich Text Format files contain a mix of readable text and control sequences that define formatting. The converter uses a 3-pass regex pipeline to extract clean text. The first pass converts \par paragraph markers to newline characters. The second pass handles two tasks: removing {\ control group sequences and decoding hexadecimal character escapes in the format \'XX (where XX is a two-digit hex code) back to their actual characters. The third pass removes remaining RTF control words matching the pattern /\\[a-z]+\d*\s?/gi, strips all remaining curly braces, and collapses runs of three or more consecutive newlines down to two. This is a basic parser suitable for simple RTF documents -- complex formatting, embedded objects, and nested groups may not be fully handled.

Custom Markdown Renderer

Rather than including a full Markdown library like marked.js, the converter uses a custom regex-based renderer that covers the most common Markdown syntax. It processes lines sequentially, converting #, ##, and ### prefixes to <h1>, <h2>, and <h3> tags; **text** to <strong>; *text* to <em>; `code` to <code> with a gray background; lines starting with - to <ul><li> list items; lines starting with > to left-bordered <blockquote> elements; and --- to <hr> horizontal rules. Double newlines are wrapped in <p> paragraph tags. The output is styled with Georgia serif typography. Notably, this custom engine does not support links ([text](url)), images (![alt](src)), or tables -- those features would require a full parser rather than line-by-line regex.

PDF Generation with jsPDF

When outputting to PDF, the converter uses jsPDF to create an A4 document with Helvetica typography. The title is rendered in Helvetica Bold at 18pt, and body text uses Helvetica at 10.5pt. All four margins are set to 20mm, with a 5mm line height for readable spacing. Long lines are automatically wrapped using jsPDF's splitTextToSize() method, which calculates line breaks based on the current font metrics and available page width. When content exceeds the page height, the converter automatically inserts a new page and continues rendering -- ensuring that even very long ebooks produce properly paginated PDF output.

Preview System and Truncation

Before conversion, the preview system shows a summary of the source content so you can verify the file was parsed correctly. Each format has a tailored preview strategy: TXT and Markdown files show the first 5,000 characters of raw content; HTML files have tags stripped and whitespace collapsed before showing 5,000 characters; EPUB files combine stripped text from the first 10 HTML content files in spine order; PDF files extract text from the first 5 pages via pdf.js; FB2 files extract the <body> element and strip all tags; and RTF files run through the control sequence stripper. All previews are capped at 5,000 characters, with a [... truncated] indicator appended when content is cut. PDF previews additionally note [... showing first N pages] to clarify the scope.

Conversion Routing: 25 Paths

The converter supports 25 distinct conversion paths across its 7 input and 4 output formats. EPUB converts to PDF, TXT, and HTML. PDF converts to TXT, HTML, and EPUB. TXT converts to PDF, EPUB, and HTML. HTML converts to PDF, EPUB, and TXT. FB2 is the most flexible input with 4 output options: EPUB, PDF, TXT, and HTML. RTF converts to TXT, PDF, HTML, and EPUB. Markdown converts to HTML, PDF, EPUB, and TXT. Each path uses the appropriate extraction method for the source format and the appropriate generation method for the target format, with the content flowing through an intermediate text or HTML representation.

Frequently Asked Questions

What ebook formats are supported?
The converter accepts 7 input formats: EPUB, PDF, TXT, HTML, FB2 (FictionBook 2), RTF (Rich Text Format), and Markdown (.md). It outputs 4 formats: EPUB 3.0, PDF, TXT, and HTML. This gives you 25 total conversion paths. The input format is auto-detected from the file extension, and only compatible output formats are shown in the dropdown for each input type.
What is EPUB 3.0 and how is it built?
EPUB 3.0 is the current standard for reflowable ebook packaging, maintained by the W3C. An EPUB file is a ZIP container with a specific structure: a mimetype file (uncompressed, exactly application/epub+zip), a META-INF/container.xml pointing to the Package Document, and an OPF file containing metadata, a manifest (listing all files), and a spine (defining reading order). The converter builds this structure from scratch using JSZip, generating a unique identifier with ebook-converter-{timestamp}, XHTML 1.1 content documents, and a <nav epub:type="toc"> navigation document as required by the EPUB 3.0 specification.
What is FB2 (FictionBook 2)?
FB2 is an XML-based ebook format created by Dmitry Gribov in the early 2000s. It became the dominant ebook format in Russia and Eastern Europe because it uses semantic tags that describe content meaning rather than visual appearance. For example, <title> marks a title, <section> marks a logical chapter, and <emphasis> marks emphasized text. The converter parses these semantic tags and converts them to their HTML equivalents, producing well-structured output regardless of which format you convert to.
Does it preserve formatting when converting?
Basic formatting is preserved: headings, paragraphs, bold text, italic text, and document structure (chapters, sections). However, complex visual layouts, embedded images, custom fonts, CSS styling, tables, and footnotes are not carried through conversion. The converter focuses on textual content fidelity -- your words and their basic structure will be accurate, but the visual presentation will use the output format's default styling (Georgia serif for HTML/FB2, Helvetica for PDF).
How does the EPUB spine work?
The spine is the mechanism that defines reading order in an EPUB. Inside the OPF Package Document, the <manifest> section lists every file in the EPUB with a unique id and its file path (href). The <spine> section then contains a sequence of <itemref idref="..."> elements that reference manifest items by their id. The order of these itemref elements determines the order a reader displays chapters. When the converter parses an EPUB, it reads the spine to extract chapters in the correct sequence rather than just reading files alphabetically.
Can it handle DRM-protected ebooks?
No. DRM (Digital Rights Management) protected ebooks cannot be converted by this tool. DRM-protected EPUB files are encrypted and require authorization keys from the DRM provider to decrypt. The converter does not attempt to circumvent, bypass, or remove DRM protection. Attempting to remove DRM from copyrighted ebooks is illegal under the DMCA and equivalent laws in most countries. This tool is designed for DRM-free ebooks that you own or have the right to convert.
What Markdown features are supported?
The custom Markdown renderer supports: headings (#, ##, ### for h1-h3), bold text (**bold**), italic text (*italic*), inline code (`code`), unordered lists (lines starting with -), blockquotes (lines starting with >), horizontal rules (---), and paragraph wrapping on double newlines. It does not support links ([text](url)), images (![alt](src)), tables, ordered lists, or nested formatting. The engine is a custom regex-based renderer -- not a full Markdown parser like marked.js or CommonMark.
How large can ebook files be?
The converter accepts files up to 50MB, though practical limits depend on your browser's available memory. Most ebooks are well under 5MB -- a typical novel in EPUB format is 300KB-2MB, and even image-heavy ebooks rarely exceed 20MB. PDF files with scanned pages can be larger, but text extraction via pdf.js is memory-efficient since it processes one page at a time. If you encounter issues with very large files, try closing other browser tabs to free memory.
Can I convert PDF to EPUB?
Yes -- PDF to EPUB is a supported conversion path. The converter uses pdf.js to extract text content from each page of the PDF, then wraps the combined text in a valid EPUB 3.0 package with proper OPF manifest, spine, and navigation. However, be aware that PDF is a fixed-layout format while EPUB is reflowable, so complex layouts (multi-column text, tables, sidebars, headers/footers) will be simplified to a linear text flow. The conversion works best with text-heavy PDFs like novels and reports. Scanned PDFs (image-based) will yield little or no text since pdf.js extracts embedded text, not OCR.
Is my ebook data safe?
Absolutely. All processing happens entirely in your browser using client-side JavaScript libraries: JSZip for EPUB packaging, jsPDF for PDF generation, pdf.js for PDF text extraction, and FileSaver for triggering downloads. No ebook content, metadata, filenames, or any other data is ever uploaded to any server. The file never leaves your device -- it is read from your local filesystem by the browser, processed in memory, and the converted output is saved back to your device. There are no analytics on file content, no server-side processing, and no network requests during conversion.

Privacy & Security

Your Ebooks Never Leave Your Device

Your ebooks are processed entirely in your browser using JSZip for EPUB packaging, jsPDF for PDF generation, and pdf.js for text extraction. No book content, metadata, or file data is ever uploaded to any server. The preview, conversion, and download all happen locally -- your reading material remains completely private. Whether you are converting a personal manuscript, a purchased ebook, or a document archive, every byte stays on your machine from start to finish. The four JavaScript libraries (JSZip, jsPDF, pdf.js, FileSaver) all execute client-side with zero network communication during the conversion process.

Ready to convert your ebooks? It's free, private, and runs entirely in your browser.

Launch Ebook Converter →

Related

Milan Salvi

Milan Salvi

Founder, Leena Software Solutions

Milan is the founder of ZeroDataUpload and Leena Software Solutions, building privacy-first browser tools that process everything client-side. View all articles ยท About the author.

Last Updated: March 26, 2026