/pdf-text-extractor

A Chrome extension that allows you to extract text from PDF files and paste it directly into text fields on enabled websites.

Primary LanguageJavaScript

PDF Text Extractor Chrome Extension

A Chrome extension that allows you to extract text from PDF files and paste it directly into text fields on enabled websites.

Features

  • Extract text from PDF files with drag-and-drop functionality
  • Automatically adds upload buttons next to text inputs and textareas
  • Configure which websites the extension works on
  • Simple and intuitive popup interface for site management
  • Supports both main domains and subdomains
  • Works offline - no server required

Installation

  1. Clone this repository or download the ZIP file
git clone https://github.com/yourusername/pdf-text-extractor.git
  1. Open Chrome and navigate to chrome://extensions/
  2. Enable "Developer mode" in the top right corner
  3. Click "Load unpacked" and select the extension directory

Directory Structure

pdf-text-extractor/
├── manifest.json
├── popup.html
├── popup.js
├── content-loader.js
├── content.js
├── build/
│   ├── pdf.mjs
│   └── pdf.worker.mjs
└── web/
    ├── cmaps/
    └── standard_fonts/

Usage

  1. Click the extension icon in your Chrome toolbar
  2. Add websites where you want the extension to work:
    • Type a domain manually (e.g., "example.com")
    • Or click "Add Current Site" to add the current website
  3. Visit an enabled website
  4. Look for the upload button next to text inputs
  5. Click the upload button and select a PDF file
  6. The extracted text will be automatically inserted into the input field

Configuration

Adding Sites

  1. Click the extension icon
  2. Enter the domain name in the input field
  3. Click "Add Site" or press Enter

Removing Sites

  1. Click the extension icon
  2. Find the site in the list
  3. Click the × button next to the site name

Development

Prerequisites

  • Chrome browser
  • Basic understanding of JavaScript and Chrome extensions

Local Development

  1. Make changes to the code
  2. Go to chrome://extensions/
  3. Click the refresh icon on your extension card
  4. Test your changes

Files

  • manifest.json: Extension configuration
  • popup.html/js: Site management interface
  • content-loader.js: Initializes the extension on web pages
  • content.js: Main PDF processing functionality

Dependencies

  • PDF.js: Mozilla's PDF rendering engine
  • No external services or APIs required

Browser Support

  • Chrome (Version 88+)
  • Other Chromium-based browsers (Edge, Brave, etc.)

Known Limitations

  • Only works with text-based PDFs
  • Some PDFs with complex layouts might not extract perfectly
  • Maximum file size depends on available memory

Troubleshooting

Extension Not Working?

  1. Check if the current site is in the enabled list
  2. Verify the PDF file is readable and not corrupted
  3. Try refreshing the page
  4. Check the console for error messages

Common Issues

  • PDF Not Loading: Ensure the file is a valid PDF
  • No Upload Button: Refresh the page or check if site is enabled
  • Extraction Failed: The PDF might be image-based or protected

Contributing

  1. Fork the repository
  2. Create a feature branch
git checkout -b feature/AmazingFeature
  1. Commit your changes
git commit -m 'Add some AmazingFeature'
  1. Push to the branch
git push origin feature/AmazingFeature
  1. Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

  • PDF.js by Mozilla
  • Chrome Extensions documentation

Support

For support, please open an issue in the GitHub repository.