Заявка #43175849
17.04.2025
We are seeking a skilled and motivated developer to create a software solution for converting Indian voter list PDFs (in Hindi, English, and local languages) into Excel format, including handling images embedded in the PDFs. The solution should enable automated extraction of specific voter data for use in a standardized Excel format.
Project Requirements:
Data Extraction: Develop a system to extract the following fields from voter list PDFs provided by the Election Commission: Name, Father's Name, Age, Gender, Voter ID, Serial Number
Output Format: Extracted data should be organized in a standard Excel format (no specific formatting required).
Automation: Create a reusable software or script that automates the PDF-to-Excel conversion process for future voter list PDFs.
Language Support: The solution must handle voter lists in Hindi, English, and regional Indian languages.
Image Handling: The software should process and extract relevant information from images embedded in the PDFs.
Ideal Freelancer:
Advanced expertise in data extraction techniques (e.g., OCR, PDF parsing).
Proficiency in programming languages such as Python, Java, or similar, with experience in libraries like PyPDF2, pdfplumber, Tesseract OCR, or equivalent.
Familiarity with Election Commission data formats or similar structured datasets.
Experience developing automated systems for PDF-to-Excel conversion.
Knowledge of handling multilingual text (Hindi, English, and regional languages) and image processing.
Responsibilities:
Design and develop a robust software/script for extracting voter list data from PDFs
Ensure the solution is scalable and reusable for ongoing voter list conversions.
Test the software across various voter list PDF samples to ensure accuracy and reliability.
Provide documentation and basic instructions for using the software.
Nice-to-Have:
Experience with Excel automation (e.g., using openpyxl or pandas in Python).
Familiarity with Indian voter list formats and Election Commission guidelines.
Knowledge of cloud-based solutions for processing large volumes of PDFs.
If you are passionate about data extraction and automation and have the skills to build a reliable solution for converting voter list PDFs to Excel, we’d love to hear from you!
Перейти к проекту