auto-fill-form-selenium: June 2025

Tuesday, June 3, 2025

#6 Form Filling Multimodal

✅ Here's your fully upgraded multimodal form-filling agent:

🔗 Download form_filler_agent_final.zip

https://drive.google.com/file/d/1G4CYh81p9GYRPEfQYmgJ_SCw317isYUL/view?usp=sharing

🚀 New Features

Feature	Description
✅ Chrome GUI	Uses full Chrome browser (non-headless)
✅ Form Auto-Submission	Detects and clicks submit buttons automatically
✅ Multi-field Label Mapping	Enhanced matching using `<label for="id">` and field type
✅ Logging Interface	Displays every action/decision after form fill
✅ DOCX, PDF, Text Support	Upload various formats — auto-extracted content
✅ Dockerized	Easily scalable, portable containerized app

🧪 To Use

Start Ollama locally with lemma:
```
ollama run lemma
```
Build Docker image:
```
docker build -t form-filler-gui .
```

Run container with Chrome GUI:

docker run -p 5000:5000 --add-host=host.docker.internal:host-gateway form-filler-gui

Open http://localhost:5000 in your browser.

Let me know if you want:

Parallel batch form processing
Cloud-based autoscaling (e.g. using FastAPI + Celery + Redis)
Form filling across authenticated sessions
Exportable logs and filled reports

Ready to help you scale it!

#5 Form Filling Up to Dock erization

🚀 Your enhanced, Dockerized form-filling agent is ready!

👉 Download form_filler_agent_dockerized.zip

https://drive.google.com/file/d/1nOp0s_RCeyf6jHKO5-O3QnmqHEEDybJs/view?usp=sharing

✅ New Features Included

DOCX support (via python-docx)
Enhanced field mapping: Labels associated with inputs are matched more accurately.
Dockerized: Runs in an isolated container with all dependencies.

🧪 To Run in Docker

Build the image:
```
docker build -t form-filler-agent .
```

Run the container:

docker run -p 5000:5000 --add-host=host.docker.internal:host-gateway form-filler-agent

Open http://localhost:5000

📦 Requirements (Outside Docker)

Ollama must be running locally:
```
ollama run lemma
```

Let me know if you’d like:

Chrome GUI (non-headless)
Submission automation
Multi-form processing or logging

I can help you keep scaling this!

#4 Form Filling Agent with Flask Front end

✅ Your ready-to-run ZIP file is available here:

Download form_filler_agent.zip

https://drive.google.com/file/d/110iEVgiA1Jyijr95qiZ3N9yqWjKqv1BL/view?usp=sharing

📦 What's Inside:

app.py – Flask frontend for uploading PDFs or text files and entering a form URL.
agent.py – Main logic for:
- Extracting text from PDFs.
- Calling the Ollama model (lemma) to generate field values.
- Filling out the web form using Selenium.
templates/index.html – Upload form page.

🛠 Requirements

Install dependencies with:

pip install flask selenium pdfplumber requests

Make sure:

ChromeDriver is installed and accessible in PATH.
Ollama is running the lemma model locally:
```
ollama run lemma
```

▶️ To Run the App

python app.py

Then open: http://localhost:5000

Let me know if you want:

Support for DOCX
Enhanced field mapping (e.g. based on labels next to fields)
Dockerized version

I'm happy to help you extend this!

#3 AI agent to fill Form

To create an AI agent that can fill out forms using web pages, PDFs, and text sources using Python, you can use:

Ollama for local LLM inference (e.g., LLaMA3.2).
LangChain or custom agents for orchestrating form-filling tasks.
PDF/text parsing via libraries like PyMuPDF, pdfplumber, or docx.
Web form filling using Selenium.
Multimodal context support (text + form structure) to extract and align relevant information.

🧠 Objective

Create a Python agent that:

Reads content from PDFs / web / plain text.
Understands and extracts the relevant data.
Matches it to form fields using prompt-based reasoning.
Uses Ollama with a LLaMA3.2-based model (e.g., lemma) to infer the correct values.
Automatically fills a web form using Selenium.

📦 Prerequisites

pip install selenium requests flask langchain pdfplumber ollama

You’ll also need:

A local Ollama instance with LLaMA3/lemma model running:
```
ollama run lemma
```
ChromeDriver installed (for Selenium).

🧪 Sample Setup

Here’s a minimal Python setup to create such an agent:

📁 `agent.py`

import json
import pdfplumber
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.options import Options
import time
import requests

OLLAMA_URL = "http://localhost:11434/api/generate"
MODEL_NAME = "lemma"  # or llama3


def extract_text_from_pdf(pdf_path):
    text = ""
    with pdfplumber.open(pdf_path) as pdf:
        for page in pdf.pages:
            text += page.extract_text() + "\n"
    return text


def extract_form_fields(driver):
    fields = {}
    elements = driver.find_elements(By.XPATH, "//input[@type='text' or @type='email' or @type='number' or @type='date']")
    for element in elements:
        try:
            label = element.get_attribute("name") or element.get_attribute("id")
            if label:
                fields[label] = element
        except Exception:
            pass
    return fields


def ask_ollama_to_fill(form_labels, context_text):
    prompt = f"""
You are a helpful assistant. Given the following text:\n\n{context_text}\n\n
Fill out this form with these fields: {list(form_labels)}.
Respond in JSON format with field names and their values.
    """
    response = requests.post(OLLAMA_URL, json={
        "model": MODEL_NAME,
        "prompt": prompt,
        "stream": False
    })
    data = response.json()
    try:
        filled_data = json.loads(data['response'].strip())
        return filled_data
    except:
        print("Could not parse JSON. Response:", data['response'])
        return {}


def fill_web_form(url, context_text):
    options = Options()
    options.add_argument("--headless=new")
    driver = webdriver.Chrome(options=options)
    driver.get(url)
    time.sleep(2)

    form_fields = extract_form_fields(driver)
    field_labels = list(form_fields.keys())
    filled_data = ask_ollama_to_fill(field_labels, context_text)

    for label, value in filled_data.items():
        if label in form_fields:
            form_fields[label].send_keys(str(value))

    print("Form filled. Pausing for review...")
    time.sleep(10)
    driver.quit()


if __name__ == "__main__":
    # Example source: PDF or text
    source_text = extract_text_from_pdf("sample_resume.pdf")
    # Or use plain text: source_text = open("info.txt").read()

    # Example web form URL (local or online)
    form_url = "http://example.com/form"

    fill_web_form(form_url, source_text)

📝 Example Prompt Handling

Input PDF Text:

Name: Ramesh Kumar
Email: ramesh@example.com
Phone: 9876543210
DOB: 1995-01-15

Form Labels (HTML fields):

["name", "email", "phone", "dob"]

Ollama Output:

{
  "name": "Ramesh Kumar",
  "email": "ramesh@example.com",
  "phone": "9876543210",
  "dob": "1995-01-15"
}

✅ Enhancements

Add support for textarea, select, and checkbox types.
Use LangChain Agents with tools like SeleniumTool, PDFLoader, etc.
Log/Store filled data as JSON for audit.
Integrate file upload UI using Flask for real-world apps.

Would you like me to provide:

A ready-to-run zip file for this?
A Flask frontend where users can upload PDF/text and see the form filled?

Let me know how you'd like to expand this.

Tuesday, June 3, 2025

#6 Form Filling Multimodal

🚀 New Features

🧪 To Use

#5 Form Filling Up to Dock erization

✅ New Features Included

🧪 To Run in Docker

📦 Requirements (Outside Docker)

#4 Form Filling Agent with Flask Front end

📦 What's Inside:

🛠 Requirements

▶️ To Run the App

#3 AI agent to fill Form

🧠 Objective

📦 Prerequisites

🧪 Sample Setup

📁 agent.py

📝 Example Prompt Handling

✅ Enhancements

#6 Form Filling Multimodal

📁 `agent.py`