Tuesday, June 3, 2025

#6 Form Filling Multimodal

 ✅ Here's your fully upgraded multimodal form-filling agent:

๐Ÿ”— Download form_filler_agent_final.zip

https://drive.google.com/file/d/1G4CYh81p9GYRPEfQYmgJ_SCw317isYUL/view?usp=sharing


๐Ÿš€ New Features

Feature Description
Chrome GUI Uses full Chrome browser (non-headless)
Form Auto-Submission Detects and clicks submit buttons automatically
Multi-field Label Mapping Enhanced matching using <label for="id"> and field type
Logging Interface Displays every action/decision after form fill
DOCX, PDF, Text Support Upload various formats — auto-extracted content
Dockerized Easily scalable, portable containerized app

๐Ÿงช To Use

  1. Start Ollama locally with lemma:

    ollama run lemma
    
  2. Build Docker image:

    docker build -t form-filler-gui .
    
  3. Run container with Chrome GUI:

    docker run -p 5000:5000 --add-host=host.docker.internal:host-gateway form-filler-gui
    
  4. Open http://localhost:5000 in your browser.


Let me know if you want:

  • Parallel batch form processing

  • Cloud-based autoscaling (e.g. using FastAPI + Celery + Redis)

  • Form filling across authenticated sessions

  • Exportable logs and filled reports

Ready to help you scale it!

#5 Form Filling Up to Dock erization

 ๐Ÿš€ Your enhanced, Dockerized form-filling agent is ready!

๐Ÿ‘‰ Download form_filler_agent_dockerized.zip

https://drive.google.com/file/d/1nOp0s_RCeyf6jHKO5-O3QnmqHEEDybJs/view?usp=sharing

✅ New Features Included

  • DOCX support (via python-docx)

  • Enhanced field mapping: Labels associated with inputs are matched more accurately.

  • Dockerized: Runs in an isolated container with all dependencies.


๐Ÿงช To Run in Docker

  1. Build the image:

    docker build -t form-filler-agent .
    
  2. Run the container:

    docker run -p 5000:5000 --add-host=host.docker.internal:host-gateway form-filler-agent
    
  3. Open http://localhost:5000


๐Ÿ“ฆ Requirements (Outside Docker)

  • Ollama must be running locally:

    ollama run lemma
    

Let me know if you’d like:

  • Chrome GUI (non-headless)

  • Submission automation

  • Multi-form processing or logging

I can help you keep scaling this!

#4 Form Filling Agent with Flask Front end

 ✅ Your ready-to-run ZIP file is available here:

Download form_filler_agent.zip

https://drive.google.com/file/d/110iEVgiA1Jyijr95qiZ3N9yqWjKqv1BL/view?usp=sharing

๐Ÿ“ฆ What's Inside:

  • app.py – Flask frontend for uploading PDFs or text files and entering a form URL.

  • agent.py – Main logic for:

    • Extracting text from PDFs.

    • Calling the Ollama model (lemma) to generate field values.

    • Filling out the web form using Selenium.

  • templates/index.html – Upload form page.

๐Ÿ›  Requirements

Install dependencies with:

pip install flask selenium pdfplumber requests

Make sure:

  • ChromeDriver is installed and accessible in PATH.

  • Ollama is running the lemma model locally:

    ollama run lemma
    

▶️ To Run the App

python app.py

Then open: http://localhost:5000

Let me know if you want:

  • Support for DOCX

  • Enhanced field mapping (e.g. based on labels next to fields)

  • Dockerized version

I'm happy to help you extend this!

#3 AI agent to fill Form

 To create an AI agent that can fill out forms using web pages, PDFs, and text sources using Python, you can use:

  • Ollama for local LLM inference (e.g., LLaMA3.2).

  • LangChain or custom agents for orchestrating form-filling tasks.

  • PDF/text parsing via libraries like PyMuPDF, pdfplumber, or docx.

  • Web form filling using Selenium.

  • Multimodal context support (text + form structure) to extract and align relevant information.


๐Ÿง  Objective

Create a Python agent that:

  1. Reads content from PDFs / web / plain text.

  2. Understands and extracts the relevant data.

  3. Matches it to form fields using prompt-based reasoning.

  4. Uses Ollama with a LLaMA3.2-based model (e.g., lemma) to infer the correct values.

  5. Automatically fills a web form using Selenium.


๐Ÿ“ฆ Prerequisites

pip install selenium requests flask langchain pdfplumber ollama

You’ll also need:

  • A local Ollama instance with LLaMA3/lemma model running:

    ollama run lemma
    
  • ChromeDriver installed (for Selenium).


๐Ÿงช Sample Setup

Here’s a minimal Python setup to create such an agent:

๐Ÿ“ agent.py

import json
import pdfplumber
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.options import Options
import time
import requests

OLLAMA_URL = "http://localhost:11434/api/generate"
MODEL_NAME = "lemma"  # or llama3


def extract_text_from_pdf(pdf_path):
    text = ""
    with pdfplumber.open(pdf_path) as pdf:
        for page in pdf.pages:
            text += page.extract_text() + "\n"
    return text


def extract_form_fields(driver):
    fields = {}
    elements = driver.find_elements(By.XPATH, "//input[@type='text' or @type='email' or @type='number' or @type='date']")
    for element in elements:
        try:
            label = element.get_attribute("name") or element.get_attribute("id")
            if label:
                fields[label] = element
        except Exception:
            pass
    return fields


def ask_ollama_to_fill(form_labels, context_text):
    prompt = f"""
You are a helpful assistant. Given the following text:\n\n{context_text}\n\n
Fill out this form with these fields: {list(form_labels)}.
Respond in JSON format with field names and their values.
    """
    response = requests.post(OLLAMA_URL, json={
        "model": MODEL_NAME,
        "prompt": prompt,
        "stream": False
    })
    data = response.json()
    try:
        filled_data = json.loads(data['response'].strip())
        return filled_data
    except:
        print("Could not parse JSON. Response:", data['response'])
        return {}


def fill_web_form(url, context_text):
    options = Options()
    options.add_argument("--headless=new")
    driver = webdriver.Chrome(options=options)
    driver.get(url)
    time.sleep(2)

    form_fields = extract_form_fields(driver)
    field_labels = list(form_fields.keys())
    filled_data = ask_ollama_to_fill(field_labels, context_text)

    for label, value in filled_data.items():
        if label in form_fields:
            form_fields[label].send_keys(str(value))

    print("Form filled. Pausing for review...")
    time.sleep(10)
    driver.quit()


if __name__ == "__main__":
    # Example source: PDF or text
    source_text = extract_text_from_pdf("sample_resume.pdf")
    # Or use plain text: source_text = open("info.txt").read()

    # Example web form URL (local or online)
    form_url = "http://example.com/form"

    fill_web_form(form_url, source_text)

๐Ÿ“ Example Prompt Handling

Input PDF Text:

Name: Ramesh Kumar
Email: ramesh@example.com
Phone: 9876543210
DOB: 1995-01-15

Form Labels (HTML fields):

["name", "email", "phone", "dob"]

Ollama Output:

{
  "name": "Ramesh Kumar",
  "email": "ramesh@example.com",
  "phone": "9876543210",
  "dob": "1995-01-15"
}

✅ Enhancements

  • Add support for textarea, select, and checkbox types.

  • Use LangChain Agents with tools like SeleniumTool, PDFLoader, etc.

  • Log/Store filled data as JSON for audit.

  • Integrate file upload UI using Flask for real-world apps.


Would you like me to provide:

  • A ready-to-run zip file for this?

  • A Flask frontend where users can upload PDF/text and see the form filled?

Let me know how you'd like to expand this.

Thursday, March 27, 2025

#2 DevOps Beginner's - Hands-on

 

DevOps Oru Saravedi! ๐Ÿ’ฅ๐Ÿš€

aurmc2024@gmail.com


Step 1: Git Anna Kitta Po! ๐ŸŽฉ

๐Ÿ‘‰ Mudhalil Git install pannunga, illa na code track panna mudiyadhu! ๐Ÿ˜ฑ Download inga

CMD la paarunga:

git --version

Output:

git version 2.49.0.windows.1

Indha output illa na, installation fail aayiduchu! ๐Ÿ˜ต


Step 2: GitHub Account Setup Pannu! ๐Ÿ› ️

GitHub.com pogi account create pannu! (Free dhan!)

๐ŸŽฉ Username Example: "Yazh24"

๐Ÿ“Œ Oru repo create pannunga DevOps

๐Ÿš€ Clone panna:

git clone https://github.com/Yazh24/devops.git

Step 3: Node.js - Idhu Illa Na Namma Saaapdave Mudiyadhu! ๐ŸŸข

๐Ÿ“ฅ Download Node.js

Install aagittadha check pannunga:

node -v

Output:

v22.14.0
npm -version

Output:

10.9.2

Yenna output varala na, sariyaa install aagala! ๐Ÿคจ


Step 4: VS Code - Developer Ungaloda Nallavan! ๐Ÿ’ป

๐Ÿ“ฅ Download VS Code

๐Ÿ“‚ Open panni DevOps folder select pannu

๐Ÿ‘€ Oru file irukkanum:

D:\Devops\README.md

Step 5: Oru Chinna Node.js Project ๐ŸŽญ

✅ Terminal open panni:

npm init -y

Ithu package.json create pannum!

๐Ÿ“ฆ Express install pannunga:

npm install express

๐Ÿ“‘ package.json la main ah app.js a maathunga

๐Ÿ“ app.js ezhudhunga:

const express = require('express');
const app = express();
const PORT = process.env.PORT || 3000;
const name = "CodeTest";

app.get("/", (req, res) => {
    res.send(`Welcome to DevOps Easy from ${name}!`);
});

const server = app.listen(PORT, () => {
    console.log(`Server is running on port ${PORT}`);
});

module.exports = {app, server, name};

๐Ÿš€ Terminal la run pannunga:

node app.js

Browser la poi: localhost:3000 ➡️ Aha! Adhirshtam! ๐ŸŽ‰


Step 6: Git Ah Marakkadhinga! ๐Ÿ“ค

๐Ÿ‘€ Status check pannunga:

git status

๐Ÿ“ .gitignore create pannunga:

node_modules/
.gitignore

✅ Add & Commit:

git add .
git commit -m "Initial Commit: Basic express setup"
git push

๐ŸŽ‰ GitHub la poi check pannunga, ellam vandhirukkum!


Step 7: Testing Oru Must! ๐Ÿงช

๐Ÿ› ️ Jest & Supertest install pannu:

npm install jest supertest --save-dev

๐Ÿ“œ package.json la scripts update pannu:

"scripts": {
  "test": "jest"
}

๐Ÿ“ test/app.test.js create pannu:

const request = require("supertest");
const {app, server} = require('../app');

describe('GET /', () => {
    it("should return 200 status and the correct msg", async () => {
        const response = await request(app).get("/");
        expect(response.status).toBe(200);
        expect(response.text).toBe('Welcome to DevOps Easy from CodeTest!');
    });
});

๐Ÿ› ️ Test run pannu:

npx jest test/app.test.js

Green check mark vandha, neenga pass!


Step 8: CI/CD Setup Pannu! ✨

๐Ÿ› ️ .github/workflows/ci.yaml create pannu:

name: CI Pipeline

on:
  push:
    branches:
      - main

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout code
        uses: actions/checkout@v3

      - name: Setup Node.js
        uses: actions/setup-node@v3

      - name: Install Dependencies
        run: npm install

      - name: Run Tests
        run: npx jest test/app.test.js  

๐Ÿš€ Push pannunga, GitHub Actions la check pannunga! ๐ŸŽฉ


Conclusion: Neenga Oru DevOps Thalaivar! ๐Ÿ†๐Ÿ”ฅ

Ithula mudinja neenga DevOps expert aayachu! ๐ŸŽ‰

Ippoluthum unga GitHub repository full DevOps-ready irruku! ๐Ÿ’ช

Doubt irundha steps paarunga illa na Saarayam... illa illa, Coffee Kudinga! ☕๐Ÿ˜‚

Wednesday, March 26, 2025

#1 Basic form filling using selenium

Automate Google Form Filling Using Python and Selenium

Filling out forms manually is a tedious task, but with Python and Selenium, we can automate it in just a few steps! ๐Ÿš€

Step 1: Create a Google Form

  1. Head over to Google Forms and create a form similar to this:

    • Name (Short Answer)

    • Mobile Number (Short Answer)

    • Email (Short Answer)


  1. Publish your form and copy the form URL (e.g., https://forms.gle/ZVKdhTJYqeHT4W1o6).

  2. Open the form in Google Chrome.

  3. Right-click on each input field, select Inspect, then right-click on the <input> tag and select Copy XPath.

  4. Save the XPath values for later use.


Step 2: Automate Form Filling with Python

Install Dependencies

First, install Selenium:

pip install selenium

Also, ensure you have the Chrome WebDriver installed and placed in your system's PATH.

Create form_fill.py

Open Visual Studio Code and create a file named form_fill.py.

from selenium import webdriver
import time

# Initialize WebDriver
web = webdriver.Chrome()
web.get('https://forms.gle/ZVKdhTJYqeHT4W1o6')
web.maximize_window()

time.sleep(2)  # Allow the page to load

# Fill out the form
sname = "RM.Chandrasekaran"
name = web.find_element('xpath', '//*[@id="mG61Hd"]/div[2]/div/div[2]/div[1]/div/div/div[2]/div/div[1]/div/div[1]/input')
name.send_keys(sname)

time.sleep(2)

mobile = "999999999"
mobil = web.find_element('xpath', '//*[@id="mG61Hd"]/div[2]/div/div[2]/div[2]/div/div/div[2]/div/div[1]/div/div[1]/input')
mobil.send_keys(mobile)

time.sleep(2)

email = 'abc@a.com'
emai = web.find_element('xpath', '//*[@id="mG61Hd"]/div[2]/div/div[2]/div[3]/div/div/div[2]/div/div[1]/div/div[1]/input')
emai.send_keys(email)

time.sleep(2)

# Click Submit Button
submit = web.find_element('xpath', '//*[@id="mG61Hd"]/div[2]/div/div[3]/div[1]/div[1]/div/span/span')
submit.click()

time.sleep(5)  # Wait before closing the browser

print("Form Submitted Successfully! ๐ŸŽ‰")

# Close the browser
web.quit()

Step 3: Run the Script

Simply run the script using:

python form_fill.py

And just like that, your form gets submitted automatically! ๐Ÿ˜Ž


Conclusion

Automating repetitive tasks like form filling can save time and effort. This method can be expanded for bulk form submissions or integrated with databases to auto-fill dynamic values.

Happy Coding! ๐Ÿ’ป✨ 

#6 Form Filling Multimodal

 ✅ Here's your fully upgraded multimodal form-filling agent : ๐Ÿ”— Download form_filler_agent_final.zip https://drive.google.com/file/d/1...