Job Handling Module GUI Usage Guide
The Compileo Job Handling GUI provides a comprehensive web interface for monitoring, managing, and controlling background jobs using an enhanced Redis-based queue system.
Accessing the Job Dashboard
Navigate to the "๐ Job Dashboard" page from the main menu to access the job monitoring interface.
Job Queue Overview
The dashboard displays real-time job queue statistics and system health:
Queue Statistics Panel
- Pending Jobs: Jobs waiting in queue
- Running Jobs: Currently executing jobs
- Scheduled Jobs: Jobs scheduled for future execution
- Total Jobs: All jobs in system (with automatic cleanup)
- Active Workers: Number of healthy worker processes
- System Health: CPU and memory usage metrics
Enhanced Features
- Real-time Updates: Statistics refresh automatically every 30 seconds
- Worker Health Monitoring: Automatic detection and cleanup of stale workers
- Job Cleanup Status: Last cleanup time and items removed
- System Alerts: Warnings for high resource usage or failed jobs
Job Monitoring
Job List View
The main job table provides comprehensive job information:
-
Job Information:
- Job ID: Unique UUID identifier
- Job Type: extraction, document_processing, taxonomy_processing, dataset_generation
- Status: pending, running, completed, failed, cancelled, scheduled
- Progress: Percentage complete (0-100%)
- User: User who submitted the job
-
Timestamps:
- Created: When job was submitted
- Started: When processing began
- Completed: When job finished (or failed/cancelled)
-
Performance Metrics:
- Execution Time: Total processing duration
- Items Processed: Number of items handled
- Error Messages: Failure details (if applicable)
Filtering and Sorting
- Status Filter: View jobs by status (All, Pending, Running, Completed, Failed, Cancelled)
- Type Filter: Filter by job type
- Time Range: Filter jobs by creation date
- Search: Find jobs by ID or user
- Sorting: Sort by creation time, status, or progress
Job Management
View Job Details
- Select Job: Click on any job row in the table
- Detailed View: Expand to see complete job information including:
- Full parameters and configuration
- Progress history and milestones
- Result summaries (for completed jobs)
- Error details (for failed jobs)
- Performance metrics and timing
Cancel a Job
- Locate Target Job: Find the running or pending job in the list
- Cancel Action: Click the "โ Cancel" button in the Actions column
- Confirmation: Confirm cancellation in the dialog prompt
- Status Update: Job status changes to "cancelled" immediately
Restart Failed Jobs
- Identify Failed Job: Find job with "failed" status
- Restart Action: Click the "๐ Restart" button
- Confirmation: Confirm restart with same parameters
- Re-queue: Job returns to "pending" status with fresh execution
Real-time Monitoring
In-View Synchronous Monitoring
Compileo uses a standardized Synchronous In-View Monitoring pattern for all long-running jobs (parsing, chunking, extraction, etc.) started from the GUI:
- UI Stability: Status updates occur within
st.empty()placeholders, preventing disruptive whole-page reruns and preserving your scroll position. - Accurate Status: Displays stage-based progress messages (e.g., "Generating Splits", "Analyzing Chunks") fetched directly from backend metadata.
- Blocked Execution: The submit button handler remains active until the job reaches a terminal state (Completed, Failed, or Cancelled), ensuring you see the final result immediately.
Live Progress Updates
- Progress Bars: Visual progress indicators for running jobs
- Status Changes: Automatic status updates without page refresh
- Performance Metrics: Live execution throughput updates
- Worker Status: Real-time worker health and activity monitoring
Server-Sent Events (SSE)
The GUI uses Server-Sent Events for real-time updates: - Instant Notifications: Immediate status change alerts - Progress Streaming: Live progress updates during execution - Error Alerts: Real-time failure notifications - Completion Alerts: Success notifications with result summaries
RQ System Features
Enhanced Reliability Features
1. Datetime Compatibility - Automatic handling of timezone-aware timestamps from RQ - Prevents datetime comparison errors during worker monitoring - Compatible with Redis timestamp storage
2. Worker Health Management - Continuous monitoring of RQ worker processes - Automatic cleanup of workers unresponsive for >5 minutes - Prevention of dead worker accumulation in Redis
3. Comprehensive Job Cleanup - Multi-level Cleanup: Runs every 10 minutes - RQ failed jobs: Immediate cleanup - RQ finished jobs: 24-hour retention - Custom jobs: 2-hour retention - Processing locks: 10-minute cleanup - Startup Cleanup: Removes jobs >1 hour old on system start - Registry Management: Proper RQ registry maintenance
4. Duplicate Execution Prevention - Atomic Redis operations prevent race conditions - Early status validation before job execution - UUID-based job identification - Processing locks for concurrent execution safety
Job Types and Operations
Document Processing Jobs
- Parse Documents: Convert PDFs/text to markdown chunks
- Chunk Documents: Apply chunking strategies (token, character, semantic, schema)
- Progress Tracking: Real-time parsing and chunking progress
Taxonomy Processing Jobs
- Generate Taxonomy: AI-powered taxonomy creation from document chunks
- Category Analysis: Hierarchical category structure generation
- Validation: Taxonomy completeness and consistency checks
Extraction Jobs
- Selective Extraction: Taxonomy-based content extraction
- Multi-stage Classification: Primary + validation classifiers
- Confidence Scoring: Result confidence and category matching
Dataset Generation Jobs
- Automated Dataset Creation: Convert extractions to training datasets
- Format Conversion: JSON, CSV, and other dataset formats
- Quality Validation: Dataset completeness and consistency checks
Troubleshooting
Common Issues
Jobs Stuck in Pending: - Check worker health in queue statistics - Verify Redis connectivity - Review worker logs for startup errors - Check resource limits (CPU/memory)
Incorrect Job Counts: - Wait for automatic refresh (30 seconds) - Check last cleanup time in statistics - Verify job status accuracy
High Memory Usage: - Monitor worker accumulation - Check for failed job cleanup - Review large result storage
Datetime Errors: - System automatically handles timezone issues - Check server logs for RQ compatibility - Verify system clock synchronization
Performance Optimization
Best Practices
- Monitor Queue Health: Use statistics panel for system monitoring
- Cancel Unnecessary Jobs: Clean up jobs no longer needed
- Use Appropriate Timeouts: Set reasonable job timeouts
- Monitor Resource Usage: Watch CPU/memory trends
- Regular Cleanup: System performs automatic maintenance
System Limits
- Global Max Jobs: 10 concurrent jobs (configurable)
- Per-User Max Jobs: 3 concurrent jobs (configurable)
- Job Timeout: 72 hours default (configurable)
- Result TTL: 500 seconds default (configurable)
- Cleanup Frequency: Every 10 minutes
Advanced Features
Job Dependencies (Future)
- Chain jobs with prerequisite relationships
- Automatic dependency resolution
- Failure cascade handling
Scheduled Jobs
- Time-based job execution
- Cron-style scheduling
- Recurring job patterns
Bulk Operations
- Multiple job cancellation
- Batch status updates
- Bulk cleanup operations
This enhanced job handling system provides enterprise-grade reliability with comprehensive monitoring, automatic maintenance, and real-time user feedback for all background processing operations.