Workflow CLI Commands
ParquetFrame provides comprehensive command-line tools for workflow management, execution, and monitoring.
pframe workflow
Execute, validate, or visualize YAML workflow definitions.
Usage
pframe workflow <workflow_file> [OPTIONS]
Arguments
Argument |
Type |
Required |
Description |
workflow_file |
PATH |
Yes* |
Path to YAML workflow definition file |
*Required for execution/validation. Optional for --list-steps
and --create-example
.
Options
Execution Options
Option |
Short |
Type |
Description |
--variables |
-V |
TEXT |
Set workflow variables as key=value pairs |
--validate |
-v |
FLAG |
Validate workflow without executing |
--quiet |
-q |
FLAG |
Run in quiet mode with minimal output |
Workflow Management
Option |
Type |
Description |
--list-steps |
FLAG |
List all available workflow step types |
--create-example |
PATH |
Create example workflow at specified path |
Visualization Options
Option |
Type |
Options |
Description |
--visualize |
CHOICE |
graphviz , networkx , mermaid |
Generate workflow visualization |
--viz-output |
PATH |
- |
Output path for visualization file |
--viz-format |
CHOICE |
svg , png , pdf , dot |
Visualization format (default: svg) |
Examples
Basic Execution
# Execute workflow
pframe workflow pipeline.yml
# Execute with variables
pframe workflow pipeline.yml --variables "input_dir=data,min_age=21"
# Quiet execution
pframe workflow pipeline.yml --quiet
Validation and Setup
# Validate workflow syntax
pframe workflow pipeline.yml --validate
# Create example workflow
pframe workflow --create-example my_pipeline.yml
# List available step types
pframe workflow --list-steps
Visualization
# Generate Graphviz SVG
pframe workflow pipeline.yml --visualize graphviz --viz-output dag.svg
# Generate Mermaid diagram
pframe workflow pipeline.yml --visualize mermaid --viz-output diagram.md
# Generate PNG with NetworkX
pframe workflow pipeline.yml --visualize networkx --viz-output graph.png
Sample Output
Execution Output
[EXECUTING] Executing workflow: data_pipeline.yml
[VARIABLES] Variables: {'min_revenue': 1000, 'region': 'US'}
[STEP 1/4] load_customers (read)
✓ Loaded 50,000 rows from raw_customers.parquet (pandas backend)
Duration: 1.23s | Memory: 45.2MB
[STEP 2/4] filter_active (filter)
✓ Filtered to 32,156 rows (64.3% retention)
Duration: 0.45s | Memory: 28.9MB
[STEP 3/4] add_segments (transform)
✓ Applied transformation function
Duration: 2.18s | Memory: 31.4MB
[STEP 4/4] save_results (save)
✓ Saved 32,156 rows to customer_segments.parquet
Duration: 0.89s | Memory: 31.4MB
[SUCCESS] Workflow execution completed successfully!
Total Duration: 4.75s | Peak Memory: 45.2MB
Validation Output
[VALIDATING] Validating workflow: pipeline.yml
[SUCCESS] Workflow validation passed!
[SUMMARY] Workflow summary:
• Name: Customer Data Pipeline
• Description: Process customer data and generate insights
• Steps: 4
• Variables: 2
[STATS] Workflow DAG Statistics:
• Total Steps: 4
• Dependencies: 3
• Is Valid DAG: True
• Longest Path: 4
pframe workflow-history
View and manage workflow execution history with detailed analytics.
Usage
pframe workflow-history [OPTIONS]
Options
Filtering Options
Option |
Short |
Type |
Description |
--workflow-name |
-w |
TEXT |
Filter by specific workflow name |
--status |
-s |
CHOICE |
Filter by status: completed , failed , running |
--limit |
-l |
INT |
Limit number of records (default: 10) |
Display Options
Option |
Short |
Type |
Description |
--details |
-d |
FLAG |
Show detailed execution information |
--stats |
|
FLAG |
Show aggregate statistics |
Management Options
Option |
Type |
Description |
--cleanup |
INT |
Clean up history files older than N days |
Examples
Viewing History
# Show recent executions
pframe workflow-history
# Filter by workflow name
pframe workflow-history --workflow-name "Customer Pipeline"
# Show only failed executions
pframe workflow-history --status failed --limit 5
# Detailed view of recent executions
pframe workflow-history --details --limit 3
Statistics and Analytics
# Show aggregate statistics
pframe workflow-history --stats
# Statistics for specific workflow
pframe workflow-history --workflow-name "Data Pipeline" --stats
History Management
# Clean up files older than 30 days
pframe workflow-history --cleanup 30
# Show more results
pframe workflow-history --limit 25
Sample Output
Summary View
[HISTORY] Workflow Execution History
Recent Workflow Executions
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━┓
┃ Execution ID ┃ Workflow ┃ Status ┃ Started ┃ Duration ┃ Steps ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━┩
│ customer_pipeline_1698... │ Customer Pipeline │ completed │ 2024-01-15 14:30:25 │ 4.75s │ 4 │
│ data_etl_1698... │ Data ETL Pipeline │ completed │ 2024-01-15 13:45:12 │ 12.34s │ 6 │
│ ml_training_1698... │ ML Training Pipeline │ failed │ 2024-01-15 12:15:33 │ 45.67s │ 8 │
└────────────────────────────┴──────────────────────┴───────────┴─────────────────────┴───────────┴───────┘
Showing latest 10 records. Use --limit to see more.
Detailed View
==================================================
Execution ID: customer_pipeline_1698765425_a1b2c3d4
Workflow: Customer Data Pipeline
Status: completed
Started: 2024-01-15 14:30:25
Duration: 4.75s
Peak Memory: 45.2MB
Steps (4):
┏━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━┓
┃ Step ┃ Type ┃ Status ┃ Duration ┃
┡━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━┩
│ load_customers │ read │ completed │ 1.23s │
│ filter_active │ filter │ completed │ 0.45s │
│ add_segments │ transform │ completed │ 2.18s │
│ save_results │ save │ completed │ 0.89s │
└─────────────────┴───────────┴───────────┴──────────┘
Statistics View
Workflow Statistics - Customer Data Pipeline
┏━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┓
┃ Metric ┃ Value ┃
┡━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━┩
│ Total Executions │ 15 │
│ Successful │ 13 │
│ Failed │ 2 │
│ Success Rate │ 86.7% │
│ Avg Duration │ 5.42s │
│ Total Duration │ 81.30s │
└─────────────────────┴────────────────┘
Error Handling
Both commands provide comprehensive error handling:
Common Error Types
- File Not Found: Clear message with path verification
- Invalid YAML: Syntax errors with line numbers
- Missing Dependencies: Helpful installation instructions
- Validation Errors: Step configuration issues with fixes
- Execution Failures: Detailed error context and suggestions
Error Examples
# Invalid workflow file
$ pframe workflow missing.yml
[ERROR] Workflow file not found: missing.yml
[TIP] Check the file path and ensure the file exists
# YAML syntax error
$ pframe workflow broken.yml
[ERROR] Invalid YAML syntax in broken.yml:
Line 12: mapping values are not allowed here
[TIP] Check YAML indentation and syntax
# Missing visualization dependency
$ pframe workflow pipeline.yml --visualize graphviz
[ERROR] Workflow visualization requires additional dependencies
Please install with: pip install networkx graphviz matplotlib
- Use
--validate
to check workflows before long executions
- Leverage
--quiet
mode for automated/scripted usage
- Regular cleanup with
--cleanup
to manage disk space
- Use visualization to understand complex dependency chains
- Monitor execution history to identify performance patterns