Configuration Reference

This document provides a complete reference for GoliteFlow’s YAML configuration format.

πŸ“‹ Table of Contents

πŸ—οΈ Basic Structure

version: "1.0"
workflows:
  - name: workflow_name
    schedule: "cron_expression"
    tasks:
      - id: task_id
        command: "command_to_execute"
        # ... other task options

Root Level Fields

Field Type Required Description
version string βœ… Configuration version (currently β€œ1.0”)
workflows array βœ… List of workflow definitions

πŸ”„ Workflow Configuration

Each workflow represents a collection of tasks that run on a schedule.

workflows:
  - name: daily_backup
    schedule: "0 2 * * *"
    tasks:
      - id: backup_files
        command: "tar -czf backup.tar.gz /data"
        retry: 3

Workflow Fields

Field Type Required Description
name string βœ… Unique workflow identifier
schedule string βœ… Cron expression for scheduling
tasks array βœ… List of tasks to execute

βš™οΈ Task Configuration

Tasks are individual commands that execute within a workflow.

tasks:
  - id: download_data
    command: "curl -s https://api.example.com/data"
    retry: 3
    timeout: "30s"
    depends_on: ["previous_task"]

Task Fields

Field Type Required Default Description
id string βœ… - Unique task identifier
command string βœ… - Command to execute
retry integer ❌ 1 Number of retry attempts
timeout string ❌ β€œ30m” Task timeout duration
depends_on array ❌ [] List of task IDs this task depends on

Task Dependencies

Tasks can depend on other tasks using the depends_on field:

tasks:
  - id: step1
    command: "echo 'Step 1'"
  
  - id: step2
    depends_on: ["step1"]
    command: "echo 'Step 2'"
  
  - id: step3
    depends_on: ["step1", "step2"]
    command: "echo 'Step 3'"

Execution Order: step1 β†’ step2 β†’ step3

Retry Configuration

Tasks can be configured to retry on failure:

tasks:
  - id: unreliable_task
    command: "curl -f https://unreliable-api.com/data"
    retry: 5  # Will retry up to 5 times

Retry Behavior:

Timeout Configuration

Tasks can have custom timeouts:

tasks:
  - id: long_running_task
    command: "python long_script.py"
    timeout: "2h"  # 2 hours timeout

Timeout Format: Go duration format

⏰ Cron Schedule Format

GoliteFlow uses standard cron format with 5 fields:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ minute (0 - 59)
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ hour (0 - 23)
β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ day of month (1 - 31)
β”‚ β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ month (1 - 12)
β”‚ β”‚ β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ day of week (0 - 6) (Sunday to Saturday)
β”‚ β”‚ β”‚ β”‚ β”‚
* * * * *

Common Schedule Examples

Schedule Description
"* * * * *" Every minute
"0 * * * *" Every hour
"0 0 * * *" Daily at midnight
"0 9 * * *" Daily at 9 AM
"0 2 * * 0" Every Sunday at 2 AM
"*/15 * * * *" Every 15 minutes
"0 0 1 * *" Monthly on the 1st at midnight
"0 9 * * 1-5" Weekdays at 9 AM
"0 0 * * 0" Every Sunday at midnight

Special Characters

Character Description Example
* Any value * * * * * (every minute)
, List of values 0 9,17 * * * (9 AM and 5 PM)
- Range of values 0 9-17 * * * (9 AM to 5 PM)
/ Step values */15 * * * * (every 15 minutes)

πŸ“ Examples

Basic Workflow

version: "1.0"
workflows:
  - name: simple_backup
    schedule: "0 2 * * *"
    tasks:
      - id: create_backup
        command: "tar -czf backup-$(date +%Y%m%d).tar.gz /data"
        retry: 2
        timeout: "1h"

Complex Workflow with Dependencies

version: "1.0"
workflows:
  - name: data_pipeline
    schedule: "0 3 * * *"
    tasks:
      - id: download_data
        command: "wget -O data.csv https://api.example.com/export"
        retry: 3
        timeout: "30m"
      
      - id: validate_data
        depends_on: ["download_data"]
        command: "python validate.py data.csv"
        retry: 2
        timeout: "10m"
      
      - id: process_data
        depends_on: ["validate_data"]
        command: "python process.py data.csv"
        retry: 2
        timeout: "1h"
      
      - id: upload_results
        depends_on: ["process_data"]
        command: "aws s3 cp results.json s3://my-bucket/"
        retry: 3
        timeout: "15m"
      
      - id: send_notification
        depends_on: ["upload_results"]
        command: "curl -X POST https://hooks.slack.com/services/..."
        retry: 1
        timeout: "30s"

Multiple Workflows

version: "1.0"
workflows:
  - name: hourly_cleanup
    schedule: "0 * * * *"
    tasks:
      - id: cleanup_temp
        command: "rm -rf /tmp/old_files"
        retry: 1
        timeout: "5m"
  
  - name: daily_backup
    schedule: "0 2 * * *"
    tasks:
      - id: backup_database
        command: "pg_dump mydb > backup.sql"
        retry: 2
        timeout: "30m"
      
      - id: compress_backup
        depends_on: ["backup_database"]
        command: "gzip backup.sql"
        retry: 1
        timeout: "5m"
  
  - name: weekly_report
    schedule: "0 9 * * 1"
    tasks:
      - id: generate_report
        command: "python generate_weekly_report.py"
        retry: 2
        timeout: "2h"
      
      - id: email_report
        depends_on: ["generate_report"]
        command: "mail -s 'Weekly Report' admin@company.com < report.pdf"
        retry: 1
        timeout: "1m"

Error Handling Example

version: "1.0"
workflows:
  - name: robust_pipeline
    schedule: "0 4 * * *"
    tasks:
      - id: fetch_data
        command: "curl -f https://unreliable-api.com/data"
        retry: 5  # Retry up to 5 times
        timeout: "2m"
      
      - id: process_data
        depends_on: ["fetch_data"]
        command: "python process.py"
        retry: 3
        timeout: "30m"
      
      - id: fallback_notification
        depends_on: ["fetch_data"]
        command: "echo 'Data fetch failed, using fallback'"
        retry: 1
        timeout: "10s"

βœ… Validation Rules

Configuration Validation

Workflow Validation

Task Validation

Dependency Validation

πŸ”§ Best Practices

1. Naming Conventions

# Good: Descriptive names
- name: daily_database_backup
  tasks:
    - id: backup_postgres
    - id: backup_redis

# Avoid: Generic names
- name: workflow1
  tasks:
    - id: task1
    - id: task2

2. Error Handling

# Good: Appropriate retry counts
- id: api_call
  command: "curl -f https://api.example.com/data"
  retry: 3  # Reasonable for network calls

- id: file_operation
  command: "cp file.txt backup/"
  retry: 1  # File operations usually succeed or fail immediately

3. Timeout Configuration

# Good: Realistic timeouts
- id: download_large_file
  command: "wget https://example.com/large-file.zip"
  timeout: "30m"  # Allow time for large downloads

- id: quick_validation
  command: "python validate.py"
  timeout: "2m"   # Quick operations

4. Dependency Design

# Good: Clear dependency chain
tasks:
  - id: download
    command: "wget data.csv"
  
  - id: validate
    depends_on: ["download"]
    command: "python validate.py data.csv"
  
  - id: process
    depends_on: ["validate"]
    command: "python process.py data.csv"

🚨 Common Mistakes

1. Circular Dependencies

# ❌ Wrong: Circular dependency
tasks:
  - id: task_a
    depends_on: ["task_b"]
    command: "echo A"
  
  - id: task_b
    depends_on: ["task_a"]
    command: "echo B"

2. Invalid Cron Expressions

# ❌ Wrong: Invalid cron format
schedule: "every day at 9am"  # Not valid cron

# βœ… Correct: Valid cron format
schedule: "0 9 * * *"  # Daily at 9 AM

3. Missing Dependencies

# ❌ Wrong: Task depends on non-existent task
tasks:
  - id: process_data
    depends_on: ["download_data"]  # download_data doesn't exist
    command: "python process.py"

4. Invalid Timeout Format

# ❌ Wrong: Invalid duration format
timeout: "2 hours"  # Not valid Go duration

# βœ… Correct: Valid duration format
timeout: "2h"  # 2 hours

Ready to create your workflows?

Check out real-world examples and start building.

View Examples CLI Reference