Stackify is now BMC. Read theBlog

A Guide to Python Subprocess

By: Stackify Team
  |  January 21, 2025
A Guide to Python Subprocess

Python’s subprocess module stands as one of the most powerful tools in a developer’s toolkit. The module lets you spawn new processes, connect to their input/output/error pipes, and get return codes. Think of the subprocess module as your bridge between Python scripts and system commands, replacing older modules like os.system and os.spawn and providing a better, unified way to handle external processes.

Importance of Subprocess in Automation and Scripting

Automation is the backbone of modern development, and subprocess makes automation possible. Whether you’re building deployment scripts, automating system tasks, or creating development tools, subprocess helps you interact with the system seamlessly. Such versatility allows developers to:

  • Execute system commands from Python scripts
  • Handle command output and errors efficiently
  • Control process environments
  • Manage complex process pipelines
  • Automate repetitive system tasks

Basics of the Subprocess Module

Importing the Subprocess Module

Getting started with subprocess is straightforward. To use subprocess, import it at the top of your script:

import subprocess

This single line opens the door to a range of process management functionalities. The module comes built into Python’s standard library, so no additional installation is needed.

Key Functions and Methods

The subprocess module provides various functions, but these four are most commonly used:

subprocess.run()

The star of the show, run() is the recommended way to run commands in Python 3, providing a clean interface and returning a CompletedProcess instance:

import subprocess
# Simple command execution
result = subprocess.run(['ls', '-l'], capture_output=True, text=True)
print(result.stdout)

# With additional parameters
result = subprocess.run(
    ['python', 'script.py'],
    capture_output=True,
    text=True,
    timeout=60,
    check=True
)
print(result.stdout)

The run() function offers several useful parameters:

  • capture_output: Captures stdout and stderr
  • text: Returns string output instead of bytes
  • timeout: Sets maximum execution time
  • check: Raises CalledProcessError if return code is non-zero
  • cwd: Sets working directory for command execution

subprocess.Popen()

For more control over process execution, Popen provides low-level process creation and management:

import subprocess
import time

# Interactive process
process = subprocess.Popen(
    ['echo', 'Hello World!'],
    stdout=subprocess.PIPE,
    stderr=subprocess.PIPE,
    text=True
)

output, error = process.communicate()
print(output, error)

# Long-running process
background_process = subprocess.Popen(
    ['python', 'server.py'],
    stdout=subprocess.PIPE,
    stderr=subprocess.PIPE
)

# Do other things while process runs
while background_process.poll() is None: # Check if process has finished
    time.sleep(1)
    print("Server is still running")

print("Server stopped")

subprocess.call()

A simpler interface when you only need the return code:

import subprocess

# Basic usage
return_code = subprocess.call(['ping', '-c', '1', 'google.com'])
print(f"Ping return code '{return_code}'")

# With error handling
try:
    status = subprocess.call(['service', 'nginx', 'restart'])
    if status == 0:
        print("Service restarted successfully")
    else:
        print(f"Service restart failed with code {status}")
except Exception as e:
    print(f"Error occurred: {e}")

subprocess.check_output()

When you just want the output:

import subprocess

# Simple output capture
output = subprocess.check_output(['date'])
print(f"Date output: '{output}'")
# With error handling and timeout
try:
    output = subprocess.check_output(
        ['curl', 'http://api.example.com'],
        timeout=5,
        text=True
    )
    print(f"API Response: {output}")
except subprocess.TimeoutExpired:
    print("API request timed out")
except subprocess.CalledProcessError as e:
    print(f"API request failed with code {e.returncode}")

Executing a Command

Using subprocess.run()

Here’s how you can execute a simple shell command using subprocess.run:

result = subprocess.run(["ls", "-l"], capture_output=True, text=True)
print(result.stdout)

This command lists files in the current directory with details. The capture_output=True parameter captures the command’s output.

Here’s a complete example of running a command with various options:

import subprocess

try:
    # Run command with all common options
    result = subprocess.run(
        ['python', '--version'],
        capture_output=True,
        text=True,
        check=True,
        timeout=30,
        cwd='/path/to/working/dir',
        env={'PATH': '/usr/local/bin:/usr/bin'}
    )
    print(f"Python version: {result.stdout}")
    print(f"Command took {result.returncode} seconds")
except subprocess.CalledProcessError as e:
    print(f"Command failed with exit code {e.returncode}")
    print(f"Error output: {e.stderr}")
except subprocess.TimeoutExpired:
    print("Command timed out")

Differences Between run(), call(), and check_output()

While run() is the most versatile, each function serves a specific purpose:

  • run():
    • Most modern and flexible
    • Returns CompletedProcess object
    • Provides access to stdout, stderr, and return code
    • Supports timeout and input
    • Recommended for most use cases
  • call():
    • Simple interface
    • Returns just the return code
    • Good for quick checks
    • Doesn’t capture output by default
  • check_output():
    • Returns output as bytes/string
    • Raises CalledProcessError on non-zero exit
    • Good for commands where output is needed
    • Less flexible than run()

Error Handling with try-except

Robust error handling is crucial when working with external processes:

import subprocess

try:
 # Attempt to run command
    result = subprocess.run(
        ['nonexistent_command'],
        capture_output=True,
        text=True
    )
    result.check_returncode()
except FileNotFoundError:
    print("Command not found in system PATH")
except subprocess.CalledProcessError as e:
    print(f"Command failed with exit code {e.returncode}")
    print(f"Error output: {e.stderr}")
except subprocess.TimeoutExpired as e:
    print(f"Command timed out after {e.timeout} seconds")
except Exception as e:
    print(f"Unexpected error: {e}")

Working with Process Information

Understanding Popen Objects

The Popen class provides advanced control over processes, allowing non-blocking execution and real-time interaction with processes:

import subprocess

# Create a process
process = subprocess.Popen(
    ['sleep', '10'],
    stdout=subprocess.PIPE,
    stderr=subprocess.PIPE
)

# Process management
print(f"Process ID: {process.pid}")
print(f"Process still running: {process.poll() is None}")

# Wait for completion with timeout

try:
    process.wait(timeout=5)
except subprocess.TimeoutExpired:
    process.kill()
    print("Process killed after timeout")

print(f"Final return code: {process.returncode}")

Communicating with Processes

stdin, stdout, stderr

These streams manage input, output, and error data, which you can redirect or capture as needed.

Here you can control all process streams for interactive processes:

import subprocess

# Interactive process with all streams
process = subprocess.Popen(
    ['python', '-c', 'print(input("Enter text: "))'],
    stdin=subprocess.PIPE,
    stdout=subprocess.PIPE,
    stderr=subprocess.PIPE,
    text=True
)

# Send input and get output
stdout, stderr = process.communicate(input="Hello from Python!")
print(f"Output: {stdout}")
print(f"Errors: {stderr}")

Using communicate()

The communicate() method sends data to stdin and reads stdout and stderr.

The following is a safe way to interact with process streams without deadlocks:

import subprocess

process = subprocess.Popen(
    ['grep', 'pattern'],
    stdin=subprocess.PIPE,
    stdout=subprocess.PIPE,
    text=True
)

try:
    stdout, stderr = process.communicate(
        input="some text to search\nwith pattern\n",
        timeout=5
    )

    print(f"Found matches: {stdout}")
except subprocess.TimeoutExpired:
    process.kill()
    print("Process killed due to timeout")

Passing Arguments to Commands

Handling File Paths

Always use list format for commands with paths to avoid shell injection:

import subprocess

# Safe way to handle paths with spaces
source_path = '/path with spaces/source
dest_path = '/destination/path'

subprocess.run([
    'cp',
    source_path,
    dest_path
])

# Handling multiple files
files = ['file1.txt', 'file2.txt', 'file3.txt']

subprocess.run(['zip', 'archive.zip'] + files)

Using Shell=True and Its Implications

Use shell commands when needed, but carefully:

import subprocess

# With shell=True (use cautiously)
subprocess.run('echo $HOME | grep /home', shell=True)

# Better alternative without shell=True
home = subprocess.check_output(['echo', '$HOME'])
subprocess.run(['grep', '/home'], input=home)

Common Use Cases

Running Shell Scripts

Execute custom scripts with proper error handling:

import subprocess

def run_deployment():
    try:
        # Run deployment script
        result = subprocess.run(
            ['bash', 'deploy.sh'],
            capture_output=True,
            text=True,
            check=True,
            timeout=300 # 5 minute timeout
        )

        print("Deployment successful!")
        print(f"Output: {result.stdout}")
    except subprocess.CalledProcessError as e:
        print(f"Deployment failed: {e.stderr}")
    except subprocess.TimeoutExpired:
        print("Deployment timed out")

run_deployment()

Automating System Administration Tasks

Create powerful automation tools:

import subprocess

def backup_directory(source, destination, exclude=None):

    """
    Backup a directory using rsync with error handling and logging.

    """
    command = ['rsync', '-av']
    if exclude:
        for pattern in exclude:
            command.extend(['--exclude', pattern])
            command.extend([source, destination])
            try:
                result = subprocess.run(
                    command,
                    capture_output=True,
                    text=True,
                    check=True
                )
                print(f"Backup completed: {result.stdout}")
                return True
            except subprocess.CalledProcessError as e:
                print(f"Backup failed: {e.stderr}")
                return False

# Example usage
exclude_patterns = ['.git', 'node_modules', '*.pyc']
backup_directory(
    '/source/project',
    '/backup/project',
    exclude=exclude_patterns
)

Building Custom Utilities

Subprocess can act as a glue between tools to build utilities. For example:

import subprocess

def get_system_info():

    """
    Collect various system information using system commands.

    """
    info = {}
    commands = {
        'cpu': ['lscpu'],
        'memory': ['free', '-h'],
        'disk': ['df', '-h'],
        'network': ['ifconfig'],
        'processes': ['ps', 'aux']
    }
    for key, command in commands.items():
        try:
            output = subprocess.check_output(
                command,
                text=True,
                timeout=10
            )
            info[key] = output.strip()
        except Exception as e:
            info[key] = f"Error: {str(e)}"
    return info

# Example usage
system_info = get_system_info()

for category, data in system_info.items():
    print(f"\n=== {category.upper()} ===")
    print(data)

Security Considerations

Risks of Using Shell=True

Using shell=True can introduce several security risks:

  • Command injection vulnerabilities
  • Unexpected shell expansion
  • Performance overhead
  • Difficulty in handling special characters
  • Platform-dependent behavior

Techniques to Mitigate Security Risks

Best practices for secure subprocess usage:

  • Use lists instead of strings for commands
  • Validate all inputs before passing them to subprocess
  • Prefer shlex.quote() to sanitize inputs for shell commands

Here are some examples of good and bad implementations:

import subprocess

# Bad practice (vulnerable to injection)
user_input = "file.txt; rm -rf /"
subprocess.run(f"cat {user_input}", shell=True)
# Good practice
subprocess.run(['cat', user_input])
# Handling complex commands
def safe_file_operations(filename):

    """
    Demonstrate safe file operations with subprocess

    """

    # Validate filename first
    if not filename.endswith('.txt'):
        raise ValueError("Only .txt files allowed")

    try:
        # Read file contents
        result = subprocess.run(
            ['cat', filename],
            capture_output=True,
            text=True,
            check=True
        )
        # Process contents safely
        processed_content = result.stdout.upper()
        # Write back safely
        with open(f"processed_{filename}", 'w') as f:
        subprocess.run(
            ['tee'],
            input=processed_content,
            text=True,
            check=True
        )
    except subprocess.CalledProcessError as e:
        print(f"Operation failed: {e}")

safe_file_operations("test.txt")
safe_file_operations("test.txt; rm -rf /")

Application Monitoring and Python argparse

Stackify Retrace and Python argparse Monitoring

Stackify Retrace offers comprehensive monitoring for Python applications, including subprocess operations. The tool helps you track:

  • Process execution times and patterns
  • Error rates and types across different commands
  • Resource usage during subprocess operations
  • Performance bottlenecks in script execution
  • System command patterns and frequencies

Using custom instrumentation with Python argparse, Stackify Retrace enables detailed monitoring of command-line interfaces, providing insights into:

  • Command usage patterns
  • Parameter frequencies
  • Error conditions
  • Performance metrics
  • Resource utilization

This comprehensive monitoring helps maintain reliable and efficient Python applications.

Conclusion

Python’s subprocess module is clearly a powerful tool for system interaction and automation, yet offers great flexibility. However, it is crucial to use Python subprocess securely and efficiently. Monitoring is essential for maintaining reliability and performance, particularly in production applications.

Stackify Retrace provides complete Python monitoring in one tool. With features like code profiling, error tracking, and log management, Retrace helps ensure your Python applications run smoothly. Retrace offers detailed insights into subprocess operations, helping you identify and resolve issues quickly. Whether you’re running simple scripts or complex automation systems, Retrace’s comprehensive monitoring capabilities make it easier to maintain and optimize your Python applications.

The combination of proper subprocess usage and robust monitoring through Retrace creates a powerful foundation for reliable Python applications. This enables developers to build sophisticated automation tools while maintaining security and performance.

Visit Stackify’s Python monitoring page to learn more about how Retrace can help monitor and improve your Python applications. Better still, start your free Retrace trial today.

Improve Your Code with Retrace APM

Stackify's APM tools are used by thousands of .NET, Java, PHP, Node.js, Python, & Ruby developers all over the world.
Explore Retrace's product features to learn more.

Learn More

Want to contribute to the Stackify blog?

If you would like to be a guest contributor to the Stackify blog please reach out to [email protected]