Local debug mode#

This page introduces the features and usage of MaxFrame Local Debug Mode. Local Debug Mode allows you to debug UDF functions such as apply() and apply_chunk() directly in your local environment, without connecting to remote services.

Background#

In the traditional MaxFrame UDF development workflow, debugging functions such as apply() and apply_chunk() requires submitting code to a remote cluster for execution. This makes it impossible to set breakpoints or step through code locally. Every change has to be re-submitted for remote execution, and you may have to maintain different code paths to distinguish local development and production.

MaxFrame Local Debug Mode solves these problems. Once Local Debug Mode is enabled, UDF functions are executed directly in the local Python environment with full IDE breakpoint debugging support, work fully offline, and the same code can be switched seamlessly between local debugging and production runs.

Use cases#

Scenario

Description

UDF logic development

Debug and verify complex business logic in real time.

Data transformation tests

Validate data cleaning and transformation rules.

Issue investigation

Locate the root cause of UDF execution failures.

Offline development

Continue development without network access.

Features#

Compared with the traditional remote debugging workflow, Local Debug Mode provides the following advantages:

Dimension

Local Debug Mode

Traditional approach

Breakpoint debugging

IDE breakpoints supported

Not supported

Remote dependency

Fully offline local debugging

Requires connection to a remote cluster

Debug cycle

Local instant execution

Each change must be submitted for remote execution

Code modification

A single code base

Multiple code paths must be maintained

  • Zero-configuration debugging

    Just set debug=True or debug="local". No additional tools or services are required.

    session = new_session(o, debug=True)
    
  • Fully offline

    Does not depend on networking or remote cluster resources.

  • Native IDE support

    • Supports mainstream IDEs such as PyCharm and VSCode, as well as DataWorks Notebook.

    • Preserves the full debugging experience, including breakpoints, variable watches, and step-by-step execution.

    • Provides the same debugging experience as local Python development.

  • Flexible data sources

    Supports in-memory data, local files, MaxCompute tables, and more.

    Data source

    How to load

    Use case

    In-memory data

    md.DataFrame(pd.DataFrame())

    Quick logic verification

    MaxCompute table

    md.read_odps_table()

    Real-data testing

    Local file

    pd.read_csv() and other native pandas APIs

    Offline development

  • Seamless switch to production

    Debug code is identical to production code. After you remove debug=True or debug="local", the code can run in production directly.

    # Debugging
    session = new_session(o, debug=True)
    
    # Production
    session = new_session(o)
    

Quick start#

  1. Prepare the environment

    # MaxFrame SDK 2.5.0 or later is required
    pip install --upgrade maxframe
    
  2. Basic example

    from odps import ODPS
    from maxframe import new_session
    import maxframe.dataframe as md
    import pandas as pd
    
    o = ODPS(
        access_id='your_access_id',
        secret_access_key='your_secret_key',
        project='your_project',
        endpoint='your_endpoint',
    )
    
    # Enable debug mode
    session = new_session(o, debug=True)
    
    df = md.DataFrame(pd.DataFrame({
        "sales": [5000, 8000, 12000, 3000],
        "region": ["A", "B", "C", "D"],
    }))
    
    def calculate_commission(row):
        sales = row['sales']
        if sales > 10000:  # set a breakpoint here
            rate = 0.15
            print(rate)
        elif sales > 5000:  # set a breakpoint here
            rate = 0.10
            print(rate)
        else:
            rate = 0.05
        return sales * rate
    
    result = df.apply(calculate_commission, axis=1).execute().fetch()
    

Notes#

  • Performance differences: Local Debug Mode is intended for development and verification. Its performance does not match the production environment.

  • Data volume limit: Use small sample datasets when debugging.

  • Dependency consistency: Make sure the local Python environment uses the same dependency versions as production.

  • Sensitive data: When debugging MaxCompute tables, follow data permission and data masking requirements.