AI Agent Instructions: GraFlag Method Integration
This document provides precise instructions for an AI agent to integrate new graph anomaly detection methods into the GraFlag benchmarking framework.
Quick Reference
Item |
Location |
|---|---|
Methods directory |
|
Datasets directory |
|
Libraries directory |
|
Existing examples |
|
Entry point script |
|
CLI command |
|
Two Integration Patterns
GraFlag supports two patterns for running methods. Choose based on how the method consumes its parameters.
Pattern A: --pass-env-args (for methods using argparse)
The runner extracts _-prefixed env vars and passes them as CLI arguments to the method’s command.
_BATCH_SIZE=128becomes--batch_size 128_LEARNING_RATE=0.001becomes--learning_rate 0.001Parameter names are lowercased by the runner.
CMD ["python3", "-m", "graflag_runner", "--pass-env-args"]
Used by: taddy, generaldyg, dynwalk, strgnn, slade, gady, anograph, addgraph
Pattern B: Direct env var access (for library-based methods)
The method reads _-prefixed env vars directly via os.environ. No CLI argument conversion.
CMD ["python3", "-m", "graflag_runner"]
Used by: bond_cola, bond_ocgnn, bond_dominant, and all bond_* methods (which use graflag_bond library)
Integration Checklist
For each method, create/verify the following files:
graflag-shared/methods/{method_name}/
├── .env # REQUIRED: Configuration
├── Dockerfile # REQUIRED: Container definition
├── train_graflag.py # REQUIRED: GraFlag wrapper script
├── requirements.txt # OPTIONAL: Python dependencies
└── src/ # OPTIONAL: Original method source code
└── (cloned from GitHub at build time)
File 1: .env Configuration
Rules
METHOD_NAME: Lowercase, alphanumeric and underscores onlyCOMMAND: Entry point command (e.g.,python3 train_graflag.pyorpython3 -m graflag_bond.train)SUPPORTED_DATASETS: Comma-separated list of compatible dataset names (supports wildcards likebond_*)Hyperparameters: ALL must be prefixed with
_(underscore)Parameter naming (Pattern A only): graflag_runner lowercases parameter names
_LR_G=0.001is passed as--lr_g 0.001_BATCH_SIZE=128is passed as--batch_size 128
Boolean parameters (Pattern A only): Use empty value for True, omit entirely for False
_USE_MEMORY=means flag is present (True)(omit line) means flag is absent (False)
NEVER use
_USE_MEMORY=Trueor_USE_MEMORY=False
Reserved variables (set by the orchestrator, cannot be overridden):
DATA,EXP,METHOD_NAME,COMMAND,MONITOR_INTERVAL
Template (Pattern A)
METHOD_NAME={method_name}
DESCRIPTION={Brief description from paper}
SOURCE_CODE={GitHub URL}
SUPPORTED_DATASETS={dataset1},{dataset2}
COMMAND=python3 train_graflag.py
# === HYPERPARAMETERS ===
_EPOCHS=100
_BATCH_SIZE=128
_LEARNING_RATE=0.001
_HIDDEN_DIM=64
_SEED=42
Template (Pattern B – bond_* methods)
METHOD_NAME=bond_{detector}
DESCRIPTION={Detector description}
SOURCE_CODE={GitHub URL}
SUPPORTED_DATASETS=bond_*
COMMAND=python3 -m graflag_bond.train
_HID_DIM=64
_NUM_LAYERS=4
_DROPOUT=0
_WEIGHT_DECAY=0
_LR=0.004
_EPOCH=100
_GPU=0
_BATCH_SIZE=0
Real Example: TADDY
METHOD_NAME=taddy
DESCRIPTION=Anomaly Detection in Dynamic Graphs via Transformer
SOURCE_CODE=https://github.com/example/TADDY
COMMAND=python3 train_graflag.py
_ANOMALY_PER=0.1
_TRAIN_PER=0.4
_NEIGHBOR_NUM=20
_MAX_EPOCH=200
_BATCH_SIZE=128
_LEARNING_RATE=0.001
File 2: Dockerfile
Template (Pattern A)
FROM nvidia/cuda:12.1.0-runtime-ubuntu22.04
ENV DEBIAN_FRONTEND=noninteractive
WORKDIR /app
# System dependencies
RUN apt-get update && apt-get install -y \
python3 python3-pip git \
&& rm -rf /var/lib/apt/lists/*
RUN pip install --no-cache-dir --upgrade pip
# PyTorch (adjust version based on method requirements)
RUN pip install --no-cache-dir \
torch torchvision torchaudio \
--index-url https://download.pytorch.org/whl/cu121
# Common dependencies
RUN pip install --no-cache-dir numpy scipy scikit-learn pandas networkx tqdm
# PyTorch Geometric (if needed)
# RUN pip install --no-cache-dir torch-geometric
# RUN pip install --no-cache-dir \
# torch-scatter torch-sparse \
# -f https://data.pyg.org/whl/torch-2.1.0+cu121.html
# Clone source code from GitHub
RUN git clone {github_url} src
# Copy GraFlag integration files
COPY methods/{method_name}/train_graflag.py ./
COPY methods/{method_name}/*.py ./
# Install graflag_runner library
COPY libs/ ./libs/
RUN pip install --no-cache-dir ./libs/graflag_runner
# Entry point with --pass-env-args
CMD ["python3", "-m", "graflag_runner", "--pass-env-args"]
Template (Pattern B – bond_* methods)
FROM nvidia/cuda:12.1.0-runtime-ubuntu22.04
ENV DEBIAN_FRONTEND=noninteractive
WORKDIR /app
RUN apt-get update && apt-get install -y \
python3 python3-pip git \
&& rm -rf /var/lib/apt/lists/*
RUN pip install --no-cache-dir --upgrade pip
RUN pip install --no-cache-dir \
torch torchvision torchaudio \
--index-url https://download.pytorch.org/whl/cu121
RUN pip install --no-cache-dir torch-geometric pygod
# Install GraFlag libraries (runner + bond wrapper)
COPY libs/ ./libs/
RUN pip install --no-cache-dir ./libs/graflag_runner
RUN pip install --no-cache-dir ./libs/graflag_bond
# No --pass-env-args: graflag_bond reads env vars directly
CMD ["python3", "-m", "graflag_runner"]
Key Points
Build context: The entire
graflag-shared/directory is the build context, soCOPY libs/andCOPY methods/work correctly.Base image: Use
nvidia/cudamatching the method’s CUDA requirements. Common choices:nvidia/cuda:12.1.0-runtime-ubuntu22.04(newer methods)nvidia/cuda:11.1.1-runtime-ubuntu20.04(older methods)
graflag_runner: Always install it – it handles execution lifecycle, resource monitoring, and status tracking.
File 3: train_graflag.py
This is only needed for Pattern A methods. Pattern B methods use graflag_bond.train directly.
Critical Implementation Details
1. str2bool Helper (REQUIRED for boolean arguments)
def str2bool(v):
"""
Convert string to boolean for argparse compatibility.
graflag_runner passes: --flag True or --flag False
But argparse action='store_true' expects: --flag (no value)
This helper handles both patterns.
"""
if isinstance(v, bool):
return v
if v.lower() in ('yes', 'true', 't', 'y', '1', ''):
return True
elif v.lower() in ('no', 'false', 'f', 'n', '0'):
return False
else:
raise argparse.ArgumentTypeError('Boolean value expected.')
2. Argument Parsing (Handle case sensitivity)
def parse_args():
parser = argparse.ArgumentParser()
# Standard arguments
parser.add_argument('--data', type=str, default='dataset')
parser.add_argument('--seed', type=int, default=42)
# Numeric with aliases for case sensitivity
# If original uses --lr_G, add lowercase alias
parser.add_argument('--lr_g', '--lr_G', type=float, default=0.0001)
parser.add_argument('--lr_d', '--lr_D', type=float, default=0.0001)
# Boolean arguments -- MUST use str2bool
parser.add_argument('--use_memory', type=str2bool, nargs='?',
const=True, default=False)
parser.add_argument('--use_gpu', type=str2bool, nargs='?',
const=True, default=True)
return parser.parse_args()
3. Environment Variables
The orchestrator sets these environment variables in every container:
Variable |
Description |
Example |
|---|---|---|
|
Input dataset directory |
|
|
Experiment output directory |
|
|
Method identifier |
|
|
Command from .env |
|
4. ResultWriter API
from graflag_runner import ResultWriter
writer = ResultWriter() # Auto-reads EXP env var
# Add metadata (call before or after save_scores)
writer.add_metadata(method_name="taddy", dataset="uci", learning_rate=0.001)
# Add resource metrics (optional, also set automatically by graflag_runner)
writer.add_resource_metrics(
exec_time_ms=45230.15,
peak_memory_mb=2048.5,
peak_gpu_mb=4096.0 # optional
)
# Track training progress (creates training.csv)
writer.spot("training", epoch=1, loss=0.5, auc=0.85)
writer.spot("training", epoch=2, loss=0.3, auc=0.90)
# Track validation metrics (creates validation.csv)
writer.spot("validation", val_loss=0.4, val_auc=0.88)
# Save final scores
writer.save_scores(
result_type="EDGE_STREAM_ANOMALY_SCORES",
scores=scores_list,
ground_truth=labels_list,
)
# Finalize (writes results.json)
writer.finalize()
For large results, use streaming to avoid memory issues:
from graflag_runner import ResultWriter, StreamableArray
writer.save_scores(
result_type="NODE_ANOMALY_SCORES",
scores=StreamableArray(score_generator()), # Writes row-by-row
)
5. Result Saving (CRITICAL)
# CRITICAL RULES:
# 1. Use TEST data for evaluation (contains anomalies)
# 2. Training data typically has all zeros (no anomalies)
# 3. Always include ground_truth
# 4. Choose correct result_type
writer.save_scores(
result_type=result_type,
scores=scores if isinstance(scores, list) else scores.tolist(),
ground_truth=labels if isinstance(labels, list) else labels.tolist(),
)
Complete Template
"""
GraFlag integration for {MethodName}.
{Description}
Source: {GitHub URL}
"""
import os
import sys
import argparse
import logging
from pathlib import Path
import numpy as np
import torch
# Add original source to path
sys.path.insert(0, 'src')
# Import original method modules
# from your_module import YourModel, YourDataLoader
from graflag_runner import ResultWriter
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
def str2bool(v):
"""Convert string to boolean for argparse compatibility."""
if isinstance(v, bool):
return v
if v.lower() in ('yes', 'true', 't', 'y', '1', ''):
return True
elif v.lower() in ('no', 'false', 'f', 'n', '0'):
return False
else:
raise argparse.ArgumentTypeError('Boolean value expected.')
def parse_args():
parser = argparse.ArgumentParser('{MethodName} GraFlag Integration')
# === DATA ===
parser.add_argument('--data', type=str, default='dataset')
parser.add_argument('--seed', type=int, default=42)
# === MODEL ===
parser.add_argument('--hidden_dim', type=int, default=64)
parser.add_argument('--num_layers', type=int, default=2)
# === TRAINING ===
parser.add_argument('--batch_size', '--bs', type=int, default=128)
parser.add_argument('--epochs', '--n_epoch', type=int, default=100)
parser.add_argument('--lr', '--learning_rate', type=float, default=0.001)
# === BOOLEAN FLAGS (use str2bool) ===
parser.add_argument('--use_feature', type=str2bool, nargs='?',
const=True, default=False)
# === METHOD-SPECIFIC ===
# Add parameters from original method's argparse
return parser.parse_args()
def main():
print("=" * 60)
print("{MethodName} - GraFlag Integration")
print("=" * 60)
args = parse_args()
# Get paths from environment
data_dir = os.environ.get('DATA')
if not data_dir:
raise ValueError("DATA environment variable not set")
data_path = Path(data_dir)
dataset_name = data_path.name
print(f"\nConfiguration:")
print(f" Dataset: {dataset_name}")
print(f" Data Path: {data_path}")
for k, v in vars(args).items():
print(f" {k}: {v}")
print()
# Set seeds
np.random.seed(args.seed)
torch.manual_seed(args.seed)
if torch.cuda.is_available():
torch.cuda.manual_seed(args.seed)
# Initialize ResultWriter
writer = ResultWriter()
try:
# =============================================
# IMPLEMENT: Load data
# =============================================
# data = YourDataLoader(data_path, dataset_name)
# =============================================
# IMPLEMENT: Train model with writer.spot()
# =============================================
# for epoch in range(args.epochs):
# loss = train_epoch(model, data)
# writer.spot("training", epoch=epoch+1, loss=loss)
# =============================================
# IMPLEMENT: Generate predictions on TEST data
# =============================================
# CRITICAL: Use test split that contains anomalies!
# scores = model.predict(data.test)
# labels = data.test_labels
scores = [] # Replace with actual
labels = [] # Replace with actual
# =============================================
# Save results
# =============================================
result_type = "NODE_ANOMALY_SCORES" # Adjust per method
writer.save_scores(
result_type=result_type,
scores=scores,
ground_truth=labels,
)
writer.add_metadata(
method_name="{method_name}",
dataset=dataset_name,
**vars(args),
)
writer.finalize()
print("\n" + "=" * 60)
print("[OK] Results saved successfully")
print("=" * 60)
except Exception as e:
logger.error(f"Error: {e}", exc_info=True)
raise
if __name__ == "__main__":
main()
File 4: Dataset Directory
Rules
Naming: Use descriptive names (e.g.,
uci,btc_alpha,bond_inj_cora)NO SYMLINKS: Use actual file copies (symlinks break on NFS-mounted cluster)
Include README.md: Document data format and source
Structure
graflag-shared/datasets/{dataset_name}/
├── README.md # Dataset description
└── {data_files} # Actual data files (CSV, NPZ, etc.)
Examples
Existing dataset names in the platform:
bond_inj_cora,bond_inj_amazon,bond_inj_flickr(injection-based anomaly)bond_books,bond_disney,bond_enron,bond_reddit,bond_weibo(real-world)bond_gen_100,bond_gen_500,bond_gen_1000,bond_gen_5000,bond_gen_10000(synthetic)btc_alpha,btc_otc(cryptocurrency networks)uci(social network)
Result Types Reference
Method Output |
Result Type |
scores Format |
|---|---|---|
Static node scores |
|
|
Static edge scores |
|
|
Static graph scores |
|
|
Dynamic node (snapshots) |
|
|
Dynamic edge (snapshots) |
|
|
Dynamic graph (snapshots) |
|
|
Streaming nodes |
|
|
Streaming edges |
|
|
Streaming graphs |
|
|
Special score values:
-1: Unknown/unassigned-2: Inactive/unseen at this time step
Common Errors and Fixes
Error 1: “unrecognized arguments: –param_name value”
Cause: Parameter name mismatch (case sensitivity)
Fix: Add lowercase alias in argparse:
# .env has: _LR_G=0.001
# graflag_runner passes: --lr_g 0.001
# Original expects: --lr_G
parser.add_argument('--lr_g', '--lr_G', type=float, default=0.001)
Error 2: “unrecognized arguments: True” or “False”
Cause: Boolean using action='store_true'
Fix: Use str2bool:
# WRONG:
parser.add_argument('--flag', action='store_true')
# RIGHT:
parser.add_argument('--flag', type=str2bool, nargs='?', const=True, default=False)
Error 3: “File not found” on cluster
Cause: Dataset contains symlinks
Fix: Replace symlinks with actual files:
# Find symlinks
find graflag-shared/datasets/ -type l
# Replace with actual files
cp --remove-destination /actual/path/to/file graflag-shared/datasets/dataset_name/file
Error 4: “AUC is null”
Cause: Using training data (all labels = 0, no anomalies)
Fix: Use TEST data that contains anomalies:
# WRONG: Using training snapshots
for snap in data['snap_train']:
scores.append(predict(snap))
# RIGHT: Using test snapshots with injected anomalies
for snap in data['snap_test']:
scores.append(predict(snap))
Error 5: “ValueError: setting array element with sequence”
Cause: Ragged arrays (different lengths per timestamp)
Status: Handled by graflag_evaluator (flattens ragged arrays automatically)
Testing Commands
# 1. Sync method to cluster
graflag sync --path methods/{method_name}
# 2. Build and run
graflag run -m {method_name} -d {dataset_name} --build
# 3. Check logs (follow in real-time)
graflag logs -e exp__{method_name}__{dataset_name}__TIMESTAMP -f
# 4. Stop if needed
graflag stop -e exp__{method_name}__{dataset_name}__TIMESTAMP
# 5. Evaluate results
graflag evaluate -e exp__{method_name}__{dataset_name}__TIMESTAMP
# 6. Run with custom parameters
graflag run -m {method_name} -d {dataset_name} --params EPOCHS=50 BATCH_SIZE=64
Agent Prompt Template
Use this prompt to instruct an AI agent to integrate methods:
# Task: Integrate Graph Anomaly Detection Methods into GraFlag
## Methods to Integrate
| Method | Paper | GitHub | Description |
|--------|-------|--------|-------------|
| {method1} | {paper1} | {github1} | {desc1} |
| {method2} | {paper2} | {github2} | {desc2} |
## Instructions
For EACH method in the list above, perform the following steps:
### Step 1: Analyze Original Repository
1. Examine the GitHub repository structure
2. Identify:
- Main training script and entry point
- Data loading code and expected data format
- Model architecture files
- All configurable hyperparameters (check argparse)
- Required Python dependencies
- CUDA/PyTorch version requirements
### Step 2: Choose Integration Pattern
- **Pattern A** (`--pass-env-args`): If the method uses argparse for configuration.
The runner converts `_PARAM=value` env vars to `--param value` CLI args.
- **Pattern B** (direct env): If the method is library-based (e.g., PyGOD via `graflag_bond`).
The method reads env vars directly.
### Step 3: Create Method Directory
Create `graflag-shared/methods/{method_name}/` with these files:
#### 3.1: `.env` File
- Set METHOD_NAME (lowercase, alphanumeric, underscores)
- Set DESCRIPTION from paper abstract
- Set SOURCE_CODE to GitHub URL
- Set SUPPORTED_DATASETS to compatible dataset names (comma-separated, wildcards ok)
- Set COMMAND (e.g., `python3 train_graflag.py` or `python3 -m graflag_bond.train`)
- Add ALL hyperparameters with `_` prefix
- For booleans (Pattern A): empty value = True, omit = False
- Remember: parameter names are lowercased by graflag_runner (Pattern A only)
#### 3.2: `Dockerfile`
- Base: `nvidia/cuda` (version matching method requirements)
- Install all Python dependencies
- Clone source code: `RUN git clone {github_url} src`
- Copy train_graflag.py and helper files
- Install graflag_runner: `COPY libs/ ./libs/ && RUN pip install --no-cache-dir ./libs/graflag_runner`
- Install graflag_bond too if using Pattern B: `RUN pip install --no-cache-dir ./libs/graflag_bond`
- CMD: `["python3", "-m", "graflag_runner", "--pass-env-args"]` (Pattern A) or `["python3", "-m", "graflag_runner"]` (Pattern B)
#### 3.3: `train_graflag.py` (Pattern A only)
- Add `str2bool()` helper function at the top
- Implement `parse_args()`:
- Match ALL arguments from original code
- Add lowercase aliases for case-sensitive parameters
- Use `str2bool` for ALL boolean arguments
- Read data from `os.environ.get('DATA')` path
- Import and use original method's model/training code
- Call `writer.spot("training", ...)` during training loop
- Generate predictions on TEST data (contains anomalies)
- Call `writer.save_scores()` with correct result_type and ground_truth
- Call `writer.add_metadata()` with all hyperparameters
- Call `writer.finalize()`
### Step 4: Create Dataset Directory (if needed)
Create `graflag-shared/datasets/{dataset_name}/`:
- Copy actual data files (NO symlinks)
- Create README.md with dataset description
### Step 5: Verify Integration
Report the following for each method:
- [ ] `.env` file created with all parameters
- [ ] `Dockerfile` created with correct dependencies
- [ ] Integration script created (train_graflag.py or graflag_bond)
- [ ] Dataset directory created with actual files
- [ ] Result type determined based on method output
## Critical Requirements
1. **str2bool**: ALL boolean arguments MUST use the str2bool helper (Pattern A)
2. **Case sensitivity**: Add lowercase aliases for arguments like `--lr_G` -> `--lr_g`
3. **No symlinks**: Dataset files must be actual copies
4. **TEST data**: Predictions must be on test data with anomalies, not training data
5. **ground_truth**: Always include ground_truth in save_scores()
6. **Reserved vars**: Never override DATA, EXP, METHOD_NAME, COMMAND, MONITOR_INTERVAL
## Reference Implementations
Study these existing integrations:
- `methods/taddy/` -- Pattern A, temporal edge anomaly detection
- `methods/generaldyg/` -- Pattern A, dynamic graph anomaly detection
- `methods/bond_cola/` -- Pattern B, contrastive self-supervised (PyGOD via graflag_bond)
- `methods/bond_dominant/` -- Pattern B, deep matrix factorization (PyGOD via graflag_bond)
## Workspace Paths
- Methods: `graflag-shared/methods/`
- Datasets: `graflag-shared/datasets/`
- Libraries: `graflag-shared/libs/` (graflag_runner, graflag_bond, graflag_evaluator)
Example: Complete GADY Integration (Pattern A)
.env
METHOD_NAME=gady
DESCRIPTION=GADY: Unsupervised Anomaly Detection on Dynamic Graphs
SOURCE_CODE=https://github.com/CuiYu-Coder/GADY
SUPPORTED_DATASETS=gady_uci,gady_btc_alpha
COMMAND=python3 train_graflag.py
_LR_G=0.0001
_LR_D=0.0001
_BS=64
_N_EPOCH=100
_HIDDEN_DIM=64
_EMBED_DIM=256
_SEED=42
_USE_MEMORY=
train_graflag.py (key parts)
def str2bool(v):
if isinstance(v, bool):
return v
if v.lower() in ('yes', 'true', 't', 'y', '1', ''):
return True
elif v.lower() in ('no', 'false', 'f', 'n', '0'):
return False
raise argparse.ArgumentTypeError('Boolean value expected.')
def parse_args():
parser = argparse.ArgumentParser()
# Lowercase aliases for graflag_runner compatibility
parser.add_argument('--lr_g', '--lr_G', type=float, default=0.0001)
parser.add_argument('--lr_d', '--lr_D', type=float, default=0.0001)
parser.add_argument('--bs', '--batch_size', type=int, default=64)
# Boolean with str2bool
parser.add_argument('--use_memory', type=str2bool, nargs='?',
const=True, default=True)
return parser.parse_args()
Example: bond_cola Integration (Pattern B)
.env
METHOD_NAME=bond_cola
DESCRIPTION=Anomaly Detection on Attributed Networks via Contrastive Self-Supervised Learning
SOURCE_CODE=https://github.com/pygod-team/pygod
SUPPORTED_DATASETS=bond_*
COMMAND=python3 -m graflag_bond.train
_HID_DIM=64
_NUM_LAYERS=4
_DROPOUT=0
_WEIGHT_DECAY=0
_LR=0.004
_EPOCH=100
_GPU=0
_BATCH_SIZE=0
Dockerfile
FROM nvidia/cuda:12.1.0-runtime-ubuntu22.04
ENV DEBIAN_FRONTEND=noninteractive
WORKDIR /app
RUN apt-get update && apt-get install -y python3 python3-pip git \
&& rm -rf /var/lib/apt/lists/*
RUN pip install --no-cache-dir --upgrade pip
RUN pip install --no-cache-dir \
torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
RUN pip install --no-cache-dir torch-geometric pygod
COPY libs/ ./libs/
RUN pip install --no-cache-dir ./libs/graflag_runner
RUN pip install --no-cache-dir ./libs/graflag_bond
CMD ["python3", "-m", "graflag_runner"]
No train_graflag.py needed – graflag_bond.train handles everything via the BondDetector registry that discovers PyGOD detector classes automatically.
Summary
Choose pattern: Pattern A (
--pass-env-args) for argparse methods, Pattern B (direct env) for library methods.env: Lowercase method name,_prefix for params,SUPPORTED_DATASETSfor compatibilityDockerfile: CUDA base, install graflag_runner (and graflag_bond if Pattern B), correct CMDtrain_graflag.py(Pattern A only):str2boolhelper, lowercase aliases, TEST data for predictions, ResultWriter for outputDataset: Actual files (no symlinks)
Key rule: If the original method works but GraFlag integration fails, the issue is almost always argument parsing (case sensitivity or boolean handling).