# L2 Memory Allocator

Generate **tensor allocation and spilling** information for **L2 memory** from an ONNX model. This script parses an ONNX graph, builds a DAG of ops/tensors, and uses a graph memory scheduler (best‑fit strategy) to allocate tensors in L2, producing an **allocation summary**, **L2 fusion JSON** with details of tensor placements contained in ONNX graph, and **a visualization JSON** with details about which tensors get allocated/deallocated/spilled as every operator in the computation graph is executed.

---

## What this tool does

- Loads an ONNX model and removes ignorable/no-op operators.
- Constructs a **Directed Acyclic Graph (DAG)** of operations and tensors.
- Schedules tensors into L2 memory using **Best‑Fit** allocation.
- Emits a summary of allocations (peak usage, fragmentation, spills, etc.).
- Optionally writes **allocation events** JSON for downstream visualization or analysis.
- Writes **fusion tiling** JSON reflecting the final allocation decisions.

---

## Quick start

```bash
python graph/L2_fusion.py \
  --model_path /path/to/model.onnx \
  --model_dtype TensorProto.INT8 \
  --fusion_json_path ./L2_fusion_tiling.json \
  --write_allocation_events \
  --allocation_events_path ./L2_fusion_allocation_events.json \
  --verbosity INFO
```

If you run with **no arguments**, defaults are used:

- `model_path`: `./ResNet50_INT8_Model.onnx`
- `model_dtype`: `TensorProto.INT8`
- `fusion_json_path`: `./L2_fusion_tiling.json`
- `allocation_events_path`: `./L2_fusion_allocation_events.json`
- `verbosity`: `INFO`

---

## Command-line arguments

- `-m, --model_path`
  Path to the input **ONNX** model. _(Default: `./ResNet50_INT8_Model.onnx`)_

- `-d, --model_dtype`
  Model/tensor data type used by the scheduler (e.g., `TensorProto.INT8`, `TensorProto.FLOAT16`).
  The string must be understood by your `Tensor` wrapper. _(Default: `TensorProto.INT8`)_

- `-j, --fusion_json_path`
  Output path for the **L2 fusion tiling** JSON file. _(Default: `./L2_fusion_tiling.json`)_

- `--write_allocation_events`
  If set, writes allocation events JSON (per scheduling step/operation) for visualization.

- `--allocation_events_path`
  Output path for the **allocation events** JSON file. _(Default: `./L2_fusion_allocation_events.json`)_

- `-v, --verbosity`
  Logging level: `DEBUG` or `INFO`. _(Default: `INFO`)_

---

## Tips & notes

- **Ignored ops:** Make sure `IGNORED_OPS` covers graph decorations (e.g., Identity, Quantize/Dequantize, NoOps) that don’t affect memory liveness.
- **Data type:** The `model_dtype` string should match what `onnx.TensorProto` class expects; if unsure, check ONNX docs.

---

## Troubleshooting

- **Scheduler fails / unexpected spills**

  - Run with `-v DEBUG` to see step-by-step decisions.
  - Check that shapes/dtypes are correct on the command line.

---

# ONNX Graph Cleaner

This script simplifies ONNX graphs by removing L1 fused nodes and collapsing shape-calculation subgraphs used only for `Resize`. The goal is to make models easier to use and visualize.

## Functionality

- Removes redundant nodes: `Relu`, `LeakyRelu`, `QuantizeLinear`, `DequantizeLinear`.
- Detects and removes `Shape → Gather → Cast → Slice → Mul → Cast → Concat` subgraphs that feed into `Resize`, replacing them with constant `sizes` inputs.
- Cleans up dangling nodes, initializers, and inputs after removals.
- Produces both an updated ONNX model and a JSON file listing removed nodes.

## Usage

```
python clean_onnx_graph.py --input <path_to_model.onnx>
```

### Arguments

- `--input` (required): Path to the input ONNX model.
- `--output` (optional): Path to save the output ONNX model.
  If not provided, the output is written in the same directory as the input model with `_L2_graph.onnx` suffix.

### Examples

```
# Create cleaned model and JSON in the same directory as the input
python clean_onnx_graph.py --input yolov3/YoloV3_INT8_Model.onnx

# Explicitly set output path
python clean_onnx_graph.py --input yolov3/YoloV3_INT8_Model.onnx --output yolov3/YoloV3_Cleaned.onnx
```

## Outputs

- Cleaned ONNX model: `<input_basename>_L2_graph.onnx` by default, or the file specified with `--output`.
- JSON file mapping removed nodes: `<input_basename>_removed_L1_nodes.json`.

---

# L2/L3 Memory Allocator

Generate **tensor allocation and spilling** information for **L2 and L3 memory** from an ONNX model. This script parses an ONNX graph, builds a DAG of ops/tensors, and uses a graph memory scheduler (best‑fit strategy) to allocate tensors in L2 and L3, producing an **allocation summary**, **L2/L3 fusion JSON** with details of tensor placements contained in ONNX graph.

---

## What this tool does

- Loads an ONNX model and removes ignorable/no-op operators.
- Constructs a **Directed Acyclic Graph (DAG)** of operations and tensors.
- Schedules tensors into L2 memory using **Best‑Fit** allocation.
- Schedules tensors spilled from L2 to L3 using **Best‑Fit** allocation.
- Emits a summary of allocations (peak usage, fragmentation, spills, etc.).
- Writes **fusion tiling** JSON reflecting the final allocation decisions.

---

## Quick start

```bash
python graph/L2L3_fusion.py \
  --model_path /path/to/model.onnx \
  --fusion_json_path ./L2_fusion_tiling.json \
  --verbosity INFO
```

If you run with **no arguments**, defaults are used:

- `model_path`: `./ResNet50_INT8_Model.onnx`
- `fusion_json_path`: `./L2_fusion_tiling.json`
- `verbosity`: `INFO`
- `both_l2l3` : `false`
- `c64` : `false`
- `l1_fuse` : `false`

---

## Command-line arguments

- `-m, --model_path`
  Path to the input **ONNX** model. _(Default: `./ResNet50_INT8_Model.onnx`)_

- `-j, --fusion_json_path`
  Output path for the **L2 fusion tiling** JSON file. _(Default: `./L2_fusion_tiling.json`)_

- `-v, --verbosity`
  Logging level: `DEBUG` or `INFO`. _(Default: `INFO`)_

- `--both_l2l3`
  Allocate both in L2 and L3. _(Default: `false`)_

- `--c64`
  Make c-dimension of tensor a multiple of 64. _(Default: `false`)_

- `--l1_fuse`
  Attempt L1 fusion. _(Default: `false`)_

---

## Examples

```bash
# Resnet50
python L2L3_allocator.py -m ResNet50_INT8_Model.onnx --c64 --l1_fuse

# YoloV3
python L2L3_allocator.py -m yolov3/YoloV3_INT8_Model_cleaned_graph.onnx --c64

# PSU0
python L2L3_allocator.py -m psu0/Model-PSU0-QDQ-v2.4.0_mod_nhwc_fused.onnx --c64

# PSD3
python L2L3_allocator.py -m psd3/PSD3.quant_mod_nhwc_fused.onnx --c64

# PSD5
python L2L3_allocator.py -m psd5/model_mod_nhwc_fused.onnx --c64
```

---

## Troubleshooting

- **Scheduler fails / unexpected spills**

  - Run with `-v DEBUG` to see step-by-step decisions.
  - Check that shapes/dtypes are correct on the command line.
