# VCD Analysis Tools

This directory contains two VCD (Value Change Dump) analysis tools for extracting timing and efficiency metrics from simulation traces.

## Tools Overview

### 1. `vcdanalyze.py` - Serial Processing (Original)

- **Purpose**: Analyze VCD files to extract timing metrics and MAC efficiency
- **Processing**: Sequential, single-threaded
- **Best for**: Small to medium VCD files, maximum compatibility
- **Portability**: High - uses standard Python libraries only

### 2. `vcdanalyze_parallel.py` - Parallel Processing (Fast)

- **Purpose**: Same analysis as original but with parallel processing
- **Processing**: Multi-threaded using memory mapping and process pools
- **Best for**: Large VCD files (>100MB), faster analysis
- **Portability**: Limited - uses `mmap` which may not work on all systems

## What the Tools Analyze

Both tools extract the following metrics:

1. **E2E Cycles**: End-to-end timing from first signal high to last signal low
2. **Instr-Matrix MAC Efficiency**: Total time instruction matrix signals are high
3. **Instr-Vector MAC Efficiency**: Total time instruction vector signals are high

The analysis targets specific DMA channel signals and instruction trace signals in AIE (AI Engine) simulation traces.

## Usage

### Single VCD File Analysis

```bash
# Serial version
python vcdanalyze.py --vcd trace.vcd

# Parallel version (faster for large files)
python vcdanalyze_parallel.py --vcd trace.vcd --workers 8
```

### Batch Processing (Directory of VCD files)

```bash
# Serial version
python vcdanalyze.py --dir Output_folder/

# Parallel version
python vcdanalyze_parallel.py --dir Output_folder/ --workers 8
```

### Options

#### Common Options (Both Tools)

- `--vcd <file>`: Process a single VCD file (`.vcd` or `.vcd.gz`)
- `--dir <directory>`: Process all VCD files in subdirectories
- `--from_simnow`: Use different signal prefix for simnow traces
- `--workers <N>`: Number of parallel workers (parallel version only)

#### Parallel Version Only

- `--with_efficiency`: Compute instruction MAC efficiency (default: disabled for speed)

### Examples

```bash
# Basic analysis of single file
python vcdanalyze.py --vcd simulation_results/trace.vcd

# Fast parallel analysis of large file
python vcdanalyze_parallel.py --vcd large_trace.vcd.gz --workers 16

# Batch process all results with efficiency metrics
python vcdanalyze_parallel.py --dir Output/ --with_efficiency --workers 8

# Process simnow traces
python vcdanalyze.py --dir simnow_output/ --from_simnow
```

## Output

### Single File

Results are printed to stdout:

```
[RESULT] min_first_high_ns=12345
[RESULT] max_last_low_ns=67890
[RESULT] E2E Cycles (ns)=55545
[RESULT] Instr-Matrix MAC Efficiency=1200 ns
[RESULT] Instr-Vector MAC Efficiency=800 ns
```

### Batch Processing

Creates a CSV file `vcd_analysis.csv` in the target directory:

```csv
Operator,E2E Cycles (ns),Instr-Matrix MAC Efficiency,Instr-Vector MAC Efficiency
conv_op_1,55545,1200,800
relu_op_2,12300,0,450
...
```

## Portability Notes

### `vcdanalyze.py` (Serial Version)

- **Highly portable**: Uses only standard Python libraries
- Works on Windows, Linux, macOS
- No special system requirements
- Compatible with older Python versions

### `vcdanalyze_parallel.py` (Parallel Version)

- **Limited portability**: Uses `mmap` (memory mapping)
- Works well on Linux and macOS
- May have issues on Windows (especially with large files)
- May fail on network filesystems or special file systems

#### Memory Mapping Limitations

The parallel version uses `mmap` to efficiently access large files, which can cause issues:

1. **Windows**: `mmap` behavior differs from Unix systems
2. **Network drives**: May not support memory mapping
3. **RAM usage**: Maps entire file to virtual memory
4. **File locks**: May interfere with other processes accessing the file

## Choosing the Right Tool

### Use `vcdanalyze.py` (Serial) when:

- VCD files are small (<100MB)
- Maximum compatibility is needed
- Running on Windows or unknown environments
- Memory is limited
- Processing network-mounted files

### Use `vcdanalyze_parallel.py` (Parallel) when:

- VCD files are large (>100MB)
- Speed is critical
- Running on Linux/macOS with sufficient RAM
- Processing local files
- Batch processing many large files

## Troubleshooting

1. **"Cannot map file" error (parallel version)**

   - Solution: Use serial version or copy file locally

2. **Out of memory (parallel version)**

   - Solution: Reduce `--workers` or use serial version

## File Structure Expected

For batch processing, the tool expects this directory structure:

```
Output_folder/
├── operator_1/
│   └── trace.vcd (or trace.vcd.gz)
├── operator_2/
│   └── trace.vcd (or trace.vcd.gz)
└── ...
```

Results are written to `Output_folder/vcd_analysis.csv`

---

## Clean Output Directory Script

This script cleans subfolders inside a given root directory by **deleting everything except**:

- The `hw_package` directory (preserved with bins and ASMs)
- The files:

  - `aie4_dma.cpp`
  - `dma.hpp`
  - `graph.hpp`
  - `super.cc`
  - `super.hh`

### Usage

```bash
python clean_output_folder.py /path/to/root
```

---

## ONNX Subgraph Cutter

This utility extracts a **subgraph** from an ONNX model between two node names (inclusive).
It preserves the necessary initializers and dependencies, producing a valid ONNX model.
<br>
**NOTE:** Use quotes if the `--start` or `--end` name contains characters the shell interprets, like `/, (, ),` or spaces.

### Usage

```bash
python cut_onnx_subgraph.py \
  --input path/to/model.onnx \
  --start "Conv_3" \
  --end "Relu_7" \
  --output path/to/subgraph.onnx
```

**Arguments:**

- `--input` → Path to the original ONNX model.
- `--start` → Name of the **start node** (inclusive).
- `--end` → Name of the **end node** (inclusive).
- `--output` → Output path for the extracted subgraph.

The script will validate node order and names, then save the extracted subgraph.

### Example: Start from Graph Input

If your model input tensor is named `image_input`, you can start extraction directly from it:

```bash
python cut_onnx_subgraph.py \
  --input path/to/model.onnx \
  --start image_input \
  --end "model/conv2d_58/BiasAdd" \
  --output path/to/subgraph.onnx
```

This will extract everything between the **graph input tensor** `image_input` and the node `model/conv2d_58/BiasAdd`.

---

## Map Layer IDs from _unique_nodes.json to _alloc.json

Maps `nodenames[0]` from a **unique-nodes JSON** to matching `"name"` entries in an **alloc JSON**, printing the corresponding alloc keys.


### Usage

```bash
python map_unique_layer_ids.py -u unique_nodes.json -a alloc.json [-k PREFIX]
```
**Arguments:**

| Option | Long form       | Description                                             |
| :----- | :-------------- | :------------------------------------------------------ |
| `-u`   | `--unique`      | Path to unique-nodes JSON                               |
| `-a`   | `--alloc`       | Path to alloc JSON                                      |
| `-k`   | `--starts-with` | (Optional) Prefix filter — e.g. `MatMul`, `Conv`, `Add` |


### Example

* **With `-k`** → Prints comma-separated alloc keys for that prefix.
  Example:

  ```
  1,2,3
  ```
* **Without `-k`** → Groups all keys by prefix.
  Example:

  ```
  MatMul: 1,2,3
  Add: 4,5
  Conv: 6,7,8
  ```

### Notes

* Prefix is derived from the part of the key before `_` (e.g., `MatMul_0` → prefix `MatMul`).
* Missing or invalid `"nodenames[0]"` are skipped with a warning.
* Output has **no spaces in comma-separated lists** and is **safe for automation**.
---