Saving and Sharing Graphs with the Caugi Format
Source:vignettes/serialization.Rmd
serialization.RmdOverview
The caugi package provides a native JSON-based serialization format for saving and loading causal graphs. This format enables reproducible research, data sharing, and caching of graph structures.
Quick Start
Writing Graphs
First, create a causal graph:
Then, write it to a file in the caugi format:
tmp <- tempfile(fileext = ".caugi.json")
write_caugi(cg, tmp,
comment = "Example causal graph",
tags = c("research", "example")
)That’s it! The graph is now saved in a human-readable JSON file.
Reading Graphs
You can read the graph back from the file, and verify it matches the original:
cg_loaded <- read_caugi(tmp)
identical(edges(cg), edges(cg_loaded))
#> [1] TRUEThe Caugi Format
Structure
The caugi format uses a simple, human-readable JSON structure:
{
"$schema": "https://caugi.org/schemas/caugi-v1.schema.json",
"format": "caugi",
"version": "1.0.0",
"graph": {
"class": "DAG",
"nodes": [
"A",
"B",
"C",
"D"
],
"edges": [
{
"from": "A",
"to": "B",
"edge": "-->"
},
{
"from": "A",
"to": "C",
"edge": "-->"
},
{
"from": "B",
"to": "D",
"edge": "-->"
},
{
"from": "C",
"to": "D",
"edge": "-->"
}
]
},
"meta": {
"comment": "Example causal graph",
"tags": [
"research",
"example"
]
}
}
Key Features
- Versioned: Schema version 1 with forward compatibility
- Human-readable: Uses node names and DSL operators (not indices)
-
Self-documenting: Includes
$schemareference for IDE validation - Metadata support: Optional comments and tags
Edge Types
The format supports all caugi edge types using their DSL operators:
| Operator | Description | Graph Types |
|---|---|---|
--> |
Directed edge | DAG, PDAG, ADMG, UNKNOWN |
--- |
Undirected edge | UG, PDAG, UNKNOWN |
<-> |
Bidirected edge | ADMG, UNKNOWN |
o-> |
Partially directed | PDAG, UNKNOWN |
--o |
Partially undirected | PDAG, UNKNOWN |
o-o |
Partial (both circles) | PDAG, UNKNOWN |
Working with the Format
String Serialization
For programmatic use, you can serialize to/from strings:
# Serialize to JSON string
json_str <- caugi_serialize(cg)
cat(substr(json_str, 1, 200), "...\n")
#> {
#> "$schema": "https://caugi.org/schemas/caugi-v1.schema.json",
#> "format": "caugi",
#> "version": "1.0.0",
#> "graph": {
#> "class": "DAG",
#> "nodes": [
#> "A",
#> "B",
#> "C",
#> "D"
#> ...
# Deserialize from JSON string
cg_from_json <- caugi_deserialize(json_str)Lazy Loading
For large graphs, you can defer building:
# Read without building the Rust graph structure
cg_lazy <- read_caugi(tmp, lazy = TRUE)
# Build when needed
cg_lazy <- build(cg_lazy)Metadata
Add context to your graphs with comments and tags:
write_caugi(cg, tmp,
comment = "Mediation model from Study A",
tags = c("mediation", "study-a", "validated")
)Different Graph Types
The format supports all caugi graph classes:
# DAG
dag <- caugi(X %-->% Y, Y %-->% Z, class = "DAG")
# PDAG (with undirected edges)
pdag <- caugi(X %-->% Y, Y %---% Z, class = "PDAG")
# ADMG (with bidirected edges)
admg <- caugi(X %-->% Y, Y %<->% Z, class = "ADMG")
# UG (undirected graph)
ug <- caugi(X %---% Y, Y %---% Z, class = "UG")
# Save them all
write_caugi(dag, tempfile(fileext = ".caugi.json"))
write_caugi(pdag, tempfile(fileext = ".caugi.json"))
write_caugi(admg, tempfile(fileext = ".caugi.json"))
write_caugi(ug, tempfile(fileext = ".caugi.json"))File Extension Convention
We recommend using .caugi.json as the file extension to
clearly indicate both the format and content type. This helps tools
recognize the files and enables automatic handling by IDEs and
validators.
Schema Validation
All files generated by write_caugi() include a
$schema field pointing to the formal JSON Schema
specification:
https://caugi.org/schemas/caugi-v1.schema.json
This enables:
- IDE support: Autocomplete and inline validation in VS Code, IntelliJ, etc.
- Automated validation: Use standard JSON Schema validators
- Documentation: Hover hints in editors show field descriptions
Performance
Serialization is implemented in Rust for high performance. Large graphs serialize and deserialize efficiently:
tmp_file <- tempfile(fileext = ".caugi.json")
large_dag <- generate_graph(n = 1000, m = 500, class = "DAG")
system.time(write_caugi(large_dag, tmp_file))
system.time(res <- read_caugi(tmp_file))
unlink(tmp_file)