Thank you for your interest in contributing to caugi! This document provides guidelines and information to help you contribute effectively to the project.
Project Scope and Overview
caugi (pronounced “corgi”) is a Causal Graph Interface package for R, providing a high-performance, tidy toolbox for building, coercing, and analyzing causal graphs. The package is designed to be:
- Causality-first: Focused on causal graph operations and algorithms
- High-performance: Leveraging Rust for performance-critical operations
- Tidy: Following tidyverse principles and conventions
- Flexible: Supporting multiple graph types and custom edge definitions
Getting Started
Prerequisites
To contribute to caugi, you’ll need:
- R - See the DESCRIPTION file for minimum version requirements
- Rust toolchain - See the DESCRIPTION file for minimum version requirements and system dependencies
-
Development tools - Install the package using
pak::pak("frederikfabriciusbjerre/caugi")which will handle all dependencies
Installing Rust
If you don’t have Rust installed, visit rustup.rs for installation instructions appropriate for your platform.
Development Environment Setup
-
Clone the repository:
-
Load the package in R:
devtools::load_all() -
Build Rust code (if needed):
rextendr::document()
The Rust compilation happens automatically via rextendr when you load or build the package.
Architecture
caugi is built as a hybrid R/Rust codebase:
- R Package: Front-end API using S7 objects
- Rust Backend: Core graph algorithms and data structures for performance
- Graph Storage: Compressed Sparse Row (CSR) format for efficient querying
- Lazy Building: Graph mutations are batched in R and built in Rust on demand
Project Structure
caugi/
├── R/ # R source files
│ ├── caugi.R # Main caugi object constructor
│ ├── edge_operators.R # Edge operator definitions
│ ├── queries.R # Graph query functions
│ ├── metrics.R # Graph metrics (SHD, AID)
│ └── ...
├── src/
│ ├── rust/ # Rust source code
│ │ ├── src/
│ │ │ ├── lib.rs # Main library and extendr bindings
│ │ │ ├── graph/ # Graph data structures and algorithms
│ │ │ └── edges/ # Edge type definitions
│ │ └── Cargo.toml
│ └── entrypoint.c # C entrypoint for R
├── tests/
│ └── testthat/ # Test files (test-*.R)
├── man/ # Generated documentation
└── vignettes/ # Package vignettes
How It Works
CSR Format: Graphs are stored in Compressed Sparse Row format in Rust, which makes queries very fast but mutations more expensive.
Lazy Building: When you mutate a graph (e.g., add edges), the changes are stored in R but not immediately applied in Rust. The graph rebuilds itself in Rust when you query it, or you can force a rebuild with
build(cg).R + Rust Integration: The
extendrframework handles the communication between R and Rust, with automatic type conversions and memory management.
Code Style Guidelines
R Code
-
Follow the tidyverse style guide: Use
styler::style_pkg()before committing -
Naming conventions:
- Functions:
snake_case - S7 classes:
snake_case - Internal functions: prefix with
.(e.g.,.internal_function)
- Functions:
-
Documentation: All exported functions must have comprehensive Roxygen2 documentation (CRAN policy):
-
@title- Brief title -
@description- Detailed description -
@param- Parameter descriptions -
@returns- Return value description (required) -
@examples- Working examples (required) -
@family- Group related functions together -
@concept- Add conceptual keywords for help search - Update
_pkgdown.yamlto organize the function appropriately in the documentation website
-
Example:
#' @title Get parent nodes
#'
#' @description
#' Returns all parent nodes of the specified node(s) in the graph.
#'
#' @param graph A caugi graph object
#' @param nodes Character vector of node names
#'
#' @returns A character vector of parent node names
#'
#' @examples
#' cg <- caugi(A %-->% B, B %-->% C)
#' parents(cg, "C")
#'
#' @family queries
#' @concept queries
#'
#' @export
parents <- function(graph, nodes) {
# implementation
}Rust Code
-
Follow Rust standard style: Run
cargo fmtinsrc/rust/before committing -
Documentation: Use Rust doc comments (
///) for public functions and modules - Performance-focused: Prioritize performance and memory efficiency
-
Error handling: Use
Resulttypes appropriately and provide meaningful error messages -
Extendr integration: Functions exposed to R should use
#[extendr]macros
Example:
Testing
Writing Tests
-
Test file naming:
test-<feature>.R(e.g.,test-queries.R) -
Use testthat: Follow existing patterns with
test_that()blocks - Test edge cases: Consider empty graphs, single nodes, and complex structures
- Test both R and Rust paths: Ensure lazy building works correctly
Example test structure:
Important Testing Considerations
-
Lazy building: Remember that graph mutations are batched. Test both before and after explicit
build()calls if relevant. - Graph class invariants: When testing graph classes, ensure that operations maintain the class invariants (e.g., DAGs remain acyclic).
- Edge registry: If modifying the edge registry system, test thoroughly including edge cases.
Submitting Changes
Before Submitting a Pull Request
-
Style your code:
# Style R code styler::style_pkg() -
Check the package:
devtools::check()Ensure there are no errors or warnings. This will run all tests and validate documentation.
Pull Request Guidelines
- Create focused PRs: Each PR should address a single feature, bug fix, or improvement
- Write clear commit messages: Start with a capital letter and a verb (e.g., “Fix memory leak in graph builder” or “Add support for custom edge types”)
- Reference issues: If your PR addresses an issue, reference it in the PR description (e.g., “Fixes #123”)
- Update documentation: Include documentation updates for user-facing changes. Also update vignettes if necessary. Inline comments are encouraged for complex logic.
- Add tests with full coverage: New features should include comprehensive tests that provide full code coverage for the added code. Non-tested code should not be submitted
- Maintain backward compatibility: Avoid breaking changes to the public API when possible
-
Code coverage: Aim to maintain or improve code coverage. New functions should be fully tested. Use
devtools::test_coverage()to check coverage levels.
Reporting Issues
Found a bug or have a feature request? Please open an issue on GitHub.
Code of conduct
By participating in this project, you agree to abide by standard community guidelines. Please report any unacceptable behavior to the maintainers.