Automated Frontmatter Generation

The RallyPoynt OS documentation system includes an automated frontmatter generation tool that can analyze content and suggest appropriate metadata fields. This guide explains how to use this tool to streamline content creation and ensure consistent metadata across documentation files.

What the Tool Does

The frontmatter generation system uses natural language processing and content analysis to automatically:

Extract titles - From the first heading or file name
Generate descriptions - From the first suitable paragraph
Suggest tags - Based on keyword frequency and importance
Determine content type/level - Based on content structure and keywords
Classify business area - Based on content analysis
Suggest sidebar positions - Based on file naming patterns
Add appropriate specialized fields - Based on content type

The tool can be run in suggestion mode (showing what would be generated) or apply mode (actually updating files).

Using the Frontmatter Generation Tool

Basic Usage

To analyze content files and see what frontmatter would be generated:

npm run generate-frontmatter

This command will:

Analyze all Markdown files in the docs/ directory
Show a summary of the analysis
Generate frontmatter suggestions (but not apply them)

Applying Generated Frontmatter

To analyze content files and apply the generated frontmatter:

npm run generate-frontmatter:apply

This will update all files with appropriate frontmatter. For files that already have frontmatter, existing values are preserved, and only missing fields are added.

Interactive Mode

If you want to review and approve each file's changes individually:

npm run generate-frontmatter:interactive

This mode will:

Analyze each file
Show the proposed frontmatter
Ask for confirmation before applying changes to each file

Advanced Options

The tool supports additional command-line options for more fine-grained control:

# Process a specific file
node scripts/js/generate-frontmatter.js --file=docs/specific-file.md

# Process files in a specific directory
node scripts/js/generate-frontmatter.js --dir=docs/components

# Show detailed output for each file
node scripts/js/generate-frontmatter.js --verbose

Git Integration

The frontmatter generation system is integrated with Git to make it easier to maintain proper metadata:

Pre-commit Hook

When you commit changes, the pre-commit hook:

Scans all staged Markdown files
Identifies any files without frontmatter
Offers to automatically generate frontmatter for those files
Allows you to continue the commit after generation

This makes it easy to add proper metadata to new documentation files without interrupting your workflow.

Working with the Pre-commit Hook

When committing files without frontmatter, you'll see:

⚠️ The following files are missing frontmatter:
  - docs/new-file.md
  - docs/another-file.md

Do you want to generate frontmatter for these files? (y/n)

If you select "y":

Frontmatter will be generated and applied to the files
Your commit will continue as normal
The generated frontmatter will remain unstaged (in your working directory)
You should review the generated frontmatter, then stage and amend your commit

# After commit completes, review the generated frontmatter
git diff

# Stage the changes
git add .

# Amend your commit
git commit --amend

This workflow ensures all new content has appropriate frontmatter while giving you a chance to review it.

How the Analysis Works

The frontmatter generation uses several techniques to determine appropriate metadata:

Title Extraction

Looks for the first H1 heading (# Title)
Falls back to first H2 heading if no H1 exists
Uses the filename as a last resort, converting kebab-case to Title Case

Description Extraction

Finds the first paragraph after headings are removed
Ensures the paragraph is between 50-250 characters
Truncates longer paragraphs to 250 characters
Falls back to a generic description if no suitable paragraph is found

Tag Generation

Removes stopwords, code blocks, and non-alphabetic tokens
Uses TF-IDF (Term Frequency-Inverse Document Frequency) to identify important terms
Selects the top 5 most relevant terms as tags

Content Type Detection

First checks file path for obvious indicators (/components/, /tutorials/, etc.)
Searches for type-specific keywords in the content
Analyzes document structure (step-by-step patterns for tutorials, API patterns for reference, etc.)
Assigns a score for each content type and selects the highest

Business Area Classification

First checks file path for area indicators
Counts occurrences of area-specific keywords in the content
Assigns a score for each business area and selects the highest
Defaults to "systems" if no clear classification emerges

Integration with Validation

The frontmatter generation system works alongside the Frontmatter Validation system:

Use generate-frontmatter to add missing frontmatter to files
Then use validate-frontmatter to check for any remaining issues
Fix any validation errors manually or with validate-frontmatter:fix

This workflow ensures that all documentation has consistent, high-quality metadata.

Best Practices

Run generation on new content: Always run the generator on new documentation files
Review auto-generated fields: While the tool is quite accurate, always review the generated frontmatter
Use interactive mode for bulk operations: When applying to many files, use interactive mode to review changes
Validate after generation: Always validate frontmatter after automatic generation to catch any issues

By using the automated frontmatter generation system, you can ensure consistent metadata across documentation with minimal manual effort.