Skip to main content

Automated Frontmatter Generation

The RallyPoynt OS documentation system includes an automated frontmatter generation tool that can analyze content and suggest appropriate metadata fields. This guide explains how to use this tool to streamline content creation and ensure consistent metadata across documentation files.

What the Tool Does

The frontmatter generation system uses natural language processing and content analysis to automatically:

  1. Extract titles - From the first heading or file name
  2. Generate descriptions - From the first suitable paragraph
  3. Suggest tags - Based on keyword frequency and importance
  4. Determine content type/level - Based on content structure and keywords
  5. Classify business area - Based on content analysis
  6. Suggest sidebar positions - Based on file naming patterns
  7. Add appropriate specialized fields - Based on content type

The tool can be run in suggestion mode (showing what would be generated) or apply mode (actually updating files).

Using the Frontmatter Generation Tool

Basic Usage

To analyze content files and see what frontmatter would be generated:

npm run generate-frontmatter

This command will:

  • Analyze all Markdown files in the docs/ directory
  • Show a summary of the analysis
  • Generate frontmatter suggestions (but not apply them)

Applying Generated Frontmatter

To analyze content files and apply the generated frontmatter:

npm run generate-frontmatter:apply

This will update all files with appropriate frontmatter. For files that already have frontmatter, existing values are preserved, and only missing fields are added.

Interactive Mode

If you want to review and approve each file's changes individually:

npm run generate-frontmatter:interactive

This mode will:

  • Analyze each file
  • Show the proposed frontmatter
  • Ask for confirmation before applying changes to each file

Advanced Options

The tool supports additional command-line options for more fine-grained control:

# Process a specific file
node scripts/js/generate-frontmatter.js --file=docs/specific-file.md

# Process files in a specific directory
node scripts/js/generate-frontmatter.js --dir=docs/components

# Show detailed output for each file
node scripts/js/generate-frontmatter.js --verbose

Git Integration

The frontmatter generation system is integrated with Git to make it easier to maintain proper metadata:

Pre-commit Hook

When you commit changes, the pre-commit hook:

  1. Scans all staged Markdown files
  2. Identifies any files without frontmatter
  3. Offers to automatically generate frontmatter for those files
  4. Allows you to continue the commit after generation

This makes it easy to add proper metadata to new documentation files without interrupting your workflow.

Working with the Pre-commit Hook

When committing files without frontmatter, you'll see:

⚠️ The following files are missing frontmatter:
  - docs/new-file.md
  - docs/another-file.md

Do you want to generate frontmatter for these files? (y/n)

If you select "y":

  1. Frontmatter will be generated and applied to the files
  2. Your commit will continue as normal
  3. The generated frontmatter will remain unstaged (in your working directory)
  4. You should review the generated frontmatter, then stage and amend your commit
# After commit completes, review the generated frontmatter
git diff

# Stage the changes
git add .

# Amend your commit
git commit --amend

This workflow ensures all new content has appropriate frontmatter while giving you a chance to review it.

How the Analysis Works

The frontmatter generation uses several techniques to determine appropriate metadata:

Title Extraction

  1. Looks for the first H1 heading (# Title)
  2. Falls back to first H2 heading if no H1 exists
  3. Uses the filename as a last resort, converting kebab-case to Title Case

Description Extraction

  1. Finds the first paragraph after headings are removed
  2. Ensures the paragraph is between 50-250 characters
  3. Truncates longer paragraphs to 250 characters
  4. Falls back to a generic description if no suitable paragraph is found

Tag Generation

  1. Removes stopwords, code blocks, and non-alphabetic tokens
  2. Uses TF-IDF (Term Frequency-Inverse Document Frequency) to identify important terms
  3. Selects the top 5 most relevant terms as tags

Content Type Detection

  1. First checks file path for obvious indicators (/components/, /tutorials/, etc.)
  2. Searches for type-specific keywords in the content
  3. Analyzes document structure (step-by-step patterns for tutorials, API patterns for reference, etc.)
  4. Assigns a score for each content type and selects the highest

Business Area Classification

  1. First checks file path for area indicators
  2. Counts occurrences of area-specific keywords in the content
  3. Assigns a score for each business area and selects the highest
  4. Defaults to "systems" if no clear classification emerges

Integration with Validation

The frontmatter generation system works alongside the Frontmatter Validation system:

  1. Use generate-frontmatter to add missing frontmatter to files
  2. Then use validate-frontmatter to check for any remaining issues
  3. Fix any validation errors manually or with validate-frontmatter:fix

This workflow ensures that all documentation has consistent, high-quality metadata.

Best Practices

  • Run generation on new content: Always run the generator on new documentation files
  • Review auto-generated fields: While the tool is quite accurate, always review the generated frontmatter
  • Use interactive mode for bulk operations: When applying to many files, use interactive mode to review changes
  • Validate after generation: Always validate frontmatter after automatic generation to catch any issues

By using the automated frontmatter generation system, you can ensure consistent metadata across documentation with minimal manual effort.