Automated Frontmatter Generation
The RallyPoynt OS documentation system includes an automated frontmatter generation tool that can analyze content and suggest appropriate metadata fields. This guide explains how to use this tool to streamline content creation and ensure consistent metadata across documentation files.
What the Tool Does
The frontmatter generation system uses natural language processing and content analysis to automatically:
- Extract titles - From the first heading or file name
- Generate descriptions - From the first suitable paragraph
- Suggest tags - Based on keyword frequency and importance
- Determine content type/level - Based on content structure and keywords
- Classify business area - Based on content analysis
- Suggest sidebar positions - Based on file naming patterns
- Add appropriate specialized fields - Based on content type
The tool can be run in suggestion mode (showing what would be generated) or apply mode (actually updating files).
Using the Frontmatter Generation Tool
Basic Usage
To analyze content files and see what frontmatter would be generated:
npm run generate-frontmatter
This command will:
- Analyze all Markdown files in the
docs/directory - Show a summary of the analysis
- Generate frontmatter suggestions (but not apply them)
Applying Generated Frontmatter
To analyze content files and apply the generated frontmatter:
npm run generate-frontmatter:apply
This will update all files with appropriate frontmatter. For files that already have frontmatter, existing values are preserved, and only missing fields are added.
Interactive Mode
If you want to review and approve each file's changes individually:
npm run generate-frontmatter:interactive
This mode will:
- Analyze each file
- Show the proposed frontmatter
- Ask for confirmation before applying changes to each file
Advanced Options
The tool supports additional command-line options for more fine-grained control:
# Process a specific file
node scripts/js/generate-frontmatter.js --file=docs/specific-file.md
# Process files in a specific directory
node scripts/js/generate-frontmatter.js --dir=docs/components
# Show detailed output for each file
node scripts/js/generate-frontmatter.js --verbose
Git Integration
The frontmatter generation system is integrated with Git to make it easier to maintain proper metadata:
Pre-commit Hook
When you commit changes, the pre-commit hook:
- Scans all staged Markdown files
- Identifies any files without frontmatter
- Offers to automatically generate frontmatter for those files
- Allows you to continue the commit after generation
This makes it easy to add proper metadata to new documentation files without interrupting your workflow.
Working with the Pre-commit Hook
When committing files without frontmatter, you'll see:
⚠️ The following files are missing frontmatter:
- docs/new-file.md
- docs/another-file.md
Do you want to generate frontmatter for these files? (y/n)
If you select "y":
- Frontmatter will be generated and applied to the files
- Your commit will continue as normal
- The generated frontmatter will remain unstaged (in your working directory)
- You should review the generated frontmatter, then stage and amend your commit
# After commit completes, review the generated frontmatter
git diff
# Stage the changes
git add .
# Amend your commit
git commit --amend
This workflow ensures all new content has appropriate frontmatter while giving you a chance to review it.
How the Analysis Works
The frontmatter generation uses several techniques to determine appropriate metadata:
Title Extraction
- Looks for the first H1 heading (
# Title) - Falls back to first H2 heading if no H1 exists
- Uses the filename as a last resort, converting kebab-case to Title Case
Description Extraction
- Finds the first paragraph after headings are removed
- Ensures the paragraph is between 50-250 characters
- Truncates longer paragraphs to 250 characters
- Falls back to a generic description if no suitable paragraph is found
Tag Generation
- Removes stopwords, code blocks, and non-alphabetic tokens
- Uses TF-IDF (Term Frequency-Inverse Document Frequency) to identify important terms
- Selects the top 5 most relevant terms as tags
Content Type Detection
- First checks file path for obvious indicators (
/components/,/tutorials/, etc.) - Searches for type-specific keywords in the content
- Analyzes document structure (step-by-step patterns for tutorials, API patterns for reference, etc.)
- Assigns a score for each content type and selects the highest
Business Area Classification
- First checks file path for area indicators
- Counts occurrences of area-specific keywords in the content
- Assigns a score for each business area and selects the highest
- Defaults to "systems" if no clear classification emerges
Integration with Validation
The frontmatter generation system works alongside the Frontmatter Validation system:
- Use
generate-frontmatterto add missing frontmatter to files - Then use
validate-frontmatterto check for any remaining issues - Fix any validation errors manually or with
validate-frontmatter:fix
This workflow ensures that all documentation has consistent, high-quality metadata.
Best Practices
- Run generation on new content: Always run the generator on new documentation files
- Review auto-generated fields: While the tool is quite accurate, always review the generated frontmatter
- Use interactive mode for bulk operations: When applying to many files, use interactive mode to review changes
- Validate after generation: Always validate frontmatter after automatic generation to catch any issues
By using the automated frontmatter generation system, you can ensure consistent metadata across documentation with minimal manual effort.