docs/config-reference.md

Config Reference

Issue 10-019: Documentation for config.lua structure and field usage.

Overview

All configuration lives in config.lua at the project root. It uses Lua table syntax and is loaded via libs/config-loader.lua:

local config_loader = require("config-loader")
config_loader.set_project_root(DIR)
local config = config_loader.load()

Sections use vimfolds (-- {{{ section_name / -- }}}) for easy navigation in editors.


Section Reference

asset_paths

Read by: Most scripts
Priority: Required

Root directory for all generated assets: embeddings, caches, indexes.

FieldTypeDescription
assets_rootstringAbsolute path to assets directory

layout

Read by: src/flat-html-generator.lua:load_layout_from_config()
Priority: Low (sensible defaults exist)

Controls the visual appearance of poem boxes in generated HTML. All values are in characters.

FieldTypeDefaultDescription
regular_poem_widthnumber83Width of standard poem boxes
golden_poem_widthnumber85Width of golden poem boxes (1024 chars)
text_content_widthnumber80Inner content area width
left_box_widthnumber11Left navigation box width
right_box_widthnumber13Right navigation box width
gap_widthnumber59Gap between left and right boxes
left_junction_posnumber5Position of left box junction point
right_junction_posnumber6Position of right box junction point

sources

Read by: libs/sources-loader.lua, all extractors
Priority: Critical

Unified input source configuration. Each source type supports multiple named directories.

Source Types

SourceFormatParserDescription
fediverseactivitypubMastodon outbox.jsonActivityPub archives
messagesmessages_exportexport.jsonMatrix message exports
notesplaintext.txt/.md filesLocal notes directory
blueskyatprotoCAR filesBluesky exports
images(special)File scannerImage directories

Common Source Fields

FieldTypeRequiredDescription
enabledbooleanNo (default: true)Skip this source entirely if false
formatstringYesParser to use (see table above)
directoriesarrayYesOne or more input directories

Directory Entry Fields

FieldTypeRequiredDescription
namestringYesHuman-readable identifier for logs
pathstringYesRelative to project root, or absolute
optionalbooleanNo (default: false)Missing = warning, not error
descriptionstringNoYour notes, unused by code
external.sourcestringNoPath to rsync from
randomize_orderbooleanNoScatter images randomly in timeline

Archive Entry Fields (ZIP files)

FieldTypeRequiredDescription
namestringYesIdentifier for logs
sourcestringYesAbsolute path to ZIP file
extract_tostringYesDestination (relative to project)

Image Source Special Fields

FieldTypeDescription
supported_formatsarrayFile extensions to include
max_file_size_mbnumberSkip files larger than this
preserve_structurebooleanKeep directory hierarchy in output

extraction

Read by: Extraction scripts
Priority: Low (all enabled by default)

Controls which input sources are processed during extraction.

FieldTypeDefaultDescription
enable_fediversebooleantrueProcess ActivityPub data
enable_messagesbooleantrueProcess message exports
enable_notesbooleantrueProcess plaintext notes
enable_blueskybooleantrueProcess Bluesky data
ignored_archivesarray[]ZIP filename stems to skip

excluded_poems

Read by: libs/exclusion-filter.lua
Priority: Optional

Poems to exclude from the collection during extraction. Excluded poems leave gaps in the ID sequence (tombstoning) - they don't shift other poem IDs down, preserving stable anchor links.

ID Formats by Category

CategoryID FormatExample
fediverseNumeric post ID from ActivityPub"113847291038475"
fediverse_boostSequential boost number"0003"
notesFilename without extension"what-a-lame-movie"
messagesNumeric message index"42"
blueskyAT Protocol record key"3k...abc"

privacy

Read by: Extraction scripts
Priority: Critical for public deployment

Anonymization settings. In "clean" mode, usernames are replaced with sequential identifiers.

FieldTypeDefaultDescription
modestring"clean""clean" (anonymize) or "raw" (preserve)
anonymization_prefixstring"user-"Prefix for anonymized usernames
include_boostsbooleanfalseInclude boosted/reblogged posts
preserve_original_lengthbooleantrueKeep length hints for anonymized names
store_anonymization_mapbooleanfalseStore username→anonymous mapping
local_server_domainstring-Your home instance domain

CLI overrides: --include-boosts, --no-boosts


golden_poems

Read by: src/html-generator/golden-poem-bonus.lua
Priority: Low

"Golden poems" are exactly 1024 characters - Mastodon's limit. These get a similarity bonus.

FieldTypeDefaultDescription
enable_golden_prioritizationbooleantrueApply golden poem bonuses
golden_poem_pair_bonusnumber0.05Bonus when both poems are golden
golden_poem_single_bonusnumber0.02Bonus when one poem is golden
golden_bonus_thresholdnumber0.1Maximum bonus cap
min_golden_recommendationsnumber2Minimum golden poems per page
max_golden_recommendationsnumber5Maximum golden poems per page

semantic_colors

Read by: src/semantic-color-calculator.lua
Priority: Low (change for visual customization)

Colors for semantic clustering visualization. Each poem is assigned a color based on its embedding cluster.

ColorRGBHex
red(220, 60, 60)#dc3c3c
blue(60, 120, 220)#3c78dc
green(60, 180, 90)#3cb45a
purple(140, 60, 200)#8c3cc8
orange(230, 140, 60)#e68c3c
yellow(200, 180, 40)#c8b428
gray(120, 120, 120)#787878

color_names array defines iteration order for deterministic page generation.


similarity

Read by: src/similarity-calculator.lua
Priority: Low

FieldTypeDefaultDescription
default_algorithmstring"cosine"Similarity algorithm

Available algorithms: cosine, euclidean, manhattan, angular, pearson_correlation


ollama_servers

Read by: libs/ollama-config.lua
Priority: Required for embedding generation

Multi-server Ollama configuration for embedding generation.

FieldTypeRequiredDescription
namestringYesLabel for TUI and --ollama flag
descriptionstringNoHuman-readable description
hoststringYesServer hostname or IP
portnumberYesOllama API port (default: 11434)
modelstringYesDefault embedding model
available_modelsarrayNoList of models on this server

default_ollama_server specifies which server to use by default.

CLI overrides: --ollama NAME, --model NAME, --list-ollama


pagination

Read by: src/flat-html-generator.lua:load_pagination_config()
Priority: Medium

Controls how poems are split across HTML pages.

FieldTypeDefaultDescription
poems_per_pagenumber200Poems per similar/different page
minimum_pagesnumber1Minimum pages to generate
max_pages_per_poemnumber15Maximum pages per poem
page_number_paddingnumber2Zero-padding (01, 02...)
generate_txt_exportsbooleantrueGenerate .txt versions
chronological_paginatedbooleanfalseSplit chronological.html
chronological_poems_per_pagenumber1000Poems per chrono page

CLI overrides: --poems-per-page, --chrono-per-page, --pages


storage

Read by: src/flat-html-generator.lua:load_pagination_config()
Priority: Low

Budget planning for Neocities deployment.

FieldTypeDescription
limit_gbnumberTotal available storage
reserved_for_maze_gbnumberReserved for HTML Maze feature
reserved_headroom_gbnumberSafety buffer

centroids

Read by: src/centroid-generator.lua
Priority: Medium (for mood-based exploration)

Mood-based exploration anchors. Each centroid defines a "semantic target" using keywords.

FieldTypeRequiredDescription
namestringYesInternal identifier
descriptionstringNoHuman-readable description
source_filesarrayNoFiles to include in centroid
keywordsarrayYesEvocative phrases for embedding
output_slugstringYesURL-friendly identifier

Example: Adding a new mood centroid:

{
    name = "nostalgia",
    description = "Bittersweet memories of the past",
    source_files = {},
    keywords = {
        "childhood memories",
        "old photographs",
        "places that no longer exist",
        "the way things used to be"
    },
    output_slug = "nostalgia"
}

word_cloud

Read by: src/wordcloud-generator.lua
Priority: Low

Word cloud page settings.

FieldTypeDefaultDescription
enabledbooleantrueGenerate word cloud page
output_filestring"wordcloud.html"Output filename
min_occurrencesnumber5Minimum word frequency
max_wordsnumber200Maximum words (0 = unlimited)
min_word_lengthnumber3Ignore shorter words
font_size_minnumber1Minimum font tag size
font_size_maxnumber7Maximum font tag size
stop_wordsarray[...]Common words to exclude

html_theme

Read by: HTML generators
Priority: Low (cosmetic)

Dark mode theme colors applied via HTML body attributes.

FieldTypeDefaultDescription
backgroundstring"#000000"Background color (OLED-friendly)
textstring"#FFFFFF"Text color
linkstring"#6699FF"Unvisited link color
vlinkstring"#9966FF"Visited link color

Common Customization Tasks

Adding a New Poem Source

  1. Add entry to sources:
my_source = {
    enabled = true,
    format = "plaintext",  -- or appropriate format
    directories = {
        { name = "primary", path = "input/my-source" }
    }
}
  1. Add to extraction if needed:
enable_my_source = true
  1. Create extractor script if format requires custom parsing.

Adding an External Sync

Add external field to directory entry:

{
    name = "my-dir",
    path = "input/my-source",
    external = {
        source = "/home/user/original-location"
    }
}

Excluding a Poem

Add ID to appropriate category in excluded_poems:

excluded_poems = {
    fediverse = { "113847291038475" },
    notes = { "embarrassing-draft" }
}

Changing Embedding Server

Either:

  1. Set default_ollama_server in config
  2. Use --ollama server-name CLI flag

Deprecated Sections

SectionStatusReplacement
external_filesDeprecatedUse sources[].directories[].external
image_syncRemovedUse sources.images
input_sourcesRemovedUse sources

Last updated: 2026-04-06 (Issue 10-019)