diff --git a/CHANGELOG.adoc b/CHANGELOG.adoc index 40af30364bcb198bca27592f0f2aa5f0745b5e09..df3107604a486b9411e2fea4ae6f0616d3635964 100644 --- a/CHANGELOG.adoc +++ b/CHANGELOG.adoc @@ -4,6 +4,17 @@ This is a summary of all notable changes to the Antora Collector Extension by release. For a detailed view of what's changed, refer to the {url-repo}/commits[commit history] of this project. +== Unreleased + +=== Added + +* add an optional `clean` configuration key on each collector entry to specify a directory to clean (i.e., remove) (#10) + +=== Changed + +* don't remove the scan dir unless the `clean: true` is specified on the scan entry (#10) +* process cleans per collector entry rather than before running collector on the origin (#10) + == 1.0.0-alpha.3 (2022-11-12) === Changed diff --git a/docs/modules/ROOT/pages/configuration-keys.adoc b/docs/modules/ROOT/pages/configuration-keys.adoc index fbebf45df8a7369e099cb422b697a4b13c128103..05930740a98e60a15b62c3ef7a01b84a985d61ab 100644 --- a/docs/modules/ROOT/pages/configuration-keys.adoc +++ b/docs/modules/ROOT/pages/configuration-keys.adoc @@ -6,8 +6,12 @@ The `ext` key is the designated area in the component version descriptor for ext [#collector-key] == collector key -The external commands the extension runs and scan settings are defined under the `collector` key nested under the `ext` key in an [.path]_antora.yml_ file. -These commands and settings are applied to a git reference when Antora visits the content roots of the content sources defined in the playbook to produce a content aggregate. +The `collector` key in the component version descriptor ([.path]_antora.yml_) defines the configuration for the clean, run, and scan steps that are carried out on the origin (a git reference) in which the component version descriptor is found. +All three steps are optional. +The settings for these steps are defined using keys of the same names under the `collector` key. +The extension runs the steps in aforementioned order, so it's typical to specify the keys in this order as well. + +Here's an example of a component version descriptor that configures the collector extension. .antora.yml [,yaml] @@ -19,7 +23,9 @@ nav: - modules/ROOT/nav.adoc ext: collector: # <.> - - run: # <.> + - clean: # <.> + dir: build/generated # <.> + run: # <.> command: ./gradlew --console rich generateContent # <.> scan: # <.> - dir: build/generated # <.> @@ -28,13 +34,55 @@ ext: dir: artifacts ---- <.> `collector` key +<.> `clean` key +<.> `dir` key that specifies the directory to clean <.> `run` key -<.> `command` key +<.> `command` key that specifies the command to run <.> `scan` key -<.> `dir` key -<.> `files` key +<.> `dir` key that specifies the directory to scan +<.> `files` key that provides a pattern of files to scan + +The `collector` key accepts an array (i.e., list) of collector entries. +If there's only a single entry, the array can be replaced by a map for a single entry (i.e., no preceding hyphen). + +If the `collector` key isn't set in a component version descriptor, or its value is falsy, the extension won't run on that git reference. + +[#clean-key] +== clean key -If `collector` isn't set in a component version descriptor, or its value is falsy, the Collector extension won't run on that git reference. +The `clean` key must be nested under the `collector` key. +The `clean` key accepts a list of built-in key-value pairs that configure the directory (`dir`) to clean. + +.antora.yml +[,yaml] +---- +name: colorado +title: Colorado +version: '5.6.0' +ext: + collector: + - clean: + dir: build/generated + run: + command: ./gradlew --console rich generateContent + scan: + - dir: build/generated + files: '**/*.adoc' + - dir: build/log + - scan: + dir: artifacts +---- + +The `clean` key accepts an array of items (i.e., list) that each have a `dir` key. +Each `clean` entry is invoked sequentially in the order specified in the array. +If there's only a single entry, the array can be replaced by a map for a single entry. + +If the directory to clean is the same as the directory to scan, the clean entry can be created implicitly by setting the `clean` key on the scan entry to `true` (e.g., `clean: true`). +See <>. + +Note that when running collector on a content source without a worktree (such as a remote repository), the worktree will always start out in a clean state. +When the content source is local and has a worktree, the worktree may already contain files that are untracked or ignored by git. +In this case, the first clean entry can be important to ensuring a predictable result. [#run-key] == run key @@ -54,6 +102,7 @@ ext: command: ./gradlew --console rich generateContent scan: dir: build/generated + clean: true - run: command: stats --validate package scan: @@ -66,7 +115,7 @@ Each `run` key is invoked sequentially in the order specified in the array. [#scan-key] == scan key -The `scan` key can be nested under the `collector` key or under a `run` key. +The `scan` key can be nested under the `collector` key if there's only a single collector entry, or in an item in the array of entries on the `collector` key. The `scan` key accepts a list of built-in key-value pairs that configure the scan directory (`dir`) and file filter (`files`). .antora.yml @@ -82,20 +131,18 @@ ext: scan: - dir: build/generated files: '**/*.adoc' + clean: true - dir: build/log + clean: true - scan: dir: artifacts ---- -The `scan` key can be specified multiple times. -Each `scan` key is invoked sequentially in the order specified in the array. -When a `scan` key is set directly under `collector` (not nested under `run`), it isn't associated with the outcome of an external command. +The `scan` key accepts an array of items (i.e., list) that have the nested keys. +Each `scan` entry is invoked sequentially in the order specified in the array. +If there's only a single entry, the array can be replaced by a map of a single entry. -CAUTION: The current assumption is that the scan directories are generated by a run command. -When using a local content source, the scan directories are removed before any commands are run. -That means that if you point a scan at a directory that already exists, it will be removed. -This behavior is under review and will likely be changed or made configurable in the future. -As a workaround, a run command can be used to copy files from a permanent directory to a temporary one for scanning. +If the `clean: true` key is set on the entry, that implicitly creates a clean entry with the same dir. [#collector-reference] == Available Collector keys @@ -106,11 +153,19 @@ The table below lists the keys that can be defined in a component version descri |=== |Collector Keys |Description |Default |Values +|`clean.dir` +|Defines the location of a directory to clean before processing the rest of the collector entry (i.e., before run and scan). +The value of the `dir` key is a string path for a single directory. +If the value is `.` or starts with `./`, it's resolved starting from the start path. +Otherwise, the path is resolved relative to the worktree. +|Not set +|Path relative to the content root + |`run.command` |Specifies a command to run for the current Antora content root. The command is run from the root of the worktree for the git reference unless the `dir` key is specified. The `dir` key specifies an alternate working directory (e.g., `scripts`). -If the value is `.` or starts with `./`, the value is resolved from the start path. +If the value is `.` or starts with `./`, the value is resolved starting from the start path. The commands are invoked sequentially in the order specified in this array. If the value of `collector` is `false`, no commands will be invoked. |Not set @@ -118,18 +173,17 @@ If the value of `collector` is `false`, no commands will be invoked. |`run.dir` |Defines the location (pwd) from where the command is run. -The value of the `dir` key is a string. -If the value is `.` or starts with `./`, it's resolved from the start path. +The value of the `dir` key is a string path for a single directory. +If the value is `.` or starts with `./`, it's resolved starting from the start path. Otherwise, the path is resolved relative to the worktree. |Not set |Path relative to the content root |`scan.dir` |Defines the location from where the extension collects the generated files after the previous command is complete. -The extension then imports the collected files into the bucket in the content aggregate . -The value of the `dir` key is an array. -Each entry in that array provides a path for a single directory. -If the value is `.` or starts with `./`, it's resolved from the start path. +The extension then imports the collected files into the bucket in the content aggregate. +The value of the `dir` key is a string path for a single directory. +If the value is `.` or starts with `./`, it's resolved starting from the start path. Otherwise, the path is resolved relative to the worktree. |Not set |Path relative to the content root diff --git a/packages/collector-extension/lib/index.js b/packages/collector-extension/lib/index.js index fb3f0bbb6c75c77578e8c36773e18b5b6be355be..2db31eff6f6560cf04f23638afe836748a265834 100644 --- a/packages/collector-extension/lib/index.js +++ b/packages/collector-extension/lib/index.js @@ -33,10 +33,14 @@ module.exports.register = function () { if (Array.isArray(collectorConfig) && !collectorConfig.length) continue const worktreeDir = worktree || ospath.join(cacheDir, generateWorktreeFolderName({ url, gitdir, worktree })) const expandPathContext = { base: worktreeDir, cwd: worktreeDir, dot: ospath.join(worktreeDir, startPath) } - const scanDirs = new Set() const collectors = (Array.isArray(collectorConfig) ? collectorConfig : [collectorConfig]).map((collector) => { - const { run: runConfig = {}, scan: scanConfig = [] } = collector + const { clean: cleanConfig = [], run: runConfig = {}, scan: scanConfig = [] } = collector + let cleans return { + clean: (cleans = (Array.isArray(cleanConfig) ? cleanConfig : [cleanConfig]).reduce((accum, clean) => { + if (typeof clean.dir === 'string') accum.push({ dir: expandPath(clean.dir, expandPathContext) }) + return accum + }, [])), run: { ...runConfig, cwd: typeof runConfig.dir === 'string' ? expandPath(runConfig.dir, expandPathContext) : worktreeDir, @@ -44,21 +48,22 @@ module.exports.register = function () { scan: (Array.isArray(scanConfig) ? scanConfig : [scanConfig]).reduce((accum, scan) => { if (typeof scan.dir === 'string') { const dir = expandPath(scan.dir, expandPathContext) - scanDirs.add(dir) + if (scan.clean) cleans.push({ dir }) accum.push({ ...scan, dir }) } return accum }, []), } }) - if (worktree) { - for (const scanDir of scanDirs) await fsp.rm(scanDir, { recursive: true, force: true }) - } else { + if (!worktree) { const cache = gitCache[gitdir] || (gitCache[gitdir] = {}) const ref = `refs/${reftype === 'branch' ? 'head' : reftype}s/${refname}` await prepareWorktree({ fs, cache, dir: worktreeDir, gitdir, ref, remote, bare: worktree === undefined }) } - for (const { run, scan: scans } of collectors) { + for (const { clean: cleans, run, scan: scans } of collectors) { + for (const clean of cleans) { + await fsp.rm(clean.dir, { recursive: true, force: true }) + } const { cwd, command, local } = run if (command) { let cmd = command diff --git a/packages/collector-extension/test/collector-extension-test.js b/packages/collector-extension/test/collector-extension-test.js index 295da436bfb3a5a1281855582c46fc1ca3b8ee5a..e442624f1d42aa682ca54268e1da5b494454648d 100644 --- a/packages/collector-extension/test/collector-extension-test.js +++ b/packages/collector-extension/test/collector-extension-test.js @@ -247,6 +247,7 @@ describe('collector extension', () => { it('should populate properties of file collected from concrete worktree', async () => { const collectorConfig = { + clean: { dir: 'build' }, run: { command: 'node .gen-start-page.js' }, scan: { dir: 'build' }, } @@ -513,6 +514,26 @@ describe('collector extension', () => { }) }) + it('should not delete files in scan dir of worktree before running scan', async () => { + const collectorConfig = { + scan: { dir: 'other-docs' }, + } + await runScenario({ + repoName: 'test-at-start-path', + startPath: 'docs', + collectorConfig, + local: true, + before: (contentAggregate) => { + expect(contentAggregate).to.have.lengthOf(1) + expect(contentAggregate[0].files).to.be.empty() + }, + after: (contentAggregate) => { + expect(contentAggregate[0].files).to.have.lengthOf(1) + expect(contentAggregate[0].files[0].path).to.equal('modules/ROOT/pages/outside-start-path.adoc') + }, + }) + }) + it('should rebase files if base key is set on scan', async () => { const collectorConfig = { scan: { dir: 'code', base: 'modules/extend/examples' }, @@ -651,6 +672,44 @@ describe('collector extension', () => { }) }) + it('should clean specified dirs per collector entry', async () => { + const collectorConfig = [ + { + run: { command: 'node .gen-files.js' }, + }, + { + run: { command: 'node .gen-component-desc.js' }, + }, + { + clean: [{ dir: 'build/modules/ROOT/examples' }, { dir: 'build/modules/ROOT/partials' }], + run: { command: 'node .gen-start-page.js' }, + scan: [ + { dir: 'build', files: 'antora.yml' }, + { dir: 'build', files: 'modules/**/*' }, + ], + }, + ] + await runScenario({ + repoName: 'test-at-root', + collectorConfig, + before: (contentAggregate) => { + expect(contentAggregate).to.have.lengthOf(1) + expect(contentAggregate[0].version).to.eql('main') + expect(contentAggregate[0].files).to.be.empty() + }, + after: (contentAggregate) => { + expect(contentAggregate).to.have.lengthOf(1) + const bucket = contentAggregate[0] + expect(bucket.version).to.equal('1.0.0') + expect(bucket.files).to.have.lengthOf(1) + expect(bucket.files.map((it) => it.path)).to.have.members(['modules/ROOT/pages/index.adoc']) + expect( + contentAggregate[0].files.find((it) => it.path === 'modules/ROOT/pages/index.adoc').contents.toString() + ).to.include('= Start Page') + }, + }) + }) + it('should only collect files that match the scan pattern', async () => { const collectorConfig = { run: { command: 'node .gen-files.js' }, @@ -720,7 +779,7 @@ describe('collector extension', () => { expect(getCollectorCacheDir()).to.not.be.a.path() }) - it('should reuse worktree if available', async () => { + it('should reuse worktree without cleaning if no clean step is specified', async () => { const collectorConfig = { run: { command: 'node .gen-start-page.js' }, scan: { dir: 'build' }, @@ -729,23 +788,28 @@ describe('collector extension', () => { repoName: 'test-at-root', local: true, collectorConfig, - before: (contentAggregate) => { + before: async (contentAggregate) => { expect(contentAggregate).to.have.lengthOf(1) expect(contentAggregate[0].files).to.be.empty() + const worktree = contentAggregate[0].origins[0].worktree + const untrackedPagePath = ospath.join(worktree, 'build/modules/ROOT/pages/untracked.adoc') + await fsp.mkdir(ospath.dirname(untrackedPagePath), { recursive: true }) + await fsp.writeFile(untrackedPagePath, '= Untracked', 'utf8') }, after: (contentAggregate) => { - expect(contentAggregate[0].files).to.have.lengthOf(1) + expect(contentAggregate[0].files).to.have.lengthOf(2) const worktree = contentAggregate[0].origins[0].worktree expect(ospath.join(worktree, 'build')).to.be.a.directory() expect(ospath.join(worktree, 'build/modules/ROOT/pages/index.adoc')).to.be.a.file() + expect(ospath.join(worktree, 'build/modules/ROOT/pages/untracked.adoc')).to.be.a.file() }, }) }) - it('should clean scan dir in worktree before running command(s)', async () => { + it('should clean scan dir in worktree before running command(s) if clean key on scan entry is true', async () => { const collectorConfig = { run: { command: 'node .gen-start-page.js' }, - scan: { dir: 'build' }, + scan: { clean: true, dir: 'build' }, } await runScenario({ repoName: 'test-at-root', @@ -767,6 +831,7 @@ describe('collector extension', () => { it('should run specified command in temporary worktree if repository is local and reference is not worktree', async () => { const collectorConfig = { + clean: { dir: 'build' }, run: { command: 'node .gen-start-page.js' }, scan: { dir: 'build' }, } @@ -789,6 +854,7 @@ describe('collector extension', () => { it('should run specified command in each branch that defines collector config', async () => { const collectorConfig = { + clean: { dir: 'build' }, run: { command: 'node .gen-start-page.js' }, scan: { dir: 'build' }, } @@ -894,6 +960,7 @@ describe('collector extension', () => { it('should not modify index of local repository when checking out ref to worktree', async () => { const collectorConfig = { + clean: { dir: 'build' }, run: { command: 'node .gen-start-page.js' }, scan: { dir: 'build' }, }