Comprehensive Testing: Text Output And Column Selection
This article details the comprehensive testing strategy for the text output format and column selection features implemented across several issues. The goal is to ensure robust functionality and prevent regressions by thoroughly validating the changes.
References
- Parent Issue: #584 (529.2-cli-output-implementation)
- Testing Standards: Milestone #14 (ls-test-improvement)
- Testing Principles: #556 (Toyota Andon Cord, CRUD lifecycle pattern)
- Research:
docs/research/581-cli-output-research.md
Testing Philosophy
Our testing philosophy is guided by the principles outlined in #556. These principles emphasize the importance of preventing faulty code from being merged, ensuring the CRUD lifecycle is thoroughly verified, and avoiding superficial tests that don't validate actual behavior. We aim to catch issues early and prevent regressions by adhering to these guidelines.
- Toyota Andon Cord: Any failing test immediately halts the merge process. This prevents potentially unstable code from entering the main codebase.
- CRUD Lifecycle Pattern: We verify the actual behavior and output of the system, rather than simply checking exit codes. This ensures that the system functions as expected throughout its lifecycle.
- Avoid Anemic Tests: We move beyond simply checking
$? == 0. We verify that the output content matches our expectations. This helps prevent situations where tests pass despite underlying issues. - Root Cause Prevention: We learn from past mistakes, such as #536 where tests passed but
prompt listreturned empty results. We strive to create tests that are comprehensive and can detect a wide range of potential issues.
Testing Strategy
Our testing strategy is divided into phases, each focusing on a specific aspect of the text output and column selection features. Each phase includes unit and integration tests designed to validate the functionality and ensure it meets our quality standards.
Unit Tests (Phase 1 - #585)
These tests focus on the core logic within the cli/src/output.rs file. The goal is to validate the parsing and rendering of output formats, as well as the handling of column selections.
- [ ] Test
OutputFormat::from_str("text")parsing: Ensure that thetextformat string is correctly parsed into the corresponding enum variant. This is crucial for handling user input and configuration settings. - [ ] Test
OutputFormat::from_str("records")parsing: Verify that therecordsformat string is also correctly parsed. This ensures that all supported output formats are properly recognized. - [ ] Test invalid format strings return errors: Confirm that providing an invalid format string results in an appropriate error being returned. This prevents unexpected behavior and provides helpful feedback to the user.
- [ ] Test TSV rendering with tab separators: Validate that the TSV output format uses tab characters as separators between columns. This ensures that the output is correctly formatted for downstream processing.
- [ ] Test TSV rendering escapes special characters correctly: Ensure that special characters within the data are properly escaped when rendering TSV output. This prevents data corruption and ensures that the output is correctly interpreted.
- [ ] Test column metadata trait implementation: Verify that the column metadata trait is correctly implemented for all relevant data structures. This allows for consistent handling of column information throughout the system.
- [ ] Test
--columnsparsing (comma-separated list): Validate that the--columnsoption is correctly parsed, allowing users to specify a comma-separated list of columns to include in the output. This provides flexibility in selecting the desired data fields. - [ ] Test
--columnsvalidation (reject unknown column names): Ensure that the--columnsoption validates the provided column names, rejecting any unknown or invalid names. This prevents errors and ensures that users only request valid data fields.
Integration Tests (Phase 2 - #587)
These tests focus on the langstar prompt list command and verify the CRUD lifecycle. We set up test data, run the command with various flags, check the output content, and clean up the test data. Integration tests are crucial for verifying the interaction between different components of the system.
- [ ]
prompt list -o textoutputs TSV format with default columns: Verify that the command outputs data in TSV format when the-o textoption is specified, using the default set of columns. This ensures that the basic text output functionality is working correctly. - [ ]
prompt list -o text --columns handleoutputs only handle column: Confirm that the command outputs only thehandlecolumn when the--columns handleoption is specified along with-o text. This validates the column selection feature. - [ ]
prompt list -o text --columns handle,downloadsoutputs two columns tab-separated: Ensure that the command outputs both thehandleanddownloadscolumns, separated by a tab character, when the--columns handle,downloadsoption is specified along with-o text. This verifies the ability to select multiple columns. - [ ]
prompt list --show-columnslists: handle, likes, downloads, public, description, created_at: Validate that the--show-columnsoption lists the available columns for theprompt listcommand. This helps users discover the available data fields. - [ ] Verify tab separators (not spaces) between columns: Confirm that tab characters are consistently used as separators between columns in the TSV output format. This ensures the output is machine-readable and parsable.
- [ ] Verify newlines between rows: Ensure that each row of data is separated by a newline character. This provides clear separation between records in the output.
- [ ] Verify no header row by default: Validate that the TSV output format does not include a header row by default. This simplifies parsing and processing of the output.
- [ ] Handle empty result sets gracefully: Ensure that the command handles empty result sets gracefully, without producing errors or unexpected output. This is important for scenarios where no data matches the query.
- [ ] Handle very long field values (description truncation/escaping): Verify that very long field values, such as descriptions, are handled correctly, either by truncating them or escaping special characters. This prevents display issues and data corruption.
- [ ] Test with
LANGSTAR_OUTPUT_FORMAT=textenvironment variable: Confirm that theLANGSTAR_OUTPUT_FORMATenvironment variable can be used to set the default output format totext. This provides a convenient way to configure the output format globally.
Output verification pattern:
let output = cmd.output()?;
assert!(output.status.success());
let stdout = String::from_utf8(output.stdout)?;
// ✅ GOOD: Verify actual content
assert!(stdout.contains("expected-handle\t123"));
assert_eq!(stdout.lines().count(), expected_row_count);
// ❌ BAD: Only checking exit code (anemic test)
// assert!(output.status.success()); // Not enough!
Integration Tests (Phase 3 - #589)
This phase extends the integration tests to eight additional commands. We apply the same testing pattern used in Phase 2 to ensure consistency and thorough coverage across the entire application.
- [ ]
assistant list -o text --columns <fields> - [ ]
graph list -o text --columns <fields> - [ ]
runs query -o text --columns <fields> - [ ]
queue list -o text --columns <fields> - [ ]
dataset list -o text --columns <fields> - [ ]
eval list -o text --columns <fields> - [ ]
secrets list -o text --columns <fields> - [ ]
model-config list -o text --columns <fields>
For each command:
- Test
-o textwith default columns - Test
--columnswith single field - Test
--columnswith multiple fields - Test
--show-columnsdiscovery - Verify actual output content matches expected data
Integration Tests (Phase 4 - #591)
This phase focuses on the records format (psql \x style) and config file support. We ensure that the records format displays data correctly and that config file settings are properly applied.
- [ ]
prompt list -o recordsdisplays vertical format: Verify that the command displays data in a vertical format when the-o recordsoption is specified. This ensures that the basic records output functionality is working correctly. - [ ] Records format shows one record per section: Confirm that the records format displays one record per section, with each field on a separate line. This improves readability and allows for easier data interpretation.
- [ ] Field alignment is correct: Ensure that the fields in the records format are correctly aligned, making the output visually appealing and easy to scan.
- [ ] Multi-record output includes separators (
-[ RECORD N ]): Validate that multi-record output includes separators between records, such as-[ RECORD N ]. This helps distinguish between different records in the output. - [ ] Empty result sets handled gracefully: Ensure that the command handles empty result sets gracefully, without producing errors or unexpected output. This is important for scenarios where no data matches the query.
- [ ] Long field values display without truncation: Verify that long field values are displayed without truncation in the records format. This ensures that all data is fully visible and accessible.
Config file support:
- [ ]
~/.config/langstar/config.tomlsets default format: Verify that the config file can be used to set the default output format. This provides a convenient way to configure the output format globally. - [ ]
[output.columns.prompt]sets default columns forprompt list: Confirm that the config file can be used to set the default columns for theprompt listcommand. This allows users to customize the output for specific commands. - [ ] CLI flags override config file defaults: Ensure that CLI flags override config file defaults, allowing users to override the configured settings when needed. This provides flexibility and control over the output format.
- [ ] Invalid config values produce helpful error messages: Validate that invalid config values produce helpful error messages, guiding users to correct their configuration settings. This improves the user experience and prevents configuration errors.
Test Data Requirements
We have three options for test data management, each with its own advantages and disadvantages. The chosen approach should balance test execution speed, reliability, and maintainability.
Option 1: Use Existing Test Deployment
- Use deployment from
tests/fixtures/test-graph-deployment/ - Verify test data exists before running tests
- Document required test data shape
Option 2: Mock API Responses
- Use HTTP mocking for SDK responses
- Faster test execution
- No external dependencies
Option 3: Hybrid Approach
- Unit tests use mocks
- Integration tests use real test deployment
- Document when each approach applies
Success Criteria
To be successful, the testing process must meet the following criteria:
- [ ] All unit tests pass for
OutputFormatparsing and rendering - [ ] Integration tests verify actual output content (not just exit codes)
- [ ] All 9 list commands have integration test coverage
- [ ] Records format has integration test coverage
- [ ] Config file functionality has integration test coverage
- [ ] CI pipeline fails on any test failure (Toyota Andon Cord)
- [ ] Test documentation added to
docs/dev/testing/ - [ ] Tests prevent regression of #536-style issues (empty output passing tests)
Dependencies
This testing effort is dependent on the completion of the following issues:
- Depends on: #585 (Phase 1 - Core Infrastructure)
- Depends on: #587 (Phase 2 - Pilot Command)
- Depends on: #589 (Phase 3 - Rollout)
- Depends on: #591 (Phase 4 - Polish)
Effort Estimate
- Unit tests: ~2 hours
- Integration tests (pilot): ~3 hours
- Integration tests (rollout): ~4 hours
- Integration tests (polish): ~2 hours
- Documentation: ~1 hour
- Total: ~12 hours
Notes
This issue consolidates testing tasks previously scattered across #585, #587, #589, and #591. By creating a dedicated testing phase, we ensure comprehensive test coverage following the principles from milestone #14 (ls-test-improvement) and prevent anemic tests that only verify exit codes.
For more information on software testing best practices, check out this article on Software Testing Fundamentals.