Testing & Validation¶

Every QSO-Graph server is tested across four independent layers before release. Each layer catches different failure modes. All four must pass before a fleet-wide release.

Layer	Name	What It Catches	Blocking?
L1	Security Audit	Credential leaks, injection, unsafe patterns	Hard stop
L2	Unit Tests (Mock Mode)	Tool logic, parameter handling, return shapes	Hard stop
L3	Integration Tests (Live)	API connectivity, auth flows, data correctness	Pre-release gate
L4	Fleet Composition	Tool name collisions, schema conflicts, cross-server consistency	Fleet releases

Fleet Overview¶

12 servers, 82 tools, 4 test layers.

Package	Version	Tools	L1 Security	L2 Unit	L3 Live	L4 Fleet
adif-mcp	1.0.1	8	6 PASS	48 PASS	CI/CD	PASS
eqsl-mcp	0.3.1	5	6 PASS	45 PASS	Auth	PASS
qrz-mcp	0.3.1	5	6 PASS	38 PASS	Auth	PASS
lotw-mcp	0.3.1	5	6 PASS	38 PASS	Auth	PASS
hamqth-mcp	0.4.0	7	6 PASS	39 PASS	10 PASS	PASS
pota-mcp	0.2.0	7	6 PASS	45 PASS	15 PASS	PASS
sota-mcp	0.1.4	4	6 PASS	33 PASS	10 PASS	PASS
solar-mcp	0.2.0	6	6 PASS	43 PASS	15 PASS	PASS
wspr-mcp	0.3.0	8	6 PASS	40 PASS	12 PASS	PASS
iota-mcp	0.1.0	6	6 PASS	46 PASS	12 PASS	PASS
n1mm-mcp	0.1.4	8	6 PASS	59 PASS	Local	PASS
ionis-mcp	1.2.8	11	6 PASS	—	Local	PASS
Total	—	82	72	474+	74	20

L3 Live column notes

Auth — requires OS keyring credentials (eQSL, QRZ, LoTW accounts)
Local — requires local infrastructure (N1MM Logger+, SQLite datasets)
CI/CD — tested in GitHub Actions pipeline

L1: Security Audit¶

Every package includes test_security.py with 6 source-code audit tests. These scan all Python source files for forbidden patterns — they are not runtime tests.

#	Test	What It Catches
S1	`test_no_print_credentials`	`print()` calls containing password, secret, api_key, or token
S2	`test_no_logging_credentials`	`logging.*()` calls containing credential keywords
S3	`test_no_subprocess`	Any use of `subprocess` or `shell=True` (command injection)
S4	`test_all_urls_https`	Hardcoded `http://` URLs (except localhost)
S5	`test_error_messages_safe`	Exception messages that could expose credentials
S6	`test_no_eval_exec`	Any use of `eval()` or `exec()` (code injection)

These tests run in CI on every push and must pass before any PyPI publish. If the security gate fails, the publish job is blocked. No exceptions.

L2: Unit Tests (Mock Mode)¶

Each server supports a mock mode ({SERVER}_MCP_MOCK=1) that replaces HTTP calls with embedded test fixtures. L2 tests verify tool logic, parameter handling, return shapes, parser correctness, and helper functions without making any API calls.

Category	What's Tested	Example
Parser/Helper Functions	ADIF parsing, frequency conversion, date normalization, grid validation	`parse_adif()`, `freq_to_band()`, `to_yyyymmddhhmm()`
Tool Return Shapes	Every tool returns expected fields, types, and structures	`eqsl_inbox()` returns `total`, `records`, `by_band`
Parameter Handling	Filters, defaults, edge cases, invalid input	Band filter, callsign uppercase, empty string handling
Caching	TTL expiry, cache hits, overwrites	`_cache_set()` / `_cache_get()` with timed expiry
Data Models	Dataclass immutability, field defaults, type conversions	`FetchResult(records=[])` is frozen

# Run L2 tests for any server (no network needed)
cd solar-mcp
pytest tests/test_tools.py -v

L3: Live Integration Tests¶

L3 tests hit real APIs with known-good reference values. They verify that external services are responding correctly and that our client code handles real-world responses.

Tests are gated behind a --live flag and skipped by default. This keeps CI fast and avoids hammering volunteer-run services.

Server	Tests	APIs Hit	Reference Values
solar-mcp	15	NOAA SWPC	SFI 50-400, Kp 0-9, flare class A-X, 10 HF bands
pota-mcp	15	POTA API	US-0001 (Acadia NP), K4SWL, US-ME parks
sota-mcp	10	SOTA API	W7I/CU-001 (Borah Peak, Idaho)
wspr-mcp	12	wspr.live ClickHouse	DN13→JN48 path, JO62 grid, 20m band activity
iota-mcp	12	iota-world.org	OC-001 (Australia), 1000+ groups in programme
hamqth-mcp	10	HamQTH (public)	W1AW DXCC=291, DX cluster spots, RBN decodes

# Run L3 live tests (requires network)
cd solar-mcp
pytest tests/test_live.py --live -v

Rate Limiting

WSPR and HamQTH L3 tests include a 1-second pause between requests to respect volunteer-run services. Tests take longer but avoid API bans.

L4: Fleet Composition Tests¶

L4 tests verify that all 12 servers work correctly when loaded together. They import every server's MCP object, enumerate all tools, and check for cross-server conflicts.

Category	Tests	What's Verified
F1: Tool Name Uniqueness	5	No unexpected name collisions, snake_case convention, server namespacing, tool counts
F2: Schema Validity	7	Non-empty descriptions, typed properties, required fields exist, description length bounds
F3: Fleet Inventory	5	All 12 servers loaded, expected tools present, no empty servers
F4: Cross-Server Consistency	3	Band parameter types, callsign naming, limit parameter types

Known Findings¶

Finding	Status	Detail
`solar_conditions` name collision	Resolved	ionis-mcp v1.2.8 renamed to `solar_history`. No more collision.
Null defaults from `Optional` params	Tracked	FastMCP generates `{"default": null}` from Python `Optional[str] = None`. Valid JSON Schema but may affect some local LLM tool parsers.
Band parameter type split	By design	qso-graph servers use string band names (`"20M"`), ionis-mcp uses integer ADIF band IDs (`107`).

# Run L4 fleet tests (all 12 servers must be installed)
cd ionis-devel
EQSL_MCP_MOCK=1 HAMQTH_MCP_MOCK=1 LOTW_MCP_MOCK=1 QRZ_MCP_MOCK=1 \
  pytest tests/test_fleet.py -v

adif-mcp Validation (v1.0.0)¶

adif-mcp is the foundation package. Beyond the standard security and unit tests, it carries a comprehensive validation suite against the ADIF 3.1.6 specification:

Test Matrix — 48/48 PASS¶

Category	ID Prefix	Tests	Status
Enumeration Listing	ADIF-ENL	3	3/3 PASS
Enumeration Search	ADIF-ENS	6	6/6 PASS
Enum Validation — Simple	ADIF-EVS	6	6/6 PASS
Enum Validation — Compound	ADIF-EVC	3	3/3 PASS
Enum Validation — Conditional	ADIF-EVX	2	2/2 PASS
Enum Validation — Regression	ADIF-EVR	1	1/1 PASS
Official ADIF 3.1.6 Test Corpus	ADIF-TCR	2	2/2 PASS
Enum JSON Parity	ADIF-TCR	1	1/1 PASS
Security Audit	—	6	6/6 PASS
KI7MT Forensic Hard Tests	KI7MT-FRN	12	12/12 PASS
Empty Value Handling	ADIF-EVS	2	2/2 PASS
Total	—	48	48/48 PASS

Official ADIF 3.1.6 Test Corpus¶

The gold standard for ADIF validation. The official test file from G3ZOD's CreateADIFTestFiles contains 6,191 QSO records exercising every enumeration value in the specification.

Test	Records	Result
Zero false errors on all official records	6,191	PASS — 0 errors
All warnings are correct behavior	6,191	PASS — 39 warnings (23 user-defined fields + 16 import-only values)
Enum JSON files match official exports	25 files	PASS — all identical

Rule: If our validator rejects an official ADIF test record, our validator is wrong.

KI7MT Forensic Hard Tests¶

12 tests derived from forensic analysis of 110,761 real QSO records across three logger dialects (eQSL, QRZ, LoTW). Every test has a documented real-world source — no arbitrary tests.

ID	Test	Source	Why It Matters
FRN-001	Bread-and-butter FT8 QSO (9 enum fields)	QRZ — 15,000+ FT8 QSOs	FT8 is 88.7% of PSK Reporter spots. If this fails, everything fails.
FRN-002	MODE=FT4 correctly errors	QRZ — 2 of 49,233 records	FT4 must be MODE=MFSK + SUBMODE=FT4 per spec policy
FRN-003	Multi-field contest QSO (8+ enums)	QRZ — CQ WW DX CW logs	Contest logs have highest enum density
FRN-004	LoTW uppercase band (15M)	LoTW — 37,651 records	LoTW exports uppercase; case-insensitive matching required
FRN-005	QSL_VIA=M import-only warning	QRZ — pre-internet QSOs	Deprecated "manager" value must warn, not reject
FRN-006	Six QSL status fields in one record	QRZ — modern multi-service logs	QSL_RCVD + QSL_SENT + EQSL + LOTW — no cross-interference
FRN-007	Deleted DXCC entity (Aldabra)	DXCC enum — 62 deleted entities	Geopolitical mergers are valid historical data
FRN-008	PROP_MODE=SAT (satellite QSO)	QRZ — ISS/satellite contacts	Propagation_Mode enum rarely tested but critical
FRN-009	Credit:CARD&LOTW (& separator)	LoTW DXCC credits	Multi-medium compound format parsing
FRN-010	Freeform CONTEST_ID errors	eQSL — 470 invalid of 23,877	"CQWW 2021" is not a valid Contest_ID
FRN-011	SUBMODE without MODE field	eQSL — incomplete records	Graceful handling of missing parent field
FRN-012	EQSL_AG=Y (Authenticity Guaranteed)	eQSL — AG status for DXCC	3-value enum critical for DXCC credit eligibility

Running Tests¶

Single server — security only¶

cd eqsl-mcp
pytest tests/test_security.py -v

Single server — full mock suite¶

cd solar-mcp
pytest tests/ -v

Single server — including live API tests¶

cd solar-mcp
pytest tests/ -v --live

All servers — security sweep¶

for repo in adif-mcp eqsl-mcp qrz-mcp lotw-mcp hamqth-mcp pota-mcp \
            sota-mcp solar-mcp wspr-mcp iota-mcp n1mm-mcp ionis-mcp; do
    echo "=== $repo ==="
    (cd $repo && pytest tests/test_security.py -v) 2>&1
done

Fleet composition tests¶

cd ionis-devel
EQSL_MCP_MOCK=1 HAMQTH_MCP_MOCK=1 LOTW_MCP_MOCK=1 QRZ_MCP_MOCK=1 \
  pytest tests/test_fleet.py -v

Audit Process¶

The QSO-Graph release process requires three rounds of review before any PyPI publication:

Writer — writes code and runs all tests locally
1st Audit/Reviewer — independent functional review, pulls code, runs tests on separate machine
2nd Audit/Reviewer — failure analysis and security audit, reviews all source files for credential leaks, injection surfaces, and error message safety

All three must pass before the tag is created.

References¶

Resource	URL
ADIF 3.1.6 Specification	adif.org/316
Official Test Corpus (G3ZOD)	adif.org.uk/316/resources
CreateADIFTestFiles	github.com/g3zod/CreateADIFTestFiles
K1MU ADIF Validator	rickmurphy.net/adifvalidator.html
adif-multitool (flwyd)	github.com/flwyd/adif-multitool
MCP Security Best Practices	modelcontextprotocol.io
QSO-Graph Test Framework (internal)	ionis-devel/planning/QSO-GRAPH-TEST-FRAMEWORK.md