How OSSPath is built
Everything in OSSPath was added by a human. This page explains what that means in practice — what gets included, how it gets classified, how often it is re-verified, and how to report a mistake.
No entries are added automatically. Each job, funding program, and organization is reviewed and added individually. Repository metadata — stars, topics, and dependency data — is fetched from the GitHub API, but inclusion decisions are made by hand. The corpus stays small on purpose.
A smaller set of accurate, current entries is more useful than a larger set that includes stale or misclassified ones. When coverage and accuracy conflict, accuracy wins. Entries are removed when they can no longer be verified.
Each entry is checked against the inclusion criteria before it appears. New entries are not added automatically. The value of the corpus depends on this not changing.
The underlying data is open source. Mistakes can be seen, reported, and fixed in public. Review intervals and inclusion criteria are documented so that readers can evaluate the freshness of any given entry.
Each entity type has its own criteria. All of the following must be true for an entry to be included — not some.
OSSPath currently tracks seven entity types. Each has its own archive page, inclusion criteria, and review cadence.
- Repositories →Open-source Rust projects from GitHub, indexed by stars, activity, license, and dependency graph.
- Organizations →Companies and teams with a public GitHub presence and meaningful Rust open-source output.
- Funding programs →Grants, fellowships, and sponsorships from the Rust Foundation, NLnet, Sovereign Tech Fund, and others.
- Funders →The foundations and institutions that run funding programs.
- Jobs →Remote Rust engineering roles, manually reviewed.
- Community →Newsletters, forums, podcasts, and working-group channels.
- Ecosystems →Domain groupings derived from dependency graph analysis — 11 domains currently tracked.
Dependency pages (/deps) and topic pages (/topics) are derived views over the repository corpus rather than independently curated entity types. Events are tracked separately at /events.
- –GitHub — repository metadata (stars, topics, activity, Cargo.toml dependencies), organization membership
- –Company careers pages — reviewed directly for job listings
- –Grant program websites — reviewed directly for funding status and application windows
- –Conference and event sites — reviewed for dates and registration links
New entries are reviewed against the inclusion criteria before being added. Existing entries are re-verified on a rolling schedule:
Repository ecosystem tags are derived from Cargo.toml dependency data collected during corpus refresh. GitHub topic tags and organization owner signals are used as supplementary inputs. No LLM or AI classification is used.
Repositories are classified into one of eleven domain ecosystems:
Each ecosystem has a rule consisting of a set of specific crate names. If a repository's dependency list contains any of those crates, the rule matches — this is an inclusive OR within each rule. For blockchain repositories, the GitHub organization owner and repository topic tags are also considered when dependency data is absent or ambiguous.
Rules run in specificity order, and a repository receives up to two matching tags. A repository that builds a command-line tool over an HTTP API will match both CLI & TUI and Web & APIs. This is intentional — the repository is genuinely relevant to both views, and excluding it from one would be arbitrary. Ecosystem pages are therefore overlapping, not partitioned.
Repositories with no dependency data and no topic or owner signals receive no ecosystem tag and do not appear on any ecosystem page. They remain discoverable through the repository archive and topic/dependency pages.
Stale data is inevitable in a manually maintained corpus. Job postings expire, repositories get archived, funding programs close, and organizations change focus.
To report an error, open a GitHub issue or use the contact page. Include the URL of the incorrect entry and what the correct state should be. Pull requests with direct data corrections are also accepted.
There is no automated correction queue. Each report is reviewed and applied manually, typically within a few days.
Open an issue →The full dataset, classification rules, and site code are public. The corpus can be inspected, the rules can be read, and the methodology described here can be verified against the source.