Structured data on the web
Build search indexes for your community or domain of interest. Focused and functional to address your specific needs. Gleaner is open source, written in Go and easy to deploy. It is one part of the GleanerIO search architecture, details below.
Gleaner is a tool for extracting JSON-LD from web pages. You provide Gleaner a list of sites to index and it will access and retrieve pages based on the sitemap.xml of the domain(s). Gleaner can then check for well formed and valid structure in documents and process the JSON-LD data graphs into a form usable to drive a search interface. It is part of the bigger picture.
Communities of practice can leverage open schema (schema.org) along with web architecture approaches to build domain search portals. Enhance and extend with community vocabularies to address specific domain needs. This foundation is also leveraged by Google Data Set Search and is complementary to that service. Web architecture as foundation allows a community to provide a more detailed community experiences, while still leveraging the global reach of commercial search indexes.
Gleaner just is part of a tool chain to address this goal. You need storage and a way to search the graph Gleaner collects. A basic approach is described as well as alternatives people can use that are more native or familiar to them. See: The Big Picture