gleaner.io

Structured data on the web tooling

Gleaner

Fence

Tangram

Placeholder

Gleaner

Gleaner is a program and a set of supporting Docker containers that allow for the harvesting, validation and indexing of JSON-LD data graphs published in web pages. Its goal is to allow users to generate graphs and other searchable indexes based on the provided exposed JSON-LD.

Gleaner connection diagram

Screen cast demonstrating Gleaner (ref: Running Gleaner)

Placeholder

Fence

Fence provides an online tool for validating landing pages, checking sitemaps and leveraging framing to extract specific elements of a JSON-LD data graph. Fence will link out to other tools and also provides machine readable version of the views allowing it to be used in workflows such as Jupyter notebooks. Fence connection diagram

Fence demonstrates validation, framing (extracting key data for further processing) and sitemap checking. Fence can also provide JSON-LD extraction from web sites for further processing in custom workflows. A demonstration notebook is in development and will be shared here when ready.

Placeholder

Tangram

Tangram is a simple web services wrapper around pySHACL. It provides a RESTful web service approach for validating JSON-LD data graphs by means of shape graphs. See the GitHub Repo for some more details on using this service. Tangram services diagram

As a simple front for pySHACL, Tangram can look for shapes in a graph and report back in human or machine manners whether the shape is found. For most people, downloading and using pySHACL directly is a better solution. Tangram simply exposes pySHACL for on-line workflows.

CDF

EarthCube Council of Data Facilities Semantic Network

The CDF Semantic Network is example output from running Gleaner. Over a dozen members of the EarthCube Council of Data Facilities (CDF) who currently expose JSON-LD data for data set landing pages were harvested and indexed by Gleaner to generate this semantic network. The results of this have been placed at Github and have also been used in the test search portal at GeoDex.

Placeholder

About

This work is supported by the National Science Foundation through the EarthCube program.