Data FAIRness Practice
Dr Martin Uhrin
Dr Noel Vizcaino MIET IEEE
Synopsis
- Accessible PIDs: graph structure preserved
- Parsable metadata payload (fetched)
- List of expected good practices
- Use international standards
High level FAIR summary ...
Findable
- All IDs are global (eg UUID, hash)
- All IDs are permanent IDs (PIDs)
- Belong to a formal PID scheme
- Self ID: Dataset and Distribution (files)
- Can metadata be retrieved?
- Serialisation parsable?
- It includes core elements for findability e.g. creator
- Optional multiple online sources
Accesible
- Access level
- Metadata and data accessible by standardised protocols
- Separate access for data and metadata
Interoperable
- Using formal knowledge representation language
- Using semantic resources (eg namespaces)
- Metadata and data formally associated
FAIR analysis and scoring
- Take results/score with a grain of salt!
- QA automation (aiding) should be the goal.
- Accounting only deployed data/scenarios
- Not progressive (FAIR/ER). And cascading failure on any issue.
- Note tools checking PIDs and other items by string methods and/or regexing.
- More semantic resources is not necessarily better. Harcoded Sets.
- Checking the unkown (absurd requirement), scope problems (software engineering).
- Automation is challenging
- Different tools have different aims.
- DCAT or schema.org serialised as any RDF FAIR by design (now much?): issue detectors
Some FAIR assessment tools (somewhat related)
Exercise options
- Make a simple data example FAIRer
- Make DCAT dataset in JSON-LD
- Must validate in playground