mamot.fr is one of the many independent Mastodon servers you can use to participate in the fediverse.
Mamot.fr est un serveur Mastodon francophone, géré par La Quadrature du Net.

Server stats:

3.1K
active users

#fairprinciples

0 posts0 participants0 posts today

Our #FAIRsharingCommunityChampions #MarkMcKerracher has created a short video "Data tips: #FAIR principles in 60 seconds" as part of his work at the SDS repository at #UniversityofOxford, where he also recommends @fairsharing

Take a look at doi.org/10.25446/oxford.283235, and at the entire series of videos is available at portal.sds.ox.ac.uk/SDS_self_h

See also fairsharing.org/educational#fa

Domain Ontologies: Indispensable for Knowledge Graph Construction

AI slop is all around and increasingly extraction of useful information will face difficulties as we start to feed more noise into the already noisy world of knowledge. We are in an era of unprecedented data abundance, yet this deluge of information often lacks the structure necessary to derive meaningful insights. Knowledge graphs (KGs), with their ability to represent entities and their relationships as interconnected nodes and edges, have emerged as a powerful tool for managing and leveraging complex data. However, the efficacy of a KG is critically dependent on the underlying structure provided by domain ontologies. These ontologies, which are formal, machine-readable conceptualizations of a specific field of knowledge, are not merely useful, but essential for the creation of robust and insightful KGs. Let’s explore the role that domain ontologies play in scaffolding KG construction, drawing on various fields such as AI, healthcare, and cultural heritage, to illuminate their importance.

Vassily Kandinsky, 1913 – Composition VII (1913)
According to Kandinsky, this is the most complex piece he ever painted.

At its core, an ontology is a formal representation of knowledge within a specific domain, providing a structured vocabulary and defining the semantic relationships between concepts. In the context of KGs, ontologies serve as the blueprint that defines the types of nodes (entities) and edges (relationships) that can exist within the graph. Without this foundational structure, a KG would be a mere collection of isolated data points with limited utility. The ontology ensures that the KG’s data is not only interconnected but also semantically interoperable. For example, in the biomedical domain, an ontology like the Chemical Entities of Biological Interest (ChEBI) provides a standardized way of representing molecules and their relationships, which is essential for building biomedical KGs. Similarly, in the cultural domain, an ontology provides a controlled vocabulary to define the entities, such as artworks, artists, and historical events, and their relationships, thus creating a consistent representation of cultural heritage information.

One of the primary reasons domain ontologies are crucial for KGs is their role in ensuring data consistency and interoperability. Ontologies provide unique identifiers and clear definitions for each concept, which helps in aligning data from different sources and avoiding ambiguities. Consider, for example, a healthcare KG that integrates data from various clinical trials, patient records, and research publications. Without a shared ontology, terms like “cancer” or “hypertension” may be interpreted differently across these data sets. The use of ontologies standardizes the representation of these concepts, thus allowing for effective integration and analysis. This not only enhances the accuracy of the KG but also makes the information more accessible and reusable. Furthermore, using ontologies that follow the FAIR (Findable, Accessible, Interoperable, Reusable) principles facilitates data integration, unification, and information sharing, essential for building robust KGs.

Moreover, ontologies facilitate the application of advanced AI methods to unlock new knowledge. They support both deductive reasoning to infer new knowledge and provide structured background knowledge for machine learning. In the context of drug discovery, for instance, a KG built on a biomedical ontology can help identify potential drug targets by connecting genes, proteins, and diseases through clearly defined relationships. This structured approach to data also enables the development of explainable AI models, which are critical in fields like medicine where the decision-making process must be transparent and interpretable. The ontology-grounded KGs can then be used to generate hypotheses that can be validated through manual review, in vitro experiments, or clinical studies, highlighting the utility of ontologies in translating complex data into actionable knowledge.

Despite their many advantages, domain ontologies are not without their challenges. One major hurdle is the lack of direct integration between data and ontologies, meaning that most ontologies are abstract knowledge models not designed to contain or integrate data. This necessitates the use of (semi-)automated approaches to integrate data with the ontological knowledge model, which can be complex and resource-intensive. Additionally, the existence of multiple ontologies within a domain can lead to semantic inconsistencies that impede the construction of holistic KGs. Integrating different ontologies with overlapping information may result in semantic irreconcilability, making it difficult to reuse the ontologies for the purpose of KG construction. Careful planning is therefore required when choosing or building an ontology.

As we move forward, the development of integrated, holistic solutions will be crucial to unlocking the full potential of domain ontologies in KG construction. This means creating methods for integrating multiple ontologies, ensuring data quality and credibility, and focusing on semantic expansion techniques to leverage existing resources. Furthermore, there needs to be a greater emphasis on creating ontologies with the explicit purpose of instantiating them, and storing data directly in graph databases. The integration of expert knowledge into KG learning systems, by using ontological rules, is crucial to ensure that KGs not only capture data, but also the logical patterns, inferences, and analytic approaches of a specific domain.

Domain ontologies will prove to be the key to building robust and useful KGs. They provide the necessary structure, consistency, and interpretability that enables AI systems to extract valuable insights from complex data. By understanding and addressing the challenges associated with ontology design and implementation, we can harness the power of KGs to solve complex problems across diverse domains, from healthcare and science to culture and beyond. The future of knowledge management lies not just in the accumulation of data but in the development of intelligent, ontologically-grounded systems that can bridge the gap between information and meaningful understanding.

References

  1. Al-Moslmi, T., El Alaoui, I., Tsokos, C.P., & Janjua, N. (2021). Knowledge graph construction approaches: A survey of recent research works. arXiv preprint. https://arxiv.org/abs/2011.00235
  2. Chandak, P., Huang, K., & Zitnik, M. (2023). PrimeKG: A multimodal knowledge graph for precision medicine. Scientific Data. https://www.nature.com/articles/s41597-023-01960-3
  3. Gilbert, S., & others. (2024). Augmented non-hallucinating large language models using ontologies and knowledge graphs in biomedicine. npj Digital Medicine. https://www.nature.com/articles/s41746-024-01081-0
  4. Guzmán, A.L., et al. (2022). Applications of Ontologies and Knowledge Graphs in Cancer Research: A Systematic Review. Cancers, 14(8), 1906. https://www.mdpi.com/2072-6694/14/8/1906
  5. Hura, A., & Janjua, N. (2024). Constructing domain-specific knowledge graphs from text: A case study on subprime mortgage crisis. Semantic Web Journal. https://www.semantic-web-journal.net/content/constructing-domain-specific-knowledge-graphs-text-case-study-subprime-mortgage-crisis
  6. Kilicoglu, H., et al. (2024). Towards better understanding of biomedical knowledge graphs: A survey. arXiv preprint. https://arxiv.org/abs/2402.06098
  7. Noy, N.F., & McGuinness, D.L. (2001). Ontology Development 101: A Guide to Creating Your First Ontology. Semantic Scholar. https://www.semanticscholar.org/paper/Ontology-Development-101%3A-A-Guide-to-Creating-Your-Noy/c15cf32df98969af5eaf85ae3098df6d2180b637
  8. Taneja, S.B., et al. (2023). NP-KG: A knowledge graph for pharmacokinetic natural product-drug interaction discovery. Journal of Biomedical Informatics. https://www.sciencedirect.com/science/article/pii/S153204642300062X
  9. Zhao, X., & Han, Y. (2023). Architecture of Knowledge Graph Construction. Semantic Scholar. https://www.semanticscholar.org/paper/Architecture-of-Knowledge-Graph-Construction-Zhao-Han/dcd600619962d5c1f1cfa08a85d0be43a626b301

Due to 🇺🇦 bureaucratic requirements, many are trying to calculate the 'amount of #FAIR data' in December. This is absurd, as FAIR represents principles, not 'units of data.' There is no standardized method to measure how much data complies with FAIR, and moreover, these principles are multifaceted - each aspect can have varying levels of implementation. In short, FAIR assessment requires a comprehensive analysis, not a simple count.

#FAIRprinciples #OpenScience #DataSharing #FAIRData

🌙 FAIR and Open Data: Bridging the Gap!
At Lübeck’s Nights of Open Knowledge, Maria Chlastak hosted an interactive session on the need for both FAIR and open data in science. Highlighting the importance of open formats and research software, she guided a discussion on how true scientific impact relies on data that’s FAIR and open to all.
👉 nfdixcs.org/meldung/focus-on-r
#Nook24 #FAIRData #FAIRprinciples #OpenScience #OpenData #FDM #RDM #RSM

Indispensable : le nouveau n° d'Arabesques, la revue de l'Abes "Autorité et référentiels : le nouveau paradigme" 112 | 2024
publications-prairial.fr/arabe

Arabesques112 | 2024 - Autorités et référentiels – Arabesques « Parler avec Autorités » nous invite l’ADBU dans son Manuel d’instruction à l’usage du bibliothécaire débarquant en infodoc en 2023 ! Beaucoup se réjouiront donc de la thématique de ce numéro d’Arabesques. Un précédent dossier en 2017 donnait déjà à voir l’importance des identifiants, les usages d’IdRef hors de l’Abes, et la complémentarité homme/machine dans les tâches de liage des ressources documentaires à des notices d’autorité. En 2023, en dressant les 6 points forts de l’Abes, l’évaluation du Hcéres saluait « la qualité et la pertinence de l’offre de services, en particulier des référentiels, dont IdRef ». De fait, grâce à une politique d’alignement entre identifiants et d’agrégation de ressources, IdRef est au cœur d’un graphe de connaissance intéropérable, dont les frontières sont étendues et extensibles. Ce graphe peut être exploité par des établissements pour nourrir en interne leur propre système d’information local en l’adossant à des données valides et contrôlées. Et bien entendu, il est ouvert et réutilisable par des chercheurs pour rendre leurs jeux de données FAIR. Décrire l’Enseignement supérieur et la Recherche d’aujourd’hui dans IdRef à travers la production scientifique, c’est contribuer à la valorisation de la France à l’international, conformément à la stratégie ministérielle des données, algorithmes et code sources, en s’appuyant sur ORCID et ROR, deux identifiants internationaux compatibles avec la science ouverte. Cependant, les données de référence, ce ne sont pas que des personnes ou des structures ! Comme le montre le dynamisme de Geonames, la description des toponymes n’est pas l’apanage des bibliothécaires. Enfin, n’oublions pas l’indexation Rameau. Si la syntaxe tend à se simplifier, la richesse du vocabulaire est un défi à relever et exploiter pour les machines, dans l’optique de seconder les humains. Ce numéro d’Arabesques se clôt par le portrait d’un correspondant Autorités. À travers lui, l’Abes remercie l’ensemble des collègues qui produisent et enrichissent quotidiennement IdRef. Vous êtes les infatigables artisans d’un bien commun essentiel. Merci !