AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |
Back to Blog
Orbis net11/12/2023 It is problematic that the set of functionalities of the Orbis system is never described clearly (only buried inside writing), or illustrated with a schema. Conversely, the Orbis section, which is arguably the key contribution of this work, is much shorter and starts with presenting irrelevant information, such as the dark and standard viewing modes of the tool. In some places, writing seems misplaced (e.g., the mention evaluation in 3.3.6 seems to belong in 3.2). Meanwhile, this section 3 is not following a consistent logic - the subsection on entity linking task is very detailed whereas the subsection on NER is much shorter. The background section is surprisingly long and it combines background information with decisions made in the Orbis system (e.g., in 3.1.2) and with in-depth discussion of challenges with evaluating some tasks (3.3). There are two main challenges that prevent me from suggesting acceptance at this stage.ġ) Presentation - The paper should be better structured to make explicit its contributions, and how these contributions are justified by the proposed framework and its evaluation. The framework's focus on versioning and visualization of the system and gold predictions would intuitively help tool developers debug and understand the behavior of their system as a function of the different benchmarks, tasks, and KG versions. Similarly, evaluating IE tasks jointly is a good idea, especially for the tasks chosen in this task, which are compositional (SF>NEL>NER>CE). I find the goal of shedding light on information extraction evaluation to be very valuable. This paper describes a framework for "explainable" benchmarking, which extends prior benchmarking systems with more information extraction (IE) tasks, version control, and visualization tools. This article introduces a unified formal framework for evaluating these tasks, presents Orbis’ architecture, and illustrates how it (i) creates simple, concise visualizations that enable visual benchmarking, (ii) supports different visual classification schemas for evaluation results, (iii) aids error analysis, and (iv) enhances interpretability, reproducibility and explainability of evaluations by adhering to the FAIR principles, and using lenses which make implicit factors impacting evaluation results such as tasks, entity classes, annotation rules and the target knowledge graph more explicit.īy Filip Ilievski submitted on 25/Mar/2022 Suggestion: Major Revision Orbis currently supports four information extraction tasks: content extraction, named entity recognition, named entity linking and slot filling. It, therefore, actively aids developers in better understanding evaluation results and identifying shortcomings in their systems. This work addresses the need for explainability by presenting Orbis, a powerful and extensible explainable evaluation framework which supports drill-down analysis, multiple annotation tasks and resource versioning. Metrics such as F1 and accuracy support comparison of annotators, they do not help in explaining annotator performance. Strengths and weaknesses of their methods and guidance for their development efforts, is very limited. Nevertheless, methodological support for explainable benchmarking, which provides researchers with feedback on the
0 Comments
Read More
Leave a Reply. |