What is LegalEvalHub?
LegalEvalHub is meant to be a platform for sharing and tracking LLM performance on different legal tasks and benchmarks.
How do I evaluate my model on LegalEvalHub tasks?
We've created
legal-eval-harness, which is a harness for evaluating LLMs on legal tasks. It's very much a work in progress, but it's a good starting point. We'd encourage contributions to the harness in the form of new tasks and model support.
How can I get involved?
We'd love contributions of tasks, leaderboards, and evaluation runs. Please see the
contribution guide for more details.