Need a lawyer? Better Call SaulLM π
So essentially,
SaulLM now available as 54B and 141B model variants
Paper:
SaulLM-54B & SaulLM-141B: Scaling Up Domain Adaptation for the Legal Domain (15 Pages)
Researchers from Equall and CINES are interested in adapting LLMS for legal tasks. The development of effective legal LLMs could offer crucial support to lawyers and judicial systems, especially since legal systems in many countries are overwhelmed.
Hmm..Whatβs the background?
Past endeavors to adapt LLMs for legal tasks have faced significant hurdles: limited model scale (capped at 7/12B parameters) and restricted training datasets (no more than 30 billion tokens). Earlier models like LegalBERT, InCaseLawBERT, and SaulLM-7B, while groundbreaking, were constrained by their relatively small scale and the limited scope of their training data.
Ok, So what is proposed in the research paper?
This study introduces SaulLM-54B and SaulLM-141B, two large language models specifically designed for the legal domain, featuring architectures of 54 billion and 141 billion parameters, respectively.
These models, built upon the Mixtral architecture, are trained on an extensive legal corpus exceeding 500 billion tokens
Their strategy builds upon the base Mixtral architecture by further pre-training it on a large corpus of legal text. This step aims to enhance the model's understanding of legal language, terminology, and concepts.
The model is trained on a mixture of general and legal-specific instructions, enabling it to interpret and execute commands accurately, particularly in legal scenarios. This process involves synthesizing dialogues and question-answer pairs that simulate legal analysis and reasoning
The paper emphasizes the importance of open-sourcing the developed legal LLMs (SaulLM-54B and SaulLM-141B) to foster further research and development in legal NLP
Whatβs next?
The authors acknowledge the strong performance of the LLaMa3-70B model and suggest that using their scaling and domain adaptation techniques on top of a LLaMa3 base could improve performance beyond SaulLM-141B.
So essentially,
SaulLM now available as 54B and 141B model variants
Learned something new? Consider sharing with your friends!